Professional Documents
Culture Documents
Lecture Notes On Optimal Transport Problems
Lecture Notes On Optimal Transport Problems
Transport Problems
Luigi Ambrosio, luigi@ambrosio.sns.it
Contents
1 Some elementary examples 3
8 The Bouchitt
eButtazzo mass optimization problem 44
Introduction
These notes are devoted to the MongeKantorovich optimal transport problem.
This problem, in the original formulation of Monge, can be stated as follows: given
Lectures given in Madeira (PT), Euro Summer School Mathematical aspects of evolving
interfaces, 29 July 2000
Scuola Normale Superiore, Pisa; work partially supported by the GNAFA project Problemi
di MongeKantorovich e strutture deboli
1
two distributions with equal masses of a given material g0 (x), g1 (x) (corresponding
for instance to an embankment and an excavation), find a transport map which
carries the first distribution into the second and minimizes the transport cost
Z
C() := |x (x)|g0 (x) dx.
X
The condition that the first distribution of mass is carried into the second can be
written as Z Z
g0 (x) dx = g1 (y) dy B X Borel (1)
1 (B) B
The infimum of the transport problem leads also to a c-dependent distance between
measures with equal mass, known as KantorovichWasserstein distance.
The optimal transport problem and the KantorovichWasserstein distance have
a very broad range of applications: Fluid Mechanics [9], [10]; Partial Differential
Equations [32, 29]; Optimization [13], [14] to quote just a few examples. Moreover,
the 1-Wasserstein distance (corresponding to c(x, y) = |x y| in (2)) is related to
the so-called flat distance in Geometric Measure Theory, which plays an important
role in its development (see [6], [24], [30], [27], [44]). However, rather than showing
specific applications (for which we mainly refer to the Evans survey [21] or to the
introduction of [9]), the main aim of the notes is to present the different formulations
of the optimal transport problem and to compare them, focussing mainly on the
linear case c(x, y) = |x y|. The main sources for the preparation of the notes have
been the papers by BouchitteButtazzo [13, 14], CaffarelliFeldmanMcCann [15],
EvansGangbo [22], GangboMcCann [26], Sudakov [42] and Evans [21].
The notes are organized as follows. In Section 1 we discuss some basic examples
and in Section 2 we discuss Kantorovichs generalized solutions, i.e. the transport
plans, pointing out the connection between them and the transport maps. Section 3
is entirely devoted to the one dimensional case: in this situation the order structure
plays an important role and considerably simplifies the theory. Sections 4 and 5 are
2
devoted to the ODE and PDE based formulations of the optimal transport problem
(respectively due to Brenier and EvansGangbo); we discuss in particular the role of
the so-called transport density and the equivalence of its different representations.
R1
Namely, we prove that any transport density can be represented as 0 t# (|y
R1
x|) dt, where is an optimal planning, as 0 |Et | dt, where Et is the velocity field
in the ODE formulation, or as the solution of the PDE div( u) = f1 f0 , with
no regularity assumption on f1 , f0 . Moreover, in the same generality we prove
convergence of the p-laplacian approximation.
In Section 6 we discuss the existence of the optimal transport map, following
essentially the original Sudakov approach and filling a gap in his original proof
(see also [15, 43]). Section 7 deals with recent results, related to those obtained
in [25], on the regularity and the uniqueness of the transport density. Section 8 is
devoted to the connection between the optimal transport problem and the so-called
mass optimization problem. Finally, Section 9 contains a self contained list of the
measure theoretic results needed in the development of the theory.
Main notation
X a compact convex subset of an Euclidean space Rn
B(X) Borel -algebra of X
Ln Lebesgue measure in Rn
Hk Hausdorff k-dimensional measure in Rn
Lip(X) real valued Lipschitz functions defined on X
Lip1 (X) functions in Lip(X) with Lipschitz constant not greater than 1
u the set of points where u is not differentiable
t projections (x, y) 7 x + t(y x), t [0, 1]
So (X) open segments ]]x, y[[ with x, y X
Sc (X) closed segments [[x, y]] with x, y X, x 6= y
M(X) signed Radon measures with finite total variation in X
M+ (X) positive and finite Radon measures in X
M1 (X) probability measures in X
|| total variation of [M(X)]n
+ , positive and negative part of M(X)
f# push forward of by f
3
Example 1.1 (Non existence) Let f0 = 0 and f1 = (1 + 1 )/2. In this case
the optimal transport problem has no solution simply because there is no map
such that # f0 = f1 .
The following two examples deal with the case when the cost function c in X X
is |x y|, i.e. the euclidean distance between x and y. In this case we will use as
a test for optimality the fact that the infimum of the transport problem is always
greater than Z
sup u d(f1 f0 ) : u Lip1 (X) . (3)
X
Indeed,
Z Z Z
u d(f1 f0 ) = u((x)) u(x) df0 (x) |(x) x| df0 (x)
X X X
for any admissible transport . Actually we will prove this lower bound is sharp if
f0 has no atom (see (6) and (13)).
Our second example shows that in general the solution of the optimal transport
problem is not unique. In the one-dimensional case we will obtain (see Theorem 3.1),
uniqueness (and existence) in the class of nondecreasing maps.
4
We conclude this section with some two dimensional examples.
Example 1.5 Assume that 2f0 is the sum of the unit Dirac masses at (1, 1) and
(0, 0), and that 2f1 is the sum of the unit Dirac masses at (1, 0) and (0, 1). Then
the horizontal transport and the vertical transport are both optimal. Indeed,
the cost of these transports is 1 and choosing
(p
x21 + x22 if x1 + x2 1
u(x1 , x2 ) = p
(1 x1 )2 + (1 x2 )2 if x1 + x2 > 1
Example 1.6 Assume that f1 is the sum of two Dirac masses at A, B R2 and
assume that f0 is supported on the middle axis between them. Then
Z Z
|x (x)| df0 (x) = |x A| df0 (x)
X X
0# = f0 , 1# = f1 .
5
among all admissible and we denote by Fc (f0 , f1 ) the value of the infimum.
The difference between transport maps and transport plans can be better un-
derstood with the following proposition.
Proposition 2.1 (Transport plans versus transport maps) Any Borel trans-
port map : X X induces a transport plan defined by
:= (Id )# f0 . (5)
Conversely, a transport plan is induced by a transport map if is concentrated on
a -measurable graph .
6
R
(here denotes the outer integral). Hence x is the unit Dirac mass at (x) for
f0 -a.e. x X. Since Z
x 7 (x) := y dx (y)
X
is a Borel map coinciding with f0 -a.e., we obtain that x is the unit Dirac mass
at (x) for f0 -a.e. x. For A, B B(X) we get
Z
(A B) = x (B) df0 (x) = f0 ({x : (x, (x)) A B}) = (A B)
A
and therefore = .
The existence of optimal transport plans is a straightforward consequence of the
w -compactness of probability measures and of the lower semicontinuity of I.
Proof. Clearly the set of admissible s is closed, bounded and w -compact for the
w -convergence of measures (i.e. in the duality with continuous functions in X X).
Hence, it suffices to prove that
7
By a standard approximation argument we can assume that L = Lip(c) is finite
and that
|x y| c(x, y) x, y X
for some > 0. Possibly replacing c by c/ we assume = 1.
Fix now an integer h, set f00 = f0 and choose 0 : X X such that 0# f00 has
no atom and
Z
h
0
F1 (f1 , 0# f0 ) < 2 and c(0 (x), x) df00 (x) < Fc (f1 , f00 ) + 2h .
X
Then, we set f01 = 0# f00 , find 1 : X X such that 1# f01 has no atom and
Z
1h
1
F1 (f1 , 1# f0 ) < 2 and c(1 (x), x) df01 (x) < Fc (f1 , f01 ) + 21h .
X
Proceeding inductively and setting f0k = (k1)# f0k1 we find k such that
Z
kh
k
F1 (f1 , k# f0 ) < 2 and c(k (x), x) df0k (x) < Fc (f1 , f0k ) + 2kh
X
8
and the proof is achieved.
For instance in the case of Example 1.1 (where transport maps do not exist at
all) it is easy to check that the unique optimal transport plan is given by
1 1
0 1 + 0 1 .
2 2
In general, however, uniqueness fails because of the linearity of I and of the convexity
of the class of admissible plans . In Example 1.2, for instance, any measure
t(Id 1 )# f0 + (1 t)(Id 2 )# f0
for any permutation : {1, . . . , n} {1, . . . , n}. The equivalence can be proved
either directly (reducing to the case when has no nontrivial invariant set) or
verifying, as we will soon do, that any cyclically monotone set is contained in the c-
superdifferential of a c-concave function and then checking that the superdifferential
fulfils (8).
9
with xn+1 = x1 . For 1 i n, let Ui , Vi be compact neighbourhoods of xi and yi
respectively such that (Ui Vi ) > 0 and f ((ui ), (vi )) < 0 whenever ui Ui and
vi Vi .
Set now = mini (Ui Vi ) and denote by i M1 (Ui Vi ) the normalized
restriction of to Ui Vi . We can find a compact space Y , a probability measure
in Y and Borel maps i = ui vi : X Ui Vi such that i = i# for i = 1, . . . , n
(it suffices for instance to define Y as the product of Ui Vi , so that i are the
projections on the i-coordinate) and define
n
X
0 := + (ui+1 vi )# (ui vi )# .
n i=1
In order to show the last part of the statement we notice that the collection of
optimal transport plans is w -closed and compact. If (h )h1 is a countable dense
set, then !
h h
[ X 1
spti = spt i
i=1 i=1
h
is c-ciclically monotone for any h 1. Passing to the limit as h we obtain
that the closure of the union of spth is c-ciclically monotone. By the density of
(h ), this closure contains spt for any optimal plan .
Next, we relate the c-cyclical monotonicity to suitable concepts (adapted to c)
of concavity and superdifferential.
Remark 2.1 [Linear and quadratic case] In the case when c(x, y) is a symmetric
function satisfying the triangle inequality, the notion of c-concavity is equivalent to
1-Lipschitz continuity with respect to the metric dc induced by c. Indeed, given
u Lip1 (X, dc ), the family of functions whose infimum is u is simply
{c(x, y) + u(y) : y X} .
10
In the quadratic case c(x, y) = |x y|2 /2 a function u is c-concave if and only if
u |x|2 /2 is concave. Indeed, u = inf i c(, yi ) + ti implies
1 1
u(x) |x|2 = inf hx, yi i + |yi |2 + ti
2 i 2
and therefore the concavity of u |x|2 /2. Conversely, if v = u |x|2 /2 is concave,
from the well known formula
The following theorem ([40], [41], [36]) shows that the graphs of superdifferentials
of c-concave functions are maximal (with respect to set inclusion) c-cyclically mono-
tone sets. It may considered as the extension of the well known result of Rockafellar
to this setting.
Theorem 2.3 Any c-ciclically monotone set is contained in the graph of the
c-superdifferential of a c-concave function. Conversely, the graph of the c-
superdifferential of a c-concave function is c-ciclically monotone.
Proof. This proof is taken from [36]. We fix (x0 , y0 ) and define
where the infimum runs among all collections (xi , yi ) with 1 i n and
n 1. Then u is c-concave by construction and the cyclical monotonicity of gives
u(x0 ) = 0 (the minimum is achieved with n = 1 and (x1 , y1 ) = (x0 , y0 )).
We will prove the inequality
11
for any x X and (x0 , y 0 ) . In particular (choosing x = x0 ) this implies that
u(x0 ) > and that y 0 c u(x0 ). In order to prove (10) we fix > u(x0 ) and find
(xi , yi ) , 1 i n, such that
to obtain (8).
In the following corollary we assume that the cost function is symmetric, contin-
uous and satisfies the triangle inequality, so that c-concavity reduces to 1-Lipschitz
continuity with respect to the distance induced by c.
Proof. (Sufficiency) Let 0 be any admissible transport plan; by applying (11) first
and then (12) we get
Z Z Z
0 0
I( ) u(x) u(y) d = u df0 u df1
XX X X
Z
= u(x) u(y) d = I().
X
(Necessity) Let be the closure of the union of spt 0 as 0 varies among all optimal
plans for (MK). Then we know that is c-cyclically monotone, hence there exists a
12
c-concave function u such that c u. Then, (11) follows by the c-concavity of u,
while the inclusion c u implies
u(y) u(x) c(y, y) c(x, y) = c(x, y)
for any (x, y) spt . This, taking into account (11), proves (12).
A direct consequence of the proof of sufficiency is the identity
Z
min (MK) = max ud(f0 f1 ) : u Lip1 (X, dc ) (13)
Proof. Let u : X R be a c-concave function such that the graph of its superdif-
ferential contains the support of any optimal planning . As c(x, y) = |x y|2 /2,
an elementary computation shows that (x0 , y0 ) if and only if
y0 + v(x0 )
where v(x) = u(x)|x|2 /2 is the concave function already considered in Remark 2.1.
Then, by the above mentioned results on differentiability of concave functions, the
set of points where v is not differentiable is f0 -negligible, hence for f0 -a.e. x X
there is a unique y0 = v(x0 ) X such that (x0 , y0 ) . As spt , by
Proposition 2.1 we infer that
= (Id )# f0
for any optimal planning , with = v.
13
Example 2.1 (Brenier polar factorization theorem) A remarkable conse-
quence of Corollary 2.2 is the following result, known as polar factorization theorem.
A vector field r : X X such that
Theorem 2.4 There exists a constant C = C(n, X) with the following property:
for any M+ (X) with (X) = Ln (X) and any M > 0 there exist a Lipschitz
function : X X and B B(X) such that Lip() M , Ln (X \ B) C/M and
= # (Ln B) + s
14
instance [24]) we can extend |B to a M -Lipschitz function : X X. Setting
B c = X \ B we have then
(ii) the function in (i) is an optimal transport, and if p > 1 is the unique optimal
transport.
In the one-dimensional case these results are sharp: we have already seen that
transport maps need not exist if f0 has atoms (Example 1.1) and that, without the
monotonicity constraint, are not necessarily unique when p = 1.
Proof. (i) Let m = min I and define
Let T be the at most countable set made by the atoms of f1 and by the points
t I such that 1 (t) contains more than one point; then 1 is well defined on
15
(I) \ T and 1 ([t, t0 ]) = [ 1 (t), 1 (t0 )] whenever t, t0 (I) \ T with t < t0 .
Then (c) gives
(notice that only here we use the fact that f0 is diffuse). By (b) the closed intervals
whose endpoints belong to (I) \ T generate the Borel -algebra of sptf1 , and this
proves that # f0 = f1 .
Let be any nondecreasing function such that # f0 = f1 and assume, possibly
modifying on a countable set, that is right continuous. Let
and notice that T is at most countable (since we can index with T a family of
pairwise disjoint open intervals). We claim that on sptf0 \ T ; indeed, for
s sptf0 \ T and s0 > s we have the inequalities
f1 ([m, (s0 )]) = f0 (1 ([m, (s0 )])) f0 ([m, s0 ]) > f0 ([m, s])
whenever (x, y), (x0 , y 0 ) spt. If x < x0 , this condition implies that y y 0 (this
is a simple analytic calculation that we omit, and here the fact that p > 1 plays a
role). This means that the set
is at most countable (since we can index with T a family of pairwise disjoint open
intervals) hence f0 -negligible. Therefore for f0 -a.e. x I there exists a unique
y = (x) I such that (x, y) spt (the existence of at least one y follows by the
16
fact that the projection of spt on the first factor is sptf0 ). Notice also that is
nondecreasing in its domain.
Arguing as in Proposition 2.1 we obtain that
# f0 .
= (Id )
In particular
# f0 = # f0
f1 = 1# = 1# (Id )
and since is non decreasing it follows that = (up to countable sets) on sptf0 .
ft + Et = 0 in (0, 1) (17)
ft () = Et in (0,1) Cc () (18)
17
One more interpretation of (17) is given in the following proposition. Recall that
a map f defined in (0, 1) with values in a metric space (E, d) is said to be absolutely
continuous if for any > 0 there exists > 0 such that
X X
(yi xi ) < = d (f (yi ), f (xi )) <
i i
F1 (ft+h , ft )
lim |Et |() for L1 -a.e. t (0, 1). (19)
h0 |h|
Proof. In (13) we can obviosuly restrict to test functions u such that |u| r =
diam(X) on X; by our assumption on any of these function can be extended
to Rn in such a way that the Lipschitz constant is still less than 1 and u 0 in a
neighbourhood of Rn \. In particular, choosing an optimal u and setting u = u
we get
Z Z t
F1 (fs , ft ) = lim+ u d(ft fs ) = lim+ E (u ) d
0 X 0 s
Z t Z t
= lim+ E u d |E |() d (20)
0 s s
G := C 1 (Rn ) Lip(Rn ) : 0 on Rn \
endowed with the norm kk = Lip(). By using convolutions and (13) it is easy
to check that F1 (, ) = k kY , so that M1 (X) is isometrically embedded in
Y . Moreover, using HahnBanach theorem it is easy to check that any y Y is
representable as the divergence of a measure E [M()]n with kyk = |E|() (E is
not unique, of course).
18
By a general result proved in [5], valid also for more than one independent
variable, any Lipschitz map f from (0, 1) into the dual Y of a separable Banach
space is weakly -differentiable for L1 -almost every t, i.e.
ft+h ft
w lim =: f(t)
h0 h
Rt
and ft fs = s f( ) d for s, t (0, 1). In addition, the map is also metrically
differentiable for L1 -almost every t, i.e.
kft+h ft k
lim =: mf(t)
h0 |h|
Although f is only a w -limit of the difference quotients, it turns out (see [5]) that
the metric derivative mf is L1 -a.e. equal to kfk.
Putting together these informations the conclusion follows.
According to Brenier, we can formulate the optimal transport problem as follows.
among all Borel maps ft : [0, 1] M+ (X) and Et : [0, 1] [M()]n such that (17)
holds.
ft = [t,t+1] L1 , Et = [t,t+1] L1
provide an admissible and optimal flow, obviously related to the optimal transport
map x 7 x + 1. But, quite surprisingly, we can also define an optimal flow by
1 1
12t [2t,1] L if 0 t < 1/2
f t = 1 if t = 1/2
1 1
2t1 [1,2t]
L if 1/2 < t 1
19
It is easy to check that ft = Et = 0 and that |Et | are probability measures for any
t 6= 1/2, hence J(E) = 1. The relation of this new flow with the optimal transport
map x 7 2 x will be seen in the following (see Remark 4.1(3)).
In order to relate solutions of (ODE) to solutions of (MK) we will need an
uniqueness theorem, under regularity assumptions in the space variable, for the
ODE ft + (gt ft ) = 0. If ft , gt are smooth (say C 2 ) with respect to both the
space and time variables, uniqueness is a consequence of the classical method of
characteristics (see for instance 3.2 of [20]), which provides the representation
Z t
ft (xt ) = f0 (x) exp cs (xs ) ds
0
See also [33] for more general uniqueness and representation results in a weak setting.
Proof. Let gt be obtained from gt by a mollification with respect to the space and
time variables and define X (s, t, x) as the solution of the ODE x = gs (x) (with s
as independent variable) such that X (t, t, x) = x. Define, for Cc ((0, 1) Rn )
fixed, Z 1
(t, x) := (s, X (s, t, x)) ds.
t
Since X (s, t, X (t, 0, x)) = X (s, 0, x) we have
Z 1
(t, X (t, 0, x)) = (s, X (s, 0, x)) ds
t
whence = ( /t + gt ) in (0, 1) Rn .
20
Insert now the test function in (22) and take into account that f0 = 0 to
obtain
Z 1Z Z 1Z
0= ft ( + gt ) dxdt = ft + ft (gt gt ) dxdt.
0 Rn t 0 Rn
Theorem 4.2 ((MK) versus (ODE)) The problem (ODE) has at least one so-
lution and min (ODE) = min (MK). Moreover, for any optimal planning
M1 (X X) for (MK) the measures
hence the ODE (17) is satisfied. Then, we simply evaluate the energy J(E) in (21)
by
Z 1 Z 1
J(E) = |t# ((y x))|(X) dt t# (|y x|)(X) dt (24)
0 0
Z 1Z
= |x y| ddt = I().
0 XX
21
This shows that inf (ODE) min (MK).
In order to prove the opposite inequality we first use Proposition 4.1 and then
we present a different strategy, which provides more geometric informations.
By Proposition 4.1 we have
Z 1 Z 1
d
F1 (f0 , f1 ) F1 (ft , f0 ) dt |Et |() dt
0 dt 0
for any admissible flow (ft , Et ). This proves that min (MK) inf (ODE).
For given (and admissible) (ft , Et ), and assuming that
Z 1
spt |Et | dt
0
ft := ft + and Et := Et
where is any convolution kernel with compact support and (x) = n (x/).
Notice that ft are strictly positive on sptEt ; moreover (17) still holds, sptEt
for small enough and L1 -a.e. t and by (53) we have
Z
|Et |(y) dy |Et |() t [0, 1]. (25)
Rn
We define gt = Et /ft and denote by t (x) the semigroup in [0, 1] associated to the
ODE
t (x) = gt (t (x)) , 0 (x) = x. (26)
Notice that this flow for small enough maps into itself and leaves Rn \ fixed.
Now, we claim that ft = t#
(f0 Ln ) for any t [0, 1]. Indeed, since by definition
ft + (gt ft ) = ft + Et = 0
by Theorem 4.1 and the linearity of the equation we need only to check that also
t = t#
(f0 Ln ) satisfies the ODE t + (g t ) = 0. This is a straightforward
computation based on (26).
Using this representation, we can view 1 as approximate solutions of the optimal
transport problem and define
:= (Id 1 )# (f0 Ln ),
22
i.e. Z Z
(x, y) d (x) = (x, 1 (x)) f0 (x) dx
Rn
for any bounded Borel function in Rn Rn .
In order to evaluate I( ), we notice that
Z 1
Length(t (x)) = |gt |(t (x)) dt
0
hence, by multiplying by f0
and integrating we get
Z Z 1Z
Length(t (x))f0 (x) dx = |gt (y)|ft (y) dydt (27)
Rn 0 Rn
Z 1Z Z 1
= |Et |(y) dydt |Et |() dt.
0 Rn 0
As Z Z
I( ) = |1 (x) x|f0 (x) dx Length(t (x)) df0 (x),
Rn Rn
this proves that I( ) J(E). Passing to the limit as 0 and noticing that
0# = f0 Ln , 1# = f1 Ln
we obtain, possibly passing to a subsequence, an admissible planning M1 (Rn
Rn ) for f0 and f1 such that I() J(E). We can turn in a planning supported
in X X simply replacing by ( )# , where : Rn X is the orthogonal
projection.
Remark 4.1 (1) Starting from (ft , Et ), the (possibly multivalued) operation lead-
ing to and then to the new flow
ft := t# , Et := t# ((y x))
can be understood as a sort of arclength reparameterization of (ft , Et ). However,
since this operation is local in space (consider for instance the case of two line paths,
one parameterized by arclength, one not), in general there is no function (t) such
that
(ft , Et ) = (f(t) , E(t) ).
(2) The solutions (ft , Et ) in Example 4.1 are built as t# and t# ((y x)) where
= (Id )# f0 and (x) = x + 1, (x) = 2 x. In particular, in the second
example, f1/2 = 1 because 1 is the midpoint of any transport ray.
(3) By making a regularization first in space and then in time of (ft , Et ) one can
avoid the use of Theorem 4.1, using only the classical representation of solutions of
(22) with characteristics.
23
Remark
R1 4.2 (Optimality conditions) (1) If (ft , Et ) is optimal, then
spt 0 |Et | dt . Indeed, we can find 0 which still has the property
that the open r-neighbourhood of X is contained in 0 , with r = diam(X), hence
Z 1 Z 1
|Et |() dt = min (MK) = |Et |(0 ) dt.
0 0
(2) Since the two sides in the chain of inequalities (24) are equal when is optimal,
we infer
|t# ((y x))| = t# (|y x|) for L1 -a.e. t (0, 1). (28)
Analogously, we have
1 1
Z Z
Et dt = |Et | dt (29)
0 0
Remark 4.3 (Nonlinear cost) When the cost function c(x, y) is |x y|p for some
p > 1 the corresponding problem (MK) is still equivalent to (ODE), provided we
minimize, instead of J, the energy
Z 1
Jp (f, E) := p (ft , Et ) dt
0
where Z p
d
if || << ;
X
p (, ) =
+ otherwise.
The proof of the equivalence is quite similar to the one given in Theorem 4.2. Again
the essential ingredient is the inequality
p ( , ) p (, )
The latter follows by Jensens inequality and the convexity of the map (z, t) 7
|z|p t1p in Rn (0, ), which provide the pointwise estimate
p
.
p
24
In this case, under the same assumptions on f0 made in Corollary 2.2, due to the
uniqueness of the optimal planning = (Id # )f0 , we obtain that, for optimal
(ft , Et ), = (Id 1 )# f0 converge to as 0+ . As a byproduct, the maps 1
converge to as Young measures. If f0 << Ln it follows that h converge to in
[Lr (f0 )]n for any r [1, ) (see Lemma 9.1 and the remark following it).
See 2.6 of [4] for a systematic analysis of the continuity and semicontinuity
properties of the functional (, ) 7 p (, ) with respect to weak convergence of
measures.
In the previous proof I dont know whether it is actually possible to show full
convergence of as 0+ . However, using a more refined estimate and a geometric
lemma (see Lemma 4.1 below) we will prove that any limit measure satisfies the
condition Z 1 Z 1
t# (|y x|) dt = |Et | dt. (30)
0 0
We call transport density any of the measures in (30). This double representation
will be relevant in Section 7, where under suitable assumptions we will obtain the
uniqueness of the transport density.
Corollary 4.1 Let (ft , Et ) be optimal for (ODE).R 1 Then there exists an optimal
planning such that (30) holds. In particular spt 0 |Et | dt X.
Proof. Let
R 1 m = min (MK) = min (ODE) and recall that, by Remark 4.2(1), the
measure 0 |Et | dt has compact support in . It suffices to build an optimal planning
such that Z 1 Z 1
t# ((y x)) dt = Et dt. (31)
0 0
Indeed, by (29) we get
Z 1 Z 1
t# (|y x|) |Et | dt
0 0
25
we infer
2
Z Z 1 (x)
(x) x
t
lim t | (x)|f0 (x) dtdx = 0. (32)
| t (x)| |t (x) x| t
0+ BR 0
Now using Lemma 4.1 and the Young inequality 2ab a2 + b2 /, for any
Cc (Rn ) we obtain
Z 1
[t# ((y x) ) Et ]() dt
Z0 Z 1
[(1 (x) x)(t(1 (x) x)) t (x)(t (x))]f0 (x) dtdx
=
X 0
Z
RLip() Length(t )f0 (x) dx
X
Z Z 1 2
RLip() t (x) t (x) x
+ | (x)|f0 (x) dtdx.
|t (x) x| t
X 0 |t (x)|
with R = diam(). Passing to the limit first as 0+ , taking (32) into account
and then passing to the limit as 0+ we obtain
Z 1 Z 1
t# ((y x))() dt = Et () dt.
0 0
26
Lemma 4.1 Let Lip([0, 1], Rn ) with (0) = 0 and Lip(Rn ). Then
Z 1 Z 1
(t) (t)
(1)(t(1)) (t)((t)) dt LR |(t)| dt
|(t)|
0
0 |(t)|
and integrate both sides in [0, 1] [0, 1]. Then, the left side becomes exactly the
integral that we need to estimate from above. The right side, up to the multiplicative
constant Lip(), can be estimated with
!
Z 1 Z 1
(t) (t)
|(t) (t)| dt =
|(t)| (t) dt
0 0
|(t)| |(t)|
Z 1
(t) (t)
sup || |(t)| dt.
|(t)|
0 |(t)|
Remark 4.5 The geometric meaning of the proof above is the following: the inte-
gral to be estimated is the action on the 1-form dxi of the closed and rectifiable
current T associated to the closed path starting at 0, arriving at (1) following
the curve (t), and then going back to 0 through the segment [[0, (1)]]; for any
2-dimensional current G such that G = T , as
this action can be estimated by the mass of G times Lip(). The cone construction
provides a current G with G = T , whose mass can be estimated by
Z 1Z 1
s|(t) (t)| dsdt.
0 0
27
In order to relate also (ft , Et ) to the Kantorovich potential, we define the transport
rays and the transport set and we prove the differentiability of the potential on the
transport set.
Definition 4.1 (Transport rays and transport set) Let u Lip1 (X). We say
that a segment ]]x, y[[ X is a transport ray if it is a maximal open oriented segment
whose endpoints x, y satisfy the condition
u(x) u(y) = |x y|. (35)
The transport set Tu is defined as the union of all transport rays. We also define
Tue as the union of the closures of all transport rays.
Denoting by F the compact collection of pairs (x, y) with x 6= y such that (35)
holds, the transport set is also given by
[ [
Tu = {x + t(y x)}
t(0,1) (x,y)F
Proof. Let x, y X be such that (35) holds. By the triangle inequality and the
1-Lipschitz continuity of u we get
u(x) u(x + t(y x)) = t|y x| t [0, 1].
This implies that, setting = (y x)/|y x|, the partial derivative of u along
is equal to 1 for any internal point z of the segment. For any unit vector
perpendicular to we have
p p
u(z + h) u(z) = u(z + h) u(z + |h|) + u(z + |h|) u(z)
p p
|h|2 + |h| |h| = O(|h|3/2 ) = o(|h|).
A similar argument also proves that u(z + h) u(z) o(|h|). This proves the
differentiability of u at z and the identity u(z) = .
In Section 6 we will need a mild Lipschitz property of the potential u on Tue
(see also [15, 43] for the case of general strictly convex norms). We will use this
property to prove in Corollary 6.1 that Tue \ Tu is Lebesgue negligible. This property
can also be used to prove that u is approximately differentiable Ln -a.e. on Tu (see
for instance Theorem 3.1.9 of [24]).
28
Theorem 4.3 (Countable Lipschitz property) Let u Lip1 (X). There exists
a sequence of Borel sets Th covering Ln -almost all of Tue such that u restricted to
Th is a Lipschitz function.
Proof. Given a direction Sn1 and a R, let R be the union of the half closed
transport rays [[x, y[[ with hy x, i 0 and hy, i a. It suffices to prove that the
restriction of u to
T := R {x : x < a} \ u
has the countable Lipschitz property stated in the theorem. To this aim, since BVloc
funtions have this property (see for instance Theorem 5.34 of [4] or [24]), it suffices
to prove that u coincides Ln -a.e. in T with a suitable function w [BVloc (Sa )]n ,
where Sa = {x : x < a}. To this aim we define
u(y) + |x y| C|x|2 , y Ya
Theorem 4.4 (Regularity of (ft , Et )) For any solution (ft , Et ) of (ODE) repre-
sentable as in Theorem 4.2 for a suitable optimal planning there exists a 1-Lipschitz
function u such that
(i) ft is concentrated on the transport set Tu and |Et | Cft for L1 -a.e. t (0, 1),
with C = diam(X);
29
(iii) for any convolution kernel we have
Z 1Z
lim+ |u u |2 d|Et | dt = 0; (36)
0 0 X
R1
(iv) |Et |(X) = 0
|E |(X) d for L1 -a.e. t (0, 1).
Proof. (i) Let u be any maximal Kantorovich potential and let T = Tu be the
associated transport set. For any t (0, 1) we have
Z
ft (X \ T ) = {(x,y): x+t(yx)T
/ } d = 0
XX
because, by (12), any segment ]]x, y[[ with (x, y) spt is contained in a transport
ray, hence contained in T . The inequality |Et | Cft simply follows by the fact that
|x y| C on spt.
(ii) Choosing t satisfying (28) and taking into account Proposition 4.2, for any
C(X) we obtain
Z Z
u d|Et | = (t )u(t )|y x| d
X XX
Z
= (t )(y x) d = Et ().
XX
30
Remark 4.6 (1) It is easy to produce examples of optimal flows (ft , Et ) satisfying
(i), (ii), (iii), (iv) which are not representable as in Theorem 4.2 (again it suffices
to consider the sum of two paths, with constant sum of velocities). It is not clear
which conditions must be added in order to obtain this representation property.
(2) Notice that condition (i) actually implies that f is a Lipschitz map from [0, 1]
into M1 (X) endowed with the 1-Wasserstein metric.
( u ) = f in . (38)
Given (, u) admissible for (PDE), we call the transport density (the reason
for that soon will be clear) and u the tangential gradient of u.
31
so that the total mass () depends only on f and u. We will prove in Theorem 5.1
that actually () depends only on f and that is concentrated on X.
(2) It is easy to prove that, given (, u), there is at most one tangential gradient
u: indeed if uh u and uh u we have
Z Z
u u d = lim u uh d = lim f (
uh ) = f (u) = ()
h h
whence u = u -a.e. On the other hand, Example 5.1 below shows that to
some u there could correspond more than one measures solving (PDE). We will
obtain uniqueness in Section 7 under absolute continuity assumptions on f + or f .
We will need the following lemma, in which the assumption that is large
enough plays a role.
we conclude that ( \ 00 ) = 0.
In the following theorem we build a solution of (PDE) choosing as a transport
density of (MK) with f0 = f + and f1 = f . As a byproduct, by Corollary 4.1 we
obtain that the support of is contained in X and that is representable as the
right side in (30).
Theorem
R1 5.1 ((ODE) versus (PDE)) Assume that (ft , Et ) with
spt 0 |Et | dt solves (ODE) and let u be a maximal Kantorovich poten-
R1
tial relative to f0 , f1 . Then, setting f = f0 f1 , the pair (, u) with = 0 |Et | dt
solves (PDE) with u = u and, in particular, spt X.
32
Conversely, if (, u) solves (PDE), setting f0 = f + and f1 = f , the measures
(ft , Et ) defined by
ft := f0 + t(f1 f0 ), Et := u
solve (ODE) and spt X. In particular
(X) = min (ODE) = min (MK).
33
Theorem 5.2 Let up be solutions of (40) with p n + 1. Then
(i) the measures Hp = |up |p2 up Ln are equi-bounded in ;
(ii) (up ) are equibounded and equicontinuous in ;
(iii) If (H, u) = limj (Hpj , upj ) with pj , then (, u) solves (PDE) with = |H|.
Proof. (i) Using up as test function and the Sobolev embedding theorem we get
Z Z 1/p0
p p0
|up | dx = f (up ) |f |(X)kup k C |up0 | dx
is lower semicontinuous with respect to the weak convergence of measures for any
nonnegative Cc (), w [C()]n . The verification of this fact is straightforward:
it suffices to expand the squares. Second, we notice that
Z 2
Hp
lim+ lim sup u d|Hp | = 0
0 p |Hp |
34
where p = supt0 tp1 tp tends to 0 as p .
Taking into account these two remarks, setting p = pj and passing to the limit
as j we obtain 2
Z
H
lim+ u d|H| = 0.
0 |H|
for any nonnegative Cc (). This implies that u converge in L2loc (|H|) to the
RadonNikod ym derivative H/|H|. Since |u | 1 we have also convergence in
[L2 (|H|)]n .
We now set = |H| and u = lim u , so that H = u and (38) is satisfied.
Theorem 6.1 Let B B(X) and let : B Sc (X) be a Borel map satisfying the
conditions
(iii) the direction (x) of (x) is a Sn1 -valued countably Lipschitz map on B.
Proof. Being the property stated stable under countable disjoint unions we may
assume that
35
1
(a) there exists a unit vector such that (x) 2
for any x B;
{x : a b x a}
36
with L = {y Rn1 : H1 (f 1 (y)) > 0} and
g
H1 f 1 (y)
Z
Cf g
y0 := R , 0
= d H Ln1 L.
1
f 1 (y)
g/Cf d H1 f 1 (y) Cf
By Theorem 9.2 we obtain = 0 and y = y0 for -a.e. y, and this concludes the
proof.
Remark 6.1 (1) In [42] V.N.Sudakov stated the theorem above (see Proposition 78
therein) for maps : B So (X) without the countable Lipschitz assumption (iii)
and also in a greater generality, i.e. for a generic Borel decomposition of the
space in open affine regions, even of different dimensions. However, it turns out that
the assumption (iii) is essential even if we restrict to 1-dimensional decompositions.
Indeed, G.Alberti, B.Kirchheim and D.Preiss [3] have recently found an example of a
compact family of open and pairwise disjoint segments in R3 such that the collection
B of the midpoints of the segments has strictly positive Lebesgue measure. In this
case, of course, if = Ln B the conditional measures C are unit Dirac masses
concentrated at the midpoint of C, so that the conclusion of Theorem 6.1 fails.
If n = 2 it is not hard to prove that actually condition (iii) follows by (ii), so
that the counterexample mentioned above is optimal.
(2) Since any open segment can be approximated from inside by closed segments,
by a simple approximation argument one can prove that Theorem 6.1 still holds as
well for maps : B So (X) or for maps with values into half open [[x, y[[ segments.
Also the assumption (i) that the segments do not intersect can be relaxed. For
instance we can assume that the segments can intersect only at their extreme points,
provided the collection of these extreme points is Lebesgue negligible. This is pre-
cisely the content of the next corollary; again, a counterexample in [28] shows that
this property need not be true for a generic compact family of disjoint line segments
in Rn , n 3.
Corollary 6.1 (Negligible extreme points) Let u Lip1 (X). Then the collec-
tion of the extreme points of the transport rays is Lebesgue negligible.
Proof. We prove that the collection L of all left extreme points is negligible, the
proof for the right ones being similar. Let B = L \ u and set
r(x)
(x) := [[x, x u(x)]]
2
where r(x) is the length of the transport ray emanating from x (this ray is unique
due to the differentiability of u at x). By Theorem 4.3 the map has the countable
37
Lipschitz property on B and, by construction, (x) (x0 ) = whenever x 6= x0 .
By Theorem 6.1 we obtain Z
= C d
Sc (X)
Theorem 6.2 (Sudakov) Let f0 , f1 M1 (X) and assume that f0 << Ln . Then
there exists an optimal transport mapping f0 to f1 . Moreover, if f1 << Ln we can
1
choose so that 1 is well defined f1 -a.e. and # f1 = f0 .
Proof. Let be an optimal planning. In the first two steps we assume that
This condition holds for instance if f0 f1 = 0 because in this case any optimal
planning does not charge the diagonal of X X, i.e. () = 0 (otherwise
h = 0# ( ) = 1# ( ) would be a nonzero measure less than f0 and f1 ).
Let u Lip1 (X) be a maximal Kantorovich potential given by Corollary 2.1, i.e.
a function satisfying
for any optimal planning and let Tu be the transport set relative to u (see Defini-
tion 4.1). In the following we set
e = {x X \ u : y 6= x s.t. (x, y) spt} .
X
= C with := r#
38
for -a.e. C Sc (X). By the sufficiency part in Corollary 2.1 we infer that C is an
optimal planning relative to the probability measures
f0C := 0# C , f1C := 1# C
Indeed,
Z Z Z Z
I() = |x y| d = |x y| dC d(C) = I(C ) d(C)
XX Sc (X) XX Sc (X)
= r# = # (0# ) = # f0 .
Notice that the segments (x), x X, e can intersect only at their right extreme
point and that, by Corollary 6.1, the collection of these extreme points is Lebesgue
negligible. As a consequence of (46) with = # f0 and Remark 6.1(2), the measures
f0C are absolutely continuous with respect to H1 C for -a.e. C Sc (X). Hence,
by Theorem 3.1, for -a.e. C Sc (X) we can find a nondecreasing map C : C C
(this notion makes sense, since C is oriented) such that C# f0C = f1C . Notice also
that 1 is well defined f1C -a.e., if also f1C << H1 C.
Taking into account that the closed transport rays in (X) e are pairwise disjoint
in X,
e we can glue all the maps C to produce a single Borel map : X e X. The
map is Borel because we have been able to exhibit the one dimensional transport
map constructively and because of the Borel property of the maps C 7 fiC (see
(15) and (54)); the simple but boring details are left to the reader.
39
Since # f0C = C# f0C = f1C for -a.e. C Sc (X), taking (44) into account we
infer Z Z
# f0 = # f0C d(C) = f1C d(C) = f1 .
Sc (X) Sc (X)
40
7 Regularity and uniqueness of the transport
density
In this section we investigate the regularity and the uniqueness properties of the
transport density arising in (PDE). Recall
R1 that, as Theorem 5.1 shows, any such
measure can also be represented as 0 |Et | dt for a suitable optimal pair (ft , Et )
for (ODE) (even
R 1 with Et independent of t). In turn, by Corollary 4.1, any optimal
measure = 0 |Et | dt can be represented as
Z 1
= t# (|y x|) dt (47)
0
for a suitable optimal planning . For this reason, in the following we restrict our
attention to the representation (47), valid for any optimal measure for (PDE).
A more manageable formula for , first considered by G.Bouchitte and
G.Buttazzo in [14], is given in the following elementary lemma.
A direct consequence of (48) is that is concentrated on the transport set (since
-a.e. segment ]]x, y[[ is contained in a transport ray); another consequence is the
density estimate
(Br (x))
f0 (X)f1 (X) (49)
2r
because Length(]]x, y[[Br (x)) 2r for any ball Br (x) and any segment ]]x, y[[. The
first results on that we state, proved by A.Pratelli in [34] (see also [19]), relate
the dimensions of f0 and f1 to the dimension of and show that necessarily is
absolutely continuous with respect to the Lebesgue measure if f0 (or, by symmetry,
f1 ) has this property.
41
Theorem 7.1 Assume that for some k > 0 we have
f0 (Br (x))
sup < f0 -a.e. in X.
r(0,1) rk
(Br (x))
lim sup t x B = (B) t Hk (B) (50)
r0+ k rk
(Br (x))
lim sup t x B = (B) 2k t Hk (B) (51)
r0+ k r k
we can prove a natural lower bound on the Hausdorff dimension of .
Proof. By (49) we infer that has finite (and even bounded) 1-dimensional density
at any point. In particular (51) gives (B) = 0 whenever H1 (B) = 0, so that
H-dim() 1.
Let k = H-dim(f0 ) and k 0 < k; then f0 has finite k 0 -dimensional density f0 -a.e.,
0
otherwise by (50) the set where the density is not finite would be Hk -negligible
and with strictly positive f0 -measure. By Theorem 7.1 we infer that has finite
42
k 0 -dimensional density -a.e. By the same argument used before with k 0 = 1 we
obtain that H-dim() k 0 and therefore, since k 0 < k is arbitrary, H-dim() k.
A symmetric argument proves that H-dim() H-dim(f1 ).
We conclude this section proving the uniqueness of the transport density, under
the assumption that either f0 << Ln or f1 << Ln (see Example 5.1 for a nonunique-
ness example if neither f0 nor f1 are absolutely continuous). Similar results have
been first announced by Feldman and McCann (see [25]).
We first deal with the one dimensional case, where this absolute continuity as-
sumption is not needed.
Lemma 7.2 Let be a transport density in X R and assume that the interior
(a, b) of X is a transport ray. Then = hL1 in (a, b) with
Proof. The proof of the first statement is based on the equation 0 = f and on a
smoothing argument. By integrating both sides we get
Z b Z bZ
c(b a) = (X) f ((a, t)) dt = (X) 1 df ( ) dt
a a (a,t)
Z Z b Z
= (X) 1 dt df ( ) = (X) (b ) df ( ).
(a,b) (a,b)
The conclusion is achieved taking into account that (X) = min (MK) = F1 (f0 , f1 ).
Proof. We already know from Theorem 7.1 that any transport density is absolutely
continuous and we can assume that f0 << Ln . By Corollary 4.1 we know that
the class of transport densities relative to (f0 , f1 ) is equal to the class of transport
densities relative to (f0 h, f1 h) where h is any measure in M+ (X) such that
43
h f0 f1 (indeed, this subtraction does not change the velocity field Et ). Hence,
it is not restrictive to assume that f0 f1 = 0.
We can assume that is representable as in (30) for a suitable optimal planning
. Adopting the same notation of the proof of Theorem 6.1, we have
= C
where = # f0 does not depend on (and here the absolute continuity assumption
on f0 plays a crucial role) and C is an optimal planning relative to the measures
f0C = 0# C , f1C = 1# C for -a.e. C. In particular
Z 1 Z 1
= t# (|y x|C ) dt = t# (|y x|C ) dt .
0 0
do not depend on (up to -negligible sets of course). To this aim, taking into
account Lemma 7.2, it suffices to show that f0C and f1C do not depend on .
Indeed, we already know from (46) and Theorem 9.2 that f0C do not depend on
.
The argument for f1C is more involved. First, since () = 0 (due to the
assumption f0 f1 = 0), we have |x y| > 0 -a.e., and therefore |x y| > 0
C -a.e. for -a.e. C Sc (X). As a consequence, setting C = [[xC , yC ]], we obtain
0
that f1C (xC ) = 0 for -a.e. C. Second, we examine the restriction f1C of f1C to the
relative interior of C noticing that (44) gives
Z Z
0
f1 T u = f1C Tu d(C) = f1C d(C)
Sc (X) Sc (X)
By Theorem 9.2 we
because Tu C is the relative interior of C for any C (X).
0
obtain that f1C depend only on f1 , Tu and .
In conclusion, since
0 0
f1C = f1C + (1 f1C (X)) yC
44
8 The Bouchitt
eButtazzo mass optimization
problem
In [13, 14] Bouchitte and Buttazzo consider the following problem. Given f M(X)
with f (X) = 0, they define
Z
1 2
E() := inf |v| d f (v) : v C (X)
X 2
for any M+ (X). Then, they raised the following mass optimization problem.
(BB) Given m > 0, maximize E() among all measures M+ (X) with (X) =
m.
45
As a byproduct we have that is unique and absolutely continuous with respect
to Ln if either f0 or f1 are absolutely continuous with respect to Ln .
Theorem 8.1 ((PDE) versus (BB)) Let (, u) be a solution of (PDE) and set
m = (X). Then =m
m
solves (BB) and any solution of (BB) is representable in
this way. Moreover
1m2
max(BB) = .
2m
Proof. Setting v = uh (with uh as in the definition of (PDE)), for any M+ (X)
with (X) = m we estimate
m 2
E() f (uh )
2
so that, letting h and taking into account (39), we obtain
m 2
E() m.
2
By minimizing with respect to we obtain E() m 2 /(2m).
On the other hand, using Young inequality and choosing so that m
= m we
can estimate
Z Z
1 2 m
|v| d
f (v) 2 f (v)
hv, ui d
X 2 X 2
2
1m
= ( u )(v) f (v)
2m
1m2
=
2m
for any v C (X). This proves that is optimal for (BB).
Conversely, it has been proved in [13, 14] that for any solution of (BB) there
exists v Lip(X) such that | v| = m/m -a.e. and
( v) = f,
where v is understood in the BouchitteButtazzo sense. But since (see [14] again)
Z Z
2 2
| v| d = inf lim inf |vh | d : vh v uniformly, vh C (X)
X h X
46
9 Appendix: some measure theoretic results
In this section we list all the measure theoretic results used in the previous section;
reference books for the content of this section are [37], [18] and [4].
Let us begin with some terminology.
(Measures) Let X be a locally compact and separable metric space. We denote
by [M(X)]m the space of Radon measures with values in Rm and with finite total
variation in X. We recall that the total variation measure of = (1 , . . . , m )
[M(X)]m is defined by
(
)
X G
||(B) := sup |(Bi )| : B = Bi , Bi B(X)
i=1 i=1
and belongs to M+ (X). By Riesz theorem the space [M(X)]m endowed with the
norm kk = ||(X) is isometric to the dual of [C(X)]m . The duality is given by the
integral, i.e.
Xm Z
h, ui := ui di .
i=1 X
Recall also that, for M+ (X) and f [L1 (X, )]m , the measure f [M(X)]m
is defined by Z
f (B) := f d B B(X)
B
and |f | = |f |.
(Push forward of measures) Let [M(X)]m , let Y be another metric space
and let f : X Y be a Borel map. Then the push forward measure f# [M(Y )]m
is defined by
f# (B) := f 1 (B)
B B(Y )
and satisfies the more general property
Z Z
u df# = u f d for any bounded Borel function u : Y R.
Y X
47
(Convolution) If X Rn , [M(X)]m and Cc (Rn ) (for instance a
convolution kernel); we define
Z
(x) := (x y) d(y) x Rn .
X
and
lim sup h (X) (X)
h
R R
then X dh X d for any bounded continuous function : X R.
(Measure valued maps) Let X, Y be locally compact and separable metric
spaces. Let y 7 y be a map which assigns to any y Y a measure y [M(X)]m .
We say that y is a Borel map if y 7 y (A) is a real valued Borel map for any open
set A X. By a monotone class argument it can be proved that y 7 y is a Borel
map if and only if
y 7 y ({x : (y, x) B}) is a Borel map for any B B(Y X). (54)
48
Moreover y 7 |y | is a Borel map whenever y 7 y is Borel (detailed proofs are
in 2.5 of [4]). If M+ (Y ), analogous statements hold for B(Y ) -measurable
measure valued maps, where B(Y ) is the -algebra of -measurable sets (in this
case one has to replace B(Y X) by B(Y ) B(X) in (54)).
(Decomposition of measures) The following result plays a fundamental role in
these notes; it is also known as disintegration theorem.
49
Once the decomposition theorem is known in the special case X = Y Z and
(y, z) = z the general case can be easily recovered: it suffices to embed X into the
product Y X through the map f (x) = ((x), x) and to apply the decomposition
theorem to = f# .
Now we discuss the uniqueness of x and in the representation = x .
For simplicity we discuss only the case of positive measures.
Proof. Let y , y0 be satisfying (i), (ii). We have to show that y = y0 for -a.e. y.
Let (An ) be a sequence of open sets stable by finite intersection which generates the
Borel -algebra of X. Choosing A = An 1 (B), with B B(Y ), in (i) gives
Z Z
y (An ) d(y) = y0 (An ) d(y).
B B
Being B arbitrary, we infer that y (An ) = y0 (An ) for -a.e. y, and therefore there
exists a -negligible set N such that y (An ) = y0 (An ) for any n N and any
y Y \ N . By Proposition 1.8 of [4] we obtain that y = y0 for any y Y \ N .
Let B 0 B be any # -negligible set; then 1 (B 0 ) is -negligible and therefore
(ii) gives Z Z
1 0
0= y (B ) d(y) = y (X) d(y).
Y B0
0 0
As y (X) > 0 on B B this implies that (B ) = 0. Writing B = h# we
obtain = hy # and = y # . As a consequence (59) holds.
(Young measures) These measures, introduced by L.C.Young, arise in a natural
way in the study of oscillatory phenomena and in the analysis of weak limits of
nonlinear quantities (see [31] for a comprehensive introduction to this wide topic).
50
Specifically, assume that we are given compact metric spaces X, Y and a se-
quence of Borel maps h : X Y ; in order to understand the limit behaviour of
h we associate to them the measures
Z
h = (Id h )# = h (x) d(x)
= x
We will use the following two well known results of the theory of Young measures.
x = lim h (x) .
h
Moreover, we can choose h in such a way that the measures h# have no atom
as well.
h = pi on Xhi .
51
For any C(X) and C(Y ) we have
Z p Z
XX
d(h ) = (yi ) d
XY hZn i=1 Xhi
p Z Z
XX
pi (yi ) d = d
hZn i=1 XQh XY
weakly in M(X Y ).
52
The conclusion follows letting 0+ .
With a similar proof one can obtain a slightly more general result, namely
h := h (x) h := (x) .
References
[1] G.Alberti: On the structure of the singular set of convex functions. Calc.
Var., 2 (1994), 1727.
[5] L.Ambrosio & B.Kirchheim: Rectifiable sets in metric and Banach spaces.
Mathematische Annalen, in press.
[10] Y.Brenier: A homogenized model for vortex sheets. Arch. Rational Mech. &
Anal., 138 (1977), 319353.
53
[12] G.Bouchitte , G.Buttazzo & P.Seppecher: Energies with respect to a
measure and applications to low dimensional structures. Calc. Var., 5 (1997),
3754.
[17] A.Cellina & S.Perrotta: On the validity of the maximum principle and of
the EulerLagrange equation for a minimum problem depending on the gradient.
Siam J. Control Optim., 36 (1998), 19871998.
[19] L.De Pascale & A.Pratelli: Regularity properties for Monge transport
density and for solutions of some shape optimizazion problem. Forthcoming.
[22] L.C.Evans & W.Gangbo: Differential Equation Methods for the Monge-
Kantorovich Mass Transfer Problem. Memoirs AMS, 653, 1999.
54
[27] R.Jerrard & H.M.Soner: Functions of bounded n-variation. Preprint, 1999.
[28] D.G.Larman: A compact set of disjoint line segments in R3 whose end set
has positive measure. Mathematika, 18 (1971), 112125.
[31] S.Mu ller: Variational methods for microstructure and phase transitions.
Proc. CIME summer school Calculus of variations and geometric evolu-
tion problems, Cetraro (Italy), 1996, S.Hildebrandt and M.Struwe eds,
http:///www.mis.mpg.de/cgi-bin/lecture-notes.pl.
[32] F.Otto: The geometry of dissipative evolution equations: the porous medium
equation.
[37] W.Rudin: Real and Complex Analysis. McGrawHill, New York, 1974.
[38] L.Simon: Lectures on geometric measure theory, Proc. Centre for Math. Anal.,
Australian Nat. Univ., 3, 1983.
55
[41] C.Smith & M.Knott: On HoeffdingFrechet bounds and cyclic monotone
relations. J. Multivariate Anal., 40 (1992), 328334.
[43] N.S.Trudinger & X.J.Wang: On the Monge mass transfer problem. Calc.
Var. PDE, forthcoming.
[45] L.Zajc
ek: On the differentiability of convex functions in finite and infinite
dimensional spaces. Czechoslovak Math. J., 29 (104) (1979), 340348.
56