You are on page 1of 132

SPRINGER BRIEFS IN MATHEMATICS

Alain Bensoussan
Jens Frehse
Phillip Yam

Mean Field
Games and Mean
Field Type Control
Theory

123
SpringerBriefs in Mathematics

Series Editors

Krishnaswami Alladi
Nicola Bellomo
Michele Benzi
Tatsien Li
Matthias Neufang
Otmar Scherzer
Dierk Schleicher
Benjamin Steinberg
Vladas Sidoravicius
Yuri Tschinkel
Loring W. Tu
G. George Yin
Ping Zhang

SpringerBriefs in Mathematics showcases expositions in all areas of


mathematics and applied mathematics. Manuscripts presenting new results
or a single new result in a classical field, new field, or an emerging topic,
applications, or bridges between new results and already published works,
are encouraged. The series is intended for mathematicians and applied
mathematicians.

For further volumes:


http://www.springer.com/series/10030
Alain Bensoussan • Jens Frehse • Phillip Yam

Mean Field Games and Mean


Field Type Control Theory

123
Alain Bensoussan Jens Frehse
Naveen Jindal School of Management Institut für Angewandte Mathematik
University of Texas at Dallas Universitat Bonn
Richardson, TX, USA Bonn, Germany
Department of Systems Engineering
and Engineering Management
City University of Hong Kong
Kowloon, Hong Kong SAR

Phillip Yam
Department of Statistics
The Chinese University of Hong Kong
Shatin, Hong Kong SAR

ISSN 2191-8198 ISSN 2191-8201 (electronic)


ISBN 978-1-4614-8507-0 ISBN 978-1-4614-8508-7 (eBook)
DOI 10.1007/978-1-4614-8508-7
Springer New York Heidelberg Dordrecht London
Library of Congress Control Number: 2013945868

Mathematics Subject Classification (2010): 49J20, 49N90, 58E25, 91G80, 35Q91, 49N70, 49N90,
91A06, 91A13, 91A18, 91A25, 35R15, 60H30, 35R60, 60H15, 60H30, 91A15, 93E20

© Alain Bensoussan, Jens Frehse, Phillip Yam 2013


This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection
with reviews or scholarly analysis or material supplied specifically for the purpose of being entered
and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of
this publication or parts thereof is permitted only under the provisions of the Copyright Law of the
Publisher’s location, in its current version, and permission for use must always be obtained from Springer.
Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations
are liable to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for
any errors or omissions that may be made. The publisher makes no warranty, express or implied, with
respect to the material contained herein.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)


Preface

Mean field theory has raised a lot of interest in recent years, since the
independent introduction by Lasry–Lions and Huang–Caines–Malhamé, see, in
particular, Lasry–Lions [25–27], Gueant et al. [17], Huang et al. [21, 22], Buckdahn
et al. [13], Andersson–Djehiche [1], Cardaliaguet [14], Carmona–Delarue [15],
Bensoussan et al. [10]. The applications concern approximating an infinite number
of players with common behavior by a representative agent. This agent has to solve a
control problem perturbed by a field equation, representing in some way the average
behavior of the infinite number of agents. The mean field term can influence the
dynamics of the state equation of the agent as well as his/her objective functional. In
the mean field game, the agent cannot influence the mean field term, considered as
external. Therefore, he or she solves a standard control problem, in which the mean
field term acts as a parameter. In this context one looks for an equilibrium, which
means that the mean field term is the probability distribution of the state behavior
of the individual agent.The equilibrium is the core of the mathematical difficulty.
In the mean field type control problem, the agent can influence the mean field
term. The problem is thus a control problem, albeit more elaborate than the standard
control theory problem. Indeed, the state equation also contains the probability
distribution of the state and thus is of the McKean–Vlasov type; see McKean [29].
The objective of this book is to describe the major approaches to two types of
problems and more advanced questions for future research. In this framework, we
are not presenting full proofs of results, but we describe what are the mathematical
problems, where they come from, and the steps to be accomplished to obtain
solutions.

Richardson, TX, USA Alain Bensoussan


Bonn, Germany Jens Frehse
Shatin, Hong Kong SAR Phillip Yam

v
Acknowledgments

Alain Bensoussan expresses his gratitude to the financial supports from the National
Science Foundation under grant DMS-1303775, and Research Grant Council of
Hong Kong Special Administrative Region under grant GRF 500113. Phillip
Yam also acknowledges the financial supports from The Hong Kong RGC GRF
404012 with the project title “Advance Topics in Multivariate Risk Management in
Finance and Insurance”, and The Chinese University of Hong Kong Direct Grant
2011/2012 Project ID 2060444. Phillip Yam also expresses his sincere gratitude to
the hospitality of Hausdorff Center of Mathematics of University of Bonn for his
fruitful stay in Hausdorff Trimester Program with title: “Stochastic Dynamics in
Economics and Finance”.

vii
Contents

1 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 1
2 General Presentation of Mean Field Control Problems . . . . . . . . . . . . . . . . 7
2.1 Model and Assumptions . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 7
2.2 Definition of the Problems .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 8
3 The Mean Field Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11
3.1 HJB-FP Approach .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 11
3.2 Stochastic Maximum Principle . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 12
4 The Mean Field Type Control Problems . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 15
4.1 HJB-FP Approach .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 15
4.2 Other Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 18
4.3 Stochastic Maximum Principle . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 21
4.4 Time Inconsistency Approach . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 23
5 Approximation of Nash Games with a Large Number of Players .. . . . 31
5.1 Preliminaries.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 31
5.2 System of PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 32
5.3 Independent Trajectories .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 32
5.4 General Case. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 36
5.5 Nash Equilibrium Among Local Feedbacks . . . .. . . . . . . . . . . . . . . . . . . . 42
6 Linear Quadratic Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 45
6.1 Setting of the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 45
6.2 Solution of the Mean Field Game Problem . . . . .. . . . . . . . . . . . . . . . . . . . 45
6.3 Solution of the Mean Field Type Problem . . . . . .. . . . . . . . . . . . . . . . . . . . 48
6.4 The Mean Variance Problem .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 51
6.5 Approximate N Player Differential Game . . . . . .. . . . . . . . . . . . . . . . . . . . 57
7 Stationary Problems .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 59
7.1 Preliminaries.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 59
7.2 Mean Field Game Set-Up . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 60

ix
x Contents

7.3 Additional Interpretations .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 61


7.4 Approximate N Player Nash Equilibrium .. . . . . .. . . . . . . . . . . . . . . . . . . . 64
8 Different Populations.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 67
8.1 General Considerations . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 67
8.2 Multiclass Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 67
8.3 Major Player .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 72
8.3.1 General Theory . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 72
8.3.2 Linear Quadratic Case . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 82
9 Nash Differential Games with Mean Field Effect . . .. . . . . . . . . . . . . . . . . . . . 89
9.1 Description of the Problem . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 89
9.2 Mathematical Problem . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 89
9.3 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 91
9.4 Another Interpretation.. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 93
9.5 Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 94
9.6 Approximate Nash Equilibrium for Large Communities .. . . . . . . . . . 95
10 Analytic Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 99
10.1 General Set-Up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 99
10.1.1 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 99
10.1.2 Weak Formulation . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 101
10.2 A Priori Estimates for u .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 102
10.2.1 L∞ Estimate for u . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 102
10.2.2 L2 (W 1,2 )) Estimate for u . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 103
10.2.3 Cα Estimate for u . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 103
10.2.4 L p (W 2,p ) Estimate for u . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 106
10.3 A Priori Estimates for m . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 110
10.3.1 L2 (W 1,2 ) Estimate . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 110
10.3.2 L∞ (L∞ ) Estimates . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 111
10.3.3 Further Estimates . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 115
10.3.4 Statement of the Global A Priori Estimate Result . . . . . . . . 115
10.4 Existence Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 116

References .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 125

Index . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 127
Chapter 1
Introduction

Mean field games and mean field type control introduce new problems in control
theory. The term “games” is appealing, although maybe confusing. In fact, mean
field games are control problems, in the sense that one is interested in a single
decision maker, who we call the representative agent. However, these problems
are not standard, since both the evolution of the state and the objective functional
are influenced by terms that are not directly related to the state or to the control of
the decision maker. They are, however, indirectly related to the decision maker,
in the sense that they model a very large community of agents similar to the
representative agent. All the agents behave similarly and impact the representative
agent. However, because of the large number, an aggregation effect takes place. The
interesting consequence is that the impact of the community can be modelled by a
mean field term, but when this is done the problem is reduced to a control problem.
Note that the concept of mean field is very fruitful in physics. However, applying
the idea of averaging to domains different from physics is the novelty. The idea is
also different from the concept of equilibrium in economics because of both the
emphasis on dynamic aspects (as in physics) and the direct effect of the mean field
term on the state evolution of the agent.
Of course, an important question is whether the mean field control problem is
a good approximation, and if so, of what exactly. This is an essential aspect of the
theory. It can be done for mean field games and explains the terminology. What can
be expected is that if one considers a Nash game for identical agents, the equilibrium
can be approximated by the solution of the mean field control equilibrium, whenever
all agents use the same control, defined by the optimal control of the representative
agent. The approximation improves as the number of agents increases. This basic
result is the justification of the concept of mean field games. The interest of the mean
field type control problem is that it is a control problem and not an equilibrium.
Consequently, a solution may be found more often (at least some approximate
solution).
So to sum up, mean field games can be reduced to a standard control problem
and an equilibrium, and mean field type control is a nonstandard control problem.

A. Bensoussan et al., Mean Field Games and Mean Field Type Control Theory, 1
SpringerBriefs in Mathematics, DOI 10.1007/978-1-4614-8508-7__1,
© Alain Bensoussan, Jens Frehse, Phillip Yam 2013
2 1 Introduction

To solve these problems, one naturally can rely on the techniques of control
theory: dynamic programming and the stochastic maximum principle. It is inter-
esting to observe that in the literature those papers dealing with mean field games
use dynamic programming and those that deal with mean field type control use the
stochastic maximum principle. The reason is that in the mean field games problem,
the control problem is standard, since the mean field term is external. It is then
natural to write a Hamilton–Jacobi–Bellman (HJB) equation. The optimal control
is derived from a feedback. The mean field terms involve a probability distri-
bution, which in the equilibrium is governed by a Fokker–Planck (FP) equation.
It represents the probability distribution of the state. This equation depends on the
feedback and also on the solution of the HJB equation. So the problem becomes a
coupled system of partial differential equations, one HJB equation, and one Fokker–
Planck equation. This is the mathematical problem to be solved. Of course, for
a standard control problem, one can use the stochastic maximum principle as an
alternative to the HJB equation. This has been done recently in [15] and [10]. One
obtains again a coupling, and there is a fixed point to be sought. In the mean
field type control problem, one uses the stochastic maximum principle and not
dynamic programming, because the control problem is not standard. In particular,
there is “time inconsistency”. The optimality principle of dynamic programming is
not valid, and therefore one cannot write an HJB equation for such a nonstandard
stochastic control problem. Since the stochastic maximum principle does not rely
on the dynamic programming optimality principle, the time inconsistency is not
an issue.
We show in this book that we can obtain an HJB equation, but only when
coupled with a Fokker-Planck equation. This is not a contradiction of the previous
discussion. What cannot be obtained is a single HJB equation, because the
optimality principle is not valid. However a coupled system of HJB and Fokker–
Planck equations can be obtained, exactly like those in the case of mean field
games. We show that this result is fully consistent with the results obtained from
the stochastic maximum principle, and in fact the stochastic maximum principle
can be recovered by this approach.
So this is the first contribution of this book—to present a unified approach
for both problems, using either HJB- and FP-coupled equations or the stochastic
maximum principle.
It is important to stress that the coupled system of partial differential equations
(HJB-FP) is different for both problems. The HJB-FP system for the mean field
type control problem contains additional terms, so it is more complex than that of
the mean field game. It had not appeared earlier in the literature, and therefore no
results were known. However, the general structures of the two coupled systems are
similar, so one can expect solutions for both, even paradoxically more often for the
mean field type problem, because it is a control problem and not an equilibrium.
We shall show this in the case of linear quadratic problems, for which everything is
explicit.
We then explain in the case of mean field games that what is to be done is to show
that the optimal solution derived from the coupled HJB-FP equations can be used as
1 Introduction 3

a good approximation for the Nash equilibrium of a large number of identical agents.
Recalling that the representative agent gets an optimal feedback, the most natural
way to design the Nash equilibrium is to use identical feedbacks for all agents. This
works provided that the feedback is sufficiently smooth, and thus one obtains a Nash
equilibrium among smooth feedbacks. To avoid this smoothness difficulty, one can
introduce open loop controls derived from the feedback, and then consider a Nash
equilibrium among these open loop controls.
We conjecture that the mean field type optimal control can also be used to provide
an approximation for the Nash equilibrium of a large number of agents. However,
no such result exists in the literature.
This issue of “time inconsistency” has been considered in the literature inde-
pendently of the mean field theory, see Björk-Murgoci, [11], Hu-Jin-Zhou [18].
The objective is to consider feedbacks that do not depend on the initial condition,
such as in standard control theory. A solution that depends on the initial conditions
is labelled “precommittment” in the economics literature. One way to obtain
only time-consistent feedbacks is to reformulate the control problem into a game
problem. One considers decisions made at future times as done by different players,
one player at a time. Therefore, at each time, the decision maker is a player who
looks for a Nash equilibrium against players at future times. We shall see that one
can handle only a limited number of situations, as far as coupling is concerned.
Another way to understand the concept is to consider optimality against “spike
modifications.”
We next review the situation of linear quadratic problems. The interesting aspect
is that all solutions can be obtained explicitly and thus can be compared. It is
particularly useful to compare the assumptions. They are not at all equivalent. We do
not have the situation in which we can identify the best approach. It is a case-by-case
situation.
We explain next why the coupled HJB-FP system is written naturally as a
parabolic system, even when we consider an infinite horizon.
To derive stationary (elliptic) systems, Lasry-Lions have considered an ergodic
control case. We show in this book that it is possible to consider elliptic coupled
HJB-FP equations, without using ergodic situations. The problems are less natural
but we gain the simplicity.
In fact, it is then possible to benefit in this framework from other interpretations
of the coupled HJB-FP system. If one considers the dual control problem, then the
state equation is the Fokker–Planck equation, describing the probability distribution
of the state, while the HJB equation can be interpreted as a necessary condition
of optimality for the dual problem. To generate mean field terms, it is sufficient to
consider objective functions that are not just linear in the probability distribution,
but are also more complex.
This approach also has a different type of application. In the traditional stochastic
control problem, the objective functional is the expected value of a cost depending
on the trajectory. So it is linear in the probability measure. This type of functional
leaves out many current considerations in control theory, namely situations where
one wants to take into consideration not just the expected value but also the variance.
This case occurs often in risk management.
4 1 Introduction

The famous mean-variance optimization problem (i.e., the Markowitz problem)


is an example. In addition, one may be interested in several functionals along the
trajectory, even though one may be satisfied with expected values. Combining these
various expected values into a single payoff, one is led naturally to mean field
problems. They are meaningful even without considering ergodic theory, i.e., long-
term behavior.
We then address future important extensions.
In most real problems of economics, there is not just one representative agent and
a large community of identical players, which bring impact via a mean field term.
There are several major/dominating players, as well as large communities.
So a natural question is to consider the problem of these major players. They
know that they can influence the community, and they also compete with each
other. So the issue is that of differential games, with mean field terms, and not of
mean field equations arising from the limit of a Nash equilibrium for an infinite
number of players. Huang–Caines–Malhamé [21] allow for groups of players with
different characteristics, but they do not compete with one another. Huang [19] and
Nourian-Caines [31] study the situation of a major player. The representative agent
is submitted to the major player. The major player will take this into account in his
or her decision. The problem has similarities with Stackelberg games. However, the
state probabilities have to be replaced with conditional probabilities, which is much
more complex.
In the context of coalitions competing with each other, the objective in this work
is to present systems of HJB equations, coupled with systems of FP equations.
We explain how they can be obtained from averaging large homogeneous communi-
ties, who compete with one another. This type of problem has not yet been addressed
in the literature, except in the paper by two of the authors, [7].
To recover the system of nonlinear PDEs it is easier to proceed with the
dual problems as explained above. One can consider a differential game for state
equations that are probability distributions of states and evolve according to FP
equations. One recovers nonlinear systems of PDEs with mean field terms, with
a different motivation. An additional interesting feature of this approach is that we
do not need to consider an ergodic situation. In fact,considering strictly positive
discounts is quite meaningful in our applications. This leads to systems of nonlinear
PDEs with mean field coupling terms, which we can study with a minimum set of
assumptions. The ergodic case, when the discount vanishes, requires much more
stringent assumptions, as is already the case when there is no mean field term.
We refer to Bensoussan–Frehse [5, 6] and Bensoussan–Frehse–Vogelgesang [8, 9]
for the situation without the mean field terms. Basically our set of assumptions
remains valid and we have to incorporate additional assumptions to deal with the
mean field terms.
We then provide an overview of the analytic techniques to solve for the systems
of HJB-FP equations. We do it only in a limited number of situations.
An interesting aspect of our approach is to proceed with a priori estimates.
We use fixed-point theory only for approximations, which is much easier.
1 Introduction 5

It is clear that a lot remains to be done in developing models, techniques, and


applications. For instance, we have not considered the issue of systemic risk; see
[16], which bears similarity with mean field theory. The objective is, however,
different. In systemic risk, one is interested in the consequences of a random shock
on a equilibrium within a network of interactions. Similarly, among the interesting
new techniques, let us mention sensitivity, see [23,24], which leads to useful results.
Unfortunately, within the page limitation of this current discussion, we could not
discuss these possible applications.
We hope that, in spite of its limitations, the present synthesis will further
understanding of the diversity of concepts and problems. We want to express our
gratitude to those who initiated the domain, Caines–Huang–Malhamé on the one
hand, and Lasry–Lions on the other hand, for their inspiring work. The number of
papers that have originated from their initial articles and lectures is the best evidence
of the importance of their ideas.
Chapter 2
General Presentation of Mean Field
Control Problems

2.1 Model and Assumptions

Consider a probability space (Ω, A , P) and a filtration F t generated by a


n-dimensional standard Wiener process w(t). The state space is Rn with the
generic notation x and the control space is Rd with generic notation v. We consider
measurable functions

g(x, m, v) : Rn × L1 (Rn ) × Rd → Rn ; σ (x) : Rn → L (Rn ; Rn );


f (x, m, v) : Rn × L1 (Rn ) × Rd → R; h(x, m) : Rn × L1 (Rn ) → R. (2.1)

These functions may depend on time. We omit this dependence to save notation.
We assume that

σ (x), σ −1 (x) are bounded. (2.2)

The argument m will accommodate the mean field term. In practice, it will be a
probability density on Rn . One could replace L1 (Rn ) by a space L p (Rn ), 1 < p < ∞.
At some point, we shall need to extend these functions to arguments that are
probability measures, not having densities with respect to the Lebesgue measure;
in practice, for example, a finite sum of Dirac measures. The space of probability
measures will be equipped with the topology of weak * convergence. In this
situation we lose the structure of Banach space. So we stick to densities as much
as possible. Consider a function m(t) ∈ C(0, T ; L1 (Rn )). We pick a feedback control
in the form v(x,t), a measurable map from Rn × (0, T ) to Rd . To save notation we
shall write it v(x). We solve the stochastic differential equation (SDE)

dx = g(x(t), m(t), v(x(t)))dt + σ (x(t))dw(t),


x(0) = x0 . (2.3)

A. Bensoussan et al., Mean Field Games and Mean Field Type Control Theory, 7
SpringerBriefs in Mathematics, DOI 10.1007/978-1-4614-8508-7__2,
© Alain Bensoussan, Jens Frehse, Phillip Yam 2013
8 2 General Presentation of Mean Field Control Problems

We will need to assume that

g, σ , f , h are differentiable in both x, v. (2.4)

With this assumption, if the feedback is Lipshitz, the SDE has a unique solution
x(t), which is a continuous process adapted to the filtration F t . This is the state of
the system under the control defined by the feedback v(.). The initial state x0 is a
random variable independent of the Wiener process, which has a probability density
is given by m0 . Note that the function m(t) acts as a parameter. We assume, however,
that m(0) = m0 .
To the pair (v(.), m(.)) we associate the control objective
ˆ T 
J(v(.), m(.)) = E f (x(t), m(t), v(x(t)) dt + h(x(T ), m(T ) . (2.5)
0

To define this functional properly, we need to assume that

g has linear growth in x, v.


f , h have quadratic growth in x, v. (2.6)

Recall that m(t) is a deterministic function.

2.2 Definition of the Problems

The mean field game problem is defined as follows: find a pair (v̂(.), m(.)) such that,
denoting by x̂(.) the solution of

d x̂ = g(x̂(t), m(t), v̂(x̂(t)))dt + σ (x̂(t))dw(t),


x̂(0) = x0 , (2.7)

then

m(t) is the probability distribution of x̂(t), ∀t ∈ [0, T ]


J(v̂(.), m(.)) ≤ J(v(.), m(.))∀v(.). (2.8)

The mean field type control problem is defined as follows: For any feedback v(.),
let x(t) = xv(.) (t) be the solution of (2.3) with m(t) = probability distribution of
xv(.) (t). So (2.3) becomes a McKean–Vlasov equation. If we denote by mv(.) (t) =
probability distribution of xv(.) (t), we thus have

dxv(.) = g(xv(.) (t), mv(.) (t), v(xv(.) (t))dt + σ (xv(.) (t))dw(t),


x(0) = x0 . (2.9)

mv(.) (t) = probability distribution of xv(.) (t) (2.10)


2.2 Definition of the Problems 9

then we have to find v̂(.) such that

J(v̂(.), mv̂(.) (.)) ≤ J(v(.), mv(.) (.)) ∀v(.). (2.11)

If we denote x̂(t) = xv̂(.) (t) and m(t) = mv̂(.) (t), then we can write

m(t) is the probability distribution of x̂(t), ∀t ∈ [0, T ]


J(v̂(.), m(.)) ≤ J(v(.), mv(.) (.)), ∀v(.). (2.12)

It is useful to compare (2.8) with (2.12). The difference takes place in the right-hand
side of the inequality in the second condition. For a mean field game, m(.) is fixed,
whereas for mean field type control, m depends on v(.).
Remark 1. We can see that in our control problems we do not control the term σ (x).
There is not either the term m in it. This will considerably simplify the mathematical
treatment. Besides, there is no degeneracy, This will allow us to work with densities,
and to benefit from the nice structure of L1 (Rn ) instead of having to work with the
space of probability measures on Rn .
Chapter 3
The Mean Field Games

3.1 HJB-FP Approach

Let us set
1
a(x) = σ (x)σ ∗ (x), (3.1)
2
and introduce the second-order differential operator

Aϕ (x) = −tr a(x)D2 ϕ (x). (3.2)


We define the dual operator
n
∂2
A∗ ϕ (x) = − ∑ (akl (x)ϕ (x)). (3.3)
k,l=1 ∂xk ∂xl

Since m(t) is the probability distribution of x̂(t), it has a density with respect to the
Lebesgue measure denoted by m(x,t), which is the solution of the Fokker–Planck
equation
∂m
+ A∗m + div (g(x, m, v̂(x))m) = 0,
∂t
m(x, 0) = m0 (x). (3.4)
We next want the feedback v̂(x) to solve a standard control problem, in which m
appears as a parameter. We can thus readily associate an HJB equation with this
problem, parametrized by m. We introduce the Lagrangian function
L(x, m, v, q) = f (x, m, v) + q · g(x, m, v), (3.5)

and the Hamiltonian function


H(x, m, q) = inf L(x, m, v, q). (3.6)
v

A. Bensoussan et al., Mean Field Games and Mean Field Type Control Theory, 11
SpringerBriefs in Mathematics, DOI 10.1007/978-1-4614-8508-7__3,
© Alain Bensoussan, Jens Frehse, Phillip Yam 2013
12 3 The Mean Field Games

In this context, we shall assume that the infimum is attained and that we can define
a sufficiently smooth function v̂(x, m, q) such that
H(x, m, q) = L(x, m, v̂(x, m, q), q). (3.7)

We shall also write


G(x, m, q) = g(x, m, v̂(x, m, q)). (3.8)

From standard control theory, the HJB equation is defined by


∂u
− + Au = H(x, m, Du),
∂t
u(x, T ) = h(x, m(T )), (3.9)

and setting v̂(x) = v̂(x, m, Du), we can assert, from standard arguments,
ˆ
J(v̂(.), m(.)) = u(x, 0)m0 (x)dx ≤ J(v(.), m(.)), ∀v(.) (3.10)
Rn

Therefore, to solve the mean field game problem, we have to solve the coupled
HJB-FP system of PDEs
∂u
− + Au = H(x, m, Du)
∂t
∂m
+ A∗ m + div (G(x, m, Du)m) = 0
∂t
u(x, T ) = h(x, m(T ))
m(x, 0) = m0 (x) (3.11)

and the pair v̂(x) = v̂(x, m, Du), m is the solution. We can then proceed backwards
as usual in control theory. We solve the system (3.11), which gives us a candidate
v̂, m, and then rely on a verification argument to check that it is indeed the solution.
Since we rely on PDE techniques, it is essential that the operator A be linear and
not degenerate.

3.2 Stochastic Maximum Principle

We use notation that is customary in stating the stochastic maximum principle.


From the feedback v̂(.) and the probability distribution m(t), we construct stochastic
processes X(t) ∈ Rn , V (t) ∈ Rd , Y (t) ∈ Rn , Z(t) ∈ L (Rn ; Rn ) which are defined as
follows

X(t) = x̂(t), m(t) = PX(t)


3.2 Stochastic Maximum Principle 13

in which the notation PX(t) means the probability distribution of the random
variable X(t). We next define

Y (t) = Du(X(t),t), V (t) = v̂(X(t), PX(t) ,Y (t))

and finally

Z(t) = D2 u σ (X(t),t).

We first have from (3.1)

dX = g(X(t), PX(t) ,V (t))dt + σ (X(t))dw(t)


X(0) = x0 . (3.12)

Using Ito’s formula, we next write


 
∂ 2u ∂ 2u ∂ 3u
dYi (t) = ∑ dXk + + ∑ akl dt.
k ∂ xi ∂ xk ∂ xi ∂ t kl ∂ xi ∂ xk ∂ xl

∂ 2u
We differentiate in xi the HJB equation to evaluate . We get
∂ xi ∂ t

∂ 2u ∂ akl ∂ 2 u ∂ 3u
− −∑ − ∑ akl
∂ xi ∂ t kl ∂ xi ∂ xk ∂ xl kl ∂ xi ∂ xk ∂ xl

∂f ∂ 2u ∂ u ∂ gxk
= (x, m, v̂(x)) + ∑ gk (x, m, v̂(x)) +∑ (x, m, v̂(x)).
∂ xi k ∂ x i ∂ x k k ∂ xk ∂ xi

Combining terms and using the definition of Y (t), Z(t) we obtain easily

∂f ∂g∗
−dY = (X(t), PX(t) ,V (t)) + (X(t), PX(t) ,V (t))Y (t)
∂x ∂x

∂ σ (X(t)) ∗
+ tr Z(t) dt − Z(t)dw(t),
∂x
∂ h(X(T ), PX(T ) )
Y (T ) = .
∂x
In the context of the stochastic maximum principle, it is customary to call
Hamiltonian what is called Lagrangian in the context of dynamic programming.
Because of this, we call Hamiltonian

H(x, m, v, q) = f (x, m, v) + q.g(x, m, v). (3.13)


14 3 The Mean Field Games

We can collect results and summarize the stochastic maximum principle as follows:
find adapted processes

X(t) ∈ Rn ,V (t) ∈ Rd ,Y (t) ∈ Rn , Z(t) ∈ L (Rn ; Rn )


dX = g(X(t), PX(t) ,V (t))dt + σ (X(t))dw(t),
 
∂H ∂ σ (X(t)) ∗
−dY = (X(t), PX(t) ,V (t),Y (t)) + tr Z(t) dt − Z(t)dw(t),
∂x ∂x
∂ h(X(T ), PX(T ) )
X(0) = x0 , Y (T ) = . (3.14)
∂x

V (t) minimizes H(X(t), PX(t) , v,Y (t)) in v. (3.15)

This is a forward–backward SDE of the McKean–Vlasov type.


The usual approach is to not derive this problem from the HJB-FP system. This
requires a lot of smoothness, which is not really necessary. The usual approach is to
study (3.14), (3.16) directly, by probabilistic techniques (see [15, 31]).
Example 2. Suppose
ˆ  ˆ 
f (x, m, v) = f (x, ϕ (ξ )m(ξ )d ξ , v), h(x, m) = h x, ψ (ξ )m(ξ )d ξ
Rn Rn
ˆ
g(x, m, v) = g(x, χ (ξ )m(ξ )d ξ , v), (3.16)
Rn

where ϕ , ψ , χ map Rn into Rn . By abuse of notation, we can describe in the same way
the functions f (x, ξ , v) on Rn × Rn × Rd and f (x, m, v) on Rn × L1 (Rn ) × Rd , and
similarly for h and g. We can then write the stochastic maximum principle (3.15) as
follows

X(t) ∈ Rn ,V (t) ∈ Rd ,Y (t) ∈ Rn , Z(t) ∈ L (Rn ; Rn )


dX = g(X(t), E χ (X(t)),V (t))dt + σ (X(t))dw(t),

∂f ∂ g∗
−dY = (X(t), E ϕ (X(t)),V (t)) + (X(t), E χ (X(t)),V (t))Y (t)
∂x ∂x

∂ σ (X(t)) ∗
+tr Z(t) dt − Z(t)dw(t),
∂x
∂ h(X(T ), E ψ (X(T ))
X(0) = x0 , Y (T ) = . (3.17)
∂x

V (t) minimizes f (X(t), E ϕ (X(t)), v) + Y (t).g(X(t), E χ (X(t)), v) in v. (3.18)


Chapter 4
The Mean Field Type Control Problems

4.1 HJB-FP Approach

We need to assume that the

m → f (x, m, v), g(x, m, v), h(x, m)


are differentiable in m ∈ L2 (Rn ) (4.1)

∂f
and we use the notation (x, m, v)(ξ ) to represent the derivative, so that
∂m
ˆ
d ∂f
f (x, m + θ m̃, v)| θ =0 = (x, m, v)(ξ )m̃(ξ ) d ξ .
dθ Rn ∂ m

Here, x, v are simply parameters. Coming back to the definition (2.9)–(2.11),


consider a feedback v(x) and the corresponding trajectory defined by (2.9). The
probability distribution mv(.) (t) of xv(.) (t) is a solution of the FP equation

∂ mv(.)
+ A∗ mv(.) + div (g(x, mv(.) , v(x))mv(.) ) = 0,
∂t
mv(.) (x, 0) = m0 (x) (4.2)

and the objective functional J(v(.), mv(.) ) can be expressed as follows


ˆ T ˆ
J(v(.), mv(.) (.)) = f (x, mv(.) (x), v(x))mv(.) (x)dxdt
0 Rn
ˆ
+ h(x, mv(.) (x, T ))mv(.) (x, T )dx. (4.3)
Rn

A. Bensoussan et al., Mean Field Games and Mean Field Type Control Theory, 15
SpringerBriefs in Mathematics, DOI 10.1007/978-1-4614-8508-7__4,
© Alain Bensoussan, Jens Frehse, Phillip Yam 2013
16 4 The Mean Field Type Control Problems

Consider an optimal feedback v̂(x) and the corresponding probability density


mv̂(.) (x,t) = m(x,t). Then let v(.) be any feedback and v̂(x) + θ v(x). We want to
compute
dmv̂(.)+θ v(.) (x)
|θ =0 = m̃(x).

We can check that
∂ m̃
+ A∗ m̃ + div (g(x, m, v̂(x))m̃)
∂t
ˆ  
∂g ∂g
+ div (x, m, v̂(x))(ξ )m̃(ξ )d ξ + (x, m, v̂(x))v(x) m(x) = 0,
∂m ∂v
m̃(x, 0) = 0. (4.4)

It then follows that


dJ(v̂(.) + θ v(.), mv̂(x)+θ v(x) (.))
|θ =0

ˆ Tˆ ˆ
∂f
= (x, m, v̂(x))(ξ )m̃(ξ )m(x)dtd ξ dx
0 R R
n n ∂ m
ˆ Tˆ ˆ Tˆ
∂f
+ f (x, m, v̂(x))m̃(x)dtdx + (x, m, v̂(x))v(x)m(x)dtdx
0 Rn 0 R ∂v
n
ˆ ˆ ˆ
∂h
+ h(x, m(T ))m̃(x, T )dx + (x, m(T ))(ξ )m̃(ξ , T )m(x, T )d ξ dx.
Rn Rn Rn ∂ m
(4.5)

We introduce the function u(x,t) solution of


ˆ
∂u ∂g
− + Au − g(x, m, v̂(x)) · Du − Du(ξ ) · (ξ , m, v̂(ξ ))(x)m(ξ )d ξ
∂t Rn ∂m
ˆ
∂f
= f (x, m, v̂(x)) + (ξ , m, v̂(ξ ))(x)m(ξ )d ξ ,
Rn ∂ m
ˆ
∂h
u(x, T ) = h(x, m(T )) + (ξ , m(T ))(x)m(ξ , T )d ξ (4.6)
Rn ∂ m
hence from (4.5)
dJ(v̂(.) + θ v(.), mv̂(x)+θ v(x) (.))
|θ =0

ˆ Tˆ ˆ Tˆ 
∂f ∂u
= (x, m, v̂(x))v(x)m(x)dtdx + − + Au − g(x, m, v̂(x)) · Du
0 Rn ∂ v 0 Rn ∂t
ˆ  ˆ
∂g
− Du(ξ ) · (ξ , m, v̂(ξ ))(x)m(ξ )d ξ ) m̃(x)dtdx + u(x, T )m̃(x, T ))dx.
Rn ∂m Rn
4.1 HJB-FP Approach 17

Using (4.4) we deduce


ˆ ˆ
dJ(v̂(.) + θ v(.), mv̂(x)+θ v(x) (.)) T
∂f
|θ =0 = (x, m, v̂(x))v(x)m(x)dtdx
dθ 0 Rn ∂v
ˆ ˆ
T
∂g
− u(x)div( (x, m, v̂(x))v(x))m(x))dtdx
0 Rn ∂v

hence
ˆ ˆ
dJ(v̂(.)+θ v(.), mv̂(x)+θ v(x) (.)) T
∂f
|θ =0 = (x, m, v̂(x))v(x)m(x)dtdx
dθ 0 Rn ∂v
ˆ ˆ
T
∂g
+ Du(x) · (x, m, v̂(x))v(x)m(x)dtdx.
0 Rn ∂v

Since v̂(.) is optimal, this expression must vanish for any v(.). Hence necessarily

∂f ∂g∗
(x, m, v̂(x)) + (x, m, v̂(x))Du(x) = 0. (4.7)
∂v ∂v
It follows that (at least with convexity assumptions)

v̂(x) = v̂(x, m, Du(x)). (4.8)

We note that

f (x, m, v̂(x)) + g(x, m, v̂(x)), Du = H(x, m, Du) (4.9)


ˆ  
∂f ∂g
(ξ , m, v̂(ξ ))(x) + Du(ξ ) · (ξ , m, v̂(ξ ))(x) m(ξ )d ξ
Rn ∂ m ∂m
ˆ
∂H
= (ξ , m, Du(ξ ))(x)m(ξ )d ξ . (4.10)
Rn ∂ m

We also note that

g(x, m, v̂(x)) = g(x, m, v̂(x, m, Du(x))) = G(x, m, Du). (4.11)

Going back to (4.2), written for v(.) = v̂(.), we can finally write the system of HJB-
FP PDEs
18 4 The Mean Field Type Control Problems

ˆ
∂u ∂H
− + Au = H(x, m, Du) + (ξ , m, Du(ξ ))(x)m(ξ )d ξ ,
∂t R ∂m
n
ˆ
∂h
u(x, T ) = h(x, m(T )) + (ξ , m(T ))(x)m(ξ , T )d ξ ,
Rn ∂ m
∂m
+ A∗m + div (G(x, m, Du)m) = 0,
∂t
m(x, 0) = m0 (x). (4.12)

We can compare the system (4.12) with (3.11). They differ through the partial
derivative of H and h with respect to m. The optimal feedback is

v̂(x) = v̂(x, m, Du(x)). (4.13)

Remark 3. Although they are similar, the systems (3.11) and (4.12) have been
derived in a completely different manner. This is related to the time inconsistency
issue. The derivation of (3.11) follows the standard pattern of dynamic program-
ming, since m is external. This approach cannot be used to derive (4.12), which is
obtained as a necessary condition of optimality.
The system (4.12) has not appeared in the literature, which relies on the stochastic
maximum principle. This is natural, since we express merely necessary conditions
of optimality.

4.2 Other Approaches

We first give another formula for the functional (4.3) and its Gateaux-derivative
(4.5).
For a given feedback control v(.), we introduce the linear equation
∂ uv(.)
− + Auv(.) − g(x, mv(.) (t), v(x)) · Duv(.) = f (x, mv(.) (t), v(x)),
∂t
uv(.) (x, T ) = h(x, mv(.) (T )). (4.14)

It is then easy to check, combining (4.2) and (4.14), that the functional (4.3) can be
written as follows
ˆ
J(v(.), mv(.) (.)) = uv(.) (x, 0)m0 (x)dx. (4.15)
Rn

We can then give another expression for the Gateaux differential. We have consid-
ered the Gateaux differential [recall that mv̂(.) (x) = m(x)]

dmv̂(.)+θ v(.) (x)


|θ =0 = m̃(x).

4.2 Other Approaches 19

We can consider similarly

duv̂(.)+θ v(.) (x)


|θ =0 = ũ(x)

and ũ(x) is the solution of the linear equation

∂ ũ
− + Aũ − g(x, m(t), v̂(x)).Dũ
∂t
ˆ 
∂g ∂g
− (x, m(t), v̂(x))(ξ )m̃(ξ )d ξ + (x, m(t), v̂(x))v(x) · Duv̂(.) (x)
∂m ∂v
ˆ
∂f ∂f
= (x, m(t), v̂(x))(ξ )m̃(ξ )d ξ + (x, m(t), v̂(x))v(x),
∂m ∂v
ˆ
∂h
ũ(x, T ) = (x, m(T ))(ξ )m̃(ξ , T )d ξ . (4.16)
∂m

We can check the relation


ˆ
dJ(v̂(.) + θ v(.), mv̂(x)+θ v(x) (.))
|θ =0 = ũ(x, 0)m0 (x)dx. (4.17)

In fact, formulas (4.5) and (4.17) coincide. We leave it to the reader to check this.
Computation of the Gateaux derivative allows us to write the optimality condition
(4.7) as follows

∂L
(x, m(t), v̂(x), Du(x,t)) = 0,
∂v
where

L(x, m, v, q) = f (x, m, v) + q.g(x, m, v).

We have interpreted this condition as a necessary condition for L(x, m(t), v, Du(x,t))
to attain its minimum in v at v = v̂(x,t). This requires convexity.
We can prove this minimum property directly, as in the proof of Pontryagin’s
maximum principle, by using spike changes for the optimal control.
Suppose we change the optimum feedback v̂(x, s) into

 v(x) s ∈ (t,t + ε )
v̄(x, s) = 
 (t,t + ε )
v̂(x, s) s ∈

in which v(x) is arbitrary (spike modification).


20 4 The Mean Field Type Control Problems

We then define m̄ = mv̄(.) . We have

m̄(x, s) = m(x, s), ∀s ≤ t,


∂ m̄
+ A∗ m̄ + div (g(x, m̄, v(x))m̄) = 0, t < s < t + ε ,
∂t
∂ m̄
+ A∗ m̄ + div (g(x, m̄, v̂(x))m̄) = 0, s > t + ε .
∂t
A tedious calculation allows us to write
J(v̄(.), m̄(.)) − J(v̂(.), m(.))
ˆ t+ε ˆ
= (L(x, m(s), v(x), Du) − L(x, m(s), v̂(x), Du))m(x, s)dxds
t Rn
ˆ t+ε ˆ
+ (L(x, m̄(s), v(x), Du)m̄(x, s) − L(x, m(s), v(x), Du)m(x, s))dxds
t Rn
ˆ t+ε ˆ
− (m̄(x, s) − m(x, s))(L(x, m(s), v̂(x), Du)
t Rn
ˆ
∂L
+ (ξ , m(s), v̂(ξ ), Du(ξ ))(x)m(ξ , s)d ξ )dxds
∂m
ˆ T ˆ
+ (L(x, m̄(s), v̂(x), Du) − L(x, m(s), v̂(x), Du))(m̄(x, s)
t+ε Rn
ˆ T ˆ
− m(x, s))dxds + m(x, s)[L(x, m̄(s), v̂(x), Du) − L(x, m(s), v̂(x), Du)
t+ε Rn
ˆ
∂L
− (x, m(s), v̂(x), Du(x))(ξ )(m̄(ξ , s) − m(ξ , s))d ξ ]dxds
∂m
ˆ
+ (h(x, m̄(T ))−h(x, m(T ))(m̄(x, T )−m(x, T ))dx
Rn
ˆ ˆ
∂h
+ m(x, T )[h(x, m̄(T ))−h(x, m(T ))− (x, m(T ))(ξ )(m̄(ξ , T )−m(ξ , T ))d ξ ]dx.
Rn ∂m

The left-hand side is positive. Dividing by ε and letting ε go to 0 yields


ˆ
(L(x, m(t), v(x), Du(x,t)) − L(x, m(t), v̂(x,t), Du(x,t)))m(x,t)dx ≥ 0, a.e.t
Rn

and since v(x) is arbitrary, we get


L(x, m(t), v, Du(x,t)) ≥ L(x, m(t), v̂(x,t), Du(x,t)), a.e.x,t

which expresses the minimality property.


4.3 Stochastic Maximum Principle 21

4.3 Stochastic Maximum Principle

We can derive from the system (4.12) a stochastic maximum principle. We proceed
as in Sect. 3.2.
From the optimal feedback v̂(x) and the probability distribution m(t) we con-
struct stochastic processes X(t) ∈ Rn , V (t) ∈ Rd , Y (t) ∈ Rn , and Z(t) ∈ L (Rn ; Rn )
which are adapted and defined as follows

X(t) = x̂(t), m(t) = PX(t) .

We next define

Y (t) = Du(X(t),t), V (t) = v̂(X(t), PX(t) ,Y (t))

and finally

Z(t) = D2 u σ (X(t),t).

We first have

dX = g(X(t), PX(t) ,V (t))dt + σ (X(t))dw(t)


X(0) = x0 . (4.18)

We proceed as in Sect. 3.2 to obtain


 
∂f ∂g∗ ∂ σ (X(t)) ∗
−dY = (X(t), PX(t) ,V (t)) + (X(t), PX(t) ,V (t))Y (t) + tr Z(t) dt
∂x ∂x ∂x

∂2 f ∂ 2g
+E (X(t), PX(t) ,V (t)) + (X(t), PX(t) ,V (t))Y (t) (X(t))dt
∂ x∂ m ∂ x∂ m

− Z(t)dw(t)
 2 
∂ h(X(T ), PX(T ) ) ∂ h
Y (T ) = +E (X(T ), PX(T ) ) (X(T )).
∂x ∂ x∂ m

The notation should be clearly understood to avoid confusion. When we write


 
∂2 f
E (X(t), PX(t) ,V (t)) (X(t))
∂ x∂ m

∂f
we mean that we take the function (ξ , m, v)(x), where ξ and v are parameters
∂m
∂2 f
and we take the gradient in x, denoted by (ξ , m, v)(x). We then consider
∂ x∂ m
22 4 The Mean Field Type Control Problems

∂2 f
ξ = X(t), v = V (t) and take the expected value E (X(t), m,V (t))(x).
∂ x∂ m
We take m = PX(t) (note that it is a deterministic quantity) and thus get
∂2 f
E (X(t), PX(t) ,V (t))(x). Finally, we take the argument x = X(t). To emphasize
∂ x∂ m
∂f
the difficulty of confusion, consider (x, m.v). If we want to take the derivative
∂x
with respect to m, then we should consider x, v as parameters, so change the notation
∂2 f
to ξ and compute (ξ , m, v)(x). Clearly
∂ m∂ x

∂2 f ∂2 f
(ξ , m, v)(x) = (ξ , m, v)(x).
∂ m∂ x ∂ x∂ m

Recalling the definition of the Hamiltonian in this context, see (3.13), we can write
the stochastic maximum principle as follows

X(t) ∈ Rn ,V (t) ∈ Rd ,Y (t) ∈ Rn , Z(t) ∈ L (Rn ; Rn )


dX = g(X(t), PX(t) ,V (t))dt + σ (X(t))dw(t),
  2 
∂H ∂ H
−dY = (X(t), PX(t) ,V (t),Y (t)) + E (X(t), PX(t) ,V (t),Y (t)) (X(t))
∂x ∂ x∂ m

∂ σ (X(t)) ∗
×tr Z(t) dt − Z(t)dw(t),
∂x
 2 
∂ h(X(T ), PX(T ) ) ∂ h
X(0) = x0 , Y (T ) = +E (X(T ), PX(T ) ) (X(T ))
∂x ∂ x∂ m
(4.19)

V (t) minimizes H(X(t), PX(t) , v,Y (t)) in v. (4.20)

Example 4. Consider again Example 2. We consider functions f (x, η , v), g(x, η , v),
h(x, η ), and

 ˆ   ˆ 
f (x, m, v) = f x, ϕ (ξ )m(ξ )d ξ , v , g(x, m, v) = f x, χ (ξ )m(ξ )d ξ , v
Rn Rn
 ˆ 
h(x, m) = h x, ψ (ξ )m(ξ )d ξ .
Rn
4.4 Time Inconsistency Approach 23

We get

 ˆ 
∂2 f ∂ ϕ∗ ∂f
(ξ , m, v)(x) = (x) ξ, ϕ (ζ )m(ζ )d ζ , v
∂ x∂ m ∂x ∂η Rn
 ˆ 
∂ 2 g∗ ∂ χ ∗ ∂ g∗
Du(ξ , m, v)(x) = (x) Du ξ , χ (ζ )m(ζ )d ζ , v
∂ x∂ m ∂x ∂η Rn
 ˆ 
∂ 2h ∂ ψ∗ ∂h
(ξ , m)(x) = (x) ξ, ψ (ζ )m(ζ )d ζ
∂ x∂ m ∂x ∂η Rn

and the stochastic maximum principle writes

X(t) ∈ Rn ,V (t) ∈ Rd ,Y (t) ∈ Rn , Z(t) ∈ L (Rn ; Rn )


dX = g(X(t), E χ (X(t)),V (t))dt + σ (X(t))dw(t),

∂f ∂ g∗
−dY = (X(t), E ϕ (X(t)),V (t)) + (X(t), E χ (X(t)),V (t))Y (t)
∂x ∂x
∂ ϕ (X(t))∗ ∂ f
+ E (X(t), E ϕ (X(t)),V (t))
∂x ∂η
 ∗ 
∂ χ (X(t))∗ ∂g
+ E (X(t), E χ (X(t)),V (t))Y (t)
∂x ∂η

∂ σ (X(t)) ∗
×tr Z(t) dt − Z(t)dw(t),
∂x
∂ h(X(T ), E ψ (X(T )) ∂ ψ ∗ ∂h
X(0) = x0 , Y (T ) = + (X(T ))E (X(T ), E ψ (X(T )))
∂x ∂x ∂η
(4.21)

V (t) minimizes f (X(t), E ϕ (X(t)), v) + Y (t).g(X(t), E χ (X(t)), v) in v. (4.22)

We recover the stochastic maximum principle as established in [1, 12].

4.4 Time Inconsistency Approach

We discuss here the following particular mean field type problem

dx = g(x(t), v(x(t))dt + σ (x(t))dw(t),


x(0) = x0 , (4.23)
24 4 The Mean Field Type Control Problems

ˆ T 
J(v(.), m(.)) = E f (x(t), v(x(t)) dt + h(x(T ), m(T ))
0
ˆ T
+ F(Ex(t))dt + Φ(Ex(T )). (4.24)
0

We consider a feedback v(x,t) and m(t) = mv(.) (t) is the probability density of
xv(.) (t) the solution of (3.3). The functional becomes J(v(.), mv(.) (.)). It is clearly
a particular case of a mean field type control problem. We have indeed
ˆ 
f (x, m, v) = f (x, v) + F ξ m(ξ )d ξ ,
ˆ 
h(x, m) = h(x) + Φ ξ m(ξ )d ξ .

Therefore
ˆ 
H(x, m, q) = H(x, q) + F ξ m(ξ )d ξ ,

where

H(x, q) = inf( f (x, v) + q.g(x, v)).


v

Considering v̂(x, q) which attains the infimum in the definition of H(x, q) and setting

G(x, q) = g(x, v̂(x, q))

the coupled system HJB-FP becomes, see (4.12),


ˆ  ˆ 
∂u ∂F
− + Au = H(x, Du) + F ξ m(ξ )d ξ + ∑ ξ m(ξ )d ξ xk ,
∂t k ∂ xk
ˆ  ˆ 
∂Φ
u(x, T ) = h(x) + Φ ξ m(ξ )d ξ + ∑ ξ m(ξ )d ξ xk ,
k ∂ xk

∂m
+ A∗ m + div (G(x, Du)m) = 0,
∂t
m(x, 0) = δ (x − x0 ). (4.25)

We can reduce this problem slightly, using the following step: introduce the vector
function Ψ(x,t; s), t < s, which is the solution of

∂Ψ
− + AΨ − DΨ.G(x, Du) = 0, t < s
∂t
Ψ(x, s; s) = x (4.26)
4.4 Time Inconsistency Approach 25

then
ˆ
ξ m(ξ ,t)d ξ = Ψ(x0 , 0;t)

so (4.25) becomes

∂u ∂F
− + Au = H(x, Du) + F(Ψ(x0 , 0;t)) + ∑ (Ψ(x0 , 0;t))xk
∂t k ∂ xk
∂Φ
u(x, T ) = h(x) + Φ(Ψ(x0 , 0; T )) + ∑ (Ψ(x0 , 0; T ))xk . (4.27)
k ∂ xk

We now have the system (4.26), (4.27). We can also look at u(x,t) as the solution of
a nonlocal HJB equation, depending on the initial state x0 . The optimal feedback

v̂(x,t) = v̂(x, Du(x,t))

depends also on x0 . Note that it does not depend on any intermediate state. This type
of optimal control is called a precommitment optimal control.
In [11], the authors introduce a new concept, in order to define an optimization
problem among feedbacks that do not depend on the initial condition. A feedback
will be optimal only against spike changes, but not against global changes. They
interpret this limited optimality as a game. Players are attached to small periods of
time (eventually to each time, in the limit). Therefore, if one uses the concept of
Nash equilibrium, decisions at different times correspond to decisions of different
players, and thus are out of reach. This explains why only spike changes are allowed.
Of course, in standard control problems, this is not a limitation, but it is one in the
present situation.
In the spirit of dynamic programming, and the invariant embedding idea, we
consider a family of control problems indexed by the initial conditions, and we
control the system using feedbacks only. So if v(x, s) is a feedback, we consider
the state equation x(s) = xxt (s; v(.))

dx = g(x(s), v(x(s), s))ds + σ (x(s))dw(t)


x(t) = x (4.28)

and the payoff


ˆ T 
Jx,t (v(.)) = E f (x(s), v(x(s), s)) ds + h(x(T ))
t
ˆ T
+ F(Ex(s))ds + Φ(Ex(T )). (4.29)
t
26 4 The Mean Field Type Control Problems

Consider a specific control v̂(x, s) that will be optimal in the sense described earlier.
We define x̂(.) to be the corresponding state, the solution of (4.28), and set

V (x,t) = Jx,t (v̂(.)). (4.30)

We make a spike modification and define

v t < s < t +ε
v̄(x, s) =
v̂(x, s) s > t + ε ,

where v is arbitrary. The idea is to evaluate Jx,t (v̄(.)) and to express that it is larger
than V (x,t). We define x̄(s) to the state corresponding to the feedback v̄(.). We have,
by definition,

xxt (s; v), t < s < t +ε


x̄(s) =
x̂xxt (t+ε ;v),t+ε (s), t + ε < s < T,

where we have made explicit the initial conditions. In the sequel, to simplify the
notation, we write

x(t + ε ) = xxt (t + ε ; v).

We introduce the function

Ψ(x,t; s) = E x̂xt (s), t < s

which is the solution of


∂Ψ
− + AΨ − DΨ · g(x, v̂(x,t)) = 0, t < s
∂t
Ψ(x, s; s) = x. (4.31)

We note the important property

E x̄(s) = EΨ(x(t + ε ),t + ε ; s), ∀s ≥ t + ε .

Therefore.
ˆ t+ε ˆ T
Jx,t (v̄(.)) = E f (x(s), v)ds + f (x̂x(t+ε ),t+ε (s), v̂(x̂x(t+ε ),t+ε (s), s))ds
t t+ε
 ˆ t+ε
+ h(x̂x(t+ε ),t+ε (T )) + F(Ex(s))ds
t
ˆ T
+ F(EΨ(x(t + ε ),t + ε ; s))ds + Φ(EΨ(x(t + ε ),t + ε ; T )).
t+ε
4.4 Time Inconsistency Approach 27

The next point is to compare F(EΨ(x(t + ε ),t + ε ; s)) with EF(Ψ(x(t + ε ),t + ε ; s)).
This is a simple application of Ito’s formula

EF(Ψ(x(t + ε ),t + ε ; s)
ˆ t+ε
∂F ∂ Ψk
= F(Ψ(x,t; s)) + E
t
∑ ∂ xk (Ψ(x(τ ), τ ; s)) ∂ xi (x(τ ), τ ; s)gi (x(τ ), v)d τ
ik
ˆ t+ε
∂ 2F ∂ Ψk ∂ Ψl
+E
t
∑ ai j (x(τ )) ∑ ∂ xk ∂ xl (Ψ(x(τ ), τ ; s)) ∂ xi ∂xj
(x(τ ), τ ; s)
ij kl

∂F ∂ 2 Ψk
+∑ (Ψ(x(τ ), τ ; s)) (x(τ ), τ ; s) d τ
k ∂ xk ∂ xi ∂ x j
ˆ t+ε
∂F ∂ Ψk
+E ∑ (Ψ(x(τ ), τ ; s)) (x(τ ), τ ; s)d τ .
t k ∂ xk ∂t

On the other hand,

F(EΨ(x(t + ε ),t + ε ; s))


ˆ t+ε 
∂F ∂ Ψk
= F(Ψ(x,t; s)) +
t
∑ ∂ x k
(EΨ(x(τ ), τ ; s))E
∂t
(x(τ ), τ ; s)
k

∂ Ψk ∂ 2 Ψk
+∑ (x(τ ), τ ; s)gi (x(τ ), v) + ∑ ai j (x(τ )) (x(τ ), τ ; s) d τ .
i ∂ xi ij ∂ xi ∂ x j

By comparison we have

EF(Ψ(x(t + ε ),t + ε ; s)) − F(EΨ(x(t + ε ),t + ε ; s))


∂ 2F ∂ Ψk ∂ Ψl
= ε ∑ ai j (x) ∑ (Ψ(x,t; s)) (x,t; s)) + O(ε ). (4.32)
ij kl ∂ xk ∂ xl ∂ xi ∂ x j

We can similarly compute the difference EΦ(Ψ(x(t + ε ),t + ε ; T )) − Φ(EΨ(x(t +


ε ),t + ε ; T )). Collecting results, we obtain

Jx,t (v̄(.)) = EV (x(t + ε ),t + ε ) + ε f (x, v) + F(x)
ˆ T
∂ 2F ∂ Ψk ∂ Ψl
− ∑ ai j (x) ∑ ∂ xk ∂ xl (Ψ(x,t; s)) ∂ xi
(x,t; s))ds
ij t kl ∂xj

∂ 2Φ ∂ Ψk ∂ Ψl
− ∑ ai j (x) ∑ (Ψ(x,t; T )) (x,t; T )) + O(ε ).
ij kl ∂ xk ∂ xl ∂ xi ∂ x j
28 4 The Mean Field Type Control Problems

So we have the inequality



V (x,t) ≤ EV (x(t + ε ),t + ε ) + ε f (x, v) + F(x)
ˆ T
∂ 2F ∂ Ψk ∂ Ψl
− ∑ ai j (x) ∑ ∂ xk ∂ xl (Ψ(x,t; s)) ∂ xi (x,t; s)ds
ij t kl ∂xj

∂ 2Φ ∂ Ψk ∂ Ψl
− ∑ ai j (x) ∑ (Ψ(x,t; T )) (x,t; T ) + O(ε ).
ij kl ∂ xk ∂ xl ∂ xi ∂ x j

We expand EV (x(t + ε ),t + ε ) by the same token, divide by ε and let ε → 0.


We obtain the inequality

∂V ∂V ∂ 2V
0≤ +∑ gi (x, v) + ∑ ai j (x) + f (x, v) + F(x)
∂t i ∂ xi ij ∂ xi ∂ x j
ˆ T
∂ 2F ∂ Ψk ∂ Ψl
− ∑ ai j (x) (Ψ(x,t; s)) (x,t; s)ds
i jkl t ∂ x k ∂ x l ∂ xi ∂ x j

∂ 2Φ ∂ Ψk ∂ Ψl
+ (Ψ(x,t; T )) (x,t; T ) .
∂ xk ∂ xl ∂ xi ∂ x j

Since we get an equality for v = v̂(x,t), we deduce easily


ˆ T
∂V ∂ 2F ∂ Ψk ∂ Ψl
− + AV = H(x, DV ) + F(x) − ∑ ai j (x) (Ψ(x,t; s)) (x,t; s)ds
∂t i jkl t ∂ xk ∂ xl ∂ xi ∂ x j

∂ 2Φ ∂ Ψk ∂ Ψl
+ (Ψ(x,t; T )) (x,t; T )
∂ xk ∂ xl ∂ xi ∂ x j
V (x, T ) = h(x) + Φ(x). (4.33)

Moreover, the equation for Ψ can be written as


∂Ψ
− + AΨ − DΨ · G(x, DV) = 0, t < s
∂t
Ψ(x, s; s) = x (4.34)

Remark 5. The optimal feedback obtained from the system (4.33), (4.34), is time
consistent. It does not depend on the initial condition. The drawback is that we
cannot extend this approach to more complex modelling, as we can in the mean field
type treatment. On the other hand, this time consistent approach can be extended to
situations in which the functionals depend on the initial conditions. For instance, we
can consider instead of

Eh(x(T )) + Φ(Ex(T ))
4.4 Time Inconsistency Approach 29

the quantity

Eh(x,t; x(T )) + Φ(x,t; Ex(T )).

Even when Φ = 0, the problem is nonstandard.


Chapter 5
Approximation of Nash Games with a Large
Number of Players

5.1 Preliminaries

We first assume that the functions f (x, m, v), g(x, m, v), and h(x, m)—as functions
of m—can be extended to Dirac measures and the sum of Dirac measures that are
probabilities. Since m is no more in L p , the reference topology will be the weak *
topology of measures. That will be sufficient for our purpose, but we refer to [14]
for metric space topology. At any rate, the vector space property is lost.
We consider N players. Each player is characterized by his or her state space. The
state spaces are identical and are Rn . We denote by xi (t) the state space of player i.
The index of players will be an upper index. We write

x(t) = (x1 (t), . . . , xN (t)).


The controls are feedbacks. An important limitation is that the feedbacks are limited
to the individual states of players. So the feedback of player i is of the form vi (xi ).
We will denote

v(x) = (v1 (x1 ), . . . , vN (xN )).


We consider N independent Wiener processes wi (t), i = 1, . . . , N.
The trajectories are given by
N
1
dxi = g(xi (t),
N −1 ∑ δx j (t) , vi (xi (t))dt + σ (xi (t))dwi (t)
j=1=i

x (0) =
i
xi0 . (5.1)
1
We see that m has been replaced with N−1 ∑Nj=1=i δx j which is a probability on Rn .
i
The random variables x0 are independent and identically distributed with density m0 .
They are independent of the Wiener processes. The objective functional of player i
is defined by

A. Bensoussan et al., Mean Field Games and Mean Field Type Control Theory, 31
SpringerBriefs in Mathematics, DOI 10.1007/978-1-4614-8508-7__5,
© Alain Bensoussan, Jens Frehse, Phillip Yam 2013
32 5 Approximation of Nash Games with a Large Number of Players

ˆ T N
1
J i (v(.)) = E f (xi (t)),
N −1 ∑ δx j (t) , vi (xi (t))dt
0 j=1=i

N
1
+ h(xi (T )),
N −1 ∑ δx j (T ) . (5.2)
j=1=i

5.2 System of PDEs

To any feedback control v(.) we can associate a system of linear PDEs. Find
functions Φi (x,t; v(.)) that are the solution of
 
∂ Φi N N
1 N
− + ∑ Axh Φ − ∑ Dxh Φ .g x ,
i i h
∑ δx j , v (x )
h h
∂t h=1 h=1 N − 1 j=1 =h
 
N
1
= f xi , ∑
N − 1 j=1=i x
δ j , vi (xi )

 
N
1
Φ (x, T ) = h x ,
i i
∑ δx j ,
N − 1 j=1
(5.3)
=i

where Axh refers to the differential operator A operating on the variable xh . By simple
application of Ito’s formula, one can check easily that
ˆ ˆ N
J (v(.)) = · · · Φi (x, 0) ∏ m0 (x j )dx j .
i
(5.4)
j=1

Remark 6. To find a Nash equilibrium, one can write a system of nonlinear PDE,
as in [5]. However the feedback one can obtain in this way is global, and uses the
state values of all players. So a Nash equilibrium related to individual state values
will not exist in general. The idea of mean field games is that we can obtain an
approximate Nash equilibrium based on individual states, which is good because N
is large. Nevertheless, we will consider in the next section a particular case in which
a Nash equilibrium among individual feedbacks exists.

5.3 Independent Trajectories

In this section, we assume that g(x, m, v) = g(x, v) is independent of m. Then the


individual trajectories become
dxi = g(xi (t), vi (xi (t))dt + σ (xi (t))dwi (t)
xi (0) = xi0 . (5.5)
5.3 Independent Trajectories 33

It is clear that the processes are now independent. The probability distribution of
xi (t) is the function mvi (.) (xi ,t) solution of the FP equation

∂ mvi (.)
+ Axi mvi (.) + divxi (g(xi , vi (xi ))mvi (.) (xi )) = 0
∂t
mvi (.) (xi , 0) = m0 (xi ). (5.6)

We then define the function


ˆ ˆ
Ψi (xi ,t) = ··· ∏ Φi (x,t)mv j (.) (x j ,t)dx j . (5.7)
j=i

By testing (5.3) with ∏ j=i mv j (.) (x j ) and integrating, and using (5.6), we obtain

∂ Ψi
− + Axi Ψi − Dxi Ψi .g(xi , vi (xi ))
∂t
ˆ ˆ  N

1
= ··· f x , i
∑ δxh , v (x ) ∏ mv j (.) (x j )dx j
N − 1 h=1
i i
=i j=i
ˆ ˆ  N

1
Ψi (xi , T ) = · · · h xi , ∑
N − 1 h=1=i x ∏
δh mv j (.) (x j , T )dx j . (5.8)
j=i

We also have
ˆ ˆ ˆ
Ψi (xi ,t)mvi (.) (xi ,t)dxi = · · · Φi (x,t) ∏ mv j (.) (x j ,t)dx j . (5.9)
Rn j

In particular
ˆ
J i (v(.)) = Ψi (xi , 0)m0 (xi )dxi . (5.10)
Rn

If we have a Nash equilibrium v̂(.), which we write (v̂i (.), v̂i (.)) in which v̂i (.)
represents all components different from i, then, noting Ψi (xi ,t; vi (.), v̂i (.)), the
solution of (5.8) when we take v j (x j ) = v̂ j (x j ), ∀ j = i, then we can assert that

Ψi (xi ,t; v̂i (.), v̂i (.)) ≤ Ψi (xi ,t; vi (.), v̂i (.)), ∀vi (.).

From standard dynamic programming theory, we obtain that the functions

ui (xi ,t) = Ψi (x,t; v̂i (.), v̂i (.)) (5.11)


34 5 Approximation of Nash Games with a Large Number of Players

satisfy

∂ ui
− + Axi ui
∂t
ˆ ˆ  N

1
= inf
v
· · · f xi , ∑ δ h,v
N − 1 h=1=i x ∏ mv̂ j (.) (x j )dx j + Dui(xi ).g(xi , v) .
j=i
(5.12)

We also have the terminal condition


ˆ ˆ  N

1
u (x , T ) = · · · h x ,
i i i
∑ δxh ∏ mv̂ j (.) (x j , T )dx j .
N − 1 h=1
(5.13)
=i j=i

We want to show that a Nash equilibrium exists, made of identical feedbacks


for all players. We proceed as follows: For x ∈ Rn , m ∈ L1 (Rn ), q ∈ Rn define the
Hamiltonian
ˆ ˆ
1 N−1 N−1
HN (x, m, q) = inf
v
··· f (x, ∑ x ∏ m(x j )dx j + q.g(x, v)
N − 1 h=1
δ h , v) (5.14)
j=1

and let v̂N (x, m, q) be the minimizer in the Hamiltonian. Set

GN (x, m, q) = g(x, v̂N (x, m, q)). (5.15)

Define finally
ˆ ˆ  
1 N−1 N−1
hN (x, m) = ··· h x, ∑ δxh
N − 1 h=1 ∏ m(x j )dx j . (5.16)
j=1

We consider the pair of functions (if it exists) uN (x,t), mN (x,t) the solution of the
system
∂ uN
− + AuN = HN (x, mN , DuN )
∂t
∂ mN
+ A∗ mN + div(GN (x, mN , DuN )mN ) = 0
∂t
uN (x, T ) = hN (x, m(T )), mN (x, 0) = m0 (x). (5.17)

Then, from symmetry considerations, it is easy to check that the functions ui (xi ,t)
coincide with uN (xi ,t). We can next make the connection with the differential game
for N players. Write J N,i (v(.)) the objective functional of player i defined by (5.2),
to emphasize that there are N players. Define common feedbacks

v̂N (x) = v̂N (x, mN (x), uN (x)) (5.18)


5.3 Independent Trajectories 35

and denote by J N,i (v̂N (.)) the value of the objective functional of Player i, when all
players use the same feedback. We can assert that
ˆ
J N,i (v̂N (.)) = uN (x, 0)m0 (x)dx (5.19)

and when all players use the same local feedback v̂N (.) one gets a Nash equilibrium
for the functionals J N,i (v(.)) among local feedbacks (i.e., feedbacks on individual
states).
In this case, mean field theory is concerned with what happens to the system
(5.17) as N → +∞. To simplify the analysis, consider the situation

f (x, m, v) = f0 (x, m) + f (x, v) (5.20)

then
ˆ ˆ  
1 N−1 N−1
HN (x, m, q) = ··· f0 x, ∑ δxh
N − 1 h=1 ∏ m(x j )dx j + H(x, q) (5.21)
j=1

in which

H(x, q) = inf[ f (x, v) + q.g(x, v)] (5.22)


v

and v̂N (x, m, q) = v̂(x, q), which is the minimizer in (5.22). Also

GN (x, m, q) = G(x, q) = g(x, v̂(x, q)).

If we set
ˆ ˆ  
1 N−1 N−1
f0N (x, m) = ··· f0 x, ∑ δxh
N − 1 h=1 ∏ m(x j )dx j (5.23)
j=1

then the system (5.17) amounts to

∂ uN
− + AuN = H(x, DuN ) + f0N (x, m)
∂t
∂ mN
+ A∗ mN + div(G(x, DuN )mN ) = 0
∂t
uN (x, T ) = hN (x, m(T )), mN (x, 0) = m0 (x). (5.24)

It is clear that the pair uN (x,t), mN (x,t) will converge pointwise and in Sobolev
spaces towards the u(x,t), m(x,t) solution of
36 5 Approximation of Nash Games with a Large Number of Players

∂u
− + Au = H(x, Du) + f0 (x, m)
∂t
∂m
+ A∗m + div(G(x, Du)m) = 0
∂t
u(x, T ) = h(x, m(T )), m(x, 0) = m0 (x) (5.25)

provided f0N (x, m(t)) → f0 (x, m(t)) and hN (x, m(T )) → h(x, m(T )) as N → +∞,for
any fixed x,t. This is a consequence of the law of large numbers. Indeed, consider a
sequence of independent random variables, identically distributed with probability
distribution m(x) (this is done for any t,so we do not mention the time dependence).
We denote this sequence by X j . Then
 
1 N−1
f0N (x, m) = E f0 x, ∑ δX h
N − 1 h=1
. (5.26)

We claim that the random measure on Rn , N−1 h=1 δX h converges a.s. towards m,
∑N−1
1

for the weak * topology of measures on R . Indeed, consider a continuous bounded


n

function on Rn , denoted by ϕ ; then

1 N−1 1 N−1
ϕ, ∑ δX h
N − 1 h=1
= ∑ ϕ (X h).
N − 1 h=1

But the real ´random variables ϕ (X h ) are independent and identically distributed.
The mean is ϕ (x)m(x)dx. According to the law of large numbers
ˆ
1 N−1
∑ ϕ (X h) →
N − 1 h=1
ϕ (x)m(x)dx, a.s. as N → +∞

h=1 δX h towards m, a.s. as N → +∞. If we assume


∑N−1
1
hence the convergence of N−1
that f0 (x, m) is continuous in m for the topology of weak * convergence, we get
 
1 N−1
f0 x, ∑ δX h → f0 (x, m) a.s. as N → +∞
N − 1 h=1

and f0N (x, m) → f0 (x, m), provided Lebesgue’s theorem can be applied.

5.4 General Case

In the general case, namely if g depends on m as in (5.1), (5.2), the problem


of Nash equilibrium among local feedbacks (i.e., those depending on individual
states) has no solution. In that case, the mean field theory can provide a good
5.4 General Case 37

feedback control, when N is large. However, PDE techniques cannot be used, since
the Bellman system necessitates allowing global feedbacks (i.e., those based on
the states of all players). The approximation property can be shown only with
probability techniques. In this context, there is no reason to consider specific
feedbacks, largely motivated by PDE techniques. The simplest is to consider open
loop local controls. These are random processes adapted to the individual Wiener
processes (the uncertainty that affects the evolution of the state of each individual
player), but not linked to the state. So we reformulate the game as follows

N
1
dxi = g(xi (t),
N −1 ∑ δx j (t) , vi (t))dt + σ (xi (t))dwi (t)
j=1=i

xi (0) = xi0 (5.27)

ˆ  
T N
1
J N,i
(v(.)) = E f x (t),
i
N −1 ∑ δx j (t) , v (t) dt
i
0 j=1=i
 
N
1
+ h xi (T ),
N −1 ∑ δx j (T ) . (5.28)
j=1=i

We recall that the Wiener processes wi (t) are standard one and independent, the
xi0 are independent identically distributed random variables, independent from the
Wiener processes and with probability distribution m0 (x). The processes vi (t) are
random processes adapted to the filtration generated by wi (t), for each i. So when
we write J N,i (v(.)), v(.) should be understood as (v1 (t), . . . , vN (t)). As already
mentioned there is no exact Nash equilibrium, among open loop controls. We now
construct our approximation.
We go back to (2.3), (2.5). We consider the solution (3.1), (2.8). We consider the
optimal feedback v̂(x) = v̂(x, m, Du) where the pair u, m is the solution of the system
(3.11) of coupled HJB-FP equations. We then construct the optimal trajectory

d x̂ = g(x̂(t), m(t), v̂(x̂(t)))dt + σ (x̂(t))dw(t)


x̂(0) = x0 (5.29)

and define the stochastic process v̂(t) = v̂(x̂(t)), which is adapted to the pair x0 , w(.).
If we consider any process v(t) adapted to the filtration generated by x0 , w(.) and the
trajectory

dx = g(x, m(t), v(t))dt + σ (x)dw(t)


x(0) = x0 (5.30)

with the same m(t) as in (5.29). The fact that m(t) is the probability distribution of
x̂(t) does not play a role in this aspect. Consider then the payoff functional
38 5 Approximation of Nash Games with a Large Number of Players

ˆ T 
J(v(.)) = E f (x(t), m(t), v(t)) dt + h(x(T ), m(T )) (5.31)
0

in which v(.) refers to v(t). From standard control theory, it follows that
ˆ
inf J(v(.)) = J(v̂(.)) = u(x, 0)m0 (x)dx. (5.32)
v(.) Rn

We now construct N replicas of x0 , w(.) called xi0 , wi (.), which are independent.
We can then define N replicas of v̂(t), called v̂i (t). Since v̂i (t) is adapted to xi0 , wi (.),
they are independent processes. We can deduce N replicas of x̂(t), called x̂i (t). They
satisfy

d x̂i = g(x̂i (t), m(t), v̂i (t))dt + σ (x̂i )dwi (t)


x̂i (0) = xi0 . (5.33)

From their construction, it can be seen they are independent processes. Moreover,

J(v̂i (.)) = J(v̂(.)). (5.34)

We will use v̂i (t) in the context of the differential game (5.27), (5.28). We will
show that they constitute an approximate local Nash equilibrium. We first compute
J N,i (v̂(.)). Note that, in the game (5.27), (5.28), the trajectories corresponding to
the controls v̂i (t) are not x̂i (t). We call them ŷi (t) and they are defined by
 
N
1
d ŷ = g ŷ (t),
i i
N −1 ∑ δŷ j (t) , v (t) dt + σ (ŷi (t))dwi (t)
i
j=1=i

ŷi (0) = xi0 . (5.35)

Note that ŷi (t) depends on N, which is not the case for x̂i (t).
The first task is to show that x̂i (t) constitutes a good approximation of ŷi (t).
We evaluate ŷi (t) − x̂i (t) as follows
    
N N
1 1
d(ŷ (t)−x̂ (t)) = g ŷ (t),
i i i
∑ δŷ j (t) , v (t) −g x̂ (t), N−1 ∑ δx̂ j (t) , v (t) dt
N−1 j=1
i i i
=i j=1=i
   
N
1
+ g x̂ (t),
i
∑ δx̂ j (t) , v (t) − g(x̂ (t), m(t), v̂ (t)) dt
N − 1 j=1
i i i
=i

+ (σ (ŷi (t)) − σ (x̂i (t)))dwi (t)

ŷi (0) − x̂i (0) = 0.

It is clear from this expression that one needs Lipschitz assumptions for g and σ in
x. An important issue is to evaluate
5.4 General Case 39

   
N N
1 1
g x̂ (t),
i
N −1 ∑ δŷ j (t) , v (t) − g x̂ (t), N − 1
i i
∑ δx̂ j (t) , v (t)
i
j=1=i j=1=i

so we have to evaluate g(x, μ , v) − g(x, ν , v) when μ = N−1 1


j=1 δξ j and ν =
∑N−1
N−1 ∑ j=1 δη j , where ξ and η are points in R . Assumptions must be made to
1 N−1 j j n

j=1 |ξ − η |. A standard case will be a function


∑N−1
1 j j
estimate this difference with N−1
g(x, μ ) of the form (omitting to indicate v)
ˆ 
g(x, μ ) = ϕ K(x, y)μ (dy) ,

where ϕ is Lipschitz and K(x, y) is Lipschitz in the second argument. There remains
the driving term (not depending on ŷi (t))

N
1
g(x̂i (t),
N −1 ∑ δx̂ j (t) , vi (t)) − g(x̂i (t), m(t), v̂i (t)).
j=1=i

But, since the random variables x̂ j (t) are independent and identically distributed
with m(t) for probability distribution, the random measure on Rn defined by
N−1 ∑ j=1=i δx̂ j (t) converges a.s. to m(t) for the weak * topology of measures.
1 N

If g(x, m, v) is continuous in m for the weak * topology of measures, we can assert


that
N
1
g(x̂i (t),
N −1 ∑ δx̂ j (t) , vi (t)) − g(x̂i (t), m(t), v̂i (t)) → 0, a.s.
j=1=i

This sequence of steps, with appropriate assumptions shows that ŷi (t) − x̂i (t) → 0
for instance in L(0, T ; L2 (Ω, A, P)) and ŷi (T ) − x̂i (T ) → 0 in L2 (Ω, A, P). With this
in hand, we can consider
ˆ T N
1
J N,i (v̂(.)) = E f (ŷi (t),
N −1 ∑ δŷ j (t) , v̂i (t))dt
0 j=1=i

N
1
+ h(ŷi (T ),
N −1 ∑ δŷ j (T ) ) .
j=1=i

We can write
40 5 Approximation of Nash Games with a Large Number of Players

ˆ   
T N
1
J N,i
(v̂(.)) = J(v̂ (.)) + E
i
0
f ŷ (t),
i
∑ δ j , v̂i (t)
N − 1 j=1=i ŷ (t)
.

 
N
1
−f x̂i (t),
N −1 ∑ δx̂ j (t) , v̂i (t) dt
j=1=i
ˆ    
T
1 N  i 
+E f x̂ (t),
i
N −1 ∑ δx̂ j (t) , v̂ (t) − f x̂ (t), m(t), v̂ (t) dt
i i
0 j=1=i
   
N N
1 1
+ E h ŷ (T ),
N −1
i
∑ δŷ j (T ) − h x̂ (T ),
i
N −1 ∑ δx̂ j (T )
j=1=i j=1=i
 
N
1
+ E h x̂i (T ),
N −1 ∑ δx̂ j (T ) − h(x̂i (T ), m(T )) .
j=1=i

Since x̂i (t) approximates ŷi (t), by appropriate smoothness assumptions on f and h,
analogous to g, we can state that
 
1
J N,i
(v̂(.)) = J(v̂ (.)) + O √
i
. (5.36)
N

Let us now focus on player 1, but we can naturally take any player. Player 1 will use
a different local control denoted by v1 (t),and all other players use v̂i (t), i ≥ 2. We
associate to these controls the trajectories

dx1 = g(x1 , m(t), v1 (t))dt + σ (x1 )dw1 (t)


x1 (0) = x10 (5.37)

and x̂i (t), i ≥ 2. Call ṽ(.) = (v1 (.), v̂2 (.), . . . v̂N (.)) and use these controls in the differ-
ential game (5.27), (5.28). The corresponding states are denoted by y1 (.), . . . , yN (.)
and are defined by the equations
 
1 N
dy = g y (t),
1 1
N − 1 j=2 ∑ δy j (t) , v (t) dt + σ (y1(t))dw1 (t)
1

y1 (0) = x10 (5.38)


 
N
1
dy = g y (t),
i
N −1
i
∑ δy j (t) , v̂ (t) dt + σ (yi (t))dwi (t)
i
j=1=i

y (0) =
i
xi0 . (5.39)

We want to show again that x1 (t) approximates y1 (t) and x̂i (t) approximates
yi (t), ∀i ≥ 2. We begin with i ≥ 2 and write
5.4 General Case 41

    
N N
1 1
d(y (t) − x̂ (t)) = g y (t),
i i i
∑ δy j (t), v̂ (t) − g y (t), N − 2 ∑ δy j (t) , v̂ (t) dt
N − 1 j=1
i i i
=i j=2=i
    
N N
1 1
+ g y (t),
i
∑ δy j (t), v̂ (t) − g x̂ (t), N − 2 ∑ δx̂ j (t), v̂ (t) dt
N − 2 j=2
i i i
=i j=2=i
   
N
1
+ g x̂i (t), ∑ x̂ (t)
N − 2 j=2
δ j , v̂i
(t) − g(x̂i
(t), m(t), v̂i
(t)) dt
=i

+ (σ (yi (t)) − σ (x̂i (t)))dwi (t)

yi (0) − x̂i (0) = 0.

For the first term, assumptions must be made so that it is estimated by


∑Nj=2 |y1 (t)−y j (t)|
(N−1)(N−2) , which should tend to 0 in L2 (0, T ; L2 (Ω, A, P)). It is sufficient
to show that all processes are bounded in L2 (0, T ; L2 (Ω, A, P)). The other terms
are dealt with as explained for ŷi (t) − x̂i (t) above. This argument shows that x̂i (t)
approximates yi (t), ∀i ≥ 2. We next consider

    
N
1 1 N
d(y (t)−x (t)) = g y (t),
1 1
∑ δy j (t), v (t) − g y (t), N−1 ∑ δx̂ j (t), v (t) dt
N − 1 j=2
1 1 1 1
j=2
    
N N
1 1
+ g y1 (t), ∑ x̂ (t)
N − 1 j=2
δ j , v 1
(t) − g x1
(t), ∑ x̂ (t)
N − 1 j=2
δ j , v1
(t) dt

   
1 N
+ g x (t),
1
∑ δx̂ j (t) , v (t) −g(x , m(t), v (t)) dt
N−1 j=2
1 1 1

+σ (y1 (t)) − σ (x1 (t)))dw1 (t)

y1 (0) − x1 (0) = 0

and similar considerations as above show that x1 (t) approximates y1 (t). We then
write
ˆ T N
1
J N,1 (ṽ(.)) = E
0
f (y1 (t),
N −1 ∑ δy j (t) , v1 (t))dt
j=2
 
N
1
+h y1 (T ),
N−1 ∑ δy j (T ) .
j=2

Thus,
42 5 Approximation of Nash Games with a Large Number of Players

ˆ  
T N
1
J N,1
(ṽ(.)) = J(v (.)) + E
1
0
y (t),
1
∑ δ j , v1 (t)
N − 1 j=2 y (t)
 
N
1
−f x1 (t),
N −1 ∑ δx̂ j (t), v1 (t) dt
j=2
ˆ    
T
1 N  
+E
0
f x (t), 1
N −1 ∑ δx̂ j (t), v (t)
1
− f x (t), m(t), v (t)
1 1
dt
j=2
   
N N
1 1
+ E h y (T ),
N −1
1
∑ δy j (T ) − h x (T ),
1
N −1 ∑ δx̂ j (T )
j=2 j=2
 
N
1
+ E h x1 (T ),
N −1 ∑ δx̂ j (T ) − h(x1 (T ), m(T )) .
j=2

Since x1 (t) approximates y1 (t) and x̂i (t) approximates yi (t), ∀i ≥ 2, and since
  N  
1
E h x1 (T ),
N −1 ∑ x̂ (T )
δ j − h(x 1
(T ), m(T )) →0
j=2

we can assert that

 
1
J N,1 (ṽ(.)) = J(v1 (.)) + O √ .
N
By the standard control theory, J(v1 (.)) ≥ J(v̂1 (.)), therefore,
 
1
J N,1 (ṽ(.)) ≥ J N,1 (v̂(.)) − O √
N

which proves that v̂(.) is an approximate Nash equilibrium among local open loop
controls.

5.5 Nash Equilibrium Among Local Feedbacks

In the preceding section, we considered local open loop controls. However the
approximate Nash equilibrium that has been obtained is obtained through a feed-
back. So it is natural to ask whether we can obtain an approximate Nash equilibrium
among local feedbacks (feedbacks on individual states).
Moreover, the feedback was given by v̂(x) = v̂(x, m, Du), so we can consider that
it is a Lipschitz function. So it is natural to consider the class of local feedbacks
that are Lipschitz continuous. In that case, it is true that v̂(.) = (v̂1 (.), . . . , v̂N (.))
constitutes an approximate Nash equilibrium.
5.5 Nash Equilibrium Among Local Feedbacks 43

Indeed, we consider the trajectories

d x̂i = g(x̂i (t), m(t), v̂i (x̂i (t)))dt + σ (x̂i )dwi (t)
x̂i (0) = xi0 . (5.40)

Now if we use the feedbacks v̂(.) in the differential game (5.1), (5.2), we get the
trajectories
 
N
1
d ŷ = g ŷ (t),
i i
N −1 ∑ δŷ j (t) , v̂ (ŷ (t)) dt + σ (ŷi (t))dwi (t)
i i
j=1=i

ŷi (0) = xi0 . (5.41)

We want to show that x̂i (t) approximates ŷi (t). The difference with the open loop
case is that v̂i (x̂i (t)) and v̂i (ŷi (t)) are now different in the two equations, whereas in
the open loop case we had the same term, v̂i (t). Nevertheless, since v̂i (x) is Lipschitz
|v̂i (ŷi (t)) − v̂i (x̂i (t))| is estimated by |ŷi (t) − x̂i (t)| and thus the reasoning of the open
loop case carries over, with additional terms. We obtain that x̂i (t) approximates ŷi (t).
Since we consider only Lipschitz local feedbacks, the reasoning of the open loop
case to show the property of approximate Nash equilibrium will apply again and the
result can then be obtained.
Chapter 6
Linear Quadratic Models

6.1 Setting of the Model

The linear quadratic model has been developed in [10]. See also [2, 20, 22].
We highlight here the results. We take
  ˆ ∗  ˆ 
1 ∗ ∗
f (x, m, v) = x Qx + v Rv + x − S ξ m(ξ )d ξ Q̄ x − S ξ m(ξ )d ξ
2
(6.1)
ˆ
g(x, m, v) = Ax + Ā ξ m(ξ )d ξ + Bv (6.2)
  ˆ ∗  ˆ 
1 ∗
h(x, m) = x QT x + x − ST ξ m(ξ )d ξ Q¯T x − ST ξ m(ξ )d ξ .
2
(6.3)

We also take σ (x) = σ , constant and set a = 12 σ σ ∗. Finally we assume that


m0 (x) is a gaussian with mean x̄0 and variance Γ0 . Of course, the matrix A is not the
operator −traD2 .

6.2 Solution of the Mean Field Game Problem

We need to solve the system of HJB-FP equations (3.11), which reads


 ˆ 
∂u 1 ∗ −1 ∗
− − tr aD u = − Du BR B Du + Du · Ax + Ā ξ m(ξ )d ξ
2
∂t 2
  ˆ ∗  ˆ 
1 ∗
+ x Qx + x − S ξ m(ξ )d ξ Q̄ x − S ξ m(ξ )d ξ
2

A. Bensoussan et al., Mean Field Games and Mean Field Type Control Theory, 45
SpringerBriefs in Mathematics, DOI 10.1007/978-1-4614-8508-7__6,
© Alain Bensoussan, Jens Frehse, Phillip Yam 2013
46 6 Linear Quadratic Models

  ˆ ∗  ˆ 
1 ∗
u(x, T ) = x QT x + x − ST ξ m(ξ )d ξ QT x − ST ξ m(ξ )d ξ
¯
2
(6.4)

  ˆ 
∂m −1 ∗
− tr aD m + div m Ax + Ā ξ m(ξ )d ξ − BR B Du
2
=0
∂t
m(x, 0) = m0 (x). (6.5)

We look for a solution u(x,t) of the form

1
u(x,t) = x∗ P(t)x + x∗r(t) + s(t) (6.6)
2
then

Du = P(t)x + r(t), D2 u = P(t).

Equation (6.5) becomes


  ˆ 
∂m
− tr aD2 m + div m (A − BR−1B∗ P)x − BR−1B∗ r + Ā ξ m(ξ )d ξ =0
∂t
´
Setting z(t) = xm(x,t)dx, it follows easily that

dz
= (A + Ā − BR−1B∗ P(t))z(t) − BR−1B∗ r(t) (6.7)
dt
z(0) = x̄0 . (6.8)

Going back to the HJB equation and using the expression (6.6), we obtain that
P(t) is the solution of the Riccati equation

dP
+ PA + A∗P − PBR−1B∗ P + Q + Q̄ = 0
dt
P(T ) = QT + Q̄T . (6.9)

This Riccati equation has a unique positive symmetric solution. To get r(t) we
need to solve the coupled system of differential equations

dz
= (A + Ā − BR−1 B∗ P(t))z(t) − BR−1B∗ r(t) (6.10)
dt
z(0) = x̄0 (6.11)
6.2 Solution of the Mean Field Game Problem 47

dr
− = (A∗ − P(t)BR−1B∗ )r(t) + (P(t)Ā − QS)z(t)
¯ (6.12)
dt
r(T ) = −Q̄T ST z(T ). (6.13)

It is easy to get s(t) by the formula

1
s(t) = z(T )∗ ST∗ Q̄T ST z(T )
2
ˆ T
1
+ tr aP(s) − r(s)∗ BR−1 B∗ r(s)
t 2

1
+r(s)∗ Āz(s) + z(s)∗ S∗ Q̄S z(s) ds. (6.14)
2

So the solvability condition reduces to solving the system (6.10) and (6.12).
The function u(x,t) is given by (6.6). The function m(x,t) can then be obtained
easily and it is a gaussian. However, (6.10) and (6.12) do not necessarily have a
solution. Conditions are needed. We will come back to this point after expressing
the maximum principle.
We write the maximum principle, applying (3.14) and (3.15) . We obtain stocastic
processes X(t),V (t),Y (t), Z(t) which must satisfy

dX = (AX − BR−1B∗Y + Ā EX(t))dt + σ dw


−dY = (A ∗ Y + (Q + Q̄)X − Q̄S EX(t))dt − Zdw
X(0) = x0
Y (T ) = (QT + Q̄T )X(T ) − Q̄T ST EX(T ) (6.15)

V (t) = −R−1 B∗Y (t). (6.16)

It follows that X̄(t) = EX(t), Ȳ (t) = EY (t) must satisfy the system

d X̄
= (A + Ā )X̄ − BR−1B∗Ȳ
dt
X̄(0) = x̄0
dȲ
− = A ∗ Ȳ + (Q + Q̄(I − S))X̄
dt
Ȳ (T ) = (QT + Q̄T (I − ST ))X̄(T ). (6.17)
48 6 Linear Quadratic Models

If we can solve this system, then (6.15) can be readily solved and the optimal
control is obtained. Now (6.17) is identical to (6.10) and (6.12) with the correspon-
dence
z(t) = X̄(t) (6.18)
r(t) = Ȳ (t) − P(t)X̄(t). (6.19)

Even though (6.17) is equivalent to (6.10) and (6.12) , the situation differs when
one tries to define sufficient conditions for a solution of (6.10) and (6.12) or for a
solution of (6.17) to exist. Indeed, (6.10) and (6.12) involve P(t), so the condition
will be expressed in terms of P(t) and thus is not easily checkable. The system (6.17)
involves the data directly. Thus, simpler conditions of a solution can be obtained
(see [10]) for a complete discussion. Moreover, we can see that (6.17) is related to
a nonsymmetric Riccati equation. Indeed, we have
Ȳ (t) = Σ(t)X̄(t) (6.20)
with

+ Σ(A + Ā) + A∗Σ − ΣBR−1B∗ Σ + Q + Q̄(I − S) = 0
dt
Σ(T ) = QT + Q̄T (I − ST ). (6.21)

It follows that
r(t) = (Σ(t) − P(t))X̄(t). (6.22)

However Σ(t) − P(t) is not solution of a single equation. Equation (6.21) is not
standard since it not symmetric. The existence of a solution of (6.17) is equivalent
to finding a solution of (6.21).

6.3 Solution of the Mean Field Type Problem

We apply (4.12). We first need to compute


 ˆ ∗
∂H ∗
(ξ , m, q)(x) = q Āx − ξ − S η m(η )d η Q̄Sx (6.23)
∂m
hence
ˆ
∂H
(ξ , m, Du(ξ ))(x)m(ξ )d ξ
∂m
ˆ ∗ ˆ ∗
= Du(ξ )m(ξ )d ξ Āx − η m(η )d η (I − S)∗Q̄Sx. (6.24)
6.3 Solution of the Mean Field Type Problem 49

The HJB equation reads


 ˆ 
∂u 1
− − tr aD2 u = − Du∗ BR−1 B∗ Du + Du. Ax + Ā ξ m(ξ )d ξ
∂t 2
  ˆ ∗  ˆ 
1 ∗
+ x Qx + x − S ξ m(ξ )d ξ Q̄ x − S ξ m(ξ )d ξ
2
ˆ ∗ ˆ ∗
+ Du(ξ )m(ξ )d ξ Āx − ξ m(ξ )d ξ (I − S)∗Q̄Sx
(6.25)
  ˆ ∗  ˆ 
1 ∗
u(x, T ) = x QT x + x − ST ξ m(ξ )d ξ Q¯T x − ST ξ m(ξ )d ξ
2
ˆ ∗
− ξ m(ξ )d ξ (I − ST )∗ Q¯T ST x. (6.26)

For the FP equation, there is no change. It is


  ˆ 
∂m −1 ∗
− tr aD m + div m Ax + Ā ξ m(ξ )d ξ − BR B Du
2
=0
∂t
m(x, 0) = m0 (x). (6.27)

We look for a solution


1
u(x,t) = x∗ P(t)x + ρ ∗(t) + τ (t). (6.28)
2
Calling again
ˆ
z(t) = xm(x,t)dx,

we get the differential equation

dz
= (A + Ā − BR−1B∗ P(t))z(t) − BR−1B∗ ρ (t)
dt
z(0) = x̄0 .

Replacing u(x,t) in the HJB equation (6.25) and identifying terms we obtain

dP
+ PA + A∗P − PBR−1B∗ P + Q + Q̄ = 0
dt
P(T ) = QT + Q̄T (6.29)
50 6 Linear Quadratic Models

and the pair z(t), ρ (t) must be a solution of the system

dz
= (A + Ā − BR−1B∗ P(t))z(t) − BR−1B∗ ρ (t)
dt
z(0) = x̄0 (6.30)

− = (A∗ + Ā∗ − P(t)BR−1B∗ )ρ (t)
dt
+ (P(t)Ā + Ā∗ P − Q̄S − S∗Q̄ + S∗Q̄S)z(t)
ρ (T ) = (−Q̄T ST − ST∗ Q̄T + ST∗ Q̄T ST )z(T ). (6.31)

We note that P(t) is identical to the case of a mean field game; see (6.9). The
system z(t), ρ (t) is different from (6.10) and (6.12). We will see in considering the
stochastic maximum principle that it always has a solution.
To write the stochastic maximum principle, we use (4.19). We obtain stochastic
processes X(t),V (t),Y (t), andZ(t), which must satisfy

dX = (AX − BR−1B∗Y + Ā EX(t))dt + σ dw


−dY = (A ∗ Y + (Q + Q̄)X + Ā∗EY (t) − (Q̄S − S∗Q̄(I − S))EX(t))dt − Zdw
(6.32)
X(0) = x0
Y (T ) = (QT + Q̄T )X(T ) − (Q̄T ST − ST∗ Q̄T (I − ST )) EX(T )

V (t) = −R−1 B∗Y (t). (6.33)

Writing X̄(t) = EX(t), Ȳ (t) = EY (t) we deduce the system

d X̄
= (A + Ā )X̄ − BR−1B∗Ȳ
dt
X̄(0) = x̄0
dȲ
− = (A + Ā) ∗ Ȳ + (Q + (I − S)∗Q̄(I − S))X̄
dt
Ȳ (T ) = (QT + (I − ST )∗ Q̄T (I − ST ))X̄(T ). (6.34)

Conversely, in the case of mean field games [see (6.17)], this system always has
a solution, provided that we assume

Q + (I − S)∗Q̄(I − S) ≥ 0, QT + (I − ST )∗ Q̄T (I − ST ) ≥ 0. (6.35)


6.4 The Mean Variance Problem 51

In particular, if we write Ȳ (t) = Σ(t)X̄(t), then Σ(t) is solution of the symmetric


Riccati equation


+ Σ(A + Ā) + (A + Ā)∗ Σ − ΣBR−1B∗ Σ + Q + (I − S)∗Q̄(I − S) = 0
dt
Σ(T ) = QT + (I − ST )∗ Q̄T (I − ST ). (6.36)

So we can see in the linear quadratic case that the mean field type control problem
has solutions more generally than the mean field game problem.

6.4 The Mean Variance Problem

The mean variance problem is the extension in continuous time for a finite horizon
of the Markowitz optimal portfolio theory. Without referring to the background of
the problem, it can be stated as follows, mathematically. The state equation is

dx = rxdt + xv · (α dt + σ dw)
x(0) = x0 (6.37)
x(t) is scalar, r is a positive constant, α is a vector in Rm , and σ is a matrix in
L(Rd ; Rm ). All can depend on time and they are deterministic quantities. v(t) is the
control in Rm . We note that, conversely to our general framework, the control affects
the volatility term. The objective function is
γ
J(v(.)) = Ex(T ) − var(x(T )) (6.38)
2
which we want to maximize.
Because of the variance term, the problem is not a standard stochastic control
problem. It is a mean field type control problem, since one can write
γ γ
J(v(.)) = E(x(T ) − x(T )2 ) + (Ex(T ))2 . (6.39)
2 2
Because of the presence of the control in the volatility term, we need to
reformulate our general approach of mean field type problems, which we shall
do formally, and without details. We consider a feedback control v(x, s) and the
corresponding state xv(.) (t) solution of (6.37) when the control is replaced by the
feedback. We associate the probability density mv(.) (x,t) solution of

∂ mv(.) ∂ 1 ∂2 2
+ (xmv(.) (r + α · v(x))) − (x mv(.) |σ ∗ v(x)|2 ) = 0
∂t ∂x 2 ∂ x2
mv(.) (x, 0) = δ (x − x0 ). (6.40)
52 6 Linear Quadratic Models

The functional (6.39) can be written as


ˆ ˆ 2
γ γ
J(v(.)) = mv(.) (x, T )(x − x2 )dx + mv(.) (x, T )xdx . (6.41)
2 2

Let v̂(x,t) be an optimal feedback, and m(t) = mv̂(.) (t). We compute the Frechet
derivative
d
m̃(x,t) = (m )
d θ v̂(.)+θ v(.) |θ =0
which is the solution of

∂ m̃ ∂ 1 ∂2 2 ∗ ∂
+ (xm̃(r + α · v̂(x))) − (x m̃|v̂ (x)σ |2 ) = − (xm α .v)
∂t ∂x 2 ∂ x2 ∂x
∂2 2 ∗
+ (x mv̂ (x)σ · v∗ σ )
∂ x2
m̃(x, 0) = 0. (6.42)

We deduce the Gateux differential of the objective functional


ˆ  ˆ ˆ
d γ 
J(v̂(.) + θ v(.))|θ =0 = m̃(x, T ) x − x2 dx + γ m(x, T )xdx m̃(x, T )xdx.
dθ 2
(6.43)
We next introduce the function u(x,t) solution of

∂u ∂u 1 2 ∗ ∂ 2u
− − x(r + α · v̂(x)) − x |σ v̂ (x)|2 2 = 0
∂t ∂x 2 ∂x
ˆ
γ
u(x, T ) = x − x2 + γ x m(ξ , T )ξ d ξ (6.44)
2

then
ˆ
d
J(v̂(.) + θ v(.))|θ =0 = m̃(x, T )u(x, T )dx

ˆ ˆ  
T
∗ ∂u 2∂ u
2

= m(x,t)v (x,t) x α + x σ σ v̂(x,t) dxdt.
0 R ∂x ∂ x2

Since v̂(.) maximizes J(v(.)), we obtain


 
∗ ∂u 2∂ u
2

v (x,t) x α + x σ σ v̂(x,t) ≤ 0, a.e. x,t, ∀v(x,t)
∂x ∂ x2
6.4 The Mean Variance Problem 53

hence
∂u
v̂(x,t) = − ∂ 2x (σ σ ∗ )−1 α . (6.45)
∂ u
x 2
∂x
So the pair u(x,t), m(x,t) satisfies


∂u 2
∂u ∂u 1 ∂x
− − xr + α ∗ (σ σ ∗ )−1 α = 0
∂t ∂ x 2 ∂ 2u
∂ x2
ˆ
γ 2
u(x, T ) = x − x + γ x m(ξ , T )ξ d ξ (6.46)
2

⎛ ⎞
∂u
∂m ∂ (xm) ∂ ⎜ ⎟
⎜m ∂ x ⎟ α ∗ (σ σ ∗ )−1 α
+r − ⎝
∂t ∂x ∂x ∂ u⎠
2

∂ x2
⎛ ⎞
∂u
1 ∂2 ⎜ ( )2 ⎟
− ⎜ m ∂ x ⎟ α ∗ (σ σ ∗ )−1 α = 0
2 ∂ x2 ⎝ ∂ 2 u 2 ⎠
( 2)
∂x
m(x, 0) = δ (x − x0 ). (6.47)

We can solve this system explicitly. We look for


1
u(x,t) = − P(t)x2 + s(t)x + ρ (t). (6.48)
2
We also define
ˆ
q(t) = m(ξ ,t)ξ d ξ . (6.49)

From (6.47) we obtain easily


 
1 1
Ṗ + r − α ∗ (σ σ ∗ )−1 α P = 0
2 2
P(T ) = γ

ṡ + (r − α ∗ (σ σ ∗ )−1 α )s = 0
s(T ) = 1 + γ q(T )
54 6 Linear Quadratic Models

1 s2 ∗
ρ̇ + α (σ σ ∗ )−1 α = 0
2P
ρ (T ) = 0.

We obtain

ˆ T 
∗ ∗ −1
P(t) = γ exp (2r − α (σ σ ) α )d τ
t
ˆ T 
∗ ∗ −1
s(t) = (1 + γ q(T )) exp (r − α (σ σ ) α )d τ
t
ˆ T
1 s2 ∗
ρ (t) = α (σ σ ∗ )−1 α (τ )d τ . (6.50)
t 2P
We need to fix q(T ). Equation (6.47) becomes
∂m ∂ (xm) ∂   s  ∗
+r − m x− α (σ σ ∗ )−1 α
∂t ∂x ∂x P
  
1 ∂2 s 2
− m x− α ∗ (σ σ ∗ )−1 α = 0
2 ∂ x2 P
m(x, 0) = δ (x − x0 ). (6.51)

If we test this equation with x we obtain easily


s ∗
q̇ − (r − α ∗ (σ σ ∗ )−1 α ))q = α (σ σ ∗ )−1 α
P
 ˆ T 
1 + γ q(T) ∗ ∗ −1
= α (σ σ ) α exp − rd τ
γ t

q(0) = x0 .

We deduce easily
ˆ T 
∗ ∗ −1
q(T ) = x0 exp (r − α (σ σ ) α )d τ
0
  ˆ T 
1 + γ q(T) ∗ ∗ −1
+L 1 − exp − α (σ σ ) α )d τ
γ 0

and we obtain
ˆ T   ˆ T  
1
q(T ) = x0 exp rd τ + exp α ∗ (σ σ ∗ )−1 α )d τ − 1 . (6.52)
0 γ 0
6.4 The Mean Variance Problem 55

This completes the definition of the function u(x,t). The optimal feedback is defined
by [see (6.45)]
 ˆ T 
1 1 + γ q(T )
∗ −1
v̂(x,t) = −(σ σ ) α + exp − rd τ . (6.53)
x γ t

We see that this optimal feedback depends on the initial condition x0 .


If we take the time consistency approach, we consider the family of problems

dx = rxds + xv(x, s).(α dt + σ dw), s > t


x(t) = x (6.54)

and the payoff


 γ  γ
Jx,t (v(.)) = E x(T ) − x(T )2 + (Ex(T ))2 . (6.55)
2 2

Denote by v̂(x, s) an optimal feedback and set V (x,t) = Jx,t (v̂(.)). We define

Ψ(x,t; T ) = E x̂xt (T )

where x̂xt (s) is the solution of (6.54) for the optimal feedback. The function
Ψ(x,t; T ) is the solution of

∂Ψ ∂Ψ 1 ∂ 2Ψ
+ (rx + xv̂(x,t)∗ α ) + x2 2 |σ ∗ v̂(x,t)|2 = 0
∂t ∂x 2 ∂x
Ψ(x, T ; T ) = x.

We can write
 γ  γ
V (x,t) = E x̂xt (T ) − x̂xt (T )2 + (Ψ(x,t; T ))2 .
2 2
We consider the spike modification

v t < s < t +ε
v̄(x, s) =
v̂(x, s) s > t + ε

then
γ
Jx,t (v̄(.)) = E(x̂x(t+ε ),t+ε (T ) − x̂x(t+ε ),t+ε (T )2 )
2
γ
+ (EΨ(x(t + ε ),t + ε ; T ))2
2
56 6 Linear Quadratic Models

where x(t + ε ) corresponds to the solution of (6.54) at time t + ε for the feedback
equal to the constant v. We note that

γ
EV (x(t + ε ),t + ε ) = E(x̂x(t+ε ),t+ε (T ) − x̂x(t+ε ),t+ε (T )2 )
2
γ
+ E(Ψ(x(t + ε ),t + ε ; T ))2
2

so we need to compare (EΨ(x(t + ε ),t + ε ; T ))2 with E(Ψ(x(t + ε ),t + ε ; T ))2 .


Using the techniques of Sect. 4.4 we see easily that

(EΨ(x(t + ε ),t + ε ; T ))2 − E(Ψ(x(t + ε ),t + ε ; T ))2


∂ 2Ψ
= − ε x2 (x,t; T )|σ ∗ v|2 + 0(ε ).
∂ x2
Therefore,

V (x,t) ≥ Jx,t (v̄(.)) = EV (x(t + ε ),t + ε )


ε ∂ 2Ψ
− x2 2 (x,t; T )|σ ∗ v|2 + 0(ε ).
∂x

Expanding EV (x(t + ε ),t + ε ) we obtain the HJB equation

∂V ∂V ∂V ∗ x2 ∂ 2 V ∂ 2Ψ
+ rx + max[x v α + ( 2 − γ 2 (x,t; T ))v∗ σ σ ∗ v] = 0
∂t ∂x v ∂x 2 ∂x ∂x
V (x, T ) = x. (6.56)

A direct checking shows that


ˆ T
1
V (x,t) = x exp(r(T − t)) + α ∗ (σ σ ∗ )α ds (6.57)
2γ t

ˆ T
1
Ψ(x,t) = x exp(r(T − t)) + α ∗ (σ σ ∗ )α ds
γ t

and

exp(−r(T − t))
v̂(x,t) = (σ σ ∗ )α (6.58)

This optimal control satisfies the time consistency property (it does not depend
on the initial condition).
6.5 Approximate N Player Differential Game 57

6.5 Approximate N Player Differential Game

From the definition of f (x, m, v), g(x, m, v), andh(x, m)—see (6.1), (6.2), and
(6.3)—the differential game (5.1) and (5.2) becomes
 
N
1
dx =
i
Ax + Bv + Ā
i i
N −1 ∑ x j
dt + σ dwi (6.59)
j=1=i

xi (0) = xi0 (6.60)

ˆ T
1
J i (v(.)) = E (xi )∗ Qxi
2 0
 ∗  
N N
1 1
+ x −S i
N−1 ∑ x j
Q̄ x −S
i
N−1 ∑ x j
+(vi )∗ Rvi dt
j=1=i j=1=i
 ∗  
N N
1 1 1
+ E (xi )∗ QT xi (T )+ xi −ST
2 N −1 ∑ xj Q̄T xi −ST
N −1 ∑ x j (T ) .
j=1=i j=1=i

(6.61)

The reasoning developed in Sect. 5.4, can be applied in the present situation.
Considering the optimal feedback

v̂(x) = −R−1 B∗ (P(t)x + r(t)) (6.62)

where P(t)and r(t) have been defined in (6.9) and (6.12). We consider the stochastic
process v̂(t) = v̂(x̂(t)) where x̂(t) is the optimal trajectory for the control problem
with the mean field term

d x̂ = ((A − BR−1B∗ P(t))x̂ − R−1 B∗ r(t) + ĀE x̂(t))dt + σ dw


x̂(0) = x0 (6.63)

so

v̂(t) = −R−1B∗ (P(t)x̂(t) + r(t)). (6.64)

We then consider N independent Wiener processes wi (.) and initial conditions


xi0 .
We duplicate the stochastic process v̂(t) into v̂i (t). These processes constitute
an approximate open-loop Nash equilibrium for the objective functionals (6.61).
We can also consider the feedback (6.62) and duplicate for all players. We will also
obtain an approximate Nash equilibrium among smooth feedbacks (see Sect. 5.5).
Chapter 7
Stationary Problems

7.1 Preliminaries

We shall consider only mean field games, but mean field type control can also
be considered. To obtain stationary problems, Lasry and Lions [27] consider
ergodic situations. This introduces an additional difficulty. It is, however, possible to
motivate stationary problems that correspond to infinite horizon discounted control
problems. The price to pay concerns the N player differential game associated with
the mean field game. It is less natural than the one used in the time-dependent case.
However, other interpretations are possible, which do not lead to the same difficulty.
The natural counterpart of (3.11) is

Au + α u = H(x, m, Du)
A∗ m+div (G(x, m, Du)m) + α m = α m0 , (7.1)

where α is a positive number (the discount). Note that when m0 is a probability,


the solution m is also a probability. We shall simplify a little bit with respect to the
time-dependent case. We consider f (x, m, v) as in Sect. 2.1, but we take g(x, v) as
not being dependent on m. Then, considering the Lagrangian

f (x, m, v) + q.g(x, v)

we suppose that the minimum is attained at a point v̂(x, m, q), and this function is
well-defined and sufficiently smooth. We next define the Hamiltonian

H(x, m, q) = inf[ f (x, m, v) + q.g(x, v)] (7.2)


v

= f (x, m, v̂(x, m, q)) + q.g(x, v̂(x, m, q)) (7.3)

A. Bensoussan et al., Mean Field Games and Mean Field Type Control Theory, 59
SpringerBriefs in Mathematics, DOI 10.1007/978-1-4614-8508-7__7,
© Alain Bensoussan, Jens Frehse, Phillip Yam 2013
60 7 Stationary Problems

and

G(x, m, q) = g(x, v̂(x, m, q)). (7.4)

These are the functions entering into (7.1).

7.2 Mean Field Game Set-Up

Consider a feedback v(.) and a probability density m(.). We construct the state
equation associated to the feedback control

dx = g(x(t), v(x(t))dt + σ (x(t))dw(t)


x(0) = x0 . (7.5)

We then define the cost functional


ˆ +∞
J(v(.), m(.)) = E exp(−α t) f (x(t), m, v(x(t)))dt (7.6)
0

and denote by pv(.) (x,t) the probability distribution of the process x(t). We use
pv(.) (x,t) instead of mv(.) (x,t) to avoid confusion. A pair v̂(.) and m(.) is a solution
of the mean field game problem, whenever

J(v̂(.), m(.)) ≤ J(v(.), m(.)), ∀v(.) (7.7)


ˆ +∞
m=α exp(−α t) pv̂(.) (t)dt. (7.8)
0

From standard control theory, it is clear that


ˆ
J(v̂(.), m(.)) = m0 (x)u(x)dx (7.9)
Rn

in which u(x) is the solution of the first equation (7.1), and

v̂(x) = v̂(x, m, Du(x)) (7.10)

so G(x, m, Du) = g(x, v̂(x)). Now pv̂(.) (x,t) is the solution of

∂ pv̂(.)
+ A∗ pv̂(.) + div (g(x, v̂(x))pv̂(.) ) = 0
∂t
pv̂(.) (x, 0) = m0 (x). (7.11)
7.3 Additional Interpretations 61

´ +∞
If we compute α 0 exp(−α t) pv̂(.) (t)dt we easily see that it satisfies the second
equation (7.1). Note that
ˆ
α J(v̂(.), m(.)) = m(x) f (x, m, v̂(x))dx. (7.12)
Rn

For any feedback v(.) we can introduce pv(.) (x,t) as the solution of

∂ pv(.)
+ A∗ pv(.) + div (g(x, v(x))pv(.) ) = 0
∂t
pv(.) (x, 0) = m0 (x)

and let
ˆ +∞
pv(.),α (x) = α exp(−α t) pv(.) (x,t)dt
0

then we can write


ˆ
α J(v(.), m(.)) = pv(.),α (x) f (x, m, v(x))dx (7.13)
Rn

so we have the fixed-point property m = pv̂(.),α (x), with v̂(.) depending on m from
formula (7.10).

7.3 Additional Interpretations

We restrict the model to

∂ Φ(m)
f (x, m, v) = f (x, v) + (x), (7.14)
∂m

where Φ(m) is a functional on L1 (Rn ). For any feedback v(.) consider the
probability pv(.),α (x). It is the solution of

A∗ pv(.),α + div (g(x, v(x))pv(.),α ) + α pv(.),α = α m0 . (7.15)

We define an objective functional


ˆ
α J(v(.)) = f (x, v(x))pv(.),α (x)dx + Φ(pv(.),α ). (7.16)
Rn

We can obtain the Gâteaux differential


ˆ
d J(v(.) + θ ṽ(.)) ∂L
α |θ =0 = (x, v(x), Duv(.) (x))pv(.),α (x)ṽ(.)dx, (7.17)
dθ Rn ∂ v
62 7 Stationary Problems

where uv(.) (x) is the solution of

∂ Φ(pv(.),α )
Auv(.) + α uv(.) = f (x, v(x)) + g(x, v(x)).Duv(.) + (x) (7.18)
∂m
and

L(x, v, q) = f (x, v) + q.g(x, v).

A necessary condition of optimality is

∂L
(x, v(x), Duv(.) (x)) = 0 (7.19)
∂v
which, if convexity in v holds, implies that the optimal feedback minimizes in v, the
Lagrangian L(x, v, Du(x)). We can define the Hamiltonian

H(x, q) = inf( f (x, v) + q.g(x, v)) (7.20)


v

and

∂ Φ(m)
H(x, m, q) = H(x, q) + (x). (7.21)
∂m

Considering the point of minimum v̂(x, q) and


G(x, m, q) = G(x, q) = g(x, v̂(x, q)) (7.22)

then the system (7.1) can be interpreted as a necessary condition of optimality for
the problem of minimizing J(v(.)).
Still, in the convex case, we can also interpret (7.1) as a necessary condition of
optimality for a control problem of the HJB equation.
Define next the conjugate of Φ(m) by
 ˆ 
Φ∗ (z(.)) = sup Φ(m) − z(x)m(x)dx (7.23)
m Rn

For any z(.) ∈ L∞ (Rn ), define mz(.) (x) to be the point of supremum in (7.23).
It satisfies
∂ Φ(mz(.) )
(x) = z(x). (7.24)
∂m
Define next uz(.) (x) to be the solution of

Auz(.) + α uz(.) = H(x, Duz(.) ) + z(x). (7.25)


7.3 Additional Interpretations 63

In this equation z(x) appears as a control, and the corresponding state is uz(.) .
The objective functional is defined by
ˆ
K(z(.)) = Φ∗ (z(.)) + α m0 (x)uz(.) (x)dx. (7.26)
Rn

We can look for a necessary condition of optimality in minimizing K(z(.)). We first


have
ˆ
d ∗
Φ (z(.) + θ z̃(.))|θ =0 = − mz(.) (x)z̃(x)dx (7.27)
dθ Rn

and
d
u (x) = ũ(x) (7.28)
d θ z(.)+θ z̃(.) |θ =0
with ũ(x) solution of

Aũ + α u = Dũ.g(x, v̂(x, uz(.) (x))) + z̃(x). (7.29)

Therefore,
ˆ ˆ
d
K(z(.) + θ z̃(.))|θ =0 = − mz(.) (x)z̃(x)dx + α m0 (x)ũ(x)dx.
dθ Rn Rn

Define next m̄z(.) by


A∗ m̄z(.) + div (g(x, v̂(x, uz(.) (x)))m̄z(.) ) + α m̄z(.) = α m0 (7.30)

then from (7.29) to (7.30) we obtain easily


ˆ ˆ
α m0 (x)ũ(x)dx = m̄z(.) (x)z̃(x)dx
Rn Rn

and thus we get


ˆ
d
K(z(.) + θ z̃(.))|θ =0 = (m̄z(.) (x) − mz(.) (x))z̃(x)dx. (7.31)
dθ Rn

If we express that the Frechet derivative is 0, then we must have m̄z(.) (x) = mz(.) (x).
If we call m(x) this common function, we have from (7.24)

∂ Φ(m)
z(x) = (x).
∂m

If we call u(x) = uz(.) (x), then the pair u, m satisfies the system (7.1).
64 7 Stationary Problems

7.4 Approximate N Player Nash Equilibrium

We consider N independent Wiener processes, wi (t) and N random variables xi0 ,


which are independent of the Wiener processes and identically distributed with
density m0 . Consider local feedbacks (feedbacks on the individual states) vi (xi ).
The evolution of the state of the player i is governed by
dxi = g(xi (t), vi (xi (t))dt + σ (xi (t))dwi (t)
xi (0) = xi0 . (7.32)

These trajectories are independent. The probability of the variable xi (t) is given by
pvi (.) (xi ,t) with pv(.) (x,t) the solution of

∂ pv(.)
+ A∗ pv(.) + div (g(x, v(x))pv(.) ) = 0
∂t
pv(.) (x, 0) = m0 (x). (7.33)

The objective functionals are defined by


ˆ ∞
J N,i (v(.)) = E exp(−α t) f (xi (t), vi ((xi (t)))
0
 ˆ 
+∞
α N
+ f0 x (t),
i
N −1
exp(−α s) ∑ δx j (s) ds dt (7.34)
0 j=1=i

Note the difference with respect to the time-dependent case; see (5.2).
This Nash game has an equilibrium composed of duplicates of a common
feedback. Indeed , recall v̂(x, q) which attains the minimum of
f (x, v) + q.g(x, v)
and

H(x, q) = inf[ f (x, v) + q.g(x, v)]


H(x, m, q) = H(x, q) + f0 (x, m)
G(x, q) = g(x, v̂(x, q)).

Next we define a system composed of a function uN (x) and stochastic processes


x̂1 (.), . . . , x̂N which are solutions of

d x̂i = G(x̂i , DuN (x̂i ))dt + σ (x̂i (t))dwi (t)


xi (0) = xi0 (7.35)
7.4 Approximate N Player Nash Equilibrium 65

AuN + α uN =H(x, DuN )


 ˆ 
+∞
α N
+ E f0 x,
N −1 0
exp(−α s) ∑ δx̂ j (s) ds . (7.36)
j=2

If this system can be solved, we deduce feedbacks (dependent on N) v̂N (x) =


v̂(x, DuN (x)). The copies v̂N (xi ) form a Nash equilibrium.
Indeed, if we set v̂N (.) = (v̂N (x1 ), . . . , v̂N (xN )), then
ˆ
J N,i (v̂N (.)) = uN (x)m0 (x)dx. (7.37)
Rn

Let us next focus on player 1. Assume that player 1 uses a feedback control v1 (x1 )
and all other players use v̂N (x j ), j = 2, . . . , N. We use the notation

J N,1 (v1 (.), v̂1N (.))

to denote the objective functional of player 1, when he uses the feedback v1 (x1 ) and
the other players use the feedbacks v̂N (x j ), j = 2, . . . , N. The trajectory of player
1 is
dx1 = g(x1 (t), v1 (x1 (t))dt + σ (x1 (t))dw1 (t)
x1 (0) = x10

and the trajectories of other players remain x̂ j (t), j = 2, . . . , N. By a standard


verification argument, computing the Ito’s differential of uN (x1 (t)) exp(−α t), one
checks that
ˆ
uN (x)m0 (x)dx ≤ J N,1 (v1 (.), v̂1N (.))
Rn

which proves the Nash equilibrium property.


It remains to compare (7.35) and (7.36) since N → +∞. Note that the probability
densities of x̂i (t) are identical and are defined by p̂N (x,t) solution of
∂ p̂N
+ A∗ p̂N + div (G(x, DuN (x)) p̂N ) = 0
∂t
p̂N (x, 0) = m0 (x). (7.38)

We set
ˆ +∞
mN (x) = α exp(−α t) p̂N (x,t)dt
0

then

A∗ mN + div (G(x, DuN (x))mN ) + α mN = α m0 . (7.39)


66 7 Stationary Problems

α
´ +∞
0 exp(−α s) ∑ j=2 δx̂ j (s) ds. It is a random measure on R .
N n
Consider now N−1
We can write for any continuous bounded function ϕ (x) on Rn
ˆ +∞
α N
ϕ,
N −1 0
exp(−α s) ∑ δx̂ j (s) ds
j=2
ˆ +∞
α N
= exp(−α s) ∑ ϕ (x̂ j (s))ds
N −1 0 j=2

+∞ ˆ
α N
= exp(−α s) ∑ (ϕ (x̂ j (s)) − E ϕ (x̂ j (s)))ds
N −1 0 j=2
ˆ
+ ϕ (x)mN (x)dx.
Rn
´ +∞
The random variables α 0 exp(−α s)(ϕ (x̂ j (s)) − E ϕ (x̂ j (s)))ds are independent,
identically distributed with 0 mean. So from the law of large numbers, we obtain
that
ˆ +∞
α N
exp(−α s) ∑ (ϕ (x̂ j (s)) − E ϕ (x̂ j (s)))ds → 0, a.s. as N → +∞.
N −1 0 j=2

So, if we show that


ˆ ˆ
ϕ (x)mN (x)dx → ϕ (x)m(x)dx
Rn Rn

then we will get


ˆ +∞
α N

N −1 0
exp(−α s) ∑ δx̂ j (s) ds → m(x)dx, a.s.
j=2

for the weak * topology of the set of measures on Rn . If f0 (x, m) is continuous in m


for the weak * topology, then we will get
 ˆ +∞ 
α N
E f0 x, exp(−α s) ∑ δx̂ j (s) ds → f0 (x, m).
N −1 0 j=2

If we get estimates for uN , mN in Sobolev spaces, then we can extract a subsequence


and the pair uN , mN will converge towards u, m, the solution of
Au + α u = H(x, Du) + f0 (x, m)
A∗ m+div (G(x, Du)m) + α m = α m0 . (7.40)
This shows the approximation property.
Chapter 8
Different Populations

8.1 General Considerations

In preceding chapters, we considered a single population composed of a large


number of individuals with identical behavior. In real situations, we will have several
populations. The natural extension to the preceding developments is to obtain mean
field equations for each population. A much more challenging situation will be
to consider competing populations. This will be addressed in the next chapter.
We discuss first the approach of multiclass agents, as described in [20, 22].

8.2 Multiclass Agents

We consider a more general situation than in [20, 22], which extends the model
discussed in Sect. 5.4. Instead of functions f (x, m, v), g(x, m, v), h(x, m), σ (x), we
consider K functions fk (x, m, v), gk (x, m, v), hk (x, m), σk (x), k = 1, . . . , K. The index
k represents some characteristics of the agents, and a class corresponds to one value
of the characteristics. So there are K classes. In the model discussed previously, we
considered a single class. In the sequel, when we consider an agent i, he or she will
have a characteristics α i ∈ (1, . . . , K). Agents will be defined with upper indices, so
i = 1, . . . , N with N very large. the value α i is known information. The important
assumption is

1 N
∑ Iα i =k → πk , as N → +∞
N i=1
(8.1)

and πk is a probability distribution on the finite set of characteristics, which


represents the probability that an agent has the characteristics k.

A. Bensoussan et al., Mean Field Games and Mean Field Type Control Theory, 67
SpringerBriefs in Mathematics, DOI 10.1007/978-1-4614-8508-7__8,
© Alain Bensoussan, Jens Frehse, Phillip Yam 2013
68 8 Different Populations

Generalizing the case of a single class, we define ak (x) = 12 σk (x)σk (x)∗ and the
operator

Ak ϕ (x) = −trak (x)D2 ϕ (x).

We define Lagrangians—i.e., Hamiltonians indexed by k,—namely

Lk (x, m, v, q) = fk (x, m, v) + q.gk (x, m, v)

Hk (x, m, q) = inf Lk (x, m, v, q)


v

and v̂k (x, m, q) denote the minimizer in the definition of the Hamiltonian. We also
define

Gk (x, m, q) = gk (x, m, v̂k (x, m, q)).

Given a function m(t) we consider the HJB equations indexed by k

∂ uk
− + Auk = Hk (x, m, Duk )
∂t
uk (x, T ) = hk (x, m(T )) (8.2)

and the FP equations


∂ mk
+ A∗ mk + div (Gk (x, m, Duk )mk ) = 0 (8.3)
∂t
mk (x, 0) = mk0 (x) (8.4)

in which the probability densities mk0 are given. A mean field game equilibrium for
the multiclass agents problem is attained whenever
K
m(x,t) = ∑ πk mk (x,t), ∀x,t. (8.5)
k=1

To this mean field game we associate a Nash differential game for a large number of
players, N. Considering independent Wiener processes wi (t), i = 1, . . . , N, we define
the state equations
 
1
dxi (t) = gα i xi (t),
N −1 ∑ δx j (t) , vi (xi (t)) dt + σα i (xi (t))dwi (t)
j=1=i

xi (0) = xi0 , (8.6)


8.2 Multiclass Agents 69

where the random variables xi0 are independent, with probability density mα i 0 (x).
In the state equation vi (xi (t)) denotes a feedback on the agent’s state. We define the
cost functional associated with player i
ˆ  
T N
1
J N,i
(v(.)) = E fα i x (t),
i
N −1 ∑ δx j (t) , v (x (t)) dt
i i
0 j=1=i
 
N
1
+ hα i xi (T ),
N −1 ∑ δx j (T ) . (8.7)
j=1=i

The objective is to obtain an approximate Nash equilibrium. We consider the


feedbacks

v̂k (x,t) = v̂k (x, m(t), Duk (x,t))

and player i will use the local feedback

v̂i (xi ,t) = v̂α i (xi ,t). (8.8)

We want to show that these feedbacks constitute an approximate Nash equilibrium


for the differential game with N players; see (8.6) and (8.7). The admissible
feedbacks will be Lipschitz feedbacks. We first consider the trajectories

d x̂i (t) = gα i (x̂i (t), m(t), v̂i (x̂i (t),t))dt + σα i (x̂i (t))dwi (t)
x̂i (0) = xi0 . (8.9)

In this equation, only player i is involved. We then consider the cost


ˆ T
Jα i (v̂i (.)) = E fα i (x̂i (t), m(t), v̂i (x̂i (t),t))dt
0

+ hα i (x̂i (T ), m(T )) . (8.10)

Again, this cost functional involves only player i. By standard results of dynamic
programming, we can check that
ˆ
Jα i (v̂ (.)) =
i
uα i (x, 0)m(x, 0)dx. (8.11)

Consider next the trajectories of the N players when they simultaneously apply the
feedbacks v̂i (xi ,t). They are defined as follows
70 8 Different Populations

 
N
1
d ŷ (t) = gα i
i
ŷ (t),
i
N−1 ∑ δŷ j (t) , v̂ (ŷ (t),t) dt + σα i (ŷi (t))dwi (t)
i i
j=1=i

ŷi (0) = xi0 (8.12)

and the corresponding cost functional is given by


ˆ  
T N
1
J N,i
(v̂(.))) =E fα i ŷ (t),
i
N −1 ∑ δŷ j (t) , v̂ (ŷ (t),t) dt
i i
0 j=1=i
 
N
1
+ hα i ŷ (T ),
i
N −1 ∑ δŷ j (T ) , (8.13)
j=1=i

where v̂(.) = (v̂1 (.), . . . , v̂N (.)). We claim that


 
1
J N,i (v̂(.))) = Jα i (v̂i (.)) + O √ . (8.14)
N

We proceed as in Sect. 5.4. We show that x̂i (t) is an approximation of ŷi (t). We write

d(ŷi (t) − x̂i (t))


   
N N
1 1
= gα i ŷ (t), i
∑ δŷ j (t) , v̂ (ŷ (t),t) − gα i x̂ (t), N − 1 ∑ δŷ j (t) , v̂ (x̂ (t),t)
N − 1 j=1
i i i i i
=i j=1=i
   
N N
1 1
+gα i x̂ (t),
i
∑ δŷ j (t), v̂ (x̂ (t),t) − gα i x̂ (t), N − 1 ∑ δx̂ j (t), v̂ (x̂ (t),t)
N − 1 j=1
i i i i i
=i j=1=i
 
N
1
+ gα i x̂i (t), ∑ δ j , v̂i (x̂i (t),t) − gα i (x̂i (t), m(t), v̂i (x̂i (t),t)) dt
N − 1 j=1=i x̂ (t)

+(σα i (ŷi (t)) − σα i (x̂i (t)))dwi


ŷi (0) − x̂i (0) = 0.

With appropriate Lipschitz assumptions, we control all terms with the difference
∑Ni=1 |ŷi (t) − x̂i (t)|2 , except the term
 
N
1
gα i x̂i (t),
N −1 ∑ δx̂ j (t) , v̂i (x̂i (t),t) − gα i (x̂i (t), m(t), v̂i (x̂i (t),t))
j=1=i

which is the driving term converging to 0 as N → +∞. Indeed, we have to show that
the random measure on Rn defined by N−1 1
∑Nj=1=i δx̂ j (t) converges for any t towards
the deterministic measure m(x,t)dx, a.s.
8.2 Multiclass Agents 71

But if ϕ is continuous and bounded on Rn we can write

1 N 1 N
ϕ,
N −1 ∑ δx̂ j (t) =
N −1 ∑ ϕ (x̂ j (t))
j=1=i j=1=i

1 N 1 N
=
N −1 ∑ (ϕ (x̂ j (t)) − E ϕ (x̂ j (t))) +
N −1 ∑ E ϕ (x̂ j (t)).
j=1=i j=1=i

But the random variables ϕ (x̂ j (t)) − E ϕ (x̂ j (t)) are independent and identically
distributed with mean 0. From the law of large numbers it follows that

N
1
N −1 ∑ (ϕ (x̂ j (t)) − E ϕ (x̂ j (t))) → 0, a.s. for fixed i, as N → ∞.
j=1=i

Next the probability density of x̂ j (t) is mα j (t). Therefore,

N N ˆ
1 1
N −1 ∑ E ϕ (x̂ (t)) = N − 1
j
∑ ϕ (x)mα j (x,t)dx
j=1=i R
n
j=1=i

K ˆ N
1
=∑ ϕ (x)mk (x,t) ∑ Iα j =k dx
k=1 N −1 j=1=i

and from the assumption (8.1) we obtain that

N K ˆ ˆ
1
N −1 ∑ E ϕ (x̂ j (t)) → ∑ ϕ (x)mk (x,t)πk dx = ϕ (x)m(x,t)dx
j=1=i k=1

which proves the convergence of N−1 1


∑Nj=1=i δx̂ j (t) to m(x,t)dx a.s. The result
follows from an assumption of continuity of gk (x, m, v) in m, with respect to the
weak * convergence of probability measures.
With this in hand we can compare J N,i (v̂(.)) with Jα i (v̂i (.)) as done for (5.36)
in Sect. 5.4. This proves (8.14).
We next focus on player 1. Suppose this player uses a different local feedback
v1 (x1 ) that is Lipschitz. The other players use v̂i (xi ), i = 2, . . . , N. We call this set
of controls ṽ(.) = (v1 (.), v̂2 (.), . . . , v̂N (.)) and we use it in the differential game; see
(8.6) and (8.7). We note the corresponding states by (y1 (.), . . . , yN (.)). We get the
equations
 
1
dy (t) = gα 1
1
y (t),
1
N −1 ∑ δy j (t) , v (y (t)) dt + σα 1 (y1 (t))dw1 (t)
1 1
j=1=i

y1 (0) = x10 (8.15)


72 8 Different Populations

 
1
dy (t) = gα i
i
y (t),
i
N −1 ∑ δy j (t) , v̂ (y (t)) dt + σα i (yi (t))dwi (t)
i i
j=1=i

yi (0) = xi0 (8.16)

for i = 2, . . . , N.
We also consider the state evolution of the first player, when he uses the feedback
v1 (.) and the measure is m(t), given by (8.5), namely

dx1 (t) = gα 1 (x1 (t), m(t), v1 (x1 (t)))dt + σα 1 (x1 (t))dw1 (t)
x1 (0) = x10 (8.17)

then we show that x1 (t) approximates y1 (t) and x̂i (t) approximates yi (t), for i ≥
2. The proof is very similar to that of Sect. 5.4 and what has been done above.
We arrive at
 
1
J (ṽ(.)) = Jα 1 (v (.)) + O √
N,1 1
N
with
ˆ T
Jα 1 (v1 (.)) =E fα 1 (x1 (t), m(t), v1 (x1 (t),t))dt
0

+hα 1 (x1 (T ), m(T )) . (8.18)

By standard control theory we have Jα 1 (v1 (.)) ≥ Jα 1 (v̂1 (.)) and thus again
 
1
J N,1 (ṽ(.)) ≥ Jα 1 (v̂1 (.)) − O √
N
and we obtain the approximate Nash equilibrium for a large number of multiclass
agents.

8.3 Major Player

8.3.1 General Theory

We consider here a problem initiated by Huang [19]. In this paper only the LQ case
is considered. In a recent paper, Nourian and Caines [31] have studied a nonlinear
mean field game with a major player. In both papers, there is a simplification in the
coupling between the major player and the representative agent. We will describe
here the problem in full generality and explain the simplification in [31].
8.3 Major Player 73

The new element is that, besides the representative agent, there is a major player.
This major player influences directly the mean field term. Since the mean field
term also impacts the major player, he or she will takes this into account to define
any decisions made. On the other hand, the mean field term can no longer be
deterministic, since it depends on the major player’s decisions. This coupling creates
new difficulties.
We introduce the following state evolution for the major player

dx0 = g0 (x0 (t), m(t), v0 (t))dt + σ0 (x0 )dw0


x0 (0) = ξ0 . (8.19)

We assume that x0 (t) ∈ Rn0 , v0 (t) ∈ Rd0 . The process w0 (t) is a standard Wiener
process with values in Rk0 and ξ0 is a random variable in Rn0 independent of the
Wiener process. The process m(t) is the mean field term, with values in the space of
probabilities on Rn . This term will come from the decisions of the representative
agent. However, it will be linked to x0 (t) since the major player influences the
decision of the representative agent. If we define the filtration

F 0t = σ (ξ0 , w0 (s), s ≤ t) (8.20)

then m(t) is a process adapted to F 0t . But it is not external, since it is assumed


in the above works. We will describe the link with the state x0 in analyzing the
representative agent problem. The control v0 (t) is also adapted to F 0t . The objective
functional of the major player is
ˆ T 
J0 (v0 (.)) = E f0 (x0 (t), m(t), v0 (t))dt + h0 (x0 (T ), m(T )) . (8.21)
0

The functions g0 , f0 , σ0 , h0 are deterministic. We do not specify the assumptions,


since our treatment is formal.
We turn now to the representative agent problem. The state x(t) ∈ Rn and the
control v(t) ∈ Rd .We have the evolution

dx = g(x(t), x0 (t), m(t), v(t))dt + σ (x(t))dw


x(0) = ξ (8.22)

in which w(t) is a standard Wiener process with values in Rk and ξ is a random


variable with values in Rn independent of w(.). Moreover, ξ , w(.) are independent
of ξ0 , w0 (.). We define

F t = σ (ξ , w(s), s ≤ t) (8.23)
G t = F 0t ∪ F t . (8.24)
74 8 Different Populations

The control v(t) is adapted to G t . The objective functional of the representative agent
is defined by
ˆ T 
J(v(.), x0 (.), m(.)) = E f (x(t), x0 (t), m(t), v(t))dt + h(x(T ), x0 (T ), m(T )) .
0
(8.25)

Conversely to the major player problem, in the representative agent problem the
processes x0 (.), m(.) are external. This explains the difference of notation between
(8.21) and (8.25). In (8.21), m(t) depends on x0 (.).
The representative agent’s problem is similar to the standard situation of Sect. 2.1
except for the presence of x0 (t).
We begin by limiting the class of controls for the representative agent to belong
to feedbacks v(x,t) random fields adapted to F 0t . The corresponding state, solution
of (8.22) is denoted by xv(.) (t). Of course, this process depends also of x0 (t), m(t).
Note that x0 (t), m(t) is independent from F t , therefore the conditional probability
density of xv(.) (t) given the filtration ∪t F 0t is the solution of the FP equation with
random coefficients
∂ pv(.)
+ A∗ pv(.) + div(g(x, x0 (t), m(t), v(x,t))pv(.) ) = 0
∂t
pv(.) (x, 0) = ϖ (x) (8.26)
in which ϖ (x) is the density probability of ξ . We can then rewrite the objective
functional J(v(.), x0 (.), m(.)) as follows
ˆ T ˆ
J(v(.), x0 (.), m(.)) =E pv(.),x0 (.),m(.) (x,t) f (x, x0 (t), m(t), v(x,t))dxdt
0 Rn
ˆ 
+ pv(.),x0 (.),m(.) (x, T )h(x, x0 (T ), m(T ))dx . (8.27)
Rn

We can give an expression for this functional. Introduce the random field χv(.) (x,t)
solution of the stochastic backward PDE:
∂ χv(.)
− + Aχv(.) = f (x, x0 (t), m(t), v(x,t)) + g(x, x0 (t), m(t), v(x,t)).Dχv(.)
∂t
χv(.) (x, T ) = h(x, x0 (T ), m(T )) (8.28)

then we can assert that


ˆ T ˆ
pv(.),x0 (.),m(.) (x,t) f (x, x0 (t), m(t), v(x,t))dxdt
0 Rn
ˆ ˆ
+ pv(.),x0 (.),m(.) (x, T )h(x, x0 (T ), m(T ))dx = χv(.) (x, 0)ϖ (x)dx
Rn Rn
8.3 Major Player 75

so
ˆ
J(v(.), x0 (.), m(.)) = ϖ (x)E χv(.) (x, 0)dx. (8.29)
Rn

Now define
0t
uv(.) (x,t) = E F χv(.) (x,t).
From (8.28) we can assert that
0t ∂ χv(.)
−E F + Auv(.) = f (x, x0 (t), m(t), v(x,t)) + g(x, x0 (t), m(t), v(x,t)).Duv(.)
∂t
uv(.) (x, T ) = h(x, x0 (T ), m(T )). (8.30)
On the other hand
ˆ t
0s ∂ χv(.)
uv(.) (x,t) − EF (x, s)ds
0 ∂s

is a F 0t martingale. Therefore, we can write


ˆ ˆ
t
0s ∂ χv(.) t
uv(.) (x,t) − EF (x, s)ds = uv(.) (x, 0) + Kv(.) (x, s)dw0 (s),
0 ∂s 0

where Kv(.) (x, s) is F 0s measurable, and uniquely defined. It is then easy to check
that the random field uv(.) (x,t) is a solution of the backward stochastic PDE (SPDE):

−∂t uv(.) (x,t) + Auv(.) (x,t)dt = f (x, x0 (t), m(t), v(x,t))dt


+ g(x, x0 (t), m(t), v(x,t)).Duv(.) (x,t)dt
− Kv(.) (x,t)dw0 (t)
uv(.) (x, T ) =h(x, x0 (T ), m(T )). (8.31)
From (8.29) we get immediately
ˆ
J(v(.), x0 (.), m(.)) = ϖ (x)Euv(.) (x, 0)dx. (8.32)
Rn

To express a necessary condition of optimality, we have to compute the Gâteaux


differential
d
J(v(.) + θ ṽ(.), x0 (.), m(.)).

We can state
ˆ
d
J(v(.) + θ ṽ(.), x0 (.), m(.)) = ϖ (x)E ũ(x, 0)dx (8.33)
dθ Rn
76 8 Different Populations

with

−∂t ũ(x,t) + Aũ(x,t)dt


= g(x, x0 (t), m(t), v(x,t)).Dũ(x,t)dt
∂L
+ (x, x0 (t), m(t), v(x,t), Duv(.) (x,t))ṽ(x,t)dt − K̃(x,t)dw0 (t)
∂v
ũ(x, T ) = 0, (8.34)

where

L(x, x0 , m, v, q) = f (x, x0 , m, v) + q.g(x, x0, m, v). (8.35)

Combining (8.26) and (8.34), we obtain

d
J(v(.) + θ ṽ(.), x0 (.), m(.))

ˆ Tˆ
∂L
=E pv(.) (x,t) (x, x0 (t), m(t), v(x,t), Duv(.) (x,t))ṽ(x,t)dxdt.
0 Rn ∂v
(8.36)

So the optimal feedback v̂(x,t) must satisfy

∂L
(x, x0 (t), m(t), v̂(x,t), Duv̂(.) (x,t)) = 0, a.e.x,t, a.s. (8.37)
∂v
As usual we consider that the function L achieves a minimum in v, denoted by
v̂(x, x0 , m, q) and define
H(x, x0 , m, q) = L(x, x0 , m, v̂(x, x0 , m, q), q). (8.38)

Setting u(x,t) = uv̂(.) (x,t), K(x,t) = Kv̂(.) (x,t) we obtain the stochastic HJB
equation
−∂t u(x,t) + Au(x,t)dt = H(x, x0 (t), m(t), Du)dt − K(x,t)dw0
u(x, T ) = h(x, x0 (T ), m(T )) (8.39)
and

v̂(x,t) = v̂(x, x0 (t), m(t), Du(x,t)). (8.40)


We next have to express the mean field game condition

m(t) = pv̂(.),x0 (.),m(.) (.,t).


8.3 Major Player 77

Setting

G(x, x0 , m, q) = g(x, x0 , m, v̂(x, x0 , m, q)) (8.41)

we obtain from (8.26) the FP equation

∂m
+ A∗ m + div(G(x, x0 (t), m(t), Du(x,t))m) = 0
∂t
m(x, 0) = ϖ (x). (8.42)

The coupled pair of HJB-FP equations, (8.39) and (8.42), allow is to define the
reaction function of the representative agent to the trajectory x0 (.) of the major
player. One defines the random fields u(x,t) and m(x,t), and the optimal feedback
is given by (8.40).
Consider now the problem of the major player. In [31] and also [19] for the LQ
case it is limited to (8.19) and (8.21) since m(t) is external. However, since m(t)
is coupled to x0 (t) through (8.39) and (8.42), one cannot consider m(t) as external,
unless limiting the decision of the major player. So, in fact, the major player has to
consider three state equations: (8.19), (8.39), and (8.42). For a given v0 (.) adapted to
F 0t we associate x0,v0 (.) (.), uv0 (.) (., .), mv0 (.) (., .) as the solution of the system (8.19),
(8.39), and (8.42). We may drop the index v0 (.) when the context is clear. We need
to compute the Gâteaux differentials

d d
x̃0 (t) = x (t) , ũ(x,t) = u (x,t)|θ =0
d θ 0,v0 (.)+θ ṽ0 (.) |θ =0 d θ v0 (.)+θ ṽ0 (.)
d
m̃(x,t) = m (x,t)|θ =0 .
d θ v0 (.)+θ ṽ0 (.)
They are solutions of the following equations

d x̃0 = g0,x0 (x0 (t), m(t), v0 (t))x̃0 (t)
ˆ
∂ g0
+ (x0 (t), m(t), v0 (t))(ξ )m̃(ξ ,t)d ξ
∂m

+ g0,v0 (x0 (t), m(t), v0 (t))ṽ0 (t) dt

k0
+ ∑ σ0l,x0 (x0 (t))x̃0 (t)dw0l (8.43)
l=1

x̃0 (0) = 0
78 8 Different Populations

−∂t ũ(x,t) + Aũ(x,t)dt



= Hx0 (x, x0 (t), m(t), Du(x,t))x̃0 (t) + G(x, x0 (t), m(t), Du(x,t)).Dũ(x,t)
ˆ 
∂H
+ (x, x0 (t), m(t), Du(x,t))(ξ )m̃(ξ ,t)d ξ dt − K̃(x,t)dw0 (t)
∂m
ˆ
∂h
ũ(x, T ) = hx0 (x, x0 (T ), m(T ))x̃0 (T ) + (x, x0 (T ), m(T ))(ξ )m̃(ξ , T )d ξ
∂m
(8.44)


m̃(x,t) + A∗m̃(x,t) + div(G(x, x0 (t), m(t), Du(x,t))m̃)
∂t

+div Gx0 (x, x0 (t), m(t), Du(x,t))x̃0 (t)
ˆ
∂G
+ (x, x0 (t), m(t), Du(x,t))(ξ )m̃(ξ ,t)d ξ
∂m
+ G(x, x0 (t), m(t), Du(x,t)).Dũ(x,t) ) m(x,t) ) = 0 (8.45)

m̃(x, 0) = 0.

We then can assert that


d
J0 (v0 (.) + θ ṽ0(.))|θ =0

ˆ T ˆ
∂ f0
=E f0,x0 (x0 (t), m(t), v0 (t))x̃0 (t) + (x0 (t), m(t), v0 (t))(ξ )m̃(ξ ,t)d ξ
0 ∂m
 
+ f0,v0 (x0 (t), m(t), v0 (t))ṽ0 (t) dt + E h0,x0 (x0 (T ), m(T ))x̃0 (T )
ˆ 
∂ h0
+ (x0 (T ), m(T ))(ξ )m̃(ξ , T )d ξ . (8.46)
∂m

We then introduce the process p(t) and random fields η (x,t), ζ (x,t) as solutions of
the SDE and SPDE:

−d p = g∗0,x0 (x0 (t), m(t), v0 (t))p(t) + f0,x0 (x0 (t), m(t), v0 (t))
ˆ
+ G∗x0 (x, x0 (t), m(t), Du(x,t))Dη (x,t)m(x,t)dx
8.3 Major Player 79

ˆ 
+ ζ (x,t)Hx0 (x, x0 (t), m(t), Du(x,t)dx dt

k0 k0
− ∑ ql dw0l + ∑ σ0l,x

0
(x0 (t))ql dt. (8.47)
l=1 l=1

ˆ
p(T ) = h0,x0 (x0 (T ), m(T )) + ζ (x, T )hx0 (x, x0 (T ), m(T ))dx


∂ g0
− ∂t η + Aη (x,t)dt = (x0 (t), m(t), v0 (t))(x).p(t)
∂m
+Dη (x,t).G(x, x0 (t), m(t), Du(x,t))
ˆ
∂G
+ Dη (ξ ,t). (ξ , x0 (t), m(t), Du(ξ ,t))(x)m(ξ ,t)d ξ
∂m
ˆ
∂H
+ ζ (ξ ,t) (ξ , x0 (t), m(t), Du(ξ ,t))(x)d ξ
∂m

∂ f0
+ (x0 (t), m(t), v0 (t))(x) dt − ∑ μl (x,t)dw0l (t) (8.48)
∂m l

ˆ
∂ h0 ∂h
η (x, T ) = (x0 (T ), m(T ))(x) + ζ (ξ , T ) (ξ , x0 (T ), m(T ))(x)d ξ
∂m ∂m

∂ζ
+ A∗ ζ (x,t) + div (G(x, x0 (t), m(t), Du(x,t))ζ (x,t))
∂t
+div(G∗q (x, x0 (t), m(t), Du(x,t))Dη (x,t) m(x,t)) = 0 (8.49)

ζ (x, 0) = 0.

Thanks to these processes we can write (8.46) as follows


ˆ
d T 
J0 (v0 (.) + θ ṽ0(.))|θ =0 = E f0,v0 (x0 (t), m(t), v0 (t))
dθ 0

+p(t)∗ g0,v0 (x0 (t), m(t), v0 (t)) ṽ0 (t)dt (8.50)
80 8 Different Populations

and writing the necessary condition that this Gâteaux differential must be equal to 0,
we obtain

v0 (t) minimizes f0 (x0 (t), m(t), v0 ) + p(t).g0(x0 (t), m(t), v0 ) in v0 .

We introduce the notation

H0 (x0 , m, p) = inf[ f0 (x0 , m, v0 ) + p.g0(x0 , m, v0 )]


v0

v̂0 (x0 , m, p) minimizes the expression in brackets


G0 (x, m, p) = g0 (x0 , m, v̂0 (x0 , m, p))

so we can write from (8.47)–(8.49)

k0
− d p = H0,x0 (x0 (t), m(t), p(t)) + ∑ σ0l,x

0
(x0 (t))ql (t)
l=1
ˆ
+ G∗x0 (x, x0 (t), m(t), Du(x,t))Dη (x,t)m(x,t)dx
ˆ  k0
+ ζ (x,t)Hx0 (x, x0 (t), m(t), Du(x,t)dx dt − ∑ ql dw0l (8.51)
l=1

ˆ
p(T ) = h0,x0 (x0 (T ), m(T )) + ζ (x, T )hx0 (x, x0 (T ), m(T ))dx


∂ H0
−∂t η + Aη (x,t)dt = (x0 (t), m(t), p(t))(x)
∂m
+ Dη (x,t).G(x, x0 (t), m(t), Du(x,t))
ˆ
∂G
+ Dη (ξ ,t). (ξ , x0 (t), m(t), Du(ξ ,t))(x)m(ξ ,t)d ξ
∂m
ˆ 
∂H
+ ζ (ξ ,t) (ξ , x0 (t), m(t), Du(ξ ,t))(x)d ξ dt − ∑ μl (x,t)dw0l (t)
∂m l
(8.52)
ˆ
∂ h0 ∂h
η (x, T ) = (x0 (T ), m(T ))(x) + ζ (ξ , T ) (ξ , x0 (T ), m(T ))(x)d ξ
∂m ∂m
8.3 Major Player 81

∂ζ
+ A∗ ζ (x,t) + div (G(x, x0 (t), m(t), Du(x,t))ζ (x,t))
∂t
+div(G∗q (x, x0 (t), m(t), Du(x,t))Dη (x,t) m(x,t)) = 0 (8.53)

ζ (x, 0) = 0.

Next x0 (t) satisfies


dx0 = G0 (x0 (t), m(t), p(t))dt + σ0 (x0 (t))dw0
x0 (0) = ξ0 . (8.54)

So, in fact, the complete solution is provided by the six equations—(8.54), (8.51),
(8.39), (8.53), (8.42), and (8.52)—and the feedback of the representative agent and
the control of the major player are given by (8.40) and
v̂0 (t) = v̂0 (x0 (t), m(t), p(t)). (8.55)
If we follow the approach of Nourian–Caines [31], then the major player considers
m(t) as external. In that case,

η (x,t), ζ (x,t) = 0
and thus the six equations reduce to four, namely (8.54), (8.39), (8.42), and
 
k0 k0
−d p = H0,x0 (x0 (t), m(t), p(t)) + ∑ ∗
σ0l,x0
(x0 (t))ql (t) dt − ∑ ql dw0l (8.56)
l=1 l=1

p(T ) = h0,x0 (x0 (T ), m(T )).

In fact, in this case the control problem of the major player is simply (8.19) and
(8.21), in which m(t) is a given process adapted to F 0t . This is a standard stochastic
control problem, except that the drift is random and adapted. We can apply the
theory of Peng [32] and introduce the HJB equation

−∂t u0 (x0 ,t) + A0 u0 (x0 ,t)dt = H0 (x0 , m(t), Du0 (x0 ,t))

k0 n0
+ ∑ ∑ σ0il (x0 )K0l,i (x0 ,t) dt
l=1 i=1

k0
− ∑ Kol (x0 ,t)dw0 (t) (8.57)
l=1

u0 (x0 , T ) = h0 (x0 , m(T ))


82 8 Different Populations

in which
1
a0 (x0 ) = σ0 (x0 )σ0∗ (x0 )
2

A0 ϕ (x0 ) = −tr a0 (x0 )D2 ϕ (x0 ). (8.58)

The solution of the major player problem is then defined by the four equations
(8.54), (8.39), (8.42), and (8.57). Naturally we have the relation
n0
pi (t) = u0,i (x0 (t),t), qil (t) = K0l,i (x0 (t),t) + ∑ u0,i j (x0 (t),t)σ0 jl (x0 (t)) (8.59)
j=1

in which

∂ u0 ∂ K0l ∂ 2 u0
u0,i (x0 ,t) = (x0 ,t), K0l,i (x0 ,t) = (x0 ,t), u0,i j (x0 ,t) = (x0 ,t).
∂ xoi ∂ xoi ∂ xoi ∂ xo j

8.3.2 Linear Quadratic Case

We now consider the linear quadratic case and apply the general theory above.
We assume
ˆ
g0 (x0 , m, v0 ) = A0 x0 + B0 v0 + F0 ξ m(ξ )d ξ

σ0 (x0 ) = σ0 (8.60)

 ˆ ∗
1
f0 (x0 , m, v0 ) = x0 − H0 ξ m(ξ )d ξ − γ0
2
ˆ 
Q0 (x0 − H0 ξ m(ξ )d ξ − γ0 ) + v∗0R0 v0

h0 (x0 , m) = 0 (8.61)

ˆ
g(x, x0 , m, v) = Ax + Γx0 + Bv + F ξ m(ξ )d ξ

σ (x) = σ (8.62)
8.3 Major Player 83

 ˆ ∗
1
f (x, x0 , m, v) = x − Hx0 − H̄ ξ m(ξ )d ξ − γ
2
 ˆ  

Q x − Hx0 − H̄ ξ m(ξ )d ξ − γ + v Rv

h(x, x0 , m) = 0. (8.63)

We deduce easily
v̂0 (x0 , m, p) = −R−1 ∗
0 B0 p (8.64)

 ˆ ∗  ˆ 
1
H0 (x0 , m, p) = x0 − H0 ξ m(ξ )d ξ − γ0 Q0 x0 − H0 ξ m(ξ )d ξ − γ0
2
 ˆ 
1
+ p. A0 x0 + F0 ξ m(ξ )d ξ − p∗ B0 R−1 ∗
0 B0 p (8.65)
2

ˆ
G0 (x0 , m, p) = A0 x0 + F0 ξ m(ξ )d ξ − B0R−1 ∗
0 B0 p (8.66)

and
v̂(x, x0 , m, q) = −R−1 B∗ q (8.67)

 ˆ ∗  ˆ 
1
H(x, x0 , m, q) = x − Hx0−H̄ ξ m(ξ )d ξ −γ Q x−Hx0 − H̄ ξ m(ξ )d ξ −γ
2
 ˆ 
1
+q. Ax+Γx0 +F ξ m(ξ )d ξ − q∗ BR−1 B∗ q (8.68)
2

ˆ
G(x, x0 , m, q) = Ax + Γx0 + F ξ m(ξ )d ξ − BR−1B∗ q. (8.69)

We define
ˆ
z(t) = ξ m(ξ ,t)d ξ

and conjecture

1
u(x,t) = x∗ P(t)x + x∗r(t) + s(t) (8.70)
2
Kl (x,t) = x∗ Kl (t) + kl (t).
84 8 Different Populations

We deduce that P(t) is the solution of the Riccati equation

P + PA + A∗P − PBR−1B∗ P + Q = 0
P(T ) = 0. (8.71)

and the equations for r(t) and s(t)

−dr =(A∗ − P(t)BR−1B∗ )r(t)dt + [(P(t)F − QH̄)z(t)


+(P(t)Γ − QH)x0(t) − Qγ ]dt − ∑ Kl (t)dw0l (t)
l

r(T ) = 0 (8.72)


1
−ds = traP(t) + r(t).(Fz(t) + Γx0 (t)) − r(t)∗ BR−1 B∗ r(t)
2

1
+ (Hx0 (t) + H̄z(t) + γ )∗ Q(Hx0 (t) + H̄z(t) + γ ) dt
2
− ∑ kl (t)dw0l (t). (8.73)
l

From the equation of m(x,t) we get easily

dz
= (A − BR−1B∗ P(t) + F)z(t) + Γx0 (t) − BR−1B∗ r(t)
dt
z(0) = ϖ̄ (8.74)

in which
ˆ
ϖ̄ = xϖ (x)dx.

We proceed by computing
 ˆ 
H0,x0 (x0 , m, p) = Q0 x0 − H0 ξ m(ξ )d ξ − γ0 + A∗0 p

 ˆ 
∂ H0
(x0 , m, p)(ξ ) = ξ ∗ −H0∗ Q0 (x0 − H0 u m(u)du − γ0) + F0∗ p
∂m
8.3 Major Player 85

Gx0 (x, x0 , m, q) = Γ

∂G
(x, x0 , m, q)(ξ ) = F ξ
∂m

Gq (x, x0 , m, q) = −BR−1 B∗

 ˆ 
Hx0 (x, x0 , m, q) = −H Q x − Hx0 − H̄ ξ m(ξ )d ξ − γ + Γ∗ q

 ˆ 
∂H
(x, x0 , m, q)(ξ ) = ξ ∗ −H̄ ∗ Q(x − Hx0 − H̄ u m(u)du − γ ) + F ∗ q .
∂m

From the equations of the major player, we introduce the mean


ˆ
χ (t) = xζ (x,t)dx

and we postulate
η (x,t) = x∗ λ (t) + θ (t). (8.75)
After some easy calculations, we obtain the following relations

dx0 = (A0 x0 (t) + F0z(t) − B0R−1 ∗


0 B0 p(t))dt + σ0 dw0 (t)

x0 (0) = ξ0 (8.76)

−d p =[A∗0 p(t) + Γ∗λ (t) + Q0 (x0 (t) − H0 z(t) − γ0 )


k0
+ (Γ∗ P(t) − H ∗Q)χ (t)]dt − ∑ ql (t)dw0l (t)
l=1

p(T ) = 0 (8.77)


= (A − BR−1B∗ P(t))χ (t) − BR−1B∗ λ (t)
dt
χ (0) = 0 (8.78)
86 8 Different Populations

−d λ = [(A∗ − P(t)BR−1B∗ + F ∗ )λ (t) − H0∗Q0 (x0 (t) − H0z(t) − γ0 )


k0
+ F0∗ p(t) + (F ∗ P(t) − H̄ ∗Q)χ (t)]dt − ∑ μl (t)dw0l (t)
l=1

λ (T ) = 0. (8.79)

Therefore, we need to solve the system of (6.41)–(6.44) and (6.37) and (6.39).
We obtain the optimal controls

v̂(x,t) = −R−1 B∗ (P(t)x + r(t))


v̂0 (t) = −R−1 ∗
0 B0 p(t). (8.80)

Note that θ (t) is given by

k0
−d θ = λ (t).(Fz(t) + Γx0 (t) − BR−1B∗ r(t))dt − ∑ νl (t)dw0l (t)
l=1

θ (T ) = 0. (8.81)

In the framework of [31] we discard λ (t) and χ (t). We get the system of four
equations

dx0 = (A0 x0 (t) + F0z(t) − B0R−1 ∗


0 B0 p(t))dt + σ0 dw0 (t) (8.82)
x0 (0) = ξ0

k0
−d p = [A∗0 p(t) + Q0(x0 (t) − H0 z(t) − γ0 )]dt − ∑ ql (t)dw0l (t)
l=1

p(T ) = 0 (8.83)

dz
= (A − BR−1B∗ P(t) + F)z(t) + Γx0 (t) − BR−1B∗ r(t)
dt
z(0) = ϖ̄ (8.84)

−dr = (A∗ − P(t)BR−1B∗ )r(t)dt + [(P(t)F − QH̄)z(t)


+ (P(t)Γ − QH)x0(t) − Qγ ]dt − ∑ Kl (t)dw0l (t)
l

r(T ) = 0. (8.85)
8.3 Major Player 87

The case considered by Huang [19] is somewhat intermediary. It amounts to


neglecting ζ (x,t). Since ζ (x,t) is the adjoint associated to the state u(x,t) it means
that we consider u(x,t) as external in the m equation. But m(t) is not external, it is
influenced by x0 (t). So we get five equations. In the LQ case it boils down to taking
χ (t) = 0 but not λ (t).
We get the system

dx0 = (A0 x0 (t) + F0z(t) − B0R−1 ∗


0 B0 p(t))dt + σ0 dw0 (t)

x0 (0) = ξ0 (8.86)

k0
−d p = [A∗0 p(t) + Γ∗λ (t) + Q0 (x0 (t) − H0z(t) − γ0 )]dt − ∑ ql (t)dw0l (t)
l=1

p(T ) = 0 (8.87)

−d λ = [(A∗ − P(t)BR−1B∗ + F ∗ )λ (t) − H0∗Q0 (x0 (t)


k0
− H0 z(t) − γ0 ) + F0∗ p(t)]dt − ∑ μl (t)dw0l (t)
l=1

λ (T ) = 0 (8.88)

dz
= (A − BR−1B∗ P(t) + F)z(t) + Γx0 (t) − BR−1B∗ r(t)
dt
z(0) = ϖ̄ (8.89)

−dr = (A∗ − P(t)BR−1B∗ )r(t)dt + [(P(t)F − QH̄)z(t)


+ (P(t)Γ − QH)x0(t) − Qγ ]dt − ∑ Kl (t)dw0l (t)
l

r(T ) = 0. (8.90)

Note that in the LQ case one can obtain the necessary conditions directly, without
having to solve the general problem.
Chapter 9
Nash Differential Games with Mean Field Effect

9.1 Description of the Problem

The mean field game and mean field type control problems introduced in Chap. 2 are
both control problems for a representative agent, with mean field terms influencing
both the evolution and the objective functional of this agent. The terminology game
comes from the fact that the optimal feedback of the representative agent can be
used as an approximation for a Nash equilibrium of a large community of agents
that are identical. In Sect. 8.2 we have shown that the theory extends to a multi-
class of representative agents. However, each class still has its individual control
problem.
A natural and important extension concerns the case of large coalitions com-
peting with one another. This problem has been considerd in [7]. We present here
the general situation. However, we use as interpretation the dual game concept,
extending the considerations of Sect. 7.3. The interpretation as differential games
among large coalitions remains to be done.

9.2 Mathematical Problem

We can generalize the pair of HJB-FP equations (3.11) considered in the mean field
game to a system of N pairs of HJB-FP equations. Note that in this context N is
fixed and will not tend to +∞. We use the following notation

q = (q1 , . . . , qN ), qi ∈ Rn
v = (v1 , . . . , vN ), vi ∈ Rd

A. Bensoussan et al., Mean Field Games and Mean Field Type Control Theory, 89
SpringerBriefs in Mathematics, DOI 10.1007/978-1-4614-8508-7__9,
© Alain Bensoussan, Jens Frehse, Phillip Yam 2013
90 9 Nash Differential Games with Mean Field Effect

We consider functions

f i (x, v) : Rn × RdN → R
gi (x, v) : Rn × RdN → Rn

and define Lagrangians

Li (x, v, qi ) = f i (x, v) + qi · gi (x, v) (9.1)

We look for a Nash point for the Lagrangians in the controls v1 , . . . , vN . We write
the system of equations

∂ Li
(x, v, q) = 0 (9.2)
∂ vi
Assuming that we can solve this system, we obtain functions v̂i (x, q), which we
call a Nash equilibrium for the Lagrangians. We note the vector of v̂i (x, q), v̂(x, q).
We next define the Hamiltonians

H i (x, q) = Li (x, v̂(x, q), qi ) (9.3)

and

Gi (x, q) = gi (x, v̂(x, q)). (9.4)

Consider next probability densities mi (.) on Rn . We set also

m(.) = (m1 (.), . . . , mN (.)).

These probablity densities are considered as elements of L1 (Rn ).


We can now define the system of pairs of HJB-HP equations. We look for
functions ui (x,t) and mi (x,t). We call the vector u(x,t), m(x,t). When we write
Du(x,t) we mean (Du1 (x,t), . . . , DuN (x,t)) so it is an n × N matrix. Define functions
f0i (x, m) and hi (x, m) defined on Rn × L1 (Rn ). We set the system of pairs of PDEs

∂ ui
− + Aui = H i (x, Du) + f0i (x, mi (t))
∂t
ui (x, T ) = hi (x, mi (T )) (9.5)

∂ mi
+ A∗mi + div (Gi (x, Du)mi ) = 0
∂t
mi (x, 0) = mi0 (x) (9.6)

which represents a generalization of the pair (3.11).


9.3 Interpretation 91

9.3 Interpretation

We can now interpret the system of pairs (9.5) and (9.6). The easiest way is to
proceed as in Sect. 7.3. However, we need to assume that

∂ Φi (m)
f0i (x, m) = (x) (9.7)
∂m
∂ Ψi (m)
hi (x, m) = (x). (9.8)
∂m

Here m is an argument ∈ L1 (Rn ).


We consider N players. Each of them chooses a feedback control. So we have

v(.) = (v1 (.), . . . , vN (, )).

Considering player i, we use the notation

v(.) = (vi (.), v̄i (.))

where v̄i (.) represents all the feedbacks except vi (.). To a vector of feedbacks v(.)
we associate the probabilities

piv(.) (x,t) = pivi (.),v̄i (.) (x,t)

as the solution of

∂ piv(.)
+ A∗ piv(.) + div (gi (x, v(x))piv(.) (x)) = 0
∂t
piv(.) (x, 0) = mi0 (x). (9.9)

We define the functional


ˆ T ˆ
J (v(.)) =
i
piv(.) (x,t) f i (x, v(x))dxdt
0 Rn
ˆ T
+ Φi (piv(.) (t))dt + Ψi (piv(.) (T )) (9.10)
0

where piv(.) (t) means the function piv(.) (x,t). We need to compute the Gateaux
differential of J i (v(.)) with respect to vi (.),namely the quantity
92 9 Nash Differential Games with Mean Field Effect

ˆ T ˆ
d i i
J (v (.) + θ ṽi (.), v̄i (.))|θ =0 = m̃i (x,t) f i (x, v(x))dxdt
dθ 0 Rn
ˆ ˆ
T
∂ f i (x, v(x)) i
+ piv(.) (x,t) ṽ (x,t)dxdt
0 Rn ∂ vi
ˆ T ˆ
+ f0i (x, piv(.) (t))m̃i (x,t)dxdt
0 Rn
ˆ
+ hi (x, piv(.) (T ))m̃i (x, T )dx (9.11)
Rn

where m̃i (x,t) is the solution of


 
∂ m̃i ∂ gi (x, v(x)) i
+ A∗m̃i + div (gi (x, v(x))m̃i ) + div ṽ (x)p i
(x,t) =0
∂t ∂ vi v(.)

piv(.) (x, 0) = mi0 (x). (9.12)

One can then introduce the functions uiv(.) (x,t) solutions of

∂ uiv(.)
− + Auiv(.) − gi(x, v(x)) · Duiv(.) (x) = f i (x, v(x)) + f0i (x, piv(.) (t))
∂t
uiv(.) (x, T ) = hi (x, piv(.) (T )) (9.13)

and one can check easily that


ˆ ˆ
d i i ∂ Li
T
J (v (.) + θ ṽi(.), v̄i (.))|θ =0 = (x, v(x), Duiv(.) (x))ṽi (x)piv(.) (x,t)dxdt.
dθ 0 Rn ∂ v
i
(9.14)
A Nash point of functionals J i (v(.)), v̂(.) must satisfy

∂ Li
(x, v̂(x), Duiv̂(.) (x)) = 0.
∂ vi
It follows that if we set

ui (x,t) = uiv̂(.) (x,t), mi (x,t) = piv̂(.) (x,t)

v̂(x,t) = v̂(x, Du(x,t))

then the system of pairs ui , mi is a solution of (9.5) and (9.6). Hence v̂(.) is a Nash
equilibrium for problems (9.9) and (9.10).
We can give a probabilistic interpretation to problems (9.9) and (9.10).
Consider N independent standard Wiener processes wi (t) and N independent
9.4 Another Interpretation 93

random variables xi0 ,with a probability density of mi0 . The random variables are also
independent of the Wiener processes. For a vector of feedbacks the representative
agents have states xi (.) = xiv(.) (.) solutions of the equations

dxi = gi (xi (t), v(xi (t)))dt + σ (xi (t))dwi (t)


xi (0) = xi0 . (9.15)

It is clear that the probability density of xi (t) is piv(.) (t).We note

Pxi (t) = piv(.) (t)

and we have
ˆ T ˆ T
J i (v(.)) = E f i (xi (t), v(xi (t)))dt + Φi (Pxi (t) )dt + Ψi (Pxi (T ) ). (9.16)
0 0

In this problem, each player sees the feedbacks of his or her opponents acting on
his or her own trajectory. All the trajectories are independent.

9.4 Another Interpretation

We can give another interpretation related to the mean field game interpretation for
a single player; see Chap. 2. For given functions mi (t), deterministic, to fix the ideas
in C([0, T ]; Rn ) we consider the following Nash equilibrium problem. We have state
equations, as in (9.15)

dxi = gi (xi (t), v(xi (t)))dt + σ (xi (t))dwi (t)


xi (0) = xi0 (9.17)

controlled by feedbacks v(.) = (v1 (.), . . . , vN (.)) and payoffs


ˆ T
J i (v(.), mi (.)) = E [ f i (xi (t), v(xi (t))) + f0i (xi (t), mi (t))]dt
0

+ Ehi(xi (T ), mi (T )) (9.18)

We look for a Nash equilibrium for problem (9.17) and (9.18). By classical
methodology we obtain the system of HJB equations

∂ ui
− + Aui = H i (x, Du) + f0i (x, mi (t))
∂t
ui (x, T ) = hi (x, mi (T )) (9.19)
94 9 Nash Differential Games with Mean Field Effect

with optimal feedbacks

v̂(x,t) = v̂(x, Du(x,t)).

If we use these optimal feedbacks in the state equations (9.17) we obtain the
trajectories of the Nash equilibrium

d x̂i = gi (x̂i (t), v̂(x̂i (t)))dt + σ (x̂i (t))dwi (t)


x̂i (0) = xi0 . (9.20)

We now request that the functions mi (t) represent the probability densities of the
trajectories x̂i (t). Clearly the functions mi (t) are given by the FP equations

∂ mi
+ A∗ mi + div (Gi (x, Du)mi ) = 0
∂t
mi (x, 0) = mi0 (x) (9.21)

and
ˆ
J (v̂(.), m (.)) =
i i
ui (x, 0)mi0 (x)dx. (9.22)
Rn

9.5 Generalization

We can introduce more general problems than (9.5) and (9.6). We can write

∂ ui
− + Aui = H i (x, m, Du)
∂t
ui (x, T ) = hi (x, m(T )) (9.23)

∂ mi
+ A∗ mi + div (Gi (x, m, Du)mi ) = 0
∂t
mi (x, 0) = mi0 (x) (9.24)

in which m = (m1 , . . . , mN ) and the functions H i , Gi depend on the full vector m.


The interpretation is much more elaborate.
9.6 Approximate Nash Equilibrium for Large Communities 95

9.6 Approximate Nash Equilibrium for Large Communities

We now extend the theory developed in Sect. 5.4. We want to associate to problems
(9.5) and (9.6) a differential game for N communities, composed of very large
numbers of agents. We denote the agents by the index i, j where i = 1, . . . , N and
j = 1, . . . , M. The number M will tend to +∞. Each player i, jchooses a feedback
vi, j (x), x ∈ Rn . The state of player i, j is denoted by xi, j (t) ∈ Rn . We consider
independent standard Wiener processes wi, j (t) and independent replicas xi,0 j of the
random variable xi0 , with the probability density mi0 . They are independent of the
Wiener processes. We denote

v. j (.) = (v1, j (.), . . . , vN, j (.)).

The trajectory of the state xi, j is defined by the equation

dxi, j = gi (xi, j , v. j (xi, j ))dt + σ (xi, j )dwi, j

xi, j (0) = xi,0 j . (9.25)

The trajectories are independent. The player i, j trajectory is influenced by the


feedbacks vk, j (x), k = i, acting on his own state. When we focus on player i we use
the notation

v. j (.) = (vi, j (.), v̄i, j (.))

in which v̄i, j (.) represents all feedbacks vk, j (x), k = i. The notation v(.) represents
all feedbacks.
We now define the objective functional of player i, j by
ˆ T 
J i, j
(v(.)) = E f i (xi, j (t), v. j (xi, j (t)))
0
M M
1 1
+ f0i (xi, j (t), ∑
M − 1 l=1= j
δxi,l (t) ) dt + Ehi (xi, j (T ), ∑ δ i,l ).
M − 1 l=1= j x (T )
(9.26)

We look for a Nash equilibrium. Consider next the system of pairs of HJB-FP
equations (9.5) and (9.6) and the feedback v̂(x).We want to show that the feedback

v̂i, j (.) = v̂i (.)

is an approximate Nash equilibrium. If we use this feedback in the state equa-


tion (9.25) we get
96 9 Nash Differential Games with Mean Field Effect

d x̂i, j = gi (x̂i, j , v̂(x̂i, j ))dt + σ (x̂i, j )dwi, j

x̂i, j (0) = xi,0 j

and the trajectories x̂i, j become independent replicas of x̂i solution of

d x̂i = gi (x̂i , v̂(x̂i ))dt + σ (x̂i )dwi


x̂i (0) = xi0 .

The probability density of x̂i (t) is mi (t). Therefore,


ˆ  
T M
1
J (v̂(.)) − J (v̂(.)) = E
i, j i
x̂ (t), f0i ∑ δx̂i,l (t) − f0i (x̂i (t), mi (t)) dt
M − 1 l=1
i, j
0 =j
 
M
1
+ E h x̂ (T ),
i i, j
∑ δx̂i,l (T ) − hi(x̂i (T ), mi (T ))
M − 1 l=1 =j

and also
ˆ  
T 1 M
J i, j i i
(v̂(.)) − J (v̂(.), m (.)) = E
0
x̂ (t),f0i ∑i, j
δ i,l
M − 1 l=1= j x̂ (t)

i i, j i
− f0 (x̂ (t), m (t)) dt
 
M
1
+ E hi x̂i, j (T ), ∑ x̂i,l (T ) − hi (x̂i, j (T ), mi (T )) .
M − 1 l=1
δ
=j

For fixed i, the variables x̂i,l (t),l = 1, . . . , M are independent and distributed with
the density mi (t). By arguments already used, see Sect. 5.4, the random measure
in Rn converges a.s. towards mi (x,t)dx for the topology of weak∗ convergence of
measures on Rn . Provided the functionals f0i (x, m) and hi (x, m) are continuous in
m for the topology of weak∗ convergence of measures on Rn , for any fixed x,and
provided the Lebesgue’s theorem can be used, we can assert that

J i, j (v̂(.)) − J i (v̂(.), mi (.)) → 0, as M → +∞.

We now focus on player 1, 1 to fix the ideas. Suppose he or she uses a feedback
v1,1 (x) = v̂1,1 (x), and the other players use v̂i, j (x) = v̂i (x), ∀i ≥ 2, ∀ j or ∀i, ∀ j ≥ 2.
We set v1 (x) = v1,1 (x). Call this set of controls ṽ(.). By abuse of notation, we also
call
1
ṽ(.) = (v1 (.), v̂2 (.), . . . , v̂N (.)) = (v1 (.), v̂ (.)).
9.6 Approximate Nash Equilibrium for Large Communities 97

The corresponding trajectories are denoted by y1, j (t) solutions of

1
dy1,1 = g1 (y1,1 , v1 (y1,1 ), v̂ (y1,1 ))dt + σ (y1,1)dw1,1

y1,1 (0) = x1,1


0 (9.27)

and y1, j = x̂1, j for j ≥ 2.


We can then compute
ˆ T
1
J 1,1 (ṽ(.)) = E f 1 (y1,1 (t), v1 (y1,1 ), v̂ (y1,1 ))dt
0
ˆ    
T 1 M 1,l 1 M
+E
0
f01 1,1
y (t), ∑ x̂ (t) dt + Eh y (T ), M − 1 ∑ δx̂1,l (T )
M − 1 l=2
1 1,1
l=2
ˆ T
1
=E f 1 (y1,1 (t), v1 (y1,1 ), v̂ (y1,1 ))dt
0
ˆ T
+E f01 (y1,1 (t), m1 (t))dt + Eh1 (y1,1 (T ), m1 (T ))
0
ˆ   ˆ T
T 1 M 1,l
+E
0
y (t), f01 ∑
M − 1 l=2
1,1
x̂ (t) dt − E
0
f01 (y1,1 (t), m1 (t))dt

 
1 M
+ Eh1 y1,1 (T ), ∑ x̂ (T ) − Eh1 (y1,1 (T ), m1 (T ))
M − 1 l=2
δ 1,l

ˆ
≥ u1 (x, 0)m10 (x)dx
Rn
ˆ   ˆ T
T 1 M 1,l
+E
0
y (t), f01 ∑
M − 1 l=2
1,1
x̂ (t) dt − E
0
f01 (y1,1 (t), m1 (t))dt

 
1 M
1 1,1
+ Eh y (T ), ∑ δx̂1,l (T ) − Eh1 (y1,1 (T ), m1 (T )).
M − 1 l=2
(9.28)

Recalling that
ˆ
u1 (x, 0)m10 (x)dx = J 1 (v̂(.), m1 (.))
Rn

and using previous convergence arguments, we obtain


 
1
J 1,1
(ṽ(.)) ≥ J (v̂(.), m (.)) − O √
1 1
M

and this concludes the approximate Nash equilibrium property.


Chapter 10
Analytic Techniques

10.1 General Set-Up

10.1.1 Assumptions

We consider here the system

∂ ui
− + Aui = H i (x, Du) + f0i (x, m(t))
∂t
ui (x, T ) = hi (x, m(T )) (10.1)

∂ mi
+ A∗mi + div (Gi (x, Du)mi ) = 0
∂t
mi (x, 0) = mi0 (x) (10.2)

which we consider as a system of PDEs. We call u, m the vectors with components


ui , mi . We want to give an existence and a regularity result. We recall
n
∂ 2ϕ
Aϕ (x) = − ∑ ak,l (x)
∂xk ∂xl
(x). (10.3)
k,l=1

n
∂2
A∗ ϕ (x) = − ∑ (akl (x)ϕ (x)). (10.4)
k,l=1 ∂xk ∂xl

in the mean field problem x ∈ Rn . To simplify the analytic treatment, we take x ∈ O


as a Lipschitz bounded domain with a boundary denoted by ∂ O. We assume

A. Bensoussan et al., Mean Field Games and Mean Field Type Control Theory, 99
SpringerBriefs in Mathematics, DOI 10.1007/978-1-4614-8508-7__10,
© Alain Bensoussan, Jens Frehse, Phillip Yam 2013
100 10 Analytic Techniques

ak,l = al,k ∈ W 1,∞ (O)


n
∑ ak,l (x)ξk ξl ≥ α |ξ |2 , ∀ξ ∈ Rn , ∀x ∈ O, α > 0. (10.5)
k,l=1

We need to specify boundary conditions for problems (10.1) and (10.2). In this
presentation, we shall use Neumann boundary conditions, but Dirichlet boundary
conditions are possible and in fact simpler. If x ∈ ∂ O, we call ν (x) the outward unit
normal at x. The Neumann boundary condition for (10.1) reads
n
∂ ui
∑ νk (x)ak,l (x)
∂ xl
(x) = 0, ∀x ∈ ∂ O, i = 1, . . . , N (10.6)
k,l=1

and the Neumann boundary condition for (10.2) reads


 
n

∑ νl (x) ∂ xk (ak,l (x)m (x)) − Gl (x, Du(x))m (x) = 0, ∀x ∈ ∂ O, i = 1, . . . , N.
i i i
k,l=1
(10.7)
To simplify notation we set

L∞ (L p ) = L∞ (0, T ; L p (O, RN )), 1 ≤ p ≤ ∞


C(L2 ) = C([0, T ]; L2 (O, RN )).

We shall also need the Sobolev spaces

L2 (W 1,2 ) = L2 (0, T ;W 1,2 (O, RN ))


L2 ((W 1,2 )∗ ) = L2 (0, T ; (W 1,2 )∗ (O, RN )).

We may also use the short notation for the components of the vector in RN .
We next make the assumptions

H i (x,t; q), Gi (x,t; q) : Rn × R × RnN are measurable, continuous in q (10.8)

|H i (x,t; q)| + |Gi (x,t; q)|2 ≤ K|q|2 + K (10.9)

and

f0i (x,t, m), : Rn × R × L1(O, RN )), measurable, continuous in m (10.10)


| f0i (x,t, m)| ≤ k0 (||m||)
10.1 General Set-Up 101

where

||m|| = ||m||L1 (O,RN )

and k0 is bounded on bounded sets. Also

hi (x, m), : Rn × L1 (O, RN )), measurable, continuous in m


|hi (x, m)| ≤ k0 (||m||)
m0 ∈ L2 (O, RN ). (10.11)

10.1.2 Weak Formulation

We now give a weak formulation of problems (10.1) and (10.2) . We look for a pair
u, m such that
u, m ∈ L2 (W 1,2 ) ∩ L∞ (L2 ) (10.12)
m ∈C(L2 ), mi Gi (., Du) ∈ L2 (L2 )
and u, m satisfy
ˆ T n ˆ T

0
(ui , ϕ̇ i )L2 dt + ∑ (Dl ui , Dk (akl ϕ i ))L2 dt = (hi (., m(T )), ϕ i (T ))L2
k,l=1 0
ˆ T
+ (H i (., Du) + f0i (., m(t)), ϕ i )L2 dt, ∀i = 1, . . . , N (10.13)
0

for any test function ϕ i ∈ L2 (W 1,2 ) ∩ L∞ (L∞ ), ϕ̇ i ∈ L2 ((W 1,2 )∗ ) such that ϕ i (t)
vanishes in a neighborhood of t = 0. Note that ϕ i ∈ C(L2 ).
Similarly,
ˆ T n ˆ T

0
(mi , ϕ̇ i )L2 dt + ∑ (Dl ϕ i , Dk (akl mi ))L2 dt
k,l=1 0
ˆ T
= (m0 , ϕ i (0))L2 + (mi Gi (., Du), Dϕ i )L2 dt, ∀i = 1, . . . , N (10.14)
0

for any test function ϕ i ∈ L2 (W 1,2 ) ∩ L∞ (L∞ ), ϕ̇ i ∈ L2 ((W 1,2 )∗ ), such that ϕ i (t)
vanishes in a neighborhood of t = T. We have denoted
ˆ
(ϕ , ψ )L2 = ϕ (x)ψ (x)dx
O


and Dk stands for ∂ xk .
102 10 Analytic Techniques

10.2 A Priori Estimates for u

10.2.1 L∞ Estimate for u

We assume here the structure

|H i (x, q)| ≤ K|qi ||q| + K (10.15)

Proposition 7. We assume (10.5), (10.10), (10.11), (10.15). Consider a weak


solution of (10.13) with m ∈ L∞ (L1 ), ≥ 0. Then
||u||L∞ (L∞ ) ≤ K0 (10.16)

where Ko depends only on the L∞ (L1 ) bound of m and of the various constants.
Proof. We note that f0i (x, m(t)),hi (x, m(T )) are bounded by a constant. So, in fact,
we have
 
 ∂ ui 
− i
 ∂ t + Au  ≤ K|Du ||Du| + K
i

|ui (x, T )| ≤ K
Then there exists a function σ (x,t) such that

||σ ||L∞ (L∞ ) ≤ 1


and
∂ ui
− + Aui = σ (K|Dui ||Du| + K).
∂t
Define
Dui
G̃i (x, q) = K σ |q|, if Dui = 0
|Dui |
= 0, if Dui = 0

then ui satisfies
∂ ui
− + Aui = G̃i (x, Du).Dui + K σ .
∂t
The probabilistic interpretation, or the maximum principle, shows immediately that

|u(x0 ,t0 )| ≤ max(sup |u(x, T )|, KT )

which proves the result. 


10.2 A Priori Estimates for u 103

10.2.2 L2 (W 1,2 )) Estimate for u

Proposition 8. We make the assumptions of Proposition 7, that a weak solution of


(10.13) satisfies
||u||L2 (W 1,2 )) ≤ K0 (10.17)

where K0 depends only on the L∞ (L1 ) bound of m and of the various constants.
Proof. The proof is given in [3] and is not reproduced here. It relies on using the
following test functions

γ N
ϕ i = (exp(λ ui ) − exp(−λ ui )) exp
α ∑ (exp(λ u j ) − exp(−λ u j ))
j=1

with parameters λ , γ sufficiently large. More precisely, we use iterated exponentials


related to ϕ i (see details the reference above). In the reference, however Dirichlet
conditions are assumed. But the proof carries over to Neumann boundary conditions.


10.2.3 C α Estimate for u

We need to consider here, that the operator A is written in divergence form

Aϕ (x) = −Dk (ak,l (x)Dl ϕ (x))


which can be done since ak,l (x) are Lipschitz continuous, with a modification of the
Hamiltonian, which does not change the assumptions.
One makes use of the Green’s function Γx0 ,t0 (x,t) solution of the backward
equation, t < t0 ,
∂Γ
− + AΓ = 0
∂t
Γx0 ,t0 (x,t0 ) = δ (x − x0 ).
One can then prove the following estimate.
Proposition 9. We make the assumptions of Proposition 7, and ∂ O smooth; then
one has the estimate
ˆ T ˆ
|Du|2 Γx0 ,t0 (x,t)dxdt ≤ C (10.18)
t0 O

where C depends only on the L∞ (L1 ) bound of m, of the various constants and of the
domain.
104 10 Analytic Techniques

Proof. In reference [3], the Dirichlet problem is considered. It suffices to test with
ϕ i Γx0 ,t0 , after extending the solution ui by 0, outside the domain O. This is not
valid for Neumann boundary conditions. Because the boundary is smooth, we can
use the following procedure. First, the problem is transformed locally to the half-
space xn > 0. In a portion U ∩ {xn > 0} the function u is extended by (a change of
coordinates is necessary)

ũ = u on U ∩ {xn > 0}

ũ(x1 , . . . , xn−1 , xn ,t) = u(x1 , . . . , xn−1 , −xn ,t), xn < 0

If we extend the coefficients

ak,l (x1 , . . . , xn ) = ak,l (x1 , . . . , −xn ), xn < 0,


if l = n, k = nor k = l = n

an,l (x1 , . . . , xn ) = −an,l (x1 , . . . , −xn ), l = n


ak,n (x1 , . . . , xn ) = −ak,n (x1 , . . . , −xn ), k = n
The Hamiltonian is extended as follows

H i (x1 , . . . , xn , q1 , . . . , qn−1 , qn ) = H i (x1 , . . . , −xn , q1 , . . . , qn−1 , −qn )

and

f0i (x1 , . . . , xn , m(t)) = f0i (x1 , . . . , −xn , m(t))

The extended elliptic operator is discontinuous at the boundary, but still uniformly
elliptic. Since it is in divergence form, it is valid. The Neumann condition holds on
both sides of xn = 0. The extended solution solves a parabolic problem, for which
we need to consider only interior estimates. So if the new domain is denoted Õ, we
consider the Green’s function with respect to Õ or a cube ⊃⊃ Õ. One then uses a
test function ϕ i Γx0 ,t0 τ 2 ,where τ ≥ 0, is a localization function with compact support
in Õ. This yields the desired estimate. 
From the preceding result, one can derive the Cα a priori estimate
Proposition 10. With the assumptions of Proposition 9, one has the estimate

[u]Cα ≤ C, (10.19)
10.2 A Priori Estimates for u 105

where C depends only on the L∞ (L1 ) bound of m, of the various constants and of the
domain.
Proof. Again, we do not detail the proof and refer to [3, 6]. The idea is to use a
Campanato test
ˆ 
sup R−n−2−2α |u − ūR|2 dxdt| 0 < R < R1 , QR ⊂ O × (0, T ) ≤ C, (10.20)
R QR

where QR is a parabolic cylinder of size R, around any point of the domain O ×


(0, T ), the sup is taken over all points and over all sizes. The quantity ūR is the mean
value of u over QR . Alternatively, ūR can also be chosen to be the maen value over
QmR − QR , m fixed.
The bound C is as indicated in the statement of the proposition. From this
property, it results that u is Hölder continuous on domains Q such that Q ⊂⊂
O × (0, T ), with

[u]Cα (Q) ≤ CQ .

Due to the possibility of extending the solution across the boundary ∂ O × (0, T ) and
O × {0}, we can, in fact, state

[u]Cα (O×(0,T )) ≤ C

with a bound as indicated in the statement of the proposition.


In our setting, the Campanato criterium is established, proving first the estimate
ˆ
|Du|2 dxdt ≤ CRn+2α (10.21)
QR

which implies, via Poincaré’s inequality,


ˆ ˆ  2
t0  
u − u(x,t)dx dxdt ≤ CRn+2+2α ,

t0 −R2 BR BR

where BR = BR (x0 ) is the ball of radius R and center x0 . From this and the equation,
one can also estimate the differences of mean values
 
 
 u(x,t1 )dx − u(x,t2 )dx ≤ CR2α .

BR BR

So the crucial point is to establish the Morrey inequality (10.21). The standard
technique is to obtain a “hole-filling inequality.” In the elliptic case, such inequalities
are of the form
106 10 Analytic Techniques

ˆ ˆ
|Du|2 Gdx ≤ C |Du|2 Gdx + CR2α ,
BR B2R −BR

where G = G(.; x0 ) is the fundamental solution of the underlying elliptic operator.


The parabolic analogue is much more complex since one has to deal with
fundamental solutions at different time levels. The corresponding inequality reads
ˆ ˆ
|Du|2 Γdxdt ≤ K(ε ) |Du|2 Γdxdt
QR QR −Q R
2
ˆ
+δ (ε )R−n |Du|2 dxdt + CRβ ,
QR −Q R
2

where Γ is the fundamental solution of the parabolic operator with singularity at


(x0 ,t0 ). If we discard the term in δ (ε ), then we can use the standard hole-filling
procedure which implies
ˆ ˆ
K(ε )
|Du|2 Γdxdt ≤ |Du|2 Γdxdt + KRβ .
QR 1 + K(ε ) Q4R
2

Then an iteration argument applied to R = 2−k yields


ˆ
|Du|2 Γdxdt ≤ CR2α .
QR

To deal with the term in δ (ε ), one uses the fact that

1 δ (ε )
K(ε ) ∼ , ∼ ε 2.
ε 1 + K(ε )

One then uses a supremum argument, which covers the term in δ (ε ) and takes into
account that Γ ≥ c0 Rn on Q2R − QR , but not necessarily on QR ( this is different from
the elliptic case). One derives (10.21). 

10.2.4 L p(W 2,p) Estimate for u

We show here how all the previous a priori estimates on u, including the Cα estimate
allow us to obtain estimates in the space L p (W 2,p ), 2 ≤ p < ∞, provided we assume
the following regularity in the final condition

||hi (., m)||L∞ (W 2,p ) ≤ k0 (||m||) (10.22)


10.2 A Priori Estimates for u 107

n
∂ hi
∑ νk (x)ak,l (x)
∂ xl
(x,t, m) = 0, ∀x ∈ ∂ O, ∀m, ∀t i = 1, . . . , N. (10.23)
k,l=1

We have the following


Proposition 11. We make the assumptions of Proposition 9 and (10.22) and
(10.23). Then, one has the estimate
 
 ∂ u 
||u||L p (W 2,p ) ≤ C,   ≤ C, p < ∞, (10.24)
∂ t L p (L p )

where C depends only on the L∞ (L1 ) bound of m, of the various constants and of the
domain.
Proof. The assumptions (10.22) and (10.23) allow to reduce the final condition to

ui (x, T ) = 0

simply by considering the difference ui (x,t) − hi (x, m(T )). The second property
(10.24) is a consequence of the first one. So it remains to prove the first one
with 0 final condition. We shall use linear parabolic L p theory. A delicate point
in the discussion concerns the compatibility conditions for the boundary data on
(t = T ) × ∂ O. We refer to [28], Theorem 5.3, p. 320. In our case, the compatibility
condition is satisfied, thanks to (10.23) and the continuity of u, proved in the
preceding section. A convenient reference for the L2 (W 2,p ) property in the case
of Dirichlet boundary conditions is the paper of Schlag, [33]. We believe that his
technique can be adapted to the Neumann case. Nevertheless, we proceed with local
estimates and local charts to treat the boundary.
We first prove a local estimate, namely if O0 ⊂ O, then

||u||L p (W 2,p (O0 )) ≤ C0 . (10.25)

Let x0 ∈ O and t0 ∈ (0, T ). We consider a ball B2R ⊂ O. The radius R will be small,
but will not tend to 0. We set
QR = BR × ((t0 − R2 )+ ,t0 )
and we consider a test function τ (x,t) that is equal to 1 on QR , with support included
in Q2R , which is Lipschitz, with second derivatives in x bounded.
We note that
 
 ∂ ui 
− i
 ∂ t + Au  ≤ K|Du| + K.
2

Let uiR be the mean value of u over QR . It is easy to check the following localized
inequality
108 10 Analytic Techniques

 
 ∂ 
− ((ui − uiR)τ 2 ) + A((ui − uiR)τ 2 ) ≤ K|D((u − uR)τ )|2 + K(τ )
 ∂t 

in which K does not depend on τ , but the second constant depends on τ , its
derivatives, and the L∞ bound on u. We then use the standard L p (W 2,p ) theory of
parabolic equations with Dirichlet conditions, to claim the estimate
ˆ ˆ  ˆ Tˆ 
T ∂  2
 ((u − uR)τ 2 )| p dxdt +  D ((u − uR)τ 2 )| p dxdt
∂t 
0 O 0 O
ˆ Tˆ
≤K |D((u − uR)τ )|2p dxdt + K(τ ). (10.26)
0 O

We skip details, which are easy. Setting w = (u − uR )τ , which vanishes on ∂ O, we


may write, by integration by parts
ˆ T ˆ ˆ T ˆ
|Dk w | dxdt = −(2p − 1)
i 2p
wi D2k wi |Dk wi |2p−2dxdt
0 O 0 O

hence
ˆ T ˆ ˆ T ˆ
|Dk wi |2p dxdt ≤ K p |wi D2k wi | p dxdt
0 O 0 O

which yields easily


ˆ T ˆ ˆ T ˆ
|Dk w | dxdt ≤ K p
i 2p
|ui − uiR| p |D2k (wi τ )| p dxdt + K(τ ).
0 O 0 O

Thanks to the fact that the functions ui are Cα , we can choose R sufficiently small
so that this inequality becomes
ˆ T ˆ ˆ T ˆ
|Dk wi |2p dxdt ≤ ε |D2k (wi τ )| p dxdt + K(τ )
0 O 0 O

in which ε can be chosen small. Adding terms we get


ˆ T ˆ ˆ T ˆ
|D((u − uR)τ )| dxdt ≤ ε 2p
|D2 ((u − uR)τ 2 )| p dxdt + K(τ )
0 O 0 O

which used in (10.25) yields


ˆ ˆ  p ˆ Tˆ
T ∂ 
 ((u − uR)τ 2 ) dxdt + |D2 ((u − uR)τ 2 )| p dxdt ≤ K(τ ).
 
0 O ∂t 0 O
10.2 A Priori Estimates for u 109

By combining a finite number of small domains, we obtain the property (10.25).


To obtain the estimate up to the boundary, we need to use a system of local
charts. To give all details is too technical and too long. We just explain the generic
problems. Suppose we have a domain O in xn ≥ 0, its boundary Γ contains a part
Γ ⊂ {xn = 0}, and a part Γ ⊂ {xn > 0}. We consider a Cα function, which we
denote by ϕ , which satisfies the boundary conditions
n
ϕ|Γ = 0, ∑ an,k (x)Dk ϕ (x)|Γ = 0.
k=1
´
We want to estimate the integral O |Dϕ |2p dx, p > 1. We set

∂ϕ n
(x) = ∑ an,k (x)Dk ϕ (x).
∂n k=1

We can first notice that


   
n−1  ∂ ϕ 2p
|Dϕ (x)| 2p
≤K ∑ |Dk ϕ (x)| +  ∂ n (x)

2p
k=1

then for k = 1, . . . , n − 1 we have


ˆ ˆ
|Dk ϕ |2p dx = −(2p − 1) ϕ D2k ϕ |Dk ϕ |2p−2 dx.
O O

This is thanks to the fact that the integration leads to boundary terms on Γ , since
k = n. This leads to
ˆ ˆ
|Dk ϕ |2p dx ≤ K p |ϕ D2k ϕ | p dx.
O O

If the domain is small, in view of the fact that ϕ is Cα and vanishes on Γ , we can
also bound ||ϕ ||L∞ by a small number. So we obtain the estimate
ˆ ˆ
|Dk ϕ |2p dx ≤ ε |D2k ϕ | p dx.
O O

Next we write
ˆ   ˆ    
 ∂ ϕ 2p  ∂ ϕ 2p−1 n
O
 
 ∂ n (x) dx =
O
 
 ∂ n (x) ∑ an,k (x)Dk ϕ (x) dx.
k=1
110 10 Analytic Techniques

The terms with index k = n can be estimated as above. There remains the term
´  ∂ ϕ 2p−1
O  ∂ n (x) Dn ϕ (x)dx. The parts integration is still possible, because on the
∂ϕ
part of the boundary Γ we have ∂ n (x) = 0. Eventually we get an estimate
ˆ ˆ
|Dϕ |2p dx ≤ ε |D2 ϕ | p dx + ε .
O O

The reason for the addtional term stems from the derivatives of the functions an,k (x).
This generic estimate is used together with local charts and addtional localization to
work with small size domains. Collecting results, we can obtain the estimate (10.24)
up to the boundary. 

10.3 A Priori Estimates for m

We build on the results obtained in the preceding section. Under the assumptions of
Proposition 11, we can assert that Du ∈ L∞ (L∞ ). Therefore, the vectors Gi (x, Du) are
bounded functions. So we can look at the functions mi as solving a generic problem
stated as follows [we drop the index i and we replace Gi (x, Du) by a bounded vector
G(x,t), simply referred as G(x)]
∂m
+ A∗ m + div(mG(x)) = 0, x ∈ O
∂t
 
n n

∑ νl (x) ∑ ∂ xk (akl (x)m) − Gl m = 0, x ∈ ∂ O
l=1 k=1

m(x, 0) = m0 (x) (10.27)

and m0 is in L2 (O). We write also (10.27) in the weak form


ˆ ˆ ˆ  
∂m ∂m ∂ϕ ∂ akl ∂ϕ
ϕ dx + ∑ akl dx + ∑ m ∑ − Gl dx = 0,
O ∂t k,l O ∂ xk ∂ xl l O k ∂ xk ∂ xl

∀ϕ ∈ W 1,2 . (10.28)

10.3.1 L2 (W 1,2 ) Estimate

We have the following


Proposition 12. We assume G bounded, and m0 ∈ L2 (O). Then we have
10.3 A Priori Estimates for m 111

||m||L∞ (L2 ) + ||m||L2 (W 1,2 ) ≤ C, (10.29)

where C depends only on the L∞ bound of G , on the W 1,∞ bound of ai, j , the ellipticity
constant, and of the L2 norm of m0 .
Proof. This is easily obtained, by taking ϕ = m(t) in (10.28) and integrating in
t.The result follows Gronwall’s inequality. 

10.3.2 L∞ (L∞ ) Estimates

First, it is easy to check that, if m0 ∈ L p (O), then

||m||L∞ (L p ) ≤ C p , ∀2 ≤ p < ∞. (10.30)

This is obtained by testing (10.28) with m p−1 , performing easy bounds, and using
Gronwall’s inequality [an intermediate step to justify the existence of integrals is
done by replacing m by min(m, L)]. However the constant C p depends on p and
goes to ∞ as p → ∞. This is due to the third term in (10.28). In the case when

∂ akl
∑ ∂ xk − Gl = 0 (10.31)
k

then we have simply


ˆ ˆ
m (x,t)dx ≤
p
m0p (x)dx.
O O

Therefore,

||m||L∞ (L p ) ≤ ||m0 ||L∞

and we can let p → ∞ and obtain

||m||L∞ (L∞ ) ≤ ||m0 ||L∞ . (10.32)

The L∞ (L∞ ) estimate for the general case requires an argument given by Moser (see
[4], e.g., for the elliptic case). We provide some details. We set

∂ akl
G˜l (x) = Gl (x) − ∑ (x).
k ∂ xk

We have the following


112 10 Analytic Techniques

Proposition 13. We make the assumptions of Proposition 14, then m ∈ L∞ (L∞ ), and
the bound depends only on the L∞ bound of G, the W 1,∞ bound of ai, j , the ellipticity
constant, and the L∞ norm of m0 .
Proof. We reduce the initial condition m0 to be 0, by introducing m̃ to be the
solution of
ˆ ˆ
∂ m̃ ∂ m̃ ∂ ϕ
ϕ dx + ∑ akl dx = 0
O ∂t k,l O ∂ xk ∂ xl

m̃(x, 0) = m0 (x). (10.33)


We have seen above that m̃ ∈ L∞ (L∞ ). We set π = m − m̃. Then π is the solution of
ˆ ˆ ˆ
∂π ∂π ∂ϕ ∂ϕ
ϕ dx + ∑ akl dx − ∑ (π + m̃)G̃l dx = 0,
O ∂t k,l O ∂ xk ∂ xl l O ∂ xl

∀ϕ ∈ W 1,2 , (10.34)

and π (x, 0) = 0.
We take p > 2 and use as a test function in (10.34) ϕ = |π | p−2π . We obtain easily
ˆ ˆ
1 d ∂ π ∂ π p−2
|π (x,t)| dx + (p − 1) ∑ akl
p
|π | dx − (p − 1)
p dt O k,l O ∂ xk ∂ xl
ˆ
∂π
∑ O (π + m̃)G̃k ∂ xk |π | p−2dx = 0
k

and we can write


ˆ ˆ
1 d
|π (x,t)| dx + (p − 1)α
p
|D|π | |2 |π | p−2dx ≤ (p − 1)c
p dt O O
ˆ
|π | p−2 (|π | + 1)|D|π | |dx
O

hence, by standard estimation,


ˆ ˆ
1 d α c2
|π (x,t)| dx + (p − 1)
p
|D|π | |2 |π | p−2dx ≤ (p − 1)
p dt O 2 O α
ˆ
|π | p−2(1 + |π |2)dx.
O

We use
ˆ ˆ
4 p
|D|π | |2 |π | p−2dx = |D|π | 2 |2 dx.
O p2 O
10.3 A Priori Estimates for m 113

We next use Poincaré’s inequality to state (n > 2)

ˆ ˆ  ˆ  2n  n−2
n
p p p n−2
|D|π | | dx ≥ c12 |π | 2 −
2
|π | 2 dx dx .
O O O

If n = 2 we replace n−2
n
by any q0 > 2.
After easy calculations

ˆ ˆ  n−2 ˆ
p pn n
|D|π | | dx ≥ k
2 2
|π | n−2 dx −k |π | p dx.
O O O

Collecting results, we obtain the inequality (recall that p > 2)


ˆ ˆ  n−2 ˆ 
d pn n
|π (x,t)| dx +
p
|π | n−2 dx ≤βp 2
|π | p−2
(1 + |π | )dx . (10.35)
2
dt O O O

We deduce easily (modifying β )


ˆ ˆ T ˆ  n−2
pn n
sup |π (x,t)| dx + p
|π (x,t)| n−2 dx dt ≤ β p2
0≤t≤T O 0 O
ˆ T ˆ 
|π (x,t)| p−2 (1 + |π |2)dxdt . (10.36)
0 O
´ p dx) 2n
Multiplying by (sup0≤t≤T O |π (x,t)| we obtain
 ˆ 1+ 2  ˆ 2
n n
sup |π (x,t)| dx p
+ sup |π (x,t)| dx p
0≤t≤T O 0≤t≤T O
ˆ T ˆ  n−2  ˆ 2
pn n n
|π (x,t)| n−2 dx dt ≤ sup |π (x,t)| p dx β p2
0 O 0≤t≤T O
ˆ T ˆ 
|π (x,t)| p−2 (1 + |π |2)dxdt .
0 O

We next use the inequality


ˆ T ˆ  ˆ  2n ˆ T ˆ  n−2
n
2 pn
|p(x,t)| p(1+ n ) dxd θ ≤ sup |p(x,t)| p dx |π (x,t)| n−2 dx dt
0 O 0≤t≤T O 0 O

therefore, we obtain,
114 10 Analytic Techniques

ˆ T ˆ ˆ T ˆ 1+ n2
p(1+ n2 ) 2
|π (x,t)| dxdt ≤ (β p2 )1+ n |π (x,t)| p−2
(1 + |π | )dxdt
2
.
0 O 0 O
(10.37)

We can write this relation as


1
ˆ T ˆ p
1 2
||π || 2 ≤β p p p |π (x,t)| p−2(1 + |π |2)dxdt . (10.38)
L p(1+ n ) (O ×(0,T )) 0 O

We use
ˆ T ˆ ˆ T ˆ  p−2
p 2
|π (x,t)| p−2
dxdt ≤ |π (x,t)| dxdt p
(T |O|) p
0 O 0 O
 ˆ T ˆ 
2
≤ max 1, |π (x,t)| dxdt (T |O|) p .
p
0 O

Therefore,
ˆ T ˆ
|π (x,t)| p−2(1 + |π |2)dxdt
0 O
 ˆ T ˆ 
2
≤ max 1, |π (x,t)| p dxdt (1 + (T |O|) p )
0 O
 ˆ T ˆ 
≤ max 1, |π (x,t)| p dxdt (1 + max(1 + T |O|))
0 O

hence also
1
ˆ T ˆ p
1
|π (x,t)| p−2(1 + |π |2)dxdt ≤ max(1, ||π ||L p )c p .
0 O

Using this estimate in (10.38) and modifying the constant β we can write
1 2
||π || 2 ≤ β p p p max(1, ||π ||L p ).
L p(1+ n )

Since 1 is smaller than the right-hand side, we deduce


1 2
max(1, ||π || 2 ) ≤ β p p p max(1, ||π ||L p ). (10.39)
L p(1+ n )

This is a classical inequality, which easily leads to the result. Indeed, set a = 1 + 2n
and p j = 2a j , z j = max(1, ||π ||L p j ). We can write
10.3 A Priori Estimates for m 115

1 2
p
z j+1 ≤ β p j p j j z j
and
 

∑∞ 1 2 log ph
z j ≤ z0 β h=0 ph exp ∑ .
h=0 ph

Since a > 1,the series are converging. Therefore, z j is bounded. Letting j → ∞,we
obtain that ||π || is finite. This concludes the proof. 

10.3.3 Further Estimates

With the L∞ (L∞ ) estimate, we can see that

ˆ  
∂ akl ∂ϕ
∑ m ∑ − Gl dx
l O k ∂ xk ∂ xl

can be extended to ϕ in W 1,p , ∀1 < p < 2. We can consider this term as a right-hand
side for (10.28), belonging to L∞ ((W 1,p )∗ ).
With the methods of Sect. 10.2.3, we can obtain that m is Cα up to the boundary.
1,p
We can then obtain that m ∈ L p (Wloc ), ∀2 ≤ p < ∞. Although, we believe that m ∈
L (W ), we do not have a reference in the literature related to the regularity theory
p 1,p

of parabolic equations with Neumann boundary conditions to assert the regularity


up to the boundary.
So we state the following:
Proposition 14. We assume G bounded, and m0 ∈ L∞ (O). Then m ∈ Cα ∩
1,p 1,p ∗
L p (Wloc ), ṁ ∈ L p ((Wloc ) ), ∀2 ≤ p < ∞. The norms in these functional spaces

depends only on the L bound of G, on the W 1,∞ bound of ai, j , the ellipticity constant,
and of the L∞ norm of m0 .

10.3.4 Statement of the Global A Priori Estimate Result

We can collect all previous results in a theorem that synthesizes the a priori estimates
for the system (10.1) and (10.2).
Theorem 15. We make the assumptions of Proposition 11, and mi0 ∈ L∞ ; then a
weak solution of the system (10.13) and (10.14) in the functional space (10.12)
satisfies the regularity properties
116 10 Analytic Techniques

u ∈ L p (W 2,p ), u̇ ∈ L p (L p )

m ∈ Cα ∩ L p (Wloc
1,p 1,p ∗
), ṁ ∈ L p ((Wloc ) )
∀2 ≤ p < +∞. (10.40)

The norms in the various functional spaces depend only on the L∞ (L1 ) norm of m,
the L∞ norm of m0 , of the various constants, and of the domain.
Corollary
´ 16. Under the assumptions of Proposition 11, and mi0 ∈ L∞ , mi0 (x) ≥
0, O m0 (x)dx = 1, then a weak solution of the system (10.13) and (10.14) in the
i

functional space (10.12), such that mi (x,t) ≥ 0, satisfies the regularity properties

u ∈ L p (W 2,p ), u̇ ∈ L p (L p )

m ∈ Cα ∩ L p (Wloc ), ṁ ∈ L p ((Wloc )∗ )
1,p 1,p

∀2 ≤ p < +∞ (10.41)

The norms in the various functional spaces depend only on the L∞ norm of m0 , the
W 1,∞ bound of ai, j , the ellipticity constant, the number k0 (1),and the domain.
Proof. We simply note that taking

ϕ i (x,t) = χ i (t)

in (10.14) we can write


ˆ T ˆ 
− χ̇ i (t) mi (x,t)dx dt = χ i (0)
0 O

from which it follows easily that


ˆ
mi (x,t)dx = 1, ∀t.
O

Since mi (x,t) ≥ 0, m ∈ L∞ (L1 ), and the theorem applies. 

10.4 Existence Result

We want to prove the following:


Theorem 17. Under the assumptions of Corollary 16, there exists a solution (u, m)
of (10.12)–(10.14), such that mi (x,t) ≥ 0 and that satisfies the regularity properties
(10.41).
10.4 Existence Result 117

The proof will consist of two parts. In the first part we consider an approximation
of the problem as follows. Define

H i (x, q) Gi (x, q)
Hεi (x, q) = , Giε (x, q) =
1 + ε |q| 2 1 + ε |q|
m(x) m+ (x)
βε (m)(x) = , βε+ (m)(x) =
1 + ε |m(x)| 1 + ε |m(x)|
f0,i ε (x, m) = f0i (x, βε (m)), hiε (x, m) = hi (x, βε (m)).

Note that in βε (m), m can be a vector in L p (O, RN ), whereas we shall apply βε+ (m)
only to a function in L p (O). We consider the following approximate problem

uε , mε ∈ L2 (W 1,2 ) ∩ L∞ (L2 )
mε ∈C(L2 ) (10.42)

ˆ T n ˆ T

0
(uiε , ϕ̇ i )L2 dt + ∑ (Dl uiε , Dk (akl ϕ i ))L2 dt = (hiε (., mε (T )), ϕ i (T ))L2
k,l=1 0
ˆ T
+ (Hεi (., Duε ) + f0i ε (., mε (t)), ϕ i )L2 dt, ∀i = 1, . . . , N (10.43)
0

ˆ T n ˆ T

0
(miε , ϕ̇ i )L2 dt + ∑ (Dl ϕ i , Dk (akl miε ))L2 dt
k,l=1 0
ˆ T
= (m0 , ϕ i (0))L2 + (βε+ (miε )Giε (., Duε ), Dϕ i )L2 dt, ∀i = 1, . . . , N (10.44)
0

Proposition 18. We make the assumptions of Corollary 16. Suppose we have a


solution of (10.42)–(10.44); then miε (x,t) ≥ 0 and the pair uε , mε remains bounded
in the functional spaces (10.41). We can extract a subsequence, still denoted uε , mε ,
which converges weakly to u, m and also

uε , Duε → u, Du a.e. x,t


mε → m a.e. x,t
uε (., T ) → u(., T )
mε (., T ) → m(., T ), in L2

and u, m is a solution of (10.12)–(10.14).


118 10 Analytic Techniques

We first note that, since the derivatives u̇iε , ṁiε are defined in L p (L p ) and
L p ((W 1,p )∗ ), respectively, we can write the weak forms (10.43) and (10.44) as
follows
n
(−u̇iε , ϕ i )L2 + ∑ (Dl uiε , Dk (akl ϕ i ))L2 = (Hεi (., Duε )+ f0i ε (., mε (t)), ϕ i )L2 (10.45)
k,l=1

∀ϕ i ∈ W 1,2 , ∀i ∈ 1, . . . , N

uiε (x, T ) = hiε (., mε (T ))

and
n
(ṁiε , ϕ i )L2 + ∑ (Dl ϕ i , Dk (akl miε ))L2 = (βε+ (miε )Giε (., Duε ), Dϕ i )L2 (10.46)
k,l=1

∀ϕ i ∈ W 1,2 ∀i = 1, . . . , N

miε (x, 0) = m0 (x).

We then prove the positivity property. We take ϕ i = (miε )− in (10.46). The right-
hand side vanishes, and we obtain
n
1d
|(miε )− |2 + ∑ (Dl (miε )− , Dk (akl (miε )− ))L2 = 0
2 dt k,l=1

and (miε )− (x, 0) = 0. It is easy to check that (miε )− (.,t) = 0, ∀t. But testing with
ϕ i = 1, we get
ˆ ˆ
miε (x,t)dx = m0 (x)dx = 1
O O

and thus

|| f0i ε (., mε (t))||L∞ ≤ k0 (1)


||hiε (., mε (T ))||L∞ ≤ k0 (1).
10.4 Existence Result 119

Since Hεi (x, q) satisfies the same assumptions as H i (x, q) in terms of growth,
uniformly in ε , we obtain

||uε ||L∞ (L∞ ) , ||Duε ||LP (W 1,p ) , ||u̇ε ||L p (L p ) ≤ C

hence also

||uε ||Cδ (Ō×[0,T ]) ≤ C

Therefore, for a subsequence

uε  u, weakly in LP (W 2,p ), u̇ε  u, weakly in LP (L p )


uε → u, in C0 (Ō × [0, T ]).

Since Hεi (., Duε ) + f0i ε (., mε (t)) is bounded in L1 (O × (0, T )) we obtain from
(10.45)
n ˆ T
∑ (Dl uiε , akl Dk (uiε − ui ))L2 dt → 0
k,l=1 0

from which we obtain


uε → u, in L2 (W 1,2 ).
We can extract a subsequence such that

Duε → Du, a.e.x,t


and we know that

||Duε ||L∞ (L∞ ) ≤ C.

We next consider the equations for miε . We can write (10.46) as


n n
(ṁiε , ϕ i )L2 + ∑ (Dl ϕ i , akl Dk miε )L2 = ∑ (miε G̃iε ,l , Dl ϕ i )L2 (10.47)
k,l=1 l=1

with

Giε (x, Duε (x))


G̃iε ,l (x) = − ∑ Dk ak,l (x)
1 + ε miε (x) k

and the functions G̃iε ,l are bounded in L∞ , uniformly in ε . Therefore,

||mε ||L p (W 1,p ) ≤ C, ||ṁε ||L p ((W 1,p )∗ ) ≤ C


120 10 Analytic Techniques

||mε ||L∞ (L∞ ) ≤ C.

We deduce that for a subsequence

mε → m, a.e. x,t

and

miε G̃iε ,l (x) → mi (x)Gil (x, Du), a.e. x,t.

From (10.47) and with an argument similar to that used for u, we see that

mε → m, in L2 (W 1,2 )
mε (.,t) → m(.,t), in L2 , ∀t.

We then obtain that

hiε (., mε (T )) → hi (., m(T )), in L2 .

Collecting results we see that u, m is a solution of (10.12)–(10.14) and mi (x,t) ≥ 0.



To complete the proof of Theorem 17, it remains to prove the existence of a
solution of (10.42), (10.45), and (10.46). Now ε is fixed. We omit to mention it for
the unknown functions uε , mε . We use the following notation

Φi (u, m)(x,t) = Hεi (x, Du) + f0i ε (x, m)


Ψi (u, m)(x,t) = βε+ (mi )Giε (., Du)

and call hi (x, m) the functional hiε (x, m).


The functionals Φi , Ψi map W 1,2 (O, RN ) × L2 (O, RN ) into L∞ (O) and

L (O, Rn ), respectively (they may depend on time). The system (10.45) and (10.46)
becomes
n
(−u̇i , ϕ i )L2 + ∑ (Dl ui , Dk (akl ϕ i ))L2 = (Φi (u, m), ϕ i )L2
k,l=1

ui (x, T ) = hi (x, m(T )) (10.48)


10.4 Existence Result 121

n n
(ṁi , ϕ i )L2 + ∑ (Dl ϕ i , akl Dk mi )L2 = ∑ (Ψil (u, m), Dl ϕ i )L2
k,l=1 l=1

mi (x, 0) = m0 (x). (10.49)

To solve this system, we use the Galerkin approximation. We consider an orthonor-


mal basis of L2 (O), made of functions of W 1,2 (O), namely ϕ1 , . . . , ϕr , . . .. We
approximate ui ,mi by ui,R , mi,R defined by

R
ui,R (x,t) = ∑ cir (t)ϕr (x)
r=1
R
mi,R (x,t) = ∑ bir (t)ϕr (x).
r=1

The new unknowns are the functions cir (t), bir (t). They will be solutions of a system
of forward, backward differential equations

R n
dcir
− + ∑ ∑ (Dk (ak,l ϕr ), Dl ϕρ )L2 ciρ = (Φi (uR , mR ), ϕr )L2
dt ρ =1 k,l=1

cir (T ) = (hi (., mR (T )), ϕr )L2 (10.50)

R n n
dbir
+ ∑ ∑ (Dl ϕr , ak,l Dk ϕρ )L2 biρ = ∑ (Ψil (uR , mR ), Dl ϕr )L2
dt ρ =1 k,l=1 l=1

bir (0) = (m0 , ϕr )L2 . (10.51)

We begin by checking a priori estimates. We recall that Φi (uR , mR )and Ψil (uR , mR )
are bounded by an absolute constant. We multiply (10.50) by cir and sum up over r.
We get
n
1 d i,R 2
− |u | + ∑ (Dk (ak,l ui,R ), Dl ui,R )L2 = (Φi (uR , mR ), ui,R )L2
2 dt k,l=1

R
ui,R (T ) = ∑ (hi (., mR (T )), ϕr )L2 ϕr
r=1

from which we deduce easily that

||uR ||L2 (W 1,2 ) + ||uR ||L∞ (L2 ) ≤ C. (10.52)


122 10 Analytic Techniques

Similarly, from (10.51) we can write


n n
1 d i,R 2
|m | + ∑ (ak,l Dk mi,R , Dl mi,R )L2 = ∑ (Ψil (uR , mR ), Dl mi,R )L2
2 dt k,l=1 l=1

R
mi,R (0) = ∑ (m0 , ϕr )L2 ϕr
r=1

and also

||mR ||L2 (W 1,2 ) + ||mR ||L∞ (L2 ) ≤ C. (10.53)

It follows in particular, from these estimates that the functions cir (t), bir (t) are
bounded by an absolute constant. From the differential equations we see that the
derivatives are also bounded, however, by a constant, which this time depends on R.
These a priori estimates are sufficient to prove that the system (10.50) and (10.51)
has a solution. Indeed, one uses a fixed-point argument. Here R is fixed. Given
uR , mR on the right-hand side of (10.50) and (10.51), we can solve (10.50) and
(10.51) independently. They are linear differential equations. One is forward, the
other one is backward, but they are uncoupled. In this way, one defines a map from
a compact susbet of (C[0, T ])R × (C[0, T ])R into itself. The compactness is provided
by the Arzela–Ascoli theorem. The map is continuous. So it has a fixed point, which
is a solution of (10.50) and (10.51). The estimates (10.52) and (10.53) hold.
We also have the estimates
ˆ T −h ˆ
1
|ui,R (x,t + h) − ui,R(x,t)|2 dxdt ≤ C
h 0 O
ˆ T −h ˆ
1
|mi,R (x,t + h) − mi,R(x,t)|2 dxdt. (10.54)
h 0 O

Proof of (10.54) This is a classical property. We check only the first one. We first
write

d i,R R n
− u + ∑ ∑ (ak,l Dk ϕr , Dl ui,R )L2 ϕr (10.55)
dt r=1 k,l=1

R n
= ∑ (Φi (uR , mR ) − ∑ Dk ak,l Dl ui,R , ϕr )ϕr = gi,R
r=1 k,l=1

and gR is bounded in L2 (L2 ). It follows that


 ˆ 
1 i,R n
1 t+h i,R
− (u (t + h) − u (t), u (t))L2 + ∑ ak,l Dk u (t), Dl
i,R i,R i,R
u (s)ds
h k,l=1 h t
L2
10.4 Existence Result 123

ˆ 
t+h
1
= g (s)ds, u (t)
R i,R
.
h t
L2

From previous estimates, we obtain


ˆ T −h
1
(ui,R (t), ui,R (t) − ui,R(t + h))L2 dt ≤ C
h t

from which the estimate (10.54) follows easily. 


To proceed, we need to work with a particular basis of L2 (O).We take ϕr to be
the eigenvectors of the operator −Δ + I on the domain O with Neumann boundary
conditions. So ϕr satisfies

((ϕr , v))W 1,2 (O) = λr (ϕr , v)L2 (O) , ∀v ∈ W 1,2 (O),

where λr is the eigenvalue corresponding to ϕr . Define now AR over W 1,2 (O) by


writing

R n
AR z = ∑ ∑ (ak,l Dk ϕr , Dl z)L2 ϕr
r=1 k,l=1

then AR maps W 1,2 (O) into the subspace of L2 (O) generated by ϕr , r = 1, . . . , R,


with the property

||AR ||L(W 1,2 ;(W 1,2 )∗ ) ≤ C.

This is because √ϕr is an orthonormal basis of W 1,2 (O) and


λr
 
 R 
 
 ∑ (ϕr , v)ϕr  ≤ ||v||W 1,2 .
r=1 
W 1,2

The differential equation (10.55) writes


d i,R
− u + ARui,R = gi,R (10.56)
dt
and thus we get
   
 d i,R   d i,R 
 u    
 dt  2 1,2 ∗ ≤ C,  dt m  2 1,2 ∗ ≤ C (10.57)
L ((W ) ) L ((W ) )

the second property being proved similarly. We have also

||ui,R (T )||W 1,2 ≤ C. (10.58)


124 10 Analytic Techniques

Thanks to the estimates (10.52), (10.54), and (10.58), we can extract a sequence
such that

ui,R → ui in L2 (L2 )
ui,R  ui in L2 (W 1,2 ) weakly
d i,R d
u  ui in L2 ((W 1,2 )∗ ) weakly
dt dt
ui,R (T ) → ui (T ) in L2 . (10.59)

Testing (10.56) with ui,R − ui , we see easily that

ui,R → ui in L2 (W 1,2 ). (10.60)

From (10.53) and (10.54) we can assert that, for a subsequence,

mi,R → mi in L2 (L2 ). (10.61)

Therefore, for a subsequence,

Dui,R → Dui
mi,R → mi , a.e.

and then

Φi (uR , mR ) → Φi (u, m), a.e.


Ψi (uR , mR ) → Ψi (u, m), a.e.

By a reasoning similar to that done for ui,R we can prove that

mi,R → mi in L2 (W 1,2 ). (10.62)

With these convergence results, one can pass to the limit in the Galerkin approxi-
mation and check that u, m is a solution of (10.45) and (10.46). 
References

1. Andersson, D., Djehiche, B. (2011). A maximum principle for SDEs of mean field type.
Applied Mathematics and Optimization, 63, 341–356.
2. Bardi, M. (2011). Explicit solutions of some linear-quadratic mean field games. Networks and
Heterogeneous Media, 7(2), 243–261 (2012)
3. Bensoussan, A., Frehse, J. (1990). Cα regularity results for quasi-linear parabolic systems.
Commentationes Mathematicae Universitatis Carolinae, 31, 453–474.
4. Bensoussan, A., Frehse, J. (1995). Ergodic Bellman systems for stochastic games in arbitrary
dimension. Proceedings of the Royal Society, London, Mathematical and Physical sciences A,
449, 65–67.
5. Bensoussan, A., Frehse, J. (2002). Regularity results for nonlinear elliptic systems and
applications. Applied mathematical sciences, vol. 151. Springer, Berlin.
6. Bensoussan, A., Frehse, J. (2002). Smooth solutions of systems of quasilinear parabolic
equations. ESAIM: Control, Optimization and Calculus of Variations, 8, 169–193.
7. Bensoussan, A., Frehse, J. (2013). Control and Nash games with mean field effect. Chinese
Annals of Mathematics, 34B, 161–192.
8. Bensoussan, A., Frehse, J., Vogelgesang, J. (2010). Systems of Bellman equations to stochastic
differential games with noncompact coupling, discrete and continuous dynamical systems.
Series A, 274, 1375–1390.
9. Bensoussan, A., Frehse, J., Vogelgesang, J. (2012). Nash and stackelberg differential games.
Chinese Annals of Mathematics, Series B, 33(3), 317–332.
10. Bensoussan, A., Sung, K.C.J., Yam, S.C.P., Yung, S.P. (2011). Linear-quadratic mean field
games. Technical report.
11. Björk, T., Murgoci, A. (2010). A general theory of Markovian time inconsistent stochastic
control problems. SSRN: 1694759.
12. Buckdahn, R., Djehiche, B., Li, J. (2011). A general stochastic maximum principle for SDEs
of mean-field type. Applied Mathematics and Optimization, 64, 197–216.
13. Buckdahn, R., Li, J., Peng, SG. (2009). Mean-field backward stochastic differential equations
and related partial differential equations. Stochastic Processes and their Applications, 119,
3133–3154.
14. Cardaliaguet, P. (2010). Notes on mean field games. Technical report.
15. Carmona, R., Delarue, F. (2012). Probabilistic analysis of mean-field games. http://arxiv.org/
abs/1210.5780.
16. Garnier, J., Papanicolaou, G., Yang, T. W. (2013). Large deviations for a mean field model of
systemic risk. http://arxiv.org/abs/1204.3536.

A. Bensoussan et al., Mean Field Games and Mean Field Type Control Theory, 125
SpringerBriefs in Mathematics, DOI 10.1007/978-1-4614-8508-7,
© Alain Bensoussan, Jens Frehse, Phillip Yam 2013
126 References

17. Guéant, O., Lasry, J.-M., Lions, P.-L. (2011). Mean field games and applications. In:
A.R. Carmona, et al. (Eds.), Paris-Princeton Lectures on Mathematical Sciences 2010
(pp. 205–266).
18. Hu, Y., Jin, H. Q., Zhou, X. Y. (2011). Time-inconsistent stochastic linear-quadratic control.
http://arxiv.org/abs/1111.0818.
19. Huang, M. (2010). Large-population LQG games involving a major player: the nash certainty
equivalence principle. SIAM Journal on Control and Optimization, 48(5), 3318–3353.
20. Huang, M., Malhamé, R. P., Caines, P. E. (2006). Large population stochastic dynamic
games: closed-loop MCKean-Vlasov systems and the nash certainty equivalence principle.
Communications in Information and Systems, 6(3), 221–252.
21. Huang, M., Caines, P. E., Malhamé, R. P. (2007). Large-population cost-coupled LQG prob-
lems with nonuniform agents: individual-mass behavior and decentralized ε −nash equilibria.
IEEE Transactions on Automatic Control, 52(9), 1560–1571.
22. Huang, M., Caines, P. E., Malhamé, R. P. (2007). An invariance principle in large population
stochastic dynamic games. Journal of Systems Science and Complexity, 20(2), 162–172.
23. Kolokoltsov, V. N., Troeva, M., Yang, W. (2012). On the rate of convergence for the mean-field
approximation of controlled diffusions with a large number of players. Working paper.
24. Kolokoltsov, V. N., Yang, W. (2013). Sensitivity analysis for HJB equations with an application
to a coupled backward-forward system. Working paper.
25. Lasry, J.-M., Lions, P.-L. (2006). Jeux à champ moyen I- Le cas stationnaire. Comptes Rendus
de l’Académie des Sciences, Series I, 343, 619–625.
26. Lasry, J.-M., Lions, P.-L. (2006). Jeux à champ moyen II- Horizn fini et contrôle optimal,
Comptes Rendus de l’Académie des Sciences, Series I, 343, 679–684.
27. Lasry, J.-M., Lions, P.-L. (2007). Mean field games. Japanese Journal of Mathematics, 2(1),
229–260.
28. Ladyzhenskaya , O., Solonnikov, V., Uraltseva, N. (1968). Linear and Quasi-linear Equations
of Parabolic Type. In: Translations of Mathematical Monographs, vol. 23. American Mathe-
matical Society, Providence.
29. McKean, H. P. (1966). A class of Markov processes associated with nonlinear parabolic
equations. Proceedings of the National Academy of Sciences USA, 56, 1907–1911.
30. Meyer-Brandis, T., Øksendal, B., Zhou, X. Z. (2012). A mean-field stochastic maximum
principle via Malliavin calculus. Stochastics (A Special Issue for Mark Davis’ Festschrift),
84, 643–666.
31. Nourian, M., Caines, P. E. (2012). ε-nash mean field game theory for nonlinear stochastic
dynamical systems with major and minor agents. SIAM Journal on Control and Optimization
(submitted).
32. Peng, S. (1992). Stochastic Hamilton-Jacobi-Bellman equations. SIAM Journal on Control and
Optimization, 30(2), 284–304.
33. Schlag, W. (1994). Schauder and L p −estimates for Parabolic systems via Campanato spaces.
Comm. P.D.E., 21(7–8), 1141–1175.
Index

A Fokker–Planck (FP) equations, 2, 3, 11


A priori estimates, 4, 104–118, 123, 124 Forward-backward stochastic differential
Approximate Nash games, 31–43 equations, 14
Frechet derivative, 52, 63

B
Backward stochastic differential equations, 14 G
Galerkin approximation, 123, 126
Gateaux derivative, 18, 19
C Gateaux differential, 18, 61, 77, 79, 82,
Campanato test, 107 93
Coalitions, 4, 91 Green’s function, 105, 106
Community (large), 1, 4, 91, 97–99
Compatibility condition, 109
Cost functional, 60, 71, 72 H
Hamilton–Jacobi–Bellman (HJB) equation,
2–4, 11–13, 25, 46, 49, 56, 62, 70, 83,
D 95
Differential games, 4, 34, 38, 43, 57, 59, 70, Hamiltonian (function), 11, 13, 22, 34, 59, 62,
71, 73, 91–99 70, 92, 105, 106
Dirac measure, 7, 31 Hölder estimates, 107
Dirichlet boundary condition, 102, 109
Dirichlet problem, 106
Drift, 83
Dual control problems, 3, 91 I
Dynamic programming, 2, 13, 18, 25, 33, 71 Initial condition, 3, 25, 26, 28, 55–57,
114
Ito’s formula, 13, 27, 32
E
Elliptic operator, 106, 108
Elliptic systems, 3 L
Ergodic controls, 3 Lagrangian (function), 11
Law of large numbers, 36, 66, 73
Lebesgue’s theorem, 36, 98
F Linear operator, 12
Feedback controls, 7, 18, 32, 37, 51, 60, 65, 91 Linear-quadratic, 2, 3, 45–57, 84
Fixed point, 2, 4, 61, 122 Lp estimates, 108–112

A. Bensoussan et al., Mean Field Games and Mean Field Type Control Theory, 127
SpringerBriefs in Mathematics, DOI 10.1007/978-1-4614-8508-7,
© Alain Bensoussan, Jens Frehse, Phillip Yam 2013
128 Index

M R
Major (dominating) player, 4 Random measure, 36, 39, 66, 72, 98
Markowitz, 4, 51 Representative agent, 1, 3, 4, 74–76, 79, 83,
McKean–Vlasov argument, 8, 14 91, 95
Mean-field games, 1, 2, 8, 9, 11–14, 32, 45–48, Riccati differential equations, 46, 86
50, 51, 59–61, 70, 74, 78, 91, 95 Risk management, 3
Mean-field terms, 1–4, 7, 57, 75, 91
Mean-field type control, 1, 2, 8, 9, 15–29, 51,
59, 91 S
Mean-variance (optimization) problems, 4 Sobolev spaces, 35, 66, 102
Morrey inequality, 107 Spike modification, 3, 19, 26, 55
Moser estimate, 113 State equation, 3, 25, 51, 60, 70, 71, 74, 75, 79,
95–97
N Stationary (elliptic) system, 3
Nash equilibrium, 3, 4, 25, 32–38, 42–43, 57, Stationary problems, 59–66
64–66, 71, 74, 91, 92, 95–99 Stochastic Hamilton–Jacobi–Bellman
Nash game, 1, 31–43, 64 equations, 78
Necessary condition, 3, 18, 19, 62, 63, 77, 82, Stochastic maximum principle, 2, 12–14,
89 21–23, 50
Neumann boundary condition, 102, 105, 106, Stochastic partial differential equations, 2
117, 125 Sufficient condition, 3, 48
Non-symmetric Riccati differential equations,
48
T
Terminal condition, 34
O Time consistency (approach), 28, 55
Objective (functional), 1, 3, 15, 31, 34, 35, 51, Time inconsistency (problem), 3
52, 57, 61, 63–65, 75, 76, 91, 97
Optimality condition, 3, 18, 19, 62, 63, 75
Optimality principle, 2 U
Uniform ellipticity, 106
P
Parabolic systems, 3
Portfolio theory, 51 W
Positivity, 118 Weak * topology, 31, 36, 39, 66
Pre-commitment, 25 Weak formulation, 103

You might also like