You are on page 1of 557

De Gruyter Graduate

Hans Fllmer
Alexander Schied

Stochastic Finance
An Introduction in Discrete Time

Third revised and extended edition

De Gruyter

Mathematics Subject Classification 2010: Primary: 60-01, 91-01, 91-02; Secondary: 46N10, 60E15,
60G40, 60G42, 91B08, 91B16, 91B30, 91B50, 91B52, 91B70, 91G10, 91G20, 91G80, 91G99.

The first and second edition of Stochastic Finance were published


in the series De Gruyter Studies in Mathematics.

ISBN 978-3-11-021804-6
e-ISBN 978-3-11-021805-3
Library of Congress Cataloging-in-Publication Data
Fllmer, Hans.
Stochastic finance : an introduction in discrete time / by Hans
Fllmer, Alexander Schied. 3rd, rev. and extended ed.
p. cm.
Includes bibliographical references and index.
ISBN 978-3-11-021804-6 (alk. paper)
1. Finance Statistical methods.
2. Stochastic analysis.
3. Probabilities. I. Schied, Alexander.
II. Title.
HG176.5.F65 2011
332.011519232dc22
2010045896

Bibliographic information published by the Deutsche Nationalbibliothek


The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;
detailed bibliographic data are available in the Internet at http://dnb.d-nb.de.
2011 Walter de Gruyter GmbH & Co. KG, Berlin/New York
Typesetting: Da-TeX Gerd Blumenstein, Leipzig, www.da-tex.de
Printing and binding: Hubert & Co. GmbH & Co. KG, Gttingen
Printed on acid-free paper
Printed in Germany
www.degruyter.com

Preface to the third edition

This third edition of our book appears in the de Gruyter graduate textbook series. We
have therefore included more than one hundred exercises. Typically, we have used the
book as an introductory text for two major areas, either combined into one course or
in two separate courses. The first area comprises static and dynamic arbitrage theory
in discrete time. The corresponding core material is provided in Chapters 1, 5, and 6.
The second area deals with mathematical aspects of financial risk as developed in
Chapters 2, 4, and 11. Most of the exercises we have included in this edition are
therefore contained in these core chapters. The other chapters of this book can be
used both as complementary material for the introductory courses and as basis for
special-topics courses.
In recent years, there has been an increasing awareness, both among practitioners
and in academia, of the problem of model uncertainty in finance and economics, often
called Knightian uncertainty; see, e.g., [259]. In this third edition we have put more
emphasis on this issue. The theory of risk measures can be seen as a case study how
to deal with model uncertainty in mathematical terms. We have therefore updated
Chapter 4 on static risk measures and added the new Chapter 11 on dynamic risk
measures. Moreover, in Section 2.5 we have extended the characterization of robust
preferences in terms of risk measures from the coherent to the convex case. We have
also included the new Sections 3.5 and 8.3 on robust variants of the classical problems
of optimal portfolio choice and efficient hedging.
It is a pleasure to express our thanks to all students and colleagues whose comments
have helped us to prepare this third edition, in particular to Aurlien Alfonsi, Gnter
Baigger, Francesca Biagini, Julia Brettschneider, Patrick Cheridito, Samuel Drapeau,
Maren Eckhoff, Karl-Theodor Eisele, Damir Filipovic, Zicheng Hong, Kostas Kardaras, Thomas Knispel, Gesine Koch, Heinz Knig, Volker Krtschmer, Christoph
Khn, Michael Kupper, Mourad Lazgham, Sven Lickfeld, Mareike Massow, Irina
Penner, Ernst Presman, Michael Scheutzow, Melvin Sim, Alla Slynko, Stephan Sturm,
Gregor Svindland, Long Teng, Florian Werner, Wiebke Wittm, and Lei Wu. Special thanks are due to Yuliya Mishura and Georgiy Shevchenko, our translators for the
Russian edition.
Berlin and Mannheim, November 2010

Hans Fllmer
Alexander Schied

Preface to the second edition

Since the publication of the first edition we have used it as the basis for several
courses. These include courses for a whole semester on Mathematical Finance in
Berlin and also short courses on special topics such as risk measures given at the
Institut Henri Poincar in Paris, at the Department of Operations Research at Cornell
University, at the Academia Sinica in Taipei, and at the 8th Symposium on Probability
and Stochastic Processes in Puebla. In the process we have made a large number of
minor corrections, we have discovered many opportunities for simplification and clarification, and we have also learned more about several topics. As a result, major parts
of this book have been improved or even entirely rewritten. Among them are those on
robust representations of risk measures, arbitrage-free pricing of contingent claims,
exotic derivatives in the CRR model, convergence to the BlackScholes model, and
stability under pasting with its connections to dynamically consistent coherent risk
measures. In addition, this second edition contains several new sections, including a
systematic discussion of law-invariant risk measures, of concave distortions, and of
the relations between risk measures and Choquet integration.
It is a pleasure to express our thanks to all students and colleagues whose comments
have helped us to prepare this second edition, in particular to Dirk Becherer, Hans
Bhler, Rose-Anne Dana, Ulrich Horst, Mesrop Janunts, Christoph Khn, Maren
Liese, Harald Luschgy, Holger Pint, Philip Protter, Lothar Rogge, Stephan Sturm,
Stefan Weber, Wiebke Wittm, and Ching-Tang Wu. Special thanks are due to Peter
Bank and to Yuliya Mishura and Georgiy Shevchenko, our translators for the Russian edition. Finally, we thank Irene Zimmermann and Manfred Karbe of de Gruyter
Verlag for urging us to write a second edition and for their efficient support.
Berlin, September 2004

Hans Fllmer
Alexander Schied

Preface to the first edition

This book is an introduction to probabilistic methods in Finance. It is intended for


graduate students in mathematics, and it may also be useful for mathematicians in
academia and in the financial industry. Our focus is on stochastic models in discrete
time. This limitation has two immediate benefits. First, the probabilistic machinery
is simpler, and we can discuss right away some of the key problems in the theory
of pricing and hedging of financial derivatives. Second, the paradigm of a complete
financial market, where all derivatives admit a perfect hedge, becomes the exception
rather than the rule. Thus, the discrete-time setting provides a shortcut to some of the
more recent literature on incomplete financial market models.
As a textbook for mathematicians, it is an introduction at an intermediate level, with
special emphasis on martingale methods. Since it does not use the continuous-time
methods of It calculus, it needs less preparation than more advanced texts such as
[99], [98], [107], [171], [252]. On the other hand, it is technically more demanding
than textbooks such as [215]: We work on general probability spaces, and so the text
captures the interplay between probability theory and functional analysis which has
been crucial for some of the recent advances in mathematical finance.
The book is based on our notes for first courses in Mathematical Finance which
both of us are teaching in Berlin at Humboldt University and at Technical University.
These courses are designed for students in mathematics with some background in
probability. Sometimes, they are given in parallel to a systematic course on stochastic
processes. At other times, martingale methods in discrete time are developed in the
course, as they are in this book. Usually the course is followed by a second course on
Mathematical Finance in continuous time. There it turns out to be useful that students
are already familiar with some of the key ideas of Mathematical Finance.
The core of this book is the dynamic arbitrage theory in the first chapters of Part II.
When teaching a course, we found it useful to explain some of the main arguments
in the more transparent one-period model before using them in the dynamical setting.
So one approach would be to start immediately in the multi-period framework of
Chapter 5, and to go back to selected sections of Part I as the need arises. As an
alternative, one could first focus on the one-period model, and then move on to Part II.
We include in Chapter 2 a brief introduction to the mathematical theory of expected
utility, even though this is a classical topic, and there is no shortage of excellent expositions; see, for instance, [187] which happens to be our favorite. We have three
reasons for including this chapter. Our focus in this book is on incompleteness, and
incompleteness involves, in one form or another, preferences in the face of risk and
uncertainty. We feel that mathematicians working in this area should be aware, at

viii

Preface to the first edition

least to some extent, of the long line of thought which leads from Daniel Bernoulli via
von NeumannMorgenstern and Savage to some more recent developments which are
motivated by shortcomings of the classical paradigm. This is our first reason. Second,
the analysis of risk measures has emerged as a major topic in mathematical finance,
and this is closely related to a robust version of the Savage theory. Third, but not least,
our experience is that this part of the course was found particularly enjoyable, both by
the students and by ourselves.
We acknowledge our debt and express our thanks to all colleagues who have contributed, directly or indirectly, through their publications and through informal discussions, to our understanding of the topics discussed in this book. Ideas and methods
developed by Freddy Delbaen, Darrell Duffie, Nicole El Karoui, David Heath, Yuri
Kabanov, Ioannis Karatzas, Dimitri Kramkov, David Kreps, Stanley Pliska, Chris
Rogers, Steve Ross, Walter Schachermayer, Martin Schweizer, Dieter Sondermann
and Christophe Stricker play a key role in our exposition. We are obliged to many
others; for instance the textbooks [73], [99], [98], [155], and [192] were a great help
when we started to teach courses on the subject.
We are grateful to all those who read parts of the manuscript and made useful suggestions, in particular to Dirk Becherer, Ulrich Horst, Steffen Krger, Irina Penner,
and to Alexander Giese who designed some of the figures. Special thanks are due
to Peter Bank for a large number of constructive comments. We also express our
thanks to Erhan inlar, Adam Monahan, and Philip Protter for improving some of
the language, and to the Department of Operations Research and Financial Engineering at Princeton University for its hospitality during the weeks when we finished the
manuscript.
Berlin, June 2002

Hans Fllmer
Alexander Schied

Contents

Preface to the third edition

Preface to the second edition

vi

Preface to the first edition

vii

Mathematical finance in one period

Arbitrage theory
1.1 Assets, portfolios, and arbitrage opportunities . . .
1.2 Absence of arbitrage and martingale measures . . .
1.3 Derivative securities . . . . . . . . . . . . . . . . .
1.4 Complete market models . . . . . . . . . . . . . .
1.5 Geometric characterization of arbitrage-free models
1.6 Contingent initial data . . . . . . . . . . . . . . . .

1
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

Preferences
2.1 Preference relations and their numerical representation
2.2 Von NeumannMorgenstern representation . . . . . .
2.3 Expected utility . . . . . . . . . . . . . . . . . . . . .
2.4 Uniform preferences . . . . . . . . . . . . . . . . . .
2.5 Robust preferences on asset profiles . . . . . . . . . .
2.6 Probability measures with given marginals . . . . . . .

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

50
. 51
. 57
. 67
. 83
. 94
. 113

.
.
.
.
.
.

121
121
130
139
148
151
159

Optimality and equilibrium


3.1 Portfolio optimization and the absence of arbitrage
3.2 Exponential utility and relative entropy . . . . . . .
3.3 Optimal contingent claims . . . . . . . . . . . . .
3.4 Optimal payoff profiles for uniform preferences . .
3.5 Robust utility maximization . . . . . . . . . . . .
3.6 Microeconomic equilibrium . . . . . . . . . . . .

.
.
.
.
.
.

3
3
7
16
27
33
37

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

Monetary measures of risk


175
4.1 Risk measures and their acceptance sets . . . . . . . . . . . . . . . . 176
4.2 Robust representation of convex risk measures . . . . . . . . . . . . . 186
4.3 Convex risk measures on L1 . . . . . . . . . . . . . . . . . . . . . . 199

Contents

4.4
4.5
4.6
4.7
4.8
4.9

II
5

Value at Risk . . . . . . . . . . . . . . . . . . . . . . .
Law-invariant risk measures . . . . . . . . . . . . . . .
Concave distortions . . . . . . . . . . . . . . . . . . . .
Comonotonic risk measures . . . . . . . . . . . . . . . .
Measures of risk in a financial market . . . . . . . . . .
Utility-based shortfall risk and divergence risk measures

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

Dynamic hedging

259

Dynamic arbitrage theory


5.1 The multi-period market model . . . . . . . . . .
5.2 Arbitrage opportunities and martingale measures
5.3 European contingent claims . . . . . . . . . . . .
5.4 Complete markets . . . . . . . . . . . . . . . . .
5.5 The binomial model . . . . . . . . . . . . . . . .
5.6 Exotic derivatives . . . . . . . . . . . . . . . . .
5.7 Convergence to the BlackScholes price . . . . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

261
261
266
274
287
290
296
302

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

321
321
327
337
342
347

7 Superhedging
7.1 P -supermartingales . . . . . . . . . . . . . . . .
7.2 Uniform Doob decomposition . . . . . . . . . .
7.3 Superhedging of American and European claims
7.4 Superhedging with liquid options . . . . . . . . .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

354
354
356
359
368

American contingent claims


6.1 Hedging strategies for the seller
6.2 Stopping strategies for the buyer
6.3 Arbitrage-free prices . . . . . .
6.4 Stability under pasting . . . . .
6.5 Lower and upper Snell envelopes

207
213
219
228
236
246

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

Efficient hedging
380
8.1 Quantile hedging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
8.2 Hedging with minimal shortfall risk . . . . . . . . . . . . . . . . . . 387
8.3 Efficient hedging with convex risk measures . . . . . . . . . . . . . . 396

9 Hedging under constraints


9.1 Absence of arbitrage opportunities
9.2 Uniform Doob decomposition . .
9.3 Upper Snell envelopes . . . . . .
9.4 Superhedging and risk measures .

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

404
404
412
417
424

xi

Contents

10 Minimizing the hedging error


10.1 Local quadratic risk . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.2 Minimal martingale measures . . . . . . . . . . . . . . . . . . . . . .
10.3 Variance-optimal hedging . . . . . . . . . . . . . . . . . . . . . . . .

428
428
438
449

11 Dynamic risk measures


456
11.1 Conditional risk measures and their robust representation . . . . . . . 456
11.2 Time consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465
Appendix
A.1 Convexity . . . . . . . . . . . . . . . . . . . . . . . . .
A.2 Absolutely continuous probability measures . . . . . . .
A.3 Quantile functions . . . . . . . . . . . . . . . . . . . . .
A.4 The NeymanPearson lemma . . . . . . . . . . . . . . .
A.5 The essential supremum of a family of random variables
A.6 Spaces of measures . . . . . . . . . . . . . . . . . . . .
A.7 Some functional analysis . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

476
476
480
484
493
496
497
507

Notes

512

Bibliography

517

List of symbols

533

Index

535

Part I

Mathematical finance in one period

Chapter 1

Arbitrage theory

In this chapter, we study the mathematical structure of a simple one-period model of


a financial market. We consider a finite number of assets. Their initial prices at time
t D 0 are known, their future prices at time t D 1 are described as random variables
on some probability space. Trading takes place at time t D 0. Already in this simple
model, some basic principles of mathematical finance appear very clearly. In Section 1.2, we single out those models which satisfy a condition of market efficiency:
There are no trading opportunities which yield a profit without any downside risk.
The absence of such arbitrage opportunities is characterized by the existence of an
equivalent martingale measure. Under such a measure, discounted prices have the
martingale property, that is, trading in the assets is the same as playing a fair game.
As explained in Section 1.3, any equivalent martingale measure can be identified with
a pricing rule: It extends the given prices of the primary assets to a larger space of
contingent claims, or financial derivatives, without creating new arbitrage opportunities. In general, there will be several such extensions. A given contingent claim has
a unique price if and only if it admits a perfect hedge. In our one-period model, this
will be the exception rather than the rule. Thus, we are facing market incompleteness,
unless our model satisfies the very restrictive conditions discussed in Section 1.4. The
geometric structure of an arbitrage-free model is described in Section 1.5.
The one-period market model will be used throughout the first part of this book.
On the one hand, its structure is rich enough to illustrate some of the key ideas of the
field. On the other hand, it will provide an introduction to some of the mathematical methods which will be used in the dynamic hedging theory of the second part.
In fact, the multi-period situation considered in Chapter 5 can be regarded as a sequence of one-period models whose initial conditions are contingent on the outcomes
of previous periods. The techniques for dealing with such contingent initial data are
introduced in Section 1.6.

1.1

Assets, portfolios, and arbitrage opportunities

Consider a financial market with d C 1 assets. The assets can consist, for instance,
of equities, bonds, commodities, or currencies. In a simple one-period model, these
assets are priced at the initial time t D 0 and at the final time t D 1. We assume that
the i th asset is available at time 0 for a price  i  0. The collection
 D . 0 ;  1 ; : : : ;  d / 2 RdCC1

Chapter 1 Arbitrage theory

is called a price system. Prices at time 1 are usually not known beforehand at time 0.
In order to model this uncertainty, we fix a measurable space .; F / and describe the
asset prices at time 1 as non-negative measurable functions
S 0; S 1; : : : ; S d
on .; F / with values in 0; 1/. Every ! 2  corresponds to a particular scenario
of market evolution, and S i .!/ is the price of the i th asset at time 1 if the scenario !
occurs.
However, not all asset prices in a market are necessarily uncertain. Usually there is
a riskless bond which will pay a sure amount at time 1. In our simple model for one
period, such a riskless investment opportunity will be included by assuming that
 0 D 1 and

S0  1 C r

for a constant r, the return of a unit investment into the riskless bond. In most situations it would be natural to assume r  0, but for our purposes it is enough to require
that S 0 > 0, or equivalently that
r > 1:
In order to distinguish S 0 from the risky assets S 1 ; : : : ; S d , it will be convenient to
use the notation
S D .S 0 ; S 1 ; : : : ; S d / D .S 0 ; S /;
and in the same way we will write  D .1; /.
At time t D 0, an investor will choose a portfolio
 D . 0 ; / D . 0 ;  1 ; : : : ;  d / 2 Rd C1 ;
where  i represents the number of shares of the i th asset. The price for buying the
portfolio  equals
d
X
  D
i i :
i D0

At time t D 1, the portfolio will have the value


  S .!/ D

d
X

 i S i .!/ D  0 .1 C r/ C   S.!/;

i D0

depending on the scenario ! 2 . Here we assume implicitly that buying and selling
assets does not create extra costs, an assumption which may not be valid for a small
investor but which becomes more realistic for a large financial institution. Note our
convention of writing x  y for the inner product of two vectors x and y in Euclidean
space.

Section 1.1 Assets, portfolios, and arbitrage opportunities

Our definition of a portfolio allows the components  i to be negative. If  0 < 0,


this corresponds to taking out a loan such that we receive the amount j 0 j at t D 0
and pay back the amount .1 C r/j 0 j at time t D 1. If  i < 0 for i  1, a quantity of
j i j shares of the i th asset is sold without actually owning them. This corresponds to
a short sale of the asset. In particular, an investor is allowed to take a short position
 i < 0, and to use up the received amount  i j i j for buying quantities  j  0, j i,
of the other assets. In this case, the price of the portfolio  D . 0 ; / is given by
   D 0.
Remark 1.1. So far we have not assumed that anything is known about probabilities
that might govern the realization of the various scenarios ! 2 . Such a situation
is often referred to as Knightian uncertainty, in honor of F. Knight [176], who introduced the distinction between risk which refers to an economic situation in which
the probabilistic structure is assumed to be known, and uncertainty where no such
assumption is made.
}
Let us now assume that a probability measure P is given on .; F /. The asset
prices S 1 ; : : : ; S d and the portfolio values   S can thus be regarded as random variables on .; F ; P /.
Definition 1.2. A portfolio  2 Rd C1 is called an arbitrage opportunity if     0
but   S  0 P -a.s. and P   S > 0  > 0.
Intuitively, an arbitrage opportunity is an investment strategy that yields with positive probability a positive profit and is not exposed to any downside risk. The existence
of such an arbitrage opportunity may be regarded as a market inefficiency in the sense
that certain assets are not priced in a reasonable way. In real-world markets, arbitrage
opportunities are rather hard to find. If such an opportunity would show up, it would
generate a large demand, prices would adjust, and the opportunity would disappear.
Later on, the absence of such arbitrage opportunities will be our key assumption. Absence of arbitrage implies that S i vanishes P -a.s. once  i D 0. Hence, there is no
loss of generality if we assume from now on that
i > 0

for i D 1; : : : ; d .

Remark 1.3. Note that the probability measure P enters the definition of an arbitrage opportunity only through the null sets of P . In particular, the definition can be
formulated without any explicit use of probabilities if  is countable. In this case,
we can simply apply Definition 1.2 with an arbitrary probability measure P such that
P ! > 0 for every ! 2 . Then an arbitrage opportunity is a portfolio  with
    0, with   S .!/  0 for all ! 2 , and such that   S.!0 / > 0 for at least
}
one !0 2 .

Chapter 1 Arbitrage theory

The following lemma shows that absence of arbitrage is equivalent to the following
property of the market: Any investment in risky assets which yields with positive
probability a better result than investing the same amount in the risk-free asset must
be exposed to some downside risk.
Lemma 1.4. The following statements are equivalent.
(a) The market model admits an arbitrage opportunity.
(b) There is a vector  2 Rd such that
  S  .1 C r/   P -a.s.

and P   S > .1 C r/    > 0:

Proof. To see that (a) implies (b), let  be an arbitrage opportunity. Then 0     D
 0 C   . Hence,
  S  .1 C r/      S C .1 C r/ 0 D   S:
Since   S is P -a.s. non-negative and strictly positive with non-vanishing probability,
the same must be true of   S  .1 C r/  .
Next let  be as in (b). We claim that the portfolio . 0 ; / with  0 WD    is
an arbitrage opportunity. Indeed,    D  0 C    D 0 by definition. Moreover,
  S D .1 C r/   C   S, which is P -a.s. non-negative and strictly positive with
non-vanishing probability.
Exercise 1.1.1. On  D !1 ; !2 ; !3 we fix a probability measure P with P !i  >
0 for i D 1; 2; 3. Suppose that we have three assets with prices
0 1
1
 D@2A
7
at time 0 and
0 1
1
S .!1 / D @ 3 A;
9

0 1
1
S.!2 / D @ 1 A;
5

1
1
S.!3 / D @ 5 A
10

at time 1. Show that this market model admits arbitrage.

Exercise 1.1.2. We consider a market model with a single risky asset defined on a
probability space with a finite sample space  and a probability measure P that assigns strictly positive probability to each ! 2 . We let
a WD min S.!/
!2

and

b WD max S.!/:
!2

Show that the model does not admit arbitrage if and only if a < .1 C r/ < b.

Section 1.2 Absence of arbitrage and martingale measures

Exercise 1.1.3. Show that the existence of an arbitrage opportunity implies the following seemingly stronger condition.
(a) There exists an arbitrage opportunity  such that    D 0.
Show furthermore that the following condition implies the existence of an arbitrage
opportunity.
(b) There exists  2 Rd C1 such that    < 0 and   S  0 P -a.s.
What can you say about the implication (a))(b)?

1.2

Absence of arbitrage and martingale measures

In this section, we are going to characterize those market models which do not admit
any arbitrage opportunities. Such models will be called arbitrage-free.
Definition 1.5. A probability measure P  is called a risk-neutral measure, or a martingale measure, if


Si
; i D 0; 1; : : : ; d:
(1.1)
i D E
1Cr
Remark 1.6. In (1.1), the price of the i th asset is identified as the expectation of
the discounted payoff under the measure P  . Thus, the pricing formula (1.1) can
be seen as a classical valuation formula which does not take into account any risk
aversion, in contrast to valuations in terms of expected utility which will be discussed
in Section 2.3. This is why a measure P  satisfying (1.1) is called risk-neutral. The
connection to martingales will be made explicit in Section 1.6.
}
The following basic result is sometimes called the fundamental theorem of asset
pricing or, in short, FTAP. It characterizes arbitrage-free market models in terms of
the set

P WD P  j P  is a risk-neutral measure with P   P


of risk-neutral measures which are equivalent to P . Recall that two probability measures P  and P are said to be equivalent (P   P ) if, for A 2 F , P  A  D 0 if
and only if P A  D 0. This holds if and only if P  has a strictly positive density
dP  =dP with respect to P ; see Appendix A.2. An equivalent risk-neutral measure is
also called a pricing measure or an equivalent martingale measure.
Theorem 1.7. A market model is arbitrage-free if and only if P ;. In this case,
there exists a P  2 P which has a bounded density dP  =dP .
We show first that the existence of a risk-neutral measure implies the absence of
arbitrage.

Chapter 1 Arbitrage theory

Proof of the implication ( of Theorem 1:7. Suppose that there exists a risk-neutral measure P  2 P . Take a portfolio  2 Rd C1 such that   S  0 P -a.s.
and E   S  > 0. Both properties remain valid if we replace P by the equivalent
measure P  . Hence,
  D

d
X

i i

  D

i D0

d
X
i D0

i S i
1Cr


DE

 S
1Cr


> 0:

Thus,  cannot be an arbitrage opportunity.


For the proof of the implication ) of Theorem 1.7, it will be convenient to introduce the random vector Y D .Y 1 ; : : : ; Y d / of discounted net gains:
Y i WD

Si
 i ;
1Cr

i D 1; : : : ; d:

(1.2)

With this notation, Lemma 1.4 implies that the absence of arbitrage is equivalent to
the following condition:
For  2 Rd :

  Y  0 P -a.s. H)   Y D 0 P -a.s.

(1.3)

Since Y i is bounded from below by  i , the expectation E  Y i  of Y i under any


measure P  is well-defined, and so P  is a risk-neutral measure if and only if
E  Y  D 0:

(1.4)

Here, E  Y  is a shorthand notation for the d -dimensional vector with components


E  Y i , i D 1; : : : ; d . The assertion of Theorem 1.7 can now be read as follows:
Condition (1.3) holds if and only if there exists some P   P such that E  Y  D 0,
and in this case, P  can be chosen such that the density dP  =dP is bounded.
Proof of the implication ) of Theorem 1:7. We have to show that (1.3) implies the
existence of some P   P such that (1.4) holds and such that the density dP  =dP
is bounded. We will do this first in the case in which
E jY j  < 1:
Let Q denote the convex set of all probability measures Q  P with bounded
densities dQ=dP , and denote by EQ Y  the d -dimensional vector with components
EQ Y i , i D 1; : : : ; d . Due to our assumption jY j 2 L1 .P /, all these expectations
are finite. Let
C WD EQ Y  j Q 2 Q;

Section 1.2 Absence of arbitrage and martingale measures

and note that C is a convex set in Rd : If Q1 , Q0 2 Q and 0   1, then


Q WD Q1 C .1  /Q0 2 Q and
EQ1 Y  C .1  /EQ0 Y  D EQ Y ;
which lies in C .
Our aim is to show that C contains the origin. To this end, we suppose by way of
contradiction that 0 C . Using the separating hyperplane theorem in the elementary form of Proposition A.1, we obtain a vector  2 Rd such that   x  0 for all
x 2 C , and such that   x0 > 0 for some x0 2 C . Thus,  satisfies EQ   Y   0 for
all Q 2 Q and EQ0   Y  > 0 for some Q0 2 Q. Clearly, the latter condition yields
that P   Y > 0  > 0. We claim that the first condition implies that   Y is P -a.s.
non-negative. This fact will be a contradiction to our assumption (1.3) and thus will
prove that 0 2 C .
To prove the claim that   Y  0 P -a.s., let A WD   Y < 0, and define functions


1
1
 IA C  IAc :
'n WD 1 
n
n
We take 'n as densities for new probability measures Qn :
1
dQn
WD
 'n ;
dP
E 'n 

n D 2; 3; : : : :

Since 0 < 'n  1, it follows that Qn 2 Q, and thus that


0    EQn Y  D

1
E   Y 'n :
E 'n 

Hence, Lebesgues dominated convergence theorem yields that


E   Y IY <0  D lim E   Y 'n   0:
n"1

This proves the claim that   Y  0 P -a.s. and completes the proof of Theorem 1.7
in case E jY j  < 1.
If Y is not P -integrable, then we simply replace the probability measure P by
a suitable equivalent measure PQ whose density d PQ =dP is bounded and for which
Q jY j  < 1. For instance, one can define PQ by
E
c
d PQ
D
dP
1 C jY j

 
for c WD E

1
1 C jY j

1
:

Recall from Remark 1.3 that replacing P with an equivalent probability measure does
not affect the absence of arbitrage opportunities in our market model. Thus, the first

10

Chapter 1 Arbitrage theory

part of this proof yields a risk-neutral measure P  which is equivalent to PQ and whose
density dP  =d PQ is bounded. Then P  2 P , and
dP 
dP  d PQ

D
dP
d PQ dP
is bounded. Hence, P  is as desired, and the theorem is proved.
Remark 1.8. Note that neither the absence of arbitrage nor the definition of the class
P involve the full structure of the probability measure P , they only depend on the
class of nullsets of P . In particular, the preceding theorem can be formulated in a
situation of Knightian uncertainty, i.e., without fixing any initial probability measure
P , whenever the underlying set  is countable.
}
Remark 1.9. Our assumption that asset prices S are non-negative implies that the
components of Y are bounded from below. Note however that this assumption was
not needed in our proof. Thus, Theorem 1.7 also holds if we only assume that S is
finite-valued and  2 Rd . In this case, the definition of a risk-neutral measure P 
via (1.1) is meant to include the assumption that S i is integrable with respect to P 
for i D 1; : : : ; d .
}
Example 1.10. Let P be any probability measure on the finite set  WD !1 ; : : : ; !N
that assigns strictly positive probability pi to each singleton !i . Suppose that there
is a single risky asset defined by its price  D  1 at time 0 and by the random variable
S D S 1 . We may assume without loss of generality that the values si WD S.!i / are
distinct and arranged in increasing order: s1 <    < sN . According to Theorem 1.7,
this model does not admit arbitrage opportunities if and only if
Q S  j PQ  P D
.1 C r/ 2 E

N
X

si pQi pQi > 0;


pQi D 1 D .s1 ; sN /;

i D1

i D1

and P  is a risk-neutral measure if and only if the probabilities pi WD P  !i  0


solve the linear equations

s1 p1 C    C sN pN
D .1 C r/;

p1 C    C pN
D 1:

If a solution exists, it will be unique if and only if N D 2, and there will be infinitely
many solutions for N > 2.
}
Exercise 1.2.1. On  D !1 ; !2 ; !3 we fix a probability measure P with P !i  >
0 for i D 1; 2; 3. Suppose that we have three assets with prices
0 1
1
@
 D 2A
7

11

Section 1.2 Absence of arbitrage and martingale measures

at time 0 and
0 1
1
S .!1 / D @ 3 A;
9

0 1
1
S.!2 / D @ 1 A;
5

1
1
S.!3 / D @ 5 A
13

at time 1. Show that this market model does not admit arbitrage and find all riskneutral measures. Note that this model differs from the one in Exercise 1.1.1 only in
}
the value of S 2 .!3 /.
Exercise 1.2.2. Consider a market model with one risky asset that is such that  1 > 0
and the distribution of S 1 has a strictly
positive density function f W .0; 1/ !
Rx
1  x D
.0; 1/. That is, P S
0 f .y/ dy for x > 0. Find an equivalent risk}
neutral measure P  .
Remark 1.11. The economic reason for working with the discounted asset prices
X i WD

Si
;
1Cr

i D 0; : : : ; d;

(1.5)

is that one should distinguish between one unit of a currency (e.g. C) at time t D 0
and one unit at time t D 1. Usually people tend to prefer a certain amount today over
the same amount which is promised to be paid at a later time. Such a preference is
reflected in an interest r > 0 paid by the riskless bond: Only the amount 1=.1 C r/ C
must be invested at time 0 to obtain 1 C at time 1. This effect is sometimes referred to
as the time value of money. Similarly, the price S i of the i th asset is quoted in terms
of C at time 1, while  i corresponds to time-zero euros. Thus, in order to compare
the two prices  i and S i , one should first convert them to a common standard. This is
achieved by taking the riskless bond as a numraire and by considering the discounted
prices in (1.5).
}
Remark 1.12. One can choose as numraire any asset which is strictly positive. For
instance, suppose that  1 > 0 and P S 1 > 0  D 1. Then all asset prices can be
expressed in units of the first asset by considering
e
 i WD

i
1

and

Si
;
S1

i D 0; : : : ; d:

Clearly, the definition of an arbitrage opportunity is independent of the choice of


a particular numraire. Thus, an arbitrage-free market model should admit a riskneutral measure with respect to the new numraire, i.e., a probability measure PQ  
P such that
"
#
i
S
i

Q D EQ
; i D 0; : : : ; d:
S1

12

Chapter 1 Arbitrage theory

Let us denote by PQ the set of all such measures PQ  . Then

d PQ 
S1

D
2
P
:
for
some
P
PQ D PQ 
dP 
E  S 1 
Indeed, if PQ  lies in the set on the right, then
 i 
E S i 
i
 S
Q
D
D Q i ;
D
E
S1
E  S 1 
1
and so PQ  2 PQ . Reversing the roles of PQ and P then yields the identity of the two
sets. Note that
P \ PQ D ;
as soon as S 1 is not P -a.s. constant, because Jensens inequality then implies that
h
i
1
0
Q 1 C r > 1 C r
D

Q
D
E
1
S1
EQ  S 1 
and hence EQ  S 1  > E  S 1  for all PQ  2 PQ and P  2 P .
Let

V WD   S j  2 Rd C1

denote the linear space of all payoffs which can be generated by some portfolio. An
element of V will be called an attainable payoff. The portfolio that generates V 2 V
is in general not unique, but we have the following law of one price.
Lemma 1.13. Suppose that the market model is arbitrage-free and that V 2 V can
be written as V D   S D   S P -a.s. for two different portfolios  and . Then
   D   .
Proof. We have .  /  S D 0 P  -a.s. for any P  2 P . Hence,
    DE

.  /  S
1Cr


D 0;

due to (1.1).
By the preceding lemma, it makes sense to define the price of V 2 V as
.V / WD   

if V D   S,

whenever the market model is arbitrage-free.

(1.6)

13

Section 1.2 Absence of arbitrage and martingale measures

Remark 1.14. Via (1.6), the price system  can be regarded as a linear form on the
finite-dimensional vector space V . For any P  2 P we have
.V / D E 

V i
;
1Cr

V 2 V:

Thus, an equivalent risk-neutral measure P  defines a linear extension of  onto the


larger space L1 .P  / of P  -integrable random variables. Since this space is usually infinite-dimensional, one cannot expect that such a pricing measure is in general
unique; see however Section 1.4.
}
We have seen above that, in an arbitrage-free market model, the condition   S D 0
P -a.s. implies that    D 0. In fact, one may assume without loss of generality that
  S D 0 P -a.s.

H)

 D 0;

(1.7)

for otherwise we can find i 2 0; : : : ; d such that  i 0 and represent the i th asset
as a linear combination of the remaining ones
i D 

1 X j j
 
i

and

Si D 

j i

1 X j j
 S :
i
j i

In this sense, the i th asset is redundant and can be omitted.


Definition 1.15. The market model is called non-redundant if (1.7) holds.
Exercise 1.2.3. Show that in a non-redundant market model the components of the
vector Y of discounted net gains are linearly independent in the sense that
  Y D 0 P -a.s.

H)

 D 0:

(1.8)

Show then that condition (1.8) implies non-redundance if the market model is arbitrage-free.
}
Exercise 1.2.4. Show that in a non-redundant and arbitrage-free market model the set

 2 Rd C1 j    D w and   S  0 P -a.s.
is compact for any w > 0.

Definition 1.16. Suppose that the market model is arbitrage-free and that V 2 V is
an attainable payoff such that .V / 0. Then the return of V is defined by
R.V / WD

V  .V /
:
.V /

14

Chapter 1 Arbitrage theory

Note that we have already seen the special case of the risk-free return
rD

S0  0
D R.S 0 /:
0

If an attainable payoff V is a linear combination V D


tainable payoffs Vk , then
R.V / D

n
X

k R.Vk /

kD1

Pn

kD1 k Vk

of non-zero at-

.Vk /
for k D Pn k
:
i D1 i .Vi /

The coefficient k can be interpreted as the proportion of the investment allocated to


Vk . As a particular case of the formula above, we have that
R.V / D

d
X
i i
i D0

 

 R.S i /

for all non-zero attainable payoffs V D   S (recall that we have assumed that all  i
are strictly positive).
Proposition 1.17. Suppose that the market model is arbitrage-free, and let V 2 V be
an attainable payoff such that .V / 0.
(a) Under any risk-neutral measure P  , the expected return of V is equal to the
risk-free return r
E  R.V /  D r:
(b) Under any measure Q  P such that EQ jSj  < 1, the expected return of V
is given by

 
dP
; R.V / ;
EQ R.V /  D r  cov
dQ
Q
where P  is an arbitrary risk-neutral measure in P and covQ denotes the covariance with respect to Q.
Proof. (a): Since E  V  D .V /.1 C r/, we have
E  R.V /  D

E  V   .V /
D r:
.V /

(b): Let P  2 P and '  WD dP  =dQ. Then


cov.'  ; R.V // D EQ '  R.V /   EQ '    EQ R.V / 
Q

D E  R.V /   EQ R.V / :
Using part (a) yields the assertion.

15

Section 1.2 Absence of arbitrage and martingale measures

Remark 1.18. Let us comment on the extension of the fundamental equivalence in


Theorem 1.7 to market models with an infinity of tradable assets S 0 ; S 1 ; S 2 ; : : : . We
assume that S 0  1 C r for some r > 1 and that the random vector
S.!/ D .S 1 .!/; S 2 .!/; : : : /
takes values in the space `1 of bounded real sequences. This space is a Banach space
with respect to the norm
kxk1 WD sup jx i j

for x D .x 1 ; x 2 ; : : : / 2 `1 .

i 1

A portfolio  D . 0P
; / is chosen in such a way that  D . 1 ;  2 ; : : : / is a sequence
1
1 , i.e.,
i
in the space `
i D1 j j < 1. We assume that the corresponding price system
0
 D . ; / satisfies  2 `1 and  0 D 1. Clearly, this model class includes our
model with d C 1 traded assets as a special case.
Our first observation is that the implication ( of Theorem 1.7 remains valid, i.e.,
the existence of a measure P   P with the properties


E kS k1  < 1

and

Si
1Cr

D i

implies the absence of arbitrage opportunities. To this end, suppose that  is a portfolio
strategy such that
  S  0 P -a.s. and E   S  > 0:
(1.9)
Then we can replace P in (1.9) by the equivalent measure P  . Hence,  cannot be an
arbitrage opportunity since
  D

1
X
i D0

 E

Si
1Cr


DE

 S
1Cr


> 0:

Note that interchanging summation and integration is justified by dominated convergence, because
1
X
j i j 2 L1 .P  /:
j 0 j C kS k1
i D0

The following example shows that the implication ) of Theorem 1.7, namely that
absence of arbitrage opportunities implies the existence of a risk-neutral measure,
may no longer be true for an infinite number of assets.
}
Example 1.19. Let  D 1; 2; : : : , and choose any probability measure P which
assigns strictly positive probability to all singletons !. We take r D 0 and define a

16

Chapter 1 Arbitrage theory

price system  i D 1, for i D 0; 1; : : : . Prices at time 1 are given by S 0  1 and, for


i D 1; 2; : : : , by
8

<0 if ! D i,
i
S .!/ D 2 if ! D i C 1,

:
1 otherwise.
Let us show that this market model is arbitrage-free. To this end, suppose that  D
. 0 ; / is a portfolio such that  2 `1 and such that   S.!/  0 for each ! 2 , but
such that     0. Considering the case ! D 1 yields
0

0    S .1/ D  C

1
X

 k D      1   1 :

kD2

Similarly, for ! D i > 1,


0

0    S.!/ D  C 2

i 1

1
X

 k D    C  i 1   i   i 1   i :

kD1
ki;i 1

It follows that 0   1   2     . But this can only be true if all  i vanish, since we
have assumed that  2 `1 . Hence, there are no arbitrage opportunities.
However, there exists no probability measure P   P such that E  S i  D  i for
all i . Such a measure P  would have to satisfy
1 D E  S i  D 2P  i C 1 C

1
X

P  k 

kD1
ki;i C1

D 1 C P  i C 1  P  i
for i > 1. This relation implies that P  i  D P  i C 1  for all i > 1, contra}
dicting the assumption that P  is a probability measure and equivalent to P .

1.3

Derivative securities

In real financial markets, not only the primary assets are traded. There is also a large
variety of securities whose payoff depends in a non-linear way on the primary assets
S 0 ; S 1 ; : : : ; S d , and sometimes also on other factors. Such financial instruments are
usually called options, contingent claims, derivative securities, or just derivatives.
Example 1.20. Under a forward contract, one agent agrees to sell to another agent an
asset at time 1 for a price K which is specified at time 0. Thus, the owner of a forward
contract on the i th asset gains the difference between the actual market price S i and

17

Section 1.3 Derivative securities

the delivery price K if S i is larger than K at time 1. If S i < K, the owner loses
the amount K  S i to the issuer of the forward contract. Hence, a forward contract
corresponds to the random payoff
C fw D S i  K:

Example 1.21. The owner of a call option on the i th asset has the right, but not the
obligation, to buy the i th asset at time 1 for a fixed price K, called the strike price.
This corresponds to a payoff of the form

S i  K if S i > K,
call
i
C
D .S  K/ D
C
0
otherwise.
Conversely, a put option gives the right, but not the obligation, to sell the asset at time
1 for a strike price K. The corresponding random payoff is given by

K  S i if S i < K,
put
i C
C D .K  S / D
0
otherwise.
Call and put options with the same strike K are related through the formula
C call  C put D S i  K:
Hence, if the price .C call / of a call option has already been fixed, then the price
.C put / of the corresponding put option is determined by linearity through the putcall parity
K
:
(1.10)
.C call / D .C put / C  i 
1Cr
}
Example 1.22. An option on the value V D   S of a portfolio of several risky assets
is sometimes called a basket or index option. For instance, a basket call would be of
the form .V  K/C . The asset on which the option is written is called the underlying
asset or just the underlying.
}
Put and call options can be used as building blocks for a large class of derivatives.
Example 1.23. A straddle is a combination of at-the-money put and call options
on a portfolio V D   S , i.e., on put and call options with strike K D .V /:
C D ..V /  V /C C .V  .V //C D jV  .V /j:
Thus, the payoff of the straddle increases proportionally to the change of the price of
 between time 0 and time 1. In this sense, a straddle is a bet that the portfolio price
will move, no matter in which direction.
}

18

Chapter 1 Arbitrage theory

Example 1.24. The payoff of a butterfly spread is of the form


C D .K  jV  .V /j/C ;
where K > 0 and where V D   S is the price of a given portfolio or the value of
a stock index. Clearly, the payoff of the butterfly spread is maximal if V D .V /
and decreases if the price at time 1 of the portfolio  deviates from its price at time 0.
Thus, the butterfly spread is a bet that the portfolio price will stay close to its present
value.
}
Exercise 1.3.1. Draw the payoffs of put and call options, a straddle, and a butterfly
spread as functions of its underlying.
}
Exercise 1.3.2. Consider a butterfly spread as in Example 1.24 and write its payoff
as a combination of
(a) call options,
(b) put options
on the underlying. As in the put-call parity (1.10), such a decomposition determines
the price of a butterfly spread once the prices of the corresponding put or call options
have been fixed.
}
Example 1.25. The idea of portfolio insurance is to increase exposure to rising asset
prices, and to reduce exposure to falling prices. This suggests to replace the payoff
V D   S of a given portfolio by a modified profile h.V /, where h is convex and
increasing. Let us first consider the case where V  0. Then the corresponding
payoff h.V / can be expressed as a combination of investments in bonds, in V itself,
and in basket call options on V . To see this, recall that convexity implies that
Z x
h.x/ D h.0/ C
h0 .y/ dy
0

h0

for the increasing right-hand derivative


WD h0C of h; see Appendix A.1. By
the arguments in Lemma A.19, the increasing rightcontinuous function h0 can be
represented as the distribution function of a positive Radon measure  on 0; 1/:
h0 .x/ D .0; x/ for x  0. Recall that a positive Radon measure is a -additive
measure that assigns to each Borel set A  0; 1/ a value in .A/ 2 0; 1, which is
finite when A is compact. An example is the Lebesgue measure on 0; 1/. Using the
representation h0 .x/ D .0; x/, Fubinis theorem implies that
Z xZ
h.x/ D h.0/ C
.dz/ dy
0

0;y

Z
dy .dz/:

D h.0/ C .0/ x C
.0;1/

y j zyx

19

Section 1.3 Derivative securities

Since the inner integral equals .x  z/C , we obtain


Z
0
.V  K/C .dK/:
h.V / D h.0/ C h .0/ V C
.0;1/

(1.11)
}

The formula (1.11) yields a representation of h.V / in terms of investments in bonds,


in V D  S itself, and in call options on V . It requires, however, that V is nonnegative
and that both h.0/ and h0 .0/ are finite. Also, it is sometimes more convenient to have
a development around the initial value  V WD    of the portfolio  than to have a
development around zero. Corresponding extensions of formula (1.11) are explored
in the following exercise.
Exercise 1.3.3. In this exercise, we consider the situation of Example 1.25 without
insisting that the payoff V D   S takes only nonnegative values. In particular, the
portfolio  may also contain short positions. Let h W R ! R be a continuous function.
(a) Show that for convex h there exists a nonnegative Radon measure  on R such
that the payoff h.V / can be realized by holding bonds, forward contracts, and a
mixture of call and put options on V :
h.V / D h. V / C h0 . V /.V   V /
Z
Z
C
.V  K/C .dK/ C
.1; V

.K  V /C .dK/:

. V

;1/

Note that the put and call options occurring in this formula are out of the
money in the sense that their intrinsic value, i.e., their value when V is replaced by its present value  V , is zero.
(b) Now let h be any twice continuously differentiable function on R. Deduce from
part (a) that
h.V / D h. V / C h0 . V /.V   V /
Z

V

C 00

.V  K/ h .K/ dK C
1

.K  V /C h00 .K/ dK:

V

This formula is sometimes called the BreedenLitzenberger formula.

Example 1.26. A reverse convertible bond pays interest which is higher than that
earned by an investment into the riskless bond. But at maturity t D 1, the issuer may
convert the bond into a predetermined number of shares of a given asset S i instead
of paying the nominal value in cash. The purchase of this contract is equivalent to
the purchase of a standard bond and the sale of a certain put option. More precisely,

20

Chapter 1 Arbitrage theory

suppose that 1 is the price of the reverse convertible bond at t D 0, that its nominal
value at maturity is 1 C r,
Q and that it can be converted into x shares of the i th asset.
Q
Thus, the
This conversion will happen if the asset price S i is below K WD .1 C r/=x.
payoff of the reverse convertible bond is equal to
1 C rQ  x.K  S i /C ;
i.e., the purchase of this contract is equivalent to a risk-free investment of the amount
.1 C r/=.1
Q
C r/ with interest r and the sale of the put option x.K  S i /C for the price
.rQ  r/=.1 C r/.
}
Example 1.27. A discount certificate on V D   S pays off the amount
C D V ^ K;
where the number K > 0 is often called the cap. Since
C D V  .V  K/C ;
buying the discount certificate is the same as purchasing  and selling the basket call
option C call WD .V  K/C . If the price .C call / has already been fixed, then the price
of C is given by .C / D .V /  .C call /. Hence, the discount certificate is less
expensive than the portfolio  itself, and this explains the name. On the other hand, it
}
participates in gains of  only up to the cap K.
Example 1.28. For an insurance company, it may be desirable to shift some of its
insurance risk to the financial market. As an example of such an alternative risk
transfer, consider a catastrophe bond issued by an insurance company. The interest
paid by this security depends on the occurrence of certain special events. For instance,
the contract may specify that no interest will be paid if more than a given number of
insured cars are damaged by hail on a single day during the lifetime of the contract; as
a compensation for taking this risk, the buyer will be paid an interest above the usual
market rate if this event does not occur.
}
Mathematically, it will be convenient to focus on contingent claims whose payoff is
non-negative. Such a contingent claim will be interpreted as a contract which is sold
at time 0 and which pays a random amount C.!/  0 at time 1. A derivative security
whose terminal value may also become negative can usually be reduced to a combination of a non-negative contingent claim and a short position in some of the primary
assets S 0 ; S 1 ; : : : ; S d . For instance, the terminal value of a reverse convertible bond
is bounded from below so that it can be decomposed into a short position in cash and
into a contract with positive value. From now on, we will work with the following
formal definition of the term contingent claim.

21

Section 1.3 Derivative securities

Definition 1.29. A contingent claim is a random variable C on the underlying probability space .; F ; P / such that
0C <1

P -a.s.

A contingent claim C is called a derivative of the primary assets S 0 ; : : : ; S d if it is


measurable with respect to the -field .S 0 ; : : : ; S d / generated by the assets, i.e., if
C D f .S 0 ; : : : ; S d /
for a measurable function f on Rd C1 .
So far, we have only fixed the prices  i of our primary assets S i . Thus, it is not
clear what the correct price should be for a general contingent claim C . Our main
goal in this section is to identify those possible prices which are compatible with the
given prices in the sense that they do not generate arbitrage. Our approach is based
on the observation that trading C at time 0 for a price  C corresponds to introducing
a new asset with the prices
 d C1 WD  C

and

S d C1 WD C:

(1.12)

Definition 1.30. A real number  C  0 is called an arbitrage-free price of a contingent claim C if the market model extended according to (1.12) is arbitrage-free. The
set of all arbitrage-free prices for C is denoted .C /.
In the previous definition, we made the implicit assumption that the introduction
of a contingent claim C as a new asset does not affect the prices of primary assets.
This assumption is reasonable as long as the trading volume of C is small compared
to that of the primary assets. In Section 3.6 we will discuss the equilibrium approach
to asset pricing, where an extension of the market will typically change the prices of
all traded assets.
The following result shows in particular that we can always find an arbitrage-free
price for a given contingent claim C if the initial model is arbitrage-free.
Theorem 1.31. Suppose that the set P of equivalent risk-neutral measures for the
original market model is non-empty. Then the set of arbitrage-free prices of a contingent claim C is non-empty and given by

.C / D E

C
1Cr



P 2 P such that E C  < 1 :

(1.13)

22

Chapter 1 Arbitrage theory

Proof. By Theorem 1.7,  C is an arbitrage-free price for C if and only if there exists
an equivalent risk-neutral measure PO for the market model extended via (1.12), i.e.,
i

 D EO

Si
1Cr


for i D 1; : : : ; d C 1.

In particular, PO is necessarily contained in P , and we obtain the inclusion in (1.13).


Conversely, if  C D E  C =.1 C r/  for some P  2 P , then this P  is also an
equivalent risk-neutral measure for the extended market model, and so the two sets in
(1.13) are equal.
To show that .C / is non-empty, we first fix some measure PQ  P such that
Q
E C  < 1. For instance, we can take d PQ D c.1 C C /1 dP , where c is the normalizing constant. Under PQ , the market model is arbitrage-free. Hence, Theorem 1.7
yields P  2 P such that dP  =d PQ is bounded. In particular, E  C  < 1 and
 C D E  C =.1 C r/  2 .C /.
Exercise 1.3.4. Show that the set .C / of arbitrage-free prices of a contingent claim
is convex and hence an interval.
}
The following theorem provides a dual characterization of the lower and upper
bounds
inf .C / WD inf .C / and sup .C / WD sup .C /;
which are often called arbitrage bounds for C .
Theorem 1.32. In an arbitrage-free market model, the arbitrage bounds of a contingent claim C are given by
C i
1Cr
P  2P

D max m 2 0; 1/ 9  2 Rd with m C   Y 

inf .C / D inf E 

C
P -a.s.
1Cr

(1.14)

and
C i
1Cr
P  2P

D min m 2 0; 1 9  2 Rd with m C   Y 

sup .C / D sup E 

C
P -a.s. :
1Cr

Proof. We only prove the identities for the upper arbitrage bound. The ones for the
lower bound are obtained in a similar manner; see Exercise 1.3.5. We take m 2 0; 1
and  2 Rd such that m C   Y  C =.1 C r/ P -a.s., and we denote by M the set of

23

Section 1.3 Derivative securities

all such m. Taking the expectation with P  2 P yields m  E  C =.1 C r/ , and we


get
C i
1Cr
P  2P

h C i
P  2 P ; E  C  < 1 D sup .C /;
 sup E 
1Cr

inf M  sup E 

(1.15)

where we have used Theorem 1.31 in the last identity.


Next we show that all inequalities in (1.15) are in fact identities. This is trivial if
sup .C / D 1. For sup .C / < 1, we will show that m > sup .C / implies m 
inf M . By definition, sup .C / < m < 1 requires the existence of an arbitrage
opportunity in the market model extended by  d C1 WD m and S d C1 WD C . That is,
there is .;  d C1 / 2 Rd C1 such that   Y C  d C1 .C =.1 C r/  m/ is almost-surely
non-negative and strictly positive with positive probability. Since the original market
model is arbitrage-free,  d C1 must be non-zero. In fact, we have  d C1 < 0 as taking
expectations with respect to P  2 P for which E  C  < 1 yields
 h
 d C1 E 


C i
 m  0;
1Cr

and the term in parenthesis is negative since m > sup .C /. Thus, we may define
 WD = d C1 2 Rd and obtain m C   Y  C =.1 C r/ P -a.s., hence m  inf M .
We now prove that inf M belongs to M . To this end, we may assume without loss of
generality that inf M < 1 and that the market model is non-redundant in the sense of
Definition 1.15. For a sequence mn 2 M that decreases towards inf M D sup .C /, we
fix n 2 Rd such that mn Cn Y  C =.1Cr/ P -almost surely. If lim infn jn j < 1,
there exists a subsequence of .n / that converges to some  2 Rd . Passing to the
limit yields sup .C / C   Y  C =.1 C r/ P -a.s., which gives sup .C / 2 M . But
this is already the desired result, since the following argument will show that the
case lim infn jn j D 1 cannot occur. Indeed, after passing to some subsequence
if necessary, n WD n =jn j converges to some  2 Rd with jj D 1. Under the
assumption that jn j ! 1, passing to the limit in
C
mn
C n  Y 
jn j
jn j.1 C r/

P -a.s.

yields   Y  0. The absence of arbitrage opportunities thus implies   Y D 0 P -a.s.,


whence  D 0 by non-redundance of the model. But this contradicts the fact that
jj D 1.
Exercise 1.3.5. Prove the identity (1.14).

24

Chapter 1 Arbitrage theory

Remark 1.33. Theorem 1.32 shows that sup .C / is the lowest possible price of a
portfolio  with
  S  C P -a.s.
Such a portfolio is often called a superhedging strategy or superreplication of C ,
and the identities for inf .C / and sup .C / obtained in Theorem 1.32 are often called
superhedging duality relations. When using , the seller of C would be protected
against any possible future claims of the buyer of C . Thus, a natural goal for the seller
would be to finance such a superhedging strategy from the proceeds of C . Conversely,
the objective of the buyer would be to cover the price of C from the sale of a portfolio
 with
  S  C P -a.s.,
which is possible if and only if     inf .C /. Unless C is an attainable payoff,
however, neither objective can be fulfilled by trading C at an arbitrage-free price, as
shown in Corollary 1.35 below. Thus, any arbitrage-free price involves a trade-off
between these two objectives.
}
For a portfolio  the resulting payoff V D   S, if positive, may be viewed as
a contingent claim, and in particular as a derivative. Those claims which can be
replicated by a suitable portfolio will play a special role in the sequel.
Definition 1.34. A contingent claim C is called attainable .replicable, redundant/ if
C D   S P -a.s. for some  2 Rd C1 . Such a portfolio strategy  is then called a
replicating portfolio for C .
If one can show that a given contingent claim C can be replicated by some portfolio
, then the problem of determining a price for C has a straightforward solution: The
price of C is unique and equal to the cost    of its replication, due to the law of one
price. The following corollary also shows that the attainable contingent claims are in
fact the only ones which admit a unique arbitrage-free price.
Corollary 1.35. Suppose the market model is arbitrage-free and C is a contingent
claim.
(a) C is attainable if and only if it admits a unique arbitrage-free price.
(b) If C is not attainable, then inf .C / < sup .C / and
.C / D .inf .C /; sup .C //:
Proof. To prove part (a), note first that j.C /j D 1 if C is attainable. The converse
implication will follow from (b).
In order to prove part (b), note first that .C / is an interval due to Exercise 1.3.4.
To show that this interval is open, it suffices to exclude the possibility that it contains

25

Section 1.3 Derivative securities

one of its boundary points inf .C / and sup .C /. To this end, we use Theorem 1.32 to
get  2 Rd such that
inf .C / C   Y 

C
1Cr

P -a.s.

Since C is not attainable, this inequality cannot be an almost-sure identity. Hence,


with  0 WD     inf .C /, the strategy . 0 ; ; 1/ 2 Rd C2 is an arbitrage opportunity in the market model extended by  d C1 WD inf .C / and S d C1 WD C . Therefore
inf .C / is not an arbitrage-free price for C . The possibility sup .C / 2 .C / is excluded by a similar argument.
Remark 1.36. In Theorem 1.32, the set P of equivalent risk-neutral measures can be
replaced by the set PQ of risk-neutral measures that are merely absolutely continuous
with respect to P . That is,
inf .C / D inf EQ
PQ 2PQ

C i
1Cr

and

sup .C / D sup EQ

PQ 2PQ

C i
;
1Cr

(1.16)

for any contingent claim C . To prove this, note first that P  PQ , so that we get the
two inequalities  and  in (1.16). On the other hand, for PQ 2 PQ , P  2 P with
E  C  < 1, and " 2 .0; 1, the measure P" WD "P  C .1  "/PQ belongs to P
Q C . Sending " # 0 yields the converse
and satisfies E" C  D "E  C  C .1  "/E
inequalities.
}
Remark 1.37. Consider any arbitrage-free market model, and let C call D .S i  K/C
be a call option on the i th asset with strike K > 0. Clearly, C call  S i so that
 call 
C
E
 i
1Cr
for any P  2 P . From Jensens inequality, we obtain the following lower bound:
E

C call
1Cr

 

C 
C
Si
K
K

i
 E

D  
:
1Cr
1Cr
1Cr

Thus, the following universal bounds hold for any arbitrage-free market model:

i 

K
1Cr

C

 inf .C call /  sup .C call /   i :

(1.17)

For a put option C put D .K  S i /C , one obtains the universal bounds




K
 i
1Cr

C

 inf .C put /  sup .C put / 

K
:
1Cr

(1.18)

26

Chapter 1 Arbitrage theory

If r  0, then the lower bound in (1.17) can be further reduced to inf .C call / 
. i  K/C . Informally, this inequality states that the value of the right to buy the
i th asset at t D 0 for a price K is strictly less than any arbitrage-free price for C call .
This fact is sometimes expressed by saying that the time value of a call option is
non-negative. The quantity . i  K/C is called the intrinsic value of the call option.
Observe that an analogue of this relation usually fails for put options: The left-hand
side of (1.18) can only be bounded by its intrinsic value .K   i /C if r  0. If the
intrinsic value of a put or call option is positive, then one says that the option is in
the money. For  i D K one speaks of an at-the-money option. Otherwise, the
option is out of the money.
}
In many situations, the universal arbitrage bounds (1.17) and (1.18) are in fact attained, as illustrated by the following example.
Example 1.38. Take any market model with a single risky asset S D S 1 such that
the distribution of S under P is concentrated on 0; 1; : : : with positive weights.
Without loss of generality, we may assume that S has under P a Poisson distribution
with parameter 1, i.e., S is P -a.s. integer-valued and
PS D k  D

e 1
k

for k D 0; 1; : : : .

If we take r D 0 and  D 1, then P is a risk-neutral measure and the market model


is arbitrage-free. We are going to show that the upper and lower bounds in (1.17)
are attained for this model by using Remark 1.36. To this end, consider the measure
PQ 2 PQ which is defined by its density
d PQ
D e  ISD1 :
dP
We get
Q .S  K/C  D .1  K/C D .  K/C ;
E
so that the lower bound in (1.17) is attained, i.e., we have
inf ..S  K/C / D .  K/C :
To see that also the upper bound is sharp, we define


e
gn .k/ WD e 
 I0 .k/ C .n  1/  e  In .k/;
n
It is straightforward to check that
d PQn WD gn .S/ dP

k D 0; 1; : : :

27

Section 1.4 Complete market models

defines a measure PQn 2 PQ such that




K C
C
Q
:
En .S  K/  D 1 
n
By sending n " 1, we see that also the upper bound in (1.17) is attained
sup ..S  K/C / D :
Furthermore, the put-call parity (1.10) shows that the universal bounds (1.18) for put
options are attained as well.
}
Exercise 1.3.6. We consider the market model from Exercise 1.1.2 and suppose that
a < .1 C r/ < b so that the model is arbitrage-free. Let C be a derivative that is
given by C D h.S/, where h  0 is a convex function. Show that
sup .C / D

h.b/ .1 C r/  a
h.a/ b  .1 C r/

C

:
1Cr
ba
1Cr
ba

Exercise 1.3.7. In an arbitrage-free market model, we consider a derivative C that


is given by C D h.S 1 /, where h  0 is a convex function. Derive the following
arbitrage bounds for C :
inf .C / 

1.4

h. 1 .1 C r//
1Cr

and

sup .C / 

h.x/ 1
h.0/
C lim
 :
1 C r x"1 x

Complete market models

Our goal in this section is to characterize the particularly transparent situation in which
all contingent claims are attainable.
Definition 1.39. An arbitrage-free market model is called complete if every contingent claim is attainable.
The following theorem characterizes the class of all complete market models. It is
sometimes called the second fundamental theorem of asset pricing.
Theorem 1.40. An arbitrage-free market model is complete if and only if there exists
exactly one risk-neutral probability measure, i.e., if jP j D 1.
Proof. If the model is complete, then the indicator IA of each set A 2 F is an attainable contingent claim. Hence, Corollary 1.35 implies that P  A  D E  IA 
is independent of P  2 P . Consequently, there is just one risk-neutral probability
measure.

28

Chapter 1 Arbitrage theory

Conversely, suppose that P D P  . If C is a contingent claim, then Theorem 1.31


states that the set .C / of arbitrage-free prices is non-empty and given by



C



.C / D E
P 2 P such that E C  < 1 :
1Cr
Since P has just one element, the same must hold for .C /. Hence, Corollary 1.35
implies that C is attainable.
We will now show that every complete market model has a finite structure and can
be reduced to a finite probability space. To this end, observe first that in every market
model the following inclusion holds for each P  2 P :

V D   S j  2 Rd C1 L1 .; .S 1 ; : : : ; S d /; P  /
(1.19)
L0 .; F ; P  / D L0 .; F ; P /I
see Appendix A.7 for the definition of Lp -spaces. If the market is complete then all of
these inclusions are in fact equalities. In particular, F coincides with .S 1 ; : : : ; S d /
modulo P -null sets, and every contingent claim coincides P -a.s. with a derivative of
the traded assets. Since the linear space V is finite-dimensional, it follows that the
same must be true of L0 .; F ; P /. But this means that the model can be reduced to
a finite number of relevant scenarios. This observation can be made precise by using
the notion of an atom of the probability space .; F ; P /. Recall that a set A 2 F is
called an atom of .; F ; P /, if P A  > 0 and if each B 2 F with B A satisfies
either P B  D 0 or P B  D P A .
Proposition 1.41. For p 2 0; 1, the dimension of the linear space Lp .; F ; P / is
given by
dim Lp .; F ; P /
D supn 2 N j 9 partition A1 ; : : : ; An of  with Ai 2 F and P Ai  > 0: (1.20)
Moreover, n WD dim Lp .; F ; P / < 1 if and only if there exists a partition of 
into n atoms of .; F ; P /.
Proof. Suppose that there is a partition A1 ; : : : ; An of  such that Ai 2 F and
P Ai  > 0. The corresponding indicator functions IA1 ; : : : ; IAn can be regarded
as linearly independent vectors in Lp WD Lp .; F ; P /. Thus dim Lp  n. Consequently, it suffices to consider only the case in which the right-hand side of (1.20)
is a finite number, n0 . If A1 ; : : : ; An0 is a corresponding partition, then each Ai is
an atom because otherwise n0 would not be maximal. Thus, any Z 2 Lp is P -a.s.
constant on each Ai . If we denote the value of Z on Ai by z i , then
ZD

n0
X
i D1

zi IAi

P -a.s.

29

Section 1.4 Complete market models

Hence, the indicator functions IA1 ; : : : ; IAn0 form a basis of Lp , and this implies
dim Lp D n0 .
Since for a complete market model the inclusions in (1.19) are in fact equalities,
we have
dim L0 .; F ; P / D dim V  d C 1;
with equality when the model is non-redundant. Together with Proposition 1.41, this
implies the following result on the structure of complete market models.
Corollary 1.42. For every complete market model there exists a partition of  into
at most d C 1 atoms of .; F ; P /.
Example 1.43. Consider the simple situation where the sample space  consists of
two elements ! C and !  , and where the measure P is such that
p WD P ! C  2 .0; 1/:
We assume that there is one single risky asset, which takes at time t D 1 the two
values b and a with the respective probabilities p and 1  p, where a and b are such
that 0  a < b:
*

p 


 
H

S.! C / D b

HH

H
j
H
1p H

S.!  / D a

This model does not admit arbitrage if and only if

Q S  j PQ  P D pQ b C .1  p/a
.1 C r/ 2 E
Q j pQ 2 .0; 1/ D .a; b/I

(1.21)

see also Example 1.10. In this case, the model is also complete: Any risk-neutral
measure P  must satisfy
.1 C r/ D E  S  D p  b C .1  p  /a;
and this condition uniquely determines the parameter p  D P  ! C  as
p D

.1 C r/  a
2 .0; 1/:
ba

Hence jP j D 1, and completeness follows from Theorem 1.40. Alternatively, we can


directly verify completeness by showing that a given contingent claim C is attainable
if (1.21) holds. Observe that the condition
C.!/ D  0 S 0 .!/ C  S.!/ D  0 .1 C r/ C  S.!/

for all ! 2 

30

Chapter 1 Arbitrage theory

is a system of two linear equations for the two real variables  0 and . The solution is
given by
D

C.! C /  C.!  /
ba

and

0 D

C.!  /b  C.! C /a
:
.b  a/.1 C r/

Therefore, the unique arbitrage-free price of C is


.C / D    D

C.! C / .1 C r/  a
C.!  / b  .1 C r/

C

:
1Cr
ba
1Cr
ba

For a call option C D .S  K/C with strike K 2 a; b, we have


..S  K/C / D

bK
.b  K/a
1
 

:
ba
ba
1Cr

(1.22)

Note that this price is independent of p and increasing in r, while the classical discounted expectation with respect to the objective measure P ,


p.b  K/
C
D
;
E
1Cr
1Cr
is decreasing in r and increasing in p.
In this example, one can illustrate how options can be used to modify the risk of a
position. Consider the particular case in which the risky asset can be bought at time
t D 0 for the price  D 100. At time t D 1, the price is either S.! C / D b D 120 or
S.!  / D a D 90, both with positive probability. If we invest in the risky asset, the
corresponding returns are given by
R.S/.! C / D C20 %

or

R.S/.!  / D 10 %:

Now consider a call option C WD .S  K/C with strike K D 100. Choosing r D 0,


the price of the call option is
.C / D

20
 6:67
3

from formula (1.22). Hence the return


R.C / D

.S  K/C  .C /
.C /

on the initial investment .C / equals


R.C /.! C / D

20  .C /
D C200 %
.C /

31

Section 1.4 Complete market models

or

0  .C /
D 100 %;
.C /

R.C /.!  / D

according to the outcome of the market at time t D 1. Here we see a dramatic increase
of both profit opportunity and risk; this is sometimes referred to as the leverage effect
of options.
On the other hand, we could reduce the risk of holding the asset by holding a
combination
CQ WD .K  S /C C S
of a put option and the asset itself. This portfolio insurance will of course involve an
additional cost. If we choose our parameters as above, then the put-call parity (1.10)
yields that the price of the put option .K  S/C is equal to 20=3. Thus, in order to
hold both S and a put, we must invest the capital 100 C 20=3 at time t D 0. At time
t D 1, we have an outcome of either 120 or of 100 so that the return of CQ is given by
R.CQ /.! C / D C12:5 %

and

R.CQ /.!  / D 6:25 %:

Exercise 1.4.1. We consider the following three market models.


(A)  D !1 ; !2 with r D

1
9

1 D 5

and one risky asset with prices


S 1 .!1 / D

(B)  D !1 ; !2 ; !3 with r D
1 D 5

S 1 .!1 / D

(C)  D !1 ; !2 ; !3 with r D

D


5
;
10

S.!1 / D

1
9

20
3
40
3

S 1 .!2 / D

49
:
9

and one risky asset with prices

20
;
3
1
9

20
;
3

S 1 .!2 / D

49
;
9

S 1 .!3 / D

10
:
3

and two risky assets with prices


!
;

S.!2 / D

20
3
80
9

!
;

S.!3 / D

40
9
80
9

!
:

Each of these models is endowed with a probability measure that assigns strictly positive probability to each element of the corresponding sample space .
(a) Which of these models are arbitrage-free? For those that are, describe the set
P of equivalent risk-neutral measures. For those that are not, find an arbitrage
opportunity.
(b) Discuss the completeness of those models that are arbitrage-free. For those that
are not complete find non-attainable contingent claims.
}

32

Chapter 1 Arbitrage theory

Exercise 1.4.2. Let  D !1 ; !2 ; !3 be endowed with a probability measure P


such that P !i  > 0 for i D 1; 2; 3 and consider the market model with r D 0
and one risky asset with prices  1 D 1 and 0 < S 1 .!1 / < S 2 .!2 / < S 3 .!3 /. We
suppose that the model is arbitrage-free.
(a) Describe the following objects as subsets of three-dimensional Euclidean space
R3 :
(i) the set P of equivalent risk neutral measures;
(ii) the set PQ of absolutely continuous risk neutral measures;
(iii) the set of attainable contingent claims.
(b) Find an example for a non-attainable contingent claim.
(c) Show that the supremum
Q C
sup E

(1.23)

PQ 2PQ

is attained for every contingent claim C .


(d) Let C be a contingent claim. Give a direct and elementary proof of the fact that
Q C  is constant if and
the map that assigns to each PQ 2 PQ the expectation E
only if the supremum (1.23) is attained in some element of P .
}
Exercise 1.4.3. Let  D !1 ; : : : ; !N be endowed with a probability measure P
such that P !i  > 0 for i D 1; : : : ; N . On this probability space we consider a
market model with interest rate r D 0 and with one risky asset whose prices satisfy
 1 D 1 and
0 < S 1 .!1 / < S 1 .!2 / <    < S 1 .!N /:
Show that there are strikes K1 ; : : : ; KN 2 > 0 and prices  C .Ki / such that the
corresponding call options .S 1  Ki /C complete the market in the following sense:
the market model extended by the risky assets with prices
 i WD  C .Ki 1 /

and

is arbitrage-free and complete.

S i WD .S 1  Ki 1 /C ;

i D 2; : : : ; N  1;
}

Exercise 1.4.4. Let  D !1 ; : : : ; !N C1 be endowed with a probability measure P


such that P !i  > 0 for i D 1; : : : ; N C 1.
(a) On this probability space we consider a non-redundant and arbitrage-free market model with d risky assets and prices  2 Rd C1 and S, where d < N .
Show that this market model can be extended by additional assets with prices
 d C1 ; : : : ;  N and S d C1 ; : : : ; S N in such a way that the extended market
model is arbitrage-free and complete.

Section 1.5 Geometric characterization of arbitrage-free models

33

(b) Let specifically N D 2, d D 1,  1 D 2, and


8
< 1 for i D 1;
S 1 .!i / D 2 for i D 2;
:
3 for i D 3:
We suppose furthermore that the risk-free interest rate r is chosen such that
the model is arbitrage-free. Find a non-attainable contingent claim. Then find
an extended model that is arbitrage-free and complete. Finally determine the
}
unique equivalent risk-neutral measure P  in the extended model.

1.5

Geometric characterization of arbitrage-free models

The fundamental theorem of asset pricing in the form of Theorem 1.7 states that a
market model is arbitrage-free if and only if the origin is contained in the set

dQ
is bounded, EQ jY j  < 1  Rd ;
Mb .Y; P / WD EQ Y  Q  P;
dP
where Y D .Y 1 ; : : : ; Y d / is the random vector of discounted net gains defined in
(1.2). The aim of this section is to give a geometric description of the set Mb .Y; P /
as well as of the larger set

M.Y; P / WD EQ Y  j Q  P; EQ jY j  < 1 :
To this end, it will be convenient to work with the distribution

WD P Y 1
of Y with respect to P . That is,
is a Borel probability measure on Rd such that

.A/ D P Y 2 A  for each Borel set A  Rd .


R
d such that
jyj .dy/ < 1, we will call
If

is
a
Borel
probability
measure
on
R
R
y .dy/ its barycenter.
Lemma 1.44. We have
Z
Mb .Y; P / D Mb .
/ WD

y .dy/ 
;
is bounded, jyj .dy/ < 1 ;
d

and

Z
M.Y; P / D M.
/ WD

y .dy/ 
; jyj .dy/ < 1 :

34

Chapter 1 Arbitrage theory

Proof. If 
is a Borel probability measure on Rd , then the RadonNikodym
derivative of with respect to
evaluated at the random variable Y defines a probability measure Q  P on .; F /:
d
dQ
.!/ WD
.Y .!//:
dP
d

R
Clearly, EQ Y  D y .dy/. This shows that M.
/ M.Y; P / and Mb .
/
Mb .Y; P /.
Conversely, if QQ is a given probability measure on .; F / which is equivalent to
P , then the RadonNikodym theorem in Appendix A.2 shows that the distribution
Q WD QQ Y 1 must be equivalent to
, whence M.Y; P / M.
/. Moreover,
Q
it follows from Proposition A.11 that the density d =d

Q
is bounded if d Q=dP
is
bounded, and so Mb .Y; P / Mb .
/ also follows.
By the above lemma, the characterization of the two sets Mb .Y; P / and M.Y; P /
is reduced to a problem for Borel probability measures on Rd . Here and in the sequel,
we do not need the fact that
is the distribution of the lower boundedR random vector Y
of discounted net gains; our results are true for arbitrary
such that jyj
.dy/ < 1;
see also Remark 1.9.
Definition 1.45. The support of a Borel probability measure on Rd is the smallest
closed set A  Rd such that .Ac / D 0, and it will be denoted by supp .
The support of a measure can be obtained as the intersection of all closed sets A
with .Ac / D 0, i.e.,
\
A:
supp D
A closed
.Ac /D0

We denote by
.
/ WD conv.supp
/
X

n
n

D
k yk k  0;
k D 1; yk 2 supp
; n 2 N
kD1

kD1

the convex hull of the support of


. Thus, .
/ is the smallest convex set which
contains supp
; see also Appendix A.1.
Example 1.46. Take d D 1, and consider the measure

1
.1 C C1 /:
2

Section 1.5 Geometric characterization of arbitrage-free models

35

Clearly, the support of


is equal to 1; C1 and so .
/ D 1; C1. A measure
is equivalent to
if and only if
D 1 C .1  /C1
for some 2 .1; C1/. Hence, Mb .
/ D M.
/ D .1; C1/.

The previous example gives the correct intuition, namely that one always has the
inclusions
Mb .
/  M.
/  .
/:
But while the first inclusion will turn out to be an identity, the second inclusion is usually strict. Characterizing M.
/ in terms of .
/ will involve the following concept.
Definition 1.47. The relative interior of a convex set C  Rd is the set of all points
x 2 C such that for all y 2 C there exists some " > 0 with
x  ".y  x/ 2 C:
The relative interior of C is denoted ri C .
If the convex set C has non-empty topological interior int C , then ri C D int C , and
the elementary properties of the relative interior collected in the following remarks
become obvious. This applies in particular to the set .
/ if the non-redundance
condition (1.8) is satisfied. For the general case, proofs of these statements can be
found, for instance, in 6 of [221].
Remark 1.48. Let C be a non-empty convex subset of Rd , and consider the affine
hull aff C spanned by C , i.e., the smallest affine set which contains C . If we identify
aff C with some Rn , then the relative interior of C is equal to the topological interior
of C , considered as a subset of aff C Rn . In particular, each non-empty convex set
has non-empty relative interior.
}
Exercise 1.5.1. Let C be a non-empty convex subset of Rd and denote by C its
closure. Show that for x 2 ri C ,
x C .1  /y 2 ri C

for all y 2 C and all 2 .0; 1.

(1.24)

In particular, ri C is convex. Moreover, show that the operations of taking the closure
or the relative interior of a convex set C are consistent with each other
ri C D ri C

and ri C D C :

(1.25)
}

36

Chapter 1 Arbitrage theory

After these preparations, we can now state the announced geometric characterization of the set Mb .
/. Note that the proof of this characterization relies on the
fundamental theorem of asset pricing in the form of Theorem 1.7.
Theorem 1.49. The set of all barycenters of probability measures 
coincides
with the relative interior of the convex hull of the support of
. More precisely,
Mb .
/ D M.
/ D ri .
/:
Proof. In a first step, we show the inclusion ri .
/ Mb .
/. Suppose we are given
m 2 ri .
/. Let
Q denote the translated measure

.A/
Q
WD
.A C m/

for Borel sets A  Rd

where A C m WD x C m j x 2 A. Then Mb .
/
Q D Mb .
/  m, and analogous
identities hold for M.
/
Q and .
/.
Q It follows that there is no loss of generality in
assuming that m D 0, i.e., we must show that 0 2 Mb .
/ if 0 2 ri .
/.
We claim that 0 2 ri .
/ implies the following no-arbitrage condition:
If  2 Rd is such that   y  0 for
-a.e. y, then   y D 0 for
-a.e. y.

(1.26)

If (1.26) is false, then we can find some  2 Rd such that   y  0 for


-a.e. y but

.y j   y > / > 0 for some > 0. In this case, the support of


is contained in
the closed set y j   y  0 but not in the hyperplane y j   y D 0. We conclude
that   y  0 for all y 2 supp
and that there exists at least one y  2 supp
such
that   y  > 0. In particular, y  2 .
/ so that our assumption m D 0 2 ri .
/
implies the existence of some " > 0 such that "y  2 .
/. Consequently, "y 
can be represented as a convex combination
"y  D 1 y1 C    C n yn
of certain y1 ; : : : ; yn 2 supp
. It follows that
0 > "  y  D 1   y1 C    C n   yn ;
in contradiction to our assumption that   y  0 for all y 2 supp
. Hence, (1.26)
must be true.
Applying the fundamental theorem of asset pricing in the form of Theorem 1.7
to  WD Rd , P WD
, and to the random variable Y .y/ WD y, yields a prob
density d
 =d
is bounded and which satisfies
Rability measure
 R
whose

jyj
.dy/ < 1 and y
.dy/ D 0. This proves the inclusion ri .
/ Mb .
/.
Clearly, Mb .
/  M.
/. So the theorem will be proved if we can show the
inclusion M.
/  ri .
/. To this end, suppose by way of contradiction that 

is such that
Z
Z
jyj .dy/ < 1 and m WD y .dy/ ri .
/:

37

Section 1.6 Contingent initial data

Again, we may assume without loss of generality that m D 0. Applying the separating
hyperplane theorem in the form of Proposition A.1 with C WD ri .
/ yields some
 2 Rd such that   y  0 for all y 2 ri .
/ and   y  > 0 for at least one
y  2 ri .
/. We deduce from (1.24) that   y  0 holds also for all y 2 .
/.
Moreover,   y0 must be strictly positive for at least one y0 2 supp
. Hence,
  y  0 for
-a.e. y 2 Rd

and

.y j   y > 0/ > 0.

(1.27)

By the equivalence of
and , (1.27) is also true for instead of
, and so
Z
Z
  m D   y .dy/ D   y .dy/ > 0;
in contradiction to our assumption that m D 0. We conclude that M.
/  ri .
/.

Remark 1.50. Note that Theorem 1.49 does not extend to the set
Z

Q
M .
/ WD
y .dy/

and
jyj .dy/ < 1 :
Already the simple case
WD 12 .1 C C1 / serves as a counterexample, because
here MQ .
/ D 1; C1 while ri .
/ D .1; C1/. In this case, we have an identity
between MQ .
/ and .
/. However, also this identity fails in general as can be seen
by considering the normalized Lebesgue measure on 1; C1. For this choice one
finds MQ . / D .1; C1/ but . / D 1; C1.
}
From Theorem 1.49 we obtain the following geometric characterization of the absence of arbitrage.
Corollary 1.51. Let
be the distribution of the discounted price vector S=.1 C r/ of
the risky assets. Then the market model is arbitrage-free if and only if the price system
 belongs to the relative interior ri .
/ of the convex hull of the support of
.

1.6

Contingent initial data

The idea of hedging contingent claims develops its full power only in a dynamic
setting in which trading may occur at several times. The corresponding discretetime theory is presented in Chapter 5. The introduction of additional trading periods
requires more sophisticated techniques than those we have used so far. In this section
we will introduce some of these techniques in an extended version of our previous
market model in which initial prices, and hence strategies, are contingent on scenarios.
In this context, we are going to characterize the absence of arbitrage strategies. The

38

Chapter 1 Arbitrage theory

results will be used as building blocks in the multiperiod setting of Part II; their study
can be postponed until Chapter 5.
Suppose that we are given a -algebra F0  F which specifies the information
that is available to an investor at time t D 0. The prices for our d C 1 assets at time
0 will be modelled as non-negative F0 -measurable random variables S00 ; S01 ; : : : ; S0d .
Thus, the price system  D . 0 ;  1 ; : : : ;  d / of our previous discussion is replaced
by the vector
S 0 D .S00 ; : : : ; S0d /:
The portfolio  chosen by an investor at time t D 0 will also depend on the information available at time 0. Thus, we assume that
 D . 0 ;  1 ; : : : ;  d /
is an F0 -measurable random vector. The asset prices observed at time t D 1 will be
denoted by
S 1 D .S10 ; S11 ; : : : ; S1d /:
They are modelled as non-negative random variables which are measurable with respect to a -algebra F1 such that F0  F1  F . The -algebra F1 describes the
information available at time 1, and in this section we can assume that F D F1 .
A riskless bond could be included by taking S00  1 and by assuming S10 to be
F0 -measurable and P -a.s. strictly positive. However, in the sequel it will be sufficient
to assume that S00 is F0 -measurable, S10 is F1 -measurable, and that
P S00 > 0 and S10 > 0  D 1:

(1.28)

Thus, we can take the 0th asset as numraire, and we denote by


X ti WD

S ti
;
S t0

i D 1; : : : ; d; t D 0; 1;

the discounted asset prices and by


Y D X1  X0
the vector of the discounted net gains.
Definition 1.52. An arbitrage opportunity is a portfolio  such that   S 0  0,
  S 1  0 P -a.s., and P   S 1 > 0  > 0.
By our assumption (1.28), any arbitrage opportunity  D . 0 ; / satisfies
  Y  0 P -a.s.

and P   Y > 0  > 0:

(1.29)

39

Section 1.6 Contingent initial data

In fact, the existence of a d -dimensional F0 -measurable random vector  with (1.29)


is equivalent to the existence of an arbitrage opportunity. This can be seen as in
Lemma 1.4.
The space of discounted net gains which can be generated by some portfolio is
given by
K WD   Y j  2 L0 .; F0 ; P I Rd / :
Here, L0 .; F0 ; P I Rd / denotes the space of Rd -valued random variables which are
P -a.s. finite and F0 -measurable modulo the equivalence relation (A.23) of coincidence up to P -null sets. The spaces Lp .; F0 ; P I Rd / for p > 0 are defined in the
p
p
same manner. We denote by LC WD LC .; F1 ; P / the cone of all non-negative elep
p
ments in the space L WD L .; F1 ; P /. With this notation, the absence of arbitrage
opportunities is equivalent to the condition
K \ L0C D 0:
We will denote by
K  L0C
the convex cone of all Z 2 L0 which can be written as the difference of some   Y 2
K and some U 2 L0C .
The following definition involves the notion of the conditional expectation
EQ Z j F0 
of a random variable Z with respect to a probability measure Q, given the -algebra
F0  F ; see Appendix A.2. If Z D .Z 1 ; : : : ; Z n / is a random vector, then
EQ Z j F0  is shorthand for the random vector with components EQ Z i j F0 , i D
1; : : : ; n.
Definition 1.53. A probability measure Q satisfying
EQ X ti  < 1

for i D 1; : : : ; d and t D 0; 1

and
X0 D EQ X1 j F0 

Q-a.s.

is called a risk-neutral measure or martingale measure. We denote by P the set of all


risk-neutral measures P  which are equivalent to P .
Remark 1.54. The definition of a martingale measure Q means that for each asset
i D 0; : : : ; d , the discounted price process .X ti / tD0;1 is a martingale under Q with
respect to the -fields .F t / tD0;1 . The systematic discussion of martingales in a multiperiod setting will begin in Section 5.2. The martingale aspect will be crucial for the
theory of dynamic hedging in Part II.
}

40

Chapter 1 Arbitrage theory

As the main result of this section, we can now state an extension of the fundamental theorem of asset pricing in Theorem 1.7 to our present setting. In the context of
Section 1.2, where F0 D ;; , the following arguments simplify considerably, and
they yield an alternative proof of Theorem 1.7, in which the separation argument in
Rd is replaced by a separation argument in L1 .
Theorem 1.55. The following conditions are equivalent:
(a) K \ L0C D 0.
(b) .K  L0C / \ L0C D 0.
(c) There exists a measure P  2 P with a bounded density dP  =dP .
(d) P ;.
Proof. (d) ) (a): Suppose by way of contradiction that there exist both a P  2 P
and some  2 L0 .; F0 ; P I Rd / with non-zero payoff   Y 2 K \ L0C . For large
enough c > 0,  .c/ WD Ijjc  will be bounded, and the payoff  .c/  Y will still be
non-zero and in K \ L0C . However,
E   .c/  Y  D E   .c/  E  Y j F0   D 0;
which is the desired contradiction.
(a) , (b): It is obvious that (a) is necessary for (b). In order to prove sufficiency,
suppose that we are given some Z 2 .K  L0C / \ L0C . Then there exists a random
variable U  0 and a random vector  2 L0 .; F0 ; P I Rd / such that
0  Z D   Y  U:
This implies that   Y  U  0, which, according to condition (a), can only happen
if   Y D 0. Hence, also U D 0 and in turn Z D 0.
(b) ) (c): This is the difficult part of the proof. The assertion will follow by
combining Lemmas 1.57, 1.58, 1.60, and 1.68.
Remark 1.56. If  is discrete, or if there exists a decomposition of  in countable
many atoms of .; F0 ; P /, then the martingale measure P  can be constructed by
applying the result of Theorem 1.7 separately on each atom. In the general case, the
idea of patching together conditional martingale measures would involve subtle arguments of measurable selection; see [67]. Here we present a different approach which
is based on separation arguments in L1 .P /. It is essentially due to W. Schachermayer
[233]; our version uses in addition arguments by Y. Kabanov and C. Stricker [164]. }
We start with the following simple lemma, which takes care of the integrability
condition in Definition 1.53.

41

Section 1.6 Contingent initial data

Lemma 1.57. For the proof of the implication (b) ) (c) in Theorem 1:55, we may
assume without loss of generality that
E jX t j  < 1 for t D 0; 1.

(1.30)

Proof. Define a probability measure PQ by


d PQ
WD c.1 C jX0 j C jX1 j/1
dP
where c is chosen such that the right-hand side integrates to 1. Clearly, (1.30) holds
for PQ . Moreover, condition (b) of Theorem 1.55 is satisfied by P if and only if it is
satisfied by the equivalent measure PQ . If P  2 P is such that the density dP  =d PQ
is bounded, then so is the density
dP  d PQ
dP 

D
:
dP
d PQ dP
Therefore, the implication (b) ) (c) holds for P if and only if it holds for PQ .
From now on, we will always assume (1.30). Our goal is to construct a suitable
Z 2 L1 such that
dP 
Z
WD
dP
E Z 
defines an equivalent risk-neutral measure P  . The following simple lemma gives a
criterion for this purpose, involving the convex cone
C WD .K  L0C / \ L1 :
Lemma 1.58. Suppose c  0 and Z 2 L1 are such that
E Z W   c

for all W 2 C .

Then:
(a) E Z W   0 for all W 2 C , i.e., we can take c D 0.
(b) Z  0 P -a.s.
(c) If Z does not vanish P -a.s., then
Z
dQ
WD
dP
E Z 
defines a risk-neutral measure Q
P .

42

Chapter 1 Arbitrage theory

Proof. (a): Note that C is a cone, i.e., W 2 C implies that W 2 C for all  0.
This property excludes the possibility that E ZW  > 0 for some W 2 C .
(b): C contains the function W WD IZ<0 . Hence, by part (a),
E Z   D E Z W   0:
(c): For all  2 L1 .; F0 ; P I Rd / and 2 R we have   Y 2 C by our
integrability assumption (1.30). Thus, a similar argument as in the proof of (a) yields
E Z   Y  D 0. Since  is bounded, we may conclude that
0 D E Z   Y  D E   E ZY j F0  :
As  is arbitrary, this yields E ZY j F0  D 0 P -almost surely. Proposition A.12
now implies
EQ Y j F0  D

1
E ZY j F0  D 0 Q-a.s.,
E Z j F0 

which concludes the proof.


In view of the preceding lemma, the construction of risk-neutral measures Q
P
with bounded density is reduced to the construction of elements of the set
Z WD Z 2 L1 j 0  Z  1; P Z > 0  > 0; and E Z W   0 for all W 2 C :
In the following lemma, we will construct such elements by applying a separation
argument suggested by the condition
C \ L1C D 0;
which follows from condition (b) of Theorem 1.55. This separation argument needs
the additional assumption that C is closed in L1 . Showing that this assumption is indeed satisfied in our situation will be one of the key steps in our proof; see Lemma 1.68
below.
Lemma 1.59. Assume that C is closed in L1 and satisfies C \ L1C D 0. Then for
each non-zero F 2 L1C there exists some Z 2 Z such that E F Z  > 0.
Proof. Let B WD F so that B \ C D ;. Since the set C is non-empty, convex
and closed in L1 , we may apply the HahnBanach separation theorem in the form of
Theorem A.57 to obtain a continuous linear functional ` on L1 such that
sup `.W / < `.F /:
W 2C

Since the dual space of L1 can be identified with L1 , there exists some Z 2 L1
such that `.F / D E F Z  for all F 2 L1 . We may assume without loss of generality
that kZk1  1. By construction, Z satisfies the assumptions of Lemma 1.58, and
so Z 2 Z. Moreover, E F Z  D `.F / > 0 since the constant function W  0 is
contained in C .

43

Section 1.6 Contingent initial data

We will now use an exhaustion argument to conclude that Z contains a strictly


positive element Z  under the assumptions of Lemma 1.59. After normalization, Z 
will serve as the density of our desired risk-neutral measure P  2 P .
Lemma 1.60. Under the assumptions of Lemma 1:59, there exists Z  2 Z with
Z  > 0 P -a.s.
Proof. As a first step, we claim that Z is countably convex: If .k /k2N is a sequence
of non-negative real numbers summing up to 1, and if Z .k/ 2 Z for all k, then
Z WD

1
X

k Z .k/ 2 Z:

kD1

Indeed, for W 2 C

1
X

jk Z .k/ W j  jW j 2 L1 ;

kD1

and so Lebesgues dominated convergence theorem implies that


E ZW  D

1
X

k E Z .k/ W   0:

kD1

For the second step, let


c WD supP Z > 0  j Z 2 Z:
We choose Z .n/ 2 Z such that P Z .n/ > 0  ! c. Then
Z  WD

1
X

2n Z .n/ 2 Z

nD1

by step one, and




Z > 0 D

1
[

Z .n/ > 0:

nD1

Hence P Z  > 0 D c.
In the final step, we show that c D 1. Then Z  will be as desired. Suppose by way
of contradiction that P Z  D 0 > 0, so that W WD IZ  D0 is a non-zero element of
L1C . Lemma 1.59 yields Z 2 Z with E W Z  > 0. Hence,
P Z > 0 \ Z  D 0  > 0;
and so


1

P
.Z C Z / > 0 > P Z  > 0  D c;
2

in contradiction to the maximality of P Z  > 0.

44

Chapter 1 Arbitrage theory

Thus, we have completed the proof of the implication (b) ) (c) of Theorem 1.55
up to the requirement that C is closed in L1 . Let us pause here in order to state
general versions of two of the arguments we have used so far. The first is known as
the HalmosSavage theorem.
Theorem 1.61. Let Q be a set of probability measures which are all absolutely continuous with respect to a given measure P . Suppose moreover that Q  P in the
sense that Q A  D 0 for all Q 2 Q implies that P A  D 0. Then there exists a
countable subfamily QQ  Q which satisfies QQ  P . In particular, there exists an
equivalent measure in Q as soon as Q is countably convex in the following sense: If
of non-negative real numbers summing up to 1, and if Qk 2 Q
.k /k2N is a sequence
P

for all k, then 1


kD1 k Qk 2 Q.
Exercise 1.6.1. Prove Theorem 1.61 by modifying the exhaustion argument used in
the proof of Lemma 1.60.
}
An inspection of Lemmas 1.58, 1.59, and 1.60 shows that the particular structure of
C D .K  L0C / \ L1 was only used for part (c) of Lemma 1.58. All other arguments
relied only on the fact that C is a closed convex cone in L1 that contains all bounded
negative functions and no non-trivial positive function. Thus, we have in fact proved
the following KrepsYan theorem, which was obtained independently in [266] and
[186].
Theorem 1.62. Suppose C is a closed convex cone in L1 satisfying
C L1
C

and

C \ L1C D 0:

Then there exists Z 2 L1 such that Z > 0 P -a.s. and E W Z   0 for all W 2 C.
Let us now turn to the closedness of our set C D .K  L0C / \ L1 . The following
example illustrates that we cannot expect C to be closed without assuming the absence
of arbitrage opportunities.
Example 1.63. Let P be the Lebesgue measure on the Borel field F1 of  D 0; 1,
and take F0 D ;;  and Y .!/ D !. This choice clearly violates the no-arbitrage
condition, i.e., we have K \ L0C 0. The convex set C D .K  L0C / \ L1 is a
proper subset of L1 . More precisely, C does not contain any function F 2 L1 with
F  1: If we could represent F as   Y  U for a non-negative function U , then it
would follow that
  Y D F C U  1;
which is impossible for any . However, as we show next, the closure of C in L1
coincides with the full space L1 . In particular, C cannot be closed. Let F 2 L1 be
arbitrary, and observe that
Fn WD .F C ^ n/ I 1 ;1  F 
n

45

Section 1.6 Contingent initial data

converges to F in L1 as n " 1. Moreover, each Fn belongs to C as


.F C ^ n/ I 1 ;1  n2  Y:
n

Consequently, F is contained in the L1 -closure of C.

In the special case F0 D ;; , we can directly go on to the proof that C is


closed, using a simplified version of Lemma 1.68 below. In this way, we obtain an
alternative proof of Theorem 1.7. In the general case we need some preparation. Let
us first prove a randomized version of the BolzanoWeierstra theorem. It yields a
simple construction of a measurable selection of a convergent subsequence of a given
sequence in L0 .; F0 ; P I Rd /.
Lemma 1.64. Let .n / be a sequence in L0 .; F0 ; P I Rd / with lim infn jn j < 1.
Then there exists  2 L0 .; F0 ; P I Rd / and a strictly increasing sequence . m / of
F0 -measurable integer-valued random variables such that
m .!/ .!/ ! .!/

for P -a.e. ! 2 .

Proof. Let .!/ WD lim infn jn .!/j, and define m WD m on the P -null set D
0
1. On < 1 we let 10 WD 1, and we define F0 -measurable random indices m
by

0
0
; m D 2; 3; : : : :
m WD inf n > m1 j jn j  j 
m
We use recursion on i D 1; : : : ; d to define the i th component  i of the limit  and to
i of random indices. Let
extract a new subsequence m
 i D lim inf i i1 ;
m"1

i : Let
which is already defined if i D 1. This  i can be used in the construction of m
i WD 1 and, for m D 2; 3; : : : ,
1
i
m
.!/

WD inf

ni 1 .!/

1
i 1
i
i
i
:
n .!/ > m1 .!/ and j i1 .!/   .!/j 
n
m

d yields the desired sequence of random indices.


Then m WD m

It may happen that


  Y D Q  Y

P -a.s.;

although  and Q are two different portfolios in L0 .; F0 ; P I Rd /.

46

Chapter 1 Arbitrage theory

Remark 1.65. We could exclude this possibility by the following assumption of nonredundance:
  Y D Q  Y P -a.s. H)  D Q P -a.s.
(1.31)
Under this assumption, we can immediately move on to the final step in Lemma 1.68.
}
Without assumption (1.31), it will be convenient to have a suitable linear space
N of reference portfolios which are uniquely determined by their payoff. The
construction of N ? is the purpose of the following lemma. We will assume that the
spaces L0 and L0 .; F0 ; P I Rd / are endowed with the topology of convergence in
P -measure, which is generated by the metric d of (A.24).
?

Lemma 1.66. Define two linear subspaces N and N ? of L0 .; F0 ; P I Rd / by


N WD  2 L0 .; F0 ; P I Rd / j   Y D 0 P -a.s.;
N ? WD  2 L0 .; F0 ; P I Rd / j    D 0 P -a.s. for all  2 N :
(a) Both N and N ? are closed in L0 .; F0 ; P I Rd / and, in the following sense,
invariant under the multiplication with scalar functions g 2 L0 .; F0 ; P /: If
 2 N and  2 N ? , then g 2 N and g 2 N ? .
(b) If  2 N ? and   Y D 0 P -a.s., then  D 0, i.e., N \ N ? D 0.
(c) Every  2 L0 .; F0 ; P I Rd / has a unique decomposition  D  C  ? , where
 2 N and  ? 2 N ? .
Remark 1.67. For the proof of this lemma, we will use a projection argument in
Hilbert space. Let us sketch a more probabilistic construction of the decomposition
 D  C  ? . Take a regular conditional distribution of Y given F0 , i.e., a stochastic
kernel K from .; F0 / to Rd such that K.!; A/ D P Y 2 A j F0 .!/ for all
Borel sets A  Rd and P -a.e. ! (see, e.g., 44 of [20]). If one defines  ? .!/
as the orthogonal projection of .!/ onto the linear hull L.!/ of the support of the
measure K.!; /, then  WD    ? satisfies   Y D 0 P -a.s., and any Q with the same
property must be P -a.s. perpendicular to L.!/. However, carrying out the details
of this construction involves certain measurability problems; this is why we use the
projection argument below.
}
Proof. (a): The closedness of N and N ? follows immediately from the metrizability
of L0 .; F0 ; P I Rd / (see Appendix A.7) and the fact that every sequence which
converges in measure has an almost-surely converging subsequence. The invariance
under the multiplication with F0 -measurable scalar functions is obvious.
(b): Suppose that  2 N \ N ? . Then taking  WD  in the definition of N ? yields
   D jj2 D 0 P -a.s.

47

Section 1.6 Contingent initial data

(c): Any given  2 L0 .; F0 ; P I Rd / can be written as


.!/ D  1 .!/ e1 C    C  d .!/ ed ;
where ei denotes the i th Euclidean unit vector, and where  i .!/ is the i th component
of .!/. Consider ei as a constant element of L0 .; F0 ; P I Rd /, and suppose that
we can decompose ei as
ei D ni C ei?

where ni 2 N and ei? 2 N ? .

(1.32)

Since by part (a) both N and N ? are invariant under the multiplication with F0 measurable functions, we can then obtain the desired decomposition of  by letting
.!/ WD

d
X

 i .!/ ni .!/

and

i D1

 ? .!/ WD

d
X

 i .!/ ei? .!/:

i D1

Uniqueness of the decomposition follows from N \ N ? D 0.


It remains to construct the decomposition (1.32) of ei . The constant ei is an element
of the space H WD L2 .; F0 ; P I Rd /, which becomes a Hilbert space if endowed
with the natural inner product
.; /H WD E    ;

;  2 L2 .; F0 ; P I Rd /:

Observe that both N \ H and N ? \ H are closed subspaces of H , because convergence in H implies convergence in L0 .; F0 ; P I Rd /. Therefore, we can define
the corresponding orthogonal projections
0 W H ! N \ H

and

 ? W H ! N ? \ H:

Thus, letting ni WD  0 .ei / and ei? WD  ? .ei / will be the desired decomposition
(1.32), once we know that ei D  0 .ei / C  ? .ei /. To prove this, we need only show
that  WD ei  0 .ei / is contained in N ? . We assume by way of contradiction that  is
not contained in N ? \ H . Then there exists some  2 N such that P    > 0  > 0.
Clearly,
Q WD  I>0; jjc
is contained in N \ H for each c > 0. But if c is large enough, then 0 < E Q    D
.;
Q /H , which contradicts the fact that  is by construction orthogonal to N \ H .
After these preparations, we can now complete the proof of Theorem 1.55 by showing the closedness of C D .K  L0C / \ L1 in L1 . This is an immediate consequence
of the following lemma, since convergence in L1 implies convergence in L0 , i.e.,
convergence in P -measure. Recall that we have already proved the equivalence of the
conditions (a) and (b) in Theorem 1.55.

48

Chapter 1 Arbitrage theory

Lemma 1.68. If K \ L0C D 0, then K  L0C is closed in L0 .


Proof. Suppose Wn 2 .K  L0C / converges in L0 to some W as n " 1. By passing
to a suitable subsequence, we may assume without loss of generality that Wn ! W
P -almost surely. We can write Wn D n  Y  Un for n 2 N ? and Un 2 L0C .
In a first step, we will prove the assertion given the fact that
lim inf jn j < 1
n"1

P -a.s.,

(1.33)

which will be established afterwards. Assuming (1.33), Lemma 1.64 yields F0 -measurable integer-valued random variables 1 < 2 <    and some  2 L0 .; F0 ; P I Rd /
such that P -a.s. n ! . It follows that
Un D n  Y  Wn !   Y  W DW U

P -a.s.,

(1.34)

so that U 2 L0C and W D   Y  U 2 K  L0C .


Let us now show that A WD lim infn jn j D C1 satisfies P A  D 0 as claimed
in (1.33). Let

n
when jn j > 0,
n WD jn j
otherwise,
e1
where e1 is the unit vector .1; 0; : : : ; 0/. Using Lemma 1.64 on the sequence .n /
yields F0 -measurable integer-valued random variables 1 < 2 <    and some  2
L0 .; F0 ; P I Rd / such that P -a.s. n ! . The convergence of .Wn / implies that

Un
Wn 
0  IA
D IA n  Y 
! IA   Y P -a.s.
jn j
jn j
Hence, our assumption K \ L0C D 0 yields .IA /  Y D 0. Below we will show that
IA  2 N ? , so that
 D 0 P -a.s. on A.
(1.35)
On the other hand, the fact that jn j D 1 P -a.s. implies that jj D 1 P -a.s., which
can only be consistent with (1.35) if P A  D 0.
It remains to show that IA  2 N ? . To this end, we first observe that each n
belongs to N ? since, for each  2 N ,
n   D

1
X
kD1

In Dk

1
k   D 0
jk j

P -a.s.

The closedness of N ? implies  2 N ? , and A 2 F0 yields IA  2 N ? .


If in the proof of Lemma 1.68 Wn D n  Y for all n, then U D 0 in (1.34), and
W D limn Wn is itself contained in K. We thus get the following lemma, which will
be useful in Chapter 5.

49

Section 1.6 Contingent initial data

Lemma 1.69. Suppose that K \ L0C D 0. Then K is closed in L0 .


In fact, it is possible to show that K is always closed in L0 ; see [257], [233]. But
this stronger result will not be needed here.
As an alternative to the randomized BolzanoWeierstra theorem in Lemma 1.64,
we can use the following variant of Komlos principle of subsequences. It yields
a convergent sequence of convex combinations of a sequence in L0 .; F0 ; P I Rd /,
and this will be needed later on. Recall from Appendix A.1 the notion of the convex
hull
X

n
n

conv A D
i xi xi 2 A; i  0;
i D 1; n 2 N
i D1

i D1

of a subset A of a linear space, which in our case will be L0 .; F0 ; P I Rd /.


Lemma 1.70. Let .n / be a sequence in L0 .; F0 ; P I Rd / such that supn jn j < 1
P -almost surely. Then there exists a sequence of convex combinations
n 2 convn ; nC1 ; : : :
which converges P -almost surely to some  2 L0 .; F0 ; P I Rd /.
Proof. We can assume without loss of generality that supn jn j  1 P -a.s.; otherwise
we consider the sequence Qn WD n = supn jn j. Then .n / is a bounded sequence
in the Hilbert space H WD L2 .; F0 ; P I Rd /. Since the closed unit ball in H is
weakly compact, the sequence .n / has an accumulation point  2 H ; note that
weak sequential compactness follows from the BanachAlaoglu theorem in the form
of Theorem A.63 and the fact that the dual H 0 of the Hilbert space H is isomorphic
to H itself. For each n, the accumulation point  belongs to the L2 -closure Cn of
convn ; nC1 ; : : : , due to the fact that a closed convex set in H is also weakly
closed; see Theorem A.60. Thus, we can find n 2 convn ; nC1 ; : : : such that
E jn  j2  

1
:
n2

This sequence .n / converges P -a.s. to .


Remark 1.71. The original result by Komlos [178] is more precise: It states that for
any bounded sequence .n / in L1 .; F ; P I Rd / there is a subsequence .nk / which
satisfies a strong law of large numbers, i.e.,
N
1 X
nk
N "1 N

lim

kD1

exists P -almost surely; see also [264].

Chapter 2

Preferences

In a complete financial market model, the price of a contingent claim is determined


by arbitrage arguments, without involving the preferences of economic agents. In an
incomplete model, such claims may carry an intrinsic risk which cannot be hedged
away. In order to determine desirable strategies in view of such risks, the preferences
of an investor should be made explicit, and this is usually done in terms of an expected
utility criterion.
The paradigm of expected utility is the theme of this chapter. We begin with a
general discussion of preference relations on a set X of alternative choices and their
numerical representation by some functional U on X. In the financial context, such
choices can usually be described as payoff profiles. These are defined as functions
X on an underlying set of scenarios with values in some set of payoffs. Thus we are
facing risk or even uncertainty. In the case of risk, a probability measure is given on
the set of scenarios. In this case, we can focus on the resulting payoff distributions.
We are then dealing with preferences on lotteries, i.e., on probability measures on
the set of payoffs.
In Sections 2.2 and 2.3 we discuss the conditions or axioms under which such a
preference relation on lotteries
can be represented by a functional of the form
Z
u.x/
.dx/;
where u is a utility function on the set of payoffs. This formulation of preferences on
lotteries in terms of expected utility goes back to D. Bernoulli [25]; the axiomatic theory was initiated by J. von Neumann and O. Morgenstern [209]. Section 2.4 characterizes uniform preference relations which are shared by a given class of functions u.
This involves the general theory of probability measures on product spaces with given
marginals which will be discussed in Section 2.6.
In Section 2.5 we return to the more fundamental level where preferences are defined on payoff profiles, and where we are facing uncertainty in the sense that no
probability measure is given a priori. L. Savage [232] clarified the conditions under
which such preferences on a space of functions X admit a representation of the form
U.X / D EQ u.X / 
where Q is a subjective probability measure on the set of scenarios. We are going to
concentrate on a robust extension of the Savage representation which was introduced

Section 2.1 Preference relations and their numerical representation

51

by I. Gilboa and D. Schmeidler [140] and later extended by Maccheroni, Marinacci,


and Rustichini [198]. Here the utility functional is of the form
U.X / D inf .EQ u.X /  C .Q//:
Q2Q

It thus involves a whole class Q of probability measures Q, which are taken more
or less seriously according to their penalization .Q/. The axiomatic approach to
the robust Savage representation is closely related to the construction of coherent and
convex risk measures, which will be the topic of Chapter 4.

2.1

Preference relations and their numerical representation

Let X be some non-empty set. An element x 2 X will be interpreted as a possible


choice of an economic agent. If presented with two choices x; y 2 X, the agent
might prefer one over the other. This will be formalized as follows.
Definition 2.1. A preference order (or preference relation) on X is a binary relation
with the following two properties.


Asymmetry: If x y, then y x.

Negative transitivity: If x y and z 2 X, then either x z or z y or both


must hold.

Negative transitivity states that if a clear preference exists between two choices
x and y, and if a third choice z is added, then there is still a choice which is least
preferable (y if z y) or most preferable (x if x z).
Definition 2.2. A preference order on X induces a corresponding weak preference
order defined by
x y W y x;
and an indifference relation  given by
x  y W x y and y x:
Thus, x y means that either x is preferred to y or there is no clear preference
between the two.
Remark 2.3. It is easy to check that the asymmetry and the negative transitivity of
are equivalent to the following two respective properties of :
(a) Completeness: For all x; y 2 X, either y x or x y or both are true.
(b) Transitivity: If x y and y z, then also x z.

52

Chapter 2 Preferences

Conversely, any complete and transitive relation induces a preference order via
the negation of , i.e.,
y x W
x  y:
The indifference relation  is an equivalence relation, i.e., it is reflexive, symmetric
and transitive.
}
Exercise 2.1.1. Prove the assertions in the preceding remark.

Definition 2.4. A numerical representation of a preference order is a function U W


X ! R such that
y x U.y/ > U.x/:
(2.1)
Clearly, (2.1) is equivalent to
y x U.y/  U.x/:
Note that such a numerical representation U is not unique: If f is any strictly increasing function, then UQ .x/ WD f .U.x// is again a numerical representation.
Definition 2.5. Let be a preference relation on X. A subset Z of X is called
order dense if for any pair x; y 2 X such that x y there exists some z 2 Z with
x z y.
The following theorem characterizes those preference relations for which there exists a numerical representation.
Theorem 2.6. For the existence of a numerical representation of a preference relation
it is necessary and sufficient that X contains a countable, order dense subset Z. In
particular, any preference order admits a numerical representation if X is countable.
Proof. Suppose first that we are given a countable order dense subset Z of X. For
x 2 X, let
Z.x/ WD z 2 Z j z x and

Z.x/ WD z 2 Z j x z:

The relation x y implies that Z.x/ Z.y/ and Z.x/  Z.y/. If the strict relation
x y holds, then at least one of these inclusions is also strict. To see this, pick
z 2 Z with x z y, so that either x z y or x z y. In the first case,
z 2 Z.x/nZ.y/, while z 2 Z.y/nZ.x/ in the second case.
Next, take any strictly positive probability distribution
on Z, and let
X
X

.z/ 

.z/:
U.x/ WD
z2Z.x/

z2Z.x/

Section 2.1 Preference relations and their numerical representation

53

By the above, U.x/ > U.y/ if and only if x y so that U is the desired numerical
representation.
For the proof of the converse assertion take a numerical representation U and let J
denote the countable set
J WD a; b j a; b 2 Q; a < b; U 1 .a; b/ ; :
For every interval I 2 J we can choose some zI 2 X with U.zI / 2 I and thus define
the countable set
A WD zI j I 2 J:
At first glance it may seem that A is a good candidate for an order dense set. However,
it may happen that there are x; y 2 X such that U.x/ < U.y/ and for which there is
no z 2 X with U.x/ < U.z/ < U.y/. In this case, an order dense set must contain
at least one z with U.z/ D U.x/ or U.z/ D U.y/, a condition which cannot be
guaranteed by A.
Let us define the set C of all pairs .x; y/ which do not admit any z 2 A with
y z x:
C WD .x; y/ j x; y 2 XnA; y x and z 2 A with y z x :
Then .x; y/ 2 C implies the apparently stronger fact that we cannot find any z 2 X
such that y z x: Otherwise we could find a, b 2 Q such that
U.x/ < a < U.z/ < b < U.y/;
so I WD a; b would belong to J, and the corresponding zI would be an element of
A with y zI x, contradicting the assumption that .x; y/ 2 C .
It follows that all intervals .U.x/; U.y// with .x; y/ 2 C are disjoint and nonempty. Hence, there can be only countably many of them. For each such interval
J we pick now exactly one pair .x J ; y J / 2 C such that U.x J / and U.y J / are the
endpoints of J , and we denote by B the countable set containing all x J and all y J .
Finally, we claim that Z WD A [ B is an order dense subset of X. Indeed, if x,
y 2 XnZ with y x, then either there is some z 2 A such that y z x, or
.x; y/ 2 C . In the latter case, there will be some z 2 B with U.y/ D U.z/ > U.x/
and, consequently, y z x.
The following example shows that even in a seemingly straightforward situation, a
given preference order may not admit a numerical representation.
Example 2.7. Let be the usual lexicographical order on X WD 0; 1  0; 1, i.e.,
.x1 ; x2 / .y1 ; y2 / if and only if either x1 > y1 , or if x1 D y1 and simultaneously
x2 > y2 . One easily checks that is antisymmetric and negative transitive, and hence
a preference order. We show now that does not admit a numerical representation.

54

Chapter 2 Preferences

To this end, let Z be any order-dense subset of X. Then, for x 2 0; 1 there must
be some .z1 ; z2 / 2 Z such that .x; 1/ .z1 ; z2 / .x; 0/. It follows that z1 D x
and that Z is uncountable. Theorem 2.6 thus implies that there cannot be a numerical
representation of the lexicographical order .
}
Definition 2.8. Let X be a topological space. A preference relation is called continuous if for all x 2 X
B.x/ WD y 2 X j y x and

B.x/ WD y 2 X j x y

(2.2)

are open subsets of X.


Remark 2.9. Every preference order that admits a continuous numerical representation is itself continuous. Under some mild conditions on the underlying space X, the
converse statement is also true; see Theorem 2.15 below.
}
Example 2.10. The lexicographical order of Example 2.7 is not continuous: If
.x1 ; x2 / 2 0; 1  0; 1 is given, then
.y1 ; y2 / j .y1 ; y2 / .x1 ; x2 / D .x1 ; 1  0; 1 [ x1  .x2 ; 1;
which is typically not an open subset of 0; 1  0; 1.

Recall that a topological space X is called a topological Hausdorff space if any


two distinct points in X have disjoint open neighborhoods. In this case, all singletons
x are closed. Clearly, every metric space is a topological Hausdorff space.
Proposition 2.11. Let be a preference order on a topological Hausdorff space X.
Then the following properties are equivalent:
(a) is continuous.
(b) The set .x; y/ j y x is open in X  X.
(c) The set .x; y/ j y x is closed in X  X.
Proof. (a) ) (b): We have to show that for any pair
.x0 ; y0 / 2 M WD .x; y/ j y x
there exist open sets U; V  X such that x0 2 U , y0 2 V , and U  V  M .
Consider first the case in which there exists some z 2 B.x0 / \ B.y0 / for the notation
B.x0 / and B.y0 / introduced in (2.2). Then y0 z x0 , so that U WD B.z/ and
V WD B.z/ are open neighborhoods of x0 and y0 , respectively. Moreover, if x 2 U
and y 2 V , then y z x, and thus U  V  M .
If B.x0 / \ B.y0 / D ;, we let U WD B.y0 / and V WD B.x0 /. If .x; y/ 2 U  V ,
then y0 x and y x0 by definition. We want to show that y x in order to

Section 2.1 Preference relations and their numerical representation

55

conclude that U  V  M . To this end, suppose that x y. Then y0 y by


negative transitivity, hence y0 y x0 . But then y 2 B.x0 / \ B.y0 / ;, and we
have a contradiction.
(b) ) (c): First note that the mapping .x; y/ WD .y; x/ is a homeomorphism of
X  X. Then observe that the set .x; y/ j y x is just the complement of the open
set ..x; y/ j y x/.
(c) ) (a): Since X is a topological Hausdorff space, x  X is closed in X  X,
and so is the set
x  X \ .x; y/ j y x D x  y j y x:
Hence y j y x is closed in X, and its complement y j x y is open. The
same argument applies to y j y x.
Example 2.12. For x0 < y0 consider the set X WD .1; x0  [ y0 ; 1/ endowed
with the usual order > on R. Then, with the notation introduced in (2.2), B.y0 / D
.1; x0  and B.x0 / D y0 ; 1/. Hence,
B.x0 / \ B.y0 / D ;
despite y0 x0 , a situation we had to consider in the preceding proof.

Recall that the topological space X is called connected if X cannot be written as


the union of two disjoint and non-empty open sets. Assuming that X is connected
will rule out the situation occurring in Example 2.12.
Proposition 2.13. Let X be a connected topological space with a continuous preference order . Then every dense subset Z of X is also order dense in X. In particular,
there exists a numerical representation of if X is separable.
Proof. Take x, y 2 X with y x, and consider B.x/ and B.y/ as defined in (2.2).
Since y 2 B.x/ and x 2 B.y/, neither B.x/ nor B.y/ are empty sets. Moreover,
negative transitivity implies that X D B.x/ [ B.y/. Hence, the open sets B.x/
and B.y/ cannot be disjoint, as X is connected. Thus, the open set B.x/ \ B.y/
must contain some element z of the dense subset Z, which then satisfies y z x.
Therefore Z is an order dense subset of X.
Separability of X means that there exists a countable dense subset Z of X, which
then is order dense. Hence, the existence of a numerical representation follows from
Theorem 2.6.
Remark 2.14. Consider the situation of Example 2.12, where X WD .1; x0  [
y0 ; 1/, and suppose that x0 and y0 are both irrational. Then Z WD Q \ X is dense
in X, but there exists no z 2 Z such that y0 z x0 . This example shows that the
assumption of topological connectedness is essential for Proposition 2.13.
}

56

Chapter 2 Preferences

Theorem 2.15. Let X be a topological space which satisfies at least one of the following two properties:


X has a countable base of open sets.

X is separable and connected.

Then every continuous preference order on X admits a continuous numerical representation.


For a proof we refer to [77], Propositions 3 and 4. For our purposes, namely for the
proof of the von NeumannMorgenstern representation in the next section and for the
proof of the robust Savage representation in Section 2.5, the following lemma will be
sufficient.
Lemma 2.16. Let X be a connected metric space with a continuous preference order
. If U W X ! R is a continuous function, and if its restriction to some dense
subset Z is a numerical representation for the restriction of to Z, then U is also a
numerical representation for on X.
Proof. We have to show that y x if and only if U.y/ > U.x/. In order to verify
the only if part, take x, y 2 X with y x. As in the proof of Proposition 2.13,
we obtain the existence of some z0 2 Z with y z0 x. Repeating this argument
yields z00 2 Z such that z0 z00 x. Now we take two sequences .zn / and .zn0 / in Z
with zn ! y and zn0 ! x. By continuity of , eventually
zn z0 z00 zn0 ;
and thus
U.zn / > U.z0 / > U.z00 / > U.zn0 /:
The continuity of U implies that U.zn / ! U.y/ and U.zn0 / ! U.x/, whence
U.y/  U.z0 / > U.z00 /  U.x/:
For the proof of the converse implication, suppose that x, y 2 X are such that
U.y/ > U.x/. Since U is continuous,
U.x/ WD z 2 X j U.z/ > U.x/
and
U.y/ WD z 2 X j U.z/ < U.y/
are both non-empty open subsets of X. Moreover, U.y/ [ U.x/ D X. Connectedness of X implies that U.y/ \ U.x/ ;. As above, a repeated application of the
preceding argument yields z0 , z00 2 Z such that
U.y/ > U.z0 / > U.z00 / > U.x/:

Section 2.2 Von NeumannMorgenstern representation

57

Since Z is a dense subset of X, we can find sequences .zn / and .zn0 / in Z with
zn ! y and zn0 ! x as well as with U.zn / > U.z0 / and U.zn0 / < U.z00 /. Since U is
a numerical representation of on Z, we have
zn z0 z00 zn0 :
Hence, by the continuity of , neither z0 y nor x z00 can be true, and negative
transitivity yields y x.

2.2

Von NeumannMorgenstern representation

Suppose that each possible choice for our economic agent corresponds to a probability
distribution on a given set of scenarios. Thus, the set X can be identified with a
subset M of the set M1 .S; S/ of all probability distributions on a measurable space
.S; S/. In the context of the theory of choice, the elements of M are sometimes
called lotteries. We will assume in the sequel that M is convex. The aim of this
section is to characterize those preference orders on M which allow for a numerical
representation U of the form
Z
U.
/ D u.x/
.dx/ for all
2 M,
(2.3)
where u is a real function on S .
Definition 2.17. A numerical representation U of a preference order on M is called
a von NeumannMorgenstern representation if it is of the form (2.3).
Any von NeumannMorgenstern representation U is affine on M in the sense that
U.
C .1  / / D U.
/ C .1  /U. /
for all
; 2 M and 2 0; 1. It is easy to check that affinity of U implies the
following two properties, or axioms, for a preference order on M. The first property
says that a preference
is preserved in any convex combination, independent of
the context described by another lottery .
Definition 2.18. A preference relation on M satisfies the independence axiom if,
for all
, 2 M, the relation
implies

C .1  / C .1  /
for all 2 M and all 2 .0; 1.

58

Chapter 2 Preferences

The independence axiom is also called the substitution axiom. It can be illustrated
by introducing a compound lottery, which represents the distribution
C .1  /
as a two-step procedure. First, we sample either lottery
or with probability and
1  , respectively. Then the lottery drawn in this first step is realized. Clearly, this
is equivalent to playing directly the lottery
C .1  / . With probability 1  ,
the distribution is drawn and in this case there is no difference to the compound
lottery where is replaced by
. The only difference occurs when
is drawn, and
this happens with probability . Thus, if
then it seems reasonable to prefer the
compound lottery with
over the one with .
Definition 2.19. A preference relation on M satisfies the Archimedean axiom if for
any triple
there are , 2 .0; 1/ such that

C .1  /
C .1  / :
The Archimedean axiom derives its name from its similarity to the Archimedean
principle in real analysis: For every small " > 0 and each large x, there is some
n 2 N such that n " > x. Sometimes it is also called the continuity axiom, because
it can act as a substitute for the continuity of in a suitable topology on M. More
precisely, suppose that M is endowed with a topology for which convex combinations
are continuous curves, i.e.,
C .1  / converges to or
as # 0 or " 1,
respectively. Then continuity of our preference order in this topology automatically
implies the Archimedean axiom.
Remark 2.20. As an axiom for consistent behavior in the face of risk, the Archimedean axiom is less intuitive than the independence axiom. Consider the following
three deterministic distributions: yields 1000 C, yields 10 C, and
is the lottery
where one dies for sure. Even for small 2 .0; 1/ it is not clear that someone would
prefer the gamble
C .1  / , which involves the probability of dying, over the
conservative 10 C yielded by . Note, however, that most people would not hesitate
to drive a car for a distance of 50 km in order to receive a premium of 1000 C, even
though this might involve the risk of a deadly accident.
}
Our first goal is to show that the Archimedean axiom and the independence axiom
imply the existence of an affine numerical representation.
Theorem 2.21. Suppose that is a preference relation on M satisfying both the
Archimedean and the independence axiom. Then there exists an affine numerical representation U of . Moreover, U is unique up to positive affine transformations,
i.e., any other affine numerical representation UQ with these properties is of the form
UQ D a U C b for some a > 0 and b 2 R.
The affinity of a numerical representation does not always imply that it is also of
von NeumannMorgenstern form; see Exercise 2.2.1 and Example 2.26 below. In

Section 2.2 Von NeumannMorgenstern representation

59

two important cases, however, such an affine numerical representation will already
be of von NeumannMorgenstern form. This is the content of the following two
corollaries, which we state before proving Theorem 2.21. For the first corollary, we
need the notion of a simple probability distribution. This is a probability measure

on S which can be written as a finite convex combination


PN of Dirac masses, i.e., there
exist x1 ; : : : ; xN 2 S and 1 ; : : : ; N 2 .0; 1 with i D1 i D 1 such that

N
X

i xi :

i D1

Corollary 2.22. Suppose that M is the set of all simple probability distributions on S
and that is a preference order on M that satisfies both the Archimedean and the independence axiom. Then there exists a von NeumannMorgenstern representation U .
Moreover, both U and u are unique up to positive affine transformations.
Proof. Let U be an affine numerical representation, which exists by Theorem 2.21.
We define u.x/ WD U.x /, for x 2 S. If
2 M is of the form
D 1 x1 C    C
N xN , then affinity of U implies
U.
/ D

N
X

Z
i U.xi / D

u.x/
.dx/:

i D1

This is the desired von NeumannMorgenstern representation.


On a finite set S , every probability measure is simple. Thus, we obtain the following result as a special case.
Corollary 2.23. Suppose that M is the set of all probability distributions on a finite
set S and that is a preference order on M that satisfies both the Archimedean and
the independence axiom. Then there exists a von NeumannMorgenstern representation, and it is unique up to positive affine transformations.
For the proof of Theorem 2.21, we need the following auxiliary lemma. Its first assertion states that taking convex combination is monotone with respect to a preference
order satisfying our two axioms. Its second part can be regarded as an intermediate
value theorem for straight lines in M, and (c) is the analogue of the independence
axiom for the indifference relation .
Lemma 2.24. Under the assumptions of Theorem 2:21, the following assertions are
true.
(a) If
, then 7!
C .1  / is strictly increasing with respect to . More
precisely,
C .1  /
C .1  / for 0  <  1.

60

Chapter 2 Preferences

(b) If
and
, then there exists a unique 2 0; 1 with 

C .1  / .
(c) If
 , then
C .1  /  C .1  / for all 2 0; 1 and all 2 M.
Proof. (a): Let WD
C .1  / . The independence axiom implies that
C .1  / D . Hence, for  WD =,

C .1  / D .1  / C  .1  / C  D
C .1  / :
(b): Part (a) guarantees that is unique if it exists. To show existence, we need only
to consider the case
, for otherwise we can take either D 0 or D 1.
The natural candidate is
WD sup 2 0; 1 j 
C .1  / :
If 
C .1  / is not true, then one of the following two possibilities must
occur:

C .1  / ; or 
C .1  / :
(2.4)
In the first case, we apply the Archimedean axiom to obtain some 2 .0; 1/ such that

C .1  /  C .1  /
D 
C .1  /

(2.5)

for  D 1  .1  /. Since  > , it now follows from the definition of that



C .1  / , which contradicts (2.5). If the second case in (2.4) occurs, the
Archimedean axiom yields some 2 .0; 1/ such that
.
C .1  / / C .1  / D
C .1  / :

(2.6)

Clearly < , so that the definition of yields some  2 .;  with



C .1  / . Part (a) and the fact that <  imply that

C .1  /
C .1  / ;
which contradicts (2.6).
(c): We must exclude both of the following two possibilities

C .1  / C .1  /

and

C .1  /
C .1  / : (2.7)

To this end, we may assume that there exists some  2 M with 


 ; otherwise
the result is trivial. Let us assume that 
 ; the case in which
 
is similar. Suppose that the first possibility in (2.7) would occur. The independence
axiom yields
 C .1  / C .1  / D 

61

Section 2.2 Von NeumannMorgenstern representation

for all 2 .0; 1/. Therefore,


 C .1  /  C .1  /
C .1  /

for all 2 .0; 1/.

(2.8)

Using our assumption that the first possibilities in (2.7) is occurring, we obtain from
part (b) a unique  2 .0; 1/ such that, for any fixed ,

C .1  /  .  C .1  /  C .1  / / C .1  / C .1  / 
D  C .1  /  C .1  /

C .1  / ;
where we have used (2.8) for replaced by  in the last step. This is a contradiction.
The second possibility in (2.7) is excluded by an analogous argument.
Proof of Theorem 2:21. For the construction of U , we first fix two lotteries and 
with  and define
M. ; / WD
2 M j
I
the assertion is trivial if no such pair  exists. If
2 M. ; /, part (b) of
Lemma 2.24 yields a unique 2 0; 1 such that
 C .1  /, and we put
U.
/ WD . To prove that U is a numerical representation of on M. ; /, we must
show that for ;
2 M. ; / we have U.
/ > U. / if and only if
. To prove
sufficiency, we apply part (a) of Lemma 2.24 to conclude that

 U.
/ C .1  U.
// U. / C .1  U. //  ;
Hence
. Conversely, if
then the preceding arguments already imply that
we cannot have U. / > U.
/. Thus, it suffices to rule out the case U.
/ D U. /.
But if U.
/ D U. /, then the definition of U yields
 , which contradicts
.
We conclude that U is indeed a numerical representation of restricted to M. ; /.
Let us now show that M. ; / is a convex set. Take
; 2 M. ; / and 2 0; 1.
Then
C .1  /
C .1  / ;
using the independence axiom to handle the cases and
, and part (c)
of Lemma 2.24 for  and for 
. By the same argument it follows that

C .1  / , which implies the convexity of the set M. ; /.


Therefore, U.
C .1  / / is well defined; we proceed to show that it equals
U.
/ C .1  /U. /. To this end, we apply part (c) of Lemma 2.24 twice:

C .1  /  .U.
/ C .1  U.
/// C .1  /.U. / C .1  U. ///
D U.
/ C .1  /U. / C 1  U.
/  .1  /U. /:

62

Chapter 2 Preferences

The definition of U and the uniqueness in part (b) of Lemma 2.24 imply that
U.
C .1  / / D U.
/ C .1  /U. /:
So U is indeed an affine numerical representation of on M. ; /.
In a further step, we now show that the affine numerical representation U on
M. ; / is unique up to positive affine transformations. So let UQ be another affine
numerical representation of on M. ; /, and define
UQ .
/  UQ ./
;
UO .
/ WD
UQ . /  UQ ./

2 M. ; /:

Then UO is a positive affine transformation of UQ , and UO ./ D 0 D U./ as well as


UO . / D 1 D U. /. Hence, affinity of UO and the definition of U imply
UO .
/ D UO .U.
/ C .1  U.
/// D U.
/UO . / C .1  U.
//UO ./ D U.
/
for all
2 M. ; /. Thus UO D U .
Finally, we have to show that U can be extended as a numerical representation to
Q Q 2 M such that M. ;
Q /
the full space M. To this end, we first take ;
Q M. ; /.
By the arguments in the first part of this proof, there exists an affine numerical repQ /,
resentation UQ of on M. ;
Q and we may assume that UQ . / D 1 and UQ ./ D 0;
otherwise we apply a positive affine transformation to UQ . By the previous step of
the proof, UQ coincides with U on M. ; /, and so UQ is the unique consistent exQ /,
tension of U . Since each lottery belongs to some set M. ;
Q the affine numerical
representation U can be uniquely extended to all of M.
Remark 2.25. In the proof of the preceding theorem, we did not use the fact that
the elements of M are probability measures. All that was needed was convexity of
the set M, the Archimedean, and the independence axiom. Yet, even the concept of
convexity can be generalized by introducing the notion of a mixture space; see, e.g.,
[187], [115], or [151].
}
Let us now return to the problem of constructing a von NeumannMorgenstern representation for preference relations on distributions. If M is the set of all probability
measures on a finite set S , any affine numerical representation is already of this form,
as we saw in the proof of Corollary 2.23. However, the situation becomes more involved if we take an infinite set S . In fact, the following examples show that in this
case a von NeumannMorgenstern representation may not exist.
Exercise 2.2.1. Let M be the set of probability measures
on S WD 1; 2; : : : for
which U.
/ WD limk"1 k 2
.k/ exists as a finite real number. Show U is affine
and induces a preference order on M which satisfies both the Archimedean and the
independence axiom. Show next that U does not admit a von NeumannMorgenstern
representation.
}

Section 2.2 Von NeumannMorgenstern representation

63

Example 2.26. Let M be set the of all Borel probability measures on S D 0; 1, and
denote by the Lebesgue measure on S . According to the Lebesgue decomposition
theorem, which is recalled in Theorem A.13, every
2 M can be decomposed as

D
s C
a ;
where
s is singular with respect to , and
a is absolutely continuous. We define a
function U W M ! 0; 1 by
Z
U.
/ WD x
a .dx/:
It is easily seen that U is an affine function on M. Hence, U induces a preference
order on M which satisfies both the Archimedean and the independence axioms.
But cannot have a von NeumannMorgenstern representation: Since U.x / D 0
for all x, the only possible choice for u in (2.3) would be u  0. So the preference
relation would be trivial in the sense that
 for all
2 M, in contradiction for
instance to U. / D 12 and U. 1 / D 0.
}
2

One way to obtain a von NeumannMorgenstern representation is to assume additional continuity properties of , where continuity is understood in the sense of
Definition 2.8. As we have already remarked, the Archimedean axiom holds automatically if taking convex combinations is continuous for the topology on M. This is
indeed the case for the weak topology on the set M1 .S; S/ of all probability measures
on a separable metric space S, endowed with the -field S of Borel sets. The space S
will be fixed for the rest of this section, and we will simply write M1 .S/ D M1 .S; S/.
Theorem 2.27. Let M WD M1 .S/ be the space of all probability measures on S
endowed with the weak topology, and let be a continuous preference order on M
satisfying the independence axiom. Then there exists a von NeumannMorgenstern
representation
Z
U.
/ D

u.x/
.dx/

for which the function u W S ! R is bounded and continuous. Moreover, U and u


are unique up to positive affine transformations.
Proof. Let Ms denote the set of all simple probability distributions on S . Since continuity of implies the Archimedean axiom, we deduce from Corollary 2.22 that
restricted to Ms has a von NeumannMorgenstern representation.
Let us show that the function u in this representation is bounded. For instance, if
u is not bounded from above, then there are x0 ; x1 ; : : : 2 S such that u.x0 / < u.x1 /
and u.xn / > n. Now let


1
1

n WD 1  p x0 C p xn :
n
n

64

Chapter 2 Preferences

Clearly,
n ! x0 weakly as n " 1. The continuity of together with the assumpp
tion that x1 x0 imply that x1
n for all large n. However, U.
n / > n for
all n, in contradiction to x1
n .
Suppose that the function u is not continuous. Then there exists some x 2 S
and a sequence .xn /n2N  S such that xn ! x but u.xn / u.x/. By taking
a subsequence if necessary, we can assume that u.xn / converges to some number
a u.x/. Suppose that u.x/  a DW " > 0. Then there exists some m such that
ju.xn /  aj < "=3 for all n  m. Let
WD 12 .x C xm /. For all n  m
U.x / D a C " > a C

2"
"
1
> .u.x/ C u.xm // D U.
/ > a C > U.xn /:
3
2
3

Therefore x
xn , although xn converges weakly to x , in contradiction to
the continuity of . The case u.x/ < a is excluded in the same manner.
Let us finally show that
Z
U.
/ WD u.x/
.dx/ for
2 M
defines a numerical representation of on all of M. Since u is bounded and continuous, U is continuous with respect to the weak topology on M. Moreover, Theorem A.38 states that Ms is a dense subset of the connected metrizable space M. So
the proof is completed by an application of Lemma 2.16.
The scope of the preceding theorem is limited insofar as it involves only bounded
functions u. This will not be flexible enough for our purposes. In the next section,
for instance, we will consider risk-averse preferences which are defined in terms of
concave functions u on the space S D R. Such a function cannot be bounded unless
it is constant. Thus, we must relax the conditions of the previous theorem. We will
present two approaches. In our first approach, we fix some point x0 2 S and denote
by B r .x0 / the closed metric ball of radius r around x0 . The space of boundedly
supported measures on S is given by
[
M1 .B r .x0 //
Mb .S/ WD
r>0

D
2 M1 .S/ j
.B r .x0 // D 1 for some r  0 :
Clearly, this definition does not depend on the particular choice of x0 .
Corollary 2.28. Let be a preference order on Mb .S/ whose restriction to each
space M1 .B r .x0 // is continuous with respect to the weak topology. If satisfies the
independence axiom, then there exists a von NeumannMorgenstern representation
Z
U.
/ D u.x/
.dx/

65

Section 2.2 Von NeumannMorgenstern representation

with a continuous function u W S ! R. Moreover, U and u are unique up to positive


affine transformations.
Proof. Theorem 2.27 yields a von NeumannMorgenstern representation of the restriction of to M1 .B r .x0 // in terms of some continuous function ur W B r .x0 / !
R. The uniqueness part of the theorem implies that the restriction of ur to some
smaller ball B r 0 .x0 / must be a equal to ur 0 up to a positive affine transformation.
Thus, it is possible to find a unique continuous extension u W S ! R of ur 0 which defines a von NeumannMorgenstern representation of on each set M1 .B r .x0 //.
Our second variant of Theorem 2.27 includes measures with unbounded support,
but we need stronger continuity assumptions. Let be a continuous function with
values in 1; 1/ on the separable metric space S. We use as a gauge function and
define
Z

.x/
.dx/ < 1 :
M1 .S/ WD
2 M1 .S/
A suitable space of continuous test functions for measures in M1 .S/ is provided by
C .S/ WD f 2 C.S/ j 9 c W jf .x/j  c 

.x/ for all x 2 S :

These test functions can now be used to define a topology on M1 .S/ in precisely
the same way one uses the set of bounded continuous function to define the weak
topology: A sequence .
n / in M1 .S/ converges to some
2 M1 .S/ if and only if
Z
Z
f d
n ! f d
for all f 2 C .S/.
To be rigorous, one should first define a neighborhood base for the topology and
then check that this topology is metrizable, so that it suffices indeed to consider the
convergence of sequences; the reader will find all necessary details in Appendix A.6.
We will call this topology the -weak topology on M1 .S/. If we take the trivial
case  1, C .S/ consists of all bounded continuous functions, and we recover the
standard weak topology on M11 .S/ D M1 .S/. However, by taking as some nonbounded function, we can also include von NeumannMorgenstern representations in
terms of unbounded functions u. The following theorem is a version of Theorem 2.27
for the -weak topology. Its proof is analogous to that of Theorem 2.27, and we leave
it to the reader to fill in the details.
Theorem 2.29. Let be a preference order on M1 .S/ that is continuous in the weak topology and satisfies the independence axiom. Then there exists a numerical
representation U of von NeumannMorgenstern form
Z
U.
/ D u.x/
.dx/

66

Chapter 2 Preferences

with a function u 2 C .S/. Moreover, U and u are unique up to positive affine


transformations.
So far, we have presented the classical theory of expected utility, starting with the
independence axiom and the Archimedean axiom. However, it is well known that in
reality people may not behave according to this paradigm.
Example 2.30 (Allais paradox). The so-called Allais paradox questions the descriptive aspect of expected utility by considering the following lotteries. Lottery
1 D 0:33 2500 C 0:66 2400 C 0:01 0
yields 2500 C with a probability of 0.33, 2400 C with probability 0.66, and draws a
blank with the remaining probability of 0.01. Lottery

1 WD 2400
yields 2400 C for sure. When asked, most people prefer the sure amount even
though lottery 1 has the larger expected value, namely 2409 C.
Next, consider the following two lotteries
2 and 2 :

2 WD 0:34 2400 C 0:66 0

and

2 WD 0:33 2500 C 0:67 0 :

Here people tend to prefer the slightly riskier lottery 2 over


2 , in accordance with
the expectations of 2 and
2 , which are 825 C and 816 C, respectively.
This observation is due to M. Allais [5]. It was confirmed by D. Kahnemann and
A. Tversky [165] in empirical tests where 82 % of interviewees preferred
1 over
1 while 83 % chose 2 rather than
2 . This means that at least 65 % chose both

1 1 and 2
2 . As pointed out by M. Allais, this simultaneous choice leads
to a paradox in the sense that it is inconsistent with the von NeumannMorgenstern paradigm. More precisely, any preference relation for which
1 1 and
2
2 are both valid violates the independence axiom, as we will show now. If the
independence axiom were satisfied, then necessarily

1 C .1  / 2 1 C .1  / 2 1 C .1  /
2
for all 2 .0; 1/. By taking D 1=2 we would arrive at
1
1
.
1 C 2 / . 1 C
2 /
2
2
which is a contradiction to the fact that
1
1
.
1 C 2 / D . 1 C
2 /:
2
2
Therefore, the independence axiom was violated by at least 65 % of the people who
were interviewed. This effect is empirical evidence against the von NeumannMorgenstern theory as a descriptive theory. Even from a normative point of view, there
are good reasons to go beyond our present setting, and this will be done in Section 2.5.
In particular, we will take a second look at the Allais paradox in Remark 2.72.
}

67

Section 2.3 Expected utility

2.3

Expected utility

In this section, we focus on individual financial assets under the assumption that their
payoff distributions at a fixed time are known, and without any regard to hedging
opportunities in the context of a financial market model. Such asset distributions may
be viewed as lotteries with monetary outcomes in some interval on the real line. Thus,
we take M as a fixed set of Borel probability measures on a fixed interval S  R. In
this setting, we discuss the paradigm of expected utility in its standard form, where the
function u appearing in the von NeumannMorgenstern representation has additional
properties suggested by the monetary interpretation. We introduce risk aversion and
certainty equivalents, and illustrate these notions with a number of examples.
Throughout this section, we assume that M is convex and contains all point masses
x for x 2 S. We assume also that each
2 M has a well-defined expectation
Z
m.
/ WD

x
.dx/ 2 R:

Remark 2.31. For an asset whose (discounted) random payoff has a known distribution
, the expected value m.
/ is often called the fair price of the asset. For
an insurance contract where
is the distribution of payments to be received by the
insured party in dependence of some random damage within a given period, the expected value m.
/ is also called the fair premium. Typically, actual asset prices and
actual insurance premiums will be different from these values. In many situations,
such differences can be explained within the conceptual framework of expected utility, and in particular in terms of risk aversion.
}
Definition 2.32. A preference relation on M is called monotone if
x > y implies x y .
The preference relation is called risk averse if for
2 M
m./

unless
D m./ .

It is easy to characterize these properties within the class of preference relations


which admit a von NeumannMorgenstern representation.
Proposition 2.33. Suppose the preference relation has a von NeumannMorgenstern representation
Z
U.
/ D u d
:

68

Chapter 2 Preferences

Then
(a) is monotone if and only if u is strictly increasing.
(b) is risk averse if and only if u is strictly concave.
Proof. (a): Monotonicity is equivalent to
u.x/ D U.x / > U.y / D u.y/

for x > y.

(b): If is risk-averse, then


xC.1/y x C .1  /y
holds for all distinct x; y 2 S and 2 .0; 1/. Hence,
u.x C .1  /y/ > u.x/ C .1  /u.y/;
i.e., u is strictly concave. Conversely, if u is strictly concave, then Jensens inequality
implies risk aversion
Z
 Z
x
.dx/  u.x/
.dx/ D U.
/
U.m./ / D u
with equality if and only if
D m./ .
Remark 2.34. In view of the monetary interpretation of the state space S, it is natural to assume that the preference relation is monotone. The assumption of risk
aversion is more debatable, at least from a descriptive point of view. In fact, there
is considerable empirical evidence that agents tend to switch between risk aversion
and risk seeking behavior, depending on the context. In particular, they may be risk
averse after prior gains, and they may become risk seeking if they see an opportunity
to compensate prior losses. Tversky and Kahneman [261] propose to describe such a
behavioral pattern by a function u of the form

for x  c;
.x  c/
(2.9)
u.x/ D

for x < c;
 .c  x/
where c is a given benchmark level, and their experiments suggest parameter values
around 2 and  slightly less than 1. Nevertheless, one can insist on risk aversion from
a normative point of view, and in the sequel we explore some of its consequences. }
Definition 2.35. A function u W S ! R is called a utility function if it is strictly
concave, strictly increasing, and continuous on S. A von NeumannMorgenstern
representation
Z
U.
/ D

u d

in terms of a utility function u is called an expected utility representation.

(2.10)

69

Section 2.3 Expected utility

Any increasing concave function u W S ! R is necessarily continuous on every


interval .a; b  S ; see Proposition A.4. Hence, the condition of continuity in the
preceding definition is only relevant if S contains its lower boundary point. Note that
any utility function u.x/ decreases at least linearly as x # inf S. Therefore, u cannot
be bounded from below unless inf S > 1.
From now on, we will consider a fixed preference relation on M which admits a
von NeumannMorgenstern representation
Z
U.
/ D u d

in terms of a strictly increasing continuous function u W S ! R. The intermediate


value theorem applied to the function u yields for any
2 M a unique real number
c.
/ for which
Z
u.c.
// D U.
/ D

u d
:

(2.11)

It follows that
c./ 
;
i.e., there is indifference between the lottery
and the sure amount of money c.
/.
Definition 2.36. The certainty equivalent of the lottery
2 M with respect to u is
defined as the number c.
/ of (2.11), and
%.
/ WD m.
/  c.
/
is called the risk premium of
.
When u is a utility function, risk aversion implies that
u.c.
// D U.
/ < U.m./ / D u.m.
//
for every lottery
with
m./ . Hence, monotonicity yields that
c.
/ < m.
/

for
m./ .

In particular, the risk premium %.


/ associated with a utility function is always nonnegative, and it is strictly positive as soon as the distribution
carries any risk.
Remark 2.37. The certainty equivalent c.
/ can be viewed as an upper bound for any
price of
which would be acceptable to an economic agent with utility function u.
Thus, the fair price m.
/ must be reduced at least by the risk premium %.
/ if one
wants the agent to buy the asset distribution
. Alternatively, suppose that the agent
holds an asset with distribution
. Then the risk premium may be viewed as the
amount that the agent would be ready to pay for replacing the asset by its expected
value m.
/.
}

70

Chapter 2 Preferences

Example 2.38 (St. Petersburg paradox). Consider the lottery

1
X

2n 2n1

nD1

which may be viewed as the payoff distribution of the following game. A fair coin
is tossed until a head appears. If the head appears on the nth toss, the payoff will be
2n1 C. Up to the early 18th century, it was commonly accepted that the price of a
lottery should be computed as the fair price, i.e., as the expected value m.
/. In the
present example, the fair price is given by m.
/ D 1, but it is hard to find someone
who is ready to pay even 20 C. In view of this paradox, posed by Nicholas Bernoulli
in 1713, Gabriel Cramer and Daniel Bernoulli [25] independently introduced the idea
of determining an acceptable price as the certainty equivalent with respect to some
utility function. For the two utility functions
p
u1 .x/ D x and u2 .x/ D log x
proposed, respectively, by G. Cramer and by D. Bernoulli, these certainty equivalents
are given by
p
c1 .
/ D .2  2 /2  2:91 and c2 .
/ D 2;
and this is within the range of prices people are usually ready to pay. Note, however,
that for any utility function which is unbounded from above we could modify the
payoff in such a way that the paradox reappears.
R For example, we could replace the
payoff 2n by u1 .2n / for n  1000, so that u d
D C1. The choice of a utility
function that is bounded from above would remove this difficulty, but would create
others; see the discussion on pp. 7780.
}
Given the preference order on M, we can now try to determine those distributions in M which are maximal with respect to . As a first illustration, consider the
following simple optimization problem. Let X be an integrable random variable on
some probability space .; F ; P / with nondegenerate distribution
2 M. We assume that X is bounded from below by some number a in the interior of S. Which is
the best mix
X
WD .1  /X C c
of the risky payoff X and the certain amount c, that also belongs to the interior of S?
If we evaluate X
by its expected utility E u.X
/  and denote by

the distribution
of X
under P , then we are looking for a maximum of the function f on 0; 1 defined
by
Z
f . / WD U.

/ D

u d

D E u..1  /X C c/ :

When u is a utility function, f is strictly concave and attains its maximum in a unique
point  2 0; 1.

71

Section 2.3 Expected utility

Proposition 2.39. Let u be a utility function.


(a) We have  D 1 if E X   c, and  > 0 if c  c.
/.
(b) If u is differentiable, then
 D 1

E X   c

 D 0

c

and
E Xu0 .X / 
:
E u0 .X / 

Proof. (a): Jensens inequality yields that


f . /  u.E X
/ D u..1  /E X  C c/;
with equality if and only if D 1. It follows that  D 1 if the right-hand side is
increasing in , i.e., if E X   c.
Strict concavity of u implies
f . /  E .1  /u.X / C u.c/ 
D .1  /u.c.
// C u.c/;
with equality if and only if 2 0; 1. The right-hand side is increasing in if
c  c.
/, and this implies  > 0.
(b): Clearly, we have  D 0 if and only if the right-hand derivative fC0 of f
satisfies fC0 .0/  0; see Appendix A.1 for the definition of fC0 and f0 . Note that the
difference quotients
u.X
/  u.X /
u.X
/  u.X /
D
 .c  X /

X
 X
are P -a.s. bounded by
u0C .a ^ c/jc  Xj 2 L1 .P /
and that they converge to
u0C .X /.c  X /C  u0 .X /.c  X /
as # 0. By Lebesgues theorem, this implies
fC0 .0/ D E u0C .X /.c  X /C   E u0 .X /.c  X / :
If u is differentiable, or if the countable set x j u0C .x/ u0 .x/ has
-measure 0,
then we can conclude
fC0 .0/ D E u0 .X /.c  X / ;

72

Chapter 2 Preferences

i.e., fC0 .0/  0 if and only if


c

E Xu0 .X / 
:
E u0 .X / 

In the same way, we obtain


f0 .1/ D u0 .c/E .X  c/   u0C .c/E .X  c/C :
If u is differentiable at c, then we can conclude
f0 .1/ D u0 .c/.c  E X /:
This implies f0 .1/ < 0, and hence  < 1, if and only if E X  > c.
Exercise 2.3.1. As above, let X be a random variable with a nondegenerate distribution
2 M. Show that for a differentiable utility function u we have
m.
/ > c.
/ >

E u0 .X /X 
:
E u0 .X / 

(2.12)
}

Example 2.40 (Demand for a risky asset). Let S D S 1 be a risky asset with price
 D  1 . Given an initial wealth w, an agent with utility function u 2 C 1 can invest a
fraction .1  /w into the asset and the remaining part w into a risk-free bond with
interest rate r. The resulting payoff is
X
D

.1  /w
.S  / C w  r:


The preceding proposition implies that there will be no investment into the risky asset
if and only if


S
 :
E
1Cr
In other words, the price of the risky asset must be below its expected discounted
payoff in order to attract any risk averse investor, and in that case it will indeed be
optimal for the investor to invest at least some amount. Instead of the simple linear
profiles X
, the investor may wish to consider alternative forms of investment. For
example, this may involve derivatives such as maxS; K D K C .S  K/C for some
threshold K. In order to discuss such non-linear payoff profiles, we need an extended
formulation of the optimization problem; see Section 3.3 below.
}

73

Section 2.3 Expected utility

Example 2.41 (Demand for insurance). Suppose an agent with utility function u 2
C 1 considers taking at least some partial insurance against a random loss Y , with
0  Y  w and P Y E Y   > 0, where w is a given initial wealth. If insurance
of Y is available at the insurance premium , the resulting final payoff is given by
X
WD w  Y C .Y  / D .1  /.w  Y / C .w  /:
By Proposition 2.39, full insurance is optimal if and only if   E Y . In reality,
however, the insurance premium  will exceed the fair premium E Y . In this
case, it will be optimal to insure only a fraction  Y of the loss, with  2 0; 1/.
This fraction will be strictly positive as long as
<

E .w  Y /u0 .w  Y / 
E Y u0 .w  Y / 
D
w

:
E u0 .w  Y / 
E u0 .w  Y / 

Since the right-hand side is strictly larger than E Y  due to (2.12), risk aversion may
create a demand for insurance even if the insurance premium  lies above the fair
price E Y . As in the previous example, the agent may wish to consider alternative forms of insurance such as a stop-loss contract whose payoff has the non-linear
}
structure .Y  K/C of a call option.
Let us take another look at the risk premium %.
/ of a lottery
. For an approximate
calculation, we consider the Taylor expansion of a sufficiently smooth and strictly
increasing function u.x/ at x D c.
/ around m WD m.
/, and we assume that
has
finite variance var.
/. On the one hand,
u.c.
//  u.m/ C u0 .m/.c.
/  m/ D u.m/  u0 .m/%.
/:
On the other hand,
Z
u.c.
// D u.x/
.dx/

Z 
1
D
u.m/ C u0 .m/.x  m/ C u00 .m/.x  m/2 C r.x/
.dx/
2
1
 u.m/ C u00 .m/ var.
/;
2
where r.x/ denotes the remainder term in the Taylor expansion of u. It follows that
%.
/  

u00 .m.
//
1
var.
/ DW .m.
// var.
/:
0
2  u .m.
//
2

(2.13)

Thus, .m.
// is the factor by which an economic agent with von NeumannMorgenstern preferences described by u weighs the risk, measured by 12 var.
/, in order
to determine the risk premium he or she is ready to pay.

74

Chapter 2 Preferences

Definition 2.42. Suppose that u is a twice continuously differentiable and strictly


increasing function on S . Then
.x/ WD 

u00 .x/
u0 .x/

is called the ArrowPratt coefficient of absolute risk aversion of u at level x.


Example 2.43. The following classes of utility functions u and their corresponding
coefficients of risk aversion are standard examples.
(a) Constant absolute risk aversion (CARA): .x/ equals some constant > 0.
Since .x/ D .log u0 /0 .x/, it follows that u.x/ D a  b  e x . Using an
affine transformation, u can be normalized to
u.x/ D 1  e x :
(b) Hyperbolic absolute risk aversion (HARA): .x/ D .1  /=x on S D .0; 1/
for some  < 1. Up to affine transformations, we have
u.x/ D log x

for  D 0,

1
x


for  0.

u.x/ D

Sometimes, these functions are also called CRRA utility functions, because their
relative risk aversion x.x/ is constant. Of course, these utility functions can
be shifted to any interval S D .a; 1/. The risk-neutral limiting case  D 1
would correspond to an affine function u.
}
Exercise 2.3.2. Compute the coefficient of risk aversion for the S-shaped utility function in (2.9). Sketch the graphs of u and its risk aversion for D 2 and  D 0:9. }
Proposition 2.44. Suppose that u and uQ are two strictly increasing functions on S
which are twice continuously differentiable, and that and Q are the corresponding
ArrowPratt coefficients of absolute risk aversion. Then the following conditions are
equivalent:
(a) .x/  .x/
Q
for all x 2 S.
(b) u D F uQ for a strictly increasing concave function F .
(c) The respective risk premiums % and %Q associated with u and uQ satisfy %.
/ 
%.
/
Q
for all
2 M.

75

Section 2.3 Expected utility

Proof. (a) ) (b): Since uQ is strictly increasing, we may define its inverse function,
w. Then F .t / WD u.w.t // is strictly increasing, twice differentiable, and satisfies
u D F u.
Q For showing that F is concave we calculate the first two derivatives of w
w0 D

1
;
uQ 0 .w/

Q

w 00 D .w/

1
:
uQ 0 .w/2

Now we can calculate the first two derivatives of F


F 0 D u0 .w/  w 0 D

u0 .w/
>0
uQ 0 .w/

and
F 00 D u00 .w/.w 0 /2 C u0 .w/w 00
D

u0 .w/
.w/
Q
 .w/ 
uQ 0 .w/2

(2.14)

 0:
This proves that F is concave.
(b) ) (c): Jensens inequality implies that the respective certainty equivalents c.
/
and c.
/
Q
satisfy
Z
Z
u.c.
// D u d
D F uQ d

(2.15)
Z

F
uQ d
D F . u.
Q c.
///
Q
D u. c.
//:
Q
Hence, %.
/ D m.
/  c.
/  m.
/  c.
/
Q
D %.
/.
Q
(c) ) (a): If condition (a) is false, there exists an open interval O  S such that
.x/
Q
> .x/ for all x 2 O. Let OQ WD u.O/,
Q
and denote again by w the inverse
of u.
Q Then the function F .t / D u.w.t // will be strictly convex in the open interval
OQ by (2.14). Thus, if
is a measure with support in O, the inequality in (2.15) is
reversed and is even strict unless
is concentrated at a single point. It follows that
%.
/ < %.
/,
Q
which contradicts condition (c).
As an application of the preceding proposition, we will now investigate the structure of those continuous and strictly increasing functions u on R whose associated
certainty equivalents have the following translation property:
c.
t / D c.
/ C t

for all
2 M and all t 2 R,

where the translation


t of
2 M by t 2 R is defined by
Z
Z
g.x/
t .dx/ D g.x C t /
.dx/ for bounded measurable g.

(2.16)

76

Chapter 2 Preferences

Here we also assume that M is closed under translation, i.e.,


t 2 M for all
2 M
and t 2 R.
Lemma 2.45. Suppose the certainty equivalent associated with a continuous and
strictly increasing function u W R ! R satisfies the translation property (2.16). Then
u belongs to C 1 .R/.
Proof. Let denote the Lebesgue measure on 0; 1. Then
Z
f .t / WD u.c. / C t / D u.c. t // D

1Ct

u d t D

u.y/ dy;

(2.17)

and this implies f 2 C 1 .R/ with


f 0 .t / D u.1 C t /  u.t /:

(2.18)

Thus, u.x/ D f .x  c. // is in C 1 .R/, which implies that f 0 2 C 1 .R/ by (2.18),


hence f 2 C 2 .R/. Iterating the argument we get u 2 C 1 .R/.
The following proposition implies in particular that a utility function u that satisfies
the translation property (2.16) is necessarily a CARA utility function of exponential
type as in part (a) of Example 2.43.
Proposition 2.46. Suppose the certainty equivalent associated with a continuous and
strictly increasing function u W R ! R satisfies the translation property (2.16). Then
u has constant absolute risk aversion and is hence either linear or an exponential
function. More precisely, there are constants a 2 R and b; > 0 such that u.x/
equals one of the following three functions
8
x

<a  be
a C bx

:
a C be x :
Proof. For t 2 R let u t .x/ WD u.x C t /, and denote by c t .
/ the corresponding
certainty equivalent. For
2 M we have
Z
Z
Z
u t ..c t .
// D u t d
D u.x C t /
.dx/ D u d
t D u.c.
t //
D u.c.
/ C t / D u t .c.
//:
It follows that c t .
/ D c.
/ for all t 2 R and
2 M. Therefore,
% t .
/ WD m.
/  c t .
/ D m.
/  c.
/ D %.
/

77

Section 2.3 Expected utility

for all t 2 R. Since u is smooth by Lemma 2.45, we may apply Proposition 2.44
to conclude that the respective Arrow-Pratt coefficients t .x/ D u00t .x/=u0t .x/ and
.x/ D u00 .x/=u0 .x/ are equal for all t and x. But t .x/ D .x C t /, and so .x/
does not depend on x. When > 0, we see as in Example 2.43 that u is of the form
u.x/ D a  be x . When D 0, u must be linear. And when < 0, we must have
u.x/ D a C be x .
Now we focus on the case in which u is a utility function and preferences have the
expected utility representation (2.10). In view of the underlying axioms, the paradigm
of expected utility has a certain plausibility on a normative level, i.e., as a guideline
of rational behavior in the face of risk. But this guideline should be applied with
care: If pushed too far, it may lead to unplausible conclusions. In the remaining part
of this section we discuss some of these issues. From now on, we assume that S is
unbounded from above, so that w C x 2 S for any x 2 S and w  0. So far, we
have implicitly assumed that the preference relation on lotteries reflects the views
of an economic agent in a given set of conditions, including a fixed level w  0
of the agents initial wealth. In particular, the utility function may vary as the level
of wealth changes, and so it should really be indexed by w. Usually one assumes
that uw is obtained by simply shifting a fixed utility function u to the level w, i.e.,
uw .x/ WD u.w C x/. Thus, a lottery
is declined at a given level of wealth w if and
only if
Z
u.w C x/
.dx/ < u.w/:
Let us now return to the situation of Proposition 2.39 when
is the distribution of an
integrable random variable X on .; F ; P /, which is bounded from below by some
number a in the interior of S . We view X as the net payoff of some financial bet, and
we assume that the bet is favorable in the sense that
m.
/ D E X  > 0:
Remark 2.47. Even though the favorable bet X might be declined at a given level
w due to risk aversion, it follows from Proposition 2.39 that it would be optimal to
accept the bet at some smaller scale, i.e., there is some   > 0 such that
E u.w C   X /  > u.w/:
On the other hand, it follows from Proposition 2.49 below that the given bet X becomes acceptable at a sufficiently high level of wealth whenever the utility function is
unbounded from above.
}
Sometimes it is assumed that some favorable bet is declined at every level of wealth.
The assumption that such a bet exists is not as innocent as it may look. In fact it has
rather drastic consequences. In particular, we are going to see that it rules out all
utility functions in Example 2.43 except for the class of exponential utilities.

78

Chapter 2 Preferences

Example 2.48. For any exponential utility function u.x/ D 1  e x with constant
risk aversion > 0, the induced preference order on lotteries does not at all depend
on the initial wealth w. To see this, note that
Z
Z
u.w C x/
.dx/ < u.w C x/ .dx/
is equivalent to

e x
.dx/ >

e x .dx/:

Let us now show that the rejection of some favorable bet


at every wealth level w
leads to a not quite plausible conclusion: At high levels of wealth, the agent would
reject a bet with huge potential gain even though the potential loss is just a negligible
fraction of the initial wealth.
Proposition 2.49. If the favorable bet
is rejected at any level of wealth, then the
utility function u is bounded from above, and there exists A > 0 such that the bet
1
WD .A C 1 /
2
is rejected at any level of wealth.
Proof. We have assumed that X is bounded from below, i.e.,
is concentrated on
a; 1/ for some a < 0, where a is in the interior of S. Moreover, we can choose
b > 0 such that

.B/
Q
WD
.B \ a; b/ C b .B/ 
..b; 1//
is still favorable. Since u is increasing, we have
Z
Z
u.w C x/
.dx/
Q
 u.w C x/
.dx/ < u.w/
for any w  0, i.e., also the lottery
Q is rejected at any level of wealth. It follows that
Z
Z
u.w C x/  u.w/ 
.dx/
Q
<
u.w/  u.w C x/ 
.dx/:
Q
0;b

a;0/

Let us assume for simplicity that u is differentiable; the general case requires only
minor modifications. Then the previous inequality implies
u0 .w C b/ mC .
/
Q < u0 .w C a/ m .
/;
Q
where
Q WD
mC .
/

Z
x
.dx/
Q
>

0;b

a;0

.x/
.dx/
Q
DW m .
/;
Q

79

Section 2.3 Expected utility

due to the fact that


Q is favorable. Thus,
Q
m .
/
u0 .w C b/
<
DW  < 1
0
C
u .w  jaj/
m .
/
Q
for any w, hence
u0 .x C n.jaj C b// <  n u0 .x/
for any x in the interior of S . This exponential decay of the derivative implies
u.1/ WD limx"1 u.x/ < 1. More precisely, if A WD n.jaj C b/ for some n,
then
1 Z xC.kC1/A
X
u0 .y/ dy
u.1/  u.x/ D
xCkA

kD0

1 Z
X

<

u0 .z C .k C 1/A/ dz

xA

kD0
1
X

.kC1/n

u0 .z/ dz

xA

kD0

n
.u.x/  u.x  A//:
1  n

Take n such that  n  1=2. Then we obtain


u.1/  u.x/ < u.x/  u.x  A/;
i.e.,

1
.u.1/ C u.x  A// < u.x/
2
for all x such that x  A 2 S .

Example 2.50. For an exponential utility function u.x/ D 1ex , the bet defined
in the preceding lemma is rejected at any level of wealth as soon as A > 1 log 2. }
Suppose now that the lottery
2 M is played not only once but n times in a row.
For instance, one can think of an insurance company selling identical policies to a
large number of individual customers. More precisely, let .; F ; P / be a probability space supporting a sequence X1 ; X2 ; : : : of independent random variables with
common distribution
. The value of Xi will be interpreted as the outcome of the
i th drawing of the lottery
. The accumulated payoff of n successive independent
repetitions of the financial bet X1 is given by
Zn WD

n
X
i D1

Xi ;

80

Chapter 2 Preferences

and we assume that this accumulated payoff takes values in S; this is the case if, e.g.,
S D 0; 1/.
Remark 2.51. It may happen that an agent refuses the single favorable bet X at any
level of wealth but feels tempted by a sufficiently large series X1 ; : : : ; Xn of independent repetitions of the same bet. It is true that, by the weak law of large numbers, the
probability
n
h1X
i
Xi < m.
/  "
P Zn < 0  D P
n
i D1

(for " WD m.
/) of incurring a cumulative loss at the end of the series converges to 0
as n " 1. Nevertheless, the decision of accepting n repetitions is not consistent with
the decision to reject the single bet at any wealth level w. In fact, for Wk WD w C Zk
we obtain
E u.Wn /  D E E u.Wn1 C Xn / j X1 ; : : : ; Xn1  

Z
DE
u.Wn1 C x/
.dx/
< E u.Wn1 /  <    < u.w/;
i.e., the bet described by Zn should be rejected as well.

Let us denote by
n the distribution of the accumulated payoff Zn . The lottery

n has the mean m.


n / D n  m.
/, the certainty equivalent c.
n /, and the associated risk premium %.
n / D n  m.
/  c.
n /. We are interested in the asymptotic
behavior of these quantities for large n. Kolmogorovs law of large numbers states
that the average outcome n1 Zn converges P -a.s. to the constant m.
/. Therefore, one
might guess that a similar averaging effect occurs on the level of the relative certainty
equivalents
c.
n /
(2.19)
cn WD
n
and of the relative risk premiums
%.
n /
D m.
/  cn :
n
Does cn converge to m.
/, and is there a successive reduction of the relative risk
premiums %n as n grows to infinity? Applying our heuristic (2.13) to the present
situation yields
%n WD

1
1
.m.
n // var.
n / D .n  m.
// var.
/:
2n
2
Thus, one should expect that %n tends to zero only if the ArrowPratt coefficient .x/
becomes arbitrarily small as x becomes large, i.e., if the utility function is decreasingly risk averse. This guess is confirmed by the following two examples.
%n 

81

Section 2.3 Expected utility

Example 2.52. Suppose that u.x/ D 1  e x is a CARA


function with conR utility
x
.dx/ < 1. Then,
stant risk aversion > 0 and assume that
is such that e
with the notation introduced above,
Z
e

x

n .dx/ D E

n
h Y

Xi

Z
D

x

n

.dx/ :

i D1

Hence, the certainty equivalent of


n is given by
Z
n
c.
n / D  log e x
.dx/ D n  c.
/:

It follows that cn and %n are independent of n. In particular, the relative risk premiums
are not reduced if the lottery is drawn more than once.
}
The second example displays a different behavior. It shows that for HARA utility
functions the relative risk premiums will indeed decrease to 0. In particular, the lottery

n will become attractive for large enough n as soon as the price of the single lottery

is less than m.
/.
Example 2.53. Suppose that
is a non-degenerate lottery concentrated on .0; 1/,
and that u is a HARA utility function of index  2 0; 1/. If  > 0 then u.x/ D 1 x
and c.
n / D E .Zn / 1= , hence
cn D

c.
n /
DE
n



1
Zn
n

 1=
< m.
/:

If  D 0 then u.x/ D log x, and the relative certainty equivalent satisfies





1
Zn :
log cn D log c.
n /  log n D E log
n
Thus, we have


 
1
u.cn / D E u Zn
n

for any  2 0; 1/. By symmetry,


1
ZnC1 D E Xk j ZnC1 
nC1

for k D 1; : : : ; n C 1;

see, e.g., part II of 20 in [20]. It follows that





1
1

ZnC1 D E
Zn ZnC1 :
nC1
n

(2.20)

82

Chapter 2 Preferences

Since u is strictly concave and since


is non-degenerate, we get
 
  

Zn ZnC1
u.cnC1 / D E u E
n


  
1

> E E u Zn ZnC1
n
D u.cn /;
i.e., the relative certainty equivalents are strictly increasing and the relative risk premiums %n are strictly decreasing. By Kolmogorovs law of large numbers,
1
Zn ! m.
/ P -a.s.
n

(2.21)

Thus, by Fatous lemma (we assume for simplicity that


is concentrated on "; 1/
for some " > 0 if  D 0),



1
lim inf u.cn /  E lim inf u Zn
D u.m.
//;
n
n"1
n"1
hence
lim cn D m.
/

n"1

and

lim %n D 0:

n"1

Suppose that the price of


is given by  2 .c.
/; m.
//. At initial wealth w D 0,
the agent would decline a single bet. But, in contrast to the situation in Remark 2.51, a
series of n repetitions of the same bet would now become attractive for large enough n,
since c.
n / D ncn > n for
n  n0 WD mink 2 N j ck >  < 1:

Remark 2.54. The identity (2.20) can also be written as





1
1

ZnC1 D E Zn AnC1 D E X1 j AnC1 


nC1
n
where AnC1 D .ZnC1 ; ZnC2 ; : : : /. This means that the stochastic process n1 Zn ,
n D 1; 2; : : : , is a backwards martingale, sometimes also called reversed martingale.
In particular, Kolmogorovs law of large numbers (2.21) can be regarded as a special case of the convergence theorem for backwards martingales; see part II of 20
in [20].
}
Exercise 2.3.3. Investigate the asymptotics of cn in (2.19) for a HARA utility func}
tion u.x/ D 1 x with  < 0.

83

Section 2.4 Uniform preferences

2.4

Stochastic dominance

So far, we have considered preference relations on distributions defined in terms of a


fixed utility function u. In this section, we focus on the question whether one distribution is preferred over another, regardless of the choice of a particular utility function.
For simplicity, we take S D R as the set of possible payoffs. Let M be the set of
all
2 M1 .R/ with well-defined and finite expectation
Z
m.
/ D x
.dx/:
Recall from Definition 2.35 that a utility function on R is a strictly concave and strictly
increasing function u W R ! R. Since each concave function u is dominated
R by an
affine function, the existence of m.
/ implies the existence of the integral u d
as
an extended real number in 1; 1/.
Definition 2.55. Let and
be lotteries in M. We say that the lottery
is uniformly
preferred over and we write

<uni
if

Z
u d


u d

for all utility functions u.

Thus,
<uni holds if and only if every risk-averse agent will prefer
over ,
regardless of which utility function the agent is actually using. In this sense,
<uni
expresses a uniform preference for
over . Sometimes, <uni is also called second
order stochastic dominance; the notion of first order stochastic dominance will be
introduced in Definition 2.67.
Remark 2.56. The binary relation <uni is a partial order on M, i.e., <uni satisfies the
following three properties:


Reflexivity:
<uni
for all
2 M.

Transitivity:
<uni and <uni imply
<uni .

Antisymmetry:
<uni and <uni
imply
D .

The first two properties are obvious, the third is derived in Remark 2.58. Moreover,
<uni is monotone and risk-averse in the sense that
y <uni x for y  x, and

m./ <uni
for all
2 M.

Note, however, that <uni is not a weak preference relation in the sense of Definition 2.2,
since it is not complete, see Remark 2.3.
}

84

Chapter 2 Preferences

In the following theorem, we will give a number of equivalent formulations of the


statement
<uni . One of them needs the notion of a stochastic kernel on R. This is
a mapping
Q W R ! M1 .R/
such that x 7! Q.x; A/ is measurable for each fixed Borel set A  R. See Appendix A.3 for the notion of a quantile function, which will be used in condition (e).
Theorem 2.57. For any pair
; 2 M the following conditions are equivalent:
(a)
<uni .
R
R
(b) f d
 f d for all increasing concave functions f .
(c) For all c 2 R

.c  x/
.dx/ 

.c  x/C .dx/:

(d) If F and F denote the distribution functions of


and , then
Z

F .x/ dx 

F .x/ dx

1

for all c 2 R.

1

(e) If q and q are quantile functions for


and , then
Z

q .s/ ds 
0

q .s/ ds

for 0 < t  1.

(f) There exists a probability space .; F ; P / with random variables X and X
having respective distributions
and such that
E X j X   X

P -a.s.

(g) There exists a stochastic kernel Q.x; dy/ on R such that Q.x; / 2 M and
m.Q.x; //  x for all x and such that D
Q, where
Q denotes the measure
Z

Q.A/ WD Q.x; A/
.dx/ for Borel sets A  R.
Below we will show the following implications between the conditions of the theorem:
(e) (d) (c) (b) (a) (H (g) (H (f):

(2.22)

The difficult part is the proof that (b) implies (f). It will be deferred to Section 2.6,
where we will prove a multidimensional variant of this result; cf. Theorem 2.94.

85

Section 2.4 Uniform preferences

Proof of .2:22/. (e) , (d): This follows from Lemma A.22.


(d) , (c): By Fubinis theorem,
Z c Z
Z c
F .y/ dy D

.dz/ dy
1

1

Z Z

.1;y

Izyc dy
.dz/

D
Z
D

.c  z/C
.dz/:

(c) , (b): Condition (b) implies (c) because f .x/ WD .c  x/C is concave and
increasing. In order to prove the converse assertion, we take an increasing concave
function f and let h WD f . Then h is convex and decreasing, and its increasing
right-hand derivative h0 WD h0C can be regarded as a distribution function of a nonnegative Radon measure  on R,
h0 .b/ D h0 .a/ C ..a; b/
see Appendix A.1. As in (1.11):

for a < b;

.z  x/C .dz/

h.x/ D h.b/  h .b/ .b  x/ C

for x < b:

.1;b

Using h0 .b/  0, Fubinis theorem, and condition (c), we obtain that


Z
Z
h d
D h.b/
..1; b/  h0 .b/ .b  x/C
.dx/
.1;b

.z  x/C
.dx/ .dz/

.1;b

Z
 h.b/
..1; b/  h0 .b/ .b  x/C .dx/
Z
Z
C
.z  x/C .dx/ .dz/
Z
D

.1;b

h d C h.b/
..1; b/  ..1; b/:
.1;b

R
R
Taking b " 1 yields f d
 f d . Indeed, the convex decreasing function h
decays at most linearly, and the existence of first moments for
and implies that
b
..1; b/ ! 0 and b ..1; b/ ! 0 for b " 1.
(a) , (b): That (b) implies (a) is obvious. For Rthe proof of the
R converse implication,
choose any utility function u0 for which both u0 d
and u0 d are finite. For
instance, one can take

x  e x=2 C 1 if x  0;
u0 .x/ WD p
x C 1  1 if x  0.

86

Chapter 2 Preferences

Then, for f concave and increasing and for 2 0; 1/,


u .x/ WD f .x/ C .1  /u0 .x/
is a utility function. Hence,
Z
Z
Z
Z
f d
D lim u d
 lim u d D f d :
"1

"1

(f) ) (g): By considering the joint distribution of X and X , we may reduce our
setting to the situation in which  D R2 and where X and X are the respective
projections on the first and second coordinates, i.e., for ! D .x; y/ 2  D R2 we
have X .!/ D x and X .!/ D y. Let Q.x; dy/ be a regular conditional distribution
of X given X , i.e., a stochastic kernel on R such that
P X 2 A j X .!/ D Q.X .!/; A/
for all Borel sets A R and for P -a.e. ! 2  (see, e.g., Theorem 44.3 of [20] for an
existence proof). Clearly, D
Q. Condition (f) implies that
Z
X .!/  E X j X .!/ D y Q.X .!/; dy/ for P -a.e. ! 2 .
Hence, Q satisfies

Z
y Q.x; dy/  x

for
-a.e. x.

By modifying Q on a
-null set (e.g., by putting Q.x; / WD x there), this inequality
can be achieved for all x 2 R.
(g) ) (a): Let u be a utility function. Jensens inequality applied to the measure
Q.x; dy/ implies
Z
u.y/ Q.x; dy/  u.m.Q.x; ///  u.x/:
Hence,

Z Z
u d D

Z
u.y/ Q.x; dy/
.dx/ 

u d
;

completing the proof of the set of implications (2.22).


Remark 2.58. Let us note some consequences of the preceding theorem. First, taking
in condition (b) the increasing concave function f .x/ D x yields
m.
/  m. / if
<uni ,
i.e., the expectation m./ is increasing with respect to <uni .

87

Section 2.4 Uniform preferences

Next, suppose that


and are such that
Z
Z
.c  x/C
.dx/ D .c  x/C .dx/ for all c.
Then we have both
<uni and <uni
, and condition (d) of the theorem implies that
the respective distribution functions satisfy
Z c
Z c
F .x/ dx D
F .x/ dx for all c.
1

1

Differentiating with respect to c givesRthe identity


D , i.e., a measure
2 M is
uniquely determined by the integrals .c  x/C
.dx/ for all c 2 R. In particular,
}
<uni is antisymmetric.
The following proposition characterizes the partial order <uni considered on the set
of all normal distributions N.m; 2 /. Recall that the standard normal distribution
N.0; 1/ is defined by its density function
1
2
'.x/ D p e x =2 ;
2

x 2 R:

The corresponding distribution function is usually denoted


Z x
'.y/ dy; x 2 R:
.x/ D
1

More generally, the normal distribution N.m; 2 / with mean m 2 R and variance
2 > 0 is given by the density function
p

 .x  m/2 
 exp 
;
2 2
2 2
1

x 2 R:

Q Q 2 / if
Proposition 2.59. For two normal distributions, we have N.m; 2 / <uni N.m;
2
2
and only if both m  m
Q and  Q hold.
Proof. In order to prove necessity, note that N.m; 2 / <uni N.m;
Q Q 2 / implies that
Z
Z
2 2
e mC  =2 D e x N.m; 2 /.dx/  e x N.m;
Q Q 2 /.dx/
Q
D e mC

Hence, for > 0,


m

2
Q 2 =2

1 2
1
 m
Q  Q 2 ;
2
2

which gives m  m
Q by letting # 0 and 2  Q 2 for " 1.

88

Chapter 2 Preferences

We show sufficiency first in the case m D m


Q D 0. Note that the distribution
2 / is given by .x= /. Since ' 0 .x/ D x'.x/,
function of N.0;
Z c
Z c  
x 
c 
d
x
x

'
dx D
 2 dx D '
> 0:
d 1




1
Note that interchanging differentiation and integration
is justified by dominated conRc
vergence. Thus, we have shown that 7! 1 .x= / dx is strictly increasing for
all c, and N.0; 2 / <uni N.0; Q 2 / follows from part (d) of Theorem 2.57.
Now we turn to the case of arbitrary expectations m and m.
Q Let u be a utility
function. Then
Z
Z
Z
2
2
u dN.m; / D u.m C x/ N.0; /.dx/  u.m
Q C x/ N.0; 2 /.dx/;
because m  m.
Q Since x 7! u.m
Q C x/ is again a utility function, we obtain from the
preceding step of the proof that
Z
Z
Z
2
2
Q C x/ N.0; Q /.dx/ D u dN.m;
Q Q 2 /;
u.m
Q C x/ N.0; /.dx/  u.m
Q Q 2 / follows.
and N.m; 2 / <uni N.m;
Remark 2.60. Let us indicate an alternative proof for the sufficiency part of Proposition 2.59 that uses condition (g) instead of (d) in Theorem 2.57. To this end, we define
a stochastic kernel by Q.x; / WD N.x C m
Q  m; O 2 /, where O 2 WD Q 2  2 > 0.
Then m.Q.x; // D x C m
Q  m  x and
Q  m; O 2 / D N.m C m
Q  m; 2 C O 2 / D N.m; Q 2 /;
N.m; 2 / Q D N.m; 2 /  N.m
Q Q 2 / follows.
where  denotes convolution. Hence, N.m; 2 / <uni N.m;

The following corollary investigates the relation


<uni for lotteries with the same
expectation. A multidimensional version of this result will be given in Corollary 2.95
below.
Corollary 2.61. For all
, 2 M the following conditions are equivalent:
(a)
<uni and m.
/ D m. /.
R
R
(b) f d
 f d for all .not necessarily increasing/ concave functions f .
R
R
(c) m.
/  m. / and .x  c/C
.dx/  .x  c/C .dx/ for all c 2 R.
(d) There exists a probability space .; F ; P / with random variables X and X
having respective distributions
and such that
E X j X  D X

P -a.s.

89

Section 2.4 Uniform preferences

(e) There exists a mean-preserving spread Q, i.e., a stochastic kernel on R such


that m.Q.x; // D x for all x 2 S, such that D
Q.
Proof. (a) ) (e): Condition (g) of Theorem 2.57 yields a stochastic kernel Q such
that D
Q and m.Q.x; //  x. Due to the assumption m.
/ D m. /, Q must
satisfy m.Q.x; // D x at least for
-a.e. x. By modifying Q on the
-null set where
m.Q.x; // < x (e.g. by putting Q.x; / WD x there), we obtain a kernel as needed
for condition (e).
(e) ) (b): Since
Z
f .y/ Q.x; dy/  f .m.Q.x; /// D f .x/
by Jensens inequality, we obtain
Z
Z Z
Z
f d D
f .y/ Q.x; dy/
.dx/  f d
:
(b) ) (c): Just take the concave functions f .x/ D .x  c/C , and f .x/ D x.
(c) ) (a): Note that
Z
Z
C
x
.dx/  c C c
..1; c/:
.x  c/
.dx/ D
.c;1/

The existence of m.
/ implies that c
..1; c/ ! 0 as c # 1. Hence, we deduce
from the second condition in (c) that m.
/  m. /, i.e., the two expectations are in
fact identical. Now we can apply the following put-call parity (compare also (1.10))
Z
Z
C
.c  x/
.dx/ D c  m.
/ C .x  c/C
.dx/
to see that our condition (c) implies the third condition of Theorem 2.57 and, thus,

<uni .
(d) , (a): Condition (d) implies both m.
/ D m. / and condition (f) of Theorem 2.57, and this implies our condition (a). Conversely, assume that (a) holds. Then
Theorem 2.57 provides random variables X and X having the respective distributions
and such that E X j X   X . Since X and X have the same mean,
this inequality must in fact be an almost-sure equality, and we obtain condition (d).
Let us denote by
Z
var.
/ WD

.x  m.
//2
.dx/ D

the variance of a lottery


2 M.

x 2
.dx/  m.
/2 2 0; 1

90

Chapter 2 Preferences

Exercise 2.4.1. Let


and be two lotteries in M such that m.
/ D m. / and

<uni . Show that var.


/  var. /.
In the financial context, comparisons of portfolios with known payoff distributions
often use a mean-variance approach based on the relation

<

m.
/  m. / and var.
/  var. /.

For normal distributions


and , we have seen that the relation
< is equivalent
to
<uni . Beyond this special case, the equivalence typically fails as illustrated by
the following example and by Proposition 2.65 below.
Example 2.62. Let
be the uniform distribution on the interval 1; 1, so that
m.
/ D 0 and var.
/ D 1=3. For we take D p1=2 C .1  p/2 . With
the choice of p D 4=5 we obtain m. / D 0 and 1 D var. / > var.
/. However,
Z 
Z 
C
C
1
1
1
  x .dx/ D 0;
D
  x
.dx/ >
16
2
2
so
<uni does not hold.
Remark 2.63. Let
and be two lotteries in M. We will write
<con if
Z
Z
f d
 f d for all concave functions f on R.

(2.23)

Note that
<con implies that m.
/ D m. /, because both f .x/ D x and fQ.x/ D
x are concave. Corollary 2.61 shows that <con coincides with our uniform partial
order <uni if we compare two measures which have the same mean. The partial order
<con is sometimes called concave stochastic order. It was proposed in [226] and [227]
to express the view that
is less risky than . The inverse relation
<bal defined by
Z
Z
f d
 f d for all convex functions f on R
(2.24)
is sometimes called balayage order or convex stochastic order.

The following class of asset distributions is widely used in Finance.


Definition 2.64. A real-valued random variable Y on some probability space
.;F ;P / is called log-normally distributed with parameters 2 R and  0 if
it can be written as
Y D exp. C X /;
(2.25)
where X has a standard normal law N.0; 1/.

91

Section 2.4 Uniform preferences

Clearly, any log-normally distributed random variable Y on .; F ; P / takes P a.s. strictly positive values. Recall from above the standard notations ' and for the
density and the distribution function of the standard normal law N.0; 1/. We obtain
from (2.25) the distribution function


log y 
; 0 < y < 1;
PY  y  D

and the density
.y/ D



1
log y 
'
 I.0;1/ .y/
y

(2.26)

of the log-normally distributed random variable Y . Its p th moment is given by the


formula


1
E Y p  D exp p C p 2 2 :
2
In particular, the law
of Y has the expectation

1 
m.
/ D E Y  D exp C 2
2
and the variance
var.
/ D exp.2 C 2 /.exp. 2 /  1/:
Proposition 2.65. Let
and
Q be two log-normal distributions with parameters
.; / and .;
Q /,
Q respectively. Then
<uni
Q holds if and only if 2  Q 2 and
1 2
C 2  Q C 12 Q 2 .
Proof. First suppose that 2  Q 2 and m.
/  m.
/.
Q We define a kernel Q.x; /
as the law of x  exp. C Z/ where Z is a standard normal random variable. Now
suppose that
is represented by (2.25) with X independent of Z, and let f denote a
bounded measurable function. It follows that
Z
2
2 1=2
f d.
Q/ D Ef .e CX  e
CZ / D Ef .e C
C. C / U /;
where

X C Z
U Dp
2 C 2

is also N.0;
p 1/-distributed. Thus,
Qpis a log-normal distribution with parameters
. C ; 2 C 2 /. By taking WD Q 2  2 and WD Q  , we can represent

Q as
Q D
Q. With this parameter choice,
2
1
D Q  D log m.
/
Q  log m.
/  . Q 2  2 /   :
2
2

92

Chapter 2 Preferences

We have thus m.Q.x; //  x for all x, and so


<uni
Q follows from condition (g) of
Theorem 2.57.
As to the converse implication, the inequality m.
/  m.
/
Q is already clear. To
prove 2  Q 2 , let WD
log1 and Q WD
Q log1 so that D N.; 2 /
and Q D N.;
Q Q 2 /. For " > 0 we define the concave increasing function f" .x/ WD
log." C x/. If u is a concave increasing function on R, the function u f" is a concave
and increasing function on 0; 1/, which can be extended to a concave increasing
function v" on the full real line. Therefore,
Z

Z
u d D lim
"#0

Z
v" d
 lim
"#0

v" d
Q D

u d :
Q

(2.27)

Consequently, <uni Q and Proposition 2.59 yields 2  Q 2 .


Remark 2.66. The inequality (2.27) shows that if D N.; 2 /, Q D N.;
Q Q 2 / and

and
Q denote the images of and Q under the map x 7! e x , then
<uni
Q implies
<uni .
Q However, the converse implication <uni Q )
<uni
Q fails, as can be
seen by increasing Q until m.
/
Q > m.
/.
}
Because of its relation to the analysis of the BlackScholes formula for option
prices, we will now sketch a second proof of Proposition 2.65.
Second proof of Proposition 2:65. Let
Ym;



2
WD m  exp X 
2

for a standard normally distributed random variable X. Then


E .Ym;  c/C  D m .dC /  c .d /

with d D

log xc 12 2
I

see Example 5.56 in Chapter 5. Calculating the derivative of this expectation with
respect to > 0, one finds that
d
d
E .Ym;  c/C  D
.m .dC /  c .d // D x '.dC / > 0I
d
d
see (5.43) in Chapter 5. The law
m; of Ym; satisfies m.
m; / D m for all > 0.
Condition (c) of Corollary 2.61 implies that
m; is decreasing in > 0 with respect
to <uni and hence also with respect to <con , i.e.,
m; <con
m;Q if and only if  .
Q
For two different expectations m and m,
Q simply use the monotonicity of the function

93

Section 2.4 Uniform preferences

u.y/ WD .y  c/C to conclude


Z
u d
m; D Eu. m  exp. X  2 =2// 
 Eu. m
Q  exp. X  2 =2// 
Z
 u d
m;
Q Q ;
provided that m  m
Q and 0 <  .
Q
The partial order <uni was defined in terms of integrals against increasing concave
functions. By taking the larger class of all concave functions as integrands, we arrived
at the partial order <con defined by (2.23) and characterized in Corollary 2.61. In
the remainder of this section, we will briefly discuss the partial order of stochastic
dominance, which is induced by increasing instead of concave functions:
Definition 2.67. Let
and be two arbitrary probability measures on R. We say that

stochastically dominates and we write


<mon if
Z
Z
f d
 f d for all bounded increasing functions f 2 C.R/.
Stochastic dominance is sometimes also called first order stochastic dominance. It
is indeed a partial order on M1 .R/: Reflexivity and transitivity are obvious, and antisymmetry follows, e.g., from the equivalence (a) , (b) below. As will be shown by
the following theorem, the relation
<mon means that the distribution
is higher
than the distribution . In our one-dimensional situation, we can provide a complete
proof of this fact by using elementary properties of distribution functions. The general
version of this result, given in Theorem 2.96, will require different techniques.
Theorem 2.68. For
; 2 M1 .R/ the following conditions are equivalent:
(a)
<mon .
(b) The distribution functions of
and satisfy F .x/  F .x/ for all x.
(c) Any pair of quantile functions for
and satisfies q .t /  q .t / for a.e. t 2
.0; 1/.
(d) There exists a probability space .; F ; P / with random variables X and X
with distributions
and such that X  X P -a.s.
(e) There exists a stochastic kernel Q.x; dy/ on R such that Q.x; .1; x/ D 1
and such that D
Q.
In particular,
<mon implies
<uni .

94

Chapter 2 Preferences

Proof. (a) ) (b): Note that F .x/ D


..1; x/ can be written as
Z
F .x/ D 1  I.x;1/ .y/
.dy/:
It is easy to construct a sequence of increasing continuous functions with values in
0; 1 which increase to I.x;1/ for each x. Hence,
Z
Z
I.x;1/ .y/
.dy/  I.x;1/ .y/ .dy/ D 1  F .x/:
(b) , (c): This follows from the definition of a quantile function and from Lemma
A.17.
(c) ) (d): Let .; F ; P / be a probability space supporting a random variable U
with a uniform distribution on .0; 1/. Then X WD q .U / and X WD q .U / satisfy
X  X P -almost surely. Moreover, it follows from Lemma A.19 that they have the
distributions
and .
(d) ) (e): This is proved as in Theorem 2.57 by using regular conditional distributions.
(e) ) (a): Condition (e) implies that x  y for Q.x; /-a.e. y. Hence, if f is
bounded and increasing, then
Z
Z
f .y/ Q.x; dy/  f .x/ Q.x; dy/ D f .x/:
Therefore,

Z Z
f d D

Z
f .y/ Q.x; dy/
.dx/ 

f d
:

Finally, due to the equivalence (a) , (b) above and the equivalence (a) , (d) in
Theorem 2.57,
<mon implies
<uni .
Remark 2.69. It is clear from conditions (d) or (e) of Theorem 2.68 that the set of
bounded, increasing, and continuous functions in Definition 2.67 can be replaced by
the set of all increasing functions for which the two integrals make sense. Thus,

<mon for
; 2 M implies
<uni , and in particular m.
/  m. /. Moreover,
}
condition (d) shows that
<mon together with m.
/ D m. / implies
D .

2.5

Robust preferences on asset profiles

In this section, we discuss the structure of preferences for assets on a more fundamental level. Instead of assuming that the distributions of assets are known and that
preferences are defined on a set of probability measures, we will take as our basic
objects the assets themselves. An asset will be viewed as a function which associates real-valued payoffs to possible scenarios. More precisely, X will denote a set

95

Section 2.5 Robust preferences on asset profiles

of bounded measurable functions X on some measurable set .; F /. We emphasize


that no a priori probability measure is given on .; F /. In other words, we are facing
uncertainty instead of risk.
We assume that X is endowed with a preference relation . In view of the financial
interpretation, it is natural to assume that is monotone in the sense that
Y X

if Y .!/  X.!/ for all ! 2 .

Under a suitable condition of continuity, we could apply the results of Section 2.1 to
obtain a numerical representation of . L. J. Savage introduced a set of additional
axioms which guarantee that there is a numerical representation of the special form
Z
(2.28)
U.X / D EQ u.X /  D u.X.!// Q.d!/ for all X 2 X
where Q is a probability measure on .; F / and u is a function on R. The measure
Q specifies the subjective view of the probabilities of events which is implicit in the
preference relation . Note that the function u W R ! R is determined by restricting
U to the class of constant functions on .; F /. Clearly, the monotonicity of is
equivalent to the condition that u is an increasing function.
Definition 2.70. A numerical representation of the form (2.28) will be called a Savage
representation of the preference relation .
Remark 2.71. Let
Q;X denote the distribution of X under the subjective measure Q. Clearly, the preference order on X given by (2.28) induces a preference
order on
MQ WD
Q;X j X 2 X
with von NeumannMorgenstern representation
Z
UQ .
Q;X / WD U.X / D EQ u.X /  D
i.e.,

u d
Q;X ;

Z
UQ .
/ D

u.x/
.dx/ for
2 MQ .

On this level, Section 2.3 specifies the conditions on UQ which guarantee that u is a
(strictly concave and strictly increasing) utility function.
}
Remark 2.72. Even if an economic agent with preferences would accept the view
that scenarios ! 2  are generated in accordance to a given objective probability
measure P on .; F /, the preference order on X may be such that the subjective measure Q appearing in the Savage representation (2.28) is different from the
objective measure P . Suppose, for example, that P is Lebesgue measure restricted

96

Chapter 2 Preferences

to  D 0; 1, and that X is the space of bounded right-continuous increasing functions on 0; 1. Let
P;X denote the distribution of X under P . By Lemma A.19,
every probability measure on R with bounded support is of the form
P;X for some
X 2 X, i.e.,
Mb .R/ D
P;X j X 2 X:
Suppose the agent agrees that, objectively, X 2 X can be identified with the lottery

P;X , so that the preference relation on X could be viewed as a preference relation


on Mb .R/ with numerical representation
U  .
P;X / WD U.X /:
This does not imply that U  satisfies the assumptions of Section 2.2; in particular, the
preference relation on Mb .R/ may violate the independence axiom. In fact, the agent
might take a pessimistic view and distort P by putting more emphasis on unfavorable
scenarios. For example, the agent could replace P by the subjective measure
Q WD 0 C .1  /P
for some 2 .0; 1/ and specify preferences by a Savage representation in terms of u
and Q. In this case,
Z

U .
P;X / D EQ u.X /  D u d
Q;X
D u.X.0// C .1  /EP u.X / 
Z
D u.X.0// C .1  / u d
P;X :
Note that X.0/ D `.
P;X / for
`.
/ WD inf.supp
/ D supa 2 R j
..1; a// D 0 ;
where supp
is the support of
. Hence, replacing P by Q corresponds to a nonlinear distortion on the level of lotteries:
D
P;X is distorted to the lottery
 D

Q;X given by

 D `./ C .1  /
;
and the preference relation on lotteries has the numerical representation
Z
U  .
/ D u.x/
 .dx/ for
2 Mb .R/.
Let us now show that such a subjective distortion of objective lotteries provides a
possible explanation of the Allais paradox. Consider the lotteries
i and i , i D 1; 2,
described in Example 2.30. Clearly,

1 D
1

and

1 D 0 C .1  / 1 ;

97

Section 2.5 Robust preferences on asset profiles

while

2 D 0 C .1  /
2

and

1 D 0 C .1  / 1 :

For the particular choice u.x/ D x we have U  . 2 / > U  .


2 /, and for > 9=2409
we obtain U  .
1 / > U  . 1 /, in accordance with the observed preferences 2
2
and
1 1 described in Example 2.30.
For a systematic discussion of preferences described in terms of a subjective distortion of lotteries we refer to [173]. In Section 4.6, we will discuss the role of distortions
in the context of risk measures, and in particular the connection to Yaaris dual theory
of choice under risk [265].
}
Even in its general form (2.28), however, the paradigm of expected utility has a
limited scope as illustrated by the following example.
Example 2.73 (Ellsberg paradox). You are faced with a choice between two urns,
each containing 100 balls which are either red or black. In the first urn, the proportion
p of red balls is know; assume, e.g., p D 0:49. In the second urn, the proportion pQ
is unknown. Suppose that you get 1000 C if you draw a red ball and 0 C otherwise.
In this case, most people would choose the first urn. Naturally, they make the same
choice if you get 1000 C for drawing a black ball and 0 C for a red one. But this
behavior is not compatible with the paradigm of expected utility: For any subjective
probability pQ of drawing a red ball in the second urn, the first choice would imply
p > p,
Q the second would yield 1  p > 1  p,
Q and this is a contradiction.
}
For this reason, we are going to make one further conceptual step beyond the Savage representation before we start to prove a representation theorem for preferences
on X. Instead of a single measure Q, let us consider a whole class Q of measures
on .; F /. Our aim is to characterize those preference relations on X which admit a
representation of the form
U.X / D inf EQ u.X / :
Q2Q

(2.29)

This may be viewed as a robust version of the paradigm of expected utility: The
agent has in mind a whole collection of possible probabilistic views of the given set
of scenarios and takes a worst-case approach in evaluating the expected utility of a
given payoff.
It will be convenient to extend the discussion to the following framework where
payoffs can be lotteries. Let X denote the space of all bounded measurable functions
on .; F /. We are going to embed X into a certain space XQ of functions XQ on
.; F / with values in the convex set
Mb .R/ D
2 M1 .R/ j
.c; c/ D 1 for some c  0

98

Chapter 2 Preferences

of boundedly supported Borel probability measures on R. More precisely, XQ is deQ


fined as the convex set of all those stochastic kernels X.!;
dy/ from .; F / to R for
which there exists a constant c  0 such that
Q
X.!;
c; c/ D 1

for all ! 2 .

In economics, the elements of XQ are sometimes called acts or horse race lotteries; see,
for example, [187]. The space X can be embedded into XQ by virtue of the mapping
Q
X 3 X 7! X 2 X:

(2.30)

In this way, X can be identified with the set of all XQ 2 XQ for which the measure
XQ .!; / is a Dirac measure. A preference order on X defined by (2.29) clearly extends
to XQ by
Z Z
Q D inf
UQ .X/
u.y/ XQ .!; dy/ Q.d!/ D inf EQ u.
Q XQ / 
(2.31)
Q2Q

Q2Q

where uQ is the affine function on Mb .R/ defined by


Z
u.
/
Q
D u d
;
2 Mb .R/:
Remark 2.74. Restricting the preference order on XQ obtained from (2.31) to the
Q
constant maps X.!/
D
for
2 Mb .R/, we obtain a preference order on Mb .R/,
and on this level we know how to characterize risk aversion by the property that u is
strictly concave.
}
Example 2.75. Let us show how the Ellsberg paradox fits into our extended setting,
and how it can be resolved by a suitable choice of the set Q. For  D 0; 1 define
XQ 0 .!/ WD p 1000 C .1  p/0 ;

XQ1 .!/ WD .1  p/1000 C p 0 ;

and
ZQ i .!/ WD 1000  Ii .!/ C 0  I1i .!/;

i D 0; 1:

Take
Q WD q 1 C .1  q/0 j a  q  b
with a; b  0; 1. For any increasing function u, the functional
Q WD inf EQ u.
Q XQ / 
UQ .X/
Q2Q

satisfies
UQ .XQi / > UQ .ZQ i /;

i D 0; 1;

as soon as a < p < b, in accordance with the preferences described in Example 2.73.
}

99

Section 2.5 Robust preferences on asset profiles

Let us now formulate those properties of a preference order on the convex set
Q
X which are crucial for a representation of the form (2.31). For XQ ; YQ 2 XQ and
2 .0; 1/, (2.31) implies
Q XQ /  C .1  / EQ u.
Q YQ / /
UQ . XQ C .1  /YQ / D inf . EQ u.
Q2Q

Q C .1  /UQ .YQ /:
 UQ .X/
In contrast to the Savage case Q D Q, we can no longer expect equality, except
for the case of certainty YQ .!/ 
. If XQ  YQ , then UQ .XQ / D UQ .YQ /, and the lower
bound reduces to UQ .XQ / D UQ .YQ /. Thus, satisfies the following two properties:
Uncertainty aversion: If XQ ; YQ 2 XQ are such that XQ  YQ , then
XQ C .1  /YQ XQ

for all 2 0; 1.

Q ZQ 
2 Mb .R/, and 2 .0; 1 we have
Certainty independence: For XQ ; YQ 2 X,
XQ YQ

Q
XQ C .1  /ZQ YQ C .1  /Z:

Remark 2.76. In order to motivate the term uncertainty aversion, consider the situation of the preceding example. Suppose that an agent is indifferent between the
choices ZQ 0 and ZQ 1 , which both involve the same kind of Knightian uncertainty. For
2 .0; 1/, the convex combination YQ WD ZQ 0 C.1/ZQ 1 , which is weakly preferred
to both ZQ 0 and ZQ 1 in the case of uncertainty aversion, takes the form

1000 C .1  /0 for ! D 1,
YQ .!/ D
0 C .1  /1000 for ! D 0,
i.e., uncertainty is reduced in favor of risk. For D 1=2, the resulting lottery
YQ .!/  12 .1000 C 0 / is independent of the scenario !, i.e., Knightian uncertainty
is completely replaced by the risk of a coin toss.
}
Remark 2.77. The axiom of certainty independence extends the independence axiom for preferences on lotteries to our present setting, but only under the restriction
that one of the two contingent lotteries XQ and YQ is certain, i.e., does not depend on
the scenario ! 2 . Without this restriction, the extended independence axiom would
lead to the Savage representation in its original form (2.28); see Exercise 2.5.3 below.
Q As an exThere are good reasons for not requiring full independence for all ZQ 2 X.
Q
Q
Q An agent
Q
ample, take  D 0; 1 and define X .!/ D ! , Y .!/ D 1! , and Z D X.
Q
Q
may prefer X over Y , thus expressing the implicit view that scenario 1 is somewhat
more likely than scenario 0. At the same time, the agent may like the idea of hedging
against the occurrence of scenario 0, and this could mean that the certain lottery
1
1 Q
. Y C ZQ /./  .0 C 1 /
2
2

100

Chapter 2 Preferences

is preferred over the contingent lottery


1 Q
Q
. X C ZQ /./  X./;
2
thus violating the independence assumption in its unrestricted form. In general, the
role of ZQ as a hedge against scenarios unfavorable for YQ requires that YQ and ZQ are
not comonotone, i.e.,
9 !;  2  W

Q
Q
YQ .!/ YQ ./; Z.!/
 Z./:

(2.32)

Thus, the wish to hedge would still be compatible with the following enforcement of
certainty independence, called
Comonotonic independence: For XQ ; YQ ; ZQ 2 XQ and 2 .0; 1
XQ YQ

Q
XQ C .1  /ZQ YQ C .1  /Z:

(2.33)

whenever YQ and ZQ are comonotone in the sense that (2.32) does not occur.
The consequences of requiring comonotonic independence will be analyzed in Exercise 2.5.4 below. It is also relevant to weaken the axiom of certainty independence
and to require instead
Weak certainty independence: if for XQ ; YQ 2 XQ and for some 2 M1;c .S/ and
2 .0; 1 we have XQ C .1  / YQ C .1  / , then
XQ C .1  /
YQ C .1  /

for all
2 Mb .S/.

The consequences of requiring weak certainty independence will be analyzed in Theorem 2.88 below.
}
Q The set Mb .R/
From now on, we assume that is a given preference order on X.
Q
will be regarded as a subset of X by identifying a constant function ZQ 
with its
value
2 Mb .R/. We assume that possesses the following properties:


Uncertainty aversion.

Certainty independence.
Q
Q Moreover, is
Monotonicity: If YQ .!/ X.!/
for all ! 2 , then YQ X.
compatible with the usual order on R, i.e., y x if and only if y > x.
Q If
Continuity: The following analogue of the Archimedean axiom holds on X:
Q
Q
Q
Q
Q
Q
Q
X ; Y ; Z 2 X are such that Z Y X, then there are ; 2 .0; 1/ with

ZQ C .1  /XQ YQ ZQ C .1  /XQ :
Moreover, for all c > 0 the restriction of to M1 .c; c/ is continuous with
respect to the weak topology.

101

Section 2.5 Robust preferences on asset profiles

Let us denote by
M1;f WD M1;f .; F /
the class of all set functions Q W F ! 0; 1 which are normalized to Q   D 1
and which are finitely additive, i.e., Q A [ B  D Q A  C Q B  for all disjoint
A; B 2 F ; see Appendix A.6. By EQ X  we denote the integral of X with respect
to Q 2 M1;f ; see Appendix A.6. With M1 DM1 .; F / we denote the -additive
members of M1;f , that is, the class of all probability measures on .; F /. Note that
the inclusion M1  M1;f is typically strict as is illustrated by Example A.53.
Theorem 2.78. Consider a preference order on XQ satisfying the four properties of
uncertainty aversion, certainty independence, monotonicity, and continuity.
(a) There exists a strictly increasing function u 2 C.R/ and a convex set Q 
M1;f .; F / such that
Z

Q
Q
Q
u.x/ X .; dx/
U .X / D min EQ
Q2Q

is a numerical representation of . Moreover, u is unique up to positive affine transformations.


(b) If the induced preference order on X, viewed as a subset of XQ as in (2.30),
satisfies the following additional continuity property
X Y and Xn % X

H)

Xn Y

for all large n,

(2.34)

then the set functions in Q are in fact probability measures, i.e., each Q 2 Q is
-additive. In this case, the induced preference order on X has the robust Savage
representation
U.X / D min EQ u.X /  for X 2 X
Q2Q

with Q  M1 .; F /.
Remark 2.79. Even without its axiomatic foundation, the robust Savage representation is highly plausible as it stands, since it may be viewed as a worst-case approach
to the problem of model uncertainty. This aspect will be of particular relevance in our
discussion of risk measures in Chapter 4.
}
The proof of Theorem 2.78 needs some preparation.
Q the axiom of certainty indeWhen restricted to Mb .R/, viewed as a subset of X,
pendence is just the independence axiom of the von NeumannMorgenstern theory.
Thus, the preference relation on Mb .R/ satisfies the assumptions of Corollary 2.28,
and we obtain the existence of a continuous function u W R ! R such that
Z
u.
/
Q
WD u.x/
.dx/
(2.35)

102

Chapter 2 Preferences

is a numerical representation of on the set Mb .R/. Moreover, u is unique up


to positive affine transformations. The second part of our monotonicity assumption
implies that u is strictly increasing. Without loss of generality, we assume u.0/ D 0
and u.1/ D 1.
Remark 2.80. In view of the representation (2.35), it follows as in (2.11) that any

2 Mb .R/ admits a unique certainty equivalent c.


/ 2 R for which

 c./ :
Thus, if X 2 X is defined for XQ 2 XQ as X.!/ WD c.XQ .!//, then the first part of our
monotonicity assumption yields
XQ  X ;
(2.36)
and so the preference relation on XQ is uniquely determined by its restriction to X.
}
Lemma 2.81. There exists a unique extension UQ of the functional uQ in (2.35) as a
Q
numerical representation of on X.
Proof. For XQ 2 XQ let c > 0 be such that XQ .!; c; c/ D 1 for all ! 2 . Then
Q XQ .!//  u.
Q c/
u.
Q c /  u.

for all ! 2 ,

and our monotonicity assumption implies that


c XQ c :
We will show below that there exists a unique 2 0; 1 such that
XQ  .1  /c C c :

(2.37)

Once this has been achieved, the only possible choice for UQ .XQ / is
UQ .XQ / WD u..1
Q
 /c C c / D .1  /u.
Q c / C u.
Q c /:
Q
This definition of UQ provides a numerical representation of on X.
The proof of the existence of a unique 2 0; 1 with (2.37) is similar to the proof
of Lemma 2.24. Uniqueness follows from the monotonicity
>

H)

.1  /c C c .1  /c C c ;

(2.38)

which is an immediate consequence of the von NeumannMorgenstern representation


(2.35). Now we let
WD sup 2 0; 1 j XQ .1  /c C c :

103

Section 2.5 Robust preferences on asset profiles

We have to exclude the two following cases:


XQ .1  /c C c
.1  /c C c XQ :

(2.39)
(2.40)

In the case (2.39), our continuity axiom yields some 2 .0; 1/ for which
XQ .1  /c C c  C .1  /c D .1  /c C c
where  D C .1  / > , in contradiction to the definition of .
If (2.40) holds, then the same argument as above yields 2 .0; 1/ with
c C .1  /c XQ :
By our definition of there must be some  2 .; / with
XQ .1  /c C c c C .1  /c ;
where the second relation follows from (2.38). This, however, is a contradiction.
Remark 2.82. Note that the proof of the preceding lemma relies only on the assumptions of continuity and monotonicity of . Certainty independence and uncertainty
aversion are not needed.
}
Via the embedding (2.30), Lemma 2.81 induces a numerical representation U of
on X given by
(2.41)
U.X / WD UQ .X /:
The following proposition clarifies the properties of the functional U and provides the
key to a robust Savage representation of the preference order on X.
Proposition 2.83. Given u of (2.35) and the numerical representation U on X constructed via Lemma 2:81 and (2.41), there exists a unique functional  W X ! R
such that
U.X / D .u.X // for all X 2 X,
(2.42)
and such that the following four properties are satisfied:


Monotonicity: If Y .!/  X.!/ for all !, then .Y /  .X /.

Concavity: If 2 0; 1 then . X C .1  /Y /  .X / C .1  /.Y /.

Positive homogeneity: . X / D .X / for  0.

Cash invariance: .X C z/ D .X / C z for all z 2 R.

104

Chapter 2 Preferences

Proof. Denote by Xu the space of all X 2 X which take values in a compact subset of the range u.R/ of u. Clearly, Xu coincides with the range of the non-linear
transformation X 3 X 7! u.X /. Note that this transformation is bijective since u is
continuous and strictly increasing due to our assumption of monotonicity. Thus,  is
well-defined on Xu via (2.42). We show next that this  has the four properties of the
assertion.
Monotonicity is obvious. For positive homogeneity on Xu , it suffices to show that
. X / D .X / for X 2 Xu and 2 .0; 1. Let X0 2 X be such that u.X0 / D X.
We define ZQ 2 XQ by
ZQ WD X0 C .1  /0 :
By (2.36), ZQ  Z where Z is given by
Z.!/ D c. X0 .!/ C .1  /0 /
D u1 . u.X0 .!// C .1  /u.0//
D u1 . u.X0 .!///;
where we have used our convention u.0/ D 0. It follows that u.Z/ D u.X0 / D X,
and so
Q
. X / D U.Z/ D UQ .Z/:
(2.43)
As in (2.37), one can find 2 Mb .R/ such that  X0 . Certainty independence
implies that
ZQ D X0 C .1  /0  C .1  /0 :
Hence,
Q D u.
UQ .Z/
Q
C .1  /0 / D u. /
Q
D U.X0 / D .X /:
This shows that  is positively homogeneous on Xu .
Since the range of u is an interval, we can extend  from Xu to all of X by positive
homogeneity, and this extension, again denoted , is also monotone and positively
homogeneous.
Let us now show that  is cash invariant. First note that
.1/ D

u.
Q x/
.u.x//
D
D1
u.x/
u.x/

for any x such that u.x/ 0. Now take X 2 X and z 2 R. By positive homogeneity,
we may assume without loss of generality that 2X 2 Xu and 2z 2 u.R/. Then there
are X0 2 X such that 2X D u.X0 / as well as z0 ; x0 2 R with 2z D u.z0 / and
2.X / D u.x0 /. Note that X0  x0 . Thus, certainty independence yields
1
1
ZQ WD .X0 C z0 /  .x0 C z0 / DW
:
2
2

Section 2.5 Robust preferences on asset profiles

105

On the one hand, it follows that


Q D U.
/ D 1 u.x0 / C 1 u.z0 / D .X / C z:
UQ .Z/
2
2
On the other hand, the same reasoning which lead to (2.43) shows that
Q D .X C z/:
UQ .Z/
As to concavity, we need only show that . 12 X C 12 Y /  12 .X / C 12 .Y / for
X; Y 2 Xu , by Exercise 2.5.2 below. Let X0 ; Y0 2 X be such that X D u.X0 / and
Y D u.Y0 /. If .X / D .Y /, then X0  Y0 , and uncertainty aversion gives
1
ZQ WD .X0 C Y0 / X0 ;
2
which by the same arguments as above yields


1
1
1
Q
Q
U .Z/ D  X C Y  .X / D ..X / C .Y //:
2
2
2
The case in which .X / > .Y / can be reduced to the previous one by letting z WD
.X /  .Y /, and by replacing Y by Yz WD Y C z. Cash invariance then implies that




1
1
1
1
1
 X C Y C z D  X C Yz
2
2
2
2
2
1
 ..X / C .Yz //
2
1
1
D ..X / C .Y // C z:
2
2
A functional  W X ! R satisfying the properties of monotonicity, concavity,
positive homogeneity, and cash invariance is sometimes called a coherent monetary
utility functional. The functional .X / WD .X / is called a coherent risk measure.
Functionals of this type will be studied in detail in Chapter 4.
Exercise 2.5.1. Let  W X ! R be a functional that satisfies the properties of monotonicity and cash invariance stated in Proposition 2.83. Show that is Lipschitz continuous on X with respect to the supremum norm k  k, i.e.,
j.X /  .Y /j  kX  Y k for all X; Y 2 X.

Exercise 2.5.2. Suppose that  W X ! R is monotone, cash invariant, and satisfies


. 12 .X C Y //  12 .X / C 12 .Y / for all X; Y 2 X. Use Exercise 2.5.1 to show that
 is concave.
}

106

Chapter 2 Preferences

Let us now show that a function with the four properties established in Proposition 2.83 can be represented in terms of a family of set functions in the class M1;f .
Proposition 2.84. A functional  W X ! R is monotone, concave, positively homogeneous, and cash invariant if and only if there exists a set Q  M1;f such that
.X / D inf EQ X ;
Q2Q

X 2 X:

Moreover, the set Q can always be chosen to be convex and such that the infimum
above is attained, i.e.,
.X / D min EQ X ;
Q2Q

X 2 X:

Proof. The necessity of the four properties is obvious. Conversely, we will construct
for any X 2 X a finitely additive set function QX such that .X / D EQX X  and
.Y /  EQX Y  for all Y 2 X. Then
.Y / D min EQ Y 
Q2Q0

for all Y 2 X

(2.44)

where Q0 WD QX j X 2 X. Clearly, (2.44) remains true if we replace Q0 by its


convex hull Q WD conv Q0 .
To construct QX for a given X 2 X, we define three convex sets in X by
B WD Y 2 X j .Y / > 1;
C1 WD Y 2 X j Y  1;

and

C2 WD Y 2 X Y 

X
:
.X /

The convexity of C1 and C2 implies that the convex hull of their union is given by
C WD conv.C1 [ C2 / D Y1 C .1  /Y2 j Yi 2 Ci and 2 0; 1 :
Since Y 2 C is of the form Y D Y1 C .1  /Y2 for some Yi 2 Ci and 2 0; 1,
.Y /  . C .1  /Y2 / D C .1  /.Y2 /  1;
and so B and C are disjoint. Let X be endowed with the supremum norm kY k WD
sup!2 jY .!/j. Then C1 , and hence C , contains the unit ball in X. In particular, C
has non-empty interior. Thus, we may apply the separation argument in the form of
Theorem A.55, which yields a non-zero continuous linear functional ` on X such that
c WD sup `.Y /  inf `.Z/:
Y 2C

Z2B

Section 2.5 Robust preferences on asset profiles

107

Since C contains the unit ball, c must be strictly positive, and there is no loss of
generality in assuming c D 1. In particular, `.1/  1 as 1 2 C. On the other hand,
any constant b > 1 is contained in B, and so
`.1/ D lim `.b/  c D 1:
b#1

Hence, `.1/ D 1.
If A 2 F then IAc 2 C1  C , which implies that
`.IA / D `.1/  `.IAc /  1  1 D 0:
By Theorem A.51 there exists a finitely additive set function QX 2 M1;f .; F /
such that `.Y / D EQX Y  for any Y 2 X.
It remains to show that EQX Y   .Y / for all Y 2 X, with equality for Y D X.
By the cash invariance of , we need only consider the case in which .Y / > 0. Then
Yn WD

Y
1
C 2 B;
.Y /
n

and Yn ! Y =.Y / uniformly, whence


EQX Y 
D lim EQX Yn   1:
.Y /
n"1
On the other hand, X=.X / 2 C2  C yields the inequality
EQX X 
 c D 1:
.X /
We are now ready to complete the proof of the first main result in this section.
Proof of Theorem 2:78. (a): By Remark 2.80, it suffices to consider the induced preference relation on X once the function u has been determined. According to
Lemma 2.81 and the two Propositions 2.83 and 2.84, there exists a convex set Q 
M1;f such that
U.X / D min EQ u.X / 
Q2Q

is a numerical representation of on X. This proves the first part of the assertion.


(b): The assumption (2.34) applied to X  1 and Y  b < 1 gives that any
sequence with Xn % 1 is such that Xn b for large enough n. We claim that
this implies that U.Xn / % u.1/ D 1. Otherwise, U.Xn / would increase to some
number a < 1. Since u is continuous and strictly increasing, we may take b such that
a < u.b/ < 1. But then U.Xn / > U.b/ D u.b/ > a for large enough n, which is a
contradiction.

108

Chapter 2 Preferences

S In particular, we obtain that for any increasing sequence of events An 2 F with


n An D 
lim min Q An  D lim U.IAn / D 1:
n"1 Q2Q

n"1

But this means that each Q 2 Q satisfies limn Q An  D 1, which is equivalent to the
-additivity of Q.
The continuity assumption (2.34), required for all Xn 2 X, is actually quite strong.
In a topological setting, our discussion of risk measures in Chapter 4 will imply the
following version of the representation theorem.
Proposition 2.85. Consider a preference order as in Theorem 2:78. Suppose that
 is a Polish space with Borel field F and that (2.34) holds if Xn and X are continuous. Then there exists a class of probability measures Q  M1 .; F / such that the
induced preference order on X has the robust Savage representation
U.X / D min EQ u.X /  for continuous X 2 X.
Q2Q

Proof. As in the proof of Theorem 2.78, the continuity property of implies the
corresponding continuity property of U , and hence of the functional  in (2.42). The
result follows by combining Proposition 2.83, which reduces the representation of U
to a representation of , with Proposition 4.27 applied to the coherent risk measure
 WD .
Now we consider an alternative setting where we fix in advance a reference measure P on .; F /. In this context, X will be identified with the space L1 .; F ; P /,
and the representation of preferences will involve measures which are absolutely continuous with respect to P . Note, however, that this passage from measurable functions to equivalence classes of random variables in L1 .; F ; P /, and from arbitrary
probability measures to absolutely continuous measures, involves a certain loss of
robustness in the face of model uncertainty.
Theorem 2.86. Let be a preference relation as in Theorem 2:78, and assume that
X Y

whenever X D Y P -a.s.

(a) There exists a robust Savage representation of the form


U.X / D inf EQ u.X / ;
Q2Q

X 2 X;

where Q consists of probability measures on .; F / which are absolutely continuous


with respect to P , if and only if satisfies the following condition of continuity from
above:
Y X and Xn & X P -a.s.

H)

Y Xn

P -a.s. for all large n.

109

Section 2.5 Robust preferences on asset profiles

(b) There exists a representation of the form


U.X / D min EQ u.X / ;
Q2Q

X 2 X;

where Q consists of probability measures on .; F / which are absolutely continuous


with respect to P , if and only if satisfies the following condition of continuity from
below:
X Y and Xn % X P -a.s.

H)

Xn Y

P -a.s. for all large n.

Proof. As in the proof of Theorem 2.78, the continuity property of implies the
corresponding continuity property of U , and hence of the functional  in (2.42). The
results follow by combining Proposition 2.83, which reduces the representation of
U to a representation of , with Corollary 4.37 and Corollary 4.38 applied to the
coherent risk measure  WD .
In the following two exercises we explore the impact of replacing the axiom of
certainty independence by stronger requirements as discussed in Remark 2.77.
Exercise 2.5.3. Show that the following conditions are equivalent:
(a) The preference relation satisfies the following unrestricted independence axiom on XQ ,
Independence: For XQ ; YQ ; ZQ 2 XQ and 2 .0; 1 we have
XQ YQ

Q
XQ C .1  /ZQ YQ C .1  /Z:

(b) The functional  is additive: .X C Y / D .X / C .Y / for X 2 X.


(c) The set Q has exactly one element Q 2 M1;f , and so U.X / D .u.X // admits
the Savage representation
U.X / D EQ u.X / ;

X 2 X:

Exercise 2.5.4. Two random variables X; Y 2 X will be called comonotone when


X.!/  X.! 0 /Y .!/  Y .! 0 /  0 for all pairs .!; ! 0 / 2   .
Show that the following conditions are equivalent:
(a) The preference relation satisfies the axiom of comonotonic independence
(2.33).
(b) The functional  is comonotonic in the sense that .X C Y / D .X / C .Y /
whenever X; Y 2 X are comonotone.
}

110

Chapter 2 Preferences

Now we discuss what happens if we replace the assumption of certainty independence by weak certainty independence as introduced in Remark 2.77. We thus assume
from now on that is a preference relation on XQ satisfying the following conditions.


Weak certainty independence.

Uncertainty aversion.

Monotonicity.

Continuity.

Exercise 2.5.5. Show that the restriction of to Mb .R/ satisfies the independence
axiom of von NeumannMorgenstern theory and hence admits a von NeumannMorgenstern representation
Z
(2.45)
u.
/
Q
WD u.x/
.dx/;
2 Mb .R/;
with a continuous and strictly increasing function u W R ! R.

For simplicity we will assume for the rest of this section that the function u in (2.45)
has an unbounded range u.R/ containing zero. The assumption of an unbounded
range is satisfied automatically if the restriction of to Mb .R/ is risk averse and u
hence concave. Lemma 2.81 and Remark 2.82 imply that the numerical representation
uQ in (2.45) admits a unique extension UQ W XQ ! R that is a numerical representation
Q We define again
of on all of X.
U.X / WD UQ .X / for X 2 X.
As in Remark 2.80, we see that XQ  X when X 2 X is defined as X.!/ WD c.XQ .!//
with
Z

1
c.
/ D u
u d

denoting the certainty equivalent of a lottery


2 Mb .R/. In particular, the preference
relation on XQ is uniquely determined by its restriction to X. We now analyze the
structure of U in analogy to Proposition 2.83. Our next result shows that U is again of
the form U.X / D .u.X // for a functional  W X ! R. However,  may no longer
be positively homogeneous, since we have replaced certainty independence with weak
certainty independence.
Proposition 2.87. Under the above assumptions, there exists a unique functional
 W X ! R such that
UQ .X / D .u.
Q XQ //

Q
for all XQ 2 X,

and such that the following three properties are satisfied:

(2.46)

111

Section 2.5 Robust preferences on asset profiles




Monotonicity: If Y .!/  X.!/ for all !, then .Y /  .X /.

Concavity: If 2 0; 1 then . X C .1  /Y /  .X / C .1  /.Y /.

Cash invariance: .X C z/ D .X / C z for all z 2 R.

Proof. Let Xu denote the set of all X 2 X that take values in a compact subset of the
range u.R/ of u. Since u is strictly increasing, we can define  on Xu via
.X / WD UQ .u1 .X/ /;

X 2 Xu :

Then we have
Q
Q X//;
UQ .XQ / D UQ .c.XQ / / D .u.c.XQ /// D .u.

(2.47)

and so (2.46) follows. Moreover,  is monotone on Xu due to our monotonicity


assumption.
We now prove that  is cash invariant on Xu . To this end, we assume first that
u.R/ D R and take X 2 X and some z 2 R. We then let X0 WD u1 .2X /,
z0 WD u1 .2z/, and y WD u1 .0/. Taking a > 0 such that a  X0 .!/  a for each
!, we see as in (2.37) that there exists 2 0; 1 such that
1
1

1
1
1
X0 C y  .a C y / C
.a C y / D
C y ;
2
2
2
2
2
2
where
D a C .1  /a . Using weak certainty independence, we may replace
y by z0 and obtain 12 .X0 C z0 /  12 .
C z0 /. Hence, from (2.47),
1


 1
1
1 
u.X0 / C u.z0 / D  uQ X0 C z0
2
2
2
2

1

1
1
1
D UQ X0 C z0 D UQ
C z0
2
2
2
2
1
1 
1
1
Q
C u.z0 /:
D uQ
C z0 D u.
/
2
2
2
2

.X C z/ D 

Cash invariance now follows from u.z0 / D 2z and the fact that
1
1
1
1 
1 
1
u.
/
Q
D .u.
/
Q
C u.y // D uQ
C y D UQ
C y
2
2
2
2
2
2


1
1
1
1
D UQ X0 C y D  u.X0 / C u.y/ D .X /:
2
2
2
2
Here we have again applied (2.47). If u.R/ is not equal to R it is sufficient to consider
the cases in which u.R/ contains 0; 1/ or .1; 0 and to work with positive or negative quantities X and z, respectively. Then the preceding argument establishes the
cash invariance of  on the spaces of positive or negative bounded measurable functions, and  can be extended by translation to the entire space of bounded measurable
functions.

112

Chapter 2 Preferences

Now we prove the concavity of  by showing . 12 .X C Y //  12 .X / C 12 .Y /.


This is enough due to Exercise 2.5.2. Let X0 WD u1 .X / and Y0 WD u1 .Y / and
suppose first that .X / D .Y /. Then X0  Y0 and uncertainty aversion implies
that ZQ WD 12 X0 C 12 Y0 X0 . Hence, by using (2.47),


1


Q  UQ .X0 / D .X / D 1 .X / C 1 .Y /:
.X C Y / D UQ .Z/
2
2
2

When .X / .Y / we let z WD .X /  .Y / so that Yz WD Y C z satisfies


.Yz / D .X /. Hence,


 1
 1
1
1
1
1
1
.X CY / C z D  .X CYz /  .X /C .Yz / D .X /C .Y /C z:
2
2
2
2
2
2
2
2

1

A functional  W X ! R satisfying the properties of monotonicity, concavity, and


cash invariance is sometimes called a concave monetary utility functional, and  WD
 is called a convex risk measure. In Chapter 4 we will derive various representation
results for convex risk measures. In particular it will follow from Theorem 4.16 that
every concave monetary utility functional  W X ! R is of the form
.X / D

min .EQ X  C .Q//;

Q2M1;f

X 2 X;

(2.48)

where the penalty function W M1;f ! R [ C1 is bounded from below.


Exercise 2.5.6. Prove the representation (2.48) for the case in which X is the set of
all functions X W  ! R on a finite set . To this end, one can either use biduality
(Theorem A.62) or, more directly, a separation argument as given in Proposition A.1.
}
Combining Proposition 2.87 with the representation (2.48) yields the final result of
this section:
Theorem 2.88. Consider a preference order on XQ satisfying the four properties
of uncertainty aversion, weak certainty independence, monotonicity, and continuity.
Assume moreover that the function u in (2.45) has an unbounded range u.R/. Then
there exists a penalty function W M1;f ! R [ C1 that is bounded from below
such that
 Z


Q
EQ
u.x/ XQ .; dx/ C .Q/ ; XQ 2 X;
UQ .XQ / D min
Q2M1;f

is a numerical representation of .

113

Section 2.6 Probability measures with given marginals

2.6

Probability measures with given marginals

In this section, we study the construction of probability measures with given marginals. In particular, this will yield the missing implication in the characterization of
uniform preference in Theorem 2.57, but the results in this section are of independent
interest. We focus on the following basic question: Suppose
1 and
2 are two
probability measures on S, and is a convex set of probability measures on S  S;
when does contain some
which has
1 and
2 as marginals?
The answer to this question will be given in a general topological setting. Let S be
a Polish space, and let us fix a continuous function on S with values in 1; 1/. As
in Section 2.2 and in Appendix A.6, we use as a gauge function in order to define
the space of measures
Z

.x/
.dx/ < 1
M1 .S/ WD
2 M1 .S/
and the space of continuous test functions
C .S/ WD f 2 C.S/ j 9 c W jf .x/j  c 
The

.x/ for all x 2 S :

-weak topology on M1 .S/ is the coarsest topology such that


Z
M1 .S/ 3
7! f d

is a continuous mapping for all f 2 C .S/; see Appendix A.6 for details. On the
product space S  S, we take the gauge function
.x; y/ WD

.x/ C

.y/;

and define the corresponding set M1 .S S /, which will be endowed with the -weak
topology.
Theorem 2.89. Suppose that  M1 .S  S/ is convex and closed in the -weak
topology, and that
1 ,
2 are probability measures in M1 .S/. Then there exists some

2 with marginal distributions


1 and
2 if and only if
Z
Z
Z
f1 d
1 C f2 d
2  sup .f1 .x/ C f2 .y// .dx; dy/ for all f1 ; f2 2 C .S/.

Theorem 2.89 is due to V. Strassen [255]. Its proof boils down to an application of
the HahnBanach theorem; the difficult part consists in specifying the right topological setting. First, let us investigate the relations between M1 .S  S/ and M1 .S/. To
this end, we define mappings
i W M1 .S  S / ! M1 .S/;

i D 1; 2,

114

Chapter 2 Preferences

that yield the i th marginal distribution of a measure 2 M1 .S  S/:


Z
Z
Z
Z
f d.2 / D f .y/ .dx; dy/;
f d.1 / D f .x/ .dx; dy/ and
for all f 2 C .S/.
Lemma 2.90. 1 and 2 are continuous and affine mappings from M1 .S  S/ to
M1 .S/.
Proof. Suppose that n converges to in M1 .S S/. For f 2 C .S/ let f .x; y/ WD
f .x/. Clearly, f 2 C .S  S /, and thus
Z
Z
Z
Z
f d.1 n / D f d n ! f d D f d.1 /:
Therefore, 1 is continuous, and the same is true of 2 . Affinity is obvious.
Now, let us consider the linear space
E WD
 j
; 2 M1 .S/; ; 2 R
R
spanned by M1 .S/. For  D
 2 E the integral f d against a function
f 2 C .S/ is well-defined and given by
Z
Z
Z
f d D f d
 f d :
R
In particular,  7! f d is linear functional Ron E, so we
R can regard C .S/ as a
subset of the algebraic dual E  of E. Note that f d D f d Q for all f 2 C .S/
implies  D ,
Q i.e., C .S/ separates the points of E. We endow E with the coarsest
topology .E; C .S// for which all maps
Z
E 3  7! f d; f 2 C .S/;
are continuous; see Definition A.58. With this topology, E becomes a locally convex
topological vector space.
Lemma 2.91. Under the above assumptions, M1 .S/ is a closed convex subset of E,
and the relative topology of the embedding coincides with the -weak topology.
Proof. The sets of the form
U" .I f1 ; : : : ; fn / WD

Z
Z

Q 2 E fi d  fi d Q < "

n
\
i D1

Section 2.6 Probability measures with given marginals

115

with  2 E, n 2 N, fi 2 C .S/, and " > 0 form a base of the topology .E; C .S//.
Thus, if U  E is open, then every point
2 U \ M1 .S/ possesses some neighborhood U" .
I f1 ; : : : ; fn /  U . But U" .
I f1 ; : : : ; fn / \ M1 .S/ is an open neighborhood of
in the -weak topology. Hence, U \ M1 .S/ is open in the -weak
topology. Similarly, one shows that every open set V  M1 .S/ is of the form
V D U \M1 .S/ for some open subset U of E. This shows that the relative topology
M1 .S/ \ .E; C .S// coincides with the -weak topology.
Moreover, M1 .S/ is an intersection of closed subsets of E
Z
Z

M1 .S/ D  2 E
1 d D 1 \
2E
f d  0 :
f 2C .S/
f 0

Therefore, M1 .S/ is closed in E.


Next, let E 2 denote the product space E  E. We endow E 2 with the product
topology for which the sets U  V with U; V 2 .E; C .S// form a neighborhood
base. Clearly, E 2 is a locally convex topological vector space.
Lemma 2.92. Every continuous linear functional ` on E 2 is of the form
Z
Z
`.1 ; 2 / D f1 d1 C f2 d2
for some f1 ; f2 2 C .S/.
Proof. By linearity, ` is of the form `.1 ; 2 / D `1 .1 / C `2 .2 /, where `1 .1 / WD
`.1 ; 0/ and `2 .2 / WD `.0; 2 /. By continuity of `, the set
V WD `1 ..1; 1//
is open in E 2 and contains the point .0; 0/. Hence, there are two open neighborhoods
U1 ; U2  E such that .0; 0/ 2 U1  U2  V . Therefore,
0 2 Ui  `1
i ..1; 1//

for i D 1; 2,

i.e., 0 is an interior point of `1


i ..1; 1//. It follows that the `i are continuous at 0,
which in view of their linearity implies continuity everywhere on E.R Finally, we may
conclude from Proposition A.59 that each `i is of the form `i ./ D fi di for some
fi 2 C .S/.
The proof of the following lemma uses the characterization of compact sets for
the -weak topology that is stated in Corollary A.47. It is here that we need our
assumption that S is Polish.

116

Chapter 2 Preferences
N

Lemma 2.93. If is a closed convex subset of M1 .S  S/, then


H WD .1 ; 2 / j 2
is a closed convex subset of E 2 .
Proof. It is enough to show that H is closed in M1 .S/2 WD M1 .S/  M1 .S/,
because Lemma 2.91 implies that the relative topology induced by E 2 on M1 .S/2
coincides with the product topology for the -weak topology. This is a metric topology by Corollary A.45. So let .
n ; n / 2 H , n 2 N, be a sequence converging
to some .
; / 2 M1 .S/2 in the product topology. Since both sequences .
n /n2N
and . n /n2N are relatively compact for the -weak topology, Corollary A.47 yields
functions i W S ! 1; 1, i D 1; 2, such that sets of the form Kki WD i  k ,
k 2 N, are relatively compact in S and such that
Z
Z
sup 1 d
n C sup 2 d n < 1 :
n2N

n2N

For each n, there exists n 2 such that 1 n D


n and 2 n D n . Hence, if we
let .x; y/ WD 1 .x/ C 2 .y/, then
Z
Z
Z

sup  d n D sup
1 d
n C 2 d n < 1 :
n2N

n2N

Moreover, we claim that each set   k is relatively compact in S  S. To prove


this claim, let li 2 N be such that
li  sup

.x/ :

x2Kki

Then, since

 1,
2
1
[ Kk.1Cl
 Kk2 ;
  k  Kk1  Kk.1Cl
1/
2/

and the right-hand side is a relatively compact set in S  S . It follows from Corollary A.47 that the sequence . n /n2N is relatively compact for the -weak topology.
Any accumulation point of this sequence belongs to the closed set . Moreover,
has marginal distributions
and , since the projections i are continuous according
to Lemma 2.90. Hence .
; / 2 H .
Proof of Theorem 2:89. Let
1 ,
2 2 M1 .S/ be given. Since H is closed and
convex in E 2 by Lemma 2.93, we may apply Theorem A.57 with B WD .
1 ;
2 /
and C WD H : We conclude that .
1 ;
2 / H if and only if there exists a linear
functional ` on E 2 such that
`.
1 ;
2 / >

sup

`. 1 ; 2 / D sup `.1 ; 2 /:

. 1 ; 2 /2H

Applying Lemma 2.92 to ` completes the assertion.

117

Section 2.6 Probability measures with given marginals

We will now use Theorem 2.89 to deduce the remaining implication of Theorem 2.57. We consider here a more general, d -dimensional setting. To this end,
let x D .x 1 ; : : : ; x d / and y D .y 1 ; : : : ; y d / be two d -dimensional vectors. We will
say that x  y if x i  y i for all i. A function on Rd is called increasing, if it is
increasing with respect to the partial order .
d
RTheorem 2.94. Suppose
1 and
2 are Borel probability measures on R with
jxj
i .dx/ < 1 for i D 1; 2. Then the following assertions are equivalent:
R
R
(a) f d
1  f d
2 for all increasing concave functions f on Rd .

(b) There exists a probability space .; F ; P / with random variables X1 and X2
having distributions
1 and
2 , respectively, such that
E X2 j X1   X1

P -a.s.

(c) There exists a kernel Q.x; dy/ on Rd such that


Z
y Q.x; dy/  x for all x 2 Rd
Rd

and such that


2 D
1 Q.
Proof. (a) ) (b): We will apply Theorem 2.89 with S WD Rd and with the gauge
functions .x/ WD 1 C jxj and .x; y/ WD .x/ C .y/. We denote by Cb .Rd / the
set of bounded and continuous functions on Rd . Let
Z
Z

2 M1 .Rd Rd /
WD
yf .x/ .dx; dy/  xf .x/ .dx; dy/ :
f 2Cb .Rd /

Each single set of the intersection is convex and closed in M1 .Rd  Rd /, because
the functions g.x; y/ WD yf .x/ and g.x;
Q
y/ WD xf .x/ belong to C .Rd  Rd / for
f 2 Cb .S/. Therefore, itself is convex and closed.
Suppose we can show that contains an element P that has
1 and
2 as marginal
distributions. Then we can take  WD Rd  Rd with its Borel -algebra F , and let
X1 and X2 denote the canonical projections on the first and the second components,
respectively. By definition, Xi will have the distribution
i , and
E E X2 j X1 f .X1 /  D E X2 f .X1 /   E X1 f .X1 / 
By monotone class arguments, we may thus conclude that
E X2 j X1   X1
so that the assertion will follow.

P -a.s.

for all f 2 Cb .Rd /.

118

Chapter 2 Preferences

It remains to prove the existence of P . To this end, we will apply Theorem 2.89
with the set defined above. Take a pair f1 ; f2 2 C .Rd /, and let
fQ2 .x/ WD infg.x/ j g is concave, increasing, and dominates f2 :
Then fQ2 is concave, increasing, and dominates f2 . In fact, fQ2 is the smallest function
with these properties. We have
Z
Z
Z
Z
f1 d
1 C f2 d
2  f1 d
1 C fQ2 d
2
Z
 .f1 C fQ2 / d
1
 sup .f1 .x/ C fQ2 .x// DW r0 :
x2Rd

We will establish the condition in Theorem 2.89 for our set by showing that for
r < r0 we have
Z
r < sup .f1 .x/ C f2 .y// .dx; dy/:

To this end, let for z 2 Rd


Z

d
z WD 2 M1 .R /
x .dx/  z
and

Z
g2 .z/ WD sup

f2 d 2 z :

Then g2 is increasing and g2 .z/  f2 .z/, because z 2 z . Moreover, if 1 2 z1


and 2 2 z2 , then
1 C .1  / 2 2 z1 C.1/z2
for 2 0; 1. Therefore, g2 is concave, and we conclude that g2  fQ2 (recall
that fQ2 is the smallest increasing and concave function dominating f2 ). Hence, r <
f1 .z/ C g2 .z/ for some z 2 Rd , i.e., there exists some 2 z such that the product
measure WD z satisfies
Z
Z
r < f1 .z/ C f2 d D .f1 .x/ C f2 .y// .dx; dy/:
But D z 2 .
(b) ) (c): This follows as in the proof of the implication (f) ) (g) of Theorem 2.57
by using regular conditional distributions.
(c) ) (a): As in the proof of (g) ) (a) of Theorem 2.57, this follows by an application of Jensens inequality.

119

Section 2.6 Probability measures with given marginals

By the same arguments as for Corollary 2.61, we obtain the following result from
Theorem 2.94.
d
RCorollary 2.95. Suppose
1 and
2 are Borel probability measures on R such that
jxj
i .dx/ < 1, for i D 1; 2. Then the following conditions are equivalent:
R
R
(a) f d
1  f d
2 for all concave functions f on Rd .

(b) There exists a probability space .; F ; P / with random variables X1 and X2
having distributions
1 and
2 , respectively, such that
E X2 j X1  D X1

P -a.s.

(c) There exists a kernel Q.x; dy/ on Rd such that


Z
y Q.x; dy/ D x for all x 2 Rd
.i.e., Q is a mean-preserving spread/ and such that
2 D
1 Q.
We conclude this section with a generalization of Theorem 2.68. Let S be a Polish
space which is endowed with a preference order . We will assume that is continuous in the sense of Definition 2.8. A function on S will be called increasing if it is
increasing with respect to .
Theorem 2.96. For two Borel probability measures
1 and
2 on S , the following
conditions are equivalent.
R
R
(a) f d
1  f d
2 for all bounded, increasing, and measurable functions f
on S.
(b) There exists a probability space .; F ; P / with random variables X1 and X2
having distributions
1 and
2 , respectively, such that X1 X2 P -a.s.
(c) There exists a kernel Q on S such that
2 D
1 Q and
Q.x; y j x y / D 1

for all x 2 S.

Proof. (a) ) (b): We will apply Theorem 2.89 with the gauge function  1, so
that M1 .S/ is just the space M1 .S/ of all Borel probability measures on S with the
usual weak topology. Then  2 which is equivalent to taking WD 1. Let
M WD .x; y/ 2 S  S j x y:
This set M is closed in S  S by Proposition 2.11. Hence, the portmanteau theorem
in the form of Theorem A.39 implies that the convex set
WD 2 M1 .S  S/ j .M / D 1

120

Chapter 2 Preferences

is closed in M1 .S  S /. For f2 2 Cb .S/, let


fQ2 .x/ WD supf2 .y/ j x y:
Then fQ2 is bounded, increasing, and dominates f2 . Therefore, if f1 2 Cb .S/,
Z
Z
Z
Z
f1 d
1 C f2 d
2  f1 d
1 C fQ2 d
2
Z
 .f1 C fQ2 / d
1
 sup .f1 .x/ C fQ2 .x//
x2S

D sup .f1 .x/ C f2 .y//:


xy

If x y, then the product measure WD x y is contained in , and so


Z
sup .f1 .x/ C f2 .y// D sup .f1 .x/ C f2 .y// .dx; dy/:
xy

Hence, all assumptions of Theorem 2.89 are satisfied, and we conclude that there
exists a probability measure P 2 with marginals
1 and
2 . Taking  WD S  S
and Xi as the projection on the i th coordinate finishes the proof of (a) ) (b).
(b) ) (c) follows as in the proof of Theorem 2.57 by using regular conditional
distributions.
(c) ) (a) is proved as the corresponding implication of Theorem 2.68.

Chapter 3

Optimality and equilibrium

Consider an investor whose preferences can be expressed in terms of expected utility.


In Section 3.1, we discuss the problem of constructing a portfolio which maximizes
the expected utility of the resulting payoff. The existence of an optimal solution is
equivalent to the absence of arbitrage opportunities. This leads to an alternative proof
of the fundamental theorem of asset pricing, and to a specific choice of an equivalent martingale measure defined in terms of marginal utility. Section 3.2 contains
a detailed case study describing the interplay between exponential utility and relative
entropy. In Section 3.3, the optimization problem is formulated for general contingent
claims. Typically, optimal profiles will be non-linear functions of a given market portfolio, and this is one source of the demand for financial derivatives. Section 3.6 introduces the idea of market equilibrium. Prices of risky assets will no longer be given in
advance; they will be derived as equilibrium prices in a microeconomic setting, where
different agents demand contingent claims in accordance with their preferences and
with their budget constraints.

3.1

Portfolio optimization and the absence of arbitrage

Let us consider the one-period market model of Section 1.1 in which d C 1 assets are
priced at time 0 and at time 1. Prices at time 0 are given by the price system
 D . 0 ; / D . 0 ;  1 ; : : : ;  d / 2 RdCC1 ;
prices at time 1 are modeled by the price vector
S D .S 0 ; S / D .S 0 ; S 1 ; : : : ; S d /
consisting of non-negative random variables S i defined on some probability space
.; F ; P /. The 0th asset models a riskless bond, and so we assume that
 0 D 1 and

S0  1 C r

for some constant r > 1. At time t D 0, an investor chooses a portfolio


 D . 0 ; / D . 0 ;  1 ; : : : ;  d / 2 Rd C1

122

Chapter 3 Optimality and equilibrium

where  i represents the amount of shares of the i th asset. Such a portfolio  requires
an initial investment    and yields at time 1 the random payoff   S.
Consider a risk-averse economic agent whose preferences are described in terms
of a utility function u,
Q and who wishes to invest a given amount w into the financial
market. Recall from Definition 2.35 that a real-valued function uQ is called a utility
function if it is continuous, strictly increasing, and strictly concave. A rational choice
of the investors portfolio  D . 0 ; / will be based on the expected utility
E u.
Q  S/ 

(3.1)

of the payoff   S at time 1, where the portfolio  satisfies the budget constraint
    w:

(3.2)

Thus, the problem is to maximize the expected utility (3.1) among all portfolios
 2 Rd C1 which satisfy the budget constraint (3.2). Here we make the implicit
assumption that the payoff   S is P -a.s. contained in the domain of definition of the
utility function u.
Q
In a first step, we remove the constraint (3.2) by considering instead of (3.1) the
expected utility of the discounted net gain
 S
  D Y
1Cr
earned by a portfolio  D . 0 ; /. Here Y is the d -dimensional random vector with
components
Si
  i ; i D 1; : : : ; d:
Yi D
1Cr
For any portfolio  with    < w, adding the risk-free investment w     would
lead to the strictly better portfolio . 0 Cw  ; /. Thus, we can focus on portfolios
 which satisfy    D w, and then the payoff is an affine function of the discounted
net gain
  S D .1 C r/.  Y C w/:
Moreover, for any  2 Rd there exists a unique numraire component  0 2 R such
that the portfolio  WD . 0 ; / satisfies    D w.
Let u denote the following transformation of our original utility function u:
Q
u.y/ WD u..1
Q
C r/.y C w//:
Note that u is again a utility function, and that CARA and (shifted) HARA utility
functions are transformed into utility functions in the same class.

Section 3.1 Portfolio optimization and the absence of arbitrage

123

Clearly, the original constrained utility maximization problem is equivalent to the


unconstrained problem of maximizing the expected utility E u.  Y /  among all
 2 Rd such that   Y is contained in the domain D of u.
Assumption 3.1. We assume one of the following two cases:
(a) D D R. In this case, we will admit all portfolios  2 Rd , but we assume that u
is bounded from above.
(b) D D a; 1/ for some a < 0. In this case, we only consider portfolios which
satisfy the constraint
  Y  a P -a.s.,
and we assume that the expected utility generated by such portfolios is finite,
i.e.,
E u.  Y /  < 1 for all  2 Rd with   Y  a P -a.s.
Remark 3.2. Part (a) of this assumption is clearly satisfied in the case of an exponential utility function u.x/ D 1  e x . Domains of the form D D a; 1/ appear,
for example, in the case of (shifted) HARA utility functions u.x/ D log.x  b/ for
b < a and u.x/ D 1 .x  c/ for c  a and 0 <  < 1. The integrability assumption
in (b) holds if E jY j  < 1, because any concave function is bounded from above by
an affine function.
}
In order to simplify notations, let us denote by
S.D/ WD  2 Rd j   Y 2 D P -a.s.
the set of admissible portfolios for D. Clearly, S.D/ D Rd if D D R. Our aim is
to find some   2 S.D/ which is optimal in the sense that it maximizes the expected
utility E u. Y /  among all  2 S.D/. In this case,   will be an optimal investment
strategy into the risky assets. Complementing   with a suitable numraire component

 0 yields a portfolio  D . 0 ;   / which maximizes the expected utility E u.
Q  S/ 
under the budget constraint    D w. Our first result in this section will relate the
existence of such an optimal portfolio to the absence of arbitrage opportunities.
Theorem 3.3. Suppose that the utility function u W D ! R satisfies Assumption 3:1.
Then there exists a maximizer of the expected utility
E u.  Y / ;

 2 S.D/;

if and only if the market model is arbitrage-free. Moreover, there exists at most one
maximizer if the market model is non-redundant in the sense of Definition 1:15.

124

Chapter 3 Optimality and equilibrium

Proof. The uniqueness part of the assertion follows immediately from the strict concavity of the function  7! E u.  Y /  for non-redundant market models. As to
existence, we may assume without loss of generality that our model is non-redundant.
If the non-redundance condition (1.8) does not hold, then we define a linear space
N  Rd by
N WD  2 Rd j   Y D 0 P -a.s.:
Clearly, Y takes P -a.s. values in the orthogonal complement N ? of N . Moreover,
the no-arbitrage condition (1.3) holds for all  2 Rd if and only if it is satisfied for
all  2 N ? . By identifying N ? with some Rn , we arrive at a situation in which the
non-redundance condition (1.8) is satisfied and where we may apply our result for
non-redundant market models.
If the model admits arbitrage opportunities, then a maximizer   of the expected
utility E u.  Y /  cannot exist: Adding to   some non-zero  2 Rd for which
  Y  0 P -a.s., which exists by Lemma 1.4, would yield a contradiction to the
optimality of   , because then
E u.   Y /  < E u..  C /  Y / :
From now on, we assume that the market model is arbitrage-free. Let us first consider the case in which D D a; 1/ for some a 2 .1; 0/. Then S.D/ is compact.
In order to prove this claim, suppose by way of contradiction that .n / is a sequence in
S.D/ such that jn j ! 1. By choosing a subsequence if necessary, we may assume
that n WD n =jn j converges to some unit vector  2 Rd . Clearly,
n  Y
a
 lim
D0
n"1 jn j
n"1 jn j

  Y D lim

P -a.s.,

and so non-redundance implies that  WD .  ; / is an arbitrage-opportunity.


In the next step, we show that our assumptions guarantee the continuity of the
function
S.D/ 3  7! E u.  Y / ;
which, in view of the compactness of S.D/, will imply the existence of a maximizer
of the expected utility. To this end, it suffices to construct an integrable random variable which dominates u.  Y / for all  2 S.D/. Define  2 Rd by
i WD 0 _ max  i < 1:
2S.D/

Then,   S    S for  2 S.D/, and hence


 Y D

S
 S
  
 0 ^ min    0 :
1Cr
1Cr
 0 2S.D/

Note that   Y is bounded from below by    and that there exists some 2 .0; 1
such that    < jaj. Hence  2 S.D/, and so E u.  Y /  < 1. Applying

Section 3.1 Portfolio optimization and the absence of arbitrage

125

Lemma 3.4 below first with b WD    and then with b WD 0 ^ min 0 2S.D/    0
shows that

 
S
0
< 1:
E u
 0 ^ min   
1Cr
 0 2S.D/
This concludes the proof of the theorem in case D D a; 1/.
Let us now turn to the case of a utility function on D D R which is bounded from
above. We will reduce the assertion to a general existence criterion for minimizers of
lower semicontinuous convex functions on Rd , given in Lemma 3.5 below. It will be
applied to the convex function h./ WD E u.  Y / . We must show that h is lower
semicontinuous. Take a sequence .n /n2N in Rd converging to some . By part (a) of
Assumption 3.1, the random variables u.n  Y / are uniformly bounded from below,
and so we may apply Fatous lemma:
lim inf h.n / D lim inf E u.n  Y /   E u.  Y /  D h./:
n"1

n"1

Thus, h is lower semicontinuous.


By our non-redundance assumption, h is strictly convex and admits at most one
minimizer. We claim that the absence of arbitrage opportunities is equivalent to the
following condition:
lim h. / D C1

"1

for all non-zero  2 Rd .

(3.3)

This is just the condition (3.4) required in Lemma 3.5. It follows from (1.3) and
(1.8) that a non-redundant market model is arbitrage-free if and only if each non-zero
 2 Rd satisfies P   Y < 0 > 0. Since the utility function u is strictly increasing
and concave, the set   Y < 0 can be described as
  Y < 0 D

lim u.   Y / D 1

"1

for  2 Rd .

The probability of the right-hand set is strictly positive if and only if


lim E u.  Y /  D 1;

"1

because u is bounded from above. This observation proves that the absence of arbitrage opportunities is equivalent to the condition (3.3) and completes the proof.
Lemma 3.4. If D D a; 1/, b < jaj, 0 <  1, and X is a non-negative random
variable, then
E u.X  b/  < 1

H)

E u.X /  < 1:

126

Chapter 3 Optimality and equilibrium

Proof. As in (A.1) in the proof of Proposition A.4 we obtain that


u.X /  u.0/
u.X  b/  u.b/
u.X /  u.0/


:
X 0
X  0
X  b  .b/
Multiplying by X shows that u.X / can be dominated by a multiple of u.X  b/ plus
some constant.
Lemma 3.5. Suppose h W Rd ! R [ C1 is a convex and lower semicontinuous
function with h.0/ < 1. Then h attains its infimum provided that
lim h. / D C1

"1

for all non-zero  2 Rd .

(3.4)

Moreover, if h is strictly convex on h < 1, then also the converse implication holds:
the existence of a minimizer implies (3.4).
Proof. First suppose that (3.4) holds. We will show below that for c > inf h the level
sets x j h.x/  c of h are bounded and hence compact. Once the compactness of
the level sets is established, it follows that the set
\
x 2 Rd j h.x/  c
x 2 Rd j h.x/ D inf h D
c>inf h

of minimizers of h is non-empty as an intersection of decreasing and non-empty compact sets.


Suppose c > inf h is such that the level set h  c is not compact, and take
a sequence .xn / in h  c such that jxn j ! 1. By passing to a subsequence
if necessary, we may assume that xn =jxn j converges to some non-zero . For any
> 0,




 
xn

D lim inf h
xn C 1 
0
h./  lim inf h
jxn j
jxn j
jxn j
n"1
n"1





h.0/
cC 1
 lim inf
jxn j
jxn j
n"1
D h.0/:
Thus, we arrive at a contradiction to condition (3.4). This completes the proof of the
existence of a minimizer under assumption (3.4).
In order to prove the converse implication, suppose that the strictly convex function
h has a minimizer x  but that there exists a non-zero  2 Rd violating (3.4), i.e., there
exists a sequence .n /n2N and some c < 1 such that n " 1 but h.n /  c for
all n. Let
xn WD n x  C .1  n /n 

Section 3.1 Portfolio optimization and the absence of arbitrage

127

where n is such that jx   xn j D 1, which is possible for all large enough n. By


the compactness of the Euclidean unit sphere centered in x  , we may assume that xn
converges to some x. Then necessarily jx  x  j D 1. As n  diverges, we must have
that n ! 1. By using our assumption that h.n / is bounded, we obtain
h.x/  lim inf h.xn /  lim . n h.x  / C .1  n /h.n // D h.x  /:
n"1

n"1

Hence, x is another minimizer of h besides x  , contradicting the strict convexity of h.


Thus, (3.4) must hold if the strictly convex function h takes on its infimum.
Remark 3.6. Note that the proof of Theorem 3.3 under Assumption 3.1 (a) did not
use the fact that the components of Y are bounded from below. The result remains
true for arbitrary Y .
}
We turn now to a characterization of the solution   of our utility maximization
problem for continuously differentiable utility functions.
Proposition 3.7. Let u be a continuously differentiable utility function on D such
that E u.  Y /  is finite for all  2 S.D/. Suppose that   is a solution of the utility
maximization problem, and that one of the following two sets of conditions is satisfied:


u is defined on D D R and is bounded from above.

u is defined on D D a; 1/, and   is an interior point of S.D/.

Then
u0 .   Y / jY j 2 L1 .P /;
and the following first-order condition holds:
E u0 .   Y / Y  D 0:

(3.5)

Proof. For  2 S.D/ and " 2 .0; 1 let " WD " C .1  "/  , and define
" WD

u."  Y /  u.   Y /
:
"

The concavity of u implies that "   for "  , and so


" % u0 .   Y / .    /  Y

as " # 0.

Note that our assumptions imply that u.Y / 2 L1 .P / for all  2 S.D/. In particular,
we have 1 2 L1 .P /, so that monotone convergence and the optimality of   yield
that
(3.6)
0  E "  % E u0 .   Y / .    /  Y  as " # 0.
In particular, the expectation on the right-hand side of (3.6) is finite.

128

Chapter 3 Optimality and equilibrium

Both sets of assumptions imply that   is an interior point of S.D/. Hence, we


deduce from (3.6) by letting  WD     that
E u0 .   Y /   Y   0
for all  in a small ball centered in the origin of Rd . Replacing  by  shows that
the expectation must vanish.
Remark 3.8. Let us comment on the assumption that the optimal   is an interior
point of S.D/:
(a) If the non-redundance condition (1.8) is not satisfied, then either each or none of
the solutions to the utility maximization problem is contained in the interior of
S.D/. This can be seen by using the reduction argument given at the beginning
of the proof of Theorem 3.3.
(b) Note that   Y is bounded from below by    in case  has only non-negative
components. Thus, the interior of S.D/ is always non-empty.
(c) As shown by the following example, the optimal   need not be contained in the
interior of S.D/ and, in this case, the first-order condition (3.5) will generally
fail.
}
Example 3.9. Take r D 0, and let S 1 be integrable but unbounded. We choose
D D a; 1/ with a WD  1 , and we assume that P S 1  "  > 0 for all " > 0.
Then S.D/ D 0; 1. If 0 < E S 1  <  1 then Example 2.40 shows that the optimal
investment is given by   D 0, and so   lies in the boundary of S.D/. Thus, if u is
sufficiently smooth,
E u0 .   Y / Y  D u0 .0/.E S 1    1 / < 0:
The intuitive reason for this failure of the first-order condition is that taking a short
position in the asset would be optimal as soon as E S 1  <  1 . This choice, however,
is ruled out by the constraint  2 S.D/.
}
Proposition 3.7 yields a formula for the density of a particular equivalent riskneutral measure. Recall that P  is risk-neutral if and only if E  Y  D 0.
Corollary 3.10. Suppose that the market model is arbitrage-free and that the assumptions of Proposition 3:7 are satisfied for a utility function u W D ! R and an
associated maximizer   of the expected utility Eu.  Y /. Then
u0 .   Y /
dP 
D
dP
Eu0 .   Y /
defines an equivalent risk-neutral measure.

(3.7)

Section 3.1 Portfolio optimization and the absence of arbitrage

129

Proof. Proposition 3.7 states that u0 .   Y /Y is integrable with respect to P and


that its expectation vanishes. Hence, we may conclude that P  is an equivalent riskneutral measure if we can show that P  is well-defined by (3.7), i.e., if u0 .   Y / 2
L1 .P /. Let

for D D a; 1/,
u0 .a/
c WD supu0 .x/ j x 2 D and jxj  j  j 
0 .j  j/ for D D R,
u
which is finite by our assumption that u is continuously differentiable on all of D.
Thus,
0  u0 .   Y /  c C u0 .   Y /jY j  IjY j1 ;
and the right-hand side has a finite expectation.
Remark 3.11. Corollary 3.10 yields an independent and constructive proof of the
fundamental theorem of asset pricing in the form of Theorem 1.7: Suppose that the
model is arbitrage-free. If Y is P -a.s. bounded, then so is u.   Y /, and the measure
P  of (3.7) is an equivalent risk-neutral measure with a bounded density dP  =dP . If
Y is unbounded, then we may consider the bounded random vector
YQ WD

Y
;
1 C jY j

which also satisfies the no-arbitrage condition (1.3). Let Q  be a maximizer of the
expected utility E u.  YQ / . Then an equivalent risk-neutral measure P  is defined
through the bounded density
u0 . Q   YQ /
dP 
WD c 
;
dP
1 C jY j
where c is an appropriate normalizing constant.

Example 3.12. Consider the exponential utility function


u.x/ D 1  e x
with constant absolute risk aversion > 0. The requirement that E u.  Y /  is finite
is equivalent to the condition
E e Y  < 1

for all  2 Rd .

If   is a maximizer of the expected utility, then the density of the equivalent riskneutral measure P  in (3.7) takes the particular form


e  Y
dP 
D
:
dP
E e   Y 

130

Chapter 3 Optimality and equilibrium

In fact, P  is independent of since   maximizes the expected utility 1E e Y 


if and only if  WD   is a minimizer of the moment generating function
Z. / WD E e
Y ;

2 Rd ;

of Y . In Corollary 3.25 below, the measure P  will be characterized by the fact that
it minimizes the relative entropy with respect to P among the risk-neutral measures
in P ; see Definition 3.20 below.
}

3.2

Exponential utility and relative entropy

In this section we give a more detailed study of the problem of portfolio optimization
with respect to a CARA utility function
u.x/ D 1  e x
for > 0. As in the previous Section 3.1, the problem is to maximize the expected
utility
E u.  Y / 
of the discounted net gain   Y earned by an investment into risky assets. The key
assumption for this problem is that
E u.  Y /  > 1

for all  2 Rd .

(3.8)

Recall from Example 3.12 that the maximization of E u.  Y /  is reduced to the
minimization of the moment generating function
Z. / WD E e
Y ;

2 Rd ;

which does not depend on the risk aversion . The key assumption (3.8) is equivalent
to the condition that
(3.9)
Z. / < 1 for all 2 Rd .
Throughout this section, we will always assume that (3.9) holds. But we will not need
the assumption that Y is bounded from below (which in our financial market model
follows from assuming that asset prices are non-negative); all results remain true for
general random vectors Y ; see also Remarks 1.9 and 3.6.
Lemma 3.13. The condition (3.9) is equivalent to
E e jY j  < 1 for all > 0.

131

Section 3.2 Exponential utility and relative entropy

Proof. Clearly, the condition in the statement of the lemma implies (3.9). To prove
P
the converse assertion, take a constant c > 0 such that jxj  c diD1 jx i j for x 2 Rd .
By Hlders inequality,
d
d
h
 X
i Y
i
jY i j 
E e cd jY j 1=d :
E e jY j   E exp c
i D1

i D1

In order to show that the i th factor on the right is finite, take 2 Rd such that
i D cd and j D 0 for j i. With this choice,
i

E e cd jY j   E e
Y  C E e 
Y ;
which is finite by (3.9).
Definition 3.14. The exponential family of P with respect to Y is the set of measures
P
j 2 Rd
defined via

e
Y
dP

D
:
dP
Z. /

Example 3.15. Suppose that the risky asset S 1 has under P a Poisson distribution
with parameter > 0, i.e., S 1 takes values in 0; 1; : : : and satisfies
P S 1 D k  D e 

k
;
k

k D 0; 1; : : : :

Then (3.9) is satisfied for Y WD S 1   1 , and S 1 has under P


a Poisson distribution
with parameter e
. Hence, the exponential family of P generates the family of all
Poisson distributions.
}
Example 3.16. Let Y have a standard normal distribution N.0; 1/. Then (3.9) is
satisfied, and the distribution of Y under P
is equal to the normal distribution N. ; 1/
with mean and variance 1.
}
Remark 3.17. Two parameters and 0 in Rd determine the same element in the
exponential family of P if and only if .  0 /  Y D 0 P -almost surely. It follows
that the mapping
7! P

is injective provided that the non-redundance condition holds in the form


  Y D 0 P -a.s.

H)

 D 0:

(3.10)
}

132

Chapter 3 Optimality and equilibrium

In the sequel, we will be interested in the barycenters of the members of the exponential family of P with respect to Y . We denote
m. / WD E
Y  D

1
E Y e
Y ;
Z. /

2 Rd :

The next lemma shows that m. / can be obtained as the gradient of the logarithmic
moment generating function.
Lemma 3.18. Z is a smooth function on Rd , and the gradient of log Z at is the
expectation of Y under P

.r log Z/. / D E
Y  D m. /:
Moreover, the Hessian of log Z at equals the covariance matrix .covP .Y i ; Y j //i;j
of Y under the measure P

@2
log Z. / D cov.Y i ; Y j / D E
Y i Y j   E
Y i E
Y j :
@ i @ j
P
In particular, log Z is convex.
Proof. Observe that

x
@e
i
x

 exp.1 C j j/  jxj:
@ i D jx j e
Hence, Lemma 3.13 and Lebesgues dominated convergence theorem justify the interchanging of differentiation and integration (see the differentiation lemma in [21],
16, for details).
The following corollary summarizes the results we have obtained so far. Recall
from Section 1.5 the notion of the convex hull . / of the support of a measure on
Rd and the definition of the relative interior ri C of a convex set C .
Corollary 3.19. Denote by
WD P Y 1 the distribution of Y under P . Then the
function
7!  m0  log Z. /
takes on its maximum if and only if m0 is contained in the relative interior of the
convex hull of the support of
, i.e., if and only if
m0 2 ri .
/:
In this case, any maximizer  satisfies
m0 D m.  / D E
 Y :
In particular, the set m. / j 2 Rd coincides with ri .
/. Moreover, if the
non-redundance condition (3.10) holds, then there exists at most one maximizer  .

Section 3.2 Exponential utility and relative entropy

133

Proof. Taking YQ WD Y  m0 reduces the problem to the situation where m0 D 0. Applying Theorem 3.3 with the utility function u.z/ D 1  e z shows that the existence
of a maximizer  of  logZ is equivalent to the absence of arbitrage opportunities.
Corollary 3.10 states that m.  / D 0 and that 0 belongs to Mb .
/, where Mb .
/ was
defined in Lemma 1.44. An application of Theorem 1.49 completes the proof.
It will turn out that the maximization problem of the previous corollary is closely
related to the following concept.
Definition 3.20. The relative entropy of a probability measure Q with respect to P
is defined as

dQ
log
if Q
P ,
E dQ
dP
dP
H.QjP / WD
C1
otherwise.
Remark 3.21. Jensens inequality applied to the strictly convex function h.x/ D
x log x yields
 

dQ
H.QjP / D E h
 h.1/ D 0;
(3.11)
dP
with equality if and only if Q D P .
}
Example 3.22. Let  be a finite set and F be its power set. Every probability Q on
.; F / is absolutely continuous with respect to the uniform distribution P . Let us
denote Q.!/ WD Q ! . Clearly,
X
X
Q.!/
Q.!/ log
Q.!/ log Q.!/ C log jj:
D
H.QjP / D
P .!/
!2

!2

The quantity
H.Q/ WD 

Q.!/ log Q.!/

!2

is usually called the entropy of Q. Observe that H.P / D log jj, so that
H.QjP / D H.P /  H.Q/:
Since the left-hand side is non-negative by (3.11), the uniform distribution P has
maximal entropy among all probability distributions on .; F /.
}
Example 3.23. Let
D N.m; 2 / denote the normal distribution with mean m and
Q Q 2 /
variance 2 on R. Then, for
Q D N.m;


d
Q
.x  m/2

.x  m/
Q 2
;
C
.x/ D exp 
d

Q
2 Q 2
2 2
and hence





1 Q 2
Q 2
Q 2
1 mm
H.
j
/
Q
D
 log 2  1 C
:
2 2

2

134

Chapter 3 Optimality and equilibrium

The following result shows that P


is the unique minimizer of the relative entropy
H.QjP / among all probability measures Q with EQ Y  D E
Y .
Theorem 3.24. Let m0 WD m.P
0 / for some given 0 2 Rd . Then, for any probability measure Q on .; F / such that EQ Y  D m0 ,
H.QjP /  H.P
0 jP / D 0  m0  log Z. 0 /;
and equality holds if and only if Q D P
0 . Moreover, 0 maximizes the function
 m0  log Z. /
over all 2 Rd .
Proof. Let Q be a probability measure on .; F / such that EQ Y  D m0 . We show
first that for all 2 Rd
H.QjP / D H.QjP
/ C  m0  log Z. /:

(3.12)

To this end, note that both sides of (3.12) are infinite if Q 6


P . Otherwise
dQ dP

dQ e
Y
dQ


D
D
;
dP
dP
dP
dP
Z. /
and taking logarithms and integrating with respect to Q yields (3.12).
Since H.QjP
/  0 according to (3.11), we get from (3.12) that
H.QjP /   m0  log Z. /

(3.13)

for all 2 Rd and all measures Q such that EQ Y  D m0 . Moreover, equality holds
in (3.13) if and only if H.QjP
/ D 0, which is equivalent to Q D P
. In this case,
must be such that m. / D m0 . In particular, for any such
H.P
jP / D  m0  log Z. /:
Thus, 0 maximizes the right-hand side of (3.13), and P
0 minimizes the relative
entropy on the set
M0 WD Q j EQ Y  D m0 :
But the relative entropy H.QjP / is a strictly convex functional of Q, and so it can
have at most one minimizer in the convex set M0 . Thus, any with m. / D m0
induces the same measure P
0 .
Taking m0 D 0 in the preceding theorem yields a special equivalent risk-neutral
measure in our financial market model, namely the entropy-minimizing risk-neutral
measure. Sometimes it is also called the Esscher transform of P . Recall our assumption (3.9).

135

Section 3.2 Exponential utility and relative entropy

Corollary 3.25. Suppose the market model is arbitrage-free. Then there exists a
unique equivalent risk-neutral measure P  2 P which minimizes the relative entropy
H.PO jP / over all PO 2 P . The density of P  is of the form


dP 
e
Y
;
D
dP
E e
 Y 
where  denotes a minimizer of the moment generating function E e
Y  of Y .
Proof. This follows immediately from Corollary 3.19 and Theorem 3.24.
By combining Theorem 3.24 with Remark 3.17, we obtain the following corollary.
It clarifies the question of uniqueness in the representation of points in the relative
interior of .P Y 1 / as barycenters of the exponential family.
Corollary 3.26. If the non-redundance condition (3.10) holds, then
7! m. /
is a bijective mapping from Rd to ri .P Y 1 /.
Remark 3.27. It follows from Corollary 3.19 and Theorem 3.24 that for all m 2
ri .P Y 1 /
min

EQ Y Dm

H.QjP / D max  m  log Z. /:

2Rd

(3.14)

Here, the right-hand side is the FenchelLegendre transform of the convex function
}
log Z evaluated at m 2 Rd .
The following theorem shows that the variational principle (3.14) remains true for
all m 2 Rd , if we replace min and max by inf and sup.
Theorem 3.28. For m 2 Rd
H.QjP / D sup  m  log Z. /:

inf
EQ Y Dm

2Rd

The proof of this theorem relies on the following two general lemmas.
Lemma 3.29. For any probability measure Q,
H.QjP / D

sup
Z2L1 .;F

.EQ Z   log E e Z /
;P /

D supEQ Z   log E e Z  j e Z 2 L1 .P /:
The second supremum is attained by Z WD log dQ
if Q
P .
dP

(3.15)

136

Chapter 3 Optimality and equilibrium

Proof. We first show  in (3.15). To this end, we may assume that H.QjP / < 1.
For Z with e Z 2 L1 .P / let P Z be defined by
eZ
dP Z
D
:
dP
E e Z 
Then P Z is equivalent to P and
log

dP Z
dQ
dQ
C
log
D log
:
dP
dP
dP Z

Integrating with respect to Q gives


H.QjP / D H.QjP Z / C EQ Z   log E e Z :
Since H.QjP Z /  0 by (3.11), we have proved that H.QjP / is larger than or equal
to both suprema on the right of (3.15).
To prove the reverse inequality, consider first the case Q 6
P . Take Zn WD nIA
where A is such that Q A  > 0 and P A  D 0. Then, as n " 1,
EQ Zn   log E e Zn  D n  Q A  ! 1 D H.QjP /:
Now suppose that Q
P with density ' D dQ=dP . Then Z WD log ' satisfies
e Z 2 L1 .P / and
H.QjP / D EQ Z   log E e Z :
For the first identity we use an approximation argument. Let Zn D .n/_.log '/^n.
We split the expectation E e Zn  according to the two sets '  1 and ' < 1.
Using monotone convergence for the first integral and dominated convergence for the
second yields
E e Zn  ! E e log '  D 1:
Since x log x  1=e, we have 'Zn  1=e uniformly in n, and Fatous lemma
yields
lim inf EQ Zn  D lim inf E 'Zn   E ' log '  D H.QjP /:
n"1

n"1

Putting both facts together shows


lim inf.EQ Zn   log E e Zn /  H.QjP /;
n"1

and the inequality  in (3.15) follows.

Section 3.2 Exponential utility and relative entropy

137

Remark 3.30. The preceding lemma shows that the relative entropy is monotone with
respect to an increase of the underlying -algebra: Let P and Q be two probability
measures on a measurable space .; F /, and denote by H.QjP / their relative entropy. Suppose that F0 is a -field such that F0  F and denote by H0 .QjP / the
relative entropy of Q with respect to P considered as probability measures on the
smaller space .; F0 /. Then the relation L1 .; F0 ; P /  L1 .; F ; P / implies
H0 .QjP /  H.QjP /I
}

in general this inequality is strict.


Lemma 3.31. For all  0, the set
WD ' 2 L1 .; F ; P / j '  0; E '  D 1; E ' log '  
is weakly sequentially compact in L1 .; F ; P /.
Proof. Let Lp WD Lp .; F ; P /. The set of all P -densities,
D WD ' 2 L1 j '  0; E '  D 1 ;

is clearly convex and closed in L1 . Hence, this set is also weakly closed in L1 by
Theorem A.60. Moreover, Lemma 3.29 states that for ' 2 D
E ' log '  D sup .E Z '   log E e Z /:
Z2L1

In particular,
' 7! E ' log ' 
is a weakly lower semicontinuous functional on D, and so is weakly closed. In
addition, is bounded in L1 and uniformly integrable, due to the criterion of de
la Valle Poussin; see, e.g., Lemma 3 in 6 of Chapter II of [251]. Applying the
DunfordPettis theorem and the Eberleinmulian theorem as stated in Appendix A.7
concludes the proof.
Proof of Theorem 3:28. In view of Theorem 3.24 and inequality (3.13) (whose proof
extends to all m 2 Rd ), it remains to prove that
inf
EQ Y Dm

H.QjP /  sup  m  log Z. /

(3.16)

2Rd

for those m which do not belong to ri .


/, where
WD P Y 1 . The right-hand side
of (3.16) is just the FenchelLegendre transform at m of the convex function log Z
and, thus, denoted .log Z/ .m/.

138

Chapter 3 Optimality and equilibrium

First, we consider the case in which m is not contained in the closure .


/ of the
convex hull of the support of
. Proposition A.1, the separating hyperplane theorem,
yields some  2 Rd such that
  m > sup  x j x 2 .
/  sup  x j x 2 supp
:
By taking n WD n, it follows that

n  m  log Z. n /  n   m 


sup   y ! C1

as n " 1.

y2supp 

Hence, the right-hand side of (3.16) is infinite if m .


/.
It remains to prove (3.16) for m 2 .
/n ri .
/ with .log Z/ .m/ < 1. Recall
from (1.25) that ri .
/ D ri .
/. Pick some m1 2 ri .
/ and let


1
1
mn WD m1 C 1 
m:
n
n
Then mn 2 ri .
/ by (1.24). By the convexity of .log Z/ , we have


n1
1
.log Z/ .m1 / C
.log Z/ .m/
lim sup.log Z/ .mn /  lim sup
n
n
n"1
n"1

(3.17)

D .log Z/ .m/:
We also know that to each mn there corresponds a n 2 Rd such that
mn D E
n Y 

and

H.P
n jP / D .log Z/ .mn /:

(3.18)

From (3.17) and (3.18) we conclude that


lim sup H.P
n jP / D lim sup.log Z/ .mn /  .log Z/ .m/ < 1:
n"1

n"1

In particular, H.P
n jP / is uniformly bounded in n, and Lemma 3.31 implies that
after passing to a suitable subsequence if necessary the densities dP
n =dP converge
weakly in L1 .; F ; P / to a density '. Let dP1 D ' dP . By the weak lower
semicontinuity of
dQ
7! H.QjP /;
dP
which follows from Lemma 3.29, we may conclude that H.P1 jP /  .log Z/ .m/.
The theorem will be proved once we can show that E1 Y  D m. To this end, let
 WD supn .log Z/ .mn /, which is a finite non-negative number by (3.17). Taking
Z WD IjY jc jY j

139

Section 3.3 Optimal contingent claims

on the right-hand side of (3.15) yields


  E
n jY j  IjY jc   log E exp.jY jIjY jc / 

for all n  1.

Note that the rightmost expectation is finite due to condition (3.9) and Lemma 3.13.
By taking large so that = < "=2 for some given " > 0, and by choosing c such
that
"
log E exp.jY jIjY jc /  <
;
2
we obtain that
sup E
n jY j  IjY jc   ":
n1

But
E
n jY j  IjY j<c  ! E1 jY j  IjY j<c 
by the weak convergence of dP
n =dP ! dP1 =dP , and so taking " # 0 yields
m D lim E
n Y  D E1 Y ;
n"1

as desired.

3.3

Optimal contingent claims

In this section we study the problem of maximizing the expected utility


E u.X / 
under a given budget constraint in a broader context. The random variables X will
vary in a general convex class X  L0 .; F; P / of admissible payoff profiles. In
the setting of our financial market model, this will allow us to explain the demand for
non-linear payoff profiles provided by financial derivatives.
In order to formulate the budget constraint in this general context, we introduce a
linear pricing rule of the form
.X / D E  X  D E 'X 
where P  is a probability measure on .; F /, which is equivalent to P with density
'. For a given initial wealth w 2 R, the corresponding budget set is defined as
B WD X 2 X \ L1 .P  / j E  X   w:

(3.19)

Our optimization problem can now be stated as follows:


Maximize E u.X /  among all X 2 B:

(3.20)

Note, however, that we will need some extra conditions which guarantee that the
expectations E u.X /  make sense and are bounded from above.

140

Chapter 3 Optimality and equilibrium

Remark 3.32. In general, our optimization problem would not be well posed without
the assumption P   P . Note first that it should be rephrased in terms of a class
X of measurable functions on .; F / since we can no longer pass to equivalence
classes with respect to P . If P is not absolutely continuous with respect to P  then
there exists A 2 F such that P A  > 0 and P  A  D 0. For X 2 L1 .P  / and
c > 0, the random variable XQ WD X C c IA would satisfy E  XQ  D E  X  and
E u.XQ /  > E u.X / . Similarly, if P  A  > 0 and P A  D 0 then
XO WD X C c 

c
I
P  A  A

would have the same price as X but higher expected utility. In particular, the expectations in (3.20) would be unbounded in both cases if X is the class of all measurable
functions on .; F / and if the function u is not bounded from above.
}
Remark 3.33. If a solution X  with E u.X  /  < 1 exists then it is unique, since
B is convex and u is strictly concave. Moreover, if X D L0 .; F ; P / or X D
L0C .; F ; P / then X  satisfies
E  X   D w
since E  X   < w would imply that X WD X  C w  E  X   is a strictly better
choice, due to the strict monotonicity of u.
}
Let us first consider the unrestricted case X D L0 .; F; P / where any finite random variable on .; F; P / is admissible. The following heuristic argument identifies
a candidate X  for the maximization of the expected utility. Suppose that a solution
X  exists. For any X 2 L1 .P / and any 2 R,
X
WD X  C .X  E  X /
satisfies the budget constraint E  X
 D w. A formal computation yields
d
E u.X
/ 
d
D0
D E u0 .X  /.X  E  X /

0D

D E u0 .X  /X   E XE u0 .X  /  '
D E X.u0 .X  /  c '/ 
where c WD E u0 .X  / . The identity
E X u0 .X  /  D c E X ' 
for all bounded measurable X implies u0 .X  / D c ' P -almost surely. Thus, if we
denote by
I WD .u0 /1

Section 3.3 Optimal contingent claims

141

the inverse function of the strictly decreasing function u0 , then X  should be of the
form
X  D I.c '/:
We will now formulate a set of assumptions on our utility function u which guarantee that X  WD I.c '/ is indeed a maximizer of the expected utility, as suggested
by the preceding argument.
Theorem 3.34. Suppose u W R ! R is a continuously differentiable utility function
which is bounded from above, and whose derivative satisfies
lim u0 .x/ D C1:

x#1

(3.21)

Assume moreover that c > 0 is a constant such that


X  WD I.c '/ 2 L1 .P  /:
Then X  is the unique maximizer of the expected utility E u.X /  among all those
X 2 L1 .P  / for which E  X   E  X  . In particular, X  solves our optimization
problem (3.20) for X D L0 .; F ; P / if c can be chosen such that E  X   D w.
Proof. Uniqueness follows from Remark 3.33. Since u is bounded from above, its
derivative satisfies
lim u0 .x/ D 0;
x"1

in addition to (3.21). Hence, .0; 1/ is contained in the range of u0 , and it follows that
I.c '/ is P -a.s. well-defined for all c > 0.
To show the optimality of X  D I.c '/, note that the concavity of u implies that
for any X 2 L1 .P  /
u.X /  u.X  / C u0 .X  /.X  X  / D u.X  / C c '.X  X  /:
Taking expectations with respect to P yields
E u.X /   E u.X  /  C c E  X  X  :
Hence, X  is indeed a maximizer in the class X 2 L1 .P  / j E  X   E  X  .

Example 3.35. Let u.x/ D 1  e x be an exponential utility function with constant


absolute risk aversion > 0. In this case,
y
1
I.y/ D  log :

142

Chapter 3 Optimality and equilibrium

It follows that
1
c
1
E  I.c '/  D  log   E ' log ' 


c
1
1
D  log   H.P  jP /;


where H.P  jP / denotes the relative entropy of P  with respect to P ; see Definition 3.20. Hence, the utility maximization problem can be solved for any w 2 R if
and only if the relative entropy H.P  jP / is finite. In this case, the optimal profile is
given by
1
1
X  D  log ' C w C H.P  jP /;

and the maximal value of expected utility is


E u.X  /  D 1  exp.w  H.P  jP //;
corresponding to the certainty equivalent
wC

1
H.P  jP /:

Let us now return to the financial market model considered in Section 3.1, and let P 
be the entropy-minimizing risk-neutral measure constructed in Corollary 3.25. The
density of P  is of the form

e  Y
;
'D
E e   Y 
where   2 Rd denotes a maximizer of the expected utility E u.  Y / ; see Example 3.12. In this case, the optimal profile takes the form


X  D   Y C w D

 S
;
1Cr


i.e., X  is the discounted payoff of the portfolio  D . 0 ;   /, where  0 D w   


is determined by the budget constraint    D w. Thus, the optimal profile is given
by a linear profile in the given primary assets S 0 ; : : : ; S d : No derivatives are needed
at this point.
}
In most situations it will be natural to restrict the discussion to payoff profiles which
are non-negative. For the rest of this section we will make this restriction, and so the
utility function u may be defined only on 0; 1/. In several applications we will also
use an upper bound on payoff profiles given by an F -measurable random variable
W W  ! 0; 1 . We include the case W  C1 and define the convex class of
admissible payoff profiles as
X WD X 2 L0 .P / j 0  X  W P -a.s.:

143

Section 3.3 Optimal contingent claims

Thus, our goal is to maximize the expected utility E u.X /  among all X 2 B where
the budget set B is defined in terms of X and P  as in (3.19), i.e.,
B D X 2 L1 .P  / j 0  X  W P -a.s. and E  X   w :
We first formulate a general existence result:
Proposition 3.36. Let u be any utility function on 0; 1/, and suppose that W is
P -a.s. finite and satisfies E u.W /  < 1. Then there exists a unique X  2 B which
maximizes the expected utility E u.X /  among all X 2 B.
Proof. Take a sequence .Xn / in B with E  Xn   w and such that E u.Xn / 
converges to the supremum of the expected utility. Since supn jXn j  W < 1
P -almost surely, we obtain from Lemma 1.70 a sequence
XQ n 2 convXn ; XnC1 ; : : :
Q Clearly, every XQn
of convex combinations which converge almost-surely to some X.
is contained in B. Fatous lemma implies
E  XQ   lim inf E  XQn   w;
n"1

and so XQ 2 B. Each XQn can be written as


coefficients in  0 summing up to 1. Hence,
u.XQ n / 

m
X

Pm

n
i D1 i Xni

for indices ni  n and

in u.Xni /;

i D1

and it follows that


E u.XQ n /   inf E u.Xm / :
mn

By dominated convergence,
E u.XQ /  D lim E u.XQ n / ;
n"1

and the right-hand side is equal to the supremum of the expected utility.
Remark 3.37. The argument used to prove the preceding proposition works just as
well in the following general setting. Let U W B ! R be a concave functional on a
set B of random variables defined on a probability space .; F ; P / and with values
in Rn . Assume that

144



Chapter 3 Optimality and equilibrium

B is convex and closed under P -a.s. convergence,


there exists a random variable W 2 L0C .; F ; P / with jX i j  W < 1 P -a.s.
for each X D .X 1 ; : : : ; X n / 2 B,
supX2B U.X / < 1,

U is upper semicontinuous with respect to P -a.s. convergence.


Then there exists an X  2 B which maximizes U on B, and X  is unique if U is
strictly concave. As a special case, this includes the utility functionals


U.X / D inf EQ u.X / ;


Q2Q

appearing in a robust Savage representation of preferences on n-dimensional asset


profiles, where u is a utility function on Rn and Q is a set of probability measures
equivalent to P ; see Section 2.5.
}
We turn now to a characterization of the optimal profile X  in terms of the inverse
of the derivative u0 of u in case where u is continuously differentiable on .0; 1/. Let
a WD lim u0 .x/  0
x"1

and

b WD u0 .0C/ D lim u0 .x/  C1:


x#0

We define
I C W .a; b/ ! .0; 1/
as the continuous, bijective, and strictly decreasing inverse function of u0 on .a; b/,
and we extend I C to the full half axis 0; 1 by setting

0
for y  b,
C
I .y/ WD
(3.22)
C1 for y  a.
With this convention, I C W 0; 1 ! 0; 1 is continuous.
Remark 3.38. If u is a utility function defined on all of R, the function I C is the
inverse of the restriction of u0 to 0; 1/. Thus, I C is simply the positive part of
the function I D .u0 /1 . For instance, in the case of an exponential utility function
u.x/ D 1  e x , we have a D 0, b D , and


1
y 
D .I.y//C ; y  0:
(3.23)
log
I C .y/ D

}
Theorem 3.39. Assume that X  2 B is of the form
X  D I C .c '/ ^ W
for some constant c > 0 such that E  X   D w. If E u.X  /  < 1 then X  is the
unique maximizer of the expected utility E u.X /  among all X 2 B.

145

Section 3.3 Optimal contingent claims

Proof. In a first step, we consider the function


v.y; !/ WD

sup

.u.x/  xy/

(3.24)

0xW .!/

defined for y 2 R and ! 2 . Clearly, for each ! with W .!/ < 1 the supremum
above is attained in a unique point x  .y/ 2 0; W .!/, which satisfies
x  .y/ D 0


x .y/ D W .!/

u0 .x/ < y
0

u .x/ > y

for all x 2 .0; W .!//,


for all x 2 .0; W .!//.

Moreover, y D u0 .x  .y// if x  .y/ is an interior point of the interval 0; W .!/. It


follows that
x  .y/ D I C .y/ ^ W .!/;
or
X  D x  .c '/ on W < 1.

(3.25)

If W .!/ D C1, then the supremum in (3.24) is not attained if and only if u0 .x/ >
y for all x 2 .0; 1/. By our convention (3.22), this holds if and only if y  a and
hence I C .y/ D C1. But our assumptions on X  imply that I C .c '/ < 1 P -a.s.
on W D 1, and hence that
X  D x  .c '/

P -a.s. on W D 1.

(3.26)

Putting (3.24), (3.25), and (3.26) together yields


u.X  /  X  c ' D v.c'; /

P -a.s.

Applied to an arbitrary X 2 B, this shows that


u.X  /  c 'X   u.X /  c 'X

P -a.s.

Taking expectations gives


E u.X  /   E u.X /  C c  E  X   X   E u.X / :
Hence, X  maximizes the expected utility on B. Uniqueness follows from Remark 3.33.
In the following examples, we study the application of the preceding theorem to
CARA and HARA utility functions. For simplicity we consider only the case W 
1. The extension to a non-trivial bound W is straightforward.

146

Chapter 3 Optimality and equilibrium

Example 3.40. For an exponential utility function u.x/ D 1  e x we have by


(3.23)
 y ' 
1 
1 y ' 
D h
;
'I C .y '/ D ' log

where h.x/ D .x log x/ . Since h is bounded by e 1 , it follows that 'I C .y '/
belongs to L1 .P / for all y > 0. Thus,
1 h y '  i
g.y/ WD E  I C .y '/  D E h
y

decreases continuously from C1 to 0 as y increases from 0 to 1, and there exists a


unique c with g.c/ D w. The corresponding profile
X  WD I C .c '/
maximizes the expected utility E u.X /  among all X  0. Let us now return to
the special situation of the financial market model of Section 3.1, and take P  as the
entropy-minimizing risk-neutral measure of Corollary 3.25. Then the optimal profile
X  takes the form
X  D .   Y  K/C ;
where   is the maximizer of the expected utility E u.  Y / , and where K is given
by
1
c
1
c
1
1

K D log  log E e  Y  D log C H.P  jP /:

Note that X  is a linear combination of the primary assets only in the case where
   Y  K P -almost surely. In general, X  is a basket call option on the attainable
asset w C .1 C r/   Y 2 V with strike price w C .1 C r/K. Thus, a demand for
derivatives appears.
}
Example 3.41. If u is a HARA utility function of index  2 0; 1/ then u0 .x/ D
x 1 , hence
1
I C .y/ D y  1
and

I C .y '/ D y  1  '  1 :
In the logarithmic case  D 0, we assume that the relative entropy H.P jP  / of P
with respect to P  is finite. Then
X D

dP
w
Dw
'
dP 

is the unique maximizer, and the maximal value of expected utility is


E log X   D log w C H.P jP  /:

147

Section 3.3 Optimal contingent claims

If  2 .0; 1/ and

E '  1  D E  '  1  < 1;


then the unique optimal profile is given by

X  D w .E '  1 /1 '  1 ;


and the maximal value of expected utility is equal to
E u.X  /  D


1
w .E '  1 /1 :


Exercise 3.3.1. In the context of Example 3.41, compute the maximal value of expected utility when ' has a log-normal distribution.
}
The following corollary gives a simple condition on W which guarantees the existence of the maximizer X  in Theorem 3.39.
Corollary 3.42. If E u.W /  < 1 and if 0 < w < E  W  < 1, then there exists
a unique constant c > 0 such that
X  D I C .c '/ ^ W
satisfies E  X   D w. In particular, X  is the unique maximizer of the expected
utility E u.X /  among all X 2 B.
Proof. For any 2 .0; 1/,
y 7! I C .y/ ^
is a continuous decreasing function with limy"b I C .y/ ^ D 0 and I C .y/ ^ D
for all y  u0 ./. Hence, dominated convergence implies that the function
g.y/ WD E  I C .y '/ ^ W ;
is continuous and decreasing with
lim g.y/ D 0 < w < E  W  D lim g.y/:

y"1

y#0

Moreover, g is even strictly decreasing on y j 0 < g.y/ < E  W . Hence, there


exists a unique c with g.c/ D w, and Theorem 3.39 yields the optimality of the
corresponding X  .
Let us now extend the discussion to the case where preferences themselves are
uncertain. This additional uncertainty can be modelled by incorporating the choice of
a utility function into the description of possible scenarios; for an axiomatic discussion
see [172]. More precisely, we assume that preferences are described by a measurable

148

Chapter 3 Optimality and equilibrium

function u on 0; 1/   such that u.; !/ is a utility function on 0; 1/ which is


continuously differentiable on .0; 1/. For each ! 2 , the inverse of u0 .; !/ is
extended as above to a function
I C .; !/ W 0; 1 ! 0; 1:
Using exactly the same arguments as above, we obtain the following extension of
Corollary 3.42 to the case of random preferences:
Corollary 3.43. If E u.W; /  < 1 and if 0 < w < E  W  < 1, then there exists
a unique constant c > 0 such that
X  .!/ W D I C .c '.!/; !/ ^ W .!/
is the unique maximizer of the expected utility
Z
E u.X; /  D u.X.!/; !/ P .d!/
among all X 2 B.

3.4

Optimal payoff profiles for uniform preferences

So far, we have discussed the structure of asset profiles which are optimal with respect
to a fixed utility function u. Let us now introduce an optimization problem with
respect to the uniform order <uni as discussed in Section 2.4. The partial order <uni can
be viewed as a reflexive and transitive relation on the space of financial positions
X WD L1C .; F ; P /
by letting
X <uni Y W
X <uni
Y

(3.27)

E u.X /   E u.Y /  for all utility functions u,


where
X and
Y denote the distributions of X and Y under P . Note that X <uni
Y <uni X if and only if X and Y have the same distribution; see Remark 2.58. Thus,
the relation <uni is antisymmetric on the level of distributions but not on the level of
financial positions.
Let us now fix a position X0 2 X such that E  X0  < 1, and let us try to minimize the cost among all positions X 2 X which are uniformly at least as attractive
as X0
Minimize E  X  among all X <uni X0 .

Section 3.4 Optimal payoff profiles for uniform preferences

149

In order to describe the minimal cost and the minimizing profile, let us denote by F'
and FX0 the distribution functions and by q' and qX0 quantile functions of ' and X0 ;
see Appendix A.3.
Theorem 3.44. For any X 2 X such that X <uni X0 ,
Z

E X  
0

q' .1  s/ qX0 .s/ ds:

(3.28)

The lower bound is attained by X  D f .'/, where f is the decreasing function on


0; 1/ defined by
f .x/ WD qX0 .1  F' .x//
if x is a continuity point of F' , and by
f .x/ WD

1
F' .x/  F' .x/

F' .x/
F' .x/

qX0 .1  t / dt

otherwise.
The proof will use the following lemma, which yields another characterization of
the relation <uni .
Lemma 3.45. For two probability measures
and on R, the following conditions
are equivalent:
(a)
<uni .
(b) For all decreasing functions h W .0; 1/ ! 0; 1/,
Z

h.t /q .t / dt 
0

h.t /q .t / dt;

(3.29)

where q and q are quantile functions of


and .
(c) The relation (3.29) holds for all bounded decreasing functions h W .0; 1/ !
0; 1/.
Proof. The relation
<uni is equivalent to
Z

q .t / dt 
0

q .t / dt

for all y 2 0; 1;

see Theorem 2.57. The implication (c) ) (a) thus follows by taking h D I.0;t  . For the
proof of (a) ) (b), we may assume without loss of generality that h is left-continuous.

150

Chapter 3 Optimality and equilibrium

Then there exists a positive Radon measure  on .0; 1 such that h.t / D .t; 1/.
Fubinis theorem yields
Z

1Z y

h.t / q .t / dt D
0

q .t / dt .dy/
0
0
Z 1Z y

q .t / dt .dy/


0

0
1

h.t / q .t / dt:

D
0

Proof of Theorem 3:44. Using the first HardyLittlewood inequality in Theorem A.24,
we see that
Z
1

E  X  D E X'  

q' .1  t / qX .t / dt;
0

where qX is a quantile function for X. Taking h.t / WD q' .1  t / and using Lemma 3.45 thus yields (3.28).
Let us now turn to the identification of the optimal profile. Note that the function
f defined in the assertion satisfies
f .q' / D E
g j q' 

(3.30)

where g is defined by g.t / D qX0 .1t /, and where E


 j q'  denotes the conditional
expectation with respect to q' under the Lebesgue measure on .0; 1/. Let us show
that X  D f .'/ satisfies X  <uni X0 . Indeed, for any utility function u
Z

E u.X /  D E u.f .'//  D

u.f .q' // dt
0


0

u.qX0 .1  t // dt D

u.qX0 .t // dt

D E u.X0 / ;
where we have applied Lemma A.19 and Jensens inequality for conditional expectations. Moreover, X  attains the lower bound in (3.28)


E X  D E f .'/ '  D

f .q' .t // q' .t / dt
0

D
0

due to (3.30).

qX0 .1  t / q' .t / dt D

Z
0

qX0 .t / q' .1  t / dt;

151

Section 3.5 Robust utility maximization

Remark 3.46. The solution X  has the same expectation under P as X0 . Indeed,
(3.30) shows that
Z 1
Z 1

f .q' .t // dt D
qX0 .1  t / dt D E X0 :
}
E X  D E f .'/  D
0

Exercise 3.4.1. Prove (3.30).

Remark 3.47. The lower bound in (3.28) may be viewed as a reservation price for
X0 in the following sense. Let X0 be a financial position, and let X be any class of
financial positions such that X 2 X is available at price .X /. For a given relation
on X [ X0 ,
R .X0 / WD inf.X / j X 2 X; X X0
is called the reservation price of X0 with respect to X, , and .
If X is the space of constants with .c/ D c, and if the relation is of von
NeumannMorgenstern type with some utility function u, then R .X0 / reduces to
the certainty equivalent of X0 with respect to u; see (2.11).
In the context of the optimization problem (3.20), where
X X0 W E u.X /   E u.X0 / ;
the reservation price is given by E  X  , where X  is the utility maximizer in the
budget set defined by w WD E  X0 .
In the context of the financial market model of Chapter 1, we can take X as the
space V of attainable claims with
V X0 W V  X0 P -a.s.
and .V / D    for V D   S . In this case, the reservation price R .X0 / coincides
with the upper bound sup .X0 / of the arbitrage-free prices for X0 ; see Theorem 1.32.
}

3.5

Robust utility maximization

In this section we consider the optimal investment problem for an economic agent
who, as in Section 2.5, is averse against both risk and Knightian uncertainty. Under the
assumptions of Theorem 2.78 (b), the preferences of such an agent can be described
by the following robust utility functional
inf EQ u.X / ;

Q2Q

(3.31)

where Q is a set of probability measures on .; F / and u is a utility function. We


assume that such a robust utility functional is given and that all measures in Q are

152

Chapter 3 Optimality and equilibrium

absolutely continuous with respect to a fixed pricing measure P  on .; F /. We will


also assume that all payoff profiles X are nonnegative and, for simplicity, that the
utility function u is defined and finite on 0; 1/. As in Section 3.3, the budget set for
a given initial capital w > 0 is defined as
B WD X 2 L1C .P  / j E  X   w:
The problem of robust utility maximization can thus be stated as follows,
maximize inf EQ u.X /  over all X 2 B.
Q2Q

(3.32)

In the sequel, we will assume that Q is a convex set, which can be done without loss
of generality. Moreover, we will assume throughout this section that Q is equivalent
to P  in the following sense:
P  A  D 0

Q A  D 0 for all Q 2 Q.

(3.33)

Clearly, our problem (3.32) would not be well-posed without the implication ).
Note that (3.33) implies that every measure Q 2 Q is absolutely continuous with
respect to P  but not necessarily equivalent to P  .
When Q consists of the single element Q  P  , we have seen in Section 3.3 that
the solution involves the RadonNikodym derivative dP  =dQ. In our present situation, however, not every Q 2 Q needs to be equivalent to P  . The RadonNikodym
derivative dP  =dQ will therefore be understood in the sense of the Lebesgue decomposition; see Theorem A.13. With the convention x0 WD C1 for x > 0 it can be
expressed as
 dQ 1
dP 
P  -a.s.
D
dQ
dP 
It will turn out in Theorem 3.56 below that the following concept can be a key for
solving problem (3.32).
Definition 3.48. A measure Q0 2 Q is called a least favorable measure with respect
to P  if the density  D d P  =d Q0 satisfies
Q0   t  D inf Q   t 
Q2Q

for all t > 0.

Not every set Q is such that it admits a least-favorable measure with respect to a
given measure P  . But here are two examples in which least-favorable measures can
be determined explicitly.
Example 3.49. Let Y be a measurable function on .; F /, and denote by
its law
under P  . For 
given, let
Q WD Q
P  j Q Y 1 D :

153

Section 3.5 Robust utility maximization

The interpretation behind the set Q is that an investor has full knowledge about the
pricing measure P  but is uncertain about the true distribution P of market prices and
has only the weak information that a certain functional Y of the stock price has the
distribution under P . Define Q0 as follows via its RadonNikodym derivative:
dQ0
d
D
.Y /:

dP
d

Then Q0 2 Q, and the law of  WD dP  =dQ0 D d


=d .Y / is the same for all
}
Q 2 Q. Hence, Q0 is a least favorable measure.
Example 3.50. For 2 .0; 1/ and a given probability measure P  P  let

dQ
1

:
Q
WD Q
P
dP

In Chapter 4, this set will play an important role as the maximal representing set for the
risk measure AV@R
. When ' WD dP  =dP satisfies P ' > 1  > 0 and admits
a continuous and strictly increasing distribution function F' .x/ WD P '  x , then
Q
admits a least-favorable measure with respect to P  , which is given by
dQ0
1
'
D 
;
dP
' _ q' .t
/
where q' D F'1 is the (unique) quantile function of ' and t
is the unique solution
of the equation
Z
t

q' .t
/.t
 1 C / D

q' .t / dt:
0

This will be proved in Corollary 8.28. In this case, one finds that
D

dP 
D .' _ q' .t
//:
dQ0

Further examples for least-favorable measures can be found in Huber [153].


Remark 3.51. The existence of a least-favorable measure of a set Q with respect to
P  is closely related to the concept of submodularity of the set function
c.A/ WD sup Q A ;

A2F;

Q2Q

which will be discussed in Sections 4.6 and 4.7. The HuberStrassen theorem states
that a set Q that is weakly compact with respect to a Polish topology on  admits a
least favorable measure with respect to any other probability measure P   Q if and
only if the set function c is submodular
c.A [ B/ C c.A \ B/  c.A/ C c.B/;

A; B 2 F :

154

Chapter 3 Optimality and equilibrium

We refer to Huber and Strassen [154] and Lembcke [195]. Example 3.50 is a special
case of this situation since it will follow from Proposition 4.75 that the associated set
function is submodular.
}
Let us now show that we always have Q0  P  if Q satisfies a mild closure
condition.
Lemma 3.52. Suppose that dQ=d P  j Q 2 Q is closed in L1 .P  /. Then every
least favorable measure Q0 is equivalent to P  .
Proof. Due to our assumption (3.33) and the closedness of dQ=d P  j Q 2 Q, we
may apply the HalmosSavage theorem in the form of Theorem 1.61 and obtain a
measure Q1  P  in Q. We get
1 D Q0  < 1  D lim Q0   t  D lim inf Q   t   Q1  < 1 :
t"1 Q2Q

t"1

Hence, also P   < 1  D 1 and in turn P 


Q0 .
We have the following characterization of least favorable measures.
Proposition 3.53. For a measure Q0 2 Q with Q0  P  and  WD d P  =dQ0 , the
following conditions are equivalent:
(a) Q0 is a least favorable measure for P  .
(b) If f W .0; 1 ! R is decreasing and infQ2Q EQ f ./ ^ 0  > 1, then
inf EQ f ./  D EQ0 f ./  :

Q2Q

(c) If g W .0; 1 ! R is increasing and supQ2Q EQ g./ _ 0  < 1, then


sup EQ g./  D EQ0 g./  :
Q2Q

(d) Q0 minimizes the g-divergence


Ig .QjP  / WD

Z
g

 dQ 
dP 
dP 

among all Q 2 Q whenever g W 0; 1/ ! R is a continuous convex function


such that Ig .QjP  / is finite for some Q 2 Q.
Proof. (a) , (b): According to the definition, Q0 is a least favorable measure if and
only if Q0   t   Q   t  for all t  0 and each Q 2 Q. By Theorem 2.68, this
is equivalent to the fact that, for all Q 2 Q, Q0  1 stochastically dominates Q 1
in the sense of Definition 2.67 . Hence, if f is bounded, then the equivalence of (a) and

155

Section 3.5 Robust utility maximization

(b) follows from Theorem 2.68. If f is unbounded and infQ2Q EQ f ./^0  > 1,
then assertion (b) holds for fN WD .N / _ f ^ 0 where N 2 N. Thus, for all Q 2 Q,
EQ fN ./   EQ0 fN ./   EQ0 f ./ ^ 0  > 1 :
By sending N to infinity, it follows that EQ f ./ ^ 0   EQ0 f ./ ^ 0  for every
Q 2 Q. After using a similar argument for 0 _ f ./, we get
EQ f ./  D EQ f ./ _ 0  C EQ f ./ ^ 0   EQ0 f ./ 

for all Q 2 Q.

(b) , (c) follows by taking f D g.


(b) ) (d): Clearly, Ig .QjP  / is well-defined and larger than g.1/ for each Q

P  due to Jensens inequality. Now take Q1 2 Q with Ig .Q1 jP  / < 1, and denote
0 .x/ the right-hand derivative of g at x  0. Suppose first that g 0 is bounded.
by gC
C
0 .x/.y  x/, we have
Since g.y/  g.x/  gC
Ig .Q1 jP  /  Ig .Q0 jP  / 

Z
Z

 dQ

0
gC
. 1 /

1
dP 

f ./ dQ1 

dQ0 
dP 
dP 

f ./ dQ0 ;

R
0
where
f .x/ WD gC
.1=x/ is a bounded decreasing function. Therefore f ./ dQ1 
R
f ./ dQ0 , and Q0 minimizes Ig .  jP  / on Q.
0 , let
denote
To prove the assertion for convex functions g with unbounded gC
Q
dQ

under
P
.
Then
the
preceding
step
of
the
proof
implies
in
particular
the law
of
dP 
R
R
that .x  c/C
RQ .dx/  .x  c/C
Q0 .dx/ for all c 2 R and each Q 2 Q.
Since in addition x
Q .dx/ D 1 for all Q 2 Q, the result now follows from the
equivalence (b) , (c) of Corollary 2.61.
(d) ) (b): It is enough to prove (b)Rfor continuous bounded decreasing functions
x
f . For such a function f let g.x/ WD 1 f .1=t / dt. Then g is convex. For Q1 2 Q
we let Q t WD tQ1 C .1 Rt /Q0 and h.t / RWD Ig .Q t jP  /. The right-hand derivative of
h satisfies 0  h0C .0/ D f ./ dQ1  f ./ dQ0 , and the proof is complete.
Remark 3.54. Let us discuss the connection between least-favorable measures and
the statistical test theory for composite hypotheses, extending the standard Neyman
Pearson theory as outlined in Appendix A.4. In a composite hypothesis testing problem, one tests a hypothesis P against a null hypothesis Q and allows both P and Q
to be sets of probability measures. Our situation here corresponds to the special case
in which P consists of the single element P  and only the null hypothesis Q is composite. As in Remark A.32 one looks for a randomized statistical test 0 W  ! 0; 1
that maximizes the power E   among all randomized tests with a given significance level. For a composite null hypothesis Q, the significance of a randomized test

156

Chapter 3 Optimality and equilibrium

can be defined as supQ2Q EQ


maximize E 

. Thus, the problem can be stated as follows:

 over all

2 R with sup EQ

  ;

(3.34)

Q2Q

where 2 .0; 1/ is a given significance level and R denotes the class of all randomized tests (see Appendix A.4). This problem can be solved in a straightforward
manner as soon as Q admits a least-favorable measure Q0 with respect to P  . Indeed,
choose  2 0; 1 and c > 0 such that
0

satisfies EQ0
yields that

0

D . Since

WD I
Dc C I
>c
0

sup EQ

is an increasing function of , Proposition 3.53


0

 D EQ0

 D :

Q2Q

Moreover, when is any other randomized test with supQ2Q EQ   , then in


particular EQ0   , and so Theorem A.31 implies E    E  0 , because
0 is an optimal randomized test for the standard problem of testing the hypothesis
P  against the null hypothesis Q0 . Therefore 0 is a solution of (3.34). Note also
that the likelihood of a type 1 error for the test 0 is maximized by Q0 . The fact that
this is the case for every significance level is remarkable, and it explains why Q0 is
called a least-favorable measure.
}
The next proposition prepares for the solution of the robust utility maximization
problem (3.32).
Proposition 3.55. Let Q0  P  be a least favorable measure and  D d P  =dQ0 .
(a) For any X 2 B there exists XQ 2 B such that
inf EQ u.XQ /   inf EQ u.X / 

Q2Q

Q2Q

and such that XQ D f ./ for a decreasing function f  0.


(b) Every solution X  of (3.32) is of the form X  D f  ./ for a deterministic
decreasing function f  W .0; 1/ ! 0; 1/.
Proof. (a) We need to construct a decreasing function f  0 such that E  f ./  
w and
inf EQ u.f .//   inf EQ u.X /  :
(3.35)
Q2Q

Q2Q

To this end, we denote by FY .x/ WD Q0 Y  x  the distribution function and by


qY .t / a quantile function of a random variable Y with respect to the probability measure Q0 .

157

Section 3.5 Robust utility maximization

As in Theorem 3.44 we define a function f by

if F is continuous at t ,
qX .1  F .t //
R F
.t/
f .t / WD
1
q .1  s/ ds otherwise.
F
.t/F
.t/ F
.t/ X

(3.36)

Then f is decreasing and satisfies f .q / D E


h j q , where is the Lebesgue
measure and h.t / WD qX .1  t /; see Exercise 3.4.1. Hence, Jensens inequality for
conditional expectations and Lemma A.23 show that
Z t
Z 1
inf EQ u.X /   EQ0 u.X /  D
u.qX .t // dt D
u.h.t // dt
Q2Q
0
0
Z 1
Z 1
u.E
h j q .t // dt D
u.qf ./ .1  t // dt
(3.37)

0

D EQ0 u.f .//  D inf EQ u.f .//  ;


Q2Q

where we have used Proposition 3.53 in the last step. Thus, f satisfies (3.35).
It remains to show that f ./ satisfies the capital constraint. To this end, we first
use the lower HardyLittlewood inequality in Theorem A.24
Z 1
q .t / qX .1  t / dt :
(3.38)
w  E  X  D EQ0 X  
0

Here we may replace qX .1  t / D h.t / by E


h j q .t / D f .q .t //. We then get
Z 1
Z 1
q .t / qX .1  t / dt D
q .t / f .q .t // dt
(3.39)
0
0
D EQ0 f ./  D E  f ./  :
Thus, f is as desired.
(b) Now suppose X  solves (3.32). If X  is not Q0 -a.s. ./-measurable, then
Y WD EQ0 X  j   must satisfy
EQ0 u.Y /  > EQ0 u.X  /  ;

(3.40)

due to the strict concavity of u. If we define fQ as in (3.36) with Y replacing X , then


the proof of part (a) yields that
E  fQ./  D EQ0  fQ./   EQ0 Y  D EQ0 X    w;
and by (3.37) and (3.40),
inf EQ u.fQ.//   EQ0 u.Y /  > EQ0 u.X  /   inf EQ u.X  /  ;

Q2Q

Q2Q

in contradiction to the optimality of X  . Thus, X  is necessarily ./-measurable


and can hence be written as a (not yet necessarily decreasing) function of .

158

Chapter 3 Optimality and equilibrium

If we define f  as in (3.36) with X  replacing X , then f  ./ is the terminal wealth






of yet another solution in B. Clearly, we must have
R 1E X  D w D E f ./ .

Thus, (3.38) and (3.39) yield that EQ0 X  D 0 q .t / qX  .1  t / dt . But then
the only if part of the lower HardyLittlewood inequality together with the ./measurability of X  imply that X  is a decreasing function of .
We can now state and prove the main result of this section.
Theorem 3.56. Suppose that Q admits a least favorable measure Q0  P  . Then
the robust utility maximization problem (3.32) is equivalent to the standard utility
maximization problem with respect to Q0 , i.e., to (3.32) with Q replaced by Q0 .
More precisely, X  2 B solves the robust problem (3.32) if and only if it solves the
standard problem for Q0 , and the values of the corresponding optimization problems
are equal, whether there exists a solution or not:
sup inf EQ u.X /  D sup EQ0 u.X / ;

X2B Q2Q

X2B

for any initial capital w > 0.


Proof. Proposition 3.55 implies that in solving the robust utility maximization problem (3.32) we may restrict ourselves to strategies whose terminal wealth is a decreasing function of . By Propositions 3.53, the robust utility of a such a terminal
wealth is the same as the expected utility with respect to Q0 . On the other hand,
taking Q0 WD Q0 in Proposition 3.55 implies that the standard utility maximization
problem for Q0 also requires only strategies whose terminal wealth is a decreasing
function of . Therefore, the two problems are equivalent, and Theorem 3.56 is
proved.
Theorem 3.56 states in particular that the robust utility maximization problem
(3.32) has a solution if and only if the standard problem for Q0 has a solution, and we
refer to Theorem 3.39 (with W D C1) for a sufficient condition.
Example 3.57. Consider the situation of Example 3.50, where for 2 .0; 1/ a leastfavorable measure of Q
with respect to P  is given by
1
'
dQ0
D 
:
dP
' _ q' .t
/
Here ' D dP  =dP satisfies P ' > 1  > 0 and admits a continuous and strictly
increasing quantile function q' with respect to P , and t
is the solution of a certain
nonlinear equation. Let us also assume for the sake of concreteness that u.x/ D 1 x
is a HARA utility function with risk aversion  2 .0; 1/ and that E '  =.1 /  < 1.

159

Section 3.6 Microeconomic equilibrium

It was shown in Example 3.41 that the standard utility maximization problem under
P has the solution

1
X  D w.E '  1 /1 '  1 :
Note that X  is large for small values of ', that is, for low-price scenarios.
Now let us replace the single probability measure P by the entire set Q
. By
Theorem 3.56, the corresponding robust utility maximization problem will be solved
by

1
XQ  D w.E   1 /1   1 :
where
D

dP 
D .' _ q' .t
//:
dQ0

It follows that
XQ  D c.X  ^ r/
where r > 0 and c > 1 are certain constants. Thus, the effect of robustness is here
that the optimal payoff profile X  is cut off at a certain threshold. That is, one gives
up the opportunity for very high profits in low-price scenarios in favor of enhanced
returns in all other scenarios.
}

3.6

Microeconomic equilibrium

The aim of this section is to provide a brief introduction to the theory of market equilibrium. Prices of assets will no longer be given in advance. Instead, they will be
derived from first principles in a microeconomic setting where different agents demand asset profiles in accordance with their preferences and with their budget constraints. These budget constraints are determined by a given price system. The role of
equilibrium prices consists in adjusting the constraints in such a way that the resulting
overall demand is matched by the overall supply of assets.
Consider a finite set A of economic agents and a convex set X  L0 .; F ; P / of
admissible claims. At time t D 0, each agent a 2 A has an initial endowment whose
discounted payoff at time t D 1 is described by an admissible claim
Wa 2 X;
The aggregated claim
W WD

a 2 A:
X

Wa

a2A

is also called the market portfolio. Agents may want to exchange their initial endowment Wa against some other admissible claim Xa 2 X. This could lead to a new
allocation .Xa /a2A if the resulting total demand matches the overall supply.

160

Chapter 3 Optimality and equilibrium

Definition 3.58. A collection .Xa /a2A  X is called a feasible allocation if it satisfies the market clearing condition
X
Xa D W P -a.s.
(3.41)
a2A

The budget constraints will be determined by a linear pricing rule of the form
.X / WD E ' X ;

X 2 X;

where ' is a price density, i.e., an integrable function on .; F / such that ' > 0 P a.s. and E jWa j '  < 1 for all a 2 A. To any such ' we can associate a normalized
price measure P '  P with density 'E ' 1 .
Remark 3.59. In the context of our one-period model of a financial market with
d risky assets S 1 ; : : : ; S d and a risk-free asset S 0  1 C r, P ' is a risk-neutral
measure if the pricing rule is consistent with the given price vector  D . 0 ; /,
where  0 D 1. In this section, the pricing rule will be derived as an equilibrium
price measure, given the agents preferences and endowments. In particular, this will
amount to an endogenous derivation of the price vector . In a situation where the
structure of the equilibrium is already partially known in the sense that it is consistent
with the given price vector , the construction of a microeconomic equilibrium yields
a specific choice of a martingale measure P  , i.e., of a specific extension of  from
the space V of attainable payoffs to a larger space of admissible claims.
}
The preferences of agent a 2 A are described by a utility function ua . Given the
price density ', an agent a 2 A may want to exchange the endowment Wa for an
'
admissible claim Xa which maximizes the expected utility
E ua .X / 
among all X in the agents budget set
Ba .'/ WD X 2 X j E 'jX j  < 1 and E ' X   E ' Wa 
D X 2 X j X 2 L1 .P ' / and E ' X   E ' Wa :
'

In this case, we will say that Xa solves the utility maximization problem of agent
a 2 A with respect to the price density '. The key problem is whether ' can be
'
chosen in such a way that the requested profiles Xa , a 2 A, form a feasible allocation.
Definition 3.60. A price density '  together with a feasible allocation .Xa /a2A is
called an ArrowDebreu equilibrium if each Xa solves the utility maximization problem of agent a 2 A with respect to '  .

161

Section 3.6 Microeconomic equilibrium

Thus, the price density '  appearing in an ArrowDebreu equilibrium decentralizes the crucial problem of implementing the global feasibility constraint (3.41). This
is achieved by adjusting the budget sets in such a way that the resulting demands respect the market clearing condition, even though the individual demand is determined
without any regard to this global constraint.
Example 3.61. Assume that each agent a 2 A has an exponential utility function
with parameter a > 0, and let us consider the unconstrained case
X D L0 .; F ; P /:
In this case, there is a unique equilibrium, and it is easy to describe it explicitly. For
a given pricing measure P   P such that Wa 2 L1 .P  / for all a 2 A, the utility
maximization problem for agent a 2 A can be solved if and only if H.P  jP / < 1,
and in this case the optimal demand is given by
Xa D 

1
1
log '  C wa C
H.P  jP /
a
a

where
wa WD E  Wa I
see Example 3.35. The market clearing condition (3.41) takes the form
W D

X
1
1
log '  C
wa C H.P  jP /

a2A

where is defined via

X 1
1
:
D

(3.42)

a2A

Thus, a normalized equilibrium price density must have the form


' D

e W
;
E e W 

(3.43)

and this shows uniqueness. As to existence, let us assume that


E jWa je W  < 1;

a 2 AI

this condition is satisfied if, e.g., the random variables Wa are bounded from below.
Define P   P via (3.43). Then
H.P  jP / D  E  W   log E e W  < 1;

162

Chapter 3 Optimality and equilibrium

and the optimal profile for agent a 2 A with respect to the pricing measure P  takes
the form

.W  E  W /:
(3.44)
Xa D wa C
a
Since
X
wa D E  W ;
a2A

.Xa /a2A

the allocation
is feasible, and so we have constructed an ArrowDebreu
equilibrium. Thus, the agents share the market portfolio in a linear way, and in inverse
proportion to their risk aversion.
Let us now return to our financial market model of Section 3.1. We assume that
the initial endowment of agent a 2 A is given by a portfolio a 2 Rd C1 so that the
discounted payoff at time t D 1 is
Wa D

a  S
;
1Cr

a 2 A:

In this case, the market portfolio is given by W D   S=.1 C r/ with  WD


.0 ; /. The optimal claim for agent a 2 A in (3.44) takes the form



S


 ;
Xa D a   C
a
1Cr

a D

where  D .1; / and


i

 DE

Si
1Cr


for i D 1; : : : ; d .

Thus, we could have formulated the equilibrium problem within the smaller space
X D V of attainable payoffs, and the resulting equilibrium allocation would have been
the same. In particular, the extension of X from V to the general space L0 .; F ; P /
of admissible claims does not create a demand for derivatives in our present example.
}
From now on we assume that the set of admissible claims is given by
X D L0C .; F ; P /;
and that the preferences of agent a 2 A are described by a utility function ua W
0; 1/ ! R which is continuously differentiable on .0; 1/. In particular, the initial
endowments Wa are assumed to be non-negative. Moreover, we assume
P Wa > 0  0 for all a 2 A
and
E W  < 1:

(3.45)

163

Section 3.6 Microeconomic equilibrium

A function ' 2 L1 .; F ; P / such that ' > 0 P -a.s. is a price density if
E ' W  < 1I
note that this condition is satisfied as soon as ' is bounded, due to our assumption
(3.45). Given a price density ', each agent faces exactly the optimization problem
discussed in Section 3.3 in terms of the price measure P '  P . Thus, if .Xa /a2A
is an equilibrium allocation with respect to the price density '  , feasibility implies
0  Xa  W , and so it follows as in the proof of Corollary 3.42 that
Xa D IaC .ca '  /;

a 2 A;

(3.46)

with positive constants ca > 0. Note that the market clearing condition
X
X
W D
Xa D
IaC .ca '  /
a2A

a2A

will determine '  as a decreasing function of W , and thus the optimal profiles Xa
will be increasing functions of W .
Before we discuss the existence of an ArrowDebreu equilibrium, let us first illustrate the structure of such equilibria by the following simple examples. In particular,
they show that an equilibrium allocation will typically involve non-linear derivatives
of the market portfolio W .
Example 3.62. Let us consider the constrained version of the preceding example
where agents a 2 A have exponential utility functions with parameters a > 0.
Define
w WD supc j W  c P -a.s.  0;
and let P  be the measure defined via (3.43). For any agent a 2 A such that
wa WD E  Wa  

.E  W   w/;
a

(3.47)

the unrestricted optimal profile


Xa D wa C

.W  E  W /
a

satisfies Xa  0 P -a.s. Thus, if all agents satisfy the requirement (3.47) then the
unrestricted equilibrium computed in Example 3.61 is a forteriori an ArrowDebreu
equilibrium in our present context. In this case, there is no need for non-linear derivatives of the market portfolio.
If some agents do not satisfy the requirement (3.47) then the situation becomes
more involved, and the equilibrium allocation will need derivatives such as call options. Let us illustrate this effect in the simple setting where there are only two agents

164

Chapter 3 Optimality and equilibrium

a 2 A D 1; 2. Suppose that agent 1 satisfies condition (3.47), while agent 2 does


not. For c  0, we define the measure P c  P in terms of the density

1 1 W
e
on W  c;
c
' WD Z11 W
on W  c;
Z2 e
where is given by (3.42), and where the constants Z1 and Z2 are determined by the
continuity condition
log Z2  log Z1 D c.1  /
and by the normalization E ' c  D 1. Note that P 0 D P  with P  as in (3.43).
Consider the equation
c
E .W  c/C  D w2c WD E c W2 :
2

(3.48)

Both sides are continuous in c. As c increases from 0 to C1, the left-hand side
decreases from 2 E  W  to 0, while w2c goes from w20 < 2 E  W  to E 1 W2  >
0. Thus, there exists a solution c of (3.48). Let us now check that
X2c WD

.W  c/C ;
2

X1c WD W  X2c

defines an equilibrium allocation with respect to the pricing measure P c . Clearly, X1c
and X2c are non-negative and satisfy X1c CX2c D W . The budget condition for agent 2
is satisfied due to (3.48), and this implies the budget condition
E c X1c  D E c W   w2c D w1c
for agent 1. Both are optimal since
Xac D IaC . a ' c /
with
1 WD 1 Z1

and

2 WD 2 Z2 e c :

Thus, agent 2 demands 2 shares of a call option on the market portfolio W with
strike c, agent 1 demands the remaining part of W , and so the market is cleared.
In the general case of a finite set A of agents, the equilibrium price measure PO has
the following structure. There are levels 0 WD c0 <    < cN D 1 with 1  N  jAj
such that the price density 'O is given by
'O D

1 i W
e
Zi

for i D 1; : : : ; N , where
i WD

on W 2 ci 1 ; ci 

 X 1 1
;
a
2Ai

165

Section 3.6 Microeconomic equilibrium

and where Ai .i D 1; : : : ; N / are the increasing sets of agents which are active at
the i th layer in the sense that Xa > 0 on W 2 .ci 1 ; ci . At each layer .ci 1 ; ci ,
the active agents are sharing the market portfolio in inverse proportions to their risk
aversion. Thus, the optimal profile XO a of any agent a 2 A is given by an increasing
piecewise linear function in W , and thus it can be implemented by a linear combination of call options with strikes ci . More precisely, an agent a 2 Ai takes i =a
shares of the spread
.W  ci 1 /C  .W  ci /C ;
i.e., the agent goes long on a call option with strike ci 1 and short on a call option
}
with strike ci .
Example 3.63. Assume that all agents a 2 A have preferences described by HARA
utility functions so that
1
IaC .y/ D y  1 a ; a 2 A
with 0  a < 1. For a given price density ', the optimal claims take the form
1

Xa D IaC .ca '/ D ba '  1 a

(3.49)

with constants ba > 0. If a D  for all a 2 A, then the market clearing condition
(3.41) implies
X 
X
1
Xa D
ba '  1 ;
W D
a2A

i.e., the equilibrium price density

a2A

'

takes the form

' D

1 1
;
W
Z

where Z is the normalizing constant, and so the agents demand linear shares of the
market portfolio W . If risk aversion varies among the agents then the structure of the
equilibrium becomes more complex, and it will involve non-linear derivatives of the
market portfolio. Let us number the agents so that A D 1; : : : ; n and 1      n .
Condition (3.49) implies
Xi D di Xni
with some constants di , and where
i WD

1  n
1  i

satisfies 1      n D 1 with at least one strict inequality. Thus, each Xi is


a convex increasing function of Xn . In equilibrium, Xn is a concave function of W
determined by the condition
n
X
di Xni D W;
(3.50)
i D1

166

Chapter 3 Optimality and equilibrium

and the price density '  takes the form


' D

1 n 1
:
X
Z n

As an illustration, we consider the special case Bernoulli vs. Cramer, where


p
A D 1; 2 with u1 .x/ D x and u2 .x/ D log x, i.e., 1 D 12 and 2 D 0; see
Example 2.38. The solutions of (3.50) can be parameterized with c  0 such that
p
p
p
X2c D 2 c W C c  c 2 0; W 
and
X1c D W  X2c :
The corresponding price density takes the form
'c D

1
1
p
p ;
Z.c/ W C c  c

where Z.c/ is the normalizing constant. Now assume that W 1 2 L1 .P /, and let
P 1 denote the measure with density W 1 .E W 1 /1 . As c increases from 0 to
1, E c X2c  increases continuously from 0 to E 1 W , while E c W2  goes continuously from E 0 W2  > 0 to E 1 W2  < E 1 W ; here we use our assumption that
P Wa > 0  0 for all a 2 A. Thus, there is a c 2 .0; 1/ such that
E c X2c  D E c W2 ;
and this implies that the budget constraint is satisfied for both agents. With this choice
of the parameter c, .X1c ; X2c / is an equilibrium allocation with respect to the pricing
measure P c : Agent 2 demands the concave profile X2c , agent 1 demands the convex
profile X1c , both in accordance with their budget constraints, and the market is cleared.
}
Let us now return to our general setting, and let us prove the existence of an Arrow
Debreu equilibrium. Consider the following condition:
 

W
0
0
lim sup x ua .x/ < 1 and E ua
< 1; a 2 A:
(3.51)
jAj
x#0
Remark 3.64. Condition (3.51) is clearly satisfied if
u0a .0/ WD lim u0a .x/ < 1;
x#0

a 2 A:

(3.52)

But it also includes HARA utility functions ua with parameter a 2 0; 1/ if we


assume
E W a 1  < 1; a 2 A;
in addition to our assumption E W  < 1.

167

Section 3.6 Microeconomic equilibrium

Theorem 3.65. Under assumptions (3.45) and (3.51), there exists an ArrowDebreu
equilibrium.
In a first step, we are going to show that an equilibrium allocation maximizes a
suitable weighted average
U
.X / WD

a E ua .Xa / 

a2A

of the individual utility functionals over all feasible allocations X D .Xa /a2A . The
weights are non-negative, and without loss of generality we can assume that they are
normalized so that the vector WD . a /a2A belongs to the convex compact set

X
a D 1 :
D 2 0; 1A
a2A

In a second step, we will use a fixed-point argument to obtain a weight vector and a
corresponding price density such that the maximizing allocation satisfies the individual budget constraints.
Definition 3.66. A feasible allocation .Xa /a2A is called -efficient for 2 if it
maximizes U
over all feasible allocations.
In view of (3.46), part (b) of the following lemma shows that the equilibrium
allocation .Xa /a2A in an ArrowDebreu
P 1 equilibrium is -efficient for the vector
1 /
1
D .c  ca a2A , where c WD a ca . Thus, the existence proof for an Arrow
Debreu equilibrium is reduced to the construction of a suitable vector  2 .
Lemma 3.67. (a) For any 2 there exists a unique -efficient allocation .Xa
/a2A .
(b) A feasible allocation .Xa /a2A is -efficient if and only if it satisfies the first
order conditions
a u0a .Xa /  ';

with equality on Xa > 0

(3.53)

with respect to some price density '. In this case, .Xa /a2A coincides with
.Xa
/a2A , and the price density can be chosen as
'
WD max a u0a .Xa
/:
a2A

(c) For each a 2 A, Xa


maximizes E ua .X /  over all X 2 X such that
E '
X   E '
Xa
:

(3.54)

168

Chapter 3 Optimality and equilibrium

Proof. (a): Existence and uniqueness follow from the general argument in Remark
3.37 applied to the set B of all feasible allocations and to the functional U
. Note
that
U
.X /  max E ua .W / 
a2A

for any feasible allocation, and that the right-hand side is finite due to our assumption
(3.45). Moreover, by dominated convergence, U
is indeed continuous on B with
respect to P -a.s. convergence.
(b): Let us first show sufficiency. If X D .Xa /a2A is a feasible allocation satisfying
the first order conditions, and Y D .Ya /a2A is another feasible allocation then
X

U
.X /  U
.Y / D

a E ua .Xa /  ua .Ya / 

a2A

a E u0a .Xa /.Xa  Ya / 

a2A

h X
X i
Xa 
Ya
D 0;
E '
a2A

a2A

using concavity of ua in the second step and the first order conditions in the third.
This shows that X is -efficient.
Turning to necessity, consider the -efficient allocation .Xa
/a2A for 2 and
another feasible allocation .Xa /a2A . For " 2 .0; 1, let Ya" WD "Xa C .1  "/Xa
.
Since .Ya" /a2A is feasible, -efficiency of .Xa
/a2A yields
0

1X
a E ua .Ya" /  ua .Xa
/ 
"
a2A

1X
a E u0a .Ya" /.Ya"  Xa
/ 
"
a2A
X
a E u0a .Ya" /.Xa  Xa
/ :
D

(3.55)

a2A

Let us first assume (3.52); in part (d) of the proof we show how to modify the
argument under condition (3.51). Using dominated convergence and (3.52), we may
let " # 0 in the above inequality to conclude
X
a2A

E 'a
Xa  

E 'a
Xa
  E '
W ;

a2A

where
'a
WD a u0a .Xa
/:

(3.56)

169

Section 3.6 Microeconomic equilibrium

Note that '


is a price density since by (3.52)
0 < '
 max a u0a .0/ j a 2 A < 1:
Take a feasible allocation .Xa /a2A such that
X
'a
Xa D '
W I

(3.57)

a2A

for example, we can enumerate A WD 1; : : : ; jAj and take Xa WD W IT Da where


T .!/ WD mina j 'a
.!/ D '
.!/:
In view of (3.56), we see that
X

E 'a
Xa
 D E '
W :

(3.58)

a2A

This implies 'a


D '
on Xa
> 0, which is equivalent to the first order condition
(3.53) with respect to '
.
(c): In order to show optimality of Xa
, we may assume without loss of generality
that P Xa
> 0  > 0, and hence a > 0. Thus, the first order condition with respect
to '
takes the form

Xa
D IaC . 1
a ' /;
due to our convention (3.22). By Corollary 3.42, Xa
solves the optimization problem
for agent a 2 A under the constraint
E '
X   E '
Xa
:
(d): If (3.52) is replaced by (3.51), then we first need an additional argument in
order to pass from (3.55) to (3.56). Note first that by Fatous lemma,
X
X
a E u0a .Ya" / Xa  
a lim inf E u0a .Ya" / Xa 
lim inf
"#0

a2A

a2A

"#0

a E u0a .Xa
/ Xa :

a2A

On the other hand, since


 WD max sup x u0a .x/ < 1
a2A 0<x1

by (3.51), we have xu0a .x/   C xu0a .1/  .1 C x/ for all x  0. This implies
u0a .Xa /Xa  V WD .1 C W / 2 L1 .P /;

(3.59)

170

Chapter 3 Optimality and equilibrium

and also
u0a .Ya" / Xa
 u0a ..1  "/ Xa
/Xa
 .1  "/1 V;
since Ya"  .1  "/Xa
. Thus, dominated convergence implies
E u0a .Ya" / Xa
 ! E u0a .Xa
/ Xa
;

" # 0;

and this concludes the proof of (3.56).


By (3.59), we have
'a
Xa
WD a u0a .Xa
/ Xa
2 L1 .P /:
Hence E '
W  < 1 follows by taking in (3.56) a feasible allocation .Xa /a2A
which is as in (3.57). We furthermore get (3.58), which yields as in part (b) the first
order conditions (3.53).
It remains to show that '
is integrable in order to conclude that '
is a price
density. Our assumption (3.51) implies


W
2 L1 .P /;
(3.60)
F WD max u0a
jAj
a2A
and so it is enough to show that F  '
. Since Xa
D IaC .'
= a /, feasibility and
a  1 imply
X
W 
IaC .'
/  jAj max IaC .'
/;
a2A

a2A

hence

F  max u0a max IbC .'


/
a2A

b2A

u0a0 .IaC0 .'


//

D '

on

max IaC .'


/ D IaC0 .'
/ :
a2A

After these preliminaries, we are now in a position to prove the existence of an


ArrowDebreu equilibrium. Note that for each 2 the -efficient allocation
.Xa
/a2A and the price density '
would form an ArrowDebreu equilibrium if
E '
Wa  D E '
Xa


for all a 2 A.

(3.61)

If this is not the case, then we can replace by the vector g. / D .ga . //a2A defined
by
1
 E '
.Wa  Xa
/ ;
ga . / WD a C
E V 
where V is given by (3.59). Note that g. / 2 : Since the first order conditions
(3.53) together with (3.59) imply
E '
Xa
 D a E u0a .Xa
/ Xa
  a E V ;

171

Section 3.6 Microeconomic equilibrium

P
we have ga . /  0, and a ga . / D 1 follows by feasibility. Thus, we increase
the weights of agents which were allocated less than they could afford. Clearly, any
fixed point of the map g W ! will satisfy condition (3.61) and thus yield an
ArrowDebreu equilibrium.
Proof of Theorem 3:65. (a): The set is convex and compact. Thus, the existence of
a fixed point of the map g W ! follows from Brouwers fixed point theorem as
soon as we can verify that g is continuous; see, for instance, Corollary 16.52 in [3]
for a proof of Brouwers fixed point theorem. Suppose that the sequence . n / 
converges to 2 . In part (c) we show that Xn WD X
n and 'n WD '
n converge P a.s. to X
and '
, respectively. We will show next that we may apply the dominated
convergence theorem, so that
lim E 'n Wa  D E '
Wa 

n"1

and
lim E 'n Xn  D E '
X


n"1

and this will prove the continuity of g. To verify the assumptions of the dominated
convergence theorem, note that
Wa 'n  W 'n  W F;
where F is as in (3.60). Moreover,
W F  jAjF IW jAj C max u0a .1/  W 2 L1 .P /:
a2A

Thus, 'n Wa and 'n Xn are bounded by W F 2 L1 .P /.


(b): By our convention (3.22), the map f W  0; 1 ! 0; 1 defined by
X
f . ; y/ D
IaC . 1
a y/
a2A

is continuous. If we fix 2 , then the function f . ; / is continuous on 0; 1 and


strictly decreasing on .a. /; b. // where
a. / WD max lim a u0a .x/  0
a2A x"1

and

b. / D max a u0a .0C/  C1:


a2A

Moreover, f . ; y/ D 1 for y  a. / and f . ; y/ D 0 for y  b. /. Hence, for


each w 2 .0; 1/ there exists exactly one solution y
2 .a. /; b. // of the equation
f . ; y
/ D w:
Recall that 0; 1 can be regarded as a compact topological space. To see that y

depends continuously on 2 , take a sequence n ! and a subsequence . nk /

172

Chapter 3 Optimality and equilibrium

such that the solutions yk D y


nk of f . nk ; y/ D w converge to some limit y1 2
a. /; b. /. By continuity of f ,
f . ; y1 / D lim f . nk ; yk / D w;
k"1

and so y1 must coincide with y


.
(c): Recall that

Xa
D IaC . 1
a ' /

(3.62)

for any a 2 A. By feasibility,


W D

Xa
D f . ; '
/:

a2A

Thus, '
n converges P -a.s. to '
as n ! due to part (b), and so X
n converges P a.s. to X
due to (3.62). This completes the proof in (a) that the map g is continuous.

Remark 3.68. In order to simplify the exposition, we have restricted the discussion
of equilibrium prices to contingent claims with payoff at time t D 1. We have argued
in terms of discounted payoffs, and so we have implicitly assumed that the interest
rate r has already been fixed. From an economic point of view, also the interest rate
should be determined by an equilibrium argument. This requires an intertemporal
extension of our setting, which distinguishes between deterministic payoffs y at time
t D 0 and nominal contingent payoffs Y at time t D 1. Thus, we replace X D L0C
by the space
Y WD Y D .y; Y / j y 2 0; 1/; Y 2 L0C :
A pricing rule is given by a linear functional on Y of the form
.Y / WD '0  y C E ' Y ;
where '0 2 .0; 1/ and ' is a price density as before. Any such price system specifies
an interest rate for transferring income from time t D 0 to time t D 1. Indeed,
comparing the forward price c  E '  for the fixed amount c to be delivered at time
1 with the spot price c  '0 for the amount c made available at time 0, we see that the
implicit interest rate is given by
1Cr D

'0
:
E ' 

If we describe the preferences of agent a 2 A by a utility functional of the form


Ua .Y / D ua;0 .y/ C Eua;1 .Y /

173

Section 3.6 Microeconomic equilibrium

with smooth utility functions ua;0 and ua;1 , then we can show along the lines of the
preceding discussion that an ArrowDebreu equilibrium exists in this extended set
ting. Thus, we obtain an equilibrium allocation .Y a /a2A and an equilibrium price


system D .'0 ; '  / such that each Y a maximizes the functional Ua in the agents

budget set determined by an initial endowment in Y and by the pricing rule . In particular, we have then specified an equilibrium interest rate r  . Normalizing the price
system to '0 D 1 and defining P  as a probability measure with density '  =E '  ,
we see that the price at time t D 0 of a contingent claim with nominal payoff Y  0
at time t D 1 is given as the expectation


Y

E
1 C r
of the discounted claim with respect to the measure P  .

Let us now extend the discussion to situations where agents are heterogeneous not
only in their utility functions but also in their expectations. Thus, we assume that the
preferences of agent a 2 A are described by a Savage functional of the form
Ua .X / WD EQa ua .X / ;
where Qa is a probability measure on .; F / which is equivalent to P . In addition
to our assumption
(3.63)
lim sup x u0a .x/ < 1; a 2 A;
x#0

we assume that
EQa W  < 1

and


 
W
EQa u0a
< 1;
jAj

a 2 A:

(3.64)

As before, a feasible allocation .Xa /a2A together with a price density '  is called
an ArrowDebreu equilibrium if each Xa maximizes the functional Ua on the budget
set of agent a 2 A, which is determined by '  .
Theorem 3.69. Under assumptions (3.45), (3.63), and (3.64), there exists an Arrow
Debreu equilibrium.
Proof. For any 2 , the general argument of Remark 3.37 yields the existence of
a -efficient allocation .Xa
/a2A , i.e., of a feasible allocation which maximizes the
functional
X
a Ua .Xa /
U
.X / WD
a2A

over all feasible allocations X D .Xa /a2A . Since


Ua .Xa
/ D E 'a ua .Xa
/ ;

174

Chapter 3 Optimality and equilibrium

.Xa
/a2A can be viewed as a -efficient allocation in the model where agents have
random utility functions of the form
uQ a .x; !/ D ua .x/ 'a .!/;
while their expectations are homogeneous and given by P . In view of Corollary 3.43,
it follows as before that X
satisfies the first order conditions
1

Xa
D IaC . 1
a 'a ' /;

a 2 A;

with
'
D max a u0a .Xa
/ 'a ;
a2A

and that Xa
satisfies
Ua .Xa
/  E ua .Ya / 'a   Ua .Ya /
for all Ya in the budget set of agent a 2 A. The remaining arguments are essentially
the same as in the proof of Theorem 3.65.

Chapter 4

Monetary measures of risk

In this chapter, we discuss the problem of quantifying the risk of a financial position. As in Chapter 2, such a position will be described by the corresponding payoff
profile, that is, by a real-valued function X on some set of possible scenarios. In a
probabilistic model, specified by a probability measure on scenarios, we could focus
on the resulting distribution of X and try to measure the risk in terms of moments or
quantiles. Note that a classical measure of risk such as the variance does not capture
a basic asymmetry in the financial interpretation of X: Here it is the downside risk
that matters. This asymmetry is taken into account by measures such as Value at Risk
which are based on quantiles for the lower tail of the distribution, see Section 4.4 below. Value at Risk, however, fails to satisfy some natural consistency requirements.
Such observations have motivated the systematic investigation of measures of risk that
satisfy certain basic axioms.
From the point of view of an investor, we could simply turn around the discussion
of Chapter 2 and measure the risk of a position X in terms of the loss functional
L.X/ D U.X /:
Here U is a utility functional representing a given preference relation on financial
positions. Assuming robust preferences, we are led to the notion of robust shortfall
risk defined by
L.X/ D sup EQ `.X / ;
Q2Q

where `.x/ WD u.x/ is a convex increasing loss function and Q is a class of


probability measures. The results of Section 2.5 show how such loss functionals can
be characterized in terms of convexity and monotonicity properties of the preference
relation. In particular, a financial position could be viewed as being acceptable if the
robust shortfall risk of X does not exceed a given bound.
From the point of view of a supervising agency, however, a specific monetary purpose comes into play. In this perspective a risk measure is viewed as a capital requirement: We are looking for the minimal amount of capital which, if added to the
position and invested in a risk-free manner, makes the position acceptable. This monetary interpretation is captured by an additional axiom of cash invariance. Together
with convexity and monotonicity, it singles out the class of convex risk measures.
These measures can be represented in the form
.X / D sup.EQ X   .Q//;
Q

176

Chapter 4 Monetary measures of risk

where is a penalty function defined on probability measures on . Under the additional condition of positive homogeneity, we obtain the class of coherent risk measures. Here we are back to the situation in Proposition 2.84, and the representation
takes the form
.X / D sup EQ X ;
Q2Q

where Q is some class of probability measures on .


The axiomatic approach to such monetary risk measures was initiated by P. Artzner,
F. Delbaen, J. Eber, and D. Heath [12], and it will be developed in the first three sections. In Section 4.4 we discuss some coherent risk measures related to Value at Risk.
These risk measures only involve the distribution of a position under a given probability measure. In Section 4.5 we characterize the class of convex risk measures which
share this property of law-invariance. Section 4.6 discusses the role of concave distortions, and in Section 4.7 the resulting risk measures are characterized by a property
of comonotonicity. In Section 4.8 we discuss risk measures which arise naturally in
the context of a financial market model. In Section 4.9 we analyze the structure of
monetary risk measures which are induced by our notion of robust shortfall risk.

4.1

Risk measures and their acceptance sets

Let  be a fixed set of scenarios. A financial position is described by a mapping


X W  ! R where X.!/ is the discounted net worth of the position at the end
of the trading period if the scenario ! 2  is realized. The discounted net worth
corresponds to the profits and losses of the position and is also called the P&L. Our
aim is to quantify the risk of X by some number .X /, where X belongs to a given
class X of financial positions. Throughout this section, X will be a linear space
of bounded functions containing the constants. We do not assume that a probability
measure is given on .
Definition 4.1. A mapping  W X ! R is called a monetary risk measure if it satisfies
the following conditions for all X; Y 2 X:
 Monotonicity: If X  Y , then .X /  .Y /.
 Cash invariance: If m 2 R, then .X C m/ D .X /  m.
The financial meaning of monotonicity is clear: The downside risk of a position is
reduced if the payoff profile is increased. Cash invariance is also called translation
invariance or translation property. It is motivated by the interpretation of .X / as a
capital requirement, i.e., .X / is the amount which should be added to the position X
in order to make it acceptable from the point of view of a supervising agency. Thus,
if the amount m is added to the position and invested in a risk-free manner, the capital
requirement is reduced by the same amount. In particular, cash invariance implies
.X C .X // D 0;

(4.1)

Section 4.1 Risk measures and their acceptance sets

177

and
.m/ D .0/  m for all m 2 R.
For most purposes it would be no loss of generality to assume that a given monetary
risk measure satisfies the condition of


Normalization: .0/ D 0.

For a normalized risk measure, cash invariance is equivalent to cash additivity, i.e., to
.X C m/ D .X / C .m/. In some situations, however, it will be convenient not to
insist on normalization.
Remark 4.2. We are using the convention that X describes the worth of a financial position after discounting. For instance, the discounting factor can be chosen as
1=.1 C r/ where r is the return of a risk-free investment. Instead of measuring the
risk of the discounted position X, one could consider directly the nominal worth
XQ D .1 C r/X:
The corresponding risk measure .
Q XQ / WD .X / is again monotone. Cash invariance
is replaced by the following property:
.
Q XQ C .1 C r/m/ D .
Q XQ /  m;

(4.2)

i.e., the risk is reduced by m if an additional amount m is invested in a risk-free


manner. Conversely, any Q W X ! R which is monotone and satisfies (4.2) defines a
monetary measure of risk via .X / WD ..1
Q C r/X /.
}
Lemma 4.3. Any monetary risk measure  is Lipschitz continuous with respect to the
supremum norm k  k:
j.X /  .Y /j  kX  Y k:
Proof. Clearly, X  Y C kX  Y k, and so .Y /  kX  Y k  .X / by monotonicity
and cash invariance. Reversing the roles of X and Y yields the assertion.
From now on we concentrate on monetary risk measures which have an additional
convexity property.
Definition 4.4. A monetary risk measure  W X ! R is called a convex risk measure
if it satisfies


Convexity: . X C .1  /Y /  .X/ C .1  /.Y /, for 0   1.

Consider the collection of possible future outcomes that can be generated with the
resources available to an investor: One investment strategy leads to X, while a second
strategy leads to Y . If one diversifies, spending only the fraction of the resources
on the first possibility and using the remaining part for the second alternative, one

178

Chapter 4 Monetary measures of risk

obtains X C .1  /Y . Thus, the axiom of convexity gives a precise meaning to the


idea that diversification should not increase the risk. This idea becomes even clearer
when we note that, for a monetary risk measure, convexity is in fact equivalent to the
weaker requirement of


Quasi Convexity: . X C .1  /Y /  .X / _ .Y / for 0   1.

Exercise 4.1.1. Prove that a monetary risk measure is quasi-convex if and only if it is
convex.
}
Exercise 4.1.2. Show that if  is convex and normalized, then
. X /  .X/ for 0   1,
. X /  .X/ for  1.

Definition 4.5. A convex risk measure  is called a coherent risk measure if it satisfies


Positive Homogeneity: If  0, then . X / D .X/.

When a monetary risk measure  is positively homogeneous, then it is normalized,


i.e., .0/ D 0. Under the assumption of positive homogeneity, convexity is equivalent
to


Subadditivity: .X C Y /  .X / C .Y /.

This property allows to decentralize the task of managing the risk arising from a collection of different positions: If separate risk limits are given to different desks, then
the risk of the aggregate position is bounded by the sum of the individual risk limits.
In many situations, however, risk may grow in a non-linear way as the size of the
position increases. For this reason we will not insist on positive homogeneity. Instead,
our focus will be mainly on convex measures of risk.
Exercise 4.1.3. Let  be a normalized monetary risk measure on X. Show that any
two of the following properties imply the remaining third.


Convexity.

Positive homogeneity.

Subadditivity.

A monetary risk measure  induces the class


A WD X 2 X j .X /  0
of positions which are acceptable in the sense that they do not require additional capital. The class A will be called the acceptance set of . The following two propositions summarize the relations between monetary risk measures and their acceptance
sets.

179

Section 4.1 Risk measures and their acceptance sets

Proposition 4.6. Suppose that  is a monetary risk measure with acceptance set A WD
A .
(a) A is non-empty, closed in X with respect to the supremum norm k  k, and
satisfies the following two conditions:
infm 2 R j m 2 A > 1:
X 2 A; Y 2 X, Y  X

H)

Y 2 A.

(4.3)
(4.4)

(b)  can be recovered from A:


.X / D infm 2 R j m C X 2 A:

(4.5)

(c)  is a convex risk measure if and only if A is convex.


(d)  is positively homogeneous if and only if A is a cone. In particular,  is coherent if and only if A is a convex cone.
Proof. (a): Properties (4.3) and (4.4) are straightforward, and closedness follows from
Lemma 4.3.
(b): Cash invariance implies that for X 2 X,
infm 2 R j m C X 2 A D infm 2 R j .m C X /  0
D infm 2 R j .X /  m
D .X /:
(c): A is clearly convex if  is a convex measure of risk. The converse will follow
from Proposition 4.7 together with (4.7).
(d): Clearly, positive homogeneity of  implies that A is a cone. The converse
follows as in (c).
Conversely, one can take a given class A  X of acceptable positions as the primary object. For a position X 2 X, we can then define the capital requirement as the
minimal amount m for which m C X becomes acceptable
A .X / WD infm 2 R j m C X 2 A:

(4.6)

Note that, with this notation, (4.5) takes the form


A D :

(4.7)

Proposition 4.7. Assume that A is a non-empty subset of X which satisfies (4.3) and
(4.4). Then the functional A has the following properties:

180

Chapter 4 Monetary measures of risk

(a) A is a monetary risk measure.


(b) If A is a convex set, then A is a convex risk measure.
(c) If A is a cone, then A is positively homogeneous. In particular, A is a coherent risk measure if A is a convex cone.
(d) A is a subset of A A , and A D A A holds if and only if A is k  k-closed in X.
Proof. (a): It is straightforward to verify that A satisfies cash invariance and monotonicity. We show next that A takes only finite values. To this end, fix some Y in the
non-empty set A. For X 2 X given, there exists a finite number m with m C X > Y ,
because X and Y are both bounded. Then
A .X /  m D A .m C X /  A .Y /  0;
and hence A .X /  m < 1. Note that (4.3) is equivalent to A .0/ > 1. To
show that A .X / > 1 for arbitrary X 2 X, we take m0 such that X C m0  0 and
conclude by monotonicity and cash invariance that A .X /  A .0/ C m0 > 1.
(b): Suppose that X1 ; X2 2 X and that m1 ; m2 2 R are such that mi C Xi 2 A. If
2 0; 1, then the convexity of A implies that .m1 C X1 / C .1  /.m2 C X2 / 2 A.
Thus, by the cash invariance of A ,
0  A . .m1 C X1 / C .1  /.m2 C X2 //
D A . X1 C .1  /X2 /  . m1 C .1  /m2 /;
and the convexity of A follows.
(c): As in the proof of convexity, we obtain that A . X /  A .X / for  0 if
A is a cone. To prove the converse inequality, let m < A .X /. Then m C X A and
hence m C X A for  0. Thus m < A . X /, and (c) follows.
(d): The inclusion A A A is obvious, and Proposition 4.6 implies that A is
k  k-closed as soon as A D A A . Conversely, assume that A is k  k-closed. We have
to show that X A implies that A .X / > 0. To this end, take m > kXk. Since A
is k  k-closed and X A, there is some 2 .0; 1/ such that m C .1  /X A.
Thus,
0  A . m C .1  /X / D A ..1  /X /  m:
Since A is a monetary risk measure, Lemma 4.3 shows that
jA ..1  /X /  A .X /j  kX k:
Hence,
A .X /  A ..1  /X /  kXk  .m  kXk/ > 0:
Exercise 4.1.4. Let A be a non-empty subset of X which satisfies (4.3) and (4.4).
Show that A is k  k-closed if and only if A satisfies the following closure property: if
X C m 2 A for all m > 0 then X 2 A.
}

181

Section 4.1 Risk measures and their acceptance sets

In the following examples, we take X as the linear space of all bounded measurable
functions on some measurable space .; F /, and we denote by M1 D M1 .; F / the
class of all probability measures on .; F /.
Example 4.8. Consider the worst-case risk measure max defined by
max .X / D  inf X.!/
!2

for all X 2 X.

The value max .X / is the least upper bound for the potential loss which can occur in
any scenario. The corresponding acceptance set A is given by the convex cone of all
non-negative functions in X. Thus, max is a coherent risk measure. It is the most
conservative measure of risk in the sense that any normalized monetary risk measure
 on X satisfies


.X /   inf X.!/ D max .X /:
!2

Note that max can be represented in the form


max .X / D sup EQ X ;

(4.8)

Q2Q

where Q is the class M1 of all probability measures on .; F /.

Example 4.9. Let Q be a set of probability measures on .; F /, and consider a


mapping  W Q ! R with supQ .Q/ < 1, which specifies for each Q 2 Q some
floor .Q/. Suppose that a position X is acceptable if
EQ X   .Q/

for all Q 2 Q.

The set A of such positions satisfies (4.3) and (4.4), and it is convex. Thus, the
associated monetary risk measure  D A is convex, and it takes the form
.X / D sup ..Q/  EQ X /:
Q2Q

Alternatively, we can write


.X / D sup .EQ X   .Q//;

(4.9)

Q2M1

where the penalty function W M1 ! .1; 1 is defined by .Q/ D .Q/


for Q 2 Q and .Q/ D C1 otherwise. Note that  is a coherent risk measure if
.Q/ D 0 for all Q 2 Q.
}
Example 4.10. Consider a utility function u on R, a probability measure Q 2 M1 ,
and fix some threshold c 2 R. Let us call a position X acceptable if its certainty

182

Chapter 4 Monetary measures of risk

equivalent is at least c, i.e., if its expected utility EQ u.X /  is bounded from below
by u.c/. Clearly, the set
A WD X 2 X j EQ u.X /   u.c/
is non-empty, convex, and satisfies (4.3) and (4.4). Thus, A is a convex risk measure.
As an obvious robust extension, we can define acceptability in terms of a whole class
Q of probability measures on .; F /, i.e.,
\
X 2 X j EQ u.X /   u.cQ /;
A WD
Q2Q

with constants cQ such that supQ2Q cQ < 1. The corresponding risk measures will
be studied in more detail in Section 4.9.
}
Example 4.11. Suppose now that we have specified a probabilistic model, i.e., a probability measure P on .; F /. In this context, a position X is often considered to be
acceptable if the probability of a loss is bounded by a given level 2 .0; 1/, i.e., if
P X < 0   :
The corresponding monetary risk measure V@R
, defined by

V@R
.X / D infm 2 R j P m C X < 0   ;
is called Value at Risk at level . Note that it is well defined on the space L0 .; F ; P /
of all random variables which are P -a.s. finite, and that

V@R
.X / D E X  C 1 .1  / .X /;

(4.10)

if X is a Gaussian random variable with variance 2 .X / and 1 denotes the inverse


of the distribution function of N.0; 1/. Clearly, V@R
is positively homogeneous,
but in general it is not convex, as shown by Example 4.46 below. In Section 4.4, Value
at Risk will be discussed in detail. In particular, we will study some closely related
coherent and convex risk measures.
}
Exercise 4.1.5. Compute V@R
.X / when X is
(a) uniform,
(b) log-normally distributed, i.e., X D e ZCm with Z  N.0; 1/ and m; 2 R. }
Example 4.12. As in Example 4.11, we fix a probability measure P on .; F /. For
Q 0,
an asset with payoff XQ 2 L2 D L2 .; F ; P /, price .XQ /, and variance 2 .X/
the Sharpe ratio is defined as
E XQ   .XQ /.1 C r/
E X 
D
;
.X /
.XQ /

Section 4.1 Risk measures and their acceptance sets

183

where X WD XQ .1 C r/1  .XQ / is the corresponding discounted net worth. Suppose


that we find the position X acceptable if the Sharpe ratio is bounded from below by
some constant c > 0. The resulting functional c on L2 defined by (4.6) for the class
Ac WD X 2 L2 j E X   c  .X /
is given by
c .X / D E X  C c  .X /:
It is sometimes called the mean-standard deviation risk measure. It is cash invariant
and positively homogeneous, and it is convex since .  / is a convex functional on
L2 . But c is not a monetary risk measure, because it is not monotone. Indeed, if
X D e Z and Z is a random variable with normal distribution N.0; 2 /, then X  0
but
p
2
2
2
c .X / D e  =2 C ce  =2 e   1
becomes positive for large enough . Note, however, that (4.10) shows that c .X /
coincides with V@R
.X / if X is Gaussian and if c D 1 .1  / with 0 < 
1=2. Thus, both c and V@R
have all the properties of a coherent risk measure if
restricted to a Gaussian subspace XQ of L2 , i.e, a linear space consisting of normally
distributed random variables. But neither c nor V@R
can be coherent on the full
space L2 , since the existence of normal random variables on .; F ; P / implies that
X will also contain random variables as considered in Example 4.46.
}
Example 4.13. Let u W R ! R be a strictly increasing continuous function. For
X 2 X WD L1 .; F ; P / we consider the certainty equivalent of the law of X under
P as a functional of X by setting
c.X / WD u1 .E u.X / /:
Then .X / WD c.X / is monotone,
X Y

H)

.X /  .Y /:

If  is also cash invariant, and hence a monetary risk measure, then Proposition 2.46
shows that u is either linear or a function with exponential form: u.x/ D a C be x
or u.x/ D a  be x for constants a 2 R and b; > 0. In the linear case we have
.X / D E X :
In the first exponential case  is of the form
.X / D 

1
log E e X :

184

Chapter 4 Monetary measures of risk

In the second exponential case  is given by


.X / D

1
log E e X 

(4.11)

and called the entropic risk measure for reasons that will become clear in Example 4.34. There (and in Exercise 4.1.6 below) we will also see that  is a convex risk
measure.
}
Exercise 4.1.6. Let u W R ! R be a strictly increasing continuous function and let 
be defined as in Example 4.13. Show that  is quasi-convex,
. X C .1  /Y /  .X / _ .Y /

for 0   1,

when u is concave and conclude that the entropic risk measure in (4.11) is a convex
risk measure.
}
Example 4.14. Let c W F ! 0; 1 be any set function which is normalized and
monotone in the sense that c.;/ D 0; c./ D 1, and c.A/  c.B/ if A  B. For
instance, c can be given by c.A/ WD .P A / for some probability measure P and
an increasing function
W 0; 1 ! 0; 1 such that .0/ D 0 and .1/ D 1. The
Choquet integral of a bounded measurable function X  0 with respect to c is defined
as
Z 1
Z
c.X > x/ dx:
X dc WD
0

R
If c is a probability measure, Fubinis theorem implies that X dc coincides with
the usual integral. In theR general case,Rthe ChoquetR integral is a nonlinear
functional
R
of X, but we still have X dc D X dc and .X C m/ dc D X dc C m for
constants ; m  0. If X 2 X is arbitrary, we take m 2 R such that X C m  0 and
get
Z

.X C m/ dc  m D

.c.X > x/  1/ dx C
m

c.X > x/ dx:


0

The right-hand side is independent of m   inf X, and so it makes sense to extend


the definition of the Choquet integral by putting
Z

.c.X > x/  1/ dx C

X dc WD
1

for all X 2 X. It follows that


Z
Z
X dc D X dc

c.X > x/ dx
0

Z
and

Z
.X C m/ dc D

X dc C m

Section 4.1 Risk measures and their acceptance sets

for all  0 and m 2 R. Moreover, we have


Z
Z
Y dc  X dc

185

for Y  X.

Thus, the Choquet integral of the loss,


Z
.X / WD

.X / dc;

(4.12)

is a positively homogeneous monetary risk measure on X. In Section 4.7, we will


characterize these risk measures in terms of a property called comonotonicity. We
will also show that  is convex, and hence coherent, if and only if c is submodular or
2-alternating, i.e.,
c.A \ B/ C c.A [ B/  c.A/ C c.B/ for A; B 2 F .
In this case,  admits the representation
.X / D max EQ X ;
Q2Qc

(4.13)

where Qc is the core of c, defined as the class of all finitely additive and normalized set
functions Q W F ! 0; 1 such that Q A   c.A/ for all A 2 F ; see Theorem 4.94.
}
Exercise 4.1.7. Let P be a probability measure on .; F / and fix n 2 N. For X 2 X
let X1 ; : : : ; Xn be independent copies of X and set
.X / WD E min.X1 ; : : : ; Xn / :
The functional  W X ! R is sometimes called MINVAR. Show that  is a coherent
risk measure on X. Show next that  fits into the framework of Example 4.14. More
precisely, show that  can be represented as a Choquet integral,
Z
.X / D

.X / dc;

where the set function c is of the form c.A/ D .P A/ for a concave increasing
function W 0; 1 ! 0; 1 such that .0/ D 0 and .1/ D 1.
}
In the next two sections, we are going to show how representations of the form
(4.8), (4.13), (4.9), or (4.12) for coherent or convex risk measures arise in a systematic
manner.

186

4.2

Chapter 4 Monetary measures of risk

Robust representation of convex risk measures

In this section, we consider a situation of Knightian uncertainty, where no probability


measure P is fixed on the measurable space .; F /. Let X denote the space of
all bounded measurable functions on .; F /. Recall that X is a Banach space if
endowed with the supremum norm k  k. As in Section 2.5, we denote by M1 WD
M1 .;F / the set of all probability measures on .;F / and by M1;f WDM1;f .;F /
the set of all finitely additive set functions Q W F ! 0; 1 which are normalized to
Q   D 1. By EQ X  we denote the integral of X with respect to Q 2 M1;f ; see
Appendix A.6. We do not assume that a probability measure on .; F / is given a
priori.
If  is a coherent risk measure on X, then we are in the context of Proposition 2.84,
i.e., the functional  defined by .X / WD .X / satisfies the four properties listed in
Proposition 2.83. Hence, we have the following result:
Proposition 4.15. A functional  W X ! R is a coherent risk measure if and only if
there exists a subset Q of M1;f such that
.X / D sup EQ X ;

X 2 X:

(4.14)

Q2Q

Moreover, Q can be chosen as a convex set for which the supremum in (4.14) is attained.
Our first goal in this section is to obtain an analogue of this result for convex risk
measures. Applied to a coherent risk measure, it will yield an alternative proof of
Proposition 4.15, which does not depend on the discussion in Chapter 2, and it will
provide a description of the maximal set Q in (4.14). Our second goal will be to obtain
criteria which guarantee that a risk measure can be represented in terms of -additive
probability measures.
Let W M1;f ! R [ C1 be any functional such that
inf

Q2M1;f

.Q/ 2 R:

For each Q 2 M1;f the functional X 7! EQ X   .Q/ is convex, monotone,


and cash invariant on X, and these three properties are preserved when taking the
supremum over Q 2 M1;f . Hence,
.X / WD

sup .EQ X   .Q//


Q2M1;f

defines a convex risk measure on X such that


.0/ D 

inf

Q2M1;f

.Q/:

(4.15)

187

Section 4.2 Robust representation of convex risk measures

The functional will be called a penalty function for  on M1;f , and we will say that
 is represented by on M1;f .
Theorem 4.16. Any convex risk measure  on X is of the form
.X / D

max .EQ X   min .Q//;

Q2M1;f

X 2 X;

(4.16)

where the penalty function min is given by


min .Q/ WD sup EQ X  for Q 2 M1;f .
X2A

Moreover, min is the minimal penalty function which represents , i.e., any penalty
function for which (4.15) holds satisfies .Q/  min .Q/ for all Q 2 M1;f .
Proof. In a first step, we show that
.X / 

sup .EQ X   min .Q//

for all X 2 X.

Q2M1;f

To this end, recall that X 0 WD .X / C X 2 A by (4.1). Thus, for all Q 2 M1;f
min .Q/  EQ X 0  D EQ X   .X /:
From here, our claim follows.
For X given, we will now construct some QX 2 M1;f such that
.X /  EQX X   min .QX /;
which, in view of the previous step, will prove our representation (4.16). By cash
invariance it suffices to prove this for X 2 X with .X / D 0. Moreover, we may
assume without loss of generality that .0/ D 0. Then X is not contained in the
nonempty convex set
B WD Y 2 X j .Y / < 0:
Since B is open due to Lemma 4.3, we may apply the separation argument in the form
of Theorem A.55. It yields a non-zero continuous linear functional ` on X such that
`.X /  inf `.Y / DW b:
Y 2B

We claim that `.Y /  0 if Y  0. Monotonicity and cash invariance of  imply


that 1 C Y 2 B for any > 0. Hence,
`.X /  `.1 C Y / D `.1/ C `.Y /
which could not be true if `.Y / < 0.

for all > 0,

188

Chapter 4 Monetary measures of risk

Our next claim is that `.1/ > 0. Since ` does not vanish identically, there must
be some Y such that 0 < `.Y / D `.Y C /  `.Y  /. We may assume without loss of
generality that kY k < 1. Positivity of ` implies `.Y C / > 0 and `.1  Y C /  0.
Hence `.1/ D `.1  Y C / C `.Y C / > 0.
By the two preceding steps and Theorem A.51, we conclude that there exists some
QX 2 M1;f such that
EQX Y  D

`.Y /
`.1/

for all Y 2 X.

Note that B  A , and so


min .QX / D sup EQX Y   sup EQX Y  D 
Y 2A

Y 2B

b
:
`.1/

On the other hand, Y C " 2 B for any Y 2 A and each " > 0. This shows that
min .QX / is in fact equal to b=`.1/. It follows that
EQX X   min .QX / D

1
.b  `.X //  0 D .X /:
`.1/

Thus, QX is as desired, and the proof of the representation (4.16) is complete.


Finally, let be any penalty function for . Then, for all Q 2 M1;f and X 2 X
.X /  EQ X   .Q/;
and hence
.Q/  sup .EQ X   .X //
X2X

 sup .EQ X   .X //

(4.17)

X2A

 min .Q/:
Thus, dominates min .
Remark 4.17. (a) If we take D min in (4.17), then all inequalities in (4.17) must
be identities. Thus, we obtain an alternative formula for the minimal penalty
function min :
min .Q/ D sup .EQ X   .X //:
(4.18)
X2X

(b) Note that min is convex and lower semicontinuous for the total variation distance on M1;f as defined in Definition A.50, since it is the supremum of affine
continuous functions on M1;f .

Section 4.2 Robust representation of convex risk measures

189

(c) Suppose  is defined via  WD A for a given acceptance set A  X. Then A


determines min :
min .Q/ D sup EQ X  for all Q 2 M1;f .
X2A

This follows from the fact that X 2 A implies " C X 2 A for all " > 0.

Remark 4.18. Equation (4.18) shows that the penalty function min corresponds to
the FenchelLegendre transform, or conjugate function, of the convex function  on
the Banach space X. More precisely,
min .Q/ D  .`Q /;

(4.19)

where  W X 0 ! R [ C1 is defined on the dual X 0 of X by


 .`/ D sup .`.X /  .X //;
X2X

and where `Q 2 X 0 is given by `Q .X / D EQ X  for Q 2 M1;f . This suggests


an alternative proof of Theorem 4.16. First note that, by Theorem A.51, X 0 can be
identified with the space ba WD ba.; F / of finitely additive set functions with finite
total variation. Moreover,  is lower semicontinuous with respect to the weak topology .X; X 0 /, since any set   c is convex, strongly closed due to Lemma 4.3,
and hence weakly closed by Theorem A.60. Thus, the general duality theorem for
conjugate functions as stated in Theorem A.62 yields
 D ;
where  denotes the conjugate function of  , i.e.,
.X / D sup .`.X /   .`//:

(4.20)

`2ba

In a second step, using the arguments in the second part of the proof of Theorem 4.16,
we can now check that monotonicity and cash invariance of  imply that `  0 and
`.1/ D 1 for any ` 2 X 0 D ba such that  .`/ < 1. Identifying ` with Q 2 M1;f
and using equation (4.19), we see that (4.20) reduces to the representation
.X / D

sup .EQ X   min .Q//:


Q2M1;f

Moreover, the supremum is actually attained: M1;f is weak compact in X 0 D ba


due to the BanachAlaoglu theorem stated in Theorem A.63, and so the upper semicontinuous functional Q 7! EQ X   min .Q/ attains its maximum on M1;f .
}

190

Chapter 4 Monetary measures of risk

The representation
.X / D sup EQ X ;

X 2 X;

(4.21)

Q2Q

of a coherent risk measure  via some set Q  M1;f , as formulated in Proposition


4.15, is a particular case of the representation theorem for convex risk measures, since
it corresponds to the penalty function

0
if Q 2 Q
.Q/ D
C1 otherwise.
The following corollary shows that the minimal penalty function of a coherent risk
measure is always of this type.
Corollary 4.19. The minimal penalty function min of a coherent risk measure  takes
only the values 0 and C1. In particular,
.X / D max EQ X ;
Q2Qmax

X 2 X;

for the convex set


Qmax WD Q 2 M1;f j min .Q/ D 0;
and Qmax is the largest set for which a representation of the form (4.21) holds.
Proof. Recall from Proposition 4.6 that the acceptance set A of a coherent risk measure is a cone. Thus, the minimal penalty function satisfies
min .Q/ D sup EQ X  D sup EQ  X  D min .Q/
X2A

X2A

for all Q 2 M1;f and > 0. Hence, min can take only the values 0 and C1.
Exercise 4.2.1. Let  be a coherent risk measure on X and assume that  admits a
representation
.X / D sup EQ X 
Q2Q

with some class Q of probability measures on .; F /. Show that  is additive, i.e.,
.X C Y / D .X / C .Y /

for all X; Y 2 X,

if and only if the class Q reduces to a single probability measure Q, i.e.,  is simply
the expected loss with respect to Q.
}

Section 4.2 Robust representation of convex risk measures

191

The penalty function arising in the representation (4.15) is not unique, and it is
often convenient to represent a convex risk measure by a penalty function that is not
the minimal one. For instance, the minimal penalty function may be finite for certain finitely additive set functions while another is concentrated only on probability
measures as in the case of Example 4.8. Another situation of this type occurs for risk
measures which are constructed as the supremum of a family of convex risk measures.
Proposition 4.20. Suppose that for every i in some index set I we are given a convex
risk measure i on X with associated penalty function i . If supi 2I i .0/ < 1 then
.X / WD sup i .X /;

X 2 X;

i 2I

is a convex risk measure that can be represented with the penalty function
.Q/ WD inf i .Q/;
i 2I

Q 2 M1;f :

Proof. The condition .0/ D supi 2I i .0/ < 1 implies that  takes only finite
values. Moreover,
sup .EQ X   i .Q//

.X / D sup

i 2I Q2M1;f


EQ X   inf i .Q/ ;

sup
Q2M1;f

i 2I

and the assertion follows.


In the sequel, we are particularly interested in those convex measures of risk which
admit a representation in terms of -additive probability measures. Such a risk measure  can be represented by a penalty function which is infinite outside the set
M1 WD M1 .; F /:
.X / D sup .EQ X   .Q//:

(4.22)

Q2M1

In this case, one can no longer expect that the supremum above is attained. This is
illustrated by Example 4.8 if X does not take on its infimum.
A representation (4.22) in terms of probability measures is closely related to certain
continuity properties of . We first examine a necessary condition of continuity from
above.
Lemma 4.21. A convex risk measure  which admits a representation (4.22) on M1
is continuous from above in the sense that
Xn & X

H)

.Xn / % .X /:

(4.23)

192

Chapter 4 Monetary measures of risk

Moreover, continuity from above is equivalent to the Fatou property of lower semicontinuity with respect to bounded pointwise convergence: If .Xn / is a bounded sequence in X which converges pointwise to X 2 X, then
.X /  lim inf .Xn /:

(4.24)

n"1

Proof. First we show (4.24) under the assumption that  has a representation in terms
of probability measures. Dominated convergence implies that EQ Xn  ! EQ X 
for each Q 2 M1 . Hence,


.X / D sup lim EQ Xn   .Q/
Q2M1

n"1

 lim inf sup .EQ Xn   .Q//


n"1 Q2M1

D lim inf .Xn /:


n"1

In order to show the equivalence of (4.24) and (4.23), let us first assume (4.24). By
monotonicity, .Xn /  .X / for each n if Xn & X , and so .Xn / % .X / follows.
Now we assume continuity from above. Let .Xn / be a bounded sequence in X
which converges pointwise to X. Define Ym WD supnm Xn 2 X. Then Ym decreases
to X. Since .Xn /  .Yn / by monotonicity, condition (4.23) yields that
lim inf .Xn /  lim .Yn / D .X /:
n"1

n"1

The following theorem gives a strong sufficient condition which guarantees that
any penalty function for  is concentrated on the set M1 of probability measures.
This condition is continuity from below rather than from above; we will see a class
of examples in Section 4.9.
Theorem 4.22. For a convex risk measure  on X, the following two conditions are
equivalent:
(a)  is continuous from below in the sense that
Xn % X pointwise on 

H)

.Xn / & .X /:

(b) The minimal penalty function min (and hence every other penalty function representing ) is concentrated on the class M1 of probability measures, i.e.,
min .Q/ < 1

H)

Q is -additive.

In particular we have
.X / D max .EQ X   min .Q//;
Q2M1

X 2 X;

whenever one of these two equivalent conditions is satisfied.

193

Section 4.2 Robust representation of convex risk measures

For the proof of this theorem we need the following lemma.


Lemma 4.23. Let  be a convex risk measure on X which is represented by the
penalty function on M1;f , and consider the level sets
c WD Q 2 M1;f j .Q/  c;

for c > .0/ D

inf

Q2M1;f

.Q/.

For any sequence .Xn / in X such that 0  Xn  1, the following two conditions are
equivalent:
(a) . Xn / ! . / for each  1.
(b) infQ2c EQ Xn  ! 1 for all c > .0/.
Proof. (a) ) (b): In a first step, we show that for all Y 2 X
inf EQ Y   

Q2c

c C . Y /

for all > 0.

(4.25)

Indeed, since represents , we have for Q 2 c


c  .Q/  EQ  Y   . Y /;
and dividing by  yields (4.25).
Now consider a sequence .Xn / which satisfies (a). Then (4.25) shows that for all
1
c C . Xn /
c C .0/
D1
:


n"1

lim inf inf EQ Xn    lim


n"1 Q2c

Taking " 1 and assuming Xn  1 proves (b).


(b) ) (a): Clearly, for all n
. /  . Xn / D

sup .EQ  Xn   .Q//:


Q2M1;f

Since EQ  Xn   0 for all Q, only those Q can contribute to the supremum on


the right-hand side for which
.Q/  1  . / D 1 C  .0/ DW c:
Hence, for all n
. Xn / D sup .EQ  Xn   .Q//:
Q2c

But condition (b) implies that EQ  Xn  converges to  uniformly in Q 2 c ,


and so (a) follows.

194

Chapter 4 Monetary measures of risk

The proof of our theorem will also rely on Dinis lemma, which we recall here for
the convenience of the reader.
Lemma 4.24. On a compact set, a sequence of continuous functions fn increasing to
a continuous function f converges even uniformly.
T
Proof. For " > 0, the compact sets Kn WD fn  f  " satisfy n Kn D ;, hence
Kn0 D ; for some n0 .
Proof of Theorem 4:22. To prove the implication (a) ) (b), recall that Q is -additive if and
S only if Q An  % 1 for any increasing sequence of events An 2 F
such that n An D . Thus, our claim is implied by the implication (a) ) (b) of
Lemma 4.23 if we take Xn WD IAn .
We now prove the implication (b) ) (a) of our theorem. Suppose Xn % X pointwise on . We need to show that .Xn / & .X /. By cash invariance, we may assume
without loss of generality that Xn  0 for all n. As in the proof of the implication (a)
) (b) of Lemma 4.23, we see that
.Xn / D

max .EQ Xn   min .Q// D max .EQ Xn   min .Q//; (4.26)

Q2M1;f

Q2c

where c WD 1  .X / and c D Q 2 M1;f j min .Q/  c. We will show below


that c  M1 implies that
EQ Xn  ! EQ X  uniformly in Q 2 c .

(4.27)

Together with the representation (4.26), this will imply the desired convergence
.Xn / ! .X /.
To prove (4.27), recall first from Appendix A.6, and in particular from Definition
A.50, that M1;f belongs to the larger vector space ba WD ba.; F / of all finitely
additive set functions
W F ! R with finite total variation k
kvar . In fact, ba can be
identified with the topological dual of the Banach space X with respect to k  k; see
Theorem A.51. Since we can write
M1;f D
2 ba j
./ D 1 and
.A/  0 for all A 2 F ;
it is clear that M1;f is a bounded and weak closed set in ba. Hence M1;f is weak
compact by the BanachAlaoglu theorem (see Theorem A.63). Since min is weak
lower semicontinuous on M1;f as supremum of the weak continuous maps Q 7!
EQ Y  with Y 2 A , the level set c is also weak compact.
After these preparations, we can now prove (4.27). Clearly, the functions `n .Q/ WD
EQ Xn  form a decreasing sequence of weak continuous functions on c . Moreover, when c  M1 , we even have `n .Q/ & `.Q/ WD EQ X  for each Q 2 c .
By the established compactness of c , (4.27) thus follows from Dinis lemma.

Section 4.2 Robust representation of convex risk measures

195

Remark 4.25. Let  be a convex risk measure which is continuous from below. Then
 is also continuous from above, as can be seen by combining Theorem 4.22 and
Lemma 4.21.
}
Exercise 4.2.2. Show that for a convex risk measure  on X the following two conditions are equivalent:
(a)  is continuous from below.
(b)  satisfies the following Lebesgue property: .Xn / ! .X / whenever .Xn /
is a bounded sequence in X which converges pointwise to X .
}
Exercise 4.2.3. Show that for a convex risk measure  on X the following two conditions are equivalent:
(a)  is continuous from below.
(b) For every c > .0/, the coherent risk measures
c .X / WD sup EQ X 
Q2c

are continuous from below, where c D Q 2 M1;f j min .Q/  c.

Example 4.26. Let us consider a utility function u on R, a probability measure Q 2


M1 .; F /, and fix some threshold c 2 R. As in Example 4.10, we suppose that
a position X is acceptable if its expected utility EQ u.X /  is bounded from below
by u.c/. Alternatively, we can introduce the convex increasing loss function `.x/ D
u.x/ and define the convex set of acceptable positions
A WD X 2 X j EQ `.X /   x0 ;
where x0 WD u.c/. Let  WD A denote the convex risk measure induced by A.
In Section 4.9, we will show that  is continuous from below, and we will derive a
formula for its minimal penalty function.
}
Let us now continue the discussion in a topological setting. More precisely, we
will assume for the rest of this section that  is a separable metric space and that F
is the -field of Borel sets. As before, X is the linear space of all bounded measurable functions on .; F /. We denote by Cb ./ the subspace of bounded continuous
functions on , and we focus on the representation of convex risk measures viewed
as functionals on Cb ./.
Proposition 4.27. Let  be a convex risk measure on X such that
.Xn / & . / for any sequence .Xn / in Cb ./ that increases to a constant > 0.
(4.28)

196

Chapter 4 Monetary measures of risk

Then there exists a penalty function on M1 such that


.X / D max .EQ X   .Q//

for X 2 Cb ./.

(4.29)

Q j E Q   D EQ   on Cb ./:
.Q/ WD inf min .Q/
Q

(4.30)

Q2M1

In fact, one can take

Proof. Let min be the minimal penalty function of  on M1;f . We show that for any
Q < 1 there exists Q 2 M1 such that E Q X  D EQ X  for all
QQ with min .Q/
Q
X 2 Cb ./. Take a sequence .Yn / in Cb ./ which increases to some Y 2 Cb ./,
and choose > 0 such that Xn WD 1 C .Yn  Y /  0 for all n. Clearly, .Xn / satisfies
condition (a) of Lemma 4.23, and so EQQ Xn  ! 1, i.e.,
EQQ Yn  % EQQ Y :
This continuity property of the linear functional EQQ   on Cb ./ implies, via the
DaniellStone representation theorem as stated in Appendix A.6, that it coincides on
Cb ./ with the integral with respect to a -additive measure Q. Taking as in (4.30)
gives the result.
Remark 4.28. If  is compact then any convex risk measure admits a representation
(4.29) on the space Cb ./ D C./. Indeed, if .Xn / is a sequence in Cb ./ that
increases to a constant , then this convergence is even uniform by Lemma 4.24.
Since  is Lipschitz continuous on C./ by Lemma 4.3, it satisfies condition (4.28).
Alternatively, we could argue as in Remark 4.18 and apply the general duality theorem for the FenchelLegendre transform to the convex functional  on the Banach
space C./. Just note that any continuous functional ` on C./ which is positive and
normalized is of the form `.X / D EQ X  for some probability measure Q 2 M1 ;
see Theorem A.48.
}
Definition 4.29. A convex risk measure  on X is called tight if there exists an increasing sequence K1  K2     of compact subsets of  such that
. IKn / ! . /

for all  1.

Note that every convex risk measure is tight if  is compact.


Proposition 4.30. Suppose that the convex risk measure  on X is tight. Then (4.28)
holds and the conclusion of Proposition 4:27 is valid. Moreover, if  is a Polish space
and is a penalty function on M1 such that
.X / D sup .EQ X   .Q//

for X 2 Cb ./,

Q2M1

then the level sets c D Q 2 M1 j .Q/  c are relatively compact for the weak
topology on M1 .

Section 4.2 Robust representation of convex risk measures

197

Proof. First we show (4.28). Suppose Xn 2 Cb ./ are such that Xn % > 0. We
may assume without loss of generality that  is normalized. Convexity and normalization guarantee that condition (4.28) holds for all > 0 as soon as it holds for all
 c where c is an arbitrary constant larger than 1. Hence, the cash invariance of
 implies that there is no loss of generality in assuming Xn  0 for all n. We must
show that .Xn /  . / C 2" eventually, where we take " 2 .0;  1/.
By assumption, there exists a compact set KN such that
..  "/IKN /  .  "/ C " D . / C 2":
By Dinis lemma as recalled in Lemma 4.24, there exists some n0 2 N such that
 "  Xn on KN for all n  n0 . Finally, monotonicity implies
.Xn /  ..  "/IKN /  . / C 2":
To prove the relative compactness of c , we will show that for any " > 0 there
exists a compact set K"   such that for all c > .0/
inf Q K"   1  ".c C .0/ C 1/:

Qc
Q2

The relative compactness of c will then be an immediate consequence of Prohorovs


characterization of weakly compact sets in M1 , as stated in Theorem A.42. We fix a
countable dense set !1 ; !2 ; : : :   and a complete metric which generates the
topology of . For r > 0 we define continuous functions ri on  by
ri .!/ WD 1 

.!; !i / ^ r
:
r

The function ri is dominated by the indicator function of the closed metric ball
B r .!i / WD ! 2  j .!; !i /  r:
Let
Xnr .!/ WD max ri .!/:
i n

Clearly, Xnr is continuous and satisfies 0  Xnr  1 as well as Xnr % 1 for n " 1.
According to (4.25), we have for all > 0
inf Q

Q2c

n
h [
i D1

i
c C . Xnr /
B r .!i /  inf EQ Xnr   
:
Q2c

Now we take k WD 2k =" and rk WD 1=k. The first part of this proof and (4.28) yield
the existence of nk 2 N such that
. k Xnrkk /  . k / C 1 D  k C 1;

198

Chapter 4 Monetary measures of risk

and thus
sup Q

nk
h \

Q2c

i D1

i cC1
nB rk .!i / 
D "2k .c C 1/:
k

We let
K" WD

nk
1 [
\

B rk .!i /:

kD1 i D1

Then, for each Q 2 c


Q K"  D 1  Q

nk
1 \
h [

nB rk .!i /

kD1 i D1

1

1
X

"2k .c C 1/

kD1

D 1  ".c C 1/:
The reader may notice that K" is closed, totally bounded and, hence, compact. A short
proof of this fact goes as follows: Let .xj / be a sequence in K" . We must show that
.xj / has a convergent subsequence. Since K" is covered by B rk .!1 /; : : : ; B rk .!nk /
for each k, there exists some ik  nk such that infinitely many xj are contained in
B rk .!ik /. A diagonalization argument yields a single subsequence .xj 0 / which for
each k is contained in some B rk .!ik /. Thus, .xj 0 / is a Cauchy sequence with respect
to the complete metric and, hence, converging to some element ! 2 .
Remark 4.31. Note that the representation (4.29) does not necessarily extend from
Cb ./ to the space X of all bounded measurable functions. Suppose in fact that 
is compact but not finite, so that condition (4.28) holds as explained in Remark 4.28.
There is a finitely additive Q0 2 M1;f which does not belong to M1 ; see Example A.53. The proof of Proposition 4.27 shows that there is some QQ 2 M1 such that
the coherent risk measure  defined by .X / WD EQ0 X  coincides with EQQ X 
for X 2 Cb ./. But  does not admit a representation of the form
.X / D sup .EQ X   .Q//

for all X 2 X.

Q2M1

In fact, this would imply


.Q/  EQ0 X   EQ X 
for Q 2 M1 and any X 2 X, hence .Q/ D 1 for any Q 2 M1 .

Section 4.3 Convex risk measures on L1

4.3

199

Convex risk measures on L1

In the sequel, we fix a probability measure P on .; F / and consider risk measures
 such that
.X / D .Y / if X D Y P -a.s.
(4.31)
Note that only the nullsets of P will matter in this section.
Lemma 4.32. Let  be a convex risk measure that satisfies (4.31) and which is represented by a penalty function as in (4.15). Then .Q/ D C1 for any Q 2
M1;f .; F / which is not absolutely continuous with respect to P .
Proof. If Q 2 M1;f .; F / is not absolutely continuous with respect to P , then there
exists A 2 F such that Q A  > 0 but P A  D 0. Take any X 2 A , and define
Xn WD X  n IA . Then .Xn / D .X /, i.e., Xn is again contained in A . Hence,
.Q/  min .Q/  EQ Xn  D EQ X  C n Q A  ! 1
as n " 1.
In view of (4.31), we can identify X with the Banach space L1 WD L1 .; F ; P /.
Let us denote by
M1 .P / WD M1 .; F ; P /
the set of all probability measures on .; F / which are absolutely continuous with
respect to P . The following theorem characterizes those convex risk measures on L1
that can be represented by a penalty function concentrated on probability measures,
and hence on M1 .P /, due to Lemma 4.32.
Theorem 4.33. Suppose  W L1 ! R is a convex risk measure. Then the following
conditions are equivalent:
(a)  can be represented by some penalty function on M1 .P /.
(b)  can be represented by the restriction of the minimal penalty function min to
M1 .P /
.X / D

.EQ X   min .Q//;

sup

X 2 L1 :

(4.32)

Q2M1 .P /

(c)  is continuous from above: If Xn & X P -a.s. then .Xn / % .X /.


(d)  has the following Fatou property: for any bounded sequence .Xn / which converges P -a.s. to some X,
.X /  lim inf .Xn /:
n"1

200

Chapter 4 Monetary measures of risk

(e)  is lower semicontinuous for the weak topology .L1 ; L1 /.


(f) The acceptance set A of  is weak closed in L1 , i.e., A is closed with
respect to the topology .L1 ; L1 /.
Proof. The implication (b) ) (a) is obvious, and (a) ) (c) , (d) follows as in
Lemma 4.21, replacing pointwise convergence by P -a.s. convergence.
(c) ) (e): We have to show that C WD   c is weak closed for c 2 R. To
this end, let Cr WD C \ X 2 L1 j kX k1  r for r > 0. If .Xn / is a sequence
in Cr converging in L1 to some random variable X, then there is a subsequence that
converges P -a.s., and the Fatou property of  implies that X 2 Cr . Hence, Cr is
closed in L1 , and Lemma A.65 implies that C WD   c is weak closed.
(e) ) (f) is obvious.
(f) ) (b): We fix some X 2 L1 and let
mD

sup

. EQ X   min .Q/ /:

(4.33)

Q2M1 .P /

In view of Theorem 4.16, we need to show that m  .X / or, equivalently, that
m C X 2 A . Suppose by way of contradiction that m C X A . Since the nonempty convex set A is weak closed by assumption, we may apply Theorem A.57 in
the locally convex space .L1 ; .L1 ; L1 // with C WD A and B WD m C X . We
obtain a continuous linear functional ` on .L1 ; .L1 ; L1 // such that
WD inf `.Y / > `.m C X / DW  > 1:
Y 2A

(4.34)

By Proposition A.59, ` is of the form `.Y / D E Y Z  for some Z 2 L1 . In fact, Z 


0. To show this, fix Y  0 and note that . Y /  .0/ for  0, by monotonicity.
Hence Y C .0/ 2 A for all  0. It follows that
1 <  < `. Y C .0// D `.Y / C `..0//:
Taking " 1 yields that `.Y /  0 and in turn that Z  0. Moreover, P Z > 0  > 0
since ` is non-zero. Thus,
Z
dQ0
WD
dP
E Z 
defines a probability measure Q0 2 M1 .P /. By (4.34), we see that
min .Q0 / D sup EQ0 Y  D 
Y 2A

:
E Z 

However,
EQ0 X  C m D

`.m C X /


D
<
D  min .Q0 /;
E Z 
E Z 
E Z 

in contradiction to (4.33). Hence, m C X must be contained in A , and thus m 


.X /.

Section 4.3 Convex risk measures on L1

201

The theorem shows that any convex risk measure of L1 that is continuous from
above arises in the following manner. We consider any probabilistic model Q 2
M1 .P /, but these models are taken more or less seriously as described by the penalty
function. Thus, the value .X / is computed as the worst case, over all models
Q 2 M1 .P /, of the expected loss EQ X , but reduced by .Q/. In the following
example, the given model P is the one which is taken most seriously, and the penalty
function .Q/ is proportional to the deviation of Q from P , measured by the relative
entropy.
Example 4.34. Consider the penalty function W M1 .P / ! .0; 1 defined by
.Q/ WD

1
H.QjP /;

where > 0 is a given constant and


h
dQ i
H.QjP / D EQ log
dP
is the relative entropy of Q 2 M1 .P / with respect to P ; see Definition 3.20. The
corresponding entropic risk measure  is given by
 .X / D

sup
Q2M1 .P /

EQ X  


1
H.QjP / :

The variational principle for the relative entropy as stated in Lemma 3.29 shows that
EQ X  

1
1
H.QjP /  log E e X ;

and the upper bound is attained by the measure with the density e X =E e X .
Thus, the entropic risk measure takes the form
 .X / D

1
log E e X :

Note that is in fact the minimal penalty function representing  , since Lemma 3.29
implies


1
1
min .Q/ D sup EQ X   log E e X  D H.QjP /:

X2L1
A financial interpretation of the entropic risk measure in terms of shortfall risk will be
discussed in Example 4.114.
}
Exercise 4.3.1. Show that the entropic risk measure  converges to the worst-case
risk measure for " 1 and to the expected loss under P for # 0.
}

202

Chapter 4 Monetary measures of risk

The following corollary characterizes those convex risk measures on L1 that satisfy the property of continuity from below, which is stronger than continuity from
above.
Corollary 4.35. For a convex risk measure  on L1 , the following conditions are
equivalent:
(a)  is continuous from below: Xn % X H) .Xn / & .X /.
(b)  satisfies the Lebesgue property: .Xn / ! .X / whenever .Xn / is a bounded
sequence in L1 which converges P -a.s. to X .
(c) The minimal penalty function min is concentrated on M1 .P /, i.e., min .Q/ <
1 implies Q 2 M1 .P /.
In particular we have
.X / D

max

Q2M1 .P /

.EQ X   min .Q//;

X 2 L1 ;

whenever one of these three equivalent conditions is satisfied.


Proof. The equivalence of conditions (a) and (b) was shown in Exercise 4.2.2. The
equivalence of conditions (a) and (c) follows from Theorem 4.22 and Lemma 4.32.
Exercise 4.3.2. Show that the three conditions in Corollary 4.35 are equivalent to the
following fourth condition:
(d) For each c 2 R, the level set c WD Q j min .Q/  c is contained in M1 .P /,
and the corresponding set of densities,

dQ
Q 2 c ;
dP
is weakly compact in L1 .; F ; P /.
Hint: Use Lemma 4.23 and the DunfordPettis theorem (Theorem A.67).

Example 4.36. Let g W 0; 1! R [ C1 be a lower semicontinuous convex


function satisfying g.1/ < 1 and the superlinear growth condition g.x/=x ! C1
as x " 1. Associated with it is the g-divergence
h  dQ  i
;
Ig .QjP / WD E g
dP

Q 2 M1 .P /:

(4.35)

The g-divergence Ig .QjP / quantifies the deviation of the hypothetical model Q from
the reference measure P . Thus,
g .Q/ WD Ig .QjP /;

Q 2 M1 .P /;

Section 4.3 Convex risk measures on L1

203

is a natural choice for a penalty function. The resulting risk measure


g .X / WD sup .EQ X   Ig .QjP //

(4.36)

QP

is sometimes called divergence risk measure. Note that, for g.x/ D 1 x log x, g is
just the entropic risk measure discussed in Example 4.34. Divergence risk measures
will be discussed in more detail in Section 4.9.
}
Exercise 4.3.3. Show that the risk measure in (4.36) is continuous from below and
that g .Q/ D Ig .QjP /, Q 2 M1 .P /, is its minimal penalty function. In particular,
the supremum in (4.36) is in fact a maximum.
Hint: Using Exercise 4.3.2 can be helpful.
}
Theorem 4.33 takes the following form for coherent risk measures; the proof is the
same as the one for Corollary 4.19.
Corollary 4.37. A coherent risk measure on L1 can be represented by a set Q 
M1 .P / if and only if the equivalent conditions of Theorem 4:33 are satisfied. In this
case, the maximal representing subset of M1 .P / is given by
Qmax WD Q 2 M1 .P / j min .Q/ D 0:
Let us also state a characterization of those coherent risk measures on L1 which
are continuous from below.
Corollary 4.38. For a coherent risk measure  on L1 the following properties are
equivalent:
(a)  is continuous from below: Xn % X H) .Xn / & .X /.
(b)  satisfies the Lebesgue property: .Xn / ! .X / whenever .Xn / is a bounded
sequence in L1 which converges P -a.s. to X .
(c) We have Qmax  M1 .P /.
(d) The set of densities

dQ
Q 2 Qmax
dP

is weakly compact in L1 .; F ; P /.


In this case, the representation
.X / D max EQ X ;
Q2Qmax

involves only -additive probability measures.

X 2 L1 ;

204

Chapter 4 Monetary measures of risk

We now give three examples of coherent risk measures which will be studied in
more detail in Section 4.4.
Example 4.39. In our present context, where we require condition (4.31), the worstcase risk measure takes the form
max .X / WD  ess inf X D infm 2 R j X C m  0 P -a.s.:
One can easily check that max is coherent and satisfies the Fatou property. Moreover,
1
the acceptance set of max is equal to the positive cone L1
C in L , and this implies
min .Q/ D 0 for any Q 2 M .P /. Thus,

1
max .X / D

sup

EQ X :

Q2M1 .P /

Note however that the supremum on the right cannot be replaced by a maximum as
soon as .; F ; P / cannot be reduced to a finite model. Indeed, in that case there
exists X 2 L1 such that X does not attain its essential infimum, and so there can
be no Q 2 M1 .P / such that EQ X  D ess inf X D max .X /. In this case, the
preceding corollary shows that max is not continuous from below.
}
Example 4.40. Let Q
be the class of all Q 2 M1 .P / whose density dQ=dP is
bounded by 1= for some fixed parameter 2 .0; 1/. The corresponding coherent
risk measure
AV@R
.X / WD sup EQ X 
(4.37)
Q2Q

will be called the Average Value at Risk at level . This terminology will become clear
in Section 4.4, which contains a detailed study of AV@R
. By taking g.x/ WD 0 for
x 
1 and g.x/ WD C1 for x >
1 , one sees that AV@R
falls into the class of divergence risk measures as introduced in Example 4.36. It follows from Exercise 4.3.3
that Q
is equal to the maximal representing subset of AV@R
, that AV@R
is continuous from below, and that the supremum in (4.37) is actually attained. An explicit
construction of the maximizing measure will be given in the proof of Theorem 4.52.
}
Example 4.41. We take for Q the class of all conditional distributions P  j A  such
that A 2 F has P A  > for some fixed level 2 .0; 1/. The coherent risk measure
induced by Q,
WCE
.X / WD sup E X j A  j A 2 F ; P A  > ;

(4.38)

is called the worst conditional expectation at level . We will show in Section 4.4
that it coincides with the Average Value at Risk of Example 4.40 if the underlying
probability space is rich enough.
}

Section 4.3 Convex risk measures on L1

205

Let  be a convex risk measure with the Fatou property. We now consider the situation in which  admits a representation in terms of equivalent probability measures
Q  P , i.e.,
.X / D sup .EQ X   min .Q//;

X 2 L1 :

(4.39)

Q P

We will show in the next theorem that this property can be characterized by the following concept of sensitivity, which is sometimes also called relevance. It formalizes
the idea that  should react to every nontrivial loss at a sufficiently high level.
Definition 4.42. A convex risk measure  on L1 is called sensitive with respect to
P if for every nonconstant X 2 L1 with X  0 there exists > 0 such that
. X/ > .0/.
Theorem 4.43. For a convex risk measure with the Fatou property, the following
conditions are equivalent:
(a)  admits the representation (4.39) in terms of equivalent probability measures.
(b)  is sensitive with respect to P .
(c) For every A 2 F with P A  > 0 there exists > 0 such that . IA / > .0/.
(d) For every A 2 F with P A  > 0 there exists Q 2 M1 .P / with Q A  > 0 and
min .Q/ < 1.
(e) There exists Q  P with min .Q/ < 1.
Proof. Throughout the proof we will assume for simplicity that  is normalized in the
sense that .0/ D 0. This can be done without loss of generality.
(a) ) (b): Take X 2 L1
C with E X  > 0. By (4.39) there exists Q  P with
min
.Q/ < 1 and
. X/  EQ X   min .Q/:
The right-hand side is strictly positive as soon as
>

min .Q/
;
EQ X 

which is a finite number since EQ X  > 0 due to Q  P .


The implications (b) ) (c) and (c) ) (d) are both obvious.
(d) ) (e): For every c > 0 the level set c WD Q 2 M1 .P / j min .Q/  c
is nonempty due to our assumption .0/ D 0. We will show that c contains some
Q  P . We show first the following auxiliary claim:
For any A 2 F with P A  > 0 there exists Q 2 c with Q A  > 0.

(4.40)

206

Chapter 4 Monetary measures of risk

Indeed, (d) implies that there is Q0 2 M1 .P / with Q0 A  > 0 and min .Q0 / < 1.
Now we take Q1 2 c=2 and let Q" WD "Q0 C .1  "/Q1 for 0 < " < 1. We clearly
have Q" A  > 0 and
c
min .Q" /  " min .Q1 / C .1  "/ < c
2
for sufficiently small " > 0. This implies (4.40).
We now apply the HalmosSavage theorem in the form of Theorem 1.61. It yields
the existence of Q 2 c with Q  P if we can show that c is countably convex.
real numbers
To show countable convexity, let .k /k2N be a sequence of nonnegative
P
k
summing up to 1 and take Qk 2 c for k 2 N. We define Q WD 1
kD1 k Q . Then
Q 2 M1 .P / and
min .Q/ D sup EQ X  D sup
X2A

1
X
kD1

1
X

X2A kD1

k sup EQk X  D
X2A

k EQk X 

1
X

k min .Qk /  c:

kD1

Thus, Q belongs to c , and (e) follows.


(e) ) (a): By the representation (4.32) in Theorem 4.33 it is sufficient to show that
.X / D

sup

.EQ X   min .Q//  sup .EQ X   min .Q//

Q2M1 .P /

Q P

for any given X 2 L1 . To this end, we take > 0 and choose Q1 2 M1 .P / such
that
EQ1 X   min .Q1 / > .X/  :
Then we take Q0  P with min .Q0 / < 1, which exists due to (e). When letting
Q" WD "Q0 C .1  "/Q1 we have Q"  P for all " 2 .0; 1/ and min .Q" / 
" min .Q1 / C .1  "/ min .Q0 /. Hence,
EQ" X   min .Q" /  ".EQ0 X   min .Q0 //
C .1  "/.EQ1 X   min .Q1 //;
and this is larger that .X /  when " is sufficiently small.
Condition (e) in Theorem 4.43 is of course satisfied when min .P / < 1. This is
the case for the worst conditional expectation WCE (Example 4.41), Average Value
at Risk AV@R (Example 4.40), the divergence risk measures (Example 4.36), and in
particular the entropic risk measure (Example 4.34). These, and many other risk measures, are therefore sensitive and admit a representation (4.39) in terms of equivalent
risk measures.

207

Section 4.4 Value at Risk

Remark 4.44. In analogy to Remark 4.18, the implication (e) ) (a) in the Representation Theorem 4.33 can be viewed as a special case of the general duality in
Theorem A.62 for the FenchelLegendre transform of the convex function  on L1 ,
combined with the properties of a monetary risk measure. From this general point of
view, it is now clear how to state representation theorems for convex risk measures on
the Banach spaces Lp .; F ; P / for 1  p < 1. More precisely, let q 2 .1; 1 be
such that p1 C q1 D 1, and define

dQ
q
2 Lq :
M1 .P / WD Q 2 M1 .P /
dP
A convex risk measure  on Lp is of the form
.X / D

sup
Q2M1q .P /

.EQ X   .Q//

if and only if it is lower semicontinuous on Lp , i.e., the Fatou property holds in the
form
}
Xn ! X in Lp H) .X /  lim inf .Xn /:
n"1

4.4

Value at Risk

A common approach to the problem of measuring the risk of a financial position X


consists in specifying a quantile of the distribution of X under the given probability
measure P . For 2 .0; 1/, a -quantile of a random variable X on .; F ; P / is any
real number q with the property
PX  q  

and

P X < q   ;

and the set of all -quantiles of X is an interval qX . /; qXC . /, where
qX .t / D supx j P X < x  < t D infx j P X  x   t
is the lower and
qXC .t / D infx j P X  x  > t D supx j P X < x   t
is the upper quantile function of X; see Appendix A.3. In this section, we will focus
on the properties of qXC . /, viewed as a functional on a space of financial positions X.
Definition 4.45. Fix some level 2 .0; 1/. For a financial position X , we define its
Value at Risk at level as

V@R
.X / WD qXC . / D qX
.1  / D infm j P X C m < 0   :

(4.41)

208

Chapter 4 Monetary measures of risk

In financial terms, V@R


.X / is the smallest amount of capital which, if added
to X and invested in the risk-free asset, keeps the probability of a negative outcome
below the level . However, Value at Risk only controls the probability of a loss; it
does not capture the size of such a loss if it occurs. Clearly, V@R
is a monetary risk
measure on X D L0 , which is positively homogeneous; see also Example 4.11. The
following example shows that the acceptance set of V@R
is typically not convex,
and so V@R
is not a convex risk measure. Thus, V@R
may penalize diversification
instead of encouraging it.
Example 4.46. Consider an investment into two defaultable corporate bonds, each
with return rQ > r, where r  0 is the return on a riskless investment. The discounted
net gain of an investment w > 0 in the i th bond is given by

w
in case of default,
Xi D w. rQ r/
otherwise.
1Cr
If a default of the first bond occurs with probability p  , then
i
h
w. rQ  r/
< 0 D P 1st bond defaults  D p  :
P X1 
1Cr
Hence,
w. rQ  r/
< 0:
1Cr
This means that the position X1 is acceptable in the sense that is does not carry a
positive Value at Risk, regardless of the possible loss of the entire investment w.
Diversifying the portfolio by investing the amount w=2 into each of the two bonds
leads to the position Y WD .X1 C X2 /=2. Let us assume that the two bonds default
independently of each other, each of them with probability p. For realistic r,
Q the
probability that Y is negative is equal to the probability that at least one of the two
bonds defaults: P Y < 0  D p.2  p/. If, for instance, p D 0:009 and D 0:01
then we have p < < p.2  p/, hence


w
rQ  r
V@R
.Y / D  1 
:
2
1Cr

V@R
.X1 / D 

Typically, this value is close to one half of the invested capital w. In particular, the
acceptance set of V@R
is not convex. This example also shows that V@R may
strongly discourage diversification: It penalizes quite drastically the increase of the
probability that something goes wrong, without rewarding the significant reduction
of the expected loss conditional on the event of default. Thus, optimizing a portfolio
with respect to V@R
may lead to a concentration of the portfolio in one single asset
with a sufficiently small default probability, but with an exposure to large losses. }

209

Section 4.4 Value at Risk

Exercise 4.4.1. Let .Yn / be a sequence of independent and identical distributed random variables in L1 .; F ; P /. Show that

V@R

n
1 X


Yi ! E Y1  as n " 1

i D1

for any 2 .0; 1/. Choose the common distribution in such a way that convexity is
violated for large n, i.e.,

V@R

n
1 X


Yi > V@R
.Y1 /:

i D1

In the remainder of this section, we will focus on monetary measures of risk which,
in contrast to V@R
, are convex or even coherent on X WD L1 . In particular, we are
looking for convex risk measures which come close to V@R
. A first guess might
be that one should take the smallest convex measure of risk, continuous from above,
which dominates V@R
. However, since V@R
itself is not convex, the following
proposition shows that such a smallest V@R
-dominating convex risk measure does
not exist.
Proposition 4.47. For each X 2 X and each 2 .0; 1/,

V@R
.X / D min.X / j  is convex, continuous from above, and  V@R
:
Proof. Let q WD V@R
.X / D qXC . / so that P X < q   . If A 2 F satisfies
P A  > , then P A \ X  q  > 0. Thus, we may define a measure QA by
QA WD P  j A \ X  q :
It follows that EQA X   q D V@R
.X /.
Let Q WD QA j P A  > , and use this set to define a coherent risk measure 
via
.Y / WD sup EQ Y :
Q2Q

Then .X /  V@R


.X /. Hence, the assertion will follow if we can show that
.Y /  V@R
.Y / for each Y 2 X. Let " > 0 and A WD Y  V@R
.Y / C ".
Clearly P A  > , and so QA 2 Q. Moreover, QA A  D 1, and we obtain
.Y /  EQA Y   V@R
.Y /  ":
Since " > 0 is arbitrary, the result follows.
For the rest of this section, we concentrate on the following risk measure which
is defined in terms of Value at Risk, but does satisfy the axioms of a coherent risk
measure.

210

Chapter 4 Monetary measures of risk

Definition 4.48. The Average Value at Risk at level 2 .0; 1 of a position X 2 X is


given by
Z
1

AV@R
.X / D
V@R .X / d:
0
Sometimes, the Average Value at Risk is also called the Conditional Value at
Riskor the expected shortfall, and one writes CV@R
.X / or ES
.X /. These
terms are motivated by formulas (4.44) and (4.42) below, but they are potentially misleading: Conditional Value at Risk might also be used to denote the Value at Risk
with respect to a conditional distribution, and expected shortfall might be understood as the expectation of the shortfall X  . For these reasons, we prefer the term
Average Value at Risk. Note that
Z
1

AV@R
.X / D 
qX .t / dt
0
by (4.41). In particular, the definition of AV@R
.X / makes sense for any X 2
L1 .; F ; P / and we have, in view of Lemma A.19,
Z 1
AV@R1 .X / D 
qXC .t / dt D E X :
0

Exercise 4.4.2. Compute AV@R


.X / when X is
(a) uniform,
(b) normally distributed,
(c) log-normally distributed, i.e., X D e ZCm with Z  N.0; 1/ and m; 2 R.
Recalling the results from Example 4.11 and Exercise 4.1.5, compare the behavior of
V@R
.X / and AV@R
.X / as the parameter increases from 0 to 1.
}
Remark 4.49. Theorem 2.57 shows that the partial order <uni on probability measures
on R with finite mean can be characterized in terms of Average Value at Risk

<uni AV@R
.X /  AV@R
.X /

for all 2 .0; 1,

where X and X are random variables with distributions


and .

Remark 4.50. For X 2 L1 , we have


lim V@R
.X / D  ess inf X D infm j P X C m < 0   0:

#0

Hence, it makes sense to define

AV@R0 .X / WD V@R0 .X / WD  ess inf X;


which is the worst-case risk measure on L1 introduced in Example 4.39. Recall that
it is continuous from above but in general not from below.
}

211

Section 4.4 Value at Risk

Lemma 4.51. For 2 .0; 1/ and any -quantile q of X,

AV@R
.X / D

1
1
E .q  X /C   q D
inf .E .r  X /C   r/:

r2R

(4.42)

Proof. Let qX be a quantile function with qX . / D q. By Lemma A.19,


Z
Z
1
1 1
1

C
C
.qqX .t // dt q D 
qX .t / dt D AV@R
.X /:
E .qX / q D

0
0
This proves the first identity. The second one follows from Lemma A.22.
Theorem 4.52. For 2 .0; 1, AV@R
is a coherent risk measure which is continuous from below. It has the representation

AV@R
.X / D max EQ X ;
Q2Q

X 2 X;

(4.43)

where Q
is the set of all probability measures Q
P whose density dQ=dP is
P -a.s. bounded by 1= . Moreover, Q
is equal to the maximal set Qmax of Corollary 4:37.
Proof. Since Q1 D P , the assertion is obvious for D 1. For 0 < < 1, consider
the coherent risk measure 
.X / WD supQ2Q EQ X . First we assume that we
are given some X < 0. We define a measure PQ  P by d PQ =dP D X=E X . Then

.X / D

E X 
Q '  j 0  '  1; E '  D :
supE

Clearly, the condition E '  D on the right can be replaced by E '   . Thus,
we can apply the NeymanPearson lemma in the form of Theorem A.31 and conclude
that the supremum is attained by
'0 D IX <q C IXDq
for a -quantile q of X and some  2 0; 1 for which E '0  D . Hence,

.X / D

E X  Q
1
 E '0  D E X '0 :

Since dQ0 D 1 '0 dP defines a probability measure in Q


, we conclude that

.X / D max EQ X  D EQ0 X 
Q2Q

1
.E X I X < q  q C qP X < q /

1
D E .q  X /C   q

D AV@R
.X /;

212

Chapter 4 Monetary measures of risk

where we have used (4.42) in the last step. This proves (4.43) for X < 0. For arbitrary
X 2 L1 , we use the cash invariance of both 
and AV@R
.
It remains to prove that Q
is the maximal set of Corollary 4.37. This follows from
Exercise 4.3.3 and Example 4.40, but we also give a different argument here. To this
end, we show that
sup .EQ X   AV@R
.X // D C1
X2X

for Q Q
. We denote by ' the density dQ=dP . There exist 0 2 .0; / and
k > 1= 0 such that P ' ^ k  1= 0  > 0. For c > 0 define X .c/ 2 X by
X .c/ WD c.' ^ k/I'1=0 :
Since
PX

.c/

1
< 0 D P '  0

 0 < ;

we have V@R
.X .c/ / D 0, and (4.42) yields that

AV@R
.X

.c/



1
c
1
.c/
/ D E X  D E ' ^ kI '  0 :


On the other hand,


EQ X

.c/

1
 D c  E '  ' ^ kI '  0



1
c
 0 E ' ^ kI '  0 :

Thus, the difference between EQ X .c/  and AV@R


.X .c/ / becomes arbitrarily
large as c " 1.
Remark 4.53. The proof shows that for 2 .0; 1/ the maximum in (4.43) is attained
by the measure Q0 2 Q
, whose density is given by
dQ0
1
D .IX <q C IXDq /;
dP

where q is a -quantile of X, and where  is defined as

0
if P X D q  D 0,
 WD
P X<q 
otherwise.
P XDq 

Corollary 4.54. For all X 2 X,

AV@R
.X /  WCE
.X /
 E X j X  V@R
.X / 
 V@R
.X /;

(4.44)

Section 4.5 Law-invariant risk measures

213

where WCE
is the coherent risk measure defined in (4.38). Moreover, the first two
inequalities are in fact identities if
P X  qXC . /  D ;

(4.45)

which is the case if X has a continuous distribution.


Proof. If P A   , then the density P  j A  with respect to P is bounded by 1= .
Therefore, Theorem 4.52 implies that AV@R
dominates WCE
. Since
P X  V@R
.X /  "  > ;
we have
WCE
.X /  E X j X  V@R
.X /  " ;
and the second inequality follows by taking the limit " # 0. Moreover, (4.42) shows
that
AV@R
.X / D E X j X  V@R
.X / 
as soon as (4.45) holds.
Remark 4.55. We will see in Corollary 4.68 that the two coherent risk measures
AV@R
and WCE
coincide if the underlying probability space is rich enough. If
this is not the case, then the first inequality in (4.44) may be strict for some X; see [2].
Moreover, the functional
E X j X  V@R
.X / 
does not define a convex risk measure. Hence, the second inequality in (4.44) cannot
reduce to an identity in general.
}
Remark 4.56. We have seen in Proposition 4.47 that there is no smallest convex risk
measure dominating V@R
. But if we restrict our attention to the class of convex
risk measures that dominate V@R
and only depend on the distribution of a random
variable, then the situation is different. In fact, we will see in Theorem 4.67 that
AV@R
is the smallest risk measure in this class, provided that the underlying probability space is rich enough. In this sense, Average Value at Risk can be regarded as
the best conservative approximation to Value at Risk.
}

4.5

Law-invariant risk measures

Clearly, V@R
and AV@R
only involve the distribution of a position under the
given probability measure P . In this section we study the class of all risk measures
which share this property of law-invariance.

214

Chapter 4 Monetary measures of risk

Definition 4.57. A monetary risk measure  on X D L1 .; F ; P / is called lawinvariant if .X / D .Y / whenever X and Y have the same distribution under P .
Throughout this section, we assume that the probability space .; F ; P / is rich
enough in the sense that it supports a random variable with a continuous distribution.
This condition is satisfied if and only if .; F ; P / is atomless; see Proposition A.27.
Remark 4.58. Any law-invariant monetary risk measure  is monotone with respect
to the partial order <mon introduced in Definition 2.67. More precisely,

<mon

H)

.X /  .X /;

if X and X are random variables with distributions


and . To prove this, let q
and q be quantile functions for
and and take a random variable U with a uniform
distribution on .0; 1/. Then XQ WD q .U /  q .U / DW XQ by Theorem 2.68, and
XQ  and XQ have the same distribution as X and X by Lemma A.19. Hence, law}
invariance and monotonicity of  imply .X / D .XQ /  .XQ / D .X /.
We can now formulate our first structure theorem for law-invariant convex risk
measures.
Theorem 4.59. Let  be a convex risk measure and suppose that  is continuous from
above. Then  is law-invariant if and only if its minimal penalty function min .Q/
under P when Q 2 M1 .P /. In this case, 
depends only on the law of 'Q WD dQ
dP
has the representation

Z 1
.X / D sup
qX .t /q'Q .t / dt  min .Q/ ;
(4.46)
Q2M1 .P /

and the minimal penalty function satisfies


Z 1
min
.Q/ D sup
qX .t /q'Q .t / dt
X2A

D sup
X2L1

Z

1
0

(4.47)

qX .t /q'Q .t /  .X / :

For the proof, we will need the following lemma. Here and in the sequel we will
write X  XQ to indicate that the two random variables X and XQ have the same law.
Lemma 4.60. For two random variables X and Y ,
Z 1
Q ;
qX .t /qY .t / dt D max E XY
0

Q
X
X

provided that all occurring integrals are well-defined.

215

Section 4.5 Law-invariant risk measures

Proof. The upper HardyLittlewood inequality in Theorem A.24 yields . To


prove the converse inequality, take a random variable U with a uniform distribution on .0; 1/ such that Y D qY .U / P -a.s. Such a random variable U exists by
Lemma A.28 and our assumption that the underlying probability space is atomless.
Since XQ WD qX .U /  X by Lemma A.19, we obtain
Z 1
Q
qX .t /qY .t / dt;
E X Y  D E qX .U /qY .U /  D
0

and hence .
Proof of Theorem 4:59. Suppose first that  is law-invariant. Then X 2 A implies
that XQ 2 A for all XQ  X . Hence,
min .Q/ D sup E X 'Q  D sup sup E XQ 'Q 
X2A

X2A X
X
Q

D sup
X2A

qX .t /q'Q .t / dt;

where we have used Lemma 4.60 in the last step. It follows that min .Q/ depends
only on the law of 'Q . In order to check the second identity in (4.47), note that
XQ WD X C .X / belongs to A for any X 2 L1 and that qX  .X / is a quantile
Q
function for X.
Conversely, let us assume that min .Q/ depends only on the law of 'Q . Let us write
QQ  Q to indicate that 'Q and 'QQ have the same law. Then Lemma 4.60 yields
.X / D

sup

.EQ X   min .Q//

Q2M1 .P /

sup

sup .E X 'QQ   min .Q//

Q2M1 .P / Q
Q
Q

sup
Q2M1 .P /

Z

1
0


qX .t /q'Q .t / dt  min .Q/ :

Exercise 4.5.1. For 0 < < 1, start from the definition of Average Value at Risk as

dQ
1
AV@R
.X / WD sup EQ X  j Q 2 M1 .P / and
 P -a.s. ;
dP

check that the conditions of Theorem 4.59 are satisfied, and deduce from (4.46) that
the representation
Z
1

AV@R
.X / D
V@R .X / d
0
holds.

216

Chapter 4 Monetary measures of risk

Example 4.61. Let u W R ! R be an increasing concave function, and suppose that


a position X 2 L1 is acceptable if E u.X /   c, where c is a given constant in the
interior of u.R/. We have seen in Example 4.10 that the corresponding acceptance set
induces a convex risk measure . Clearly,  is law-invariant, and it will be shown in
Proposition 4.113 that  is continuous from below and, hence, from above. Moreover,
the corresponding minimal penalty function can be computed as

min

1
.Q/ D inf

>0

cC
0


` .  q'Q .t // dt ;


where
` .y/ D sup .xy C u.x// D sup .xy  `.x//
x2R

x2R

is the FenchelLegendre transform of the convex increasing loss function `.x/ WD


u.x/; see Theorem 4.115.
}
The following theorem clarifies the crucial role of the risk measures AV@R
: they
can be viewed as the building blocks for law-invariant convex risk measures on L1 .
Recall that we assume that .; F ; P / is atomless.
Theorem 4.62. A convex risk measure  is law-invariant and continuous from above
if and only if
.X / D

sup
2M1 ..0;1/

Z
.0;1


AV@R
.X /
.d /  min .
/ ;

where

min

(4.48)

Z
.
/ D sup
X2A

.0;1

AV@R
.X /
.d /:

Proof. Clearly, the right-hand side of (4.48) defines a law-invariant convex risk measure that is continuous from above. Conversely, let  be law-invariant and continuous
from above. We will show that for Q 2 M1 .P / there exists a measure
2 M1 ..0; 1/
such that
Z 1
Z
qX .t /q' .t / dt D
AV@Rs .X /
.ds/;
0

.0;1

. Then the assertion will follow from Theorem 4.59. Since


where ' WD 'Q D dQ
dP
qX .t / D V@R1t .X / and q' .t / D q'C .t / for a.e. t 2 .0; 1/,
Z

qX .t /q' .t / dt D
0

V@R t .X /q'C .1  t / dt:

217

Section 4.5 Law-invariant risk measures

Since q'C is increasing and right-continuous, we can write q'C .t / D .1  t; 1/ for
some positive locally finite measure on .0; 1. Moreover, the measure
given by

.dt / D t .dt / is a probability measure on .0; 1


Z 1
Z 1
Z
t .dt / D
.s; 1/ ds D
q'C .s/ ds D E '  D 1:
.0;1

Thus,
Z

1
0

qX .t /q' .t / dt D

V@R t .X /
0

Z
D

.0;1

Z
D

1
s

.t;1

.ds/ dt
s

V@R t .X / dt
.ds/

(4.49)

AV@Rs .X /
.ds/:
.0;1

Conversely,
for any probability measure
on .0; 1, the function q defined by q.t / WD
R
1
.ds/ can be viewed as a quantile function of the density ' WD q.U / of
s
.1t;1
a measure Q 2 M1 .P /, where U has a uniform distribution on .0; 1/. Altogether,
we obtain a one-to-one correspondence between laws of densities ' and probability
measures
on .0; 1.
Theorem 4.62 takes the following form for coherent risk measures.
Corollary 4.63. A coherent risk measure  is continuous from above and law-invariant if and only if
Z
.X / D sup
2M .0;1

AV@R
.X /
.d /

for some set M  M1 ..0; 1/.


Remark 4.64. In the preceding results of this section, it was assumed that  is lawinvariant and continuous from above and that the underlying probability space is atomless. Under the additional regularity assumption that L2 .; F ; P / is separable, it can
be shown that every law-invariant convex risk measure is continuous from above;
see [161].
}
Randomness of a position is reduced in terms of P if we replace the position by its
conditional expectation with respect to some -algebra G  F . Such a reduction of
randomness is reflected by a convex risk measure if it is law-invariant.
Corollary 4.65. Assume that  is a convex risk measure which is continuous from
above and law-invariant. Then  is monotone with respect to the binary relation <uni
introduced in (3.27):
Y <uni X

H)

.Y /  .X /;

218

Chapter 4 Monetary measures of risk

for Y; X 2 X. In particular,
.E X j G /  .X /;
for X 2 X and any -algebra G  F , and
.E X / D .0/  E X   .X /:
Proof. The first inequality follows from Theorem 4.62 combined with Remark 4.49
The second inequality is a special case of the first one, since E X jG  <uni X according to Theorem 2.57. The third follows from the second by taking G D ;; .
Recall from Theorem 2.68 that
<mon implies
<uni . Thus, the preceding conclusion for convex risk measures is stronger than the one of Remark 4.58 for monetary
risk measures.
Remark 4.66. If G1  G2      F are -algebras, then
.E X j Gn / ! .E X j G1 / as n " 1,
S
where  is as in Corollary 4.65 and G1 D . n Gn /. Indeed, Doobs martingale
convergence theorem (see, e.g., Theorem 19.1 in [20]) states that E X j Gn  !
E X j G1  P -a.s. as n " 1. Hence, the Fatou property and Corollary 4.65 show
that


.E X j G1 / D  lim E X j Gn 
n"1

 lim inf .E X j Gn /


n"1

 .E X j G1 /:

In contrast to Proposition 4.47, the following theorem shows that AV@R


is the
best conservative approximation to V@R
in the class of all law-invariant convex
risk measures which are continuous from above.
Theorem 4.67. AV@R
is the smallest law-invariant convex measure of risk which
is continuous from above and dominates V@R
.
Proof. That AV@R
dominates V@R
was already stated in (4.44). Suppose now
that  is another law-invariant convex risk measure which dominates V@R
and
which is continuous from above. We must show that for a given X 2 X
.X /  AV@R
.X /:

(4.50)

219

Section 4.6 Concave distortions

By cash invariance, we may assume without loss of generality that X > 0. Take
" > 0, and let A WD X  V@R
.X /  " and
Y WD E X j XIAc  D X  IAc C E X j A   IA :
Since Y > qXC . / C "  E X j A  on Ac , we get P Y < E X j A   D 0. On the
other hand, P Y  E X j A    P A  > , and this implies that V@R
.Y / D
E X j A . Since  dominates V@R
, we have .Y /  E X j A . Thus,
.X /  .Y / D E X j X  V@R
.X /  " ;
by Corollary 4.65. Taking " # 0 yields
.X /  E X j X  V@R
.X / :
If the distribution of X is continuous, Corollary 4.54 states that the conditional expectation on the right equals AV@R
.X /, and we obtain (4.50). When the distribution of
X is not continuous, we denote by D the set of all points x such that P X D x  > 0
and take any bounded random variable Z  0 with a continuous distribution. Such a
random variable exists due to our assumption that .; F ; P / is atomless. Note that
Xn WD X C n1 ZIX 2D has a continuous distribution. Indeed, for any x,
X
P X D y; Z D n.x  y/  D 0:
P Xn D x  D P X D x; X D  C
y2D

Moreover, Xn decreases to X. The inequality (4.50) holds for each Xn and extends
to X by continuity from above.
Corollary 4.68. AV@R
and WCE
coincide under our assumption that the probability space is atomless.
Proof. We know from Corollary 4.54 that WCE
.X / D AV@R
.X / if X has a continuous distribution. Repeating the approximation argument at the end of the preceding proof yields WCE
.X / D AV@R
.X / for each X 2 X.

4.6

Concave distortions

Let us now have a closer look at the coherent risk measures


Z
 .X / WD AV@R
.X /
.d /;

(4.51)

which appear in the Representation Theorem 4.62 for law-invariant convex risk measures. We are going to characterize these risk measures  in two ways, first as Choquet integrals with respect to some concave distortion of the underlying probability
measure P , and then, in the next section, by a property of comonotonicity.

220

Chapter 4 Monetary measures of risk

Again, we will assume throughout this section that the underlying probability space
.; F ; P / is atomless. Since AV@R
is coherent, continuous from below, and lawinvariant, any mixture  for some probability measure
on .0; 1 has the same properties. According to Remark 4.50, we may set AV@R0 .X / D  ess inf X so that we
can extend the definition (4.51) to probability measures
on the closed interval 0; 1.
However,  will only be continuous from above and not from below if
.0/ > 0,
because AV@R0 is not continuous from below.
Our first goal is to show that  .X / can be identified with the Choquet integral
of the loss X with respect to the set function c .A/ WD .P A /, where is the
concave function defined in the following lemma. Choquet integrals were introduced
in Example 4.14, and the risk measure MINVAR of Exercise 4.1.7 provides a first
example for a risk measure arising as the Choquet integral of a set function c . Recall
0 ; see
that every concave function admits a right-continuous right-hand derivative C
Proposition A.4.
Lemma 4.69. The identity
0
C .t /

s 1
.ds/;

0 < t < 1;

(4.52)

.t;1

defines a one-to-one correspondence between probability measures


on 0; 1 and
increasing concave functions
W 0; 1 ! 0; 1 with .0/ D 0 and .1/ D 1.
Moreover, we have .0C/ D
.0/.
Proof. Suppose first that
is given and is defined by
is concave and increasing on .0; 1. Moreover,
Z
1

.0C/ D

Z
.t / dt D

.0;1

1
s

.1/ D 1 and (4.52). Then

1
0

It <s1 dt
.ds/ D
..0; 1/  1:

Hence, we may set .0/ WD 0 and obtain an increasing concave function on 0; 1.
0 .t / is a decreasing right-continuous function on
Conversely, if is given, then C
0
.0; 1/ and can be written as C .t / D ..t; 1/ for some locally finite positive measure
on .0; 1. We first define
on .0; 1 by
.dt / D t .dt /. Then (4.52) holds and, by
Fubinis theorem,
Z

1Z

..0; 1/ D
0

Hence, setting
.0/ WD

.0;1

It <s .ds/ dt D 1 

.0C/  1:

.0C/ defines a probability measure


on 0; 1.

221

Section 4.6 Concave distortions

Theorem 4.70. For a probability measure


on 0; 1, let be the concave function
defined in Lemma 4:69. Then, for X 2 X,
Z 1
qX .t / 0 .1  t / dt
(4.53)
 .X / D .0C/AV@R0 .X / C
0

. .P X > x /  1/ dx C

D
1

.P X > x / dx:
0

Proof. Using the fact that V@R


.X / D qX .1  /, we get as in (4.49) that
Z 1
Z
AV@R
.X /
.d / D
qX .t / 0 .1  t / dt:
0

.0;1

Hence, we obtain the first identity. For the second one, we will first assume X  0.
Then
Z 1
qXC .t / D supx  0 j FX .x/  t D

IFX .x/t dx;

where FX is the distribution function of X . Using Fubinis theorem, we obtain


Z 1Z 1
Z 1
0
qX .t / .1  t / dt D
IFX .x/1t 0 .t / dt dx
0

.1  FX .x// dx 

.0C/ ess sup X;

Ry
since 0 0 .t / dt D . .y/  .0C//Iy>0 . This proves the second identity for
X  0, since .0C/ D
.0/ and ess sup X D AV@R0 .X /. If X 2 L1 is
arbitrary, we consider X C C , where C WD  ess inf X . The cash invariance of 
yields
Z 1
.P X > x  C / dx
C C  .X / D
Z

.P X > x / dx C

D
C

.P X > x / dx
0

. .P X > x /  1/ dx C

DC C
1

.P X > x / dx:
0

Example 4.71. Clearly, the risk measure AV@R


is itself of the form  where
D

. For > 0, the corresponding concave distortion function is given by


t 
1
^ 1 D .t ^ /:
.t / D


Thus, we obtain yet another representation of AV@R
:
Z
1 1
AV@R
.X / D
P X > x  ^ dx for X 2 L1
}
C.
0

222

Chapter 4 Monetary measures of risk

Exercise 4.6.1. Find the probability measure


on 0; 1 such that the coherent risk
measure MINVAR introduced in Exercise 4.1.7 is of the form (4.51).
}
Exercise 4.6.2. Suppose that there exists a set A 2 F with 0 < P A  < 1 such
that  .IA / D  .IA /. Use the representation (4.53) to deduce that
D 1 , i.e.,
 .X / D E X  for all X 2 L1 . More generally, show that
D 1 if there exists
a nonconstant X 2 L1 such that  .X / D  .X /.
}
Corollary 4.72. If
.0/ D 0 in Theorem 4.70, then
Z 1
qX .'.t // dt;
 .X / D 
0

where ' is an inverse function of

, taken in the sense of Definition A.14.

Proof. Due to Lemma A.15, the distribution of ' under the Lebesgue measure has the
distribution function and hence the density 0 . Therefore
Z

qX .'.t // dt D
0

qX .t /

.t / dt D 

qX .1  t /

.t / dt;

where we have used Lemma A.23 in the last step. An application of Theorem 4.70
concludes the proof.
Let us continue with a brief discussion of the set function c .A/ D

.P A /.

Definition 4.73. Let W 0; 1 ! 0; 1 be an increasing function such that


and .1/ D 1. The set function
c .A/ WD

.P A /;

.0/ D 0

A2F;

is called the distortion of the probability measure P with respect to the distortion
function .
Definition 4.74. A set function c W F ! 0; 1 is called monotone if
c.A/  c.B/ for A  B
and normalized if
c.;/ D 0

and

c./ D 1:

A monotone set function is called submodular or 2-alternating if


c.A [ B/ C c.A \ B/  c.A/ C c.B/:
Clearly, any distortion c is normalized and monotone.

223

Section 4.6 Concave distortions

Proposition 4.75. Let c be the distortion of P with respect to the distortion function . If is concave, then c is submodular. Moreover, if the underlying probability space is atomless, then also the converse implication holds.
Proof. Suppose first that is concave. Take A; B 2 F with P A   P B . We
must show that c WD c satisfies
c.A/  c.A \ B/  c.A [ B/  c.B/:
This is trivial if r D 0, where
r WD P A   P A \ B  D P A [ B   P B :
For r > 0 the concavity of

yields via (A.1) that

c.A/  c.A \ B/
c.A [ B/  c.B/

:
PA  PA \ B 
PA [ B   PB 
Multiplying both sides with r gives the result.
Now suppose that c D c is submodular and assume that .; F ; P / is atomless.
By Exercise A.1.1 below it is sufficient to show that .y/  . .x/ C .z//=2 whenever 0  x  z  1 and y D .x C z/=2. To this end, we will construct two sets
A; B  F such that P A  D P B  D y, P A \ B  D x, and P A [ B  D z.
Submodularity then gives .x/ C .z/  2 .y/ and in turn the concavity of .
In order to construct the two sets A and B, take a random variable U with a uniform
distribution on 0; 1, which exists by Proposition A.27. Then
A WD 0  U  y

and

B WD z  y  U  z

are as desired.
Let us now recall the notion of a Choquet integral, which was introduced in Example 4.14.
Definition 4.76. Let c W F ! 0; 1 be any set function which is normalized and
monotone. The Choquet integral of a bounded measurable function X on .; F /
with respect to c is defined as
Z

.c.X > x/  1/ dx C

X dc WD
1

c.X > x/ dx:


0

Note that the Choquet integral coincides with the usual integral as soon as c is a
-additive probability measure; see also Lemma 4.97 below. With this definition,
Theorem 4.70 allows us to identify the risk measure  as the Choquet integral of the
loss with respect to a concave distortion c of the underlying probability measure P .

224

Chapter 4 Monetary measures of risk

Corollary 4.77. For a probability measure


on 0; 1, let be the concave distortion
function defined in Lemma 4:69, and let c denote the distortion of P with respect
to . Then, for X 2 L1 ,
Z
 .X / D

.X / dc :

Combining Corollary 4.77 with Theorem 4.62, we obtain the following characterization of law-invariant convex risk measures in terms of concave distortions:
Corollary 4.78. A convex risk measure  is law-invariant and continuous from above
if and only if

Z
.X / D sup
.X / dc   min . / ;
where the supremum is taken over the class of all concave distortion functions
Z
min
 . / WD sup
.X / dc :

and

X2A

The following series of exercises should be compared with Exercises 4.1.7 and
4.6.1.
Exercise 4.6.3. For  1 consider the concave distortion function
Show that for 2 N the corresponding risk measure
Z
MAXVAR .X / WD .X / dc

.x/

WD x .

has the property that MAXVAR .X / D E Y1 , if Y1 ; : : : ; Y are independent and


identically distributed (i.i.d.) random variables for which max.Y1 ; : : : ; Y /  X. }
1

Exercise 4.6.4. For  1 consider the distortion function Q .x/ WD .1.1x/ / .


Show that for 2 N the corresponding risk measure
Z
MAXMINVAR .X / WD .X / dc Q
has the property that MAXMINVAR .X / D E Y1 , if Y1 ; : : : ; Y are i.i.d. random variables for which max.Y1 ; : : : ; Y /  min.X1 ; : : : ; X /, when X1 ; : : : ; X
are independent copies of X.
}
1

Exercise 4.6.5. For  1 consider the distortion function O .x/ WD .1.1x/ / .


Show that for 2 N the corresponding risk measure
Z
MINMAXVAR .X / WD .X / dc O
has the property that MINMAXVAR .X / D E min.Y1 ; : : : ; Y / , if Y1 ; : : : ; Y
are i.i.d. random variables for which max.Y1 ; : : : ; Y /  X .
}

225

Section 4.6 Concave distortions

As another consequence of Theorem 4.70, we obtain an explicit description of the


maximal representing set Q  M1 .P / for the coherent risk measure  .
Theorem 4.79. Let
be a probability measure on 0; 1, and let be the corresponding concave function defined in Lemma 4:69. Then  can be represented as
 .X / D sup EQ X ;
Q2Q

where the set Q is given by


Z 1

dQ

qZ .s/ ds 
satisfies
Q WD Q 2 M1 .P / Z WD
dP
t

.1  t / for t 2 .0; 1/ :

Moreover, Q is the maximal subset of M1 .P / that represents  .


Proof. The risk measure  is coherent and continuous from above. By Corollary 4.37, it can be represented by taking the supremum of expectations over the set
Qmax D Q 2 M1 .P / j min .Q/ D 0. Using (4.47) and Theorem 4.70, we see that
a measure Q 2 M1 .P / with density Z D dQ=dP belongs to Qmax if and only if
Z 1
qX .s/qZ .s/ ds   .X /
0
(4.54)
Z
1

.0C/AV@R0 .X / C

qX .s/

.1  s/ ds

for all X 2 L1 . For constant random variables X  t , we have qX D It;1 a.e., and
so we obtain
Z 1
Z 1
0
qZ .s/ ds  .0C/ C
.1  s/ ds D .1  t /
t

Qmax

for all t 2 .0; 1/. Hence


 Q . For the proof of the converse inclusion, we
show that the density Z of a fixed measure Q 2 Q satisfies (4.54) for any given
X 2 L1 . We may assume without loss of generality that X  0. Let be the
positive finite measure on 0; 1 such that qXC .s/ D .0; s/. Using Fubinis theorem
and the definition of Q , we get
Z 1
Z 1
Z
qX .s/qZ .s/ ds D
qZ .s/ ds .dt /
0

0;1

.1  t / .dt /
0;1

.0C/ .0; 1/ C


0

which coincides with the right-hand side of (4.54).

Z
.1  s/

.dt / ds;
0;s

226

Chapter 4 Monetary measures of risk

Corollary 4.80. In the context of Theorem 4.79, the following conditions are equivalent:
(a)  is continuous from below.
(b)
.0/ D 0.
(c)  .X / D maxQ2Q EQ X  for all X 2 L1 .
If these equivalent conditions are satisfied, then the maximum in (c) is attained by
the measure QX 2 Q with density dQX =dP D f .X /, where f is the decreasing
function defined by
f .x/ WD 0 .FX .x//
if x is a continuity point of FX , and by
f .x/ WD

1
FX .x/  FX .x/

FX .x/

.t / dt

FX .x/

otherwise. Moreover, with denoting the Lebesgue measure on .0; 1/,

 dQ 1

Q D Q
P P
<uni .
dP

0 1

(4.55)

Proof. The equivalence of conditions (a) and (c) has already been proved in Corollary 4.38. If (b) holds, then  is continuous from below, due to Theorem 4.52
and monotone convergence. Let us now show that condition (a) is not satisfied if
WD
.0/ > 0. In this case, we can write
 D AV@R0 C .1  /0 ;
where
0 WD
.  j.0; 1/. Then 0 is continuous from below since
0 .0/ D 0, but
AV@R0 is not, and so  does not satisfy (a); see Remark 4.50.
Let us now prove the remaining assertions. Since .0C/ RD
.0/ D 0, Ra measure
1
1
Q with density Z D dQ=dP belongs to Q if and only if t qZ .s/ ds  t 0 .1 
0 .1  t / is a quantile function for the law of
0 under ,
s/ ds for all t . Since
part (e) of Theorem 2.57 implies (4.55). The problem of identifying the maximizing
measure QX is hence equivalent to minimizing E ZX  under the constraint that Z
is a density function such that P Z 1 <uni . 0 /1 . Let us first assume that
X  0. Then it follows from Theorem 3.44 that f .X / minimizes E YX  among
all Y 2 L1C such that P Y 1 <uni . 0 /1 . Moreover, Remark 3.46 shows that
R1
E f .X /  D 0 0 .t / dt D 1, and so ZX WD f .X /  0 is the density of an optimal
probability measure QX 2 Q . If X is not positive, then we may take a constant
c such that X C c  0 and apply the preceding argument. The formula for f then
follows from the fact that FXCc .X C c/ D FX .X /.

227

Section 4.6 Concave distortions

on 0; 1 define its convex conju-

Exercise 4.6.6. For a concave distortion function


gate function as
'.x/ WD sup . .y/  xy/;

x  0:

y20;1

Show that the set Q in Theorem 4.79 can be represented as

dQ
satisfies E .Z  x/C   '.x/ for x  0 : }
Q D Q 2 M1 .P / Z WD
dP
Remark 4.81. As long as we are interested in a law-invariant risk assessment, we can
represent a financial position X 2 L1 by its distribution function FX or, equivalently,
by the function
GX .t / WD 1  FX .t / D P X > t :
If we only consider positions X with values in 0; 1 then their proxies GX vary in
the class of right-continuous decreasing functions G on 0; 1 such that G.1/ D 0 and
G.0/  1. Due to Theorem 4.70, a law-invariant coherent risk measure  induces a
functional U on the class of proxies via
Z

U.GX / WD  .X / D

.GX .t // dt:
0

Since is increasing and concave, the functional U has the form of a von Neumann
Morgenstern utility functional on the probability space given by Lebesgue measure on
the unit interval 0; 1. As such, it can be characterized by the axioms in Section 2.3,
and this is the approach taken in Yaaris dual theory of choice [265]. More generally, we can introduce a utility function u on 0; 1 with u.0/ D 0 and consider the
functional
Z 1
.GX .t // du.t /
U.GX / WD
0

introduced by Quiggin [216]. For u.x/ D x this reduces to the dual theory, for
.x/ D x we recover the classical utility functionals
Z

U.GX / D

GX .t / du.t /
0

u.t / dGX .t /

D
0

D E u.X / 
discussed in Section 2.3.

228

4.7

Chapter 4 Monetary measures of risk

Comonotonic risk measures

In many situations, the risk of a combined position X C Y will be strictly lower than
the sum of the individual risks, because one position serves as a hedge against adverse
changes in the other position. If, on the other hand, there is no way for X to work
as a hedge for Y then we may want the risk simply to add up. In order to make
this idea precise, we introduce the notion of comonotonicity. Our main goal in this
section is to characterize the class of all convex risk measures that share this property
of comonotonicity.
As in the first two sections of this chapter, we will denote by X the linear space of
all bounded measurable functions on the measurable space .; F /.
Definition 4.82. Two measurable functions X and Y on .; F / are called comonotone if
.X.!/  X.! 0 //.Y .!/  Y .! 0 //  0

for all .!; ! 0 / 2   .

(4.56)

A monetary risk measure  on X is called comonotonic if


.X C Y / D .X / C .Y /
whenever X; Y 2 X are comonotone.
Lemma 4.83. If  is a comonotonic monetary risk measure on X, then  is positively
homogeneous.
Proof. Note that .X; X / is a comonotone pair. Hence .2X / D 2.X /. An iteration
of this argument yields .rX / D r.X / for all rational numbers r  0. Positive
homogeneity now follows from the Lipschitz continuity of ; see Lemma 4.3.
We will see below that every comonotonic monetary risk measure on X arises as
the Choquet integral with respect to a certain set function on .; F /. In the sequel,
c W F ! 0; 1 will always denote a set function that is normalized and monotone; see
Definition 4.74. Unless otherwise mentioned, we will not assume that c enjoys any
additivity properties. Recall from Definition 4.76 that the Choquet integral of X 2 X
with respect to c is defined as
Z 1
Z
Z 0
.c.X > x/  1/ dx C
c.X > x/ dx:
X dc D
1

The proof of the following proposition was already given in Example 4.14.
Proposition 4.84. The Choquet integral of the loss,
Z
.X / WD .X / dc;
is a monetary risk measure on X which is positively homogeneous.

229

Section 4.7 Comonotonic risk measures

Definition 4.85. Let X be a measurable function on .; F /. An inverse function


rX W .0; 1/ ! R of the increasing function GX .x/ WD 1  c.X > x/, taken in the
sense of Definition A.14, is called a quantile function for X with respect to c.
If c is a probability measure, then GX .x/ D c.X  x/. Hence, the preceding
definition extends the notion of a quantile function given in Definition A.20. The
following proposition yields an alternative representation of the Choquet integral in
terms of quantile functions with respect to c.
Proposition 4.86. Let rX be a quantile function with respect to c for X 2 X. Then
Z
Z 1
X dc D
rX .t / dt:
0

Proof. We have .X C m/ dc D X dc C m, and one easily checks that rXCm D


rX C m a.e. for all m 2 R and each quantile function rXCm of X C m. Thus, we
may assume without loss of generality that X  0. In this case, Remark A.16 and
Lemma A.15 imply that the largest quantile function rXC is given by
Z 1
C
IGX .x/t dx:
rX .t / D supx  0 j GX .x/  t D
0

Since rX D rXC a.e. on .0; 1/, Fubinis theorem implies


Z

1Z 1

rX .t / dt D
0

IGX .x/t dx dt

.1  GX .x// dx
Z

X dc:

The preceding proposition yields the following generalization of Corollary 4.72


when applied to a continuous distortion of a probability measure as defined in Definition 4.73.
Corollary 4.87. Let c .A/ D .P A / be the distortion of the probability measure
P with respect to the continuous distortion function . If ' is an inverse function for
the increasing function in the sense of Definition A.14, then the Choquet integral
with respect to c satisfies
Z

X dc D

qX .1  '.t // dt;
0

where qX is a quantile function for X 2 X, taken with respect to P .

230

Chapter 4 Monetary measures of risk

Proof. Due to the continuity of , we have .a/  t if and only if a  ' C .t / D


infx j .x/ > t . Thus, we can compute the lower quantile function of X with
respect to c
rX .t / D infx 2 R j 1  c .X > x/  t
D infx 2 R j .P X > x /  1  t
D infx 2 R j P X > x   ' C .1  t /
D qX .1  ' C .1  t //:
Next note that ' C .t / D '.t / for a.e. t . Moreover, ' has the continuous distribution
function under the Lebesgue measure, and so we can replace qX by the arbitrary
quantile function qX .
Theorem 4.88. A monetary risk measure  on X is comonotonic if and only if there
exists a normalized monotone set function c on .; F / such that
Z
.X / D .X / dc; X 2 X:
In this case, c is given by c.A/ D .IA /.
The preceding theorem implies in view of Corollary 4.77 that all mixtures
Z
 D
AV@R

.d /
0;1

are comonotonic. We will see in Theorem 4.93 below that these are in fact all convex
risk measures that are law-invariant and comonotonic. The proof of Theorem 4.88
requires a further analysis of comonotone random variables.
Lemma 4.89. Two measurable functions X and Y on .; F / are comonotone if and
only if there exists a third measurable function Z on .; F / and increasing functions
f and g on R such that X D f .Z/ and Y D g.Z/.
Proof. Clearly, X WD f .Z/ and Y WD g.Z/ are comonotone for given Z, f , and g.
Conversely, suppose that X and Y are comonotone and define Z by Z WD X C Y .
We show that z WD Z.!/ has a unique decomposition as z D x C y, where .x; y/ D
.X.! 0 /; Y .! 0 // for some ! 0 2 . Having established this, we can put f .z/ WD x
and g.z/ WD y. The existence of the decomposition as z D x C y follows by taking
x WD X.!/ and y WD Y .!/, so it remains to show that these are the only possible
values x and y. To this end, let us suppose that X.!/ C Y .!/ D z D X.! 0 / C Y .! 0 /
for some ! 0 2 . Then
X.!/  X.! 0 / D .Y .!/  Y .! 0 //;

231

Section 4.7 Comonotonic risk measures

and comonotonicity implies that this expression vanishes. Hence x D X.! 0 / and
y D Y .! 0 /.
Next, we check that both f and g are increasing functions on Z./. So let us
suppose that
X.!1 / C Y .!1 / D z1  z2 D X.!2 / C Y .!2 /:
This implies
X.!1 /  X.!2 /  .Y .!1 /  Y .!2 //:
Comonotonicity thus yields that X.!1 /X.!2 /  0 and Y .!1 /Y .!2 /  0, whence
f .z1 /  f .z2 / and g.z1 /  g.z2 /. Thus, f and g are increasing on Z./, and it is
straightforward to extend them to increasing functions defined on R.
Lemma 4.90. If X; Y 2 X is a pair of comonotone functions, and rX , rY , rXCY are
quantile functions with respect to c, then
rXCY .t / D rX .t / C rY .t /

for a.e. t .

Proof. Write X D f .Z/ and Y D g.Z/ as in Lemma 4.89. The same argument as
in the proof of Lemma A.23 shows that f .rZ / and g.rZ / are quantile functions for
X and Y under c if rZ is a quantile function for Z. An identical argument applied to
the increasing function h WD f C g shows that h.rZ / D f .rZ / C g.rZ / is a quantile
function for X CY . The assertion now follows from the fact that all quantile functions
of a random variable coincide almost everywhere, due to Lemma A.15.
Remark 4.91. Applied to the special case of quantile function with respect to a probability measure, the preceding lemma yields that V@R
and AV@R
are comonotonic.
}
Proof of Theorem 4:88. We already know from Proposition 4.84 that the Choquet integral of the loss is a monetary risk measure. Comonotonicity follows by combining
Proposition 4.86 with Lemma 4.90.
Conversely, suppose now that  is comonotonic. Then  is positively homogeneous
according to Lemma 4.83. In particular we have .m/ D m for m  0. Thus, we
obtain a normalized
monotone set function by letting c.A/ WD .IA /. Moreover,
R
c .X / WD .X / dc is a comonotonic monetary risk measure on X that coincides
with  on indicator functions: .IA / D c.A/ D c .IA /. Let us now show that 
and c coincide on simple random variables of the form
XD

n
X

xi IAi ;

xi 2 R; Ai 2 F :

i D1

Since these random variables are dense in L1 , Lemma 4.3 will then imply that  D
c . In order to show that c .X / D .X / for X as above, we may assume without

232

Chapter 4 Monetary measures of risk

loss of generality that x1  x2      xn and that the sets Ai are disjoint. By


can write
cash invariance,
we may also assume X  0, i.e., xn  0. Thus, weS
P
X D niD1 bi IBi , where bi WD xi  xi C1  0, xnC1 WD 0, and Bi WD ikD1 Ak .
P
Note that bi IBi and bk IBk is a pair of comonotone functions. Hence, also k1
i D1 bi IBi
and bk IBk are comonotone, and we get inductively
.X / D

n
X

bi .IBi / D

i D1

n
X

bi c .IBi / D c .X /:

i D1

Remark 4.92. The argument at the end of the preceding proof shows that the Choquet
integral of a simple random variable
XD

n
X

xi IAi

with x1      xn  xnC1 WD 0

i D1

and disjoint sets A1 ; : : : ; An can be computed as


Z
X dc D

n
n
X
X
.xi  xi C1 /c.Bi / D
xi .c.Bi /  c.Bi 1 //;
i D1

where B0 WD ; and Bi WD

i D1

Si

kD1 Ak

for i D 1; : : : ; n.

So far, we have shown that comonotonic monetary risk measures can be identified
with Choquet integrals of normalized monotone set functions. Our next goal is to
characterize those set functions that induce risk measures with the additional property
of convexity. To this end, we will first consider law-invariant risk measures. The
following result shows that the risk measures AV@R
may be viewed as the extreme
points in the convex class of all law-invariant convex risk measures on L1 that are
comonotonic.
Theorem 4.93. On an atomless probability space, the class of risk measures
Z
 .X / WD AV@R
.X /
.d /;
2 M1 .0; 1/;
is precisely the class of all law-invariant convex risk measures on L1 that are comonotonic. In particular, any convex risk measure that is law-invariant and comonotonic is also coherent and continuous from above.
Proof. Comonotonicity of  follows from Corollary 4.77 and Theorem 4.88. Conversely, let us assume that  is a law-invariant
convex risk measure that is also coR
monotonic. By Theorem 4.88, .X / D .X / dc for c.A/ WD .IA /. The lawinvariance of  implies that c.A/ is a function of the probability P A , i.e., there

233

Section 4.7 Comonotonic risk measures

exists an increasing function on 0; 1 such that .0/ D 0, .1/ D 1, and c.A/ D


.P A /. Note that IA[B and IA\B is a pair of comonotone functions for all A; B 2
F . Hence, comonotonicity and subadditivity of  imply
c.A \ B/ C c.A [ B/ D .IA\B / C .IA[B / D .IA\B  IA[B /
D .IA  IB /

(4.57)

 c.A/ C c.B/:
Proposition 4.75 thus implies that is concave. Corollary 4.77 finally shows that the
Choquet integral with respect to c can be identified with a risk measure  , where

is obtained from via Lemma 4.69.


Now we turn to the characterization of all comonotonic convex risk measures on X.
Recall that, for a positively homogeneous monetary risk measure, convexity is equivalent to subadditivity. Also recall that M1;f WD M1;f .; F / denotes the set of all
finitely additive normalized set functions Q W F ! 0; 1, and that EQ X  denotes
the integral of X 2 X with respect to Q 2 M1;f , as constructed in Theorem A.51.
Theorem 4.94. For the Choquet integral with respect to a normalized monotone set
function c, the following conditions are equivalent:
R
(a) .X / WD .X / dc is a convex risk measure on X.
R
(b) .X / WD .X / dc is a coherent risk measure on X.
(c) For Qc WD Q 2 M1;f j Q A   c.A/ for all A 2 F ,
Z
X dc D max EQ X  for X 2 X.
Q2Qc

(d) The set function c is submodular.


In this case, Qc is equal to the maximal representing set Qmax for .
Before giving the proof of this theorem, let us state the following corollary, which
gives a complete characterization of all comonotonic convex risk measures, and a
remark concerning the set Qc in part (c), which is usually called the core of c.
Corollary 4.95. A convex risk measure on X is comonotonic if and only if it arises
as the Choquet integral of the loss with respect to a submodular, normalized, and
monotone set function c. In this case, c is given by c.A/ D .IA /, and  has the
representation
.X / D max EQ X ;
Q2Qc

where Qc D Q 2 M1;f j Q A   c.A/ for all A 2 F is equal to the maximal


representing set Qmax .

234

Chapter 4 Monetary measures of risk

R
Proof. Theorems 4.88 and 4.94 state that .X / WD .X / dc is a comonotonic coherent risk measure, which can be represented as in the assertion, as soon as c is a submodular, normalized, and monotone set function. Conversely, any comonotonic convex risk measure  is coherent and arises as the Choquet integral of c.A/ WD .IA /,
due to Theorem 4.88. Theorem 4.94 then gives the submodularity of c.
Remark 4.96. Let c be a normalized, monotone, submodular set function. Theorem
4.94 implies in particular that the core Qc of c is non-empty. Moreover, c can be
recovered from Qc :
c.A/ D max Q A  for all A 2 F .
Q2Qc

If c has the additional continuity T


property that c.An / ! 0 for any decreasing sequence .An / of events such that n An D ;, then this property is shared by any
that Q is -additive. Thus, the corresponding coherent risk
Q 2 Qc , and it follows
R
measure .X / D .X / dc admits a representation in terms of -additive probability
measures. It follows by Lemma 4.21 that  is continuous from above.
}
The proof of Theorem 4.94 requires some preparations. The assertion of the following lemma is not entirely obvious, since Fubinis theorem may fail if Q 2 M1;f
is not -additive.
Lemma 4.97. For
R X 2 X and Q 2 M1;f , the integral EQ X  is equal to the
Choquet integral X dQ.
P
Proof. It is enough to prove the result for X  0. Suppose first that X D niD1 xi IAi
is as in Remark 4.92. Then
Z
n
i
n
h [
i X
X
X dQ D
.xi  xi C1 /Q
Ak D
xi Q Ai  D EQ X :
i D1

kD1

i D1

The result for general X 2 X follows by approximating X uniformly with Xn which


take Ronly finitely many values, and by using the Lipschitz continuity of both EQ  
and  dQ with respect to the supremum norm.
Lemma 4.98. Let A1 ; : : : ; An be a partition of  into disjoint measurable sets, and
suppose that the normalized monotone set function c is submodular. Let Q be the
probability measure on F0 WD .A1 ; : : : ; An / with weights
Q Ak  WD c.Bk /  c.Bk1 /

for B0 WD ; and Bk WD

k
[

Aj ; k  1:

(4.58)

j D1

R
P
Then X dc  EQ X  for all F0 -measurable X D niD1 xi IAi , and equality holds
if the values of X are arranged in decreasing order: x1      xn .

235

Section 4.7 Comonotonic risk measures

Proof.
Clearly, it suffices to consider only the case X  0. Then Remark 4.92 implies
R
X dc D EQ X R as soon as the values of X are arranged in decreasing order.
Now we prove X dc  EQ X  for arbitrary F0 -measurable X 0. To this end,
note that any permutation of 1; : : : ; n induces a probability measure Q on F0
by applying the definition of Q to the re-labeled partitionRA.1/ ; : : : ; A.n/ . If is a
permutation such that x.1/      x.n/ , then we have X dc D EQ X , and so
the assertion will follow if we can prove that EQ X   EQ X . To this end, it is
enough to show that EQ X   EQ X  if  is the transposition of two indices i and
i C 1 which are such that xi < xi C1 , because can be represented as a finite product
of such transpositions.
Note next that
EQ X   EQ X  D xi .Q Ai   Q Ai /
C xi C1 .Q Ai C1   Q Ai C1 /:

(4.59)

To compute the probabilities Q Ak , let us introduce


B0

WD ; and

Bk

WD

k
[

A.j / ;

k D 1; : : : ; n:

j D1

Then Bk D Bk for k i . Hence,


Q Ai  C Q Ai C1  D Q A.i/  C Q A.iC1/  D c.BiC1 /  c.Bi1 /
D c.Bi C1 /  c.Bi 1 / D Q Ai  C Q Ai C1 :

(4.60)

Moreover, Bi \ Bi D Bi 1 , Bi [ Bi D Bi C1 , and hence c.Bi 1 / C c.Bi C1 / 


c.Bi / C c.Bi /, due to the submodularity of c. Thus,
Q AiC1  D c.Bi C1 /  c.Bi /  c.Bi /  c.Bi1 / D Q A.i/  D Q Ai C1 :
Using (4.59), (4.60), and our assumption xi < xi C1 thus yields EQ X   EQ X .
Proof of Theorem 4:94. (a) , (b): According to Proposition 4.84, the property of
positive homogeneity is shared by all Choquet integrals, and the implication (b) )
(a) is obvious.
(b) ) (c): By Corollary 4.19, .X / D maxQ2Qmax EQ X , where Q 2 M1;f
belongs to Qmax if and only if
Z
EQ X   .X / D X dc for all X 2 X.
(4.61)
We will now show thatRthis set Qmax coincides with the set Qc . If Q 2 Qmax then,
in particular, Q A   IA dc D c.A/ for all A 2 F . Hence Q 2 Qc . Conversely,

236

Chapter 4 Monetary measures of risk

suppose Q 2 Qc . If X  0 then
Z
Z
Z 1
c.X > x/ dx 
X dc D
0

Q X > x  dx D EQ X ;

where we have used Lemma 4.97. Cash invariance yields (4.61).


(c) ) (b) is obvious.
(b) ) (d): This follows precisely as in (4.57).
(d) ) (b): We have to show that the Choquet integral is subadditive. By Lemma
4.3, it is again enough to prove this for random variables which only take finitely many
values. Thus, let A1 ; : : : P
; An be a partition
Pof  into finitely many disjoint measurable
sets. Let us write X D i xi IAi , Y D i yi IAi , and let us assume that the indices
i D 1; : : : ; n are arranged such that x1 C y1      xn C yn . Then the probability
measure Q constructed in Lemma 4.98 is such that
Z
Z
Z
.X C Y / dc D EQ X C Y  D EQ X  C EQ Y   X dc C Y dc:
But this is the required subadditivity of the Choquet integral.

4.8

Measures of risk in a financial market

In this section, we will consider risk measures which arise in the financial market
model of Section 1.1. In this model, d C 1 assets are priced at times t D 0 and t D 1.
Prices at time 1 are modelled as non-negative random variables S 0 ; S 1 ; : : : ; S d on
some probability space .; F ; P /, with S 0  1 C r. Prices at time 0 are given by
a vector  D .1; /, with  D . 1 ; : : : ;  d /. The discounted net gain of a trading
strategy  D . 0 ; / is given by   Y , where the random vector Y D .Y 1 ; : : : ; Y d / is
defined by
Si
  i for i D 1; : : : ; d .
Yi D
1Cr
As in the previous two sections, risk measures will be defined on the space L1 D
; P /. A financial position X can be viewed as riskless if X  0 or, more
generally, if X can be hedged without additional costs, i.e., if there exists a trading
strategy  D . 0 ; / such that    D 0 and

L1 .; F

XC

 S
D X C   Y  0 P -a.s.
1Cr

Thus, we define the following set of acceptable positions in L1 :


A0 WD X 2 L1 j 9  2 Rd with X C   Y  0 P -a.s. :

(4.62)

Section 4.8 Measures of risk in a financial market

237

Proposition 4.99. Suppose that infm 2 R j m 2 A0 > 1. Then 0 WD A0 is a


coherent risk measure. Moreover, 0 is sensitive in the sense of Definition 4:42 if and
only if the market model is arbitrage-free. In this case, 0 is continuous from above
and can be represented in terms of the set P of equivalent risk-neutral measures
0 .X / D sup E  X :

(4.63)

P  2P

Proof. The fact that 0 is a coherent risk measure follows from Proposition 4.7. If
the model is arbitrage-free, then Theorem 1.32 yields the representation (4.63), and it
follows that 0 is sensitive and continuous from above.
Conversely, suppose that 0 is sensitive, but the market model admits an arbitrage
opportunity. Then there are  2 Rd and " > 0 such that 0    Y P -a.s. and
A WD   Y  " satisfies P A  > 0. It follows that   Y  "IA  0, i.e., "IA is
acceptable. However, the sensitivity of 0 implies that
0 ."IA / D "0 .IA / > 0 .0/ D 0;
where we have used the coherence of 0 , which follows from fact that A0 is a cone.
Thus, we arrive at a contradiction.
There are several reasons why it may make sense to allow in (4.62) only strategies
 that belong to a proper subset S of the class Rd of all strategies. For instance, if the
resources available to an investor are limited, only those strategies should be considered for which the initial investment in risky assets is below a certain amount. Such
a restriction corresponds to an upper bound on   . There may be other constraints.
For instance, short sales constraints are lower bounds on the number of shares in the
portfolio. In view of market illiquidity, the investor may also wish to avoid holding
too many shares of one single asset, since the market capacity may not suffice to resell
the shares. Such constraints will be taken into account by assuming throughout the
remainder of this section that S has the following properties:


0 2 S.

S is convex.

Each  2 S is admissible in the sense that   Y is P -a.s. bounded from below.

Under these conditions, the set


AS WD X 2 L1 j 9  2 S with X C   Y  0 P -a.s.

(4.64)

is non-empty, convex, and contains all X 2 X which dominate some Z 2 AS .


Moreover, we will assume from now on that
infm 2 R j m 2 AS > 1:

(4.65)

238

Chapter 4 Monetary measures of risk

Proposition 4.7 then guarantees that the induced risk measure


S .X / WD AS .X / D infm 2 R j m C X 2 AS
is a convex risk measure on L1 . Note that (4.65) holds, in particular, if S does not
contain arbitrage opportunities in the sense that   Y  0 P -a.s. for  2 S implies
P   Y D 0  D 1.
Remark 4.100. Admissibility of portfolios is a serious restriction; in particular, it
prevents unhedged short sales of any unbounded asset. Note, however, that it is consistent with our notion of acceptability for bounded claims in (4.64), since X C   Y  0
implies   Y  kXk.
}
Two questions arise: When is S continuous from above, and thus admits a representation (4.32) in terms of probability measures? And, if such a representation
exists, how can we identify the minimal penalty function Smin on M1 .P /? In the
case S D Rd , both questions were addressed in Proposition 4.99. For general S,
only the second question has a straightforward answer, which will be given in Proposition 4.102. As can be seen from the proof of Proposition 4.99, an analysis of the
first question requires an extension of the arbitrage theory in Chapter 1 for the case
of portfolio constraints. Such a theory will be developed in Chapter 9 in a more general dynamic setting, and we will address both questions for the corresponding risk
measures in Corollary 9.32. This result implies the following theorem for the simple
one-period model of the present section:
Theorem 4.101. In addition to the above assumptions, suppose that the market model
is non-redundant in the sense of Definition 1:15 and that S is a closed subset of Rd
such that the cone  j  2 S;  0 is closed. Then S is sensitive if and only if
S contains no arbitrage opportunities. In this case, S is continuous from above and
admits the representation

(4.66)
EQ X   sup EQ   Y  :
S .X / D sup
Q2M1 .P /

2S

In the following proposition, we will explain the specific form of the penalty function in (4.66). This result will not require the additional assumptions of Theorem
4.101.
Proposition 4.102. For Q 2 M1 .P /, the minimal penalty function Smin of S is
given by
Smin .Q/ D sup EQ   Y :
2S

In particular, S can be represented as in (4.66) if S is continuous from above.

239

Section 4.8 Measures of risk in a financial market

Proof. Fix Q 2 M1 .P /. Clearly, the expectation EQ   Y  is well defined for each


 2 S by admissibility. If X 2 AS , there exists  2 S such that X  Y P -almost
surely. Thus,
EQ X   EQ   Y   sup EQ   Y 
2S

for any Q 2 M1 .P /. Hence, the definition of the minimal penalty function yields
Smin .Q/  sup EQ   Y :

(4.67)

2S

To prove the converse inequality, take  2 S. Note that Xk WD ..  Y / ^ k/ is


bounded since  is admissible. Moreover,
Xk C   Y D .  Y  k/ IY k  0;
so that Xk 2 AS . Hence,
Smin .Q/  EQ Xk  D EQ .  Y / ^ k ;
and so Smin .Q/  EQ   Y  by monotone convergence.
Exercise 4.8.1. Show that the identity
Smin .Q/ D sup EQ   Y 
2S

in Proposition 4.102 remains true even for Q 2 M1;f .P / if we assume in addition


that Y is P -a.s. bounded. We thus obtain the representation


EQ X   sup EQ   Y 
max
S .X / D
Q2M1;f .P /

2S

without assuming continuity from above.

Remark 4.103. Suppose that S is a cone. Then the acceptance set AS is also a
cone, and S is a coherent measure of risk. If S is continuous from above, then
Corollary 4.37 yields the representation
S .X / D

sup EQ X 

max
Q2QS

max
D Q 2 M1 .P / j Smin .Q/ D 0. It follows from
in terms of the non-empty set QS
Proposition 4.102 that for Q 2 M1 .P /
max
Q 2 QS

if and only if EQ   Y   0

for all  2 S.

max
If S is sensitive, then the set S cannot contain any arbitrage opportunities, and QS
contains the set P of all equivalent martingale measures whenever such measures
max can be described as the set of absolutely continuous
exist. More precisely, QS
supermartingale measures with respect to S; this will be discussed in more detail in
the dynamical setting of Chapter 9.
}

240

Chapter 4 Monetary measures of risk

Let us now relax the condition of acceptability in (4.64). We no longer insist that
the final outcome of an acceptable position, suitably hedged, should always be nonnegative. Instead, we only require that the hedged position is acceptable in terms of a
given convex risk measure A with acceptance set A. Thus, we define
AN WD X 2 L1 j 9  2 S; A 2 A with X C   Y  A P -a.s. :

(4.68)

Clearly, A  AN and hence


A   WD AN :
From now on, we assume that
 > 1;

(4.69)

which implies our assumption (4.65) for AS .


Proposition 4.104. The minimal penalty function min for  is given by
min .Q/ D Smin .Q/ C min .Q/;

(4.70)

where Smin is the minimal penalty function for S and min is the minimal penalty
function for A .
Proof. We claim that
N
X 2 L1 j .X / < 0 X S C A j X S 2 AS ; A 2 A A:

(4.71)

If .X / < 0, then there exists A 2 A and  2 S such that X C   Y  A. Therefore


X S WD X  A 2 AS . Next, if X S 2 AS then X S C   Y  0 for some  2 S.
N
Hence, for any A 2 A, we get X S C A C   Y  A 2 A, i.e., X WD X S C A 2 A.
In view of (4.71), we have
EQ X  

sup

sup

sup EQ X S A 

X S 2AS A2A

XW .X/<0

 sup EQ X   sup EQ X :
N
X2A

X2A

But the left- and rightmost terms are equal, and so we get that
min .Q/ D sup EQ X 
N
X2A

sup

sup EQ X S  A 

X S 2AS A2A

D Smin .Q/ C min .Q/:

241

Section 4.8 Measures of risk in a financial market

Remark 4.105. It can happen that the sum of two penalty functions as on the righthand side of (4.70) is infinite for every Q 2 M1 .P /. In this case, the condition (4.69)
will be violated and the risk measure  does not exist. Consider, for example, the
situation of a complete market model without trading constraints. Then S .X / D
E  X , where P  is the unique equivalent risk-neutral measure. That is,

Smin .Q/ D

0
when Q D P 
C1 otherwise.

For A we take AV@R


. Then min .Q/ C Smin .Q/ is infinite for Q P  , and
min .P  / C Smin .P  / is finite (and in fact zero) if and only if
dP 
1

dP

P -a.s.

But the latter condition is violated as soon as P  P and is close enough to 1. }


Exercise 4.8.2. Suppose that condition (4.69) is satisfied. Show that the risk measure
 can also be obtained as
.X / D inf.X  X S / j X S 2 AS
and as
.X / D inf1 ..X  Z/ C S .Z//:
Z2L

The preceding exercise shows that  is a special case of an inf-convolution, which


we now define in a general context.
Definition 4.106. For two convex risk measures 1 and 2 on L1 ,
1 2 .X / WD inf1 .1 .X  Z/ C 2 .Z//;
Z2L

X 2 L1 ;

is called the inf-convolution of 1 and 2 .


Exercise 4.8.3. Let 1 and 2 be two convex risk measures on L1 and assume that
1 2 .0/ D inf1 .1 .Z/ C 2 .Z// > 1:
Z2L

(a) Show that 1 2 D 2 1 .


(b) Show that  WD 1 2 is a convex risk measure on L1 .

242

Chapter 4 Monetary measures of risk

(c) Show that the minimal penalty function min of  is equal to the sum of the
respective minimal penalty functions 1min and 2min of 1 and 2 :
min .Q/ D 1min .Q/ C 2min .Q/

for Q 2 M1;f .

(d) Show that  is continuous from below as soon as 1 is continuous from below.
}
For the rest of this section, we consider the following case study, which is based
on [45]. Let us fix a finite class
Q0 D Q1 ; : : : ; Qn
of equivalent probability measures Qi  P such that jY j 2 L1 .Qi /; as in [45], we
call the measures in Q0 valuation measures. Define the sets
B WD X 2 L0 j EQi X  exists and is  0; i D 1; : : : ; n

(4.72)

and
B0 WD X 2 B j EQi X  D 0 for i D 1; : : : ; n :
Note that
B0 \ L0C D 0;

(4.73)

since X D 0 P -a.s. as soon as X  0 P -a.s. and EQi X  D 0, due to the equivalence


Qi  P .
As the initial acceptance set, we take the convex cone
A WD B \ L1 :

(4.74)

The corresponding set AN of positions which become acceptable if combined with a


suitable hedge is defined as in (4.68)
AN WD X 2 L1 j 9  2 Rd with X C   Y 2 B :
Let us now introduce the following stronger version of the no-arbitrage condition
K \ L0C D 0, where K WD   Y j  2 Rd :
K \ B D K \ B0 :

(4.75)

In other words, there is no portfolio  2 Rd such that the result satisfies the valuation
inequalities in (4.72) and is strictly favorable in the sense that at least one of the
inequalities is strict.
Note that (4.75) implies the absence of arbitrage opportunities:
K \ L0C D K \ B \ L0C D K \ B0 \ L0C D 0;

243

Section 4.8 Measures of risk in a financial market

where we have used (4.73) and B \L0C D L0C . Thus, (4.75) implies, in particular, the
existence of an equivalent martingale measure, i.e., P ;. The following proposition
may be viewed as an extension of the fundamental theorem of asset pricing. Let us
denote by
n
n

X
X

i > 0;
i Qi
i D 1
R WD
i D1

i D1

the class of all representative models for the class Q0 , i.e., all mixtures such that
each Q 2 Q0 appears with a positive weight.
Proposition 4.107. The following two properties are equivalent:
(a) K \ B D K \ B0 .
(b) P \ R ;.
Proof. (b) ) (a): For V 2 K \ B and R 2 R, we have ER V   0. If we can
choose R 2 P \ R then we get ER V  D 0, hence V 2 B0 .
(a) ) (b): Consider the convex set
C WD ER Y  j R 2 R  Rd I
we have to show that C contains the origin. If this is not the case then there exists
 2 Rd such that
  x  0 for x 2 C ,
(4.76)
and
  x > 0

for some x  2 C ;

see Proposition A.1. Define V WD   Y 2 K. Condition (4.76) implies


ER V   0

for all R 2 R,

hence V 2 K \ B. Let R 2 R be such that x  D ER Y . Then V satisfies


ER V  > 0, hence V K \ B0 , in contradiction to our assumption (a).
We can now state a representation theorem for the coherent risk measure  correN It is a special case of Theorem 4.110 which will be
sponding to the convex cone A.
proved below.
Theorem 4.108. Under assumption (4.75), the coherent risk measure  WD AN corresponding to the acceptance set AN is given by
.X / D

sup
P  2P \R

E  X :

244

Chapter 4 Monetary measures of risk

Let us now introduce a second finite set Q1  M1 .P / of probability measures


Q
P with jY j 2 L1 .Q/; as in [45], we call them stress test measures. In addition
to the valuation inequalities in (4.72), we require that an admissible position passes a
stress test specified by a floor
.Q/ < 0 for each Q 2 Q1 .
Thus, the convex cone A in (4.74) is reduced to the convex set
A1 WD A \ B1 D L1 \ .B \ B1 /;
where
B1 WD X 2 L0 j EQ X   .Q/ for Q 2 Q1 :
Let
AN 1 WD X 2 L1 j 9  2 Rd with X C   Y 2 B \ B1
denote the resulting acceptance set for positions combined with a suitable hedge.
Remark 4.109. The analogue
K \ .B \ B1 / D K \ B0

(4.77)

of our condition (4.75) looks weaker, but it is in fact equivalent to (4.75). Indeed, for
X 2 K \ B we can find " > 0 such that X1 WD "X satisfies the additional constraints
EQ X1   .Q/ for Q 2 Q1 .
Since X1 2 K \ B \ B1 , condition (4.77) implies X1 2 K \ B0 , hence X D 1" X1 2
}
K \ B0 , since K \ B0 is a cone.
Let us now identify the convex risk measure 1 induced by the convex acceptance
set AN 1 . Define
X

R1 WD
.Q/  Q .Q/  0;
.Q/ D 1 R
Q2Q

Q2Q

as the convex hull of Q WD Q0 [ Q1 , and define


X
.Q/.Q/
.R/ WD
Q2Q

for R D

.Q/Q 2 R with .Q/ WD 0 for Q 2 Q0 .

Section 4.8 Measures of risk in a financial market

245

Theorem 4.110. Under assumption (4.75), the convex risk measure 1 induced by
the acceptance set AN 1 is given by
1 .X / D

.E  X  C .P  //;

sup

(4.78)

P  2P \R1

i.e., 1 is determined by the penalty function

C1
for Q P \ R1 ,
1 .Q/ WD
.Q/ for Q 2 P \ R1 .
Proof. Let  denote the convex risk measure defined by the right-hand side of (4.78),
and let A denote the corresponding acceptance set
A WD X 2 L1 j E  X   .P  / for all P  2 P \ R1 :
It is enough to show A D AN 1 .
(a): In order to show AN 1  A , take X 2 AN 1 and P  2 P \ R1 . There exists
 2 Rd and A1 2 A1 such that X C   Y  A1 . Thus,
E  X C   Y   E  A1   .P  /;
due to P  2 R1 . Since E    Y  D 0 due to P  2 P , we obtain E  X   .P  /,
hence X 2 A .
(b): In order to show A  AN 1 , we take X 2 A and assume that X AN 1 . This
 / with components
means that the vector x  D .x1 ; : : : ; xN
xi WD EQi X   .Qi /
does not belong to the convex cone
N
C WD .EQi   Y /i D1;:::;N C y j  2 Rd ; y 2 RN
CR ;

where Q D Q0 [ Q1 D Q1 ; : : : ; QN with N  n. In part (c) of this proof we will


show that C is closed. Thus, there exists 2 RN such that
 x  < inf  xI
x2C

(4.79)

N
see Proposition
P A.1. Since C RC , we obtain i  0 for i D 1; : : : ; N , and we
may assume i i D 1 since 0. Define

R WD

N
X
i D1

i Qi 2 R1 :

246

Chapter 4 Monetary measures of risk

Since C contains the linear space of vectors .EQi V /i D1;:::;N with V 2 K, (4.79)
implies
ER V  D 0 for V 2 K,
hence R 2 P . Moreover, the right-hand side of (4.79) must be zero, and the condition
 x  < 0 translates into
ER X  < .R/;
contradicting our assumption X 2 A .
(c): It remains to show that C is closed. For  2 Rd we define y./ as the vector
in RN with coordinates yi ./ D EQi   Y . Any x 2 C admits a representation
x D y./ C z
?
with z 2 RN
C and  2 N , where

N WD  2 Rd j EQi   Y  D 0 for i D 1; : : : ; N ;
and
N ? WD  2 Rd j    D 0 for all  2 N :
Take a sequence
xn D y.n / C zn ;

n D 1; 2; : : : ;

N
with n 2 N ? and zn 2 RN
C , such that xn converges to x 2 R . If lim infn jn j < 1,
then we may assume, passing to a subsequence if necessary, that n converges to
 2 Rd . In this case, zn must converge to some z 2 RN
C , and we have x D y./Cz 2
C . Let us now show that the case limn jn j D 1 is in fact excluded. In that case,
n WD .1 C jn j/1 converges to 0, and the vectors n WD n n stay bounded. Thus,
we may assume that n converges to  2 N ? . This implies

y./ D lim y.n / D  lim n zn 2 RN


C:
n"1

n"1

Since  2 N ? and jj D limn jn j D 1, we obtain y./ 0. Thus, the inequality
EQi ./  Y  D yi ./  0
holds for all i and is strict for some i, in contradiction to our assumption (4.75).

4.9

Utility-based shortfall risk and divergence risk


measures

In this section, we will establish a connection between convex risk measures and the
expected utility theory of Chapter 2.

Section 4.9 Utility-based shortfall risk and divergence risk measures

247

Suppose that a risk-averse investor assesses the downside risk of a financial position
X 2 X by taking the expected utility E u.X  /  derived from the shortfall X  , or
by considering the expected utility E u.X /  of the position itself. If the focus is
on the downside risk, then it is natural to change the sign and to replace u by the
function `.x/ WD u.x/. Then ` is a strictly convex and increasing function, and
the maximization of expected utility is equivalent to minimizing the expected loss
E `.X /  or the shortfall risk E `.X  / . In order to unify the discussion of both
cases, we do not insist on strict convexity. In particular, ` may vanish on .1; 0,
and in this case the shortfall risk takes the form
E `.X  /  D E `.X / :
Definition 4.111. A function ` W R ! R is called a loss function if it is increasing
and not identically constant.
Let us return to the setting where we consider monetary risk measures defined on
the class X of all bounded measurable functions on some given measurable space
.; F /. Let us fix a probability measure P on .; F /. For a given loss function `
and an interior point x0 in the range of `, we define the following acceptance set:
A WD X 2 X j E `.X /   x0 :

(4.80)

Alternatively, we can write


A D X 2 X j E u.X /   y0 ;

(4.81)

where u.x/ D `.x/ and y0 D x0 . The acceptance set A satisfies (4.3) and (4.4).
By part (a) of Proposition 4.7 it induces a monetary risk measure  given by
.X / D infm 2 R j E `.X  m/  x0
D infm 2 R j E u.X C m/  y0 :

(4.82)

This risk measure satisfies (4.31) and hence can be regarded as a monetary risk measure on L1 . When ` is convex or u concave,  is a convex risk measure. It is
normalized when x0 D `.0/.
Exercise 4.9.1. Let ` be a strictly increasing continuous loss function. Suppose that
the risk measure  associated to ` via (4.82) satisfies
.X /  .Y /

E `.X /   E `.Y / 

for any X; Y 2 L1 . Show that ` is either linear or exponential.


Hint: Apply Proposition 2.46.
For the rest of this section, we will only consider convex loss functions.

248

Chapter 4 Monetary measures of risk

Definition 4.112. The convex risk measure in (4.82) is called utility-based shortfall
risk measure.
Proposition 4.113. The utility-based shortfall risk measure  is continuous from below. Moreover, the minimal penalty function min for  is concentrated on M1 .P /,
and  can be represented in the form
.X / D

max

Q2M1 .P /

.EQ X   min .Q//:

(4.83)

Proof. We have to show that  is continuous from below. Note first that z D .X / is
the unique solution to the equation
E `.z  X /  D x0 :

(4.84)

Indeed, that z D .X / solves (4.84) follows by dominated convergence, since the
finite convex function ` is continuous. The solution is unique, since ` is strictly increasing on .`1 .x0 /  "; 1/ for some " > 0.
Suppose now that .Xn / is a sequence in X which increases pointwise to some
X 2 X. Then .Xn / decreases to some finite limit R. Using the continuity of ` and
dominated convergence, it follows that
E `..Xn /  Xn / ! E `.R  X / :
But each of the approximating expectations equals x0 , and so R is a solution to (4.84).
Hence R D .X /, and this proves continuity from below. Since  satisfies (4.31), the
representation (4.83) follows from Theorem 4.22 and Lemma 4.32.
Let us now compute the minimal penalty function min .
Example 4.114. For an exponential loss function `.x/ D e x , the minimal penalty
function can be described in terms of relative entropy, and the resulting risk measure
coincides, up to an additive constant, with the entropic risk measure introduced in
Example 4.34. In fact,
.X / D infm 2 R j E e .mCX/   x0 D

1
.log E e X   log x0 /:

In this special case, the general formula (4.18) for min reduces to the variational
formula for the relative entropy H.QjP / of Q with respect to P


1
log x0
min .Q/ D sup EQ X   log E e X  

X2X
D

1
.H.QjP /  log x0 /I

Section 4.9 Utility-based shortfall risk and divergence risk measures

249

see Lemma 3.29. Thus, the representation (4.83) of  is equivalent to the following
dual variational identity:
log E e X  D

max .EQ X   H.QjP //:

Q2M1 .P /

In general, the minimal penalty function min on M1 .P / can be expressed in terms


of the FenchelLegendre transform or conjugate function ` of the convex function `
defined by
` .z/ WD sup . zx  `.x/ /:
x2R

Theorem 4.115. For any convex loss function `, the minimal penalty function in the
representation (4.83) is given by

 
 
1
dQ
x0 C E ` 
; Q 2 M1 .P /:
(4.85)
min .Q/ D inf
dP

>0
In particular,

 
 
dQ
1

x0 C E `
;
EQ X   inf
dP

>0


.X / D

max

Q2M1 .P /

X 2 L1 :

To prepare the proof of Theorem 4.115, we summarize some properties of the functions ` and ` as stated in Appendix A.1. First note that ` is a proper convex function, i.e., it is convex and takes some finite value. We denote by J WD .` /0C its
right-continuous derivative. Then, for x; z 2 R,
xz  `.x/ C ` .z/

with equality if x D J.z/.

(4.86)

Lemma 4.116. Let .`n / be a sequence of convex loss functions which decreases
pointwise to the convex loss function `. Then the corresponding conjugate functions
`n increase pointwise to ` .
Proof. It follows immediately from the definition of the FenchelLegendre transform
that each `n is dominated by ` , and that `n .z/ increases to some limit `1 .z/. We
have to prove that `1 D ` .
The function z 7! `1 .z/ is a lower semicontinuous convex function as the increasing limit of such functions. Moreover, `1 is a proper convex function, since it is
dominated by the proper convex function ` . Consider the conjugate function `
1 of





`1 . Clearly, `1  `, since `1  ` and since ` D ` by Proposition A.6. On
the other hand, we have by a similar argument that `
1  `n for each n. By taking
 D ` .
D
`,
which
in
turn
gives
`
n " 1, this shows `
1
1

250

Chapter 4 Monetary measures of risk

Lemma 4.117. The functions ` and ` have the following properties:


(a) ` .0/ D  infx2R `.x/ and ` .z/  `.0/ for all z.
(b) There exists some z1 2 0; 1/ such that
` .z/ D sup .xz  `.x//

for z  z1 .

x0

In particular, ` is increasing on z1 ; 1/.


(c)

` .z/
z

! 1 as z " 1.

Proof. Part (a) is obvious.


(b): Let N WD z 2 R j ` .z/ D `.0/. We show in a first step that N ;.
Note that convexity of ` implies that the set S of all z with zx  `.x/  `.0/ for all
x 2 R is non-empty. For z 2 S we clearly have ` .z/  `.0/. On the other hand,
` .z/  `.0/ by (a).
Now we take z1 WD sup N . It is clear that z1  0. If z > z1 and x < 0, then
xz  `.x/  xz1  `.x/  ` .z1 /  `.0/;
where the last inequality follows from the lower semicontinuity of ` . But ` .z/ >
`.0/, hence
sup .xz  `.x// < ` .z/:
x<0

(c): For z  z1 ,
` .z/=z D sup .x  `.x/=z/
x0

by (b). Hence

` .z/
 xz  1;
z
where xz WD supx j `.x/  z. Since ` is convex, increasing, and takes only finite
values, we have xz ! 1 as z " 1.
Proof of Theorem 4:115. Fix Q 2 M1 .P /, and denote by ' WD dQ=dP its density.
First, we show that it suffices to prove the claim for x0 > `.0/. Otherwise we can find
some a 2 R such that `.a/ < x0 , since x0 was assumed to be an interior point of
Q
`.R/. Let `.x/
WD `.x  a/, and
Q XQ /   x0 :
AQ WD XQ 2 X j E `.
Then AQ D X  a j X 2 A, and hence
sup EQ XQ  D sup EQ X  C a:
Q
Q A
X2

X2A

(4.87)

Section 4.9 Utility-based shortfall risk and divergence risk measures

251

Q
The convex loss function `Q satisfies the requirement `.0/
< x0 . So if the assertion is
established in this case, we find that
1
1
.x0 C E `Q . '/ / D inf .x0 C E ` . '/ / C aI

>0

>0

sup EQ XQ  D inf


Q
XQ 2A

here we have used the fact that the FenchelLegendre transform `Q of `Q satisfies
`Q .z/ D ` .z/ C az. Together with (4.87), this proves that the reduction to the
case `.0/ < x0 is indeed justified.
For any > 0 and X 2 A, (4.86) implies
X' D

1
1
.X /. '/  .`.X / C ` . '//:

Hence, for any > 0


1
1
.E `.X /  C E ` . '//  .x0 C E ` . '//:

X2A

min .Q/  sup

Thus, it remains to prove that


min .Q/  inf

>0

1
.x0 C E ` . '//

(4.88)

in case where min .Q/ < 1. This will be done first under the following extra conditions:
There exists  2 R such that `.x/ D inf ` for all x  .

(4.89)

` is finite on .0; 1/.

(4.90)

J is continuous on .0; 1/.

(4.91)

Note that these assumptions imply that ` .0/ < 1 and that J.0C/  . Moreover,
J.z/ increases to C1 as z " 1, and hence so does `.J.z//. Since
` .z/  `.0/ > x0

for all z,

(4.92)

it follows from (4.86) that


lim `.J.z//  x0 < lim.`.J.z// C ` .z// D lim zJ.z/ D 0:

z#0

z#0

z#0

These facts and the continuity of J imply that for large enough n there exists some
n > 0 such that
E `.J. n '/I'n /  D x0 :
Let us define
X n WD J. n '/I'n :

252

Chapter 4 Monetary measures of risk

Then X n is bounded and belongs to A. Hence, it follows from (4.86) and (4.92) that
min .Q/  EQ X n 
1
E I'n J. n '/. n '/ 
n
1
D
E .`.X n / C ` . n '//  I'n 
n
1
D
.x0  `.0/  P ' > n  C E ` . n '/I'n /
n
x0  `.0/
:

n

Since we assumed that min .Q/ < 1, the decreasing limit 1 of n must be strictly
positive. The fact that ` is bounded from below allows us to apply Fatous lemma
min .Q/  lim inf
n"1

1
.x0  `.0/  P ' > n  C E ` . n '/I'n /
n

1
.x0 C E ` . 1 '/ /:
1

This proves (4.88) under the assumptions (4.89), (4.90), and (4.91).
If (4.89) and (4.90) hold, but J is not continuous, then we can approximate the
upper semicontinuous function J from above with an increasing continuous function
JQ on 0; 1/ such that
Z z
JQ .y/ dy
`Q .z/ WD ` .0/ C
0

satisfies
` .z/  `Q .z/  ` ..1 C "/z/ for z  0.
Let `Q WD `Q denote the FenchelLegendre transform of `Q . Since ` D ` by
Proposition A.6, it follows that
 x 
Q
 `.x/
 `.x/:
1C"

`
Therefore,

Q
AQ WD X 2 X j E `.X
/   x0 .1 C "/X j X 2 A DW A" :

253

Section 4.9 Utility-based shortfall risk and divergence risk measures

Q we get that
Since we already know that the assertion holds for `,

 

 
 
 
dQ
dQ
1
1
x0 C E ` 
 inf
x0 C E `Q
inf
dP
dP

>0

>0
D sup EQ X 
Q
X2A

 sup EQ X 
X2A"

D .1 C "/ min .Q/:


By letting " # 0, we obtain (4.88).
Finally, we remove conditions (4.89) and (4.90). If ` .z/ D C1 for some z,
then z must be an upper bound for the slope of `. So we will approximate ` by a
sequence .`n / of convex loss functions whose slope is unbounded. Simultaneously,
we can handle the case where ` does not take on its infimum. To this end, we choose
a sequence n # inf ` such that n  `.0/ < x0 . We can define, for instance,
1
`n .x/ WD `.x/ _ n C .e x  1/C :
n
Then `n decreases pointwise to `. Each loss function `n satisfies (4.89) and (4.90).
Hence, for any " > 0 there are "n such that
1 > min .Q/  nmin .Q/ 

1
.x0 C E `n . "n '/ /  "
"n

for each n,

where nmin .Q/ is the penalty function arising from `n . Note that `n % ` by
Lemma 4.116. Our assumption min .Q/ < 1, the fact that
inf `n .z/  `n .0/ D `.0/ > x0 ;

z2R

and part (c) of Lemma 4.117 show that the sequence . "n /n2N must be bounded away
from zero and from infinity. Therefore, we may assume that "n converges to some
" 2 .0; 1/. Using again the fact that `n .z/  `.0/ uniformly in n and z, Fatous
lemma yields
min .Q/ C "  lim inf
n"1

1
1
.x0 C E `n . "n '/ /  " .x0 C E ` . " '/ /:
"
n

This completes the proof of the theorem.


Example 4.118. Take

`.x/ WD

1 p
px

if x  0,

otherwise,

254

Chapter 4 Monetary measures of risk

where p > 1. Then

` .z/ WD

1 q
qz

if z  0,

C1 otherwise,

where q D p=.p  1/ is the usual dual coefficient. We may apply Theorem 4.115 for
any x0 > 0. Let Q 2 M1 .P / with density ' WD dQ=dP . Clearly, min .Q/ D C1
if ' Lq .; F ; P /. Otherwise, the infimum in (4.85) is attained for


px0 1=q
:
Q D
E ' q 
Hence, we can identify min .Q/ for any Q
P as
 

dQ q 1=q
:
pmin .Q/ D .px0 /1=p  E
dP
Taking the limit p # 1, we obtain the case `.x/ D x C where we measure the risk in
terms of the expected shortfall. Here we have


dQ
min

:
}
1 .Q/ D x0 
dP 1
Together with Proposition 4.20, Theorem 4.115 yields the following result for risk
measures which are defined in terms of a robust notion of bounded shortfall risk. Here
it is convenient to define ` .1/ WD 1.
Corollary 4.119. Suppose that Q is a family of probability measures on .; F /, and
that `, ` , and x0 are as in Theorem 4:115. We define a set of acceptable positions by
A WD X 2 X j EP `.X /   x0 for all P 2 Q :
Then the corresponding convex risk measure can be represented in terms of the penalty
function

 
 
1
dQ

x0 C inf EP `
; Q 2 M1 .; F /;
.Q/ D inf
dP
P 2Q

>0
where dQ=dP is the density appearing in the Lebesgue decomposition of Q with
respect to P as in Theorem A.13.
Example 4.120. In the case of Example 4.114, the corresponding robust problem in
Corollary 4.119 leads to the following entropy minimization problem: For a given Q
and a set Q of probability measures, find
inf H.QjP /:

P 2Q

Note that this problem is different from the standard problem of minimizing H.QjP /
with respect to the first variable Q as it appears in Section 3.2.
}

Section 4.9 Utility-based shortfall risk and divergence risk measures

255

Example 4.121. Take x0 D 0 in (4.80) and `.x/ WD x. Then

0
if z D 1,
` .z/ WD
C1 otherwise.
Therefore, .Q/ D 1 if Q P , and .X / D E X . If Q is a set of probability
measures, the robust risk measure  of Corollary 4.119 is coherent, and it is given
by
}
.X / D sup EP X :
P 2Q

Exercise 4.9.2. Consider a situation in which model uncertainty is described by a


parametric family P for  2 . In each model P , the expected utility of a random
variable X 2 X is E u.X / , where u W R ! R is a given utility function. In
a Bayesian approach, one would choose a prior distribution
on . In terms of
F. Knights distinction between risk and uncertainty, we would now be in a situation
of model risk. Risk neutrality with respect to this model risk would be described by
the utility functional
Z
U.X / D

E u.X / 
.d /I

here we assume that  7! E u.X /  is sufficiently measurable. In order to capture


model risk aversion, we could choose another utility function uO W R ! R and consider
the utility functional UO .X / defined by
Z
u.
O UO .X // D u.E
O u.X / /
.d /:
Show that UO is quasi-concave, i.e.,
UO .X C .1  /Y /  UO .X / ^ UO .Y /

for X; Y 2 L1 and 0   1.

Show next that for u.x/


O
D 1  e x , the utility functional UO .X / takes the form
O
U .X / D .u.X // for a convex risk measure . Then compute the minimal penalty
function in the robust representation of  (here you may assume that is a finite set).
Discuss the limit " 1.
}
We now explore the relations between shortfall risk and the divergence risk measures introduced in Example 4.36. To this end, let g W 0; 1! R [ C1 be a lower
semicontinuous convex function satisfying g.1/ < 1 and the superlinear growth
condition g.x/=x ! C1 as x " 1. Recall the definition of the g-divergence,
h  dQ  i
;
Ig .QjP / WD E g
dP

Q 2 M1 .P /;

(4.93)

256

Chapter 4 Monetary measures of risk

and of the corresponding divergence risk measure


g .X / WD sup .EQ X   Ig .QjP //;

X 2 L1 :

(4.94)

QP

We have seen in Exercise 4.3.3 that  is continuous from below and that Ig .  jP /
is its minimal penalty function, so the supremum in (4.94) is actually a maximum.
The following representation for g extends the corresponding result for AV@R in
Lemma 4.51, where g D 1  I.1=;1/ .
Theorem 4.122. Let g .y/ D supx>0 .xy  g.x// be the FenchelLegendre transform of g. Then
g .X / D inf .E g  .z  X /   z/;
z2R

X 2 L1 :

(4.95)

The proof of Theorem 4.122 is based on Theorem 4.115. In fact, we will see that,
in some sense, the two representation (4.85) and (4.95) are dual with respect to each
other. We prepare the proof with the following exercises. The first one concerns a
nice and sometimes very useful property of convex functions.
Exercise 4.9.3. If h is a convex function on 0; 1/, then
.x; y/ 7! xh

y 
x

is a convex function of .x; y/ 2 .0; 1/  0; 1/.

Exercise 4.9.4. Let g W 0; 1! R [ C1 be a lower semicontinuous convex


function satisfying g.1/ < 1 and the superlinear growth condition g.x/=x ! C1
as x " 1. For > 0 let g
.x/ WD g.x= /. Then . ; x/ 7! g
.x/ is convex by
Exercise 4.9.3. Let 
.Q/ D Ig .QjP / be the corresponding g
-divergence. Show
that . ; Q/ 7! 
.Q/ is a convex functional and that

h. / WD

g .X / D minQ2M1 .P / .EQ X  C 


.Q// if > 0,
C1
otherwise,

is a lower semicontinuous convex function in if X 2 L1 is fixed.

Proof of Theorem 4:122. Let g


and h be as in Exercise 4.9.4. Our aim is to compute
h.1/. The idea is to use Theorem 4.115 so as to identify the FenchelLegendre transform h of h. To this end, we first observe that ` WD g satisfies the assumptions of
Theorem 4.115. Next, ` D g  D g by Proposition A.6. Hence, Theorem 4.115

Section 4.9 Utility-based shortfall risk and divergence risk measures

257

yields that
f .x/ WD infm 2 R j E g  .m  X /   x


h  dQ  i
D max
EQ X   inf x C E g

dP

>0
Q2M1 .P /
D  inf

min

>0 Q2M1 .P /

.EQ X  C x C 
.Q//

D  inf . x C h. // D h .x/;

>0

for all x in the interior of g  .R/, which coincides with the interior of dom f . Exercise 4.9.4 hence yields h.1/ D h .1/ D supx .x  f .x//. We have seen in the
proof of Proposition 4.113 that x D E g .f .x/  X /  whenever x belongs
to the interior of g  .R/. Hence,
h.1/ D sup .E g  .f .x/  X /   f .x//;
x2R

and the assertion follows by noting that the range of f contains all points to the left
of kX  k1  x0 , where x0 is the lower bound for all points in which the right-hand
derivative of g  is strictly positive.

Part II

Dynamic hedging

Chapter 5

Dynamic arbitrage theory

In this chapter we develop a dynamic version of the arbitrage theory of Chapter 1.


Here we will work in a multiperiod setting, where the stochastic price fluctuation of
a financial asset is described as a stochastic process in discrete time. Portfolios will
be successively readjusted, taking into account the information available at each time.
In its weakest form, market efficiency requires that such dynamic trading strategies
should not create arbitrage opportunities. In Section 5.2 we show that an arbitragefree model is characterized by the existence of an equivalent martingale measure.
Under such a measure, the discounted price processes of the traded assets are martingales, that is, they have the mathematical structure of a fair game. In Section 5.3 we
introduce European contingent claims. These are financial instruments whose payoff
at the expiration date depends on the behavior of the underlying primary assets, and
possibly on other factors. We discuss the problem of pricing such contingent claims
in a manner which does not create new arbitrage opportunities. The pricing problem
is closely related to the problem of hedging a given claim by using a dynamic trading
strategy based on the primary assets. An ideal situation occurs if any contingent claim
can be perfectly replicated by the final outcome of such a strategy. In such a complete
model, the equivalent martingale measure P  is unique, and derivatives are priced in
a canonical manner by taking the expectation of the discounted payoff with respect
to the measure P  . Section 5.5 contains a simple case study for completeness, the
binomial model introduced by Cox, Ross, and Rubinstein. In this context, it is possible to obtain explicit pricing formulas for a number of exotic options, as explained
in Section 5.6. In Section 5.7 we pass to the limiting diffusion model of geometric
Brownian motion. Using a suitable version of the central limit theorem, we are led
to the general BlackScholes formula for European contingent claims and to explicit
pricing formulas for some exotic options such as lookback options and the up-and-in
and up-and-out calls.
The general structure of complete models is described in Section 5.4. There it will
become clear that completeness is the exception rather than the rule: Typical market
models in discrete time are incomplete.

5.1

The multi-period market model

Throughout this chapter, we consider a market model in which d C 1 assets are priced
at times t D 0; 1; : : : ; T . The price of the i th asset at time t is modelled as a nonnegative random variable S ti on a given probability space .; F ; P /. The random

262

Chapter 5 Dynamic arbitrage theory

vector S t D .S t0 ; S t / D .S t0 ; S t1 ; : : : ; S td / is assumed to be measurable with respect


to a -algebra F t  F . One should think of F t as the class of all events which are
observable up to time t . Thus, it is natural to assume that
F0  F1      FT :

(5.1)

Definition 5.1. A family .F t / tD0;:::;T of -algebras satisfying (5.1) is called a filtration. In this case, .; F ; .F t / tD0;:::;T ; P / is also called a filtered probability space.
To simplify the presentation, we will assume that
F0 D ;;  and

F D FT :

(5.2)

Let .E; E/ be a measurable space. A stochastic process with state space .E; E/ is
given by a family of E-valued random variables on .; F ; P / indexed by time. In
our context, the typical parameter sets will be 0; : : : ; T or 1; : : : ; T , and the state
space will be some Euclidean space.
Definition 5.2. A stochastic process Y D .Y t / tD0;:::;T is called adapted with respect
to the filtration .F t / tD0;:::;T if each Y t is F t -measurable. A stochastic process Z D
.Z t / tD1;:::;T is called predictable with respect to .F t / tD0;:::;T if each Z t is F t1 measurable.
Note that in our definition predictable processes start at t D 1 while adapted processes are also defined at t D 0. In particular, the asset prices S D .S t / tD0;:::;T form
an adapted stochastic process with values in Rd C1 .
Definition 5.3. A trading strategy is a predictable Rd C1 -valued process
 D . 0 ; / D . t0 ;  t1 ; : : : ;  td / tD1;:::;T :
The value  ti of a trading strategy  corresponds to the quantity of shares of the
i
is the
i asset held during the t th trading period between t  1 and t . Thus,  ti S t1
th
i
i
amount invested into the i asset at time t  1, while  t S t is the resulting value at
time t . The total value of the portfolio  t at time t  1 is
th

 t  S t1 D

d
X

i
 ti S t1
:

i D0

By time t , the value of the portfolio  t has changed to


t  St D

d
X
i D0

 ti S ti :

263

Section 5.1 The multi-period market model

The predictability of  expresses the fact that investments must be allocated at the
beginning of each trading period, without anticipating future price increments.
Definition 5.4. A trading strategy  is called self-financing if
 t  S t D  tC1  S t

for t D 1; : : : ; T  1.

(5.3)

Intuitively, (5.3) means that the portfolio is always rearranged in such a way that its
present value is preserved. It follows that the accumulated gains and losses resulting
from the asset price fluctuations are the only source of variations of the portfolio value
 tC1  S tC1   t  S t D  tC1  .S tC1  S t /:

(5.4)

In fact,  is self-financing if and only if (5.4) holds for t D 1; : : : ; T  1. It follows


through summation over (5.4) that
 t  S t D 1  S 0 C

t
X

 k  .S k  S k1 /

for t D 1; : : : ; T .

kD1

Here, the constant  1  S 0 can be interpreted as the initial investment for the purchase
of the portfolio  1 .
Example 5.5. Often it is assumed that the 0th asset plays the role of a locally riskless
bond. In this case, one takes S00  1 and one lets S t0 evolve according to a spot rate
r t  0: At time t , an investment x made at time t  1 yields the payoff x.1 C r t /.
Thus, a unit investment at time 0 produces the value
S t0 D

t
Y

.1 C rk /

kD1
0

at time t . An investment in S is locally riskless if the spot rate r t is known beforehand at time t  1. This idea can be made precise by assuming that the process r is
predictable.
}
Without assuming predictability as in the preceding example, we assume from now
on that
S t0 > 0 P -a.s. for all t .
This assumption allows us to use the 0th asset as a numraire and to form the discounted price processes
X ti WD

S ti
;
S t0

t D 0; : : : ; T; i D 0; : : : ; d:

Then X t0  1, and X t D .X t1 ; : : : ; X td / expresses the value of the remaining assets in


units of the numraire. As explained in Remark 1.11, discounting allows comparison
of asset prices which are quoted at different times.

264

Chapter 5 Dynamic arbitrage theory

Definition 5.6. The (discounted) value process V D .V t / tD0;:::;T associated with a


trading strategy  is given by
V0 WD  1  X 0

and

V t WD  t  X t

for t D 1; : : : ; T:

The gains process associated with  is defined as


G0 WD 0

and

t
X

G t WD

k  .Xk  Xk1 / for t D 1; : : : ; T .

kD1

Clearly,
Vt D  t  X t D

t  St
;
S t0

so V t can be interpreted as the portfolio value at the end of the t th trading period
expressed in units of the numraire asset. The gains process
Gt D

t
X

k  .Xk  Xk1 /

kD1

reflects, in terms of the numraire, the net gains which have accumulated through the
trading strategy  up to time t . For a self-financing trading strategy , the identity
 t  S t D 1  S 0 C

t
X

 k  .S k  S k1 /

(5.5)

kD1

remains true if all relevant quantities are computed in units of the numraire. This is
the content of the following simple proposition.
Proposition 5.7. For a trading strategy  the following conditions are equivalent:
(a)  is self-financing.
(b)  t  X t D  tC1  X t for t D 1; : : : ; T  1.
P
(c) V t D V0 C G t D  1  X 0 C tkD1 k  .Xk  Xk1 / for all t .
Proof. By dividing both sides of (5.3) by S t0 it is seen that condition (b) is a reformulation of Definition 5.4. Moreover, (b) holds if and only if
 tC1  X tC1   t  X t D  tC1  .X tC1  X t / D  tC1  .X tC1  X t /
for t D 1; : : : ; T  1, and this identity is equivalent to (c).

265

Section 5.1 The multi-period market model

Remark 5.8. The numraire component of a self-financing trading strategy  satisfies


0
  t0 D . tC1   t /  X t
 tC1

for t D 1; : : : ; T  1.

(5.6)

Since
10 D V0  1  X0 ;

(5.7)

the entire process  0 is determined by the initial investment V0 and the d -dimensional
process . Consequently, if a constant V0 and an arbitrary d -dimensional predictable
process  are given, then we can use (5.7) and (5.6) as the definition of a predictable
process  0 , and this construction yields a self-financing trading strategy  WD . 0 ; /.
In dealing with self-financing strategies , it is thus sufficient to focus on the initial
}
investment V0 and the d -dimensional processes X and .
Remark 5.9. Different economic agents investing into the same market may choose
different numraires. For example, consider the following simple market model in
which prices are quoted in euros ( C) as the domestic currency. Let S 0 be a locally
riskless C-bond with the predictable spot rate process r 0 , i.e.,
S t0 D

t
Y

.1 C rk0 /;

kD1

and let S 1 describe the price of a locally riskless investment into US dollars ($). Since
the price of this $-bond is quoted in C, the asset S 1 is modeled as
S t1 D U t 

t
Y

.1 C rk1 /;

kD1

where r 1 is the spot rate for a $-investment, and U t denotes the price of 1$ in terms
of C, i.e., U t is the exchange rate of the $ versus the C. While it may be natural
for European investors to take S 0 as their numraire, it may be reasonable for an
American investor to choose S 1 . This simple example explains why it may be relevant
to check which concepts and results of our theory are invariant under a change of
numraire; see, e.g., the discussion at the end of Section 5.2.
}
Exercise 5.1.1. Consider a market model with two assets which are modeled as usual
by the stochastic process S D .S 0 ; S 1 / that is adapted to the filtration .F t / tD0;:::;T .
Decide which of the following processes  are predictable and which in general are
not.
(i)  t D IS 1 >S 1
t

t 1

(ii) 1 D 1 and  t D IS 1

1
t 1 >S t 2

for t  2;

266

Chapter 5 Dynamic arbitrage theory

(iiii)  t D IA  It >t0 , where t0 2 0; : : : ; T and A 2 F t0 ;


(iv)  t D IS 1 >S 1 ;
t

(v) 1 D 1 and  t D 2 t1 IS 1

1
t 1 <S0

5.2

for t  2.

Arbitrage opportunities and martingale measures

Intuitively, an arbitrage opportunity is an investment strategy that yields a positive


profit with positive probability but without any downside risk.
Definition 5.10. A self-financing trading strategy is called an arbitrage opportunity
if its value process V satisfies
V0  0;

VT  0 P -a.s.,

and

P VT > 0  > 0:

The existence of such an arbitrage opportunity may be regarded as a market inefficiency in the sense that certain assets are not priced in a reasonable way. In this
section, we will characterize those market models which do not allow for arbitrage
opportunities. Such models will be called arbitrage-free. The following proposition
shows that the market model is arbitrage-free if and only if there are no arbitrage opportunities for each single trading period. Later on, this fact will allow us to apply the
results of Section 1.6 to our multi-period model.
Proposition 5.11. The market model admits an arbitrage opportunity if and only if
there exist t 2 1; : : : ; T and  2 L0 .; F t1 ; P I Rd / such that
  .X t  X t1 /  0

P -a.s.;

and

P   .Xt  X t1 / > 0  > 0:

(5.8)

Proof. To prove necessity, take an arbitrage opportunity  D . 0 ; / with value process V , and let
t WD mink j Vk  0 P -a.s., and P Vk > 0  > 0 :
Then t  T by assumption, and either V t1 D 0 P -a.s. or P V t1 < 0  > 0. In the
first case, it follows that
 t  .X t  X t1 / D V t  V t1 D V t

P -a.s.

Thus,  WD  t satisfies (5.8). In the second case, we let  WD  t IV t 1 <0 . Then  is


F t1 -measurable, and
  .X t  X t1 / D .V t  V t1 / IV t 1 <0  V t1 IV t 1 <0 :
The expression on the right-hand side is non-negative and strictly positive with a positive probability, so (5.8) holds.

Section 5.2 Arbitrage opportunities and martingale measures

267

Now we prove sufficiency. For t and  as in (5.8), define a d -dimensional predictable process  by

 if s D t ,
s WD
0 otherwise.
Via (5.7) and (5.6),  uniquely defines a self-financing trading strategy  D . 0 ; /
with initial investment V0 D 0. Since the corresponding value process satisfies VT D
  .X t  X t1 /, the strategy  is an arbitrage opportunity.
Exercise 5.2.1. Let V be the value process of a self-financing strategy in an arbitragefree market model. Prove that the following two implications hold for all t 2 0; : : : ;
T  1 and A 2 F t with P A  > 0.
P V tC1  V t  0 j A  D 1

H)

P V tC1  V t D 0 j A  D 1;

P V tC1  V t  0 j A  D 1

H)

P V tC1  V t D 0 j A  D 1:

Definition 5.12. A stochastic process M D .M t / tD0;:::;T on a filtered probability


space .; F ; .F t /; Q/ is called a martingale if M is adapted, satisfies EQ jM t j  <
1 for all t , and if
Ms D EQ M t j Fs  for 0  s  t  T .

(5.9)

A martingale can be regarded as the mathematical formalization of a fair game:


For each time s and for each horizon t > s, the conditional expectation of the future
gain M t  Ms is zero, given the information available at s.
Exercise 5.2.2. Let M D .M t / tD0;:::;T be an adapted process on .; F ; .F t /; Q/
such that EQ jM t j  < 1 for all t . Show that the following conditions are equivalent:
(a) M is a martingale.
(b) M t D EQ M tC1 j F t  for 0  t  T  1.
(c) There exists F 2 L1 .; FT ; Q/ such that M t D EQ F j F t  for t D 0; : : : ; T ,
that is, M arises as a sequence of successive conditional expectations.
}
Exercise 5.2.3. Let QQ be a probability measure on .; F ; .F t // that is absolutely
continuous with respect to Q. Show that the density process
d QQ
Z t WD
; t D 0; : : : ; T;
dQ F t
is a martingale with respect to Q.

Whether or not a given process M is a martingale depends on the underlying probability measure Q. If we wish to emphasize the dependence of the martingale property
of M on a particular measure Q, we will say that M is a Q-martingale or that M is
a martingale under the measure Q.

268

Chapter 5 Dynamic arbitrage theory

Definition 5.13. A probability measure Q on .; FT / is called a martingale measure


if the discounted price process X is a (d -dimensional) Q-martingale, i.e.,
EQ X ti  < 1

and

Xsi D EQ X ti j Fs ;

0  s  t  T; i D 1; : : : ; d:

A martingale measure P  is called an equivalent martingale measure if it is equivalent


to the original measure P on FT . The set of all equivalent martingale measures is
denoted by P .
The following result is a version of Doobs fundamental systems theorem for
martingales. It states that a fair game admits no realistic gambling system which
produces a positive expected gain. Here, Y  denotes the negative part Y ^ 0 of a
random variable Y .
Theorem 5.14. For a probability measure Q, the following conditions are equivalent:
(a) Q is a martingale measure.
(b) If  D . 0 ; / is self-financing and  is bounded, then the value process V of 
is a Q-martingale.
(c) If  D . 0 ; / is self-financing and its value process V satisfies EQ VT  < 1,
then V is a Q-martingale.
(d) If  D . 0 ; / is self-financing and its value process V satisfies VT  0 Q-a.s.,
then EQ VT  D V0 .
Proof. (a) ) (b): Let V be the value process of a self-financing trading strategy
 D . 0 ; / such that there is a constant c such that j i j  c for all i. Then
jV t j  jV0 j C

t
X

c.jXk j C jXk1 j/:

kD1

Since each jXk j belongs to L1 .Q/, we have EQ jV t j  < 1. Moreover, for 0  t 


T  1,
EQ V tC1 j F t  D EQ V t C  tC1  .X tC1  X t / j F t 
D V t C  tC1  EQ X tC1  X t j F t 
D Vt ;
where we have used that  tC1 is F t -measurable and bounded.
(b) ) (c): We will show the following implication:
If EQ V t  < 1 then EQ V t j F t1  D V t1 .
Since EQ VT  < 1 by assumption, we will then get
EQ VT1  D EQ EQ VT j FT 1    EQ VT  < 1;

(5.10)

269

Section 5.2 Arbitrage opportunities and martingale measures

due to Jensens inequality for conditional expectations. Repeating this argument will
yield EQ V t  < 1 and EQ V t j F t1  D V t1 for all t . Since V0 is a finite constant, we will also get EQ V t  D V0 , which together with the fact that EQ V t  < 1
implies V t 2 L1 .Q/ for all t . Thus, the martingale property of V will follow.
To prove (5.10), note first that EQ V t j F t1  is well defined due to our assumption
.a/
.a/
EQ V t  < 1. Next, let  t WD  t Ij t ja for a > 0. Then  t  .X t  X t1 / is
.a/

a martingale increment by condition (b). In particular,  t


.a/
and EQ  t  .X t  X t1 / j F t1  D 0. Hence,

 .X t  X t1 / 2 L1 .Q/

.a/

EQ V t j F t1  Ij t ja D EQ V t Ij t ja j F t1   EQ  t


.a/

D EQ V t Ij t ja   t

 .X t X t1 / j F t1 

 .X t  X t1 / j F t1 

D EQ V t1 Ij t ja j F t1 


D V t1 Ij t ja :
By sending a " 1, we obtain (5.10).
(c) ) (d): By (5.2), every Q-martingale M satisfies
M0 D EQ MT j F0  D EQ MT :
(d) ) (a): To prove that X ti 2 L1 .Q/ for given i and t , consider the deterministic
j
process  defined by si WD Ist and s WD 0 for j i. By Remark 5.8,  can be
complemented with a predictable process  0 such that  D . 0 ; / is a self-financing
strategy with initial investment V0 D X0i . The corresponding value process satisfies
VT D V0 C

T
X

s  .Xs  Xs1 / D X ti  0:

sD1

From (d) we get


EQ X ti  D EQ VT  D V0 D X0i ;

(5.11)

which yields X ti 2 L1 .Q/.


i I A  for
Condition (a) will follow if we can show that EQ X ti I A  D EQ X t1
given t , i, and A 2 F t1 . To this end, we define a d -dimensional predictable process
j
 by is WD Is<t C IAc IsDt and s WD 0 for j i . As above, we take a predictable
0 such that  D .0 ; / is a self-financing strategy with initial investment
process 
VQ0 D X0i . Its terminal value is given by
VQT D VQ0 C

T
X
sD1

i
s  .Xs  Xs1 / D X ti IAc C X t1
IA  0:

270

Chapter 5 Dynamic arbitrage theory

Using (d) yields


X0i D VQ0 D EQ VQT 
i
D EQ X ti I Ac  C EQ X t1
I A :
i I A.
By comparing this identity with (5.11), we conclude that EQ X ti I A D EQ X t1

Remark 5.15. (a) Suppose that the objective measure P is itself a martingale measure, so that the fluctuation of prices may be viewed as a fair game. In this case, the
preceding proposition shows that there are no realistic self-financing strategies which
would generate a positive expected gain. Thus, the assumption P 2 P is a strong
version of the so-called efficient market hypothesis. For a market model containing
a locally risk-less bond, this strong hypothesis would imply that risk-averse investors
would not be attracted towards investing into the risky assets if their expectations are
consistent with P ; see Example 2.40.
(b) The strong assumption P 2 P implies, in particular, that there is no arbitrage
opportunity, i.e., no self-financing strategy with positive expected gain and without
any downside risk. Indeed, Theorem 5.14 implies that the value process of any selffinancing strategy with V0  0 and VT  0 satisfies E VT  D V0 , hence VT D 0
P -almost surely. The assumption that the market model is arbitrage-free may be
viewed as a much milder and hence more flexible form of the efficient market hypothesis.
}
We can now state the following dynamic version of the fundamental theorem of
asset pricing, which relates the absence of arbitrage opportunities to the existence of
equivalent martingale measures.
Theorem 5.16. The market model is arbitrage-free if and only if the set P of all
equivalent martingale measures is non-empty. In this case, there exists a P  2 P
with bounded density dP  =dP .
Proof. Suppose first that there exists an equivalent martingale measure P  . Then it
follows as in Remark 5.15 (b) that the market model in which the probability measure
P is replaced by P  is arbitrage-free. Since the notion of an arbitrage opportunity
depends on the underlying measure only through its null sets and since these are common for the two equivalent measures P and P  , it follows that also the original market
model is arbitrage-free.
Let us turn to the proof of the converse assertion. For t 2 1; : : : ; T , we define
K t WD   .X t  X t1 / j  2 L0 .; F t1 ; P I Rd / :

(5.12)

Section 5.2 Arbitrage opportunities and martingale measures

271

By Proposition 5.11, the market model is arbitrage-free if and only if


K t \ L0C .; F t ; P / D 0

(5.13)

holds for all t . Note that (5.13) depends on the measure P only through its null sets.
Condition (5.13) allows us to apply Theorem 1.55 to the t th trading period. For t D
T we obtain a probability measure PQT  P which has a bounded density d PQT =dP
and which satisfies
EQ T XT  XT 1 j FT 1  D 0:
Now suppose that we already have a probability measure PQ tC1  P with a bounded
density d PQtC1 =dP such that
EQ tC1 Xk  Xk1 j Fk1  D 0

for t C 1  k  T:

(5.14)

The equivalence of PQtC1 and P implies that (5.13) also holds with P replaced by
PQ tC1 . Applying Theorem 1.55 to the t th trading period yields a probability measure
PQ t with a bounded F t -measurable density Z t WD d PQt =d PQtC1 > 0 such that
EQ t X t  X t1 j F t1  D 0:
Clearly, PQt is equivalent to P and has a bounded density, since
d PQtC1
d PQt
d PQt

D
dP
dP
d PQtC1
is the product of two bounded densities. Moreover, if t C1  k  T , Proposition A.12
and the F t -measurability of Z t D d PQt =d PQtC1 imply
EQ tC1 .Xk  Xk1 /Z t j Fk1 
EQ t Xk  Xk1 j Fk1  D
EQ tC1 Z t j Fk1 
D EQ tC1 Xk  Xk1 j Fk1 
D 0:
Hence, (5.14) carries over from PQtC1 to PQt . We can repeat this recursion until finally
P  WD PQ1 yields the desired equivalent martingale measure.
Clearly, the absence of arbitrage in the market is independent of the choice of the
numraire, while the set P of equivalent martingale measures generally does depend
on the numraire. In order to investigate the structure of this dependence, suppose
that the first asset S 1 is P -a.s. strictly positive, so that it can serve as an alternative
numraire. The price process discounted by S 1 is denoted by
 0

St
S t2
S td
S t0
0
1
d
Y t D .Y t ; Y t ; : : : ; Y t / WD
;
1;
;
:
:
:
;
X t ; t D 0; : : : ; T:
D
S t1
S t1
S t1
S t1

272

Chapter 5 Dynamic arbitrage theory

Let PQ be the set of equivalent martingale measures for Y . Then PQ ; if and only
if P ;, according to Theorem 5.16 and the fact that the existence of arbitrage
opportunities is independent of the choice of the numraire.
Proposition 5.17. The two sets P and PQ are related via the identity

d PQ 
XT1

D
for
some
P
2
P
:
PQ D PQ 
dP 
X01
Proof. The process X t1 =X01 is a P  -martingale for any P  2 P . In particular,
E  XT1 =X01  D 1, and the formula
XT1
d PQ 
D
dP 
X01
defines a probability measure PQ  which is equivalent to P . Moreover, by Proposition A.12,
1
EQ  Y t j Fs  D 1  E  Y t  X t1 jFs 
Xs
1
D 1  E  X t j Fs 
Xs
D Y s:
Hence, PQ  is an equivalent martingale measure for Y , and it follows that

d PQ 
XT1

PQ  PQ 
D
for
some
P
2
P
:
dP 
X01
Reversing the roles of X and Y yields the identity of the two sets.
Remark 5.18. Unless XT1 is P -a.s. constant, the two sets P and PQ satisfy
P \ PQ D ;:
}

This can be proved as in Remark 1.12.

Exercise 5.2.4. Let X t WD X t1 be the P -a.s. strictly positive discounted price process
of a risky asset. The corresponding returns are
X t  X t1
;
RQ t WD
X t1
so that
X t D X0

t
Y

t D 1; : : : ; T;

.1 C RQ k /:

kD1

We take as filtration F t D .X0 ; : : : ; X t /.

273

Section 5.2 Arbitrage opportunities and martingale measures

(a) Show that X is a P -martingale when the .RQ t / are independent and integrable
random variables with E RQ t  D 0.
(b) Now give necessary and sufficient conditions on the .RQ t / such that X is a P martingale.
(c) Construct an example in which X is a martingale but the .RQ t / are not independent.
}
Exercise 5.2.5. Let Z1 ; : : : ; ZT be independent standard normal random variables on
.; F ; P /, and let F t be the -field generated by Z1 ; : : : ; Z t , where t D 1; : : : ; T .
We also let F0 WD ;; . For constants X01 > 0, i > 0, and mi 2 R we now define
the discounted price process of a risky asset as the following sequence of log-normally
distributed random variables,
X t1 WD X01

t
Y

e i Zi Cmi ;

t D 0; : : : ; T:

(5.15)

i D1

Construct an equivalent martingale measure for X 1 under which the random variables
}
X t1 have still a log-normal distribution.
Exercise 5.2.6. For a square-integrable random variable X on .; F ; P / and a algebra F0  F , the conditional variance of X given F0 is defined as
var.X jF0 / WD E .X  E X j F0 /2 j F0 :
Show that
var.X jF0 / D E X 2 j F0   .E X j F0 /2
and that
var.X / D E var.X jF0 /  C var.E X j F0 /:

Exercise 5.2.7. Let Y1 and Y2 be jointly normal random variables with mean 0, variance 1, and correlation % 2 .1; 1/. That is, the joint distribution of .Y1 ; Y2 / has the
density
'.y1 ; y2 / D

1
1

.y 2 Cy 2 2%y1 y2 /
p
e 2.1%2 / 1 2
;
2 1  %2

.y1 ; y2 / 2 R2 :

(a) Compute the conditional expectation E Y2 j Y1 .


(b) Compute the conditional variance var.Y2 jY1 /.
(c) For constants m; 2 R compute E e Y2 Cm j Y1 .

274

Chapter 5 Dynamic arbitrage theory

Exercise 5.2.8. Let Y1 and Y2 be as in Exercise 5.2.7. We use the Yi to construct a


log-normal price process in analogy to (5.15)
X t1

WD

X01

t
Y

e i Zi Cmi ;

t D 0; : : : ; 2;

(5.16)

i D1

for constants X01 ; i > 0 and mi 2 R (i D 1; 2).


(a) Compute the conditional expectation E X21 j X11 .
(b) Construct an equivalent martingale measure for the price process in (5.16) when
}
the filtration is the one generated by the process X 1 .
Exercise 5.2.9. Let X0 ; X1 ; : : : describe the discounted prices of a risky asset in a
market model with infinite time horizon that is modeled on a filtered probability space
.; .F t / tD0;1;::: ; P /. Suppose that every market model X0 ; : : : ; XT with finite time
horizon T 2 N is arbitrage-free.
(a) Show that there exists a sequence .PT /T D1;2::: of probability measures such that
PT is defined on .; FT /, is equivalent to P on FT , and such that the restriction
of PT to FT 1 equals PT1 , i.e., PT A  D PT1 A  for all A 2 FT 1 .
 arises as the restriction to F of a
(b) Can you give conditions under which PTS
T

measure P that is defined on F1 WD . t0 F t /?
Hint: You may choose a setting in which one can apply the Kolmogorov extension theorem.
}

5.3

European contingent claims

A key topic of mathematical finance is the analysis of derivative securities or contingent claims, i.e., of certain assets whose payoff depends on the behavior of the
primary assets S 0 ; S 1 ; : : : ; S d and, in some cases, also on other factors.
Definition 5.19. A non-negative random variable C on .; FT ; P / is called a European contingent claim. A European contingent claim C is called a derivative of the
underlying assets S 0 ; S 1 ; : : : ; S d if C is measurable with respect to the -algebra
generated by the price process .S t / tD0;:::;T .
A European contingent claim has the interpretation of an asset which yields at time
T the amount C.!/, depending on the scenario ! of the market evolution. T is called
the expiration date or the maturity of C . Of course, maturities prior to the final trading
period T of our model are also possible, but unless it is otherwise mentioned, we will
assume that our European contingent claims expire at T . In Chapter 6, we will meet
another class of derivative securities, the so-called American contingent claims. As
long as there is no risk of confusion between European and American contingent

275

Section 5.3 European contingent claims

claims, we will use the term contingent claim to refer to a European contingent
claim.
Example 5.20. The owner of a European call option has the right, but not the obligation, to buy an asset at time T for a fixed price K, called the strike price. This
corresponds to a contingent claim of the form
C call D .STi  K/C :
Conversely, a European put option gives the right, but not the obligation, to sell the
asset at time T for a strike price K. This corresponds to the contingent claim
C put D .K  STi /C :

Example 5.21. The payoff of an Asian option depends on the average price
i
WD
Sav

1 X i
St
jT j
t2T

of the underlying asset during a predetermined set of periods T  0; : : : ; T . For


instance, an average price call with strike K corresponds to the contingent claim
call
i
WD .Sav
 K/C ;
Cav

and an average price put has the payoff


put
i C
WD .K  Sav
/ :
Cav

Average price options can be used, for instance, to secure regular cash streams against
exchange rate fluctuations. For example, assume that an economic agent receives at
each time t 2 T a fixed amount of a foreign currency with exchange rates Sti . In
this case, an average price put option may be an efficient instrument for securing the
incoming cash stream against the risk of unfavorable exchange rates.
An average strike call corresponds to the contingent claim
i C
.STi  Sav
/ ;

while an average strike put pays off the amount


i
 STi /C :
.Sav

An average strike put can be used, for example, to secure the risk from selling at time
T a quantity of an asset which was bought at successive times over the period T . }

276

Chapter 5 Dynamic arbitrage theory

Example 5.22. The payoff of a barrier option depends on whether the price of the
underlying asset reaches a certain level before maturity. Most barrier options are
either knock-out or knock-in options. A knock-in option pays off only if the barrier B
is reached. The simplest example is a digital option

1 if max0tT S ti  B;
C dig WD
0 otherwise,
which has a unit payoff if the price processes reaches a given upper barrier B > S0i .
Another example is the down-and-in put with strike price K and lower barrier BQ < S0i
which pays off

Q
.K  STi /C if min0tT S ti  B,
put
Cd&i WD
0
otherwise.
A knock-out barrier option has a zero payoff once the price of the underlying asset
reaches the predetermined barrier. For instance, an up-and-out call corresponds to the
contingent claim

.STi  K/C if max0tT S ti < B,


call
Cu&o WD
0
otherwise;
}

see Figure 5.1. Down-and-out and up-and-in options are defined analogously.

S01
K

T
Figure 5.1. In one scenario, the payoff of the up-and-out call becomes zero because the
stock price hits the barrier B before time T . In the other scenario, the payoff is given by
.ST  K/C .

277

Section 5.3 European contingent claims

Example 5.23. Using a lookback option, one can trade the underlying asset at the
maximal or minimal price that occurred during the life of the option. A lookback call
has the payoff
STi  min S ti ;
0tT

while a lookback put corresponds to the contingent claim


max S ti  STi :

0tT

The discounted value of a contingent claim C when using the numraire S 0 is given
by
H WD

C
:
ST0

We will call H the discounted European claim or just the discounted claim associated with C . In the remainder of this text, H will be the generic notation for the
discounted payoff of any type of contingent claim.
The reader may wonder why we work simultaneously with the notions of a contingent claim and a discounted claim. From a purely mathematical point of view, there
would be no loss of generality in assuming that the numraire asset is identically equal
to one. In fact, the entire theory to be developed in Part II can be seen as a discretetime stochastic analysis for the d -dimensional process X D .X 1 ; : : : ; X d / and its
stochastic integrals
t
X

k  .Xk  Xk1 /

kD1

of predictable d -dimensional processes . However, some of the economic intuition would be lost if we would limit the discussion to this level. For instance, we
have already seen the economic relevance of the particular choice of the numraire,
even though this choice may be irrelevant from the mathematicians point of view.
As a compromise between the mathematicians preference for conciseness and the
economists concern for keeping track explicitly of economically relevant quantities,
we develop the mathematics on the level of discounted prices, but we will continue
to discuss definitions and results in terms of undiscounted prices whenever it seems
appropriate.
From now on, we will assume that our market model is arbitrage-free or, equivalently, that
P ;:

278

Chapter 5 Dynamic arbitrage theory

Definition 5.24. A contingent claim C is called attainable (replicable, redundant )


if there exists a self-financing trading strategy  whose terminal portfolio value coincides with C , i.e.,
C D  T  S T P -a.s.
Such a trading strategy  is called a replicating strategy for C .
Clearly, a contingent claim C is attainable if and only if the corresponding discounted claim H D C =ST0 is of the form
H D  T  X T D VT D V0 C

T
X

 t  .X t  X t1 /;

tD1

for a self-financing trading strategy  D . 0 ; / with value process V . In this case,


we will say that the discounted claim H is attainable, and we will call  a replicating strategy for H . The following theorem yields the surprising result that an attainable discounted claim is automatically integrable with respect to every equivalent
martingale measure. Note, however, that integrability may not hold for an attainable
contingent claim prior to discounting.
Theorem 5.25. Any attainable discounted claim H is integrable with respect to each
equivalent martingale measure, i.e.,
E  H  < 1

for all P  2 P .

Moreover, for each P  2 P the value process of any replicating strategy satisfies
Vt D E  H j Ft 

P -a.s. for t D 0; : : : ; T .

In particular, V is a non-negative P  -martingale.


Proof. This follows from VT D H  0 and the systems theorem in the form of
Theorem 5.14.
Remark 5.26. The identity
V t D E  H j F t ;

t D 0; : : : ; T;

appearing in Theorem 5.25 has two remarkable implications. Since its right-hand side
is independent of the particular replicating strategy, all such strategies must have the
same value process. Moreover, the left-hand side does not depend on the choice of
P  2 P . Hence, V t is a version of the conditional expectation E  H j F t  for every
}
P  2 P . In particular, E  H  is the same for all P  2 P .

279

Section 5.3 European contingent claims

Remark 5.27. When applied to an attainable contingent claim C prior to discounting,


Theorem 5.25 states that



0  C
 t  S t D St E
F t ; t D 0; : : : ; T;
ST0
P -a.s. for all P  2 P and for every replicating strategy . In particular, the initial
investment which is needed for a replication of C is given by


C
 1  S 0 D S00 E 
:
}
ST0
Let us now turn to the problem of pricing a contingent claim. Consider first a
discounted claim H which is attainable. Then the (discounted) initial investment
 1  X 0 D V0 D E  H 

(5.17)

needed for the replication of H can be interpreted as the unique (discounted) fair
price of H . In fact, a different price for H would create an arbitrage opportunity.
For instance, if H could be sold at time 0 for a price Q which is higher than (5.17),
then selling H and buying the replicating portfolio  yields the profit
Q   1  X 0 > 0
at time 0, although the terminal portfolio value VT D  T  X T suffices for settling the
claim H at maturity T . In order to make this idea precise, let us formalize the idea of
an arbitrage-free price of a general discounted claim H .
Definition 5.28. A real number  H  0 is called an arbitrage-free price of a discounted claim H , if there exists an adapted stochastic process X d C1 such that
X0d C1 D  H ;
X td C1  0

for t D 1; : : : ; T  1,

and

(5.18)

XTd C1 D H;
and such that the enlarged market model with price process .X 0 ; X 1 ; : : : ; X d ; X d C1 /
is arbitrage-free. The set of all arbitrage-free prices of H is denoted by .H /. The
lower and upper bounds of .H / are denoted by
inf .H / WD inf .H /

and

sup .H / WD sup .H /:

Thus, an arbitrage-free price  H of a discounted claim H is by definition a price


at which H can be traded at time 0 without introducing arbitrage opportunities into
the market model: If H is sold for  H , then neither buyer nor seller can find an

280

Chapter 5 Dynamic arbitrage theory

investment strategy which both eliminates all the risk and yields an opportunity to
make a positive profit. Our aim in this section is to characterize the set of all arbitragefree prices of a discounted claim H .
Note that an arbitrage-free price  H is quoted in units of the numraire asset. The
amount that corresponds to  H in terms of currency units prior to discounting is equal
to
 C WD S00  H ;
and  C is an (undiscounted) arbitrage-free price of the contingent claim C WD ST0 H .
Theorem 5.29. The set of arbitrage-free prices of a discounted claim H is non-empty
and given by
.H / D E  H  j P  2 P and E  H  < 1 :

(5.19)

Moreover, the lower and upper bounds of .H / are given by


inf .H / D inf E  H 
P  2P

and

sup .H / D sup E  H :
P  2P

Proof. By Theorem 5.16,  H is an arbitrage-free price for H if and only if we can


find an equivalent martingale measure PO for the market model extended via (5.18).
PO must satisfy
O X i j F t  for t D 0; : : : ; T and i D 1; : : : ; d C 1.
X ti D E
T
O H . Thus, we obtain the incluIn particular, PO belongs to P and satisfies  H D E
sion in (5.19).
Conversely, if  H D E  H  for some P  2 P , then we can define the stochastic
process
X td C1 WD E  H j F t ; t D 0; : : : ; T;
which satisfies all the requirements of (5.18). Moreover, the same measure P  is
clearly an equivalent martingale measure for the extended market model, which hence
is arbitrage-free. Thus, we obtain the identity of the two sets in (5.19).
To show that .H / is non-empty, we first fix some measure PQ  P such that
Q
E H  < 1. For instance, we can take d PQ D c.1 C H /1 dP , where c is the
normalizing constant. Under PQ , the market model is arbitrage-free. Hence, Theorem 5.16 yields P  2 P such that dP  =d PQ is bounded. In particular, E  H  < 1
and hence E  H  2 .H /.
The formula for inf .H / follows immediately from (5.19) and the fact that .H /
;. The one for sup .H / needs an additional argument. Suppose that P 1 2 P is such
that E 1 H  D 1. We must show that for any c > 0 there exists some  2 .H /
with  > c. To this end, let n be such that Q WD E 1 H ^ n  > c, and define
X td C1 WD E 1 H ^ n j F t ;

t D 0; : : : ; T:

281

Section 5.3 European contingent claims

Then P 1 is an equivalent martingale measure for the extended market model .X 0; : : : ;


X d ; X d C1 /, which hence is arbitrage-free. Applying the already established fact that
the set of arbitrage-free prices of any contingent claim is nonempty to the extended
market model yields an equivalent martingale measure P  for .X 0 ; : : : ; X d ; X d C1 /
such that E  H  < 1. Since P  is also a martingale measure for the original market
model, the first part of this proof implies that  WD E  H  2 .H /. Finally, note
that
 D E  H   E  H ^ n  D E  XTd C1  D X0d C1 D Q > c:
Hence, the formula for sup .H / is proved.
Example 5.30. In an arbitrage-free market model, we consider a European call option C call D .ST1  K/C with strike K > 0 and with maturity T . We assume that
the numraire S 0 is the predictable price process of a locally riskless bond as in Example 5.5. Then S t0 is increasing in t and satisfies S00  1. For any P  2 P ,
Theorem 5.29 yields an arbitrage-free price  call of C call which is given by
 call D E 

C call
ST0

D E



XT1 

K
ST0

C 
:

Due to the convexity of the function x 7! x C D x _ 0 and our assumptions on S 0 ,


 call can be bounded from below as follows:
C
 
K
call

1
  E XT  0
ST

C

K
D S01  E 
(5.20)
ST0
 .S01  K/C :
In financial language, this fact is usually expressed by saying that the value of the
option is higher than its intrinsic value .S01 K/C , i.e., the payoff if the option were
exercised immediately. The difference of the price  call of an option and its intrinsic
value is often called the time-value of the European call option; see Figure 5.2. }
Example 5.31. For a European put option C put D .K  ST1 /C , the situation is more
complicated. If we consider the same situation as in Example 5.30, then the analogue
of (5.20) fails unless the numraire S 0 is constant. In fact, as a consequence of the
put-call parity, the time value of a put option whose intrinsic value is large (i.e., the
option is in the money) usually becomes negative; see Figure 5.3.
}
Our next aim is to characterize the structure of the set of arbitrage-free prices of a
discounted claim H . It follows from Theorem 5.29 that every arbitrage-free price  H

282

Chapter 5 Dynamic arbitrage theory

S01

Figure 5.2. The typical price of a call option as a function of S01 is always above the
options intrinsic value .S01  K/C .

of H must lie between the two numbers


inf .H / D inf E  H  and
P  2P

sup .H / D sup E  H :
P  2P

We also know that inf .H / and sup .H / are equal if H is attainable. The following
theorem shows that also the converse implication holds, i.e., H is attainable if and
only if inf .H / D sup .H /.
Theorem 5.32. Let H be a discounted claim.
(a) If H is attainable, then the set .H / of arbitrage-free prices for H consists of
the single element V0 , where V is the value process of any replicating strategy
for H .
(b) If H is not attainable, then inf .H / < sup .H / and
.H / D .inf .H /; sup .H //:
Proof. The first assertion follows from Remark 5.26 and Theorem 5.29.
To prove (b), note first that
.H / D E  H  j P  2 P ; E  H  < 1
is an interval because P is a convex set. We will show that .H / is open by constructing for any  2 .H / two arbitrage-free prices L and O for H such that L <  < .
O
To this end, take P  2 P such that  D E  H . We will first construct an equivalent
O H  > E  H . Let
martingale measure PO 2 P such that E
U t WD E  H j F t ;

t D 0; : : : ; T;

283

Section 5.3 European contingent claims

S01

Figure 5.3. The typical price of a European put option as a function of S01 compared to
the options intrinsic value .K  S01 /C .

so that
H D U0 C

T
X

.U t  U t1 /:

tD1

Since H is not attainable, there must be some t 2 1; : : : ; T such that U t  U t1


K t \ L1 .P  /, where
K t WD   .X t  X t1 / j  2 L0 .; F t1 ; P I Rd / :
By Lemma 1.69, K t \ L1 .P  / is a closed linear subspace of L1 .; F t ; P  /.
Therefore, Theorem A.57 applied with B WD U t  U t1 and C WD K t \ L1 .P  /
yields some Z 2 L1 .; F t ; P  / such that
supE  W Z  j W 2 K t \ L1 .P  / < E  .U t  U t1 / Z  < 1:
From the linearity of K t \ L1 .P  / we deduce that
E  W Z  D 0

for all W 2 K t \ L1 .P  /,

(5.21)

and hence that


E  .U t  U t1 / Z  > 0:

(5.22)

There is no loss of generality in assuming that jZj  1=3, so that


ZO WD 1 C Z  E  Z j F t1 
can be taken as the density d PO =dP  D ZO of a new probability measure PO  P .

284

Chapter 5 Dynamic arbitrage theory

Since Z is F t -measurable, the expectation of H under PO satisfies


O H  D E  H ZO 
E
D E  H  C E  E  H j F t  Z   E  E  H j F t1  E  Z j F t1  
D E  H  C E  U t Z   E  U t1 Z 
> E  H ;
O H   5 E H  <
where we have used (5.22) in the last step. On the other hand, E
3
O H  will yield the desired arbitrage-free price larger than  if we
1. Thus, O WD E
have PO 2 P .
Let us prove that PO 2 P . For k > t , the F t -measurability of ZO and Proposition A.12 yield that
O Xk  Xk1 j Fk1  D E  Xk  Xk1 j Fk1  D 0:
E
For k D t , (5.21) yields E  .X t  X t1 / Z j F t1  D 0. Thus, it follows from
E  ZO j F t1  D 1 that
O X t  X t1 j F t1 
E
D E  .X t  X t1 /.1  E  Z j F t1 / j F t1  C E  .X t  X t1 /Z j F t1 
D 0:
Finally, if k < t then P  and PO coincide on Fk . Hence
O Xk  Xk1 j Fk1  D E  Xk  Xk1 j Fk1  D 0;
E
and we may conclude that PO 2 P .
It remains to construct another equivalent martingale measure PL such that
L H  < E  H  D :
L WD E

(5.23)

But this is simply achieved by letting


d PO
d PL
WD 2 
;

dP
dP 
which defines a probability measure PL  P , because the density d PO =dP  is
bounded from above by 5=3 and below by 1/3. PL 2 P is then obvious as is (5.23).
Remark 5.33. So far, we have assumed that a contingent claim is settled at the terminal time T . A natural way of dealing with an FT0 -measurable payoff C0  0 maturing
at some time T0 < T is to apply our results to the corresponding discounted claim
H0 WD

C0
ST00

285

Section 5.3 European contingent claims

in the market model with the restricted time horizon T0 . Clearly, this restricted model
is arbitrage-free. An alternative approach is to invest the payoff C0 at time T0 into the
numraire asset S 0 . At time T , this yields the contingent claim
C WD C0 

ST0
ST00

whose discounted claim


H D

C
C0
D 0
0
ST
ST0

is formally identical to H0 . Moreover, our results can be directly applied to H . It is


intuitively clear that these two approaches for determining the arbitrage-free prices of
C0 should be equivalent. A formal proof must show that the set .H / is equal to the
set
.H0 / WD E0 H0  j P0 2 P0 and E0 H0  < 1
of arbitrage-free prices of H0 in the market model whose time horizon is T0 . Here,
P0 denotes the set of measures P0 on .; FT0 / which are equivalent to P on FT0 and
which are martingale measures for the restricted price process .X t / tD0;:::;T0 . Clearly,
each P  2 P defines an element of P0 by restricting P  to the -algebra FT0 . In
fact, Proposition 5.34 below shows that every element in P0 arises in this way. Thus,
the two sets of arbitrage-free prices for H and H0 coincide, i.e.,
.H / D .H0 /:
It follows, in particular, that H0 is attainable if and only if H is attainable.

Proposition 5.34. Consider the situation described in Remark 5:33 and let P0 2 P0
be given. Then there exists some P  2 P whose restriction to FT0 is equal to P0 .
Proof. Let PO 2 P be arbitrary, and denote by ZT0 the density of P0 with respect to
the restriction of PO to the -algebra FT0 . Then ZT0 is FT0 -measurable, and
dP  WD ZT0 d PO
defines a probability measure on F . Clearly, P  is equivalent to PO and to P , and it
coincides with P0 on FT0 . To check that P  2 P , it suffices to show that X t  X t1
is a martingale increment under P  for t > T0 . For these t , the density ZT0 is F t1 measurable, so Proposition A.12 implies that
O X t  X t1 j F t1  D 0:
E  X t  X t1 j F t1  D E

286

Chapter 5 Dynamic arbitrage theory

Example 5.35. Let us consider the situation of Example 5.30, where the numraire
S 0 is a locally riskless bond. Remark 5.33 allows us to compare the arbitrage-free
prices of two European call options C0 D .ST10  K/C and C D .ST1  K/C with
the same strikes and underlyings but with different maturities T0 < T . As in Example 5.30, we get that for P  2 P
E



 0
C
ST0
C
1
1

FT0  0 ST0  K E
FT0
ST0
ST0
ST0


(5.24)

C0
:
ST00

Hence, if P  is used to calculate arbitrage-free prices for C0 and C , the resulting


price of C0 is lower than the price of C
E

C
ST0


E


C0
:
ST00

This argument suggests that the price of a European call option should be an increasing function of the maturity.
}
Exercise 5.3.1. Let Yk .k D 1; : : : ; T / be independent identically distributed random
variables in L1 .; F ; P /, and suppose that the Yk are not P -a.s. constant and satisfy
E Yk  D 0. Let furthermore
X t WD X0 C

t
X

Yk ;

t D 0; : : : ; T:

kD1

Then X is a P -martingale when we consider the filtration given by F0 D ;; 


and F t D .Y1 ; : : : ; Y t / for t D 1; : : : ; T . We now enlarge the filtration by adding
insider information of the terminal value XT . That is, we consider the enlarged
filtration
FQn WD .Fn [ .XT //:
(a) Show that X is no longer a P -martingale with respect to .FQt /.
(b) Prove that the process
XQ t WD X t 

t1
X
kD0

1
.XT  Xk /:
T k

is a P -martingale with respect to the enlarged filtration .FQt /.

287

Section 5.4 Complete markets

(c) The insider information of the terminal value XT implies the existence of selffinancing strategies with positive expected profit. Construct a strategy   that
maximizes the expected profit
T
hX

 t .X t  X t1 /

tD1

within the class of all .FQt /-predictable strategies  with j t j  1 P -a.s. for all t .

5.4

Complete markets

We have seen in Theorem 5.32 that any attainable claim in an arbitrage-free market
model has a unique arbitrage-free price. Thus, the situation becomes particularly
transparent if all contingent claims are attainable.
Definition 5.36. An arbitrage-free market model is called complete if every contingent claim is attainable.
Complete market models are precisely those models in which every contingent
claim has a unique and unambiguous arbitrage-free price. However, in discrete time,
only a very limited class of models enjoys this property. The following characterization of market completeness is sometimes called the second fundamental theorem of
asset pricing.
Theorem 5.37. An arbitrage-free market model is complete if and only if there exists
exactly one equivalent martingale measure. In this case, the number of atoms in
.; FT ; P / is bounded from above by .d C 1/T .
Proof. If the model is complete, then H WD IA for A 2 FT is an attainable discounted
claim. It follows from the results of Section 5.3 that the mapping P  7! E  H  D
P  A  is constant over the set P . Hence, there can be only one equivalent martingale
measure.
Conversely, if jP j D 1, then the set .H / of arbitrage-free prices of every discounted claim H has exactly one element. Hence, Theorem 5.32 implies that H is
attainable.
To prove the second assertion, note first that the asserted bound on the number of
atoms in FT holds for T D 1 by Corollary 1.42. We proceed by induction on T . Suppose that the assertion holds for T  1. By assumption, any bounded FT -measurable
random variable H  0 can be written as
H D VT 1 C T  .XT  XT 1 /;

288

Chapter 5 Dynamic arbitrage theory

where both VT 1 and T are FT 1 -measurable and hence constant on each atom A of
.; FT 1 ; P /. It follows that the dimension of the linear space L1 .; FT ; P  jA/
is less than or equal to d C 1. Thus, Proposition 1.41 implies that .; FT ; P  jA/
has at most d C 1 atoms. Applying the induction hypothesis concludes the proof.
Below we state additional characterizations of market completeness. Denote by Q
the set of all martingale measures in the sense of Definition 5.13. Then both P and Q
are convex sets. Recall that an element of a convex set is called an extreme point of
this set if it cannot be written as a non-trivial convex combination of members of this
set.
Property (d) in the following theorem is usually called the predictable representation property, or the martingale representation property, of the P  -martingale X.
Theorem 5.38. For P  2 P the following conditions are equivalent:
(a) P D P  .
(b) P  is an extreme point of P .
(c) P  is an extreme point of Q.
(d) Every P  -martingale M can be represented as a stochastic integral of a d dimensional predictable process 
M t D M0 C

t
X

k  .Xk  Xk1 / for t D 0; : : : ; T .

kD1

Proof. (a) ) (c): If P  can be written as P  D Q1 C .1  /Q2 for 2 .0; 1/ and


Q1 ; Q2 2 Q, then Q1 and Q2 are both absolutely continuous with respect to P  . By
defining
1
Pi WD .Qi C P  /; i D 1; 2;
2
we thus obtain two martingale measures P1 and P2 which are equivalent to P  .
Hence, P1 D P2 D P  and, in turn, Q1 D Q2 D P  .
(c) ) (b): This is obvious since P  Q.
(b) ) (a): Suppose that there exists a PO 2 P which is different from P  . We
will show below that in this case PO can be chosen such that the density d PO =dP  is
bounded by some constant c > 0. Then, if " > 0 is less than 1=c,
dP 0
d PO
WD 1 C "  " 

dP
dP
defines another measure P 0 2 P different from P  . Moreover, P  can be represented
as the convex combination
P D

1
" O
PC
P 0;
1C"
1C"

289

Section 5.4 Complete markets

which contradicts condition (b). Hence, P  must be the unique equivalent martingale
measure.
It remains to prove the existence of PO 2 P with a bounded density d PO =dP 
if there exists some PQ 2 P which is different from P  . Then there exists a set
A 2 FT such that P  A  PQ A . We enlarge our market model by introducing the
additional asset
Xtd C1 WD PQ A j F t ; t D 0; : : : ; T;
and we take P  instead of P as our reference measure. By definition, PQ is an equivalent martingale measure for .X 0 ; X 1 ; : : : ; X d ; X d C1 /. Hence, the extended market
model is arbitrage-free, and Theorem 5.16 guarantees the existence of an equivalent
martingale measure PO such that the density d PO =dP  is bounded. Moreover, PO must
be different from P  , since P  is not a martingale measure for X d C1 :
X0d C1 D PQ A  P  A  D E  XTd C1 :
(a) ) (d): The terminal value MT of a P  -martingale M can be decomposed into
the difference of its positive and negative parts
MT D MTC  MT :
MTC and MT can be regarded as two discounted claims, which are attainable by
Theorem 5.37. Hence, there exist two d -dimensional predictable process  C and  
such that
T
X
k  .Xk  Xk1 / P  -a.s.
MT D V0 C
kD1

for two non-negative constants V0C and V0 . Since the value processes
V t WD V0 C

t
X

k  .Xk  Xk1 /

kD1

are P  -martingales by Theorem 5.25, we get that


M t D E  MTC  MT j F t  D V tC  V t :
This proves that the desired representation of M holds in terms of the d -dimensional
predictable process  WD  C    .
(d) ) (a): Applying our assumption to the martingale M t WD P  A j F t  shows
that H D IA is an attainable contingent claim. Hence, it follows from the results of
Section 5.3 that the mapping P  7! P  A  is constant over the set P . Thus, there
can be only one equivalent martingale measure.

290

Chapter 5 Dynamic arbitrage theory

Exercise 5.4.1. Consider the sample space


 WD 1; C1T D ! D .y1 ; : : : ; yT / j yi 2 1; C1
and denote by Y t .!/ WD y t , for ! D .y1 ; : : : ; yT /, the projection on the t th coordinate. As filtration we take F0 WD ;;  and F t WD .Y1 ; : : : ; Y t / for t D 1; : : : ; T .
We consider a financial market model with two assets such that the discounted price
process X t WD X t1 D S t1 =S t0 is of the form
X t D X0 exp

t
X

. k Yk C mk /

kD1

for a constant X0 > 0 and two predictable processes . t / and .m t /. We suppose that
0  jm t j < t for all t .
(a) Show that when P is a probability measure on  for which P !  > 0 for all
!, then there exists a unique equivalent martingale measure P  .
(b) By using the binary structure of the model, and without using Theorem 5.37,
prove the following martingale representation result. If P  is as in (a), every
P  -martingale M can be represented as
M t D M0 C

t
X

k .Xk  Xk1 /

t D 0; : : : ; T ,

kD1

where the predictable process  is given by


k D

5.5

Mk  Mk1
:
Xk  Xk1

The binomial model

A complete financial market model with only one risky asset must have a binary tree
structure, as we have seen in Theorem 5.37. Under an additional homogeneity assumption, this reduces to the following particularly simple model, which was introduced by Cox, Ross, and Rubinstein in [58]. It involves the riskless bond
S t0 WD .1 C r/t ;

t D 0; : : : ; T;

with r > 1 and one risky asset S 1 D S, whose return


R t WD

S t  S t1
S t1

in the t th trading period can only take two possible values a; b 2 R such that
1 < a < b:

291

Section 5.5 The binomial model

Thus, the stock price jumps from S t1 either to the higher value S t D S t1 .1 C b/ or
to the lower value S t D S t1 .1 C a/. In this context, we are going to derive explicit
formulas for the arbitrage-free prices and replicating strategies of various contingent
claims.
Let us construct the model on the sample space
 WD 1; C1T D ! D .y1 ; : : : ; yT / j yi 2 1; C1 :
Denote by
Y t .!/ WD y t
the projection on the

t th

for ! D .y1 ; : : : ; yT /

(5.25)

coordinate, and let

1 C Y t .!/
a
1  Y t .!/
Cb
D
R t .!/ WD a
2
2
b

if Y t .!/ D 1,
if Y t .!/ D C1.

The price process of the risky asset is modeled as


S t WD S0

t
Y

.1 C Rk /;

kD1

where the initial value S0 > 0 is a given constant. The discounted price process takes
the form
t
Y
1 C Rk
St
X t D 0 D S0
:
1Cr
St
kD1

As filtration we take
F t WD .S0 ; : : : ; S t / D .X0 ; : : : ; X t /;

t D 0; : : : ; T:

Note that F0 D ;; , and


F t D .Y1 ; : : : ; Y t / D .R1 ; : : : ; R t /

for t D 1; : : : ; T ;

F WD FT coincides with the power set of . Let us fix any probability measure P on
.; F / such that
P !  > 0 for all ! 2 .
(5.26)
Such a model will be called a binomial model or a CRR model. The following theorem
characterizes those parameter values a; b; r for which the model is arbitrage-free.
Theorem 5.39. The CRR model is arbitrage-free if and only if a < r < b. In
this case, the CRR model is complete, and there is a unique martingale measure
P  . The martingale measure is characterized by the fact that the random variables
R1 ; : : : ; RT are independent under P  with common distribution
r a
P  R t D b  D p  WD
; t D 1; : : : ; T:
ba

292

Chapter 5 Dynamic arbitrage theory

Proof. A measure Q on .; F / is a martingale measure if and only if the discounted


price process is a martingale under Q, i.e.,


1 C R tC1
Q-a.s.
X t D EQ X tC1 j F t  D X t EQ
Ft
1Cr
for all t  T  1. This identity is equivalent to the equation
r D EQ R tC1 j F t  D b  Q R tC1 D b j F t  C a  .1  Q R tC1 D b j F t /;
i.e., to the condition
Q R tC1 D b j F t .!/ D p  D

r a
ba

for Q-a.e. ! 2 .

But this holds if and only if the random variables R1 ; : : : ; RT are independent under
Q with common distribution Q R t D b  D p  . In particular, there can be at most
one martingale measure for X.
If the market model is arbitrage-free, then there exists an equivalent martingale
measure P  . The condition P   P implies
p  D P  R1 D b  2 .0; 1/;
which holds if and only if a < r < b.
Conversely, if a < r < b then we can define a measure P   P on .; F / by
P  !  WD .p  /k  .1  p  /T k > 0
where k denotes the number of occurrences of C1 in ! D .y1 ; : : : ; yT /. Under P  ,
Y1 ; : : : ; YT , and hence R1 ; : : : ; RT , are independent random variables with common
distribution P  Y t D 1  D P  R t D b  D p  , and so P  is an equivalent martingale measure.
From now on, we consider only CRR models which are arbitrage-free, and we
denote by P  the unique equivalent martingale measure.
Remark 5.40. Note that the unique martingale measure P  , and hence the valuation of any contingent claim, is completely independent of the initial choice of the
objective measure P within the class of measures satisfying (5.26).
}
Let us now turn to the problem of pricing and hedging a given contingent claim C .
The discounted claim H D C =ST0 can be written as
H D h.S0 ; : : : ; ST /
for a suitable function h.

293

Section 5.5 The binomial model

Proposition 5.41. The value process


V t D E  H j F t ;

t D 0; : : : ; T;

of a replicating strategy for H is of the form


V t .!/ D v t .S0 ; S1 .!/; : : : ; S t .!//;
where the function v t is given by
 

S1
ST t

; : : : ; xt
:
v t .x0 ; : : : ; x t / D E h x0 ; : : : ; x t ; x t
S0
S0

(5.27)

Proof. Clearly,
 


S tC1
ST
; : : : ; St
V t D E  h S0 ; S1 ; : : : ; S t ; S t
Ft :
St
St
Each quotient S tCs =S t is independent of F t and has under P  the same distribution
as
s
Y
Ss
D
.1 C Rk /:
S0
kD1

Hence (5.27) follows from the standard properties of conditional expectations.


Since V is characterized by the recursion
VT WD H

and

V t D E  V tC1 j F t ;

t D T  1; : : : ; 0;

we obtain a recursive formula for the functions v t defined in (5.27)


vT .x0 ; : : : ; xT / D h.x0 ; : : : ; xT /;

(5.28)

O C .1  p  /  v tC1 .x0 ; : : : ; x t ; x t a/;


O
v t .x0 ; : : : ; x t / D p   v tC1 .x0 ; : : : ; x t ; x t b/
where
aO WD 1 C a

and

bO WD 1 C b:

Example 5.42. If H D h.ST / depends only on the terminal value ST of the stock
price, then V t depends only on the value S t of the current stock price
V t .!/ D v t .S t .!//:
Moreover, the formula (5.27) for v t reduces to an expectation with respect to the
binomial distribution with parameter p 
!
T
t
X
T t
T tk O k
b /
v t .x t / D
h.x t aO
.p  /k .1  p  /T tk :
k
kD0

294

Chapter 5 Dynamic arbitrage theory

In particular, the unique arbitrage-free price of H is given by


!
T
X
T
T k O k
h.S0 aO
.p  /k .1  p  /T k :
b /
.H / D v0 .S0 / D
k
kD0

For h.x/ D .x K/C =.1Cr/T or h.x/ D .K x/C =.1Cr/T , we obtain explicit formulas for the arbitrage-free prices of European call or put options with strike price K.
For instance, the price of H call WD .ST  K/C =.1 C r/T is given by
!
T
X
1
call
T k O k
C T
.S0 aO
}
b  K/
.p  /k .1  p  /T k :
.H / D
k
.1 C r/T
kD0

Example 5.43. Denote by


M t WD max Ss ;
0st

0  t  T;

the running maximum of S, and consider a discounted claim H D h.ST ; MT /. For


instance, H can be an up-and-in or up-and-out barrier option or a lookback put. Then
the value process of H is of the form
V t D v t .S t ; M t /;
where

h  S
 i
 M
T t
T t
:
v t .x t ; m t / D E  h x t
; mt _ xt
S0
S0
This follows from (5.27) or directly from the fact that

Su 
MT D M t _ S t max
;
tuT S t
where max tuT Su =S t is independent of F t and has the same law as MT t =S0
under P  . The same argument works for options that depend on the minimum of the
stock price such as lookback calls or down-and-in barrier options.
}
Exercise 5.5.1. For an Asian option depending on the average price
Sav WD

1 X
St
jT j
t2T

during a predetermined set of periods T  0; : : : ; T , we introduce the process


X
Ss :
A t WD
s2T ; st

Show that the value process V t of the Asian option is a function of S t , A t , and t . }

295

Section 5.5 The binomial model

Let us now derive a formula for the hedging strategy  D . 0 ; / of our discounted
claim H D h.S0 ; : : : ; ST /.
Proposition 5.44. The hedging strategy is given by
 t .!/ D  t .S0 ; S1 .!/; : : : ; S t1 .!//;
where
 t .x0 ; : : : ; x t1 / WD .1 C r/

O  v t .x0 ; : : : ; x t1 ; x t1 a/


O
v t .x0 ; : : : ; x t1 ; x t1 b/
:
O
x t1 b  x t1 aO

Proof. For each ! D .y1 ; : : : ; yT /,  t must satisfy


 t .!/.X t .!/  X t1 .!// D V t .!/  V t1 .!/:

(5.29)

In this equation, the random variables  t , X t1 , and V t1 depend only on the first
t  1 components of !. For a fixed t , let us define ! C and !  by
! WD .y1 ; : : : ; y t1 ; 1; y tC1 ; : : : ; yT /:
Plugging ! C and !  into (5.29) shows
O C r/1  X t1 .!// D V t .! C /  V t1 .!/
 t .!/  .X t1 .!/ b.1
O C r/1  X t1 .!// D V t .!  /  V t1 .!/:
 t .!/  .X t1 .!/ a.1
Solving for  t .!/ and using our formula (5.28) for V t , we obtain
 t .!/ D .1 C r/

V t .! C /  V t .!  /
D  t .S0 ; S1 .!/; : : : ; S t1 .!//:
X t1 .!/.bO  a/
O

Remark 5.45. The term  t may be viewed as a discrete derivative of the value
function v t with respect to the possible stock price changes. In financial language,
a hedging strategy based on a derivative of the value process is often called a Delta
hedge.
}
Remark 5.46. Let H D h.ST / be a discounted claim which depends on the terminal
value of S by way of an increasing function h. For instance, h can be the discounted
payoff function h.x/ D .x  K/C =.1 C r/T of a European call option. Then
v t .x/ D E  h.x ST t =S0 / 
is also increasing in x, and so the hedging strategy satisfies
 t .!/ D .1 C r/t

O  v t .S t1 .!/ a/
v t .S t1 .!/ b/
O
 0:
O
S t1 .!/ b  S t1 .!/ aO

In other words, the hedging strategy for H does not involve short sales of the risky
asset.
}

296

Chapter 5 Dynamic arbitrage theory

Exercise 5.5.2. Let T0 2 1; : : : ; T  1 and K > 0. The payoff of forward starting


call option has the form
C
S
T
K :
ST0
}

Determine its arbitrage-free price and hedging strategy.

5.6

Exotic derivatives

The recursion formula (5.28) can be used for the numeric computation of the value
process of any contingent claim. For the value processes of certain exotic derivatives
which depend on the maximum of the stock price, it is even possible to obtain simple
closed-form solutions if we make the additional assumption that
aO D

1
;
bO

where aO D 1 C a and bO D 1 C b. In this case, the price process of the risky asset is
of the form
S t .!/ D S0 bO Z t .!/
where, for Yk as in (5.25),
Z0 WD 0

and

Z t WD Y1 C    C Y t ;

t D 1; : : : ; T:

Let P denote the uniform distribution


P !  WD

1
D 2T ;
jj

! 2 :

Under the measure P , the random variables Y t are independent with common distribution P Y t D C1  D 12 . Thus, the stochastic process Z becomes a standard
random walk under P . Therefore,

t
if t C k is even,
2t t Ck
2
(5.30)
P Zt D k  D
0
otherwise.
The following lemma is the key to numerous explicit results on the distribution of
Z under the measure P ; see, e.g., Chapter III of [110]. For its statement, it will be
convenient to assume that the random walk Z is defined up to time T C 1; this can
always be achieved by enlarging our probability space .; F /. We denote by
M t WD max Zs
0st

the running maximum of Z.

297

Section 5.6 Exotic derivatives

Lemma 5.47 (Reflection principle). For all k 2 N and l 2 N0 ,


P MT  k and ZT D k  l  D P ZT D k C l ;
and
P MT D k and ZT D k  l  D 2

kCl C1
P ZT C1 D 1 C k C l :
T C1

Proof. Let
 .!/ WD inft  0 j Z t .!/ D k ^ T:
For ! D .y1 ; : : : ; yT / 2  we define .!/ by .!/ D ! if  .!/ D T and by
.!/ D .y1 ; : : : ; y.!/ ; y.!/C1 ; : : : ; yT /
otherwise, i.e., if the level k is reached before the deadline T . Intuitively, the two
trajectories .Z t .!// tD0;:::;T and .Z t ..!/// tD0;:::;T coincide up to  .!/, but from
then on the latter path is obtained by reflecting the original one on the horizontal axis
at level k; see Figure 5.4.

kCl

kl


Figure 5.4. The reflection principle.

Let Ak;l denote the set of all ! 2  such that ZT .!/ D k  l and MT  k. Then
 is a bijection from Ak;l to the set

MT  k and ZT D k C l ;

298

Chapter 5 Dynamic arbitrage theory

which coincides with ZT D k Cl, due to our assumption l  0. Hence, the uniform
distribution P must assign the same probability to Ak;l and ZT D k C l, and we
obtain our first formula.
The second formula is trivial in case T C k C l is not even. Otherwise, we let
j WD .T C k C l/=2 and apply (5.30) together with part one of this lemma
P MT D k; ZT D k  l 
D P MT  k; ZT D k  l   P MT  k C 1; ZT D k  l 
D P ZT D k C l   P ZT D k C l C 2 
!
!
T
T T
T
D2
2
j
j C1
!
T C 1 2j C 1  T
;
D 2T
T C1
j C1
and this expression is equal to the right-hand side of our second formula.
Formula (5.30) will change if we replace the uniform distribution P by our martingale measure P  , described in Theorem 5.39

P Zt D k  D

.p  /

t Ck
2

.1  p  /

t k
2

t Ck
2

if t C k is even,
otherwise.

Let us now show how the reflection principle carries over to P  .


Lemma 5.48 (Reflection principle for P  ). For all k 2 N and l 2 N0 ,
P  MT  k; ZT D k  l  D

 1  p  l

P  ZT D k C l 
p
 p  k
P  ZT D k  l ;
D
1  p

and
P  MT D k; ZT D k  l 
1  1  p  l k C l C 1 

P ZT C1 D 1 C k C l 
D 
p
p
T C1
 p  k k C l C 1
1


D
P  ZT C1 D 1  k  l :
1  p 1  p
T C1

299

Section 5.6 Exotic derivatives

Proof. We show first that the density of P  with respect to P is given by


T CZT
T ZT
dP 
D 2T  .p  / 2 .1  p  / 2 :
dP

Indeed, P  puts the weight


P  !  D .p  /k .1  p  /T k
to each ! D .y1 ; : : : ; yT / 2  which contains exactly k components with yi D C1.
But for such an ! we have ZT .!/ D k  .T  k/ D 2k  T , and our formula follows.
From the density formula, we get
P  MT  k and ZT D k  l 
D 2T .p  /

T Ckl
2

.1  p  /

T Clk
2

P MT  k and ZT D k  l :

Applying the reflection principle and using again the density formula, we see that the
probability term on the right is equal to
P ZT D k C l  D 2T .p  /

T CkCl
2

.1  p  /

T kl
2

P  ZT D k C l ;

which gives the first identity. The proof of the remaining ones is analogous.
Example 5.49 (Up-and-in call option). Consider an up-and-in call option with payoff

.ST  K/C if max0tT S t  B;


call
Cu&i
D
0
otherwise,
where B > S0 _ K denotes a given barrier, and where K > 0 is the strike price. Our
aim is to compute the arbitrage-free price
call
.Cu&i
/D

1
call
E  Cu&i
:
.1 C r/T

Clearly,


call
 D E  .ST  K/C I max S t  B
E  Cu&i
0tT

D E .ST  K/ I ST  B 


C E  .ST  K/C I max S t  B; ST < B :
0tT

The first expectation on the right can be computed explicitly in terms of the binomial
distribution. Thus, it remains to compute the second expectation, which we denote
by I . To this end, we may assume without loss of generality that B lies within the

300

Chapter 5 Dynamic arbitrage theory

range of possible asset prices, i.e., there exists some k 2 N such that B D S0 bO k .
Then, by Lemma 5.48,
X
E  .ST  K/C I MT  k; ZT D k  l 
I D
l1

.S0 bO kl  K/C P  MT  k; ZT D k  l 

l1

.S0 bO kl  K/C

l1

 p  k
P  ZT D k  l 
1  p

 p  k
X
Q C P  ZT D k  l 
D
.S0 bO kl  K/
bO 2k

1p
l1

 p  k  B 2
Q C I ST < BQ ;
E  .ST  K/
D
1  p
S0
where

 S 2
S2
0
and BQ WD 0 :
KQ D K bO 2k D K
B
B
Hence, we obtain the formula

1
call
E  .ST  K/C I ST  B 
.Cu&i
/D
.1 C r/T

 p  k  B 2

C
Q I ST < BQ  :
E .ST  K/
C
1  p
S0

Both expectations on the right now only involve the binomial distribution with parameters p  and T . They can be computed as in Example 5.42, and so we get the explicit
formula
call
/
.Cu&i

1
D
.1 C r/T

X
nk

.S0 bO T 2n  K/C .p  /T n .1  p  /n

nD0

 p  k  B 2
C
1  p
S0

T
X

!
T
T n

Q C .p  /T n .1  p  /n
.S0 bO T 2n  K/

nDnk C1

where nk is the largest integer n such that T  2n  k.

!

T
T n

;
}

Example 5.50 (Up-and-out call option). Consider an up-and-out call option with payoff

0
if max0tT S t  B;
call
Cu&o D
C
otherwise,
.ST  K/

301

Section 5.6 Exotic derivatives

where K > 0 is the strike price and B > S0 _ K is an upper barrier for the stock
price. As in the preceding example, we assume that B D S0 .1 C b/k for some k 2 N.
Let
C call WD .ST  K/C
denote the corresponding plain vanilla call, whose arbitrage-free price is given by
.C call / D

1
E  .ST  K/C :
.1 C r/T

call
call
Since C call D Cu&o
C Cu&i
, we get from Example 5.49 that
call
call
/ D .C call /  .Cu&i
/
.Cu&o

1
D
E  .ST  K/C I ST < B 
.1 C r/T

 p  k  B 2

C
Q
Q
E

.S

K/
I
S
<
B

:

T
T
1  p
S0

where KQ D KS02 =B 2 and BQ WD S02 =B. These expectations can be computed as in


Example 5.49.
}
Exercise 5.6.1. Derive a formula for the arbitrage-free price of a down-and-in put
option with payoff

0
if min0tT S t > B;
put
Cd&i D
C
otherwise,
.K  ST /
where K > 0 is the strike price and B < S0 is a lower barrier for the stock price.
Then compute the price of the option for the following specific parameter values:
T D 3;

S0 D 100;

a D 0:1;

r D 0:05;

B D 70;

K D 90:

In the following example, we compute the price of a lookback put option.


Example 5.51 (Lookback put option). A lookback put option corresponds to the contingent claim
put
Cmax
WD max S t  ST I
0tT

put

see Example 5.23. In the CRR model, the discounted arbitrage-free price of Cmax is
given by


1
put
.Cmax
/D
E  max S t  S0 :
T
0tT
.1 C r/

302

Chapter 5 Dynamic arbitrage theory

The expectation of the maximum can be computed as


E

T
X

max S t D S0
bO k P  MT D k :

0tT

kD0

Lemma 5.48 yields


P  MT D k  D

P  MT D k; ZT D k  l 

l0

X
l0

1  p  k k C l C 1 
P ZT C1 D 1  k  l 
1  p 1  p
T C1

1  p  k 1
E  ZT C1 I ZT C1  1  k :
1  p 1  p T C 1

Thus, we arrive at the formula


put
.Cmax
/ C S0
T
 p  k
X
S0
Ok
D
E  ZT C1 I ZT C1  1  k :
b
1  p
.1 C r/T .1  p  /.T C 1/
kD0

As before, one can give explicit formulas for the expectations occurring on the right.
}
Exercise 5.6.2. Derive a formula for the price of a lookback call option with payoff
ST  min S t :
0tT

5.7

Convergence to the BlackScholes price

In practice, a huge number of trading periods may occur between the current time
t D 0 and the maturity T of a European contingent claim. Thus, the computation
of option prices in terms of some martingale measure may become rather elaborate.
On the other hand, one can hope that the pricing formulas in discrete time converge
to a transparent limit as the number of intermediate trading periods grows larger and
larger. In this section, we will formulate conditions under which such a convergence
occurs.
Throughout this section, T will not denote the number of trading periods in a fixed
discrete-time market model but rather a physical date. The time interval 0; T  will
T 2T
; N ; : : : ; NNT , and the date kT
be divided into N equidistant time steps N
N will correspond to the k th trading period of an arbitrage-free market model. For simplicity, we

303

Section 5.7 Convergence to the BlackScholes price

will assume that each market model contains a riskless bond and just one risky asset.
In the N th approximation, the risky asset will be denoted by S .N / , and the riskless
bond will be defined by a constant interest rate rN > 1.
The question is whether the prices of contingent claims in the approximating market
models converge as N tends to infinity. Since the terminal values of the riskless bonds
should converge, we assume that
lim .1 C rN /N D e rT ;

N "1

where r is a finite constant. This condition is in fact equivalent to the following one:
lim N rN D r T:

N "1

.N /

Let us now consider the risky assets. We assume that the initial prices S0 do
.N /
D S0 for some constant S0 > 0. The prices Sk.N /
not depend on N , i.e., S0
are random variables on some probability space .N ; F .N / ; PN /, where PN is a
risk-neutral measure for each approximating market model, i.e., the discounted price
process
Sk.N /
.N /
; k D 0; : : : ; N;
Xk WD
.1 C rN /k
is a PN -martingale with respect to the filtration Fk.N / WD .S1.N / ; : : : ; Sk.N / /. Our
remaining conditions will be stated in terms of the returns
.N /

.N /

Rk

WD

Sk

.N /

 Sk1

.N /
Sk1

k D 1; : : : ; N:
.N /

.N /

First, we assume that, for each N , the random variables R1 ; : : : ; RN


dent under PN and satisfy
.N /
1 < N  Rk  N ;

are indepen-

k D 1; : : : ; N;

for constants N and N such that


lim N D lim N D 0:

N "1

N "1

.N /
Second, we assume that the variances varN .Rk / under PN are such that
2
N

N
1 X
WD
var.Rk.N / / ! 2 2 .0; 1/:
T
N
kD1

The following result can be regarded as a multiplicative version of the central limit
theorem.

304

Chapter 5 Dynamic arbitrage theory


.N /

Theorem 5.52. Under the above assumptions, the distributions of SN under PN
converge weakly to the log-normal distribution with parameters log S0 C rT  12 2 T
p
and T , i.e., to the distribution of
 


1 2
ST WD S0 exp WT C r  T ;
(5.31)
2
where WT has a centered normal law N.0; T / with variance T .
Proof. We may assume without loss of generality that S0 D 1. Consider the Taylor
expansion
1
log.1 C x/ D x  x 2 C .x/ x 2
(5.32)
2
where the remainder term  is such that
j.x/j  .; /

for 1 <  x  ,

and where .; / ! 0 for ; ! 0. Applied to


.N /

SN

N
Y

.N /

.1 C Rk /;

kD1

this yields
.N /
log SN

N 
X

.N /
Rk

kD1


1 .N / 2
 .Rk / C N ;
2

where
jN j  .N ; N /

N
X

.Rk.N / /2 :

kD1
 R .N /  D r , and it follows that
Since PN is a martingale measure, we have EN
N
k


jN j   .N ; N /
EN

N
X

2
.var.Rk.N / / C rN
/ ! 0:

kD1

In particular, N ! 0 in probability, and the corresponding laws converge weakly


to the Dirac measure 0 . Slutskys theorem, as stated in Appendix A.6, asserts that it
suffices to show that the distributions of
ZN WD

N 
X
kD1


N
X
1
.N /
Rk.N /  .Rk.N / /2 DW
Yk
2
kD1

305

Section 5.7 Convergence to the BlackScholes price

converge weakly to the normal law N.rT  12 2 T; 2 T /. To this end, we will check
that the conditions of the central limit theorem in the form of Theorem A.37 are satisfied.
Note that
1 2
.N /
! 0
max jYk j  N C N
2
1kN
for N WD jN j _ jN j, and that
1 2
1

2
ZN  D N rN  . N
T C N rN
/ ! r T  2 T:
EN
2
2
Finally,
var.ZN / ! 2 T;
N

since for p > 2


N
X

.N /

p2

EN
jRk jp   N

kD1

N
X

.N /


EN
.Rk /2  ! 0:

kD1

Thus, the conditions of Theorem A.37 are satisfied.


Remark 5.53. The assumption of independent returns in Theorem 5.52 can be relaxed. Instead of Theorem A.37, we can apply a central limit theorem for martingales
under suitable assumptions on the behavior of the conditional variances
.n/
var. Rk j Fk1 /I
N

for details see, e.g., Section 9.3 of [54].

Example 5.54. Suppose the approximating model in the N th stage is a CRR model
with interest rate
rT
rN D
;
N
.N /
and with returns Rk , which can take the two possible values aN and bN ; see Section 5.5. We assume that

aO N D 1 C aN D e 

p
T =N

and

bON D 1 C bN D e 

p
T =N

for some given > 0. Since


p
N rN ! 0;

p
p
N aN !  T ;

p
p
N bN ! T

as N " 1, (5.33)

306

Chapter 5 Dynamic arbitrage theory

we have aN < rN < bN for large enough N . Theorem 5.39 yields that the N th
model is arbitrage-free and admits a unique equivalent martingale measure PN . The
measure PN is characterized by
.N /

PN Rk


D bN  DW pN
D

rN  aN
;
bN  aN

and we obtain from (5.33) that


1

D :
lim pN
2
N "1
.N /

 R
Moreover, EN
k
N
X
kD1

 D rN , and we get

 2

2
2
var.Rk.N / / D N.pN
bN C .1  pN
/aN
 rN
/ ! 2 T
N

as N " 1. Hence, the assumptions of Theorem 5.52 are satisfied.

Let us consider a derivative which is defined in terms of a function f  0 of


the risky assets terminal value. In each approximating model, this corresponds to a
contingent claim
.N /
C .N / D f .SN /:
Corollary 5.55. If f is bounded and continuous, the arbitrage-free prices of C .N /
calculated under PN converge to a discounted expectation with respect to a lognormal distribution, which is often called the BlackScholes price. More precisely,
lim

N "1


EN

C .N /
.1 C rN /N

D e rT E  f .ST /
e rT
D p
2

f .S0 e 

(5.34)
p
T yCrT  2 T =2

/e y

2 =2

dy;

1

where ST has the form (5.31) under P  .


This convergence result applies in particular to the choice f .x/ D .K  x/C corresponding to a European put option with strike K. Since the put-call parity

EN

.N /
.SN  K/C
.1 C rN /N


D


EN

.N /

.K  SN /C
.1 C rN /N


C S0 

K
.1 C rN /N

holds for each N , the convergence (5.34) is also true for a European call option with
the unbounded payoff profile f .x/ D .x  K/C .

307

Section 5.7 Convergence to the BlackScholes price

Example 5.56 (BlackScholes formula for the price of a call option). The limit of the
.N /
arbitrage-free prices of C .N / D .SN  K/C is given by v.S0 ; T /, where
e rT
v.x; T / D p
2

.xe 

p
T yCrT  2 T =2

 K/C e y

2 =2

dy:

1

The integrand on the right vanishes for


y

x
C .r  12 2 /T
log K
DW d .x; T / DW d :
p
T

Let us also define


x
p
C .r C 12 2 /T
log K
dC WD dC .x; T / WD d .x; T / C T D
;
p
T
Rx
2
and let us denote by .x/ D .2/1=2 1 e y =2 dy the distribution function of
the standard normal distribution. Then
Z C1
p
x
2
e .y T / =2 dy  e rT K.1  .d //;
v.x; T / D p
2 d

and we arrive at the BlackScholes formula for the price of a European call option
with strike K and maturity T
v.x; T / D x .dC .x; T //  e rT K.d .x; T //:
See Figure 5.5 for the plot of the function v.x; t /.

(5.35)
}

Remark 5.57. For fixed x and T , the BlackScholes price of a European call option
increases to the upper arbitrage bound x as " 1. In the limit # 0, we obtain the
lower arbitrage bound .x  e rT K/C ; see Remark 1.37.
}
The following proposition gives a criterion for the convergence (5.34) in case f is
not necessarily bounded and continuous. It applies in particular to f .x/ D .x  K/C ,
and so we get an alternative proof for the convergence of call option prices to the
BlackScholes price.
Proposition 5.58. Let f W .0; 1/ ! R be measurable, continuous a.e., and such
that jf .x/j  c .1 C x/q for some c  0 and 0  q < 2. Then
.N /

EN
f .SN
/  ! E  f .ST /;

where ST has the form (5.31) under P  .

308

Chapter 5 Dynamic arbitrage theory

2K
0

K
t
T

Figure 5.5. The BlackScholes price v.x; t / of a European call option .ST K/C plotted
as a function of the initial spot price x D S0 and the time to maturity t.

Proof. Let us note first that by the Taylor expansion (5.32)


N
Y

.N /


.SN /2  D log
log EN

.N /

kD1

N
X

.N /


.var.1 C Rk / C EN
1 C Rk 2 /
N

.N /

log.var.Rk / C .1 C rN /2 /

kD1

2
2
T C 2N rN C N rN
C cQ
 N

N
X

2 2
.var.Rk.N / / C 2jrN j C rN
/

kD1

for a finite constant c.


Q Thus,
.N / 2

.SN
/  < 1:
sup EN
N

With this property established, the assertion follows immediately from Theorem 5.52
and the Corollaries A.46 and A.47, but we also give the following more elementary
proof. To this end, we may assume that q > 0, and we define p WD 2=q > 1. Then
.N / p
.N / 2


jf .SN
/j   c p sup EN
.1 C SN
/  < 1;
sup EN
N

and the assertion follows from Lemma 5.59 below.


Lemma 5.59. Suppose .
N /N 2N is a sequence of probability measures on R converging weakly to
. If f is a measurable and
-a.e. continuous function on R such

309

Section 5.7 Convergence to the BlackScholes price

that

Z
c WD sup

jf jp d
N < 1

for some p > 1,

N 2N

then

Z
f d
N !

f d
:

Proof. We may assume without loss of generality that f  0. Then fk WD f ^ k is


a bounded and
-a.e. continuous function for each k > 0. Clearly,
Z
Z
Z
f d
N D fk d
N C .f  k/C d
N :
Due to part (e) of the portmanteau
R theorem in the form of Theorem A.39, the first
integral on the right converges to fk d
as N " 1. Let us consider the second term
on the right
Z
Z
Z
1
c
C
.f  k/ d
N 
f d
N  p1 f p1 f d
N  p1 ;
k
k
f >k
uniformly in N . Hence,
Z
Z
Z
fk d
N  lim inf f d
N
fk d
D lim
N "1

N "1

f d
N 

 lim sup

fk d
C

N "1

Letting k " 1, we have

fk d
%

c
k p1

f d
, and convergence follows.

Let us now continue the discussion of the BlackScholes price of a European call
option where f .x/ D .x  K/C . We are particularly interested how it depends on the
various model parameters. The dependence on the spot price S0 D x can be analyzed
via the x-derivatives of the function v.t; x/ appearing in the BlackScholes formula
(5.35). The first derivative
.x; t / WD

@
v.x; t / D .dC .x; t //
@x

(5.36)

is called the options Delta; see Figure 5.6. In analogy to the formula for the hedging
strategy in the binomial model obtained in Proposition 5.44, .x; t / determines the
Delta hedging portfolio needed for a replication of the call option in continuous
time, as explained in (5.45) below.
The Gamma of the call option is given by
.x; t / WD

@2
1
@
.x; t / D 2 v.x; t / D '.dC .x; t // p I
@x
@x
x t

(5.37)

310

Chapter 5 Dynamic arbitrage theory

1
2K
0

K
t
T

Figure 5.6. The Delta .x; t / of the BlackScholes price of a European call option.

p
2
see Figure 5.7. Here '.x/ D 0 .x/ D e x =2 = 2 stands as usual for the density
of the standard normal distribution. Large Gamma values occur in regions where the
Delta changes rapidly, corresponding to the need for frequent readjustments of the
Delta hedging portfolio. Note that is always strictly positive. It follows that v.x; t /
is a strictly convex function of its first argument.

2K
0

K
t
T

Figure 5.7. The options Gamma .x; t /.

Exercise 5.7.1. Prove the formulas (5.36) and (5.37) for Delta and Gamma of a European call option.
}

311

Section 5.7 Convergence to the BlackScholes price

Remark 5.60. On the one hand, 0  .x; t /  1 implies that


jv.x; t /  v.y; t /j  jx  yj:
Thus, the total change of the option values is always less than a corresponding change
in the asset prices. On the other hand, the strict convexity of x 7! v.x; t / together
with (A.1) yields that for t > 0 and z > y
v.y; t /  v.0; t /
v.y; t /
v.z; t /  v.y; t /
>
D
zy
y0
y
and hence

zy
v.z; t /  v.y; t /
>
:
v.y; t /
y

Similarly, one obtains


xy
v.x; t /  v.y; t /
<
v.y; t /
y
for x < y. Thus, the relative change of option prices is larger in absolute value than
the relative change of asset values. This fact can be interpreted as the leverage effect
for call options; see also Example 1.43.
}
Another important parameter is the Theta
.x; t / WD

@
x
v.x; t / D p '.dC .x; t // C Kr e rt .d .x; t //I
@t
2 t

(5.38)

see Figure 5.8. The fact > 0 corresponds to our general observation, made in Example 5.35, that arbitrage-free prices of European call options are typically increasing
functions of the maturity.
Exercise 5.7.2. Prove the formula (5.38) for the Theta of a European call option.
Then show that the parameters , , and are related by the equation
1
.x; t / D rx .x; t / C 2 x 2 .x; t /  r v.x; t /
2
when t > 0.

(5.39)
}

Equation (5.39) implies that, for .x; t / 2 .0; 1/  .0; 1/, the function v solves the
partial differential equation
@v
@2 v
@v
1
D rx
C 2 x 2 2  rv;
@t
@x
2
@x
often called the BlackScholes equation. Since
v.x; t / ! f .x/ D .x  K/C

as t # 0,

(5.40)

(5.41)

v.x; t / is a solution of the Cauchy problem defined via (5.40) and (5.41). This fact is
not limited to call options, it remains valid for all reasonable payoff profiles f .

312

Chapter 5 Dynamic arbitrage theory

2K
0

K
t
T 0
Figure 5.8. The Theta .x; t /.

Proposition 5.61. Let f be a continuous function on .0; 1/ such that jf .x/j 


c.1 C x/p for some c; p  0, and define
Z 1
p
e rt
2
2
f .xe  t yCrt t=2 /e y =2 dy;
u.x; t / WD e rt E  f .S t /  D p
2 1
where S t D x exp. W t C rt  2 t =2/ and W t has law N.0; t / under P  . Then
u solves the Cauchy problem defined by the BlackScholes equation (5.40) and the
initial condition lim t#0 u.x; t / D f .x/, locally uniformly in x.
The proof of Proposition 5.61 is the content of the next exercise.
Exercise 5.7.3. In the context Proposition 5.61, use the formula (2.26) for the density
of a log-normally distributed random variable to show that
Z 1
 log y  rt C 2 t =2  log x 
1

f .y/ dy;
p '
p
E f .S t /  D
y t
t
0
p
2
where '.x/ D e x =2 = 2. Then verify the validity of (5.40) by differentiating
under the integral. Use the bound jf .x/j  c.1 C x/p for some c; p  0 to justify
the interchange of differentiation and integration and to verify the initial condition
}
lim t#0 u.x; t / D f .x/.
Recall that the BlackScholes price v.S0 ; T / was obtained as the expectation of the
discounted payoff e rT .ST  K/C under the measure P  . Thus, at a first glance, it
may come as a surprise that the Rho of the option,
%.x; t / WD

@
v.x; t / D Kt e rt .d .x; t //;
@r

(5.42)

313

Section 5.7 Convergence to the BlackScholes price

is strictly positive, i.e., the price is increasing in r; see Figure 5.9. Note, however, that
the measure P  depends itself on the interest rate r, since E  e rT ST  D S0 . In a
simple one-period model, we have already seen this effect in Example 1.43.

2K
0

K
t
T

Figure 5.9. The Rho %.x; t / of a call option.

The parameter is called the volatility. As we have seen, the BlackScholes price
of a European call option is an increasing function of the volatility, and this is reflected
in the strict positivity of
V .x; t / WD

p
@
v.x; t / D x t '.dC .x; t //I
@

(5.43)

see Figure 5.10. The function V is often called the Vega of the call option price, and
the functions , , , %, and V are usually called the Greeks (although vega does
not correspond to a letter of the Greek alphabet).
Exercise 5.7.4. Prove the respective formulas (5.42) and (5.43) for Rho and Vega of
a European call option. Then derive formulas for the options Vanna,
@
@V
@2 v
D
D
;
@x@
@
@x
and the options Volga, which is also called Vomma,
@2 v
@V
:
D
2
@
@

Let us conclude this section with some informal comments on the dynamic picture
behind the convergence result in Theorem 5.52 and the pricing formulas in Example 5.56 and Proposition 5.58. The constant r is viewed as the interest rate of a
riskfree savings account
S t0 D e rt ; 0  t  T:

314

Chapter 5 Dynamic arbitrage theory

2K
0

K
t
T 0
Figure 5.10. The Vega V .x; t /.

The prices of the risky asset in each discrete-time model are considered as a contin.N /
.N /
WD Sk.N / at the dates t D kT
uous process SQ .N / D .SQ t /0tT , defined as SQ t
N ,
and by linear interpolation in between. Theorem 5.52 shows that the distributions of
.N /
SQ t converge for each fixed t weakly to the distribution of


1 
S t D S0 exp W t C r  2 t ;
2

(5.44)

where W t has a centered normal distribution with variance t . In fact, one can prove
convergence in the much stronger sense of a functional central limit theorem: The
laws of the processes SQ .N / , considered as C 0; T -valued random variables on
.N ; F .N / ; PN /, converge weakly to the law of a geometric Brownian motion S D
.S t /0tT , where each S t is of the form (5.44), and where the process W D.W t /0tT
is a standard Brownian motion or Wiener process. A Wiener process is characterized
by the following properties:


W0 D 0 almost surely,

t 7! W t is continuous,

For each sequence 0 D t0 < t1 <    < tn D T , the increments


W t1  W t0 ; : : : ; W tn  W tn1
are independent and have normal distributions N.0; ti  ti 1 /;

see, e.g., [171]. This multiplicative version of a functional central limit theorem follows as above if we replace the classical central limit theorem by Donskers invariance
principle; for details see, e.g., [99]. Sample paths of Brownian motion and geometric
Brownian motion can be found in Figures 5.11 and 5.12.

315

Section 5.7 Convergence to the BlackScholes price

0.5

0.5

Figure 5.11. A sample path of Brownian motion.

0.5

Figure 5.12. A sample path of geometric Brownian motion.

Geometric Brownian motion is the classical reference model in continuous-time


mathematical finance. In order to describe the model more explicitly, we denote by
W D .W t /0tT the coordinate process on the canonical path space  D C 0; T ,
defined by W t .!/ D !.t /, and by .F t /0tT the filtration given by F t D .Ws I s 
t /. There is exactly one probability measure P on .; FT / such that W is a Wiener
process under P , and it is called the Wiener measure. Let us now model the price
process of a risky asset as a geometric Brownian motion S defined by (5.44). The
discounted price process
X t WD

St
2
D S0 e W t  t=2 ;
rt
e

0  t  T;

316

Chapter 5 Dynamic arbitrage theory

is a martingale under P , since


E X t j Fs  D Xs E e .Wt Ws /

2 .ts/=2

 D Xs

for 0  s  t  T . In fact, P is the only probability measure equivalent to P with


that property.
As in discrete time, uniqueness of the equivalent martingale measure implies completeness of the model. Let us sketch the construction of the replicating strategy for
a given European option with reasonable payoff profile f .ST /, for example a call
option with strike K. At time t the price of the asset is S t .!/, the remaining time to
maturity is T  t , and the discounted price of the option is given by
V t .!/ D e rt u.S t .!/; T  t /;
where u is the function defined in Proposition 5.61. The process V D .V t /0tT can
be viewed as the value process of the trading strategy  D . 0 ; / defined by
 t D .S t ; T  t /;

 t0 D e rt u.S t ; T  t /   t X t ;

(5.45)

where  D @u=@x is the options Delta. Indeed, if we view  as the number of shares
in the risky asset S and  0 as the number of shares in the riskfree savings account
S t0 D e rt , then the value of the resulting portfolio in units of the numraire is given
by
V t D  t  X t C  t0 D e rt . t  S t C  t0  S t0 /:
The strategy replicates the option since
VT WD lim e rt u.S t ; T  t / D e rT f .ST / D
t"T

f .ST /
;
ST0

due to Proposition 5.61. Moreover, its initial cost is given by the BlackScholes price
Z
p
e rT 1
2
2
f .xe  T yCrT  T =2 /e y =2 dy:
V0 D u.S0 ; T / D e rT E f .ST /  D p
2 1
It remains to show that the strategy is self-financing in the sense that changes in
the portfolio value are only due to price changes in the underlying assets and do not
require any additional capital. To this end, we use Its formula
dF .W t ; t / D

 1 @2 F
@F
@F 
C
.W t ; t / d W t C
.W t ; t / dt
@x
2 @x 2
@t

for a smooth function F , see, e.g., [171] or, for a strictly pathwise approach, [117].
Applied to the function F .x; t / D exp. x C rt  2 t =2/, it shows that the price
process S satisfies the stochastic differential equation
dS t D S t d W t C rS t dt:

(5.46)

317

Section 5.7 Convergence to the BlackScholes price

Thus, the infinitesimal return dS t =S t is the sum of the safe return r dt and an additional noise term with zero expectation under P  . The strength of the noise is
measured by the volatility parameter . Similarly, we obtain
dX t D X t d W t D e rt .dS t  rS t dt /:

(5.47)

Applying Its formula to the function


F .x; t / D e rt u.exp.x C rt  2 t =2/; T  t /
and using (5.46), we obtain
d V t D e rt


1
@u
@2 u @u
.S t ; t / dS t C e rt 2 S t2 2 
 ru .S t ; t / dt:
@x
2
@x
@t

The BlackScholes partial differential equation (5.40) shows that the term in parenthesis is equal to rS t @u=@x, and we obtain from (5.47) that
d Vt D

@u
.S t ; t / dX t D  t dX t :
@x

More precisely,

V t D V0 C

s dXs ;
0

where the integral with respect to X is defined as an It integral, i.e., as the limit of
non-anticipating Riemann sums
X
 ti .X tiC1  X ti /
ti 2Dn ; ti t

along an increasing sequence .Dn / of partitions of the interval 0; T ; see, e.g., [117].
Thus, the It integral can be interpreted in financial terms as the cumulative net gain
generated by dynamic hedging in the discounted risky asset as described by the hedging strategy . This fact is an analogue of property (c) in Proposition 5.7, and in this
sense  D . 0 ; / is a self-financing trading strategy in continuous time. Similarly,
we obtain the following continuous-time analogue of (5.5), which describes the undiscounted value of the portfolio as a result of dynamic trading both in the undiscounted
risky asset and the riskfree asset
Z t
Z t
rt
s dSs C
s0 dSs0 :
e V t D V0 C
0

Perfect replication also works for exotic options C.S/ defined by reasonable functionals C on the path space C 0; T , due to a general representation theorem for such
functionals as It integrals of the underlying Brownian motion W or, via (5.47), of the

318

Chapter 5 Dynamic arbitrage theory

process X. Weak convergence on path space implies, in analogy to Proposition 5.61,


that the arbitrage-free prices of the options C.S .N / /, computed as discounted expectations under the measure PN , converge to the discounted expectation
e rT E C.S/ 
under the Wiener measure P .
On the other hand, the discussion in Section 5.6 suggests that the prices of certain
exotic contingent claims, such as barrier options, can be computed in closed form as
the BlackScholes price for some corresponding payoff profile of the form f .ST /.
This is illustrated by the following example, where the price of an up-and-in call is
computed in terms of the distribution of the terminal stock price under the equivalent
martingale measure.
Example 5.62 (BlackScholes price of an up-and-in call option). Consider an upand-in call option

.ST  K/C if max0tT S t  B;


call
Cu&i
.S/ D
0
otherwise,
where B > S0 _ K denotes a given barrier, and where K > 0 is the strike price. As
approximating models we choose the CRR models of Example 5.54. That is, we have
interest rates
rT
rN D
N
and parameters aN and bN defined by
aO N D 1 C aN D e 

p
T =N

and

bON D 1 C bN D e 

p
T =N

for some given > 0. Applying the formula obtained in Example 5.49 yields

call
EN
Cu&i
.S .N / / 
.N /
.N /

.SN
 K/C I SN
 BN 
D EN
 p  kN  B 2
N
.N /
.N /
N

EN
.SN
 KQ N /C I SN
< BQ N ;
C

1  pN
S0

where

l pN
Bm
kN D
p log
S0
T
k , B
2
O kN Q
is the smallest integer k such that B  S0 bON
N WD S0 bN , BN WD S0 =BN D
kN
S0 aO N
, and
 S 2
0
2kN
DK
:
KQ N D K bON
BN

319

Section 5.7 Convergence to the BlackScholes price

Then we have
BN & B; BQ N % BQ D

S02
;
B

and

 S 2
0
KQ N % KQ D K
:
B

Since f .x/ D .x  K/C IxB is continuous a.e., we obtain


.N /


EN
.SN

.N /

 K/C I SN

.N /


 BN  D EN
.SN

.N /

 K/C I SN

B

! E .ST  K/C I ST  B ;
due to Proposition 5.58. Combining the preceding argument with the fact that
.N /
PN KQ N  SN  KQ  ! 0

also gives the convergence of the second expectation


.N /


EN
.SN

.N /
.N /
.N /

 KQ N /C I SN < BQ N  D EN
.SN  KQ N /C I SN < BQ 

Q C I ST < BQ :
! E .ST  K/
Next we note that for constants c; d > 0
1
2c
cx 2 C 1  e dx
D
log dx
 d;
d
x#0 x
e  1  cx 2
lim

due to lHpitals rule. From this fact, one deduces that


 B  2r2 1
 p  kN

N
!
:

1  pN
S0
Thus, we may conclude that the arbitrage-free prices
1
E  C call .SQ .N / / 
.1 C rN /N N u&i
in the N th approximating model converge to


 B  2r2 C1

rT
C
C
Q
Q
E .ST  K/ I ST < B  :
E .ST  K/ I ST  B  C
e
S0
The expectations occurring in this formula are integrals with respect to a log-normal
distribution and can be explicitly computed as in Example 5.56. Moreover, our limit
is in fact equal to the BlackScholes price of the up-and-in call option: The functional
call
.  / is continuous in each path in C 0; T  whose maximum is different from the
Cu&i
value B, and one can show that these paths have full measure for the law of S uncall
der P . Hence, Cu&i
.  / is continuous P S 1 -a.e., and the functional version of
Proposition 5.58 yields

call Q .N /
call
Cu&i
.S /  ! E Cu&i
.S/ ;
EN

so that our limiting price must coincide with the discounted expectation on the right.
}

320

Chapter 5 Dynamic arbitrage theory

Remark 5.63. Let us assume, more generally, that the price process S is defined by
S t D S0 e W t Ct ;

0  t  T;

for some 2 R. Applying Its formula as in (5.46), we see that S is governed by the
stochastic differential equation
dS t D S t d W t C bS t dt
with b D C 12 2 . The discounted price process is given by


X t D S0 e W t C.r/t D S0 e Wt

 2 t=2

with W t D W t C t for D .b  r/= . The process W  is a Wiener process under


the measure P   P defined by the density
dP 
2
D e 
WT 
T =2 :
dP
In fact, P  is the unique equivalent martingale measure for X. We can now repeat
the arguments above to conclude that the cost of perfect replication for a contingent
claim C.S/ is given by
}
e rT E  C.S/ :
Even in the context of simple diffusion models such as geometric Brownian motion,
however, completeness is lost as soon as the future behavior of the volatility parameter
is unknown. If, for instance, volatility itself is modeled as a stochastic process, we
are facing incompleteness. Thus, the problems of pricing and hedging in discrete-time
incomplete markets as discussed in this book reappear in continuous time. Other versions of the invariance principle may lead to other classes of continuous-time models
with discontinuous paths, for instance to geometric Poisson or Lvy processes. Discontinuity of paths is another important source of incompleteness. In fact, this has
already been illustrated in this book, since discrete-time models can be regarded as
stochastic processes in continuous time, where jumps occur at predictable dates.

Chapter 6

American contingent claims

So far, we have studied European contingent claims whose payoff is due at a fixed
maturity date. In the case of American options, the buyer can claim the payoff at any
time up to the expiration of the contract.
First, we take the point of view of the seller, whose aim is to hedge against all
possible claims of the buyer. In Section 6.1, this problem is solved under the assumption of market completeness, using the Snell envelope of the contingent claim.
The buyer tries to choose the best date for exercising the claim, contingent on the
information available up to that time. Since future prices are usually unknown, a formulation of this problem will typically involve subjective preferences. If preferences
are expressed in terms of expected utility, the choice of the best exercise date amounts
to solving an optimal stopping problem. In the special case of a complete market
model, any exercise strategy which maximizes the expected payoff under the unique
equivalent martingale measure turns out to be optimal even in an almost sure sense.
In Section 6.3, we characterize the set of all arbitrage-free prices of an American
contingent claim in an incomplete market model. This involves a lower Snell envelope
of the claim, which is analyzed in Section 6.5, using the fact that the class of equivalent
martingale measures is stable under pasting. This notion of stability under pasting is
discussed in Section 6.4 in a general context, and in Section 6.5 we point out its
connection with the time-consistency of dynamic risk measures. This connection will
be discussed systematically in Chapter 11. The results on lower Snell envelopes can
also be regarded as a solution to the buyers optimal stopping problem in the case
where preferences are described by robust Savage functionals. Moreover, these results
will be used in the theory of superhedging of Chapter 7.

6.1

Hedging strategies for the seller

Throughout this chapter we will continue to use the setting described in Section 5.1.
We start by introducing the Doob decomposition of an adapted process and the notion
of a supermartingale.
Proposition 6.1. Let Q be a probability measure on .; FT /, and suppose that Y is
a stochastic process that is adapted to the filtration .Ft / tD0;:::;T and satisfies Y t 2
L1 .Q/ for all t . Then there exists a unique decomposition
Y D M  A;

(6.1)

322

Chapter 6 American contingent claims

where M is a Q-martingale and A is a process such that A0 D 0 and .A t / tD1;:::;T


is predictable. The decomposition (6.1) is called the Doob decomposition of Y with
respect to the probability measure Q.
Proof. Define A by
A t  A t1 WD EQ Y t  Y t1 j F t1  for t D 1; : : : ; T .

(6.2)

Then A is predictable and M t WD Y t C A t is a Q-martingale. Clearly, any process A


with the required properties must satisfy (6.2), so the uniqueness of the decomposition
follows.
Definition 6.2. Let Q be a probability measure on .; FT / and suppose that Y is an
adapted process such that Y t 2 L1 .Q/ for all t . Denote by Y D M  A the Doob
decomposition of Y .
(a) Y is called a Q-supermartingale if A is increasing.
(b) Y is called a Q-submartingale if A is decreasing.
Clearly, a process is a martingale if and only if it is both a supermartingale and a
submartingale, i.e., if and only if A  0. The following exercise gives equivalent
characterizations of the supermartingale property of a process Y .
Exercise 6.1.1. Let Y be an adapted process with Y t 2 L1 .Q/ for all t . Show that
the following conditions are equivalent:
(a) Y is a Q-supermartingale.
(b) Ys  EQ Y t j Fs  for 0  s  t  T .
(c) Y t1  EQ Y t j F t1  for t D 1; : : : ; T .
(d) Y is a Q-submartingale.

Exercise 6.1.2. Let Y be a nonnegative Q-supermartingale. Show that for 0  s 


}
T  t we have Y tCs D 0 Q-a.s. on Y t D 0.
We now return to the market model introduced in Section 5.1. An American option,
or American contingent claim, corresponds to a contract which is issued at time 0 and
which obliges the seller to pay a certain amount C  0 if the buyer decides at time
 to exercise the option. The choice of the exercise time  is entirely up to the buyer,
except that the claim is automatically exercised at the expiration date of the claim.
The American contingent claim can be exercised only once: It becomes invalid as
soon as the payoff has been claimed by the buyer. This concept is formalized as
follows:
Definition 6.3. An American contingent claim is a non-negative adapted process C D
.Ct / tD0;:::;T on the filtered space .; .F t / tD0;:::;T /.

Section 6.1 Hedging strategies for the seller

323

For each t , the random variable C t is interpreted as the payoff of the American
contingent claim if the claim is exercised at time t . The time horizon T plays the role
of the expiration date of the claim. The possible exercise times for C are not limited
to fixed deterministic times t 2 0; : : : ; T ; the buyer may exercise the claim in a way
which depends on the scenario ! 2  of the market evolution.
Definition 6.4. An exercise strategy for an American contingent claim C is an FT measurable random variable  taking values in 0; : : : ; T . The payoff obtained by
using  is equal to
C .!/ WD C.!/ .!/; ! 2 :
Example 6.5. An American put option on the i th asset and with strike K > 0 pays
the amount
put
C t WD .K  S ti /C
if it is exercised at time t . The payoff at time t of the corresponding American call
option is given by
C tcall WD .S ti  K/C :
Clearly, the American call option is out of the money (i.e., has zero payoff) if the
corresponding American put is in the money (i.e., has non-zero payoff). It is therefore a priori clear that the respective owners of C put and C call will usually exercise
their claims at different times. In particular, there will be no put-call parity for American options.
}
Similarly, one defines American versions of most options mentioned in the examples of Section 5.3. Clearly, the value of an American option is at least as high as the
value of the corresponding European option with maturity T .
Remark 6.6. It should be emphasized that the concept of American contingent claims
can be regarded as a generalization of European contingent claims: If C E is a European contingent claim, then we can define a corresponding American claim C A by

0
if t < T ,
A
Ct D
(6.3)
E
if t D T .
C
}
Example 6.7. A Bermuda option can be exercised by its buyer at each time of a
predetermined subset T  0; : : : ; T . For instance, a Bermuda call option pays the
amount .S ti  K/C if it is exercised at some time t 2 T . Thus, a Bermuda option
is a financial instrument between an American option with T D 0; : : : ; T and a
European option with T D T , just as Bermuda lies between America and Europe;
hence the name Bermuda option. A Bermuda option can be regarded as a particular
}
American option C that pays the amount C t D 0 for t T .

324

Chapter 6 American contingent claims

The process
Ht D

Ct
;
S t0

t D 0; : : : ; T;

of discounted payoffs of C will be called the discounted American claim associated


with C . As far as the mathematical theory is concerned, the discounted American
claim H will be the primary object. For certain examples it will be helpful to keep
track of the numraire and, thus, of the payoffs C t prior to discounting.
In this section, we will analyze the theory of hedging American claims in a complete
market model. We will therefore assume throughout this section that the set P of
equivalent martingale measures consists of one single element P 
P D P  :
Under this assumption, we will construct a suitable trading strategy that permits the
seller of an American claim to hedge against the buyers discounted claim H . Let us
first try to characterize the minimal amount of capital U t which will be needed at time
t 2 0; : : : ; T . Since the choice of the exercise time  is entirely up to the buyer, the
seller must be prepared to pay at any time t the current payoff H t of the option. This
amounts to the condition U t  H t . Moreover, the amount U t must suffice to cover
the purchase of the hedging portfolio for the possible payoffs Hu for u > t . Since the
latter condition is void at maturity, we require
UT D HT :
At time T  1, our first requirement on UT 1 reads UT 1  HT 1 . The second
requirement states that the amount UT 1 must suffice for hedging the claim HT in
case the option is not exercised before time T . Due to our assumption of market
completeness, the latter amount equals
E  HT j FT 1  D E  UT j FT 1 :
Thus,
UT 1 WD HT 1 _ E  UT j FT 1 
is the minimal amount that fulfills both requirements. Iterating this argument leads to
the following recursive scheme for U t :
UT WD HT ;

U t WD H t _ E  U tC1 j F t  for t D T  1; : : : ; 0.


(6.4)

Definition 6.8. The process U P WD U defined by the recursion (6.4) is called the
Snell envelope of the process H with respect to the measure P  .

325

Section 6.1 Hedging strategies for the seller

Example 6.9. Let H E be a discounted European claim. Then the Snell envelope
with respect to P  of the discounted American claim H A associated with H E via
(6.3) satisfies

U tP D E  HTA j F t  D E  H E j F t :
Thus, U is equal to the value process of a replicating strategy for H E .

Clearly, a Snell envelope U Q can be defined for any probability measure Q on


.; FT / and for any adapted process H that satisfies the following integrability condition:
(6.5)
H t 2 L1 .Q/ for t D 0; : : : ; T .
In our finite-time setting, this condition is equivalent to


EQ max jH t j < 1:
tT

For later applications, the following proposition is stated for a general measure Q.
Proposition 6.10. Let H be an adapted process such that (6.5) holds. Then the Snell
envelope U Q of H with respect to Q is the smallest Q-supermartingale dominating
H : If UQ is another Q-supermartingale such that UQ t  H t Q-a.s. for all t , then
Q
UQ t  U t Q-a.s. for all t .
Q

Proof. It follows from the definition of U Q that U t1  EQ U t j F t1  so that


U Q is indeed a supermartingale. If UQ is another supermartingale dominating H , then
Q
UQ T  HT D UT . We now proceed by backward induction on t . If we already know
Q
that UQ t  U t , then
Q
UQ t1  EQ UQ t j F t1   EQ U t j F t1 :

Adding our assumption UQ t1  H t1 yields that


Q
Q
UQ t1  H t1 _ EQ U t j F t1  D U t1 ;

and the result follows.


Proposition 6.10 illustrates how the seller can (super-) hedge a discounted American claim H by using the Doob decomposition


U tP D M t  A t ;


t D 0; : : : ; T;

of the Snell envelope U P with respect to P  . Then M is a P  -martingale, A is


increasing, and .A t / tD1;:::;T is predictable. Since we assume the completeness of

326

Chapter 6 American contingent claims

the market model, Theorem 5.38 yields the representation of the martingale M as the
stochastic integral of a suitable d -dimensional predictable process 


M t D U0P C

t
X

k  .Xk  Xk1 /;

t D 0; : : : ; T:

(6.6)

kD1

It follows that

M t  U tP  H t

for all t .

0

By adding a numraire component


such that  D . 0 ; / becomes a self-financing

trading strategy with initial investment U0P , we obtain a (super-) hedge for H ,
namely a self-financing trading strategy whose value process V satisfies
Vt  Ht

for all t .

(6.7)

Thus, U tP may be viewed as the resulting capital at each time t if we use the selffinancing strategy , combined with a refunding scheme where we withdraw suc
cessively the amounts defined by the increments of A. In fact, U tP is the minimal
investment at time t for which one can purchase a hedging strategy such that (6.7)
holds. This follows from our next result.


Theorem 6.11. Let H be a discounted American claim with Snell envelope U P .


Then there exists a d -dimensional predictable process  such that
u
X

U tP C

k  .Xk  Xk1 /  Hu

for all u  t P -a.s.

(6.8)

kDtC1

Moreover, any F t -measurable random variable UQ t which, for some predictable ,



satisfies (6.8) in place of U tP is such that

UQ t  U tP

P -a.s.

Thus, U tP is the minimal amount of capital which is necessary to hedge H from


time t up to maturity.

Proof. Clearly, U P satisfies (6.8) for  as in (6.6). Now suppose that UQ t is F t measurable, that Q is predictable, and that

Vu WD UQ t C

u
X

Qk  .Xk  Xk1 /  Hu

for all u  t P -a.s.

kDtC1


We show Vu  UuP for all u  t by backward induction. VT  HT D UTP holds


P  for some u. Since our market model is
by assumption, so assume VuC1  UuC1
complete, Theorem 5.37 implies that Q is bounded. Hence, we get
E  VuC1  Vu j Fu  D E  QuC1  .XuC1  Xu / j Fu  D 0

P -a.s.

327

Section 6.2 Stopping strategies for the buyer

It follows that


P
j Fu  D UuP :
Vu D E  VuC1 j Fu   Hu _ E  UuC1

6.2

Stopping strategies for the buyer

In this section, we take the point of view of the buyer of an American contingent
claim. Thus, our aim is to optimize the exercise strategy. It is natural to assume that
the decision to exercise the claim at a particular time t depends only on the market
information which is available at t . This constraint can be formulated as follows:
Definition 6.12. A function  W  ! 0; 1; : : : ; T [ C1 is called a stopping time
if  D t 2 F t for t D 0; : : : ; T .
In particular, the constant function   t is a stopping time for fixed t 2 0; : : : ; T .
Exercise 6.2.1. Show that a function  W  ! 0; 1; : : : ; T [ C1 is a stopping
time if and only if   t 2 F t for each t . Show next that, if  and are two
stopping times, then the following functions are also stopping times:
 ^ ;

 _ ;

. C / ^ T:

Example 6.13. A typical example of a non-trivial stopping time is the first time at
which an adapted process Y exceeds a certain level c
 .!/ WD inft  0 j Y t .!/  c:
In fact,
  t D

t
[

Ys  c 2 F t

sD0

for t D 0; : : : ; T . This example also illustrates the role of the value C1 in Definition 6.12: We have  .!/ D C1 if, for this particular !, the criterion that triggers 
is not met for any t 2 0; : : : ; T .
}
Definition 6.14. For any stochastic process Y and each stopping time  we denote by
Y  the process stopped in 
Y t .!/ WD Y t^.!/ .!/ for ! 2  and for all t 2 0; : : : ; T .
It follows from the definition of a stopping time that Y  is an adapted process if Y
is. Informally, the following basic theorem states that a martingale cannot be turned
into a favorable game by using a clever stopping strategy. This result is often called
Doobs stopping theorem or the optional sampling theorem. Recall that we assume
F0 D ;; .

328

Chapter 6 American contingent claims

Theorem 6.15. Let M be an adapted process such that M t 2 L1 .Q/ for each t .
Then the following conditions are equivalent:
(a) M is a Q-martingale.
(b) For any stopping time  the stopped process M  is a Q-martingale.
(c) EQ M^T  D M0 for any stopping time  .
Proof. (a) ) (b): Note that

M tC1
 M t D .M tC1  M t / I>t :

Since  > t 2 F t , we obtain that



 M t j F t  D EQ M tC1  M t j F t   I>t D 0:
EQ M tC1

(b) ) (c): This follows simply from the fact that the expectation of M t is constant
in t .
(c) ) (a): We need to show that if t < T , then
EQ MT I A  D EQ M t I A 

(6.9)

for each A 2 F t . Fix such an A and define a stopping time  as

t if ! 2 A,
 .!/ WD
T if ! A.
We obtain that
M0 D EQ MT ^  D EQ M t I A  C EQ MT I Ac :
Using the constant stopping time T instead of  yields that
M0 D EQ MT  D EQ MT I A  C EQ MT I Ac :
Subtracting the latter identity from the previous one yields (6.9).
Exercise 6.2.2. Let Y D M  A be the Doob decomposition with respect to Q of an
adapted process Y with Y t 2 L1 .Q/ (t D 0; : : : ; T ), and let  be a stopping time.
Show that Y  D M   A is the Doob decomposition of Y  .
}
Corollary 6.16. Let U be an adapted process such that U t 2 L1 .Q/ for each t . Then
the following conditions are equivalent:
(a) U is a Q-supermartingale.
(b) For any stopping time  , the stopped process U  is a Q-supermartingale.

329

Section 6.2 Stopping strategies for the buyer

Proof. If U D M  A is the Doob decomposition of U , then Exercise 6.2.2 implies that U  D M   A is the Doob decomposition of U  . This observation and
Theorem 6.15 yield the equivalence of (a) and (b).
Let us return to the problem of finding an optimal exercise time  for a discounted
American claim H . We assume that the buyer chooses the possible exercise times
from the set
T WD  j  is a stopping time with   T
of all stopping times which do not take the value C1. Assume that the aim of the
buyer is to choose a payoff from the class H j  2 T which is optimal in the sense
that it has maximal expectation. Thus, the problem is
Maximize E H  among all  2 T .

(6.10)

The analysis of the optimal stopping problem (6.10) does not require any properties of
the underlying market model, not even the absence of arbitrage. We may also drop the
positivity assumption on H : All we have to assume is that H is an adapted process
which satisfies
(6.11)
H t 2 L1 .; F t ; P / for all t .
This relaxed assumption will be useful in Chapter 9, and it allows us to include the
interpretation of the optimal stopping problem in terms of the following utility maximization problem:
Remark 6.17. Suppose the buyer uses a preference relation on X WD H j  2 T
which can be represented in terms of a Savage representation
U.H / D EQ u.H / 
where Q is a probability measure on .; F /, and u is a measurable or continuous
function; see Section 2.5. Then a natural goal is to maximize the utility U.H / among
all  2 T . This is equivalent to the optimal stopping problem (6.10) for the transformed process HQ t WD u.H t /, and with respect to the measure Q instead of P . This
utility maximization problem is covered by the discussion in this section as long as
HQ t 2 L1 .Q/ for all t . In Remark 6.49 we will discuss the problem of maximizing
the more general utility functionals which appear in a robust Savage representation.
}
Under the assumption (6.11), we can construct the Snell envelope U WD U P of H
with respect to P , i.e., U is defined via the recursive formula
UT WD HT

and

U t WD H t _ E U tC1 j F t ;

Let us define a stopping time min by


min WD mint  0 j U t D H t :

t D T  1; : : : ; 0:

330

Chapter 6 American contingent claims

Note that min  T since UT D HT . As we will see in the following theorem, min
maximizes the expectation of H among all  2 T . In other words, min is a solution
to our optimal stopping problem (6.10). Similarly, we let
.t/

min WD minu  t j Uu D Hu ;
which is a member of the set
T t WD  2 T j   t :
The following theorem uses the essential supremum of a family of random variables
as explained in Appendix A.5.
Theorem 6.18. The Snell envelope U of H satisfies
U t D E H .t / j F t  D ess sup E H j F t :
min

2T t

In particular,
U0 D E Hmin  D sup E H :
2T

Proof. Since U is a supermartingale under P , Corollary 6.16 shows that for  2 T t


U t  E U j F t   E H j F t :
Therefore,
U t  ess sup E H j F t :
2T t

Hence, the theorem will be proved if we can show that U t D E H .t / j F t , which is


min
in turn implied by the identity
U t D E U .t / j F t :

(6.12)

min

In order to prove (6.12), let U .t/ denote the stopped process


Us.t/ WD Us^ .t / ;
min

.t/

and fix some s between t and T . Then Us > Hs on min > s. Hence, P -a.s. on
.t/
min > s
.t/

Us.t/ D Us D Hs _ E UsC1 j Fs  D E UsC1 j Fs  D E UsC1 j Fs :


.t/

.t/

.t/

.t/

On the set min  s one has UsC1 D U .t / D Us , hence Us


Thus, U .t/ is a martingale from time t on
.t/

min

.t/

D E UsC1 j Fs .

Us.t/ D E UsC1 j Fs  for all s 2 t; t C 1; : : : ; T  1.

331

Section 6.2 Stopping strategies for the buyer

It follows that
.t/

.t/

E U .t / j F t  D E UT j F t  D U t
min

D Ut :

This proves the claim (6.12).


Definition 6.19. A stopping time   2 T is called optimal .with respect to P / if
E H   D sup E H :
2T

In particular, min is an optimal stopping time in the sense of this definition. The
following result implies that min is in fact the minimal optimal stopping time.
Proposition 6.20. A stopping time  2 T is optimal if and only if H D U P -a.s.,
and if the stopped process U  is a martingale. In particular, any optimal stopping
time  satisfies   min .
Proof. First note that  2 T is optimal if it satisfies the two conditions of the assertion,
because then Theorem 6.18 implies that
sup E H  D U0 D E UT  D E U  D E H :

 2T

For the converse implication, we apply the assumption of optimality, the fact that
H  U , and the stopping theorem for supermartingales to obtain that
U0 D E H   E U   U0 ;
so that all inequalities are in fact equalities. It follows in particular that H D U P almost surely. Moreover, the identity E U  D U0 implies that the stopped process
U  is a supermartingale with constant expectation U0 , and hence is a martingale.
In general, there can be many different optimal stopping times. The largest optimal
stopping time admits an explicit description: It is the first time before T for which the
Snell envelope U loses the martingale property
max WD inft  0 j E U tC1  U t j F t  0 ^ T
D inft  0 j A tC1 0 ^ T:
Here, A denotes the increasing process obtained from the Doob decomposition of U
under P .
Theorem 6.21. The stopping time max is the largest optimal stopping time. Moreover, a stopping time  is optimal if and only if P -a.s.   max and U D H .

332

Chapter 6 American contingent claims

Proof. Let U D M  A be the Doob decomposition of U . Recall from Exercise 6.2.2


that U  D M   A is the Doob decomposition of U  for any stopping time  .
Thus, U  is a martingale if and only if A D 0, because A is increasing. Therefore,
U  is a martingale if and only if   max , and so the second part of the assertion
follows from Proposition 6.20. It remains to prove that max itself is optimal, i.e., that
Umax D Hmax . This is clear on the set max D T . On the set max D t for t < T
one has A t D 0 and A tC1 > 0. Hence,
E U tC1  U t j F t  D .A tC1  A t / D A tC1 < 0 on max D t .
Thus, U t > E U tC1 j F t  and the definition of the Snell envelope yields that U t D
H t _ E U tC1 j F t  D H t on max D t .
Let us now return to our complete financial market model, where Ht is the discounted payoff of an American contingent claim. Thus, an optimal stopping strategy
for H maximizes the expected payoff E H . But a stopping time turns out to be the
best choice even in a pathwise sense, provided that it is optimal with respect to the
unique equivalent martingale measure P  in a complete market model. In order to
explain this fact, let us first recall from Section 6.1 the construction of a perfect hedge
of H from the sellers perspective. Let


UP D M  A


denote the Doob decomposition of the Snell envelope U P of H with respect to P  .


Since P  is the unique equivalent martingale measure in our model, the martingale
M has the representation


M t D U0P C

t
X

k  .Xk  Xk1 /;

t D 0; : : : ; T;

kD1

for a d -dimensional predictable process . Clearly, M is equal to the value process of



the self-financing strategy constructed from  and the initial investment U0P . Since
M dominates H , this yields a perfect hedge of H from the perspective of the seller: If
the buyer exercises the option at some stopping time  , then the seller makes a profit
M  H  0. The following corollary states that the buyer can in fact meet the
value of the sellers hedging portfolio, and that this happens if and only if the option

is exercised at an optimal stopping time with respect to P  . In this sense, U0P can
be regarded as the unique arbitrage-free price of the discounted American claim H .
Corollary 6.22. With the above notation,


H  M D U0P C


X

k  .Xk  Xk1 /;

P  -a.s. for all  2 T ,

kD1

and equality holds P  -almost surely if and only if  is optimal with respect to P  .

333

Section 6.2 Stopping strategies for the buyer

Proof. At time  ,

H  UP D M  A  M :


Moreover, by Theorem 6.21, both H D UP and A D 0 hold P  -a.s. if and only
if  is optimal with respect to P  .
Let us now compare a discounted American claim H to the corresponding discounted European claim HT , i.e., to the contract which is obtained from H by restricting the exercise time to be T . In particular, we are interested in the relation
between American and European put or call options. Let
V t WD E  HT j F t 
denote the amount needed at time t to hedge HT . Since our market model is complete,
V t can also be regarded as the unique arbitrage-free price of the discounted claim HT

at time t . From the sellers perspective, U tP plays a similar role for the American
option. It is intuitively clear that an American claim should be more expensive than the
corresponding European one. This is made mathematically precise in the following
statement.
Proposition 6.23. With the above notation, U tP

dominates H , then U P and V coincide.

 V t for all t . Moreover, if V

Proof. The first statement follows immediately from the supermartingale property

of U P


U tP  E  UTP j F t  D E  HT j F t  D V t :
Next, if the P  -martingale V dominates H , then it also dominates the corresponding


Snell envelope U P by Proposition 6.10. Thus V and U P must coincide.
Remark 6.24. The situation in which V dominates H occurs, in particular, when the
process H is a P  -submartingale. This happens, for instance, if H is obtained by
applying a convex function f W Rd ! 0; 1/ to the discounted price process X.
Indeed, in this case, Jensens inequality for conditional expectations implies that
E  f .X tC1 / j F t   f .E  X tC1 j F t / D f .X t /:

Example 6.25. The discounted payoff of an American call option C tcall D .S t1 K/C
is given by


K C
call
1
:
Ht D Xt  0
St
Under the hypothesis that S t0 is increasing in t , (5.24) states that
call
j F t   H tcall
E  H tC1

P  -a.s. for t D 0; : : : ; T  1.

334

Chapter 6 American contingent claims




In other words, H call is a submartingale, and the Snell envelope U P of H call coincides with the value process



K C

1
XT  0
Vt D E
Ft
ST
of the corresponding European call option with maturity T . In particular, we have

U0P D V0 , i.e., the unique arbitrage-free price of the American call option is equal to
its European counterpart. Moreover, Theorem 6.21 implies that the maximal optimal
stopping time with respect to P  is given by max  T . This suggests that, in a
complete model, an American call should not be exercised before maturity.
}
put

Example 6.26. For an American put option C t WD .K  S t1 /C the situation is


different, because the argument in (5.24) fails unless S 0 is decreasing. If S 0 is an
increasing bond, then the time value


1 C

0  .K  ST /
W t WD S t E
F t  .K  S t1 /C
ST0
of a European put .K  ST1 /C typically becomes negative at a certain time t , corresponding to an early exercise premium W t ; see Figure 5.3. Thus, the early exercise
premium is the surplus which an owner of the American put option would have over
the value of the European put .K  ST1 /C .
The relation between the price of a put option and its intrinsic value can be illustrated in the context of the CRR model. With the notation of Section 5.5, the price
process of the risky asset S t D S t1 can be written as
S t D S0 t

for t WD

t
Y

.1 C Rk /

kD1

and with the constant S0  0. Recall that the returns Rk can take only two possible
values a and b with 1 < a < b, and that the market model is arbitrage-free if and
only if the riskless interest rate r satisfies a < r < b. In this case, the model is
complete, and the unique equivalent martingale measure P  is characterized by the
fact that it makes R1 ; : : : ; RT independent with common distribution
P  Rk D b  D p  D
Let
.x/ WD sup E
2T
put

r a
:
ba

.K  x /C
.1 C r/

(6.13)


denote the price of C regarded as a function of x WD S0 . Clearly, .x/ is a convex


and decreasing function in x. Let us assume that r > 0 and that the parameter a is

335

Section 6.2 Stopping strategies for the buyer

strictly negative. A trivial situation occurs if the option is far out of the money in
the sense that
K
;
x
.1 C a/T
because then S t D x t  K for all t , and the payoff of C put is always zero. In
particular, .x/ D 0. If
K
x
(6.14)
.1 C b/T
then S t D x t  K for all t , and hence
 


K

 x D K  x:
.x/ D sup E
.1 C r/
2T
In this case, the price of the American put option is equal to its intrinsic value .Kx/C
at time t D 0, and an optimal strategy for the owner would simply consist in exercising
the option immediately, i.e., there is no demand for the option in the regime (6.14).
Now consider the case
K
Kx<
.1 C a/T
of a put option which is at the money or not too far out of the money. For large
put
enough t > 0, the probability P  C t > 0  of a non-zero payoff is strictly positive,
while the intrinsic value .K  x/C vanishes. It follows that the price .x/ is strictly
higher than the intrinsic value, and so it is not optimal for the buyer to exercise the
option immediately.
Summarizing our observations, we can say that there exists a value x  with
K
 x < K
.1 C b/T
such that
.x/ D .K  x/C

for x  x  ,

.x/ > .K  x/C

for x  < x < K=.1 C a/T ;

.x/ D 0

for x  K=.1 C a/T ;

and

see Figure 6.1.

Remark 6.27. In the context of an arbitrage-free CRR model, we consider a discounted American claim H whose payoff is determined by a function of time and of
the current spot price, i.e.,
Ht D h t .S t /

for all t .

336

Chapter 6 American contingent claims

S0

Figure 6.1. The price of an American put option as a function of S0 compared to the
options intrinsic value .K  S0 /C .

Clearly, this setting includes American call and put options as special cases. By using

the same arguments as in the derivation of (5.28), we get that the Snell envelope U P
of H is of the form


U tP D u t .S t /;

t D 0; : : : ; T;

where the functions u t are determined by the recursion


uT .x/ D hT .x/

and

u t .x/ D h t .x/ _ .u tC1 .x bO / p  C u tC1 .x a/


O .1  p  //:

Here p  is defined as in (6.13), and the parameters aO and bO are given by aO D 1 C a


and bO D 1 C b. Thus, the space 0; T   0; 1/ can be decomposed into the two
regions
Rc WD .t; x/ j u t .x/ > h t .x/ and

Rs WD .t; x/ j u t .x/ D h t .x/;

and the minimal optimal stopping time min can be described as the first exit time of
the space time process .t; S t / from the continuation region Rc or, equivalently, as the
first entrance time into the stopping region Rs
min D mint  0 j .t; S t / Rc D mint  0 j .t; S t / 2 Rs :

Exercise 6.2.3. Consider a market model with two assets and an American contingent
claim. The development of the discounted price process X WD X 1 and the discounted
American claim is described by the following diagram.

337

Section 6.3 Arbitrage-free prices

 X2 D 9, H2 D 4





 X1 D 8, H1 D 1:5 HH


HH



X0 D 5, H0 D 1 
HH

H
H
 X2 D 6, H2 D 1


HH


H
H X1 D 4, H1 D 0 
HH
HH
H
H X2 D 3, H2 D 0

The buyer of the American claim uses a probability measure P that assigns equal
probability to each of the possible scenarios. Find an optimal stopping strategy that
maximizes E H  over  2 T . What would be an optimal stopping time
p if the buyer
p
uses the utility function u.x/ D x and thus aims at maximizing E H ? Then
show that the market model admits a unique risk-neutral measure P  and compute

}
the corresponding Snell envelope U P .
Exercise 6.2.4. Let
H tK WD

.K  S t /C
.1 C r/t

be the discounted payoff of an American put option with strike K in a market model
with one risky asset S D .S t / tD0;:::T and a riskless asset S t0 D .1 C r/t , where
K the minimal optimal stopping time of the buyers problem
r > 0. We denote by min
K
to maximize E H  over  2 T .
0

K   K P -a.s. when K  K 0 .
(a) Show that min
min
K D 0 P -a.s.
(b) Show that ess infK0 min

(c) Use (b) and the fact that F0 D ;;  to conclude that there exists K0  0 such
K D 0 P -a.s. for all K  K .
}
that min
0

6.3

Arbitrage-free prices

In this section, we drop the condition of market completeness, and we develop the
notion of an arbitrage-free price  for a discounted American claim H in a general
incomplete framework. The basic idea consists in reducing the problem to the determination of the arbitrage-free price for the payoff H which arises from H by fixing
the exercise strategy  . The following remark explains that H can be treated like the
discounted payoff of a European contingent claim, whose set of arbitrage-free prices
is given by
(6.15)
.H / D E  H  j P  2 P ; E  H  < 1 :

338

Chapter 6 American contingent claims

Remark 6.28. As observed in Remark 5.33, a discounted payoff HQ t which is received


at time t < T can be regarded as a discounted European claim HQ E maturing at T .
HQ E is obtained from HQ t by investing at time t the payoff S t0 HQ t into the numraire,
i.e., by buying HQ t shares of the 0th asset, and by considering the discounted terminal
value of this investment:
1
HQ E D 0 . ST0 HQ t / D HQ t :
ST
In the case of our discounted American claim H which is payed off at the random
time  , we can either apply this argument to each payoff
HQ t WD H IDt D H t IDt ;
or directly use a stopping time version of this argument. We conclude that H can
be regarded as a discounted European claim, whose arbitrage-free prices are given by
(6.15).
}
Now suppose that H is offered at time t D 0 for a price   0. From the buyers
point of view there should be at least one exercise strategy  such that the proposed
price  is not too high in the sense that    0 for some  0 2 .H /. From
the sellers point of view the situation looks different: There should be no exercise
strategy  0 such that the proposed price  is too low in the sense that  <  0 for all
 0 2 .H 0 /. By adding the assumption that the buyer only uses stopping times in
exercising the option, we obtain the following formal definition.
Definition 6.29. A real number  is called an arbitrage-free price of a discounted
American claim H if the following two conditions are satisfied:


The price  is not too high in the sense that there exists some  2 T and
 0 2 .H / such that    0 .
The price  is not too low in the sense that there exists no  0 2 T such that
 <  0 for all  0 2 .H 0 /.

The set of all arbitrage-free prices of H is denoted .H /, and we define


inf .H / WD inf .H /

and

sup .H / WD sup .H /:

Recall from Remark 6.6 that every discounted European claim H E can be regarded
as a discounted American claim H A whose payoff is zero if H A is exercised before T ,
and whose payoff at T equals H E . Clearly, the two sets .H E / and .H A / coincide, and so the two Definitions 5.28 and 6.29 are consistent with each other.
Remark 6.30. It follows from the definition that any arbitrage-free price  for H
must be an arbitrage-free price for some H . Hence, (6.15) implies that  D E  H 

339

Section 6.3 Arbitrage-free prices

for some P  2 P . Similarly, we obtain from the second condition in Definition 6.29
that   infP  2P E  H  for all  2 T . It follows that
sup inf E  H     sup sup E  H  for all  2 .H /.
2T P  2P

(6.16)

2T P  2P

In particular,
sup E  H 
2T

is the unique arbitrage-free price of H if P  is the unique equivalent martingale measure in a complete market model, and so Definition 6.29 is consistent with the results
of the Section 6.1 and 6.2.
}
Exercise 6.3.1. Show that in every arbitrage-free market model and for any discounted American claim H ,
inf sup E  H  < 1;

(6.17)

P  2P 2T

and that the set .H / of arbitrage-free prices is nonempty.

Our main goal in this section is to characterize the set .H /, and to identify the
upper and lower bounds in (6.16) with the quantities sup .H / and inf .H /. We will
work under the simplifying assumption that
H t 2 L1 .P  /

for all t and each P  2 P .

(6.18)

For each P  2 P we denote by U P the corresponding Snell envelope of H , i.e.,




U tP D ess sup E  H j F t :
2T t

With this notation, the right-hand bound in (6.16) can be written as




sup sup E  H  D sup sup E  H  D sup U0P :

2T P  2P

P  2P 2T

P  2P

In fact, a similar relation also holds for the lower bound in (6.16)


sup inf E  H  D inf sup E  H  D inf U0P :

2T

P  2P

P  2P

2T

(6.19)

P  2P

The proof that the above interchange of infimum and supremum is indeed justified
under assumption (6.18) is postponed to the next section; see Theorem 6.45.
Theorem 6.31. Under condition (6.18), the set of arbitrage-free prices for H is a
real interval with endpoints
inf .H / D inf sup E  H  D sup inf E  H 
P  2P 2T

2T P  2P

340

Chapter 6 American contingent claims

and
sup .H / D sup sup E  H  D sup sup E  H :
P  2P 2T

2T P  2P

Moreover, .H / either consists of one single point or does not contain its upper
endpoint sup .H /.
Proof. Let   be a stopping time which is optimal with respect to a given P  2 P .


Then U0P D E  H   D sup 0 2T E  H 0 , and consequently U0P 2 .H /.
Together with the a priori bounds (6.16), we obtain the inclusions


U0P j P  2 P  .H /  a; b;

(6.20)

where
a WD sup inf E  H  and
2T

b WD sup sup E  H :

P  2P

2T P  2P

Moreover, the minimax identity (6.19) shows that


a D inf sup E  H  D inf U0P

and

P  2P

P  2P 2T

b D sup U0P :
P  2P

Together with (6.20), this yields the identification of inf .H / and sup .H / as a and b.

Now we claim that U0P j P  2 P is an interval, which, in view of the preceding
step, will prove that .H / is also an interval. Take P0 ; P1 2 P and define P 2 P
by P WD P1 C .1  /P0 for 0   1. By Theorem 6.18, f ./ WD U0P is the
supremum of the affine functions
7! E H  D E1 H  C .1  /E0 H ;

 2T:

Thus, f is convex and lower semicontinuous on 0; 1, hence continuous; see part (a)
of Proposition A.4. Since P is convex, this proves our claim.
It remains to exclude the possibility that b belongs to .H / in case a < b. Suppose
by way of contradiction that b 2 .H /. Then there exist O 2 T and PO 2 P such that
O HO  D b D sup sup E  H :
E
2T P  2P

In particular, PO attains the supremum of E  HO  for P  2 P . Theorem 5.32 implies that the discounted European claim HO is attainable and that E  HO  is in fact
independent of P  2 P . Hence,
O HO  D inf E  HO   sup inf E  H ;
b D E
P  2P


2T P 2P

and we end up with the contradiction b  a. Thus, b cannot belong to .H /.

341

Section 6.3 Arbitrage-free prices

Comparing the previous result with Theorem 5.32, one might wonder whether
.H / contains its lower bound if inf .H / < sup .H /. At a first glance, it may
come as a surprise that both cases
inf .H / 2 .H /

and

inf .H / .H /

can occur, as is illustrated by the following simple example.


Example 6.32. Consider a complete market model with T D 2, defined on some
probability space .0 ; G0 ; P0 /. This model will be enlarged by adding two external
states ! C and !  , i.e., we define  WD 0  ! C ; !  and
1
P0 !0 ; !0 2 0 :
2
We assume that this additional information is revealed at time 2. The enlarged financial market model will then be incomplete, and the corresponding set P of equivalent
martingale measures satisfies
P .!0 ; ! /  WD

P Pp j 0 < p < 1;


where Pp is determined by Pp 0 ! C  D p. Consider the discounted American
claim H defined as

2 if ! D .!0 ; ! C /,
H0  0; H1  1; and H2 .!/ WD
0 if ! D .!0 ; !  /.
Clearly, .H / 1; 2/. On the other hand, 2  2 is an optimal stopping time for
Pp if p > 12 , while 1  1 is optimal for p  12 . Hence,
.H / D 1; 2/;
and the lower bound inf .H / D 1 is an arbitrage-free price for H . Now consider the
discounted American claim HQ defined by HQ t D H t for t D 0; 2 and by HQ 1  0. In
this case, we have
.HQ / D .0; 2/:
}
Theorem 6.31 suggests that an American claim H which admits a unique arbitragefree price should be attainable in an appropriate sense. Corollary 6.22, our hedging
result in the case of a complete market, suggests the following definition of attainability.
Definition 6.33. A discounted American claim H is called attainable if there exists
a stopping time  2 T and a self-financing trading strategy  whose value process V
satisfies P -a.s.
V t  H t for all t , and V D H .
The trading strategy  is called a hedging strategy for H .

342

Chapter 6 American contingent claims

If H is attainable, then a hedging strategy protects the seller not only against those
claims H which arise from stopping times  . The seller is on the safe side even if
the buyer would have full knowledge of future prices and would exercise H at an
arbitrary FT -measurable random time . For instance, the buyer even could choose
such that
H D max H t :
0tT

In fact, we will see in Remark 7.12 that H is attainable in the sense of Definition 6.33
if and only if V t  H t for all t and V D H for some FT -measurable random
time .
If the market model is complete, then every American claim H is attainable. Moreover, Theorem 6.11 and Corollary 6.22 imply that the minimal initial investment
needed for the purchase of a hedging strategy for H is equal to the unique arbitragefree price of H . In a general market model, every attainable discounted American
claim H satisfies our integrability condition (6.18) and has a unique arbitrage-free
price which is equal to the initial investment of a hedging strategy for H . This follows from Theorem 5.25. In fact, the converse implication is also true.
Theorem 6.34. For a discounted American claim H satisfying (6.18), the following
conditions are equivalent:
(a) H is attainable.
(b) H admits a unique arbitrage-free price .H /, i.e., .H / D .H /.
(c) sup .H / 2 .H /.
Moreover, if H is attainable, then .H / is equal to the initial investment of any
hedging strategy for H .
The equivalence of (b) and (c) is an immediate consequence of Theorem 6.31. The
remainder of the proof of Theorem 6.34 is postponed to Remark 7.10 because it requires the technique of superhedging, which will be introduced in Section 7.

6.4

Stability under pasting

In this section we define the pasting of two equivalent probability measures at a given
stopping time. This operation will play an important role in the analysis of lower
and upper Snell envelopes as developed in Section 6.5. In particular, we will prepare
for the proof of the minimax identity (6.19), which was used in the characterization
of arbitrage-free prices of an American contingent claim. Let us start with a few
preparations.
Definition 6.35. Let  be a stopping time. The -algebra of events which are observable up to time  is defined as
F WD A 2 F j A \   t 2 F t for all t :

343

Section 6.4 Stability under pasting

Exercise 6.4.1. Prove that F is indeed a -algebra. Show next that


F D A 2 F j A \  D t 2 F t for all t
and conclude that F coincides with F t if   t . Finally show that F  F when
is a stopping time with .!/   .!/ for all ! 2 .
}
The following result is an addendum to Doobs stopping theorem; see Theorem
6.15:
Proposition 6.36. For an adapted process M in L1 .Q/ the following conditions are
equivalent:
(a) M is a Q-martingale.
(b) EQ M j F  D M^ for all  2 T and all stopping times .
Proof. (a) ) (b): Take a set A 2 F and let us write
EQ M I A  D EQ M I A \    C EQ M I A \  > :
Condition (b) will follow if we may replace M by M in the rightmost expectation.
To this end, note that
A \ D t \  > D A \ D t \  > t 2 F t :
Thus, since the stopped process M  is a martingale by Theorem 6.15,
EQ M I A \  >  D

T
X

EQ MT I A \ D t \  > 

tD0

T
X

EQ M t I A \ D t \  > 

tD0

D EQ M I A \  > :
(b) ) (a): This follows by taking   t and  s  t .
Exercise 6.4.2. Let Z be the density process of a probability measure QQ that is absolutely continuous with respect to Q; see Exercise 5.2.3. Show that for a stopping
time , we have QQ
Q on F with density given by
d QQ
D EQ ZT j F  D ZT ^ :
dQ F

344

Chapter 6 American contingent claims

Exercise 6.4.3. Show that for a stopping time  , a random variable Y 2 L1 .; F ; Q/,
and t 2 0; : : : ; T ,
EQ H j F  D EQ H j F t  Q-a.s. on  D t .

(6.21)
}

We next state the following extension of Theorem 6.18. It provides the solution to
the optimal stopping problem posed at any stopping time   T .
Proposition 6.37. Let H be an adapted process in L1 .; F ; Q/, and define for
 2T
T WD 2 T j   :
Then the Snell envelope U Q of H satisfies Q-a.s.
UQ D ess sup EQ H j F ;
 2T

and the essential supremum is attained for


./
Q
min WD mint   j H t D U t :

Exercise 6.4.4. Prove Proposition 6.37 by using the identity (6.21).

Definition 6.38. Let Q1 and Q2 be two equivalent probability measures and take
2 T . The probability measure
Q A  WD EQ1 Q2 A j F  ;
Q

A 2 FT ;

is called the pasting of Q1 and Q2 in .


The monotone convergence theorem for conditional expectations guarantees that QQ
is indeed a probability measure and that
EQQ Y  D EQ1 EQ2 Y j F  
for all FT -measurable Y  0. Note that QQ coincides with Q1 on F , i.e.,
EQQ Y  D EQ1 Y 

for all F -measurable Y  0.

Lemma 6.39. For Q1  Q2 , their pasting in 2 T is equivalent to Q1 and satisfies


ZT
d QQ
D
;
dQ1
Z
where Z is the density process of Q2 with respect to Q1 .

Section 6.4 Stability under pasting

345

Proof. For Y  0,
EQQ Y  D EQ1 EQ2 Y j F  
i
h 1
EQ1 Y ZT j F 
D EQ1
Z
i
hZ
T
D EQ1
Y ;
Z
where we have used the martingale property of Z and the fact that Z > 0 Q1 -almost
surely. The equivalence of QQ and Q1 follows from ZT > 0 Q1 -almost surely.
Lemma 6.40. For Q1  Q2 , let QQ be their pasting in 2 T . Then, for all stopping
times  and FT -measurable Y  0,
EQQ Y j F  D EQ1 EQ2 Y j F _  j F :
Proof. If '  0 is F -measurable, then 'I is F \ F -measurable. Hence,
EQQ Y'I    D EQ1 EQ2 Y j F 'I   
D EQ1 EQ1 EQ2 Y j F  j F 'I   
D EQQ EQ1 EQ2 Y j F  j F 'I   ;
where we have used the fact that QQ coincides with Q1 on F . On the other hand,
EQQ Y'I  >  D EQ1 EQ2 EQ2 Y j F ' j F I  > 
D EQQ EQ2 Y j F 'I  > :
It follows that
EQQ Y j F  D EQ1 EQ2 Y j F  j F  I C EQ2 Y j F  I> ;
and this coincides with the right-hand side of the asserted identity.
Definition 6.41. A set Q of equivalent probability measures on .; F / is called stable if, for any Q1 ; Q2 2 Q and 2 T , also their pasting in is contained in Q.
The condition of stability in the preceding definition is sometimes also called fork
convexity, m-stability, or stability under pasting. For the purposes of this book, the
most important example of a stable set is the class P of all equivalent martingale
measures, but in Section 6.5 we will also discuss the connection between stable sets
and dynamic risk measures.
Proposition 6.42. P is stable.

346

Chapter 6 American contingent claims

Proof. Take P1 ; P2 2 P and denote by PQ their pasting a given 2 T . Doobs


stopping theorem in the form of Proposition 6.36 and Lemma 6.40 applied with Y WD
X ti  0 and   s yield that for s  t
Q X t j Fs  D E1 E2 X t j F _s  j Fs  D E1 X _s j Fs  D Xs :
E
Q X ti  D X i < 1,
It follows in particular that each component X ti is in L1 .PQ / since E
0
concluding the proof of PQ 2 P .
We conclude this section by an alternative characterization of stable sets. It will be
used in Section 11.2. Suppose that 2 T takes at most one value t 2 0; : : : ; T that
is different from T . Then there exists a set B 2 F t such that D t  IB C T  IB c . It
follows that the pasting QQ of two equivalent probability measures Q1 and Q2 in is
given by
Q A  WD EQ1 Q2 A j F t   IB C I
; A 2 FT :
(6.22)
Q
A\B c
This observation can be used to give the following characterization of stable sets.
Proposition 6.43. A set Q of equivalent probability measures is stable if and only
if for any t 2 0; : : : ; T and B 2 F t the probability measure QQ defined in (6.22)
belongs again to Q.
Proof. We have already seen that QQ 2 Q when Q is stable. For the proof of the
converse implication, let 2 T be a stopping time and take Q1 ; Q2 2 Q. We define
recursively QQ T WD Q1 and
QQ t1 A  WD EQQ t Q2 A j F t1   I Dt 1 C IA\ t 1 
for t D T; : : : ; 1. Then QQ 0 2 Q by assumption. We claim that QQ 0 coincides with
the pasting of Q1 and Q2 in , and this will prove the assertion. To verify our claim,
note that the densities of QQ t with respect to Q1 satisfy the recursion

d QQ t1
d QQ t  ZT
D
I Dt 1 C I t 1 ;
dQ1
dQ1 Z t1
where .Z t / is the density process of Q2 with respect to Q1 . But this implies that
d QQ t1
ZT
D
I
C I <t 1 ;
dQ1
Z t 1
and so our claim QQ 0 D QQ follows from Lemma 6.39.

347

Section 6.5 Lower and upper Snell envelopes

6.5

Lower and upper Snell envelopes

Our main goal in this section is to provide a proof of the minimax identity (6.19),
that was used in the characterization of the set of arbitrage-free prices of an American contingent claim. The techniques and results which we develop here will help
to characterize the time-consistency of dynamic coherent risk measures and they will
also be needed in Chapter 7. Moreover, they can be interpreted in terms of an optimal stopping problem for general utility functionals which appear in a robust Savage
representation of preferences on payoff profiles. Let us now fix a set Q of equivalent
probability measures and an adapted process H such that
H t 2 L1 .Q/

for all t and each Q 2 Q.

Recall that this condition implies


Q
inf sup EQ H  D inf U0 < 1;

Q2Q 2T

Q2Q

where U Q denotes the Snell envelope of H with respect to Q 2 Q. Let us also


assume that
Q is stable.
Definition 6.44. The lower Snell envelope of H is defined as
#
Q
U t WD ess inf U t D ess inf ess sup EQ H j F t ;
Q2Q

Q2Q

t D 0; : : : ; T:

2T t

The upper Snell envelope of H is defined as


"

U t WD ess sup U t D ess sup ess sup EQ H j F t ;


Q2Q

2T t

t D 0; : : : ; T:

Q2Q

We will first study the lower Snell envelope. The following minimax theorem
states that the essential infimum and the essential supremum occurring in the definition of U # may be interchanged if Q is stable. Applied at t D 0 and combined with
Proposition 6.42, this gives the identity (6.19), which was used in our characterization
of the arbitrage-free prices of H .
Theorem 6.45. The lower Snell envelope of H satisfies
U t# D ess sup ess inf EQ H j F t  for each t .
2T t

Q2Q

In particular,
#

U0 D inf sup EQ H  D sup inf EQ H :


Q2Q 2T

2T Q2Q

(6.23)

348

Chapter 6 American contingent claims

The inequality  in (6.23) is obvious. Its converse is an immediate consequence


of the next theorem, which solves the following optimal stopping problem that is
formulated with respect to the nonadditive expectation operator infQ2Q EQ  :
maximize inf EQ H  among all  2 T .
Q2Q

Theorem 6.46. Define a stopping time  t 2 T t by


 t WD minu  t j Uu# D Hu :
Then, P -a.s.,

U t D ess inf EQ H t j F t :

(6.24)

Q2Q

In particular,
#

sup inf EQ H  D inf EQ H0  D U0 :

2T Q2Q

Q2Q

For the proof of Theorem 6.46, we need some preparations.


Lemma 6.47. Suppose that we are given Q1 ; Q2 2 Q, a stopping time  2 T , and
a set B 2 F . Let QQ 2 Q be the pasting of Q1 and Q2 in the stopping time
WD  IB C T IB c :
Then the Snell envelopes associated with these three measures are related as follows:
Q

UQ D UQ2 IB C UQ1 IB c

P -a.s.

Proof. With Proposition 6.37 and its notation, we have


Q

UQ D ess sup EQQ H j F :


2T

To compute the conditional expectation on the right, note first that


EQ2 H j F _  D EQ2 H j F  IB C H IB c :
Hence, Lemma 6.40 yields that
EQQ H j F  D EQ2 H j F  IB C EQ1 H j F  IB c :
Moreover, whenever 1 ; 2 2 T , then
Q WD 1 IB C 2 IB c
is also a stopping time in T . Thus,
Q

UQ D ess sup EQ2 H j F  IB C ess sup EQ1 H j F  IB c ;


2T

and (6.25) follows.

2T

(6.25)

349

Section 6.5 Lower and upper Snell envelopes

Lemma 6.48. For any Q 2 Q and  2 T there exist Qk 2 Q such that Qk D Q on


F and
O
UQk & ess inf UQ D U# :
O
Q2Q

Similarly, there exist Qk 2 Q such that Qk D Q on F and


O

UQ % ess sup UQ DW U" :


O
Q2Q

Q
Q
Proof. For Q1 ; Q2 2 Q, B WD U 1 > U 2 , take QQ 2 Q as in Lemma 6.47. Then
Q

UQ D UQ1  IB c C UQ2  IB D UQ1 ^ UQ2 :

(6.26)

Moreover, if Q1 D Q on F then also QQ D Q on F . Hence, the set


O
WD UQ j QO 2 Q and QO D Q on F

is such that U# D ess inf . Moreover, (6.26) implies that is directed downwards,
and the second part of Theorem A.33 states the existence of the desired sequence
.Qk /  Q. The proof for the essential supremum is analogous.
Q

Proof of Theorem 6:46. To prove (6.24), observe first that U t  EQ H t j F t  for


each Q 2 Q, so that  holds in (6.24). For the proof of the converse inequality, note
that
Q
for Q 2 Q.
 t  minu  t j UuQ D Hu DW  t
It was shown in Theorem 6.18 that  tQ is the minimal optimal stopping time after
time t and with respect to Q. It was also shown in the proof of Theorem 6.18 that the
Q
stopped process .U Q / t is a Q-martingale from time t on. In particular,
U tQ D EQ UQt j F t 

for all Q 2 Q.

(6.27)

Let us now fix some Q 2 Q. Lemma 6.48 yields Qk 2 Q with Qk D Q on F t


such that UQt k decreases to U#t . We obtain


EQ H t j F t  D EQ U#t j F t  D EQ lim UQt k j F t
k"1

D lim EQ UQt k j F t  D lim EQk UQt k j F t 


k"1

D lim

k"1

k"1

U tQk

U t# :

Q
Q
Q
Q
Here we have used that H t  U t k  U t 1 and EQ jUt 1 j  D EQ1 jU t 1 j  < 1
together with dominated convergence in the third step, the fact that Qk D Q on
F t F t in the fourth, and (6.27) in the fifth identity.

350

Chapter 6 American contingent claims

Remark 6.49. Suppose the buyer of an American option uses a utility functional of
the form
inf EQ u.Z/ ;
Q2Q

where Q is a set of probability measures and u is a measurable function. This may be


viewed as a robust Savage representation of a preference relation on discounted asset
payoffs; see Section 2.5. Thus, the aim of the buyer is to maximize the utility
inf EQ u.H / 

Q2Q

of the discounted payoff H among all stopping times  2 T . This generalized


utility maximization problem can be solved with the results developed in this section,
provided that the set Q is a stable set of equivalent probability measures. Indeed,
assume
HQ t WD u.H t / 2 L1 .Q/ for all t and each Q 2 Q,
and let U Q be the Snell envelope of HQ t with respect to Q 2 Q. Theorem 6.46 states
that the generalized optimal stopping problem is solved by the stopping time

  WD min t  0 j ess inf U tQ D HQ t ;


Q2Q

i.e.,
#
inf sup EQ u.H /  D U0 D inf EQ u.H  / :

Q2Q 2T

Q2Q

Let us now turn to the analysis of the upper Snell envelope


"
Q
U t WD ess sup U t D ess sup ess sup EQ H j F t ;
Q2Q

2T t

t D 0; : : : ; T:

Q2Q

In order to simplify the presentation, we will assume from now on that


sup EQ jH t j  < 1

for all t .

Q2Q

This condition implies that


"
Q
U0 D sup U0  sup sup EQ jH j  < 1:
Q2Q

2T Q2Q

Our main result on upper Snell envelopes states that, for stable sets Q, the upper
Snell envelope U " satisfies a recursive scheme that is similar to the one for ordinary
Snell envelopes. In contrast to (6.4), however, it involves the nonadditive conditional
expectation operators ess supQ EQ  j F t .

351

Section 6.5 Lower and upper Snell envelopes

Theorem 6.50. U " satisfies the following recursive scheme:


"

"

"

UT D HT and U t D H t _ ess sup EQ U tC1 j F t ;

t D T  1 : : : ; 0: (6.28)

Q2Q

Proof. The definition of the Snell envelope U Q implies that


"
Q
Q
U t D ess sup U t D H t _ ess sup EQ U tC1 j F t :
Q2Q

(6.29)

Q2Q

Next, we fix Q 2 Q and denote by Q tC1 .Q/ the set of all QO 2 Q which coincide
with Q on F tC1 . According to Lemma 6.48, there are Qk 2 Q tC1 .Q/ such that
Qk
"
Q1
Q1
j  D EQ1 jU tC1
j  < 1 combined with
U tC1 % U tC1 . The fact that EQ jU tC1
monotone convergence for conditional expectations shows that
"
Q
ess sup EQ U tC1
j F t   ess sup EQ U tC1
j Ft 
Q2Q

Q2Q

D ess sup

ess sup

O
Q2Q Q2Q
t C1 .Q/

O
Q

EQ U tC1 j F t 
k

Q
 ess sup lim inf EQ U tC1 j F t 
k"1

Q2Q

(6.30)

"

D ess sup EQ U tC1 j F t :


Q2Q

In particular, all inequalities are in fact identities. Together with (6.29) we obtain the
recursive scheme for U " .
The following result shows that the nonadditive conditional expectation operators
ess supQ EQ  j F t  associated with a stable set Q enjoy a consistency property that
is similar to the martingale property for ordinary conditional expectations.
Theorem 6.51. Let Q be a set of equivalent probability measures and
"

V t WD ess sup EQ H j F t ;

t D 0; : : : ; T;

Q2Q

"

for some FT -measurable H  0 such that V0 < 1. If Q is stable then


V" D ess sup EQ V" j F 
Q2Q

for ;  2 T with   .

352

Chapter 6 American contingent claims

Remark 6.52. Note that, for H as in the theorem and  2 T ,


V" D

T
X

ess sup EQ H j F t IDt

tD0 Q2Q
T
X

ess sup EQ H j F IDt

tD0 Q2Q

D ess sup EQ H j F ;
Q2Q

where we have used (6.21) in the second identity.


Proof of Theorem 6:51. By Remark 6.52,
V" D ess sup EQ H j F  D ess sup EQ EQ H j F  j F :
Q2Q

Q2Q

"

The proof that the right-hand side is equal to ess supQ2Q EQ V j F  is done by
first noting that V " is equal to the upper Snell envelope of the process H t given by
HT D H and H t D 0 for t < T . Then the same argument as in (6.30) applies. All
one has to do is to replace t C 1 by  .
Remark 6.53. Let us conclude this section by pointing out the connection between
stability under pasting and the time-consistency of dynamic coherent risk measures.
Let
.Y / WD sup EQ Y ; Y 2 L1 .P /;
Q2Q

be a coherent risk measure on L1 .P / defined in terms of a set Q of probability


measures equivalent to P . In the context of a dynamic financial market model, it is
natural to update the initial risk assessment at later times t > 0. If one continues to
use Q as a basis to compute the risk but takes into account the available information,
one is led to consider the conditional risk measures
 t .Y / D ess sup EQ Y j F t ;

t D 0; : : : ; T:

(6.31)

Q2Q

The sequence 0 : : : ; T can be regarded as a dynamic coherent risk measure. Such


a dynamic risk measure is called time-consistent or dynamically consistent if
s . t .Y // D s .Y /

for 0  s  t  T .

(6.32)

When the set Q in (6.31) is a stable set of equivalent probability measures, then Theorem 6.51 implies immediately the time consistency (6.32). The following converse of
this statement, and hence a converse of Theorem 6.51, will be given in Theorem 11.22:

Section 6.5 Lower and upper Snell envelopes

353

if . t / is a dynamically consistent sequence of conditional coherent risk measures satisfying certain regularity assumptions, then there exists a stable set Q of equivalent
probability measures such that (6.32) holds. An extension of dynamic consistency to
dynamic convex risk measures will be given in Section 11.2.
Note that Theorem 6.51 shows that in (6.32) the deterministic times s and t can
even be replaced by stopping times when Q is stable.
}

Chapter 7

Superhedging

The idea of superhedging is to find a self-financing trading strategy with minimal initial investment which covers any possible future obligation resulting from the sale of
a contingent claim. If the contingent claim is not attainable, the proof of the existence
of such a superhedging strategy requires new techniques, and in particular a new
uniform version of the Doob decomposition. We will develop this theory for general
American contingent claims. In doing so, we will also obtain new results for European contingent claims. In the first three sections of this chapter, we assume that our
market model is arbitrage-free or, equivalently, that the set of equivalent martingale
measures satisfies
P ;:
In the final Section 7.4, we discuss liquid options in a setting where no probabilistic
model is fixed a priori. Such options may be used for the construction of specific
martingale measures, and also for the purpose of hedging illiquid exotic derivatives.

7.1

P -supermartingales

In this section, H denotes a discounted American claim with


sup E  H t  < 1 for all t :

(7.1)

P  2P

Our aim in this chapter is to find the minimal amount of capital Ut that will be needed
at time t in order to purchase a self-financing trading strategy whose value process
satisfies Vu  Hu for all u  t . In analogy to our derivation of the recursive scheme
(6.4), we will now heuristically derive a formula for U t . At time T , the minimal
amount needed is clearly given by
UT D HT :
At time T  1, a first requirement is to have UT 1  HT 1 . Moreover, the amount
UT 1 must suffice to purchase an FT 1 -measurable portfolio  T such that  T  XT 
HT almost surely. An informal application of Theorem 1.32, conditional on FT 1 ,
shows that
UT 1  ess sup E  HT j FT 1 :
P  2P

355

Section 7.1 P -supermartingales

Hence, the minimal amount UT 1 is equal to the maximum of HT 1 and this essential
supremum. An iteration of this argument yields the recursive scheme
UT D HT

and

U t D H t _ ess sup E  U tC1 j F t 


P  2P

for t D T  1; : : : ; 0. By combining Proposition 6.42 and Theorem 6.50, we can


identify U as the upper Snell envelope


"

U t D ess sup U tP D ess sup ess sup E  H j F t 


P  2P

2T t

P  2P


of H with respect to the stable set P , where U P denotes the Snell envelope of H
with respect to P  . In the first three sections of this chapter, we will in particular give
a rigorous version of the heuristic argument above.
Note first that condition (7.1) implies that


sup .H / D sup U0P D sup sup E  H  < 1;


P  2P

P  2P 2T

where we have used the identification of the upper bound sup .H / of the arbitragefree prices of H given in Theorem 6.31. It will turn out that the following definition
applies to the upper Snell envelope if we choose Q D P .
Definition 7.1. Suppose that Q is a non-empty set of probability measures on
.; FT /. An adapted process is called a Q-supermartingale if it is a supermartingale
with respect to each Q 2 Q. Analogously, we define the notions of a Q-submartingale
and of a Q-martingale.
In Theorem 5.25, we have already encountered an example of a P -martingale,
namely the value process of the replicating strategy of an attainable discounted European claim.
Theorem 7.2. The upper Snell envelope U " of H is the smallest P -supermartingale
that dominates H .
Proof. For each P  2 P the recursive scheme (6.28) implies that P  -a.s.
"

"

"

U t  H t _ E  U tC1 j F t   E  U tC1 j F t :
"
Since U0 is a finite constant due to our integrability assumption (7.1), induction on
"
t shows that U t is integrable with respect to each P  2 P and hence is a P -supermartingale dominating H .

356

Chapter 7 Superhedging

"
If UQ is another P -supermartingale which dominates H , then UQ T  HT D UT .
"
for some t , then
Moreover, if UQ tC1  U
tC1

"
UQ t  H t _ E  UQ tC1 j F t   H t _ E  U tC1 j F t :

Thus,

"
"
UQ t  H t _ ess sup E  U tC1 j F t  D U t ;
P  2P

and backward induction shows that UQ dominates U " .


For European claims, Theorem 7.2 takes the following form.
Corollary 7.3. Let H E be a discounted European claim such that
sup E  H E  < 1:
P  2P

Then

"

V t WD ess sup E  H E j F t ;

t D 0; : : : ; T;

P  2P

is the smallest P -supermartingale whose terminal value dominates H E .


Remark 7.4. Note that the proof of Theorem 7.2 did not use any special properties
of the set P . Thus, if Q is an arbitrary set of equivalent probability measures, the
process U defined by the recursion
UT D HT

and

U t D H t _ ess sup EQ U tC1 j F t 


Q2Q

is the smallest Q-supermartingale dominating the adapted process H .

7.2

Uniform Doob decomposition

The aim of this section is to give a complete characterization of all non-negative P supermartingales. It will turn out that an integrable and non-negative process U is a
P -supermartingale if and only if it can be written as the difference of a P -martingale
N and an increasing adapted process B satisfying B0 D 0. This decomposition
may be viewed as a uniform version of the Doob decomposition since it involves
simultaneously the whole class P . It will turn out that the P -martingale N has a
special structure: It can be written as a stochastic integral of the underlying process X, which defines the class P . On the other hand, the increasing process B is
only adapted, not predictable as in the Doob decomposition with respect to a single
measure.

357

Section 7.2 Uniform Doob decomposition

Theorem 7.5. For an adapted, non-negative process U , the following two statements
are equivalent:
(a) U is a P -supermartingale.
(b) There exists an adapted increasing process B with B0 D 0 and a d -dimensional
predictable process  such that
U t D U0 C

t
X

k  .Xk  Xk1 /  B t

P -a.s. for all t .

kD1

Proof. First, we prove the easier implication (b) ) (a). Fix P  2 P and note that
VT WD U0 C

T
X

k  .Xk  Xk1 /  UT  0:

kD1

Hence, V is a P -martingale by Theorem 5.14. It follows that U t 2 L1 .P  / for all t .


Moreover, for P  2 P
E  U tC1 j F t  D E  V tC1  B tC1 j F t   V t  B t D U t ;
and so U is a P -supermartingale.
The proof of the implication (a) ) (b) is similar to the proof of Theorem 5.32. We
must show that for any given t 2 1; : : : ; T , there exist  t 2 L0 .; F t1 ; P I Rd /
and R t 2 L0C .; F t ; P / such that
U t  U t1 D  t  .X t  X t1 /  R t :
This condition can be written as
U t  U t1 2 K t  L0C .; F t ; P /;
where K t is as in (5.12). There is no loss of generality in assuming that P is itself
a martingale measure. In this case, U t  U t1 is contained in L1 .; F t ; P / by the
definition of a P -supermartingale. Assume that
U t  U t1 C WD .K t  L0C .; F t ; P // \ L1 .P /:
Absence of arbitrage and Lemma 1.68 imply that C is closed in L1 .; F t ; P /. Hence,
Theorem A.57 implies the existence of some Z 2 L1 .; F t ; P / such that
WD sup E Z W  < E Z .U t  U t1 /  DW < 1:

(7.2)

W 2C

In fact, we have D 0 since C is a cone containing the constant function 0. Lemma


1.58 implies that such a random variable Z must be non-negative and must satisfy
E .X t  X t1 / Z j F t1  D 0:

(7.3)

358

Chapter 7 Superhedging

In fact, we can always modify Z such that it is bounded from below by some " > 0
and still satisfies (7.2). To see this, note first that every W 2 C is dominated by a term
of the form  t  .X t  X t1 /. Hence, our assumption P 2 P , the integrability of W ,
and an application of Fatous lemma yield that
E W   E  t  .X t  X t1 /   lim inf E Ij t jc  t  .X t  X t1 /   0:
c"1

Thus, if we let Z " WD " C Z, then Z " also satisfies E Z " W   0 for all W 2 C. If
we chose " small enough, then E Z " .U t  U t1 /  is still larger than 0; i.e., Z " also
satisfies (7.2) and in turn (7.3). Therefore, we may assume from now on that our Z
with (7.2) is bounded from below by some constant " > 0.
Let
Z t1 WD E Z j F t1 ;
and define a new measure PQ  P by
Z
d PQ
:
WD
dP
Z t1
We claim that PQ 2 P . To prove this, note first that Xk 2 L1 .PQ / for all k, because
the density d PQ =dP is bounded. Next, let


Z
F
'k WD E
k ; k D 0; : : : ; T:
Z t1
If k t , then 'k1 D 'k ; this is clear for k > t , and for k < t it follows from


E Z j F t1 
'k D E
Fk D 1:
Z t1
Thus, for k t
Q Xk  Xk1 j Fk1  D
E

1
'k1

E .Xk  Xk1 / 'k j Fk1 

D E Xk  Xk1 j Fk1 
D 0:
If k D t , then (7.3) yields that
Q Xk  Xk1 j Fk1  D
E
Hence PQ 2 P .

1
Z t1

E .X t  X t1 / Z j F t1  D 0:

Section 7.3 Superhedging of American and European claims

359

Q U t  U t1 j F t1   0, and we get


Since PQ 2 P , we have E
Q E
Q U t  U t1 j F t1  Z t1 
0  E
Q .U t  U t1 / Z t1 
D E
D E .U t  U t1 / Z 
D :
This, however, contradicts the fact that > 0.
Remark 7.6. The decomposition in part (b) of Theorem 7.5 is sometimes called the
optional decomposition of the P -supermartingale U . The existence of such a decomposition was first proved by El Karoui and Quenez [105] and D. Kramkov [182]
in a continuous-time framework where B is an optional process; this explains the
terminology.
}

7.3

Superhedging of American and European claims

Let H be a discounted American claim such that


sup E  H t  < 1 for all t ,
P  2P

which is equivalent to the condition that the upper bound of the arbitrage-free prices
of H is finite
sup .H / D sup sup E  H  < 1:
P  2P 2T

Our aim in this section is to construct self-financing trading strategies such that the
seller of H stays on the safe side in the sense that the corresponding portfolio value
is always above H .
Definition 7.7. Any self-financing trading strategy  whose value process V satisfies
V t  Ht

P -a.s. for all t

is called a superhedging strategy for H .


Sometimes, a superhedging strategy is also called a superreplication strategy. According to Definition 6.33, H is attainable if and only if there exist  2 T and a
superhedging strategy whose value process satisfies V D H P -almost surely.
Lemma 7.8. If H is not attainable, then the value process V of any superhedging
strategy satisfies
P V t > H t for all t  > 0:

360

Chapter 7 Superhedging

Proof. We introduce the stopping time


 WD inft  0 j H t D V t :
Then P  D 1  D P V t > H t for all t . Suppose that P  D 1  D 0. In
this case, V D H P -a.s so that we arrive at the contradiction that H must be an
attainable American claim.
Let us now turn to the question whether superhedging strategies exist. In Section 6.1, we have already seen how one can use the Doob decomposition of the Snell

envelope U P of H together with the martingale representation of Theorem 5.38 in

order to obtain a superhedging strategy for the price U0P , where P  denotes the
unique equivalent martingale measure in a complete market model. We have also

seen that U0P is the minimal amount for which such a superhedging strategy is avail
able, and that U0P is the unique arbitrage-free price of H . The same is true of any
attainable American claim in an incomplete market model.
In the context of a non-attainable American claim H in an incomplete financial
market model, the P  -Snell envelope will be replaced with the upper Snell envelope
U " of H . The uniform Doob decomposition will take over the roles played by the
usual Doob decomposition and the martingale representation theorem. Since U " is a
P -supermartingale by Theorem 7.2, the uniform Doob decomposition states that U "
takes the form
"
Ut

"
U0

t
X

s  .Xs  Xs1 /  B t

(7.4)

sD1

 Ht
for some predictable process  and some increasing process B. Thus, the self-financing trading strategy  D . 0 ; / defined by  and the initial capital
"

 1  X D D U0 D sup .H /
is a superhedging strategy for H . Moreover, if VQ is the value process of any superhedging strategy, then Lemma 7.8 implies that VQ0 > E  H  for all  2 T and each
P  2 P . In particular, VQ0 is larger than any arbitrage-free price for H , and it follows
that VQ0  sup .H /. Thus, we have proved:
Corollary 7.9. There exists a superhedging strategy with initial investment sup .H /,
and this is the minimal amount needed to implement a superhedging strategy.
We will call sup .H / the cost of superhedging of H . Sometimes, a superhedging
strategy is also called a superreplication strategy, and one says that sup .H / is the cost
of superreplication or the upper hedging price of H . Recall, however, that sup .H /

Section 7.3 Superhedging of American and European claims

361

is typically not an arbitrage-free price for H . In particular, the seller cannot expect to
receive the amount sup .H / for selling H .
On the other hand, the process B in the decomposition (7.4) can be interpreted
as a refunding scheme: Using the superhedging strategy , the seller may withdraw
successively the amounts defined by the increments of B. With this capital flow, the
hedging portfolio at time t has the value U t"  H t . Thus, the seller is on the safe side
at no matter when the buyer decides to exercise the option. As we are going to show
in Theorem 7.13 below, this procedure is optimal in the sense that, if started at any
time t , it requires a minimal amount of capital.
Remark 7.10. Suppose sup .H / belongs to the set .H / of arbitrage-free prices
for H . By Theorem 6.31, this holds if and only if sup .H / is the only element of
.H /. In this case, the definition of .H / yields a stopping time  2 T and some
P  2 P such that
sup .H / D E  H :
Now let V be the value process of a superhedging strategy bought at V0 D sup .H /.
It follows that E  V  D sup .H /. Hence, V D H P -a.s., so that H is attainable in
the sense of Definition 6.33. This observation completes the proof of Theorem 6.34.
}
Remark 7.11. If the American claim H is not attainable, then sup .H / is not an
arbitrage-free price of H . Thus, one may expect the existence of arbitrage opportunities if H would be traded at the price sup .H /. Indeed, selling H for sup .H / and
buying a superhedging strategy  creates such an arbitrage opportunity: The balance
at t D 0 is zero, but Lemma 7.8 implies that the value process V of  cannot be
reached by any exercise strategy , i.e., we always have
V  H

and

P V > H  > 0:

(7.5)

Note that (7.5) is not limited to exercise strategies which are stopping times but holds
for arbitrary FT -measurable random times W  ! 0; : : : ; T . In other words,
sup .H / is too expensive even if the buyer of H would have full information about
the future price evolution.
}
Remark 7.12. The argument of Remark 7.11 implies that an American claim H
is attainable if and only if there exists an FT -measurable random time W  !
0; : : : ; T such that H D V , where V the value process of a superhedging strategy. In other words, the notion of attainability of American claims does not need the
restriction to stopping times.
}
We already know that sup .H / is the smallest amount for which one can buy a superhedging strategy at time 0. The following superhedging duality theorem extends

362

Chapter 7 Superhedging
"

this result to times t > 0. To this end, denote by U t .H / the set of all F t -measurable
random variables UQ t  0 for which there exists a d -dimensional predictable process
Q such that
UQ t C

u
X

Qk  .Xk  Xk1 /  Hu

for all u  t P -a.s.

(7.6)

kDtC1

Theorem 7.13. The upper Snell envelope U t" of H is the minimal element of U"t .H /.
More precisely
"

"

(a) U t 2 U t .H /,
"
"
(b) U t D ess inf U t .H /.

Proof. Assertion (a) follows immediately from the uniform Doob decomposition of
"
"
the P -supermartingale U " . As to part (b), we clearly get U t  ess inf U t .H /
"
from (a). For the proof of the converse inequality, take UQ t 2 U t .H / and choose a
"
predictable process Q for which (7.6) holds. We must show that the set B WD U t 
UQ t satisfies P B  D 1. Let
"
"
UO t WD U t ^ UQ t D U t  IB C UQ t  IB c :
"
"
Then UO t  U t , and our claim will follow if we can show that Ut  UO t . Let 
denote the predictable process obtained from the uniform Doob decomposition of the
P -supermartingale U " , and define

if s  t ,
Os WD s
s  IB C Qs  IB c if s > t .
"
With this choice, UO t satisfies (7.6), i.e., UO t 2 U t .H /. Let
"
VOs WD U0 C

s
X

Ok  .Xk  Xk1 /:

kD1

Us"

Then VOs 
for all s  t . In particular VOt  UO t , and hence VOT  HT , which
O
implies that V is a P -martingale; see Theorem 5.25. Hence, Doobs stopping theorem
implies
"
U t D ess sup ess sup E  H j F t 
P  2P

2T t


h
i
X

Ok  .Xk  Xk1 / F t


 ess sup ess sup E  UO t C
P  2P

2T

D UO t ;
which concludes the proof.

kDtC1

363

Section 7.3 Superhedging of American and European claims

We now take the point of view of the buyer of the American claim H . The buyer
allocates an initial investment  to purchase H , and then receives the amount H  0.
The objective is to find an exercise strategy and a self-financing trading strategy 
with initial investment , such that the portfolio value is covered by the payoff of
the claim. In other words, find  2 T and a self-financing trading strategy with value
process V such that V0 D  and V C H  0. As shown below, the maximal  for
which this is possible is equal to
#
inf .H / D sup inf E  H  D inf sup E  H  D U0 ;
2T P  2P

P  2P 2T

where
#

U t D ess inf U tP

P  2P

D ess inf ess sup E  H j F t 


P  2P

2T t

D ess sup ess inf E  H j F t 


2T t

P  2P

is the lower Snell envelope of H with respect to the stable set P . More generally,
we will consider the buyers problem for arbitrary t  0. To this end, denote by
#
U t .H / the set of all F t -measurable random variables UQ t  0 for which there exists
a d -dimensional predictable process Q and a stopping time 2 Tt such that
UQ t 


X

Q k  .Xk  Xk1 /  H

P -a.s.

kDtC1
#
#
Theorem 7.14. U t is the maximal element of U t .H /. More precisely
#

(a) U t 2 U t .H /,
#

(b) U t D ess sup U t .H /.


Proof. (a): Let  be a superhedging strategy for H with initial investment sup .H /,
and denote by V the value process of . The main idea of the proof is to use that
V t  H t  0 can be regarded as a new discounted American claim, to which we
can apply Theorem 7.13. However, we must take care of the basic asymmetry of
the hedging problem for American options: The seller of H must hedge against all
possible exercise strategies, while the buyer must find only one suitable stopping time.
#
It will turn out that a suitable stopping time is given by t WD infu  t j Uu D Hu .
With this choice, let us define a modified discounted American claim HQ by
HQ u D .Vu  Hu /  IuD t ;

u D 0; : : : ; T:

364

Chapter 7 Superhedging

Clearly HQ   HQ  t for all 2 T t . It follows that


ess sup ess sup E  HQ  j F t  D ess sup E  HQ  t j F t 
P  2P

 2T t

P  2P

D V t  ess inf E  H t j F t 
P  2P
#

D Vt  Ut ;
where we have used that V is a P -martingale in the second and Theorem 6.46 in
#
the third step. Thus, V t  U t is equal to the upper Snell envelope UQ " of HQ at time
Q
t . Let  be the d -dimensional predictable process obtained from the uniform Doob
decomposition of UQ " . Then, due to part (a) of Theorem 7.13,
V t  U t# C

u
X

Qk  .Xk  Xk1 /  HQ u D .Vu  Hu /  IuD t

for all u  t .

kDtC1

Thus,  WD Q   is as desired.
#
(b): Part (a) implies the inequality  in (b). To prove its converse, take UQ t 2 U t , a
d -dimensional predictable process ,
Q and 2 T t such that

X

UQ t 

Q k  .Xk  Xk1 /  H

P -a.s.

kDtC1

We will show below that


E


h X

i
Q k  .Xk  Xk1 / F t D 0

for all P  2 P .

(7.7)

kDtC1

Given this fact, we obtain that


UQ t  E  H j F t   ess sup E  H j F t 
2T t
#
for all P  2 P . Taking the essential infimum over P  2 P thus yields UQ t  U t and
in turn (b).
To prove (7.7), let

GQ s WD Ist C1

s
X

Ik Q k  .Xk  Xk1 /;

s D 0; : : : ; T:

kDtC1

Then GQ T  UQ t  H  H 2 L1 .P  / for all P  , and Theorem 5.14 implies that


GQ is a P -martingale. Hence (7.7) follows.

365

Section 7.3 Superhedging of American and European claims

We conclude this section by stating explicitly the corresponding results for European claims. Recall from Remark 6.6 that every discounted European claim H E
can be regarded as the discounted American claim. Therefore, the results we have
obtained so far include the corresponding European counterparts as special cases.
Corollary 7.15. For any discounted European claim H E such that
sup E  H E  < 1;
P  2P

there exist two d -dimensional predictable processes  and  such that P -a.s.


ess sup E H

j Ft  C

P  2P

k  .Xk  Xk1 /  H E ;

(7.8)

k  .Xk  Xk1 /  H E :

(7.9)

kDtC1


ess inf E H
P  2P

T
X

T
X

j Ft  

kDtC1

Remark 7.16. For t D 0, (7.8) takes the form


sup E  H E  C
P  2P

T
X

k  .Xk  Xk1 /  H E

P -a.s.

kD1

Thus, the self-financing trading strategy  arising from  and the initial investment
 1  X 0 D supP  2P E  H E  allows the seller to cover all possible obligations without any downside risk. Similarly, (7.9) yields an interpretation of the self-financing
trading strategy  which arises from  and the initial investment
1  X 0 D  inf E  H E :
P  2P

The latter quantity corresponds to the largest loan the buyer can take out and still be
sure that, by using the trading strategy , this debt will be covered by the payoff H E .
}
Remark 7.17. Let H be a discounted European claim such that
sup E  H E  < 1:
P  2P

O H  D sup .H /. If  D . 0 ; / is a superhedging
Suppose that PO 2 P is such that E
strategy for H , then
O H C
HO WD E

T
X
kD1

k  .Xk  Xk1 /

366

Chapter 7 Superhedging

satisfies HO  H  0. Hence, HO is an attainable discounted claim, and it follows


from Theorem 5.25 that
O HO  D E
O H :
E
This shows that HO and H are identical and that H is attainable. We have thus obtained
another proof of Theorem 5.32.
}
As the last result in this section, we formulate the following superhedging duality
theorem, which states that the bounds in (7.8) and (7.9) are optimal.
Corollary 7.18. Suppose that H E is a discounted European claim with
sup E  H E  < 1:
P  2P
"
Denote by U t .H E / the set of all F t measurable random variables UQ t for which there
exists a d -dimensional predictable process Q such that

UQ t C

T
X

Qk  .Xk  Xk1 /  H E

P -a.s.

kDtC1

Then

"

ess sup E  H E j F t  D ess inf U t .H E /:


P  2P

#
By U t .H E / we denote the set of all F t measurable random variables UQ t for which
there exists a d -dimensional predictable process Q such that

UQ t 

T
X

Q k  .Xk  Xk1 /  H E

P -a.s.

kDtC1

Then

ess inf E  H E j F t  D ess sup U t .H E /:


P  2P

Remark 7.19. Define A as the set of financial positions Z 2 L1 .; FT ; P / which


are acceptable in the sense that there exists a d -dimensional predictable process 
such that
ZC

T
X

k  .Xk  Xk1 /  0

P -a.s.

kD1

As in Section 4.8, this set A induces a coherent risk measure  on L1 .; FT ; P /


.Z/ D infm 2 R j m C Z 2 A;

Z 2 L1 .; FT ; P /:

Section 7.3 Superhedging of American and European claims

367

Corollary 7.18 implies that  can be represented as


.Z/ D sup E  Z :
P  2P

We therefore obtain a multiperiod version of Proposition 4.99.

Remark 7.20. Often, the superhedging strategy in a given incomplete model can
be identified as the perfect hedge in an associated extremal model. As an example,
consider a one-period model with d discounted risky assets given by bounded random
variables X 1 ; : : : ; X d . Denote by
the distribution of X D .X 1 ; : : : ; X d / and by
.
/ the convex hull of the support of
. The closure K WD .
/ of .
/ is convex
and compact. We know from Section 1.5 that the model is arbitrage-free if and only if
the price system  D . 1 ; : : : ;  d / is contained in the relative interior of .
/, and
the equivalent martingale measures can be identified with the measures
 
with
barycenter . Consider a derivative H D h.X/ given by a convex function h on K.
The cost of superhedging is given by
Z
sup h d
 D inf./ j affine on K,  h
-a.s. ;


which is a special case of the duality result of Theorem 1.32. Since  h is convex
and closed, the condition
.  h/ D 1 implies  h on K. Denote by M./ the
class of all probability measures on K with barycenter . For any affine function
with  h on K, and for any
Q 2 M./ we have
Z
Z
h d
Q  d
Q D ./:
Thus,
O
h./
D

Z
sup

h d
Q

(7.10)

2M./
Q

where we define for f 2 C.K/


fO WD inf j affine on K,  f
-a.s. :
The supremum in (7.10) is attained since M./ is weakly compact. More precisely,
it is attained by any measure
O 2 M./ on K which is maximal with respect to
the balayage order <bal defined for measures on K as in (2.24); see Thorme X. 41
in [87]. But such a maximal measure is supported by the set of extreme points of
the convex compact set K, i.e., by the Choquet boundary of K. This follows from a
general integral representation theorem of Choquet; see, e.g., Thorme X. 43 of [87].
In our finite-dimensional setting,
O can in fact be chosen to have a support consisting
of at most d C 1 points, due to a theorem of Carathodory and the representation of K

368

Chapter 7 Superhedging

as the convex hull of its extreme points; see [221], Theorems 17.1 and 18.5. But this
means that
O can be identified with a complete model, due to Proposition 1.41. Thus,
O
the cost of superhedging h./
can be identified with the canonical price
Z
O WD h d
O
of the derivative H , computed in the complete model
.
O Note that
O sits on the
Choquet boundary of K D .
/, but typically it will no longer be equivalent or
absolutely continuous with respect to the original measure
. As a simple illustration,
consider a one-period model with one risky asset X 1 . If X 1 is bounded, then the
distribution
of X 1 has bounded support, and .
/ is of the form a; b. In this case,
the cost of superhedging H D h.X 1 / for a convex function h is given by the price
p  h.b/ C .1  p  /h.a/;
computed in the binary model in which X 1 takes only the values a and b, and where
p  2 .0; 1/ is determined by
p  b C .1  p  /a D  1 :

The following example illustrates that a superhedging strategy is typically too expensive from a practical point of view. However, we will see in Chapter 8 how superhedging strategies can be used in order to construct other hedging strategies which
are efficient in terms of cost and shortfall risk.
Example 7.21. Consider a simple one-period model where S11 has under P a Poisson
distribution and where S 0  1. Let H WD .S11  K/C be a call option with strike
K > 0. We have seen in Example 1.38 that inf .H / and sup .H / coincide with the
universal arbitrage bounds of Remark 1.37
.S01  K/C D inf .H /

and

sup .H / D S01 :

Thus, the superhedging strategy for the seller consists in the trivial hedge of buying
the asset at time 0, while the corresponding strategy for the buyer is a short-sale of the
}
asset in case the option is in the money, i.e., if S01 > K.

7.4

Superhedging with liquid options

In practice, some derivatives such as put or call options are traded so frequently that
their prices are quoted just like those of the primary assets. The prices of such liquid
options can be regarded as an additional source of information on the expectations of
the market as to the future evolution of asset prices. This information can be exploited
in various ways. First, it serves to single out those martingale measures P  which are

369

Section 7.4 Superhedging with liquid options

compatible with the observed options prices, in the sense that the observed prices
coincide with the expectations of the discounted payoff under P  . Second, liquid
options may be used as instruments for hedging more exotic options.
Our aim in this section is to illustrate these ideas in a simple setting. Assume that
there is only one risky asset S 1 such that S01 is a positive constant, and that S 0 is
a riskless bond with interest rate r D 0. Thus, the discounted price process of the
risky asset is given by X t D S t1  0 for t D 0; : : : ; T . As the underlying space of
scenarios, we use the product space
 WD 0; 1/T :
We define X t .!/ D x t for ! D .x1 ; : : : ; xT / 2 , and denote by F t the -algebra
generated by X0 ; : : : ; X t ; note that F0 D ;; . No probability measure P is given
a priori. Let us now introduce a linear space X of FT -measurable functions as the
smallest linear space such that the following conditions are satisfied:
(a) 1 2 X.
(b) .X t  Xs / IA 2 X for 0  s < t  T and A 2 Fs .
(c) .X t  K/C 2 X for K  0 and t D 1; : : : ; T .
The functions in the space X will be interpreted as (discounted) payoffs of liquid
derivatives. The constant 1 in (a) corresponds to a unit investment into the riskless
bond. The function X t  Xs in (b) corresponds to the payoff of a forward contract on
the risky asset, issued at time s for the price Xs and expiring at time t . The decision
to buy such a forward contract at time s may depend on the market situation at time s;
this is taken into account by allowing for payoffs .X t  Xs / IA with A 2 Fs . Linearity
of X together with conditions (a) and (b) implies that
Xt 2 X

for all t .

Finally, condition (c) states that call options with any possible strike and any maturity
up to time T can be used as liquid securities.
Suppose that a linear pricing rule is given on X. The value .Y / will be interpreted as the market price of the liquid security Y 2 X. The price of a liquid call
option with strike K and maturity t will be denoted by
C t .K/ WD ..X t  K/C /:
Assumption 7.22. We assume that W X ! R is a linear functional which satisfies
the following conditions:
(a) .1/ D 1.
(b) .Y /  0 if Y  0.

370

Chapter 7 Superhedging

(c) ..X t  Xs / IA / D 0 for all 0  s < t  T and A 2 Fs .


(d) C t .K/ D ..X t  K/C / ! 0 as K " 1 for all t .
The first two conditions must clearly be satisfied if the pricing rule shall not create arbitrage opportunities. Condition (c) states that Xs is the fair price for a forward
contract issued at time s. This condition is quite natural in view of Theorem 5.29. In
our present setting, it can also be justified by the following simple replication argument. At time s, take out a loan Xs .!/ and use it for buying the asset. At time t ,
the asset is worth X t .!/ and the loan must be paid back, which results in a balance
X t .!/  Xs .!/. Since this investment strategy requires zero initial capital, the price
of the corresponding payoff should also be zero. The continuity condition (d) is also
quite natural.
Our first goal is to show that any such pricing rule is compatible with the paradigm that arbitrage-free prices can be identified as expectations with respect to some
martingale measure for X. More precisely, we are going to construct a martingale
measure P  such that .Y / D E  Y  for all Y 2 X. On the one hand, this will
imply regularity properties of . On the other hand, this will yield an extension of our
pricing rule to a larger space of payoffs including path-dependent exotic options.
As a first step in this direction, we have the following result.
Lemma 7.23. For each t , there exists a unique probability measure
t on 0; 1/
such that for all K  0
Z
C t .K/ D ..X t  K/C / D .x  K/C
t .dx/:
In particular,
t has the mean
Z
x
t .dx/ D X0 :
Proof. Since K 7! .X t  K/C is convex and decreasing, linearity and positivity of
imply that the function  t .K/ WD ..X t  K/C / is convex and decreasing as well.
Hence, there exists a decreasing right-continuous function f W 0; 1/ ! 0; 1/ such
that
Z K
f .x/ dx
C t .K/ D C t .0/ 
Z
D X0 

0
K

f .x/ dx;
0

i.e., f .K/ is equal to the right-hand derivative of C t .K/ at K. Our fourth condition
on yields
Z
1

f .x/ dx D X0 ;
0

371

Section 7.4 Superhedging with liquid options

so that f .x/ & 0 as x " 1. Hence, there exists a positive measure


t on .0; 1/
such that
f .x/ D
t ..x; 1// for x > 0.
Fubinis theorem implies
Z

x
t .dx/ D

f .y/ dy D X0

.0;1/

and
Z

C t .K/ D X0 
Z
D

.0;1/

Iy<x
t .dx/ dy

.x  K/C
t .dx/:

.0;1/

It remains to show that


t can be extended to a probability measure on 0; 1/, i.e.,
we must show that
t ..0; 1//  1. To this end, we will use the put-call parity
C t .K/ D X0  K C ..K  X t /C /;
which follows from our assumptions on . Thus,
Z K
C
..K  X t / / D
g.x/ dx;
0

where g.x/ D 1  f .x/. Since K 7! ..K  X t /C / is increasing, g must be nonnegative, and we obtain 1  f .0/ D
t ..0; 1//.
The following lemma shows that the measures
t constructed in Lemma 7.23 are
related to each other by the balayage order <bal , defined by
Z
Z

<bal

f d
 f d for all convex functions f
for probability measures with finite expectation; see Remark 2.63.
Lemma 7.24. The map t 7!
t is increasing with respect to the balayage order <bal :

tC1 <bal
t

for all t .

Proof. Note that


.X tC1  K/C  .X tC1  X t / IX t >K  .X t  K/C
 .X tC1  K/C  .X tC1  K/C IX t >K
 0:

372

Chapter 7 Superhedging

Since the price of the forward contract .X tC1 X t / IX t >K vanishes under our pricing
rule , we must have that for all K  0
Z
Z
.x  K/C
tC1 .dx/  .x  K/C
t .dx/
D C tC1 .K/  C t .K/
 0:
An application of Corollary 2.61 concludes the proof.
Let us introduce the class
P D P  2 M1 .; F / j E  Y  D .Y / for all Y 2 X
of all probability measures P  on .; F / which coincide with on X. Note that for
any P  2 P ,
E  .X t  Xs / IA  D 0

for s < t and A 2 Fs ,

so that P consists of martingale measures for X. Our first main result in this section
can be regarded as a version of the fundamental theorem of asset pricing without an
a priori measure P .
Theorem 7.25. Under Assumption 7:22, the class P is non-empty. Moreover, there
exists P  2 P with the Markov property: For 0  s  t  T and each bounded
measurable function f ,
E  f .X t / j Fs  D E  f .X t / j Xs :
Proof. Since
tC1
R <bal
t , Corollary 2.61 yields the existence of stochastic kernel
Q tC1 such that y Q tC1 .x; dy/ D x and
tC1 D
t Q tC1 . Let us define
P  WD
1 Q2    QT ;
i.e., for each measurable set A   D 0; 1/T
Z
Z
Z

P A  D
1 .dx1 / Q2 .x1 ; dx2 / : : : QT .xT 1 ; dxT / IA .x1 ; x2 ; : : : ; xT /:
Clearly,
t is the law of X t under P  . In particular, all call options are priced correctly by calculating their expectation with respect to P  . Then one checks that
P  -a.s.
Z

E f .X tC1 / j F t  D f .y/ Q tC1 .X t ; dy/ D E  f .X tC1 / j X t :
(7.11)
The first identity above implies E  X tC1  X t j F t  D 0. In particular, P  is
a martingale measure, and the expectation of .X t  Xs / IA vanishes for s < t and
A 2 Fs . It follows that E  Y  D .Y / for all Y 2 X. Finally, an induction
argument applied to (7.11) yields the Markov property.

373

Section 7.4 Superhedging with liquid options

So far, we have assumed that our space X of liquidly traded derivatives contains
call options with all possible strike prices and maturities. From now on, we will
simplify our setting by assuming that only call options with maturity T are liquidly
traded. Thus, we replace X by the smaller space XT which is defined as the linear
hull of the constants, of all forward contracts
.X t  Xs / IA ;

0  s < t  T; A 2 Fs ;

and of all call options


.XT  K/C ;

K  0,

with maturity T . The observed market prices of derivatives in XT are as before


modeled by a linear pricing rule
T W XT ! R:
We assume that T satisfies Assumption 7.22 in the sense that condition (d) is only
required for t D T
(d0 ) CT .K/ WD T ..XT  K/C / ! 0 as K " 1.
By
PT D P  2 M1 .; F / j E  Y  D T .Y / for all Y 2 XT
we denote the class of all probability measures P  on .; F / which coincide with
T on XT . As before, it follows from condition (c) of Assumption 7.22 that any
P  2 PT will be a martingale measure for the price process X. Obviously, any linear pricing rule which is defined on the full space X and which satisfies Assumption 7.22 can be restricted to XT , and this restriction satisfies the above assumptions.
Thus, we have PT P ;.
Proposition 7.26. Under the above assumptions, PT is non-empty.
Proof. Let
T be the measure constructed in Lemma 7.23 from the call prices with
maturity T . Now consider the measure PQ on .; F / defined as
PQ WD X0    X0
T ;
i.e., under PQ we have X t D X0 PQ -a.s. for t < T , and the law of XT is
T . Clearly,
we have PQ 2 PT .
A measure P  2 PT can be regarded as an extension of the pricing rule T
to the larger space L1 .P  /, and the expectation E  H  of some European claim
H  0 can be regarded as an arbitrage-free price for H . Our aim is to obtain upper
and lower bounds for E  H  which hold simultaneously for all P  2 P . We will
derive such bounds for various exotic options; this will amount to the construction of
certain superhedging strategies in terms of liquid securities.

374

Chapter 7 Superhedging

As a first example, we consider the following digital option


H dig

1 if max0tT X t  B
WD
0 otherwise,

which has a unit payoff if the price process reaches a given upper barrier B > X0 . If
we denote by
B WD inft  0 j X t  B
the first hitting time of the barrier B, then the payoff of the digital option can also be
described as
H dig D IB T :
For simplicity, we will assume from now on that
CT .B/ > 0;
so that in particular
T ..B; 1// > 0.
Theorem 7.27. The following upper bound on the arbitrage-free prices of the digital
option holds:
CT .K/
:
(7.12)
max E  H dig  D min
0K<B B  K
P  2PT
Proof. For 0  K < B, we have XB  B and
H dig D IB T 

.XT  K/C
X  XT
:
C B
I
B K
B  K B T

Taking expectations with respect to some P  2 PT yields


E  H dig  

CT .K/
1
C
E  .XB  XT / IB T :
B K
B K

Since P  is a martingale measure, the stopping theorem in the form of Proposition 6.36 implies
E  XT IB T  D E  XB IB T :
This shows that
sup
P  2PT

E  H dig  

inf
0K<B

CT .K/
:
B K

The proof will be completed by Lemmas 7.28 and 7.29 below.

375

Section 7.4 Superhedging with liquid options

Lemma 7.28. If we let


WD 1 

inf

0K<B

CT .K/
2 .0; 1/;
B K

then the infimum on the right-hand side is attained in K if and only if K belongs to
the set of -quantiles for
T , i.e., if and only if

T .0; K//  
T .0; K/:
In particular, it is attained in

K  WD inf K j
T .0; K/  :
Proof. The convex function CT has left- and right-hand derivatives
.CT /0 .K/ D 
T .K; 1//

and

.CT /0C .K/ D 


T ..K; 1//I

see also Proposition A.4. Thus, the function g.K/ WD CT .K/=.B  K/ has a minimum in K if and only if its left- and right-hand derivatives satisfy
0
.K/  0 and
g

0
gC
.K/  0:

0 and g 0 , one sees that these two conditions are equivalent to the
By computing g
C
requirement that K is a -quantile for
T ; see Lemma A.15.

Lemma 7.29. There exists a martingale measure PO 2 PT such that


CT .K/
PO B  T  D min
:
0K<B B  K
Moreover, PO can be taken such that
B D T  1

and

XB D B

PO -a.s. on B  T

and
XT > K   B  T  XT  K  modulo PO -nullsets,
where K  is as in Lemma 7:28.
Proof. Let be as in Lemma 7.28, and let

q.t / WD q
.t / D infK j
T .0; K/  t

be the lower quantile function for


; see A.3. We take an auxiliary probability space
Q FQ ; PQ / supporting a random variable U which is uniformly distributed on .0; 1/.
.;
By Lemma A.19, XQ T WD q.U / has distribution
T under PQ . Let  be such that
X0 D  C B.1  /:

376

Chapter 7 Superhedging

Since B > X0 we have 0   < X0 . We define XQT 1 by


XQ T 1 WD  IU  C B IU > ;
and we let XQ t WD X0 for 0  t  T  2.
We now prove that XQ is a martingale with respect to its natural filtration FQt WD
Q and hence
.XQ 0 ; : : : ; XQ t /. To this end, note first that FQT 2 D ;; ,
Q XQ T 1 j FQT 2  D E
Q XQT 1  D X0 D XQT 2 :
E
Furthermore, since K  D q. /,
Q XQ T I XQ T 1 D B  D E
Q XQT I U > 
E
Q .XQ T  K  /C  C K  PQ U > 
D E
D CT .K  / C K  .1  /
D .1  /.B  K  / C K  .1  /
D B  PQ XQ T 1 D B :
Hence,
Q XQT   E
Q XQT I XQ T 1 D B 
Q XQ T I XQ T 1 D   D E
E
D X0  B  PQ XQ T 1 D B 
D  PQ XQT 1 D  :
It follows that
Q XQ T j FQT 1  D E
Q XQ T j XQT 1  D XQT 1 ;
E
and so XQ is indeed a martingale.
As the next step, we note that
XQT  B  XQT > K 
 U > D XQT 1 D B
 U   XQ T  K  ;
where we have used the fact that K  D q. /. Hence, if we denote by
QB WD inft  0 j XQ t  B
the first time at which XQ hits the barrier B, then

XQ T > K   QB  T D XQ T 1 D B  XQT  K  :

377

Section 7.4 Superhedging with liquid options

Hence, QB  T D QB D T  1 D XQT 1 D B,


CT .K/
;
PQ QB  T  D PQ XQT 1 D B  D 1  D min
0K<B B  K
and the distribution PO of XQ under PQ is as desired.
Remark 7.30. The inequality
H dig 

.XT  K/C
X  XT
C B
I
B K
B  K B T

appearing in the proof of Theorem 7.27 can be interpreted in terms of a suitable superhedging strategy for the claim H dig by using call options and forward contracts:
At time t D 0, we buy .B  K/1 call options with strike K, and at the first time
when the price process passes the barrier B, we sell forward .B  K/1 shares of
the asset. This strategy will be optimal if the strike price K is such that it realizes the
minimum on the right-hand side of (7.12). By virtue of Lemma 7.28, such an optimal
strike price can be identified as the Value at Risk at level 1 of a short position XT
in the asset.
}
Let us now derive bounds on the arbitrage-free prices of barrier call options. More
precisely, we will consider an up-and-in call option

.XT  K/C if max0tT X t  B,


call
Hu&i WD
0
otherwise,
and the corresponding up-and-out call

.XT  K/C
call
WD
Hu&o
0

if max0tT X t < B,
otherwise.

If the barrier B is below the strike price K, then the up-and-in call is identical to a
plain vanilla call .XT  K/C , and the payoff of the up-and-out call is zero. Thus,
we assume from now on that
K < B:
Recall that K  denotes the minimizer of the function c 7! CT .c/=.B  c/ as constructed in Lemma 7.28.
Theorem 7.31. For an up-and-in call option,

if K   K,
CT .K/

call
max E Hu&i  D BK


P  2PT
BK  CT .K / if K > K.

378

Chapter 7 Superhedging

Proof. For any c with K  c < B,


call

Hu&i

B K
cK
.XT  c/C C
.XB  XT / IB T :
B c
B c

call
Indeed, on XT  K or on B > T the payoff of Hu&i
is zero, and the right-hand
side is non-negative. On XT  c; B  T and on c > XT > K; B  T we have
. The expectation of the right-hand side under a martingale measure P  2 PT is
equal to

B K
cK 
B K
CT .c/ C
E .XB  XT / IB T  D
CT .c/;
B c
B c
B c
due to the stopping theorem. The minimum of this upper bound over all c 2 K; B/
is attained in c D K _ K  , which shows  in the assertion.
Finally, let PO be the martingale measure constructed in Lemma 7.29. If K   K
then
.XT  K/C IB T D .XT  K/C PO -a.s.,
O H call  D CT .K/. If K  > K then PO -a.s.
and so E
u&i
B K
K  K
 C
call
.X

K
/
C
.B  XT / IB T D Hu&i
:
T
B  K
B  K
Taking expectations with respect to PO concludes the proof.
Remark 7.32. The inequality
call
Hu&i


B K
cK
.XT  c/C C
.XB  XT / IB T
B c
B c

appearing in the preceding proof can be interpreted as a superhedging strategy for the
up-and-in call with liquid derivatives: At time t D 0, we purchase .B  K/=.B  c/
call options with strike c, and at the first time when the stock price passes the barrier
B, we sell forward .c  K/=.B  c/ shares of the asset. This strategy will be optimal
}
for c D K  _ K.
We now turn to the analysis of the up-and-out call option
call
Hu&o
D .XT  K/C IB >T :

Theorem 7.33. For an up-and-out call,


call
 D CT .K/  CT .B/  .B  K/
T .B; 1//:
max E  Hu&o

P  2PT

379

Section 7.4 Superhedging with liquid options

Proof. Clearly,
call
 .XT  K/C IXT <B
Hu&o

(7.13)

D .XT  K/C  .XT  B/C  .B  K/ IXT B :


Taking expectations yields  in the assertion.
Now consider the measure PQ on .; F / defined as
PQ WD X0    X0
T ;
i.e., under PQ we have X t D X0 PQ -a.s. for t < T , and the law of XT is
T . Clearly
PQ 2 PT , and (7.13) is PQ -a.s. an identity.
Using the identity
call
call
C Hu&i
;
.XT  K/C D Hu&o

we get the following lower bounds as an immediate corollary.


Corollary 7.34. We have
min

P  2PT

call
E  Hu&i
 D CT .B/ C .B  K/
T .B; 1//;

and

min

P  2PT

call
E  Hu&o
D

0
CT .K/ 

BK
BK 

CT .K  /

if K   K,
if K  > K.

Chapter 8

Efficient hedging

In an incomplete financial market model, a contingent claim typically will not admit
a perfect hedge. Superhedging provides a method for staying on the safe side, but the
required cost is usually too high both from a theoretical and from a practical point of
view. It is thus natural to relax the requirements.
As a first preliminary step, we consider strategies of quantile hedging which stay
on the safe side with high probability. In other words, we maximize the probability
for staying on the safe side under a given cost constraint. The main idea consists
in reducing the construction of such strategies for a given claim H to a problem of
superhedging for a modified claim HQ , which is the solution to a static optimization
problem of NeymanPearson type. Typically, HQ will have the form of a knock-out
option, that is, HQ D H  IA . At this stage, we only focus on the probability that a
shortfall occurs; we do not take into account the size of the shortfall if it does occur.
In Sections 8.2 and 8.3 we take a more comprehensive view of the downside risk.
Our discussion of risk measures in Section 4.8 suggests to quantify the downside risk
in terms of an acceptance set for suitably hedged positions. If acceptability is defined
in terms of utility-based shortfall risk as in Section 4.9, we are led to the problem of
constructing efficient strategies which minimize the utility-based shortfall risk under
a given cost constraint. As in the case of quantile hedging, this problem can be decomposed into a static optimization problem and the construction of a superhedging
strategy for a modified payoff profile HQ . In Section 8.3 we go even one step further
and assess the shortfall risk in terms of a general convex risk measure. For a complete
market model and the case of AV@R, we discuss the structure of the modified payoff profile HQ and point out the relation to the problem of robust utility maximization
discussed in Section 3.5.

8.1

Quantile hedging

Let H be a discounted European claim in an arbitrage-free market model such that


sup .H / D sup E  H  < 1:
P  2P

We saw in Corollary 7.15 that there exists a self-financing trading strategy whose
value process V " satisfies
"

VT  H

P -a.s.

381

Section 8.1 Quantile hedging

By using such a superhedging strategy, the seller of H can cover almost any possible
obligation which may arise from the sale of H and thus eliminate completely the
corresponding risk. The smallest amount for which such a superhedging strategy is
available is given by sup .H /. This cost will often be too high from a practical point
of view, as illustrated by Example 7.21. Furthermore, if H is not attainable then
sup .H /, viewed as a price for H , is too high from a theoretical point of view since
it would permit arbitrage. Even if H is attainable, a complete elimination of risk by
using a replicating strategy for H would consume the entire proceeds from the sale of
H , and any opportunity of making a profit would be lost along with the risk.
Let us therefore suppose that the seller is unwilling to put up the initial amount
of capital required by a superhedge and is ready to accept some risk. What is the
optimal partial hedge which can be achieved with a given smaller amount of capital?
In order to make this question precise, we need a criterion expressing the sellers
attitude towards risk. Several of such criteria will be studied in the following sections.
In this section, our aim is to construct a strategy which maximizes the probability of a
successful hedge given a constraint on the initial cost.
More precisely, let us fix an initial amount
v < sup .H /:
We are looking for a self-financing trading strategy whose value process maximizes
the probability
P VT  H 
among all those strategies whose initial investment V0 is bounded by v and which
respect the bounds V t  0 for t D 0; : : : ; T . In view of Theorem 5.25, the second
restriction amounts to admissibility in the following sense:
Definition 8.1. A self-financing trading strategy is called an admissible strategy if its
value process satisfies VT  0.
The problem of quantile hedging consists in constructing an admissible strategy  
such that its value process V  satisfies
P VT  H  D max P VT  H 

(8.1)

where the maximum is taken over all value processes V of admissible strategies subject to the constraint
(8.2)
V0  v:
Note that this problem would not be well posed if considered without the constraint
of admissibility.
Let us emphasize that the idea of quantile hedging corresponds to a Value at Risk
criterion, and that it invites the same criticism: Only the probability of a shortfall is

382

Chapter 8 Efficient hedging

taken into account, not the size of the loss if a shortfall occurs. This exclusive focus
on the shortfall probability may be reasonable in cases where a loss is to be avoided
by any means. But for most applications, other optimality criteria as considered in
the next section will usually be more appropriate from an economic point of view. In
view of the mathematical techniques, however, some key ideas already appear quite
clearly in our present context.
Let us first consider the particularly transparent situation of a complete market
model before passing to the general incomplete case. The set
VT  H
will be called the success set associated with the value process V of an admissible
strategy. As a first step, we reduce our problem to the construction of a success set of
maximal probability.
Proposition 8.2. Let P  denote the unique equivalent martingale measure in a complete market model, and assume that A 2 FT maximizes the probability P A 
among all sets A 2 FT satisfying the constraint
E  H  IA   v:

(8.3)

Then the replicating strategy   of the knock-out option


H  WD H  IA
solves the optimization problem defined by (8.1) and (8.2), and A coincides up to
P -null sets with the success set of   .
Proof. As a first step, let V be the value process of any admissible strategy such that
V0  v. We denote by A WD VT  H the corresponding success set. Admissibility
yields that VT  H  IA . Moreover, the results of Section 5.3 imply that V is a
P  -martingale. Hence, we obtain that
E  H  IA   E  VT  D V0  v:
Therefore, A fulfills the constraint (8.3) and it follows that
P A   P A :
As a second step, we consider the trading strategy   and its value process V  .
Clearly,   is admissible, and its success set satisfies
VT  H D H  IA  H  A :
On the other hand, the first part of the proof yields that
P VT  H   P A :
It follows that the two sets A and VT  H coincide up to P -null sets. In particular,
  is an optimal strategy.

383

Section 8.1 Quantile hedging

Our next goal is the construction of the optimal success set A , whose existence was
assumed in Proposition 8.2. This problem is solved by using the NeymanPearson
lemma. To this end, we introduce the measure Q given by
H
dQ
WD 
:

dP
E H 

(8.4)

The constraint (8.3) can be written as


Q A   WD

v
E  H

(8.5)

Thus, an optimal success set must maximize the probability P A  under the constraint Q A   . We denote by dP =dQ the generalized density of P with respect
to Q in the sense of the Lebesgue decomposition as constructed in Theorem A.13.
Thus, we may define the level

dP


c  WD inf c  0 Q
>
c

E

H



;
(8.6)
dQ
and the set

dP
dP



A WD
> c  E H  D
>c H :
dQ
dP 


(8.7)

Proposition 8.3. If the set A in (8.7) satisfies


Q A  D ;
then A maximizes the probability P A  over all A 2 FT satisfying the constraint
E  H  IA   v:
Proof. The condition E  H  IA   v is equivalent to Q A   D Q A .
Thus, the particular form of the set A in (8.7) and the NeymanPearson lemma in
the form of Proposition A.29 imply that P A   P A .
By combining the two Propositions 8.2 and 8.3, we obtain the following result.
Corollary 8.4. Denote by P  the unique equivalent martingale measure in a complete market model, and assume that the set A of (8.7) satisfies
Q A  D :
Then the optimal strategy solving (8.1) and (8.2) is given by the replicating strategy
of the knock-out option H  D H  IA .

384

Chapter 8 Efficient hedging

Our solution to the optimization problem (8.1) and (8.2) still relies on the assumption that the set A of (8.7) satisfies Q A  D . This condition is clearly satisfied
if


dP

P
D
c

H
D 0:
dP 
However, it may not in general be possible to find any set A whose Q -probability
is exactly . In such a situation, the NeymanPearson theory suggests replacing the
indicator function IA of the critical region A by a randomized test, i.e., by an FT measurable 0; 1-valued function . Let R denote the class of all randomized tests,
and consider the following optimization problem:
E

 D maxE

j

2 R and EQ

  ;

where Q is the measure defined in (8.4) and D v=E  H  as in (8.5). The generalized NeymanPearson lemma in the form of Theorem A.31 states that the solution
is given by


D I

dP >c  H
dP 

C   I

dP Dc  H
dP 

where c  is defined through (8.6) and  is chosen such that EQ


D

dP

 Q  dP
 > c  H 
dP

Q dP
 D c  H 

(8.8)


D , i.e.,


dP

D c  H 0:
in case P
dP 


Definition 8.5. Let V be the value process of an admissible strategy . The success
ratio of  is defined as the randomized test
V

D IVT H C

VT
:
I
H VT <H

Note that the set V D 1 coincides with the success set VT  H of V . In the
extended version of our original problem, we are now looking for a strategy which
maximizes the expected success ratio E V  under the measure P under the cost
constraint V0  v.
Theorem 8.6. Suppose that P  is the unique equivalent martingale measure in a
complete market model. Let  be given by (8.8), and denote by   a replicating
strategy for the discounted claim H  D H   . Then the success ratio V  of  
maximizes the expected success ratio E V  among all admissible strategies with
initial investment V0  v. Moreover, the optimal success ratio V  is P -a.s. equal
to  .

385

Section 8.1 Quantile hedging

We do not prove this theorem here, as it is a special case of Theorem 8.7 below
and its proof is similar to the one of Corollary 8.4, once the optimal randomized test
 has been determined by the generalized NeymanPearson lemma. Note that the
condition


dP

P
Dc H D0
dP 
implies that  D IA with A as in (8.7), so in this case the strategy   reduces to
the one described in Corollary 8.4.
Now we turn to the general case of an arbitrage-free but possibly incomplete market
model, i.e., we no longer assume that the set P of equivalent martingale measures
consists of a single element, but we assume only that
P ;:
In this setting, our aim is to find an admissible strategy whose success ratio V 
satisfies
(8.9)
E V   D max E V ;
where the maximum on the right-hand side is taken over all admissible strategies
whose initial investment satisfies the constraint
V0  v:

(8.10)


Theorem 8.7. There exists a randomized test


sup E  H 

such that

 D v;

(8.11)

P  2P

and which maximizes E

 among all
E H 

2 R subject to the constraints

v

for all P  2 P .

(8.12)

Moreover, the superhedging strategy for the modified claim


H D H 

with initial investment sup .H  / solves the problem (8.9) and (8.10).
Proof. Denote by R0 the set of all
take a sequence n 2 R0 such that
E

n

2 R which satisfy the constraints (8.12), and

! sup E
2R0

as n " 1.

386

Chapter 8 Efficient hedging

Lemma 1.70 yields a sequence of convex combinations Q n 2 conv n ; nC1 ; : : :


converging P -a.s. to a function Q 2 R. Clearly, Q n 2 R0 for each n. Hence, Fatous
lemma yields that
E  H Q   lim inf E  H Q n   v
n"1

for all P  2 P ,

and it follows that Q 2 R0 . Moreover,


E Q  D lim E Q n  D lim E
n"1

n

D sup E

n"1

;

2R0

WD Q is the desired maximizer.


We must also show that (8.11) holds. To this end, note first that P  D 1  D 1 is
impossible due to our assumption v < sup .H /. Hence, if supP  2P E  H    < v,
then we can find some " > 0 such that " WD "C.1"/  2 R0 , and the expectation
E "  must be strictly larger than E  . This, however, contradicts the maximality
of E  .
Now let  be any admissible strategy whose value process V satisfies V0  v. If
V denotes the corresponding success ratio, then
so

H

D H ^ VT  VT :

The P -martingale property of V yields that for all P  2 P ,


E  H 
Therefore,

  E  VT  D V0  v:

(8.13)

is contained in R0 and it follows that


E

  E

:

(8.14)

Consider the superhedging strategy   of H  D H   and denote by V  its value


process. Clearly,   is an admissible strategy. Moreover,
V0 D sup .H  / D sup E  H 

 D v:

P  2P

Thus, (8.14) yields that

V

satisfies
E

V

  E

:

(8.15)

On the other hand, VT dominates H  , so


H

V

D H ^ VT  H ^ H  D H 

Therefore, V  dominates  on the set H > 0. Moreover, any success ratio is


equal to one on H D 0, and we obtain that V    P -almost surely. According
to (8.15), this can only happen if the two randomized tests V  and  coincide P almost everywhere. This proves that   solves the hedging problem (8.9) and (8.10).

387

Section 8.2 Hedging with minimal shortfall risk

8.2

Hedging with minimal shortfall risk

Our starting point in this section is the same as in the previous one: At time T , an
investor must pay the discounted random amount H  0. A complete elimination of
the corresponding risk would involve the cost
sup .H / D sup E  H 
P  2P

of superhedging H , but the investor is only willing to put up a smaller amount


v 2 .0; sup .H //:
This means that the investor is ready to take some risk: Any partial hedging strategy
whose value process V satisfies the capital constraint V0  v will generate a nontrivial shortfall
.H  VT /C :
In the previous section, we constructed trading strategies which minimize the shortfall
probability
P VT < H 
among the class of trading strategies whose initial investment is bounded by v, and
which are admissible in the sense of Definition 8.1, i.e., their terminal value VT is
non-negative. In this section, we assess the shortfall in terms of a loss function, i.e.,
an increasing function ` W R ! R which is not identically constant. We assume
furthermore that
`.x/ D 0

for x  0 and

E `.H /  < 1:

A particular role will be played by convex loss functions, which correspond to risk
aversion in view of the shortfall; compare the discussion in Section 4.9.
Definition 8.8. Given a loss function ` satisfying the above assumptions, the shortfall
risk of an admissible strategy with value process V is defined as the expectation
E `.H  VT /  D E `. .H  VT /C / 
of the shortfall weighted by the loss function `.
Our aim is to minimize the shortfall risk among all admissible strategies satisfying
the capital constraint V0  v. Alternatively, we could minimize the cost under a
given bound on the shortfall risk. In other words, the problem consists in constructing
strategies which are efficient with respect to the trade-off between cost and shortfall
risk. This generalizes our discussion of quantile hedging in the previous Section 8.1,
which corresponds to a minimization of the shortfall risk with respect to the nonconvex loss function
`.x/ D I.0;1/ .x/:

388

Chapter 8 Efficient hedging

Remark 8.9. Recall our discussion of risk measures in Chapter 4. From this point of
view, it is natural to quantify the downside risk in terms of an acceptance set A for
hedged positions. As in Section 4.8, we denote by AN the class of all positions X such
that there exists an admissible strategy  with value process V such that
V0 D 0 and

X C VT  A P -a.s.

for some A 2 A. Thus, the downside risk of the position H takes the form
N
.H / D infm 2 R j m  H 2 A:
Suppose that the acceptance set A is defined in terms of shortfall risk, i.e.,
A WD X 2 L1 j E `.X  /   x0 ;
where ` is a convex loss function and x0 is a given threshold. Then .H / is the
smallest amount m such that there exists an admissible strategy  whose value process
V satisfies V0 D m and
E `..H  VT /C /   x0 :
For a given m, we are thus led to the problem of finding a strategy  which minimizes
the shortfall risk under the cost constraint V0  m. In this way, the problem of
quantifying the downside risk of a contingent claim is reduced to the construction of
efficient hedging strategies as discussed in this section.
}
As in the preceding section, the construction of the optimal hedging strategy is
carried out in two steps. The first one is to solve the static problem of minimizing
E `.H  Y / 
among all FT -measurable random variables Y  0 which satisfy the constraints
sup E  Y   v:
P  2P

If Y  solves this problem, then so does YQ WD H ^ Y  . Hence, we may assume


that 0  Y   H or, equivalently, that Y  D H  for some randomized test  ,
which belongs to the set R of all FT -measurable random variables with values in
0; 1. Thus, the static problem can be reformulated as follows: Find a randomized
test  2 R which minimizes the shortfall risk
E `. H.1 
among all

/ /

(8.16)

2 R subject to the constraints


E  H

v

for all P  2 P .

(8.17)

The next step is to fit the terminal value VT of an admissible strategy to the optimal
profile H  . It turns out that this step can be carried out without any further assumptions on our loss function `. Thus, we assume at this point that the optimal  of step
one is granted, and we construct the corresponding optimal strategy.

389

Section 8.2 Hedging with minimal shortfall risk

Theorem 8.10. Given a randomized test  which minimizes (8.16) subject to (8.17),
a superhedging strategy   for the modified discounted claim
H  WD H

with initial investment sup .H  / has minimal shortfall risk among all admissible
strategies  which satisfy the capital constraint  1  X 0  v.
Proof. The proof extends the last argument in the proof of Theorem 8.7. As a first
step, we take any admissible strategy  such that the corresponding value process V
satisfies the capital constraint V0  v. Denote by
VT
I
H VT <H
the corresponding success ratio. It follows as in (8.13) that
V

D IVT H C

E  H

V

satisfies the constraints

for all P  2 P .

v

Thus, the optimality of  implies the following lower bound on the shortfall risk
of :
E `.H  VT / D E `. H.1  V / /  E `. H.1   / /:
In the second step, we consider the admissible strategy   and its value process V  .
On the one hand,
V0 D sup .H  / D sup E  H

  v;

P  2P

so   satisfies the capital constraint. Hence, the first part of the proof yields
E `.H  VT / D E `. H.1 
On the other hand, VT  H  D H

,

V

V  / /

 E `. H.1 

/ /:

(8.18)

and therefore

P -a.s.

Hence, the inequality in (8.18) is in fact an equality, and the assertion follows.
Let us now return to the static problem defined by (8.16) and (8.17). We start by
considering the special case of risk aversion in view of the shortfall.
Proposition 8.11. If the loss function ` is convex, then there exists a randomized test
 2 R which minimizes the shortfall risk
E `. H.1 
among all

/ /

2 R subject to the constraints


E  H

v

If ` is strictly convex on 0; 1/, then

for all P  2 P .
is uniquely determined on H > 0.

(8.19)

390

Chapter 8 Efficient hedging

Proof. The proof is similar to the one of Proposition 3.36. Let R0 denote the set of
all randomized tests which satisfy the constraints (8.19). Take n 2 R0 such that
E`. H.1  n / / converges to the infimum of the shortfall risk, and use Lemma 1.70
to select convex combinations Q n 2 conv n ; nC1 ; : : : which converge P -a.s. to
some Q 2 R. Since ` is continuous and increasing, Fatous lemma implies that
E `. H.1  Q / /  lim inf E `. H.1  Q n / / D inf E `. H.1 
2R0

n"1

/ /;

where we have used the convexity of ` to conclude that E`. H.1  Q n / / tends to
the same limit as E`. H.1  n / /.
Fatous lemma also yields that for all P  2 P
E  H Q   lim inf E  H Q n   v:
n"1

Hence Q 2 R0 , and we conclude that


uniqueness part is obvious.

WD Q is the desired minimizer. The

Remark 8.12. The proof shows that the analogous existence result holds if we use a
robust version of the shortfall risk defined as
sup EQ `. H.1 

/ /;

Q2Q

where Q is a class of equivalent probability measures; see also Remark 3.37 and
Sections 8.2 and 8.3.
}
Combining Proposition 8:11 and Theorem 8.10 yields existence and uniqueness
of an optimal hedging strategy under risk aversion in a general arbitrage-free market
model.
Corollary 8.13. Assume that the loss function ` is strictly convex on 0; 1/. Then
there exists an admissible strategy which is optimal in the sense that it minimizes
the shortfall risk among all admissible strategies  subject to the capital constraint
 1  X 0  v. Moreover, any optimal strategy requires the exact initial investment v,
and its success ratio is P -a.s. equal to


where

 IH >0 C IH D0 ;

denotes the solution of the static problem constructed in Proposition 8:11.

Proof. The existence of an optimal strategy follows by combining Proposition 8:11


and Theorem 8.10. Strict convexity of ` implies that  is P -a.s. unique on H > 0.
Since ` is strictly increasing on 0; 1/,  and the success ratio V  of any optimal

391

Section 8.2 Hedging with minimal shortfall risk

strategy   must coincide P -a.s. on H > 0. On H D 0, the success ratio


equal to 1 by definition.
Since ` is strictly increasing on 0; 1/, we must have that
sup E  H

V

is

 D v;

P  2P

for otherwise we could find some " > 0 such that " WD " C .1  "/  would
also satisfy the constraints (8.17). Since we have assumed that v < sup .H /, the
constraints (8.17) imply that  1 and hence that
E `. H.1 

" / /

< E `. H.1 

/ /:

This, however, contradicts the optimality of  .


Since the value process V  of an optimal strategy is a P -martingale, and since
VT  H

V

DH

we conclude from the above that


v  V0 D sup E  VT   sup E  H
P  2P

 D v:

P  2P

Thus, V0 is equal to v.


Beyond the general existence statement of Proposition 8.11, it is possible to obtain
an explicit formula for the optimal solution of the static problem if the market model
is complete. Recall that we assume that the loss function `.x/ vanishes for x  0. In
addition, we will also assume that
` is strictly convex and continuously differentiable on .0; 1/.
Then the derivative `0 of ` is strictly increasing on .0; 1/. Let J denote the inverse function of `0 defined on the range of `0 , i.e., on the interval .a; b/ where
a WD limx#0 `0 .x/ and b WD limx"1 `0 .x/. We extend J to a function J C W 0; 1 !
0; 1 by setting

C1 for y  b,
C
J .y/ WD
0
for y  a.
From now on, we assume also that
P D P  ;
i.e., P  is the unique equivalent martingale measure in a complete market model. Its
density will be denoted by
dP 
:
'  WD
dP

392

Chapter 8 Efficient hedging

Theorem 8.14. Under the above assumptions, the solution of the static optimization
problem of Proposition 8:11 is given by
J C .c '  /
^ 1 P -a.s. on H > 0;
H
where the constant c is determined by the condition E  H   D v.


D1

Proof. The problem is of the same type as those considered in Section 3.3. It can in
fact be reduced to Corollary 3.43 by considering the random utility function
u.x; !/ WD `.H.!/  x/;

0  x  H.!/:

Just note that the shortfall risk E `.H  Y /  coincides with the negative expected
utility Eu.Y; /  for any profile Y such that 0  Y  H . Moreover, since our
market model is complete, it has a finite structure by Theorem 5.37, and so all integrability conditions are automatically satisfied. Thus, Corollary 3.43 states that the
optimal profile H  WD Y  which maximizes the expected utility E u.Y; /  under
the constraints 0  Y  H and E  Y   v is given by
H  .!/ D I C .c '  .!/; !/ ^ H.!/ D .H.!/  J C .c '  .!///C :
Dividing by H yields the formula for the optimal randomized test

.

Corollary 8.15. In the situation of Theorem 8:14, suppose that the objective probability measure P is equal to the martingale measure P  . Then the modified discounted
claim takes the simple form
H D H

D .H  J C .c  //C :

Example 8.16. Consider the discounted payoff H of a European call option .STi 
K/C with strike K under the assumption that the numraire S 0 is a riskless bond, i.e.,
that S t0 D .1 C r/t for a certain constant r  0. If the assumptions of Corollary 8.15
hold, then the modified profile H  is the discounted value of the European call option
struck at KQ WD K C J C .c  /  .1 C r/T , i.e.,
H D

Q C
.STi  K/
:
.1 C r/T

Example 8.17. Consider an exponential loss function `.x/ D .e x  1/C for some
> 0. In this case,


1
y C
C
J .y/ D
; y  0;
log

and the optimal profile is given by




c'  C
1

log
^ H:
}
H DH

393

Section 8.2 Hedging with minimal shortfall risk

Example 8.18. If ` is the particular loss function


`.x/ D

xp
;
p

x  0;

for some p > 1, then the problem is to minimize a lower partial moment of the
difference VT  H . Theorem 8:14 implies that it is optimal to hedge the modified
claim
H p D H  .cp  '  /1=.p1/ ^ H
(8.20)

}
where the constant cp is determined by E  H p D v.
Let us now consider the limit p " 1 in (8.20), corresponding to ever increasing
risk aversion with respect to large losses.
Proposition 8.19. Let us consider the loss functions
`p .x/ D

xp
;
p

x  0;

for p > 1. As p " 1, the modified claims H


L1 .P  / to the discounted claim


p

of (8.20) converge P -a.s. and in

.H  c1 /C
where the constant c1 is determined by
E  .H  c1 /C  D v:

(8.21)

Proof. Let .p/ be shorthand for 1=.p  1/ and note that


.'  / .p/ ! 1

P -a.s. as p " 1.
.pn /

Hence, if .pn / is a sequence for which cpn


lim H

n"1


pn

converges to some cQ 2 0; 1, then

D H  cQ ^ H D .H  cQ /C :

Hence,
E  H


pn

 ! E  .H  cQ /C :

Since each term on the left-hand side equals v, we must have


E  .H  cQ /C  D v;
which determines cQ uniquely as the constant c1 of (8.21).

394

Chapter 8 Efficient hedging

Example 8.20. If the discounted claim H in Proposition 8.19 is the discounted payoff
of a call option with strike K, and the numraire is a riskless bond as in Example 8.16,
then the limiting profile limp"1 H p is equal to the discounted call with the higher
}
strike price K C c1  ST0 .
In the remainder of this section, we consider loss functions which are not convex
but which correspond to risk neutrality and to risk-seeking preferences. Let us first
consider the risk-neutral case.
Example 8.21. In the case of risk neutrality, the loss function is given by
`.x/ D x

for x  0.

Thus, the task is to minimize the expected shortfall


E .H  VT /C 
under the capital constraint V0  v. Let P  be the unique equivalent martingale measure in a complete market model. Then the static problem corresponding to Proposition 8:11 is to maximize the expectation
E H
under the constraint that

2 R satisfies
E  H

  v:

We can define two equivalent measures Q and Q by


H
dQ
D
dP
E H 

and

H
dQ
D 
:

dP
E H 

The problem then becomes the hypothesis testing problem of maximizing EQ


under the side condition
v
:
EQ   WD 
E H 

Since the density dQ=dQ is proportional to the inverse of the density '  D dP =dP ,
Theorem A.31 implies that the optimal test takes the form

1

D I'  <c1 C   I'  Dc1

P -a.s. on H > 0,

where the constant c1 is given by


c1 D supc 2 R j E  H I '  < c   v ;
and where the constant 2 0; 1 is chosen such that E  H

1 can be arbitrary.


1 

D v. On H D 0,
}

395

Section 8.2 Hedging with minimal shortfall risk

Assume now that the shortfall risk is assessed by an investor who, instead of being
risk-averse, is in fact inclined to take risk. In our context, this corresponds to a loss
function which is concave on 0; 1/ rather than convex. It is not difficult to generalize
Theorem 8.14 so that it covers this situation. Here we limit ourselves to the following
explicit case study.
Example 8.22. Consider the loss function
`.x/ D

xq
;
q

x  0;

for some q 2 .0; 1/. In order to solve our static optimization problem, one could apply
the results and techniques of Section 3.3. Here we will use an approach based on the
NeymanPearson lemma. Note first that for 2 R
`.H.1 

// D .1 

/q  `.H /  `.H / 

 `.H /:

Hence, we get a lower bound on the shortfall risk of


E `.H.1 

//   E`.H /   E

 `.H / :

(8.22)

The problem of finding a minimizer of the right-hand side is equivalent to maximizing the expectation EQ  under the constraint that EQ   v=E  H  for the
measures Q and Q defined via
dQ
Hq
D
dP
E H q 

and

dQ
H
;
D 

dP
E H 

if we assume again P H > 0  D 1. As in Example 8.21, we then conclude that the


optimal test must be of the form
I1>c '  H 1q C   I1Dc '  H 1q
q

(8.23)

for certain constants cq and . Under the simplifying assumption that
P 1 D cq '  H 1q  D 0;

(8.24)

1 on 1 > cq '  H 1q ;


D
0 otherwise.

(8.25)

the formula (8.23) reduces to



q

By taking D
for E `.H.1 


q

we obtain an identity in (8.22), and so q must be a minimizer


}
//  under the constraint that E  H   v.

396

Chapter 8 Efficient hedging

In our last result of this section, we recover the knock-out option


H  I1>c H '  ;
0

which was obtained as the solution to the problem of quantile hedging by taking the
limit q # 0 in (8.25). Intuitively, decreasing q corresponds to an increasing appetite
for risk in view of the shortfall.
Proposition 8.23. Let us assume for simplicity that (8.24) holds for all q 2 .0; 1/,
that P H > 0  D 1, and that there exists a unique constant c0 such that
E  H  I1>c H '   D v:

(8.26)

Then the solutions


q

of (8.25) converge P -a.s. to the solution



0

D I1>c H ' 
0

of the corresponding problem of quantile hedging as constructed in Proposition 8:3.


Proof. Take any sequence qn # 0 such that .cqn /1=.1qn / converges to some cQ 2
0; 1. Then
:
lim qn D I1>cH
Q
'
n"1

Hence,
E  H


qn

 ! E  H  I1>cH
:
Q
'

Since we assumed (8.24) for all q 2 .0; 1/, the left-hand terms are all equal to v, and
it follows from (8.26) that cQ D c0 . This establishes the desired convergence.

8.3

Efficient hedging with convex risk measures

As in the previous sections of this chapter, we consider the shortfall


.H  VT /C
arising from hedging the discounted claim H with a self-financing trading strategy
with initial capital
V0 D v 2 .0; sup .H //:
In this section, our aim is to minimize the shortfall risk
..H  VT /C /;
where  is a given convex risk measure as discussed in Chapter 4. Here we assume that
 is defined on a suitable function space, such as Lp .; F ; P /, so that the shortfall

397

Section 8.3 Efficient hedging with convex risk measures

risk is well-defined and finite; cf. Remark 4.44. In particular, we assume that .Y / D
.YQ / whenever Y D YQ P -a.s.
As in the preceding two sections, the construction of the optimal hedging strategy can be carried out in two steps. The first step is to solve the static problem of
minimizing
..H  Y /C /
over all FT -measurable random variables Y  0 that satisfy the constraint
sup E  Y   v:
P  2P

If Y  solves this problem, then so does H ^ Y  . Hence, we may assume that 0 


Y   H , and we can reformulate the problem as
minimize .Y  H / subject to 0  Y  H and sup E  Y   v:

(8.27)

P  2P

The next step is to fit the terminal value VT of an admissible strategy to the optimal
profile Y  . It turns out that this step can be carried out without any further assumptions on our risk measure . Thus, we assume at this point that the optimal Y  of step
one is granted, and we construct the corresponding optimal strategy.
Proposition 8.24. A superhedging strategy for a solution Y  of (8.27) with initial
investment sup .Y  / has minimal shortfall risk among all admissible strategies whose
value process satisfies the capital constraint V0  v.
Proof. Let V be the value process of any admissible strategy such that V0  v. Due
to Doobs systems theorem in the form of Theorem 5.14, V is a martingale under any
P  2 P , and so
sup E  VT  D V0  v:
P  2P

Thus, Y WD H ^ VT satisfies the constraints in (8.27), and we get


..H  VT /C / D .Y  H /  .Y   H /:
Next let V  be the value process of a superhedging strategy for Y  with initial
investment
V0 D sup .Y  / D sup E  Y  :
P  2P

Then we have

V0

 v and

VT

 0. Moreover, VT  Y  P -a.s., and thus

.Y   H / D ..H  Y  /C /  ..H  VT /C /:


This concludes the proof.

398

Chapter 8 Efficient hedging

Let us now return to the static problem defined by (8.27).


Proposition 8.25. If  is lower semicontinuous with respect to P -a.s. convergence of
random variables in the class Y j 0  Y  H and .Y / < 1 for one such Y ,
then there exists a solution of the static optimization problem (8.27). In particular,
there exists a solution if H is bounded and  is continuous from above.
Proof. Take a sequence Yn with 0  Yn  H and supP  2P E  Yn   v such that
.Yn  H / converges to the infimum A of the shortfall risk. We can use Lemma 1.70
to select convex combinations Zn 2 convYn ; YnC1 ; : : : which converge P -a.s. to
some random variable Z. Then 0  Z  H and Fatous lemma yields that
E  Z   lim inf E  Zn   v
n"1

for all P  2 P . Lower semicontinuity of  implies that


.Z  H /  lim inf .Zn  H /:
n"1

Moreover, the right-hand side is equal to A, due to the convexity of . Hence, Z is


the desired minimizer.
Combining Proposition 8:25 and Proposition 8.24 yields the existence of a riskminimizing hedging strategy in a general arbitrage-free market model. So far, all
arguments were practically the same as in the preceding two sections.
Beyond the general existence statement of Proposition 8.25, it is sometimes possible to obtain an explicit formula for the optimal solution of the static problem if the
market model is complete and so P D P  . In this case, the static optimization
problem (8.27) simplifies to
minimize .Y  H / subject to 0  Y  H and E  Y   v:
By substituting Z for H  Y , this is equivalent to the problem
Q
minimize .Z/ subject to 0  Z  H and E  Z   v;

(8.28)

where vQ WD E  H   v. We will now discuss this problem in the case  D AV@R


.
Our approach relies on the general idea that a minimax problem can be transformed
into a standard minimization problem by using a duality result for the expression
involving the maximum. In the case of AV@R
, we can use the following representation of AV@R
from Lemma 4.51,

AV@R
.Z/ D

1
1
min.E .Z  r/C  C r/ D min.E .Z  r/C  C r/ (8.29)
r2R
r0

Section 8.3 Efficient hedging with convex risk measures

399

for Z  0. Our discussion of problem (8.28) will be valid also beyond the setting
of a complete discrete-time market model, whose underlying probability space has
necessarily a discrete structure by Theorem 5.37. In fact it applies whenever P and
P  are two equivalent probability measures on a given measurable space .; F /.
This is important in view of the application of the next theorem in Example 3.50.
Theorem 8.26. Suppose that H 2 L1 .P / and denote by ' WD dP  =dP the price
density of P  with respect to P . Then the problem (8.28) admits a solution for  D
AV@R
which is of the form
Z  D H I'>c C .H ^ r  /I'<c C .H C .1  /.H ^ r  //I'Dc

(8.30)

for certain constants c > 0, r   0, and  2 0; 1.


Proof. Recall from Section 4.4 that

AV@R
.Y / D sup EQ Y ;
Q2Q

where Q
is the set of all probability measures Q
P whose density dQ=dP is
P -a.s. bounded by 1= . Thus it follows from Fatous lemma that AV@R
is lower
semicontinuous with respect to P -a.s. convergence of random variables in the class
Y j 0  Y  H (the same argument also gives upper semicontinuity and hence
continuity, but this fact is not needed here). Proposition 8.25 hence yields the existence of a solution Z  of the minimization problem (8.28). By (8.29), Z  must
solve
Q
minimize E .Z  r  /C  subject to 0  Z  H and E  Z   v;

(8.31)

where r   0 is such that

AV@R
.Z  / D

1
E .Z   r  /C  C r  :

(8.32)

Lemma 4.51 states that r  is a -quantile for Z  .


Let us now solve (8.31). To this end, we consider first the case in which r  D 0.
Then we are in the situation of Example 8.21 and obtain
Z  D H I'>c C H I'Dc
for constants c > 0 and  2 0; 1. This is indeed a special case of (8.30).
Now we consider the case r  > 0. Note first that we must have Z   H ^ r  .
Indeed, let us assume P Z  < H ^ r   > 0. Then we could obtain a strictly lower
risk AV@R
.Z  / either by decreasing the level r  in case P Z   H ^ r   D 1
or, in case P Z  > H ^ r   > 0, by shifting mass of Z  from Z  > H ^ r  to
the set Z  < H ^ r  .

400

Chapter 8 Efficient hedging

Thus, we can solve our problem by minimizing E .ZO C H ^ r   r  /C  subject


to
0  ZO  H  H ^ r 

and

E  ZO   vO WD vQ  E  H ^ r  :

Any ZO satisfying these constraints must be concentrated on H > r  , so that the


problem is equivalent to
O
minimize E ZO  subject to 0  ZO  H  H ^ r  and E  ZO   v.

(8.33)

But this problem is equivalent to the one for r  D 0 if we replace H by H  H ^ r  .


Hence, it is solved by
ZO  D .H  H ^ r  /I'>c C .H  H ^ r  /I'Dc
for some constants c > 0 and  2 0; 1. It follows that
Z  D ZO  CH ^r  D H I'>c C.H ^r  /I'<c C.H C.1/.H ^r  //I'Dc :
We now solve (8.28) in the more specific situation in which H D 1. In this case,
Q The next result shows that these
one sees that r  and c are functions of the capital v.
functions behave as follows. As long as vQ is below some critical threshold v  , we
Q is determined by the requirement E  Z   D v.
Q Above
have r  D 0, and c D c.v/
the critical threshold v  , the value of c is always equal to c.v  /, and now r  > 0 is
Q
determined by the requirement E  Z   D v.
Theorem 8.27. Consider the setting of Theorem 8:26. Assume in addition that H D 1
and that ' has a continuous and strictly increasing quantile function q' with respect
to P and satisfies k'k1 > 1 . Then the solution Y  of problem (8.28) is P -a.s.
unique. Moreover, there exists a critical capital level v  such that
Z  D I'>q' .t0 /

P -a.s. for vQ  v  ;

where t0 is determined by the condition E  Z   D v,


Q and
Z  D r  C .1  r  /I'>q' .t /
where r  D 1 
equation

1vQ
.t /

> 0, .t / WD

Rt
0

P -a.s. for vQ > v  ,

q' .s/ ds, and t


is the unique solution of the

q' .t
/.t
 1 C / D .t
/:
Finally, the critical capital level v  is equal to 1  .t
/.
Proof. We fix vQ 2 .0; 1/. For a constant c we then let
Zr D r C .1  r/I'>c ;

(8.34)

401

Section 8.3 Efficient hedging with convex risk measures

where r D r.c/  0 is such that E  Zr  D v,


Q i.e.,
r.c/ D

vQ  E 'I ' > c 


:
E 'I '  c 

This makes sense as long as c  c0 , where c0 is defined via vQ D E 'I ' > c0 .
Theorem 8.26 states that a solution of our problem can be found within the class
Zr.c/ j c  c0 . Thus, we have to minimize
Z 1
Z 1
qZr.c/ .s/ ds D r.c/ C .1  r.c//
Iq' .s/>c ds
AV@R
.Zr.c/ / D
1

1

over c  c0 . Here we have used Lemma A.23 in the second identity. This minimization problem can be simplified further be using the reparameterization c D q' .t /,
which is one-to-one according to our assumptions. Indeed, by letting
%.t / WD r.q' .t // D 1 

1  vQ
;
.t /

we simply have to minimize the function


Z
R.t / WD AV@R
.Z%.t/ / D %.t / C .1  %.t //

1

I.t;1 .s/ ds

D %.t / C .1  %.t //.  .t  1 C /C /


D  .1  v/
Q

.t  1 C /C
.t /

over t  t0 WD F' .c0 /. For t  1  , we get R.t / D , which cannot be optimal.


We show next that the function
.t / WD

t 1C
.t /

has a unique maximizer t


2 .1  ; 1, which will define the solution as soon as
t
 t0 and as long as t D t0 does not give a better result. To this end, we note first
that
.t /  .t  1 C /q' .t /
:
(8.35)
0 .t / D
.t /2
The numerator of this expression is strictly larger than zero for t  1  and equal
to .1  / > 0 at t D 1  . Moreover, for t > 1  ,
Z 1

Z t
q' .s/ ds C
q' .s/  q' .t / ds;
.t /  .t  1 C /q' .t / D
0

1

which is easily seen to be strictly decreasing in t . For t " 1 this expression converges
to 1  k'k1 , which is strictly negative due to our assumption k'k1 > 1 . Hence,

402

Chapter 8 Efficient hedging

the numerator in (8.35) has a unique zero t


2 .1  ; 1/, which is the unique solution
of the equation
q' .t
/.t
 1 C / D .t
/;
and this solution t
is consequently the unique maximizer of .
If t
 t0 , then R has no minimizer on .t0 ; 1, and it follows that t  D t0 is its
minimizer. Let us compare R.t
/ with R.t0 / in case t
> t0 . We have
Q
R.t
/ D  .1  v/

.t
 1 C /C
.t
 1 C /C
D  .t0 /
.t
/
.t
/

and
R.t0 / D  .t0 C  1/C D  .t0 /

.t0  1 C /C
:
.t0 /

Since t
is the unique maximizer of the function t 7! .t  1 C /C =.t /, we thus see
that R.t
/ is strictly smaller than R.t0 /. Hence the solution is defined by
t  WD t0 _ t
:
Clearly, t
is independent of v,
Q while t0 decreases from 1 to 0 as vQ increases from 0
to 1. Thus, by taking v  as the capital level for which t
D t0 , we see that the optimal
solution has the form

I
for vQ  v  ,

' .t0 /
Z D '>q
r  C .1  r  /I'>q' .t / for vQ > v  ,
1vQ
> 0.
where r  D 1  .t
/
Finally, when vQ is equal to the critical capital level v , we must have that t
D t0 .
Q and so v D 1  .t
/.
But t0 was defined to be the solution of 1  .t / D v,

Let us now point out the connections of the preceding theorem with robust statistical
test theory as explained in Section 3.5 and in particular in Remark 3.54. To this end, let
R denote the set of all measurable functions W  ! 0; 1, which will be interpreted
as randomized statistical tests; see Remark A.32. Problem (8.28) for H D 1 can then
be rewritten as
minimize sup EQ

 subject to

2 R and E 

  v,
Q

Q2Q

where Q
is the set of all probability measures Q
P whose density dQ=dP is
P -a.s. bounded by 1= . The solution  of this problem is described in Theorem
8.27. It is easy to see that  must also solve the following dual problem:
maximize E 

 subject to

2 R and sup EQ
Q2Q

  ;

Section 8.3 Efficient hedging with convex risk measures

403

where D supQ2Q EQ  . Thus,  is an optimal randomized test for testing the hypothesis P  against the composite null hypothesis Q
; see Remark 3.54.
When Q
admits a least-favorable measure Q0 with respect to P  in the sense of
Definition 3.48, then  must also be a standard NeymanPearson test for testing the
hypothesis P  against the null hypothesis Q0 . By Theorem A.31, it must hence be of
the form

D I
Dc C I
>c ;
where c > 0 and  2 0; 1 are constants and
D

dP 
:
dQ0

Under the assumptions of Theorem 8.27, this is the case when  D c.' _ q' .t
//,
where c is a suitable constant. It follows that
dQ0
1
'
D 
:
dP
c ' _ q' .t
/
We have by (8.34),
E

i
1
'
D P ' > q' .t
/  C
E 'I '  q' .t
/ 
' _ q' .t
/
q' .t
/
D 1  t
C

1
.t
/ D :
q' .t
/

This yields that c D , and we see that


dQ0
1
'
D 
dP
' _ q' .t
/

(8.36)

defines a probability measure Q0 2 Q


with
D

dP 
D .' _ q' .t
//:
dQ0

The following result now follows immediately from Theorem 8.27:


Corollary 8.28. For a measure P   P satisfying the assumptions of Theorem 8:27,
the measure Q0 in (8.36) is a least-favorable measure for
dQ

Q
D Q 2 M1 .P /
 P -a.s.
dP

in the sense of Definition 3:48.

Chapter 9

Hedging under constraints

So far, we have focused on frictionless market models, where asset transactions can
be carried out with no limitation. In this chapter, we study the impact of market imperfections generated by convex trading constraints. Thus, we develop the theory of
dynamic hedging under the condition that only trading strategies from a given class S
may be used. In Section 9.1 we characterize those market models for which S does not
contain arbitrage opportunities. Then we take a direct approach to the superhedging
duality for American options. To this end, we first derive a uniform Doob decomposition under constraints in Section 9.2. The appropriate upper Snell envelopes are
analyzed in Section 9.3. In Section 9.4 we derive a superhedging duality under constraints, and we explain its role in the analysis of convex risk measures in a financial
market model.

9.1

Absence of arbitrage opportunities

In practice, it may be reasonable to restrict the class of trading strategies which are
admissible for hedging purposes. As discussed in Section 4.8, there may be upper
bounds on the capital invested into risky assets, or upper and lower bounds on the
number of shares of an asset. Here we model such portfolio constraints by a set S of
d -dimensional predictable processes, viewed as admissible investment strategies into
risky assets. Throughout this chapter, we will assume that S satisfies the following
conditions:
(a) 0 2 S.
(b) S is predictably convex: If ;  2 S and h is a predictable process with 0 
h  1, then the process
h t  t C .1  h t /  t ;

t D 1; : : : ; T;

belongs to S.
(c) For each t 2 1; : : : ; T , the set
S t WD  t j  2 S
is closed in L0 .; F t1 ; P I Rd /.
(d) For all t ,  t 2 S t implies  t? 2 S t .

405

Section 9.1 Absence of arbitrage opportunities

In order to explain condition (d), let us recall from Lemma 1.66 that each  t 2
L0 .; F t1 ; P I Rd / can be uniquely decomposed as
 t D  t C  t? ;

where  t 2 N t and  t? 2 N t? ,

and where
N t D  t 2 L0 .; F t1 ; P I Rd / j  t  .X t  X t1 / D 0 P -a.s. ;
N t? D  t 2 L0 .; F t1 ; P I Rd / j  t   t D 0 P -a.s. for all  t 2 N t :
Remark 9.1. Under condition (d), we may replace  t .X t X t1 / by  t? .X t X t1 /,
and  t?  .X t  X t1 / D 0 P -a.s. implies  t? D 0. Note that condition (d) holds
if the price increments satisfy the following non-redundance condition: For all t 2
1; : : : ; T and  t 2 L0 .; F t1 ; P I Rd /,
 t  .X t  X t1 / D 0

P -a.s.

H)

t D 0

P -a.s.

(9.1)
}

Example 9.2. For each t let C t be a closed convex subset of Rd such that 0 2 C t .
Take S as the class of all d -dimensional predictable processes  such that  t 2 C t
P -a.s. for all t . If the non-redundance condition (9.1) holds, then S satisfies conditions (a) through (d). This case includes short sales constraints and restrictions on the
size of a long position.
}
Example 9.3. Let a; b be two constants such that 1  a < 0 < b  1, and take
S as the set of all d -dimensional predictable processes such that
a   t  X t1  b

P -a.s. for t D 1; : : : ; T .

This class S corresponds to constraints on the capital invested into risky assets. If
we assume that the non-redundance condition (9.1) holds, then S satisfies conditions
(a) through (d). More generally, instead of the two constants a and b, one can take
}
dynamic margins defined via two predictable processes .a t / and .b t /.
Let S denote the set of all self-financing trading strategies  D . 0 ; / which arise
from an investment strategy  2 S, i.e.,
S D  D . 0 ; / j  is self-financing and  2 S :
In this section, our goal is to characterize the absence of arbitrage opportunities in S.
The existence of an equivalent martingale measure P  2 P is clearly sufficient.
Under an additional technical assumption, a condition which is both necessary and
sufficient will involve a larger class PS P . In order to introduce these conditions,
we need some preparation.

406

Chapter 9 Hedging under constraints

Definition 9.4. An adapted stochastic process Z on .; F ; .F t /; Q/ is called a local


Q-martingale if there exists a sequence of stopping times .n /n2N  T such that
n % T Q-a.s., and such that the stopped processes Z n are Q-martingales. The
sequence .n /n2N is called a localizing sequence for Z. In the same way, we define
local supermartingales and local submartingales.
Remark 9.5. If Q is a martingale measure for the discounted price process X , then
the value process V of each self-financing trading strategy  D . 0 ; / is a local
Q-martingale. To prove this, one can take the sequence
n WD inft  0 j j tC1 j > n ^ T
as a localizing sequence. With this choice, j t j  n on n  t , and the increments
n
V tn  V t1
D In t  t  .X t  X t1 /;

t D 1; : : : ; T;

of the stopped process V n are Q-integrable and satisfy


n
j F t1  D In t  t  EQ X t  X t1 j F t1  D 0:
EQ V tn  V t1

The following proposition is a generalization of an argument which we have already


used in the proof of Theorem 5.25. Throughout this chapter, we will assume that
F0 D ;;  and FT D F .
Proposition 9.6. A local Q-supermartingale Z whose negative part Z t is integrable
for each t 2 1; : : : ; T is a Q-supermartingale.
Proof. Let .n / be a localizing sequence. Then
Z tn  

T
X

Zs 2 L1 .Q/:

sD0

In view of limn Z tn D Z t , Fatous lemma for conditional expectations implies that
Q-a.s.
n
D Z t1 :
EQ Z t j F t1   lim inf EQ Z tn j F t1   lim inf Z t1
n"1

n"1

We get in particular that EQ Z t   Z0 < 1. Thus Z t 2 L1 .Q/, and the assertion


follows.
Exercise 9.1.1. Let Z be a local Q-martingale with ZT  0. Show that in our
situation, where F0 D ;; , Z is a Q-martingale.

Section 9.1 Absence of arbitrage opportunities

407

Definition 9.7. By PS we denote the class of all probability measures PQ  P such


that
(9.2)
X t 2 L1 .PQ / for all t ,
and such that the value process of any trading strategy in S is a local PQ -supermartingale.
Remark 9.8. If S contains all self-financing trading strategies  D . 0 ; / with
bounded , then PS coincides with the class P of all equivalent martingale measures. To prove this, let PQ 2 PS , and note that the value process V of any such  is a
PQ -supermartingale by (9.2) and by Proposition 9.6. The same applies to the strategy
, so V is in fact a PQ -martingale, and Theorem 5.14 shows that PQ is a martingale
measure for X.
}
Our first goal is to extend the fundamental theorem of asset pricing to our present
setting; see Theorem 5.16. Let us introduce the positive cone
R WD  j  2 S;  0
generated by S. Accordingly, we define the cones R and R t . Clearly, R contains no
arbitrage opportunities if and only if S is arbitrage-free. We will need the following
condition on the L0 -closure RO t of R t :
for each t , RO t \ L1 .; F t ; P I Rd /  R t .

(9.3)

This condition clearly holds if R t itself is closed in L0 and in particular if S t D R t


for all t .
Theorem 9.9. Under condition (9.3), there are no arbitrage opportunities in S if and
only if PS is non-empty. In this case, there exists a measure PQ 2 PS which has a
bounded density d PQ =dP .
Example 9.10. In the situation of Example 9.2, condition (9.3) will be satisfied as
soon as the cones generated by the convex sets C t are closed in Rd . This case includes short sales constraints and constraints on the size of a long position, which are
modeled by taking C t D a1t ; b t1       adt ; b td  for certain numbers akt ; b tk such that
1  akt  0  b tk  1.
Example 9.11. Consider now the situation of Example 9.3. We claim that S does not
contain arbitrage opportunities if and only if the unconstrained market is arbitragefree, so that we have PS D P . To prove this, note that the existence of an arbitrage
opportunity in the unconstrained market is equivalent to the existence of some t and
some F t1 -measurable  t such that  t  .X t  X t1 /  0 P -a.s. and P  t  .X t 
X t1 / > 0  > 0 (see Proposition 5.11). Next, there exists a constant c > 0 such that
these properties are shared by Qt WD  t Ij t X t 1 jc and in turn by "Qt , where " > 0.
But "Qt 2 S t if " is small enough.

408

Chapter 9 Hedging under constraints

As to the proof of Theorem 9.9, we will first show that the condition PS ;
implies the absence of arbitrage opportunities in S.
Proof of sufficiency in Theorem 9:9. Suppose PQ is a measure in PS , and V is the
value process of a trading strategy in S such that VT  0 P -almost surely. Combining Lemma 9.12 below with Proposition 9.6 shows that V is a PQ -supermartingale.
Q VT , so V cannot be the value process of an arbitrage opportunity.
Hence V0  E
Lemma 9.12. Suppose that PS ; and that V is the value process of a trading
strategy in S such that VT  0 P -almost surely. Then Vt  0 P -a.s. for all t .
Proof. The assertion will be proved by backward induction on t . We have VT  0
by assumption, so let us assume that V t  0 P -a.s. for some t . For  D . 0 ; / 2 S
with value process V , we let s.c/ WD s Ijs jc for c > 0 and for all s. Then the value
process V .c/ of  .c/ is a PQ -supermartingale for any fixed PQ 2 PS . Furthermore,
.c/
V t1 Ij t jc D V t Ij t jc   t  .X t  X t1 /
.c/

  t

 .X t  X t1 /

.c/
.c/
D V t1  V t :

The last term on the right belongs to L1 .PQ /, so we may take the conditional expectaQ  j F t1  on both sides of the inequality. We get
tion E
Q V .c/  V t.c/ j F t1   0
V t1 Ij t jc  E
t1

PQ -a.s.

By letting c " 1, we obtain V t1  0.


Let us now prepare for the proof that the condition PS ; is necessary. First we
argue that the absence of arbitrage opportunities in S is equivalent to the absence of
arbitrage opportunities in each of the embedded one-period models, i.e., to the nonexistence of  t 2 S t such that  t  .X t  X t1 / amounts to a non-trivial positive gain.
This observation will allow us to apply the techniques of Section 1.6. Let us denote
S 1 WD  2 S j  is bounded:
Similarly, we define
S t1 WD  t j  2 S 1 D S t \ L1 .; F t1 ; P I Rd /:

409

Section 9.1 Absence of arbitrage opportunities

Lemma 9.13. The following conditions are equivalent:


(a) There exists an arbitrage opportunity in S.
(b) There exist t 2 1; : : : ; T and  t 2 S t such that
 t  .X t  X t1 /  0 P -a.s.,

and P  t  .X t  X t1 / > 0  > 0:

(9.4)

(c) There exist t 2 1; : : : ; T and  t 2 S t1 which satisfies (9.4).


Proof. The proof is essentially the same as the one of Proposition 5.11.
In order to apply the results of Section 1.6, we introduce the convex sets
K tS WD  t  .X t  X t1 / j  t 2 S t ;
for t 2 1; : : : ; T . Lemma 9.13 shows that S contains no arbitrage opportunities if
and only if the condition
K tS \ L0C D 0
(9.5)
holds for all t 2 1; : : : ; T .
Lemma 9.14. Condition (9.5) implies that K tS  L0C .; F t ; P / is a closed convex
subset of L0 .; F t ; P /.
Proof. The proof is essentially the same as the one of Lemma 1.68. Only the following additional observation is required: If . n / is sequence in S t , and if and are
two F t1 -measurable random variables such that 0   1 and is integer-valued,
then  WD   2 S t . Indeed, predictable convexity of S and our assumption that
0 2 S imply that
n
X
I Dk  k 2 S t

kD1

for each n, and the closedness of S t in L0 .; F t1 ; P I Rd / yields


D

1
X

I Dk  k 2 S t :

kD1

From now on, we will assume that


E jXs j  < 1

for all s.

(9.6)

For the purpose of proving Theorem 9.9, this can be assumed without loss of generality: If (9.6) does not hold, then we replace P by an equivalent measure P 0 which

410

Chapter 9 Hedging under constraints

has a bounded density dP 0 =dP and for which the price process X is integrable. For
instance, we can take
T
h X
i
jXs j dP;
dP 0 D c exp 
sD1

where c denotes the normalizing constant. If there exist a measure PQ  P 0 such that
each value process for a strategy in S is a local PQ -supermartingale and such that the
density d PQ =dP 0 is bounded, then PQ 2 PS , and the density d PQ =dP is bounded as
well.
Lemma 9.15. If S contains no arbitrage opportunities and condition (9.3) holds,
then for each t 2 1; : : : ; T there exists some Zt 2 L1 .; F t ; P / such that Z t > 0
P -a.s. and such that
E Z t  t  .X t  X t1 /   0

for all  2 S 1 .

(9.7)

Proof. Recall that R does not contain arbitrage opportunities if and only if S is
arbitrage-free. Hence, for each t , K tR \ L0C .; F t ; P / D 0 by Lemma 9.13.
By the equivalence of conditions (a) and (c) of same lemma and condition (9.3), we
even get
O
(9.8)
K tR \ L0C .; F t ; P / D 0
where RO t denotes again the L0 -closure of R t . The cone RO t satisfies all conditions
required from S t , and hence Lemma 9.14 implies that each
O

C tR WD .K tR  L0C .; F t ; P // \ L1
is a closed convex cone in L1 which contains L1
C .; F t ; P /. Furthermore, it
follows from (9.8) and the argument in the proof of (a) , (b) of Theorem 1.55
O
O
that C tR \ L0C D 0, so C tR satisfies the assumptions of the KrepsYan theorem,
which is stated in Theorem 1.62. We conclude that there exist Z t 2 L1 .; F t ; P /
O
such that P Z t > 0  D 1, and such that E Z t W   0 for each W 2 C tR . As
O
 t  .X t  X t1 / 2 C tR for each  2 S 1 , Z t has property (9.7).
Now we can complete the proof of Theorem 9.9 by showing that the absence of
arbitrage opportunities in S implies the existence of a measure PQ that belongs to the
class PS and has a bounded density d PQ =dP .
Proof of necessity in Theorem 9:9. Suppose that S does not contain arbitrage opportunities. We are going to construct the desired measure PQ via backward recursion.
First we consider the case t D T . Take a bounded random variable ZT > 0 as
constructed in Lemma 9.15, and define a probability measure PQT by
ZT
d PQT
D
:
dP
E ZT 

411

Section 9.1 Absence of arbitrage opportunities

Clearly, PQT is equivalent to P , and X t 2 L1 .PQT / for all t . We claim that


EQ T T  .XT  XT 1 / j FT 1   0

for all  2 S 1 .

(9.9)

To prove this claim, consider the family


WD EQ T T  .XT  XT 1 / j FT 1  j  2 S 1 :
For ; Q 2 S 1 , let
A WD EQ T T  .XT  XT 1 / j FT 1  > EQ T QT  .XT  XT 1 / j FT 1 ;
and define  0 by  t0 D 0 for t < T and
T0 WD T IA C QT IAc :
The predictable convexity of S implies that  0 2 S 1 . Furthermore, we have
EQ T T0  .XT  XT 1 / j FT 1 
D EQ T T  .XT  XT 1 / j FT 1  _ EQ T QT  .XT  XT 1 / j FT 1 :
Hence, the family is directed upwards in the sense of Theorem A.33. By virtue
of that theorem, ess sup is the increasing limit of a sequence in . By monotone
convergence, we get


EQ T ess sup EQ T T  .XT  XT 1 / j FT 1 
2S 1

D sup EQ T EQ T T  .XT  XT 1 / j FT 1  
2S 1

1
D
sup E T  .XT  XT 1 / ZT 
E ZT  2S 1

(9.10)

 0;
where we have used (9.7) in the last step. Since S contains 0, it follows that
ess sup EQ T T  .XT  XT 1 / j FT 1  D 0

PQT -a.s.,

2S 1

which yields our claim (9.9).


Now we apply the previous argument inductively: Suppose we already have a probability measure PQtC1  P with a bounded density d PQtC1 =dP such that
EQ tC1 jXs j  < 1

for all s,

412

Chapter 9 Hedging under constraints

and such that


EQ tC1 k  .Xk  Xk1 / j Fk1   0 P -a.s. for k  t C 1 and  2 S 1 . (9.11)
Then we may apply Lemma 9.15 with P replaced by PQ tC1 , and we get some strictly
positive ZQ t 2 L1 .; F t ; PQtC1 / satisfying (9.7) with PQtC1 in place of P . We now
proceed as in the first step by defining a probability measure PQt  PQtC1  P as
ZQ t
d PQt
D
:
EQ tC1 ZQ t 
d PQtC1
Then PQt has bounded densities with respect to both PQtC1 and P . In particular,
EQ t jXs j  < 1 for all s. Moreover, the F t -measurability of d PQt =d PQtC1 implies
that (9.11) is satisfied for PQt replacing PQtC1 . Repeating the arguments that led to
(9.9) yields
EQ t  t  .X t  X t1 / j F t1   0 for all  2 S 1 .
After T steps, we arrive at the desired measure PQ WD PQ1 2 PS .

9.2

Uniform Doob decomposition

The goal of this section is to characterize those non-negative adapted processes U


which can be decomposed as
U t D U0 C

t
X

k  .Xk  Xk1 /  B t ;

(9.12)

kD1

where the predictable d -dimensional process  belongs to S, and where B is an


adapted and increasing process such that B0 D 0. In the unconstrained case where
S consists of all strategies, we have seen in Section 7.2 that such a decomposition
exists if and only if U is a supermartingale under each equivalent martingale measure
P  2 P . In our present context, a first guess might be that the role of P is now played
by PS . Since each value process of a strategy in S is a local PQ -supermartingale for
each PQ 2 PS , any process U which has a decomposition (9.12) is also a local PQ supermartingale for PQ 2 PS . Thus, one might suspect that the latter property would
also be sufficient for the existence of a decomposition (9.12). This, however, is not
the case, as is illustrated by the following simple example.
Example 9.16. Consider a one-period market model with the riskless bond