Backward Stochastic Differential Equations in Financial Mathematics

arXiv:2312.06690v1 [math.
PR] 9 Dec 2023
Backward Stochastic Differential

Equations in Financial Mathematics
Weiye Yang
Contents
0 Introduction 2
1 Backward Stochastic Differential Equations 3

1.1 Preliminary notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Existence and uniqueness of solutions . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 LBSDEs and the comparison theorem . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.1 Linear backward stochastic differential equations . . . . . . . . . . . . . . 9
1.4.2 The comparison theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Supersolutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2 European claims in dynamically complete markets 15

2.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Hedging claims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3 Concave BSDEs and applications 19

3.1 Extrema of standard data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Concave drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3 Application: European claims in convex markets . . . . . . . . . . . . . . . . . . 23
4 Utility theory in incomplete markets 27

4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Logarithmic utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
References 33
1
0 Introduction
A backward stochastic differential equation (BSDE) is an SDE of the form
−dYt = f (t, Yt , Zt )dt − Zt∗ dWt ;

YT = ξ.
It differs from a forward stochastic differential equation (FSDE) in two main aspects:
• A BSDE has a terminal condition YT = ξ, as opposed to FSDEs, which have initial condi-
tions.
• A solution of a BSDE consists of a pair of processes (Y, Z) satisfying the equation.
The subject of BSDEs has seen extensive attention since their introduction in the linear case by
Bismut (1973) and in the general case by Pardoux and Peng (1990). In contrast with deterministic
differential equations, it is not enough to simply reverse the direction of time and treat the
terminal condition as an initial condition, as we would then run into problems with adaptedness.
Intuitively, our “knowledge” at time t consists only of what has happened at all times s ∈ [0, t],
and we cannot reverse the direction of time whilst keeping this true. In this way, the theory of
BSDEs is very much its own beast, separate from the theory of FSDEs.
BSDEs have a number of applications in mathematical finance. For example, we can see
immediately that the problem of finding the time-t price of a European contingent claim expiring
at some fixed future time T ≥ t is exactly the problem of solving a linear BSDE. Additionally
there is an intimate link between BSDEs and partial differential equations, which allows one to
express the solution of a BSDE, and thus the price of a European contingent claim, in terms of
the solution of a related PDE. This gives a generalisation of the Black-Scholes formula. BSDEs
also act as the “Euler-Lagrange equations” in a number of utility maximisation problems, in the
sense that the optimal value of such problems can be found through solving the related BSDE.
The layout of this essay is as follows: In Section 1 we introduce BSDEs and go over the basic
results of BSDE theory, including two major theorems: the existence and uniqueness of solutions
and the comparison theorem. We also introduce linear BSDEs and the notion of supersolutions
of a BSDE. In Section 2 we set up the financial framework in which we will price European
contingent claims, and prove a result about the fair price of such claims in a dynamically complete
market. In Section 3 we extend the theory of linear BSDEs to include concave BSDEs, and apply
this to pricing claims in more complicated market models. In Section 4 we take a look at utility
maximisation problems, and see how utilising BSDE theory allows for a relatively simple and
neat solution in certain cases.
All proofs, unless stated otherwise, are my own.
2
1 Backward Stochastic Differential Equations
1.1 Preliminary notation
Before we start talking about SDEs we need a probability space on which to work. Here is our
setup. Fix a finite time horizon T > 0, and then:
• Let (Ω, F , P) be a probability space in which we have an n-dimensional Brownian motion
W.
• Let F = (Ft )t∈[0,T ] be the augmented (i.e. right-continuous and containing P-null sets)
filtration generated by W up to time T .
Note that this means that all solutions to SDEs in this setup will be strong solutions. It also
allows us to use the martingale representation theorem freely. We now define some notation.
• For vectors a, b ∈ Rd , let |a| denote the standard Euclidean norm and a · b the standard
Euclidean inner product.
• For a matrix A ∈ Rn×d , let A∗ ∈ Rd×n denote its transpose and |A| its Frobenius norm,
i.e. |A|2 = tr(AA∗ ).
• For continuous semimartingales X = (Xt )t∈[0,T ] , Y = (Yt )t∈[0,T ] taking values in R,
Rd respectively, let hXi = (hXit )t∈[0,T ] denote the quadratic variation and hX, Y i =
(hX, Y it )t∈[0,T ] the Rd -valued quadratic covariation, where hX, Y ii = hX, Y i i which is
the standard R-valued quadratic covariation.
• Let L2,d
T be the space of square-integrable FT -measurable random variables
taking values
in Rd , i.e. the space of FT -measurable X : Ω → Rd such that kXk2 := E |X|2 < ∞.

2,d d 2
hR HT be ithe space of predictable processes ϕ : Ω × [0, T ] → R such that kϕk :=
• Let
T
E 0 |ϕt |2 dt < ∞. We call these processes square-integrable.
q
1,d d
RT
• Let HT be the space of predictable processes ϕ : Ω×[0, T ] → R such that E 2
|ϕt | dt <
0
∞.
hR i
T
• For β > 0 and ϕ ∈ HT2,d , define kϕk2β = E 0 eβt |ϕt |2 dt and let HT,β
2,d
be the space HT2,d
endowed with this norm.
Remark 1.1. Recall that L2,d 2,d
T and HT are Hilbert spaces. This implies by equivalence of norms
2,d
that HT,β is also a Hilbert space for any β > 0.
1.2 Basic definitions

Definition 1.2. A backward stochastic differential equation (or BSDE for short) is a stochastic
differential equation of the form
(1.1)
YT = ξ,
or equivalently
Z T Z T
Yt = ξ + f (s, Ys , Zs )ds − Zs∗ dWs , (1.2)
t t
where
3
1. the random variable ξ : Ω → Rd is FT -measurable,
2. the random function f : Ω×[0, T ]×Rd ×Rn×d → Rd (known as the driver) is P⊗B d ⊗B n×d -
measurable, where P is the predictable σ-algebra over Ω × [0, T ].
The pair (f, ξ) is known as the data of the BSDE.
Notice that there are two unknown processes in a BSDE: Y and Z. It is therefore worth
giving an explicit definition of what we mean by a solution of a BSDE.
Definition 1.3. A continuous solution of BSDE (1.1) is a pair (Y, Z) = (Yt , Zt )t∈[0,T ] such that
Y is a continuous adapted Rd -valued process and Z is a predictable Rn×d -valued process with
RT
0
|Zt |2 ds < ∞ P a. s., which satisfies the BSDE. The solution (Y, Z) is square-integrable if
(Y, Z) ∈ HT2,d × HT2,n×d .
For our first main result we will need to assume some regularity conditions on the data:
Definition 1.4. The driver f is standard if f (·, 0, 0) ∈ HT2,d and f (ω, t, y, z) is uniformly Lipschitz
in (y, z). This latter property means that ∃C > 0 such that dP ⊗ dt a. s.,
|f (·, y1 , z1 ) − f (·, y2 , z2 )| ≤ C (|y1 − y2 | + |z1 − z2 |) ∀(y1 , z1 ), (y2 , z2 ) ∈ Rd × Rn×d .
If in addition ξ ∈ L2,d
T , we say that the data (f, ξ) are standard data.
It’s worth making the following small observation before we move on:
Proposition 1.5. Suppose that f is a standard driver with Lipschitz constant C and that (y, z) ∈
HT2,d × HT2,n×d . Then f (·, y, z) ∈ HT2,d .
Proof. Directly from Definition 1.4, we have that dP ⊗ dt a. s.,
|f (·, y, z) − f (·, 0, 0)| ≤ C (|y| + |z|) .
Hence
|f (·, y, z))| ≤ |f (·, 0, 0))| + C (|y| + |z|)
and the result follows by squaring and integrating.
1.3 Existence and uniqueness of solutions

The purpose of this subsection is to prove the following theorem:
Theorem 1.6 (Pardoux–Peng (1990)). If (f, ξ) are standard data, then there exists a unique
continuous square-integrable solution to BSDE (1.1).
The method of proof is similar to other existence-uniqueness theorems in SDEs: we find a
suitable complete space and a suitable mapping from that space into itself, and then invoke the
contraction mapping theorem. To this end, recall the following result from the classical theory:
Proposition 1.7 (Burkholder–Davis–Gundy inequalities). For any p > 0 there exist positive
constants cp , Cp such that, for all real-valued continuous local martingales X with X0 = 0 and
stopping times τ , the following inequality holds:
h i h i
cp E hXip/2
τ ≤ E [(X ∗ p
τ ) ] ≤ Cp E hXip/2
τ (1.3)
where Xt∗ = sups≤t |Xs | is the maximum process of X.
4
A proof of this result can be found in Karatzas and Shreve (1991). We prove a corollary that
will be useful later:
Rt
Corollary 1.8. Let ϕ ∈ HT1,n , and define Mt = 0 ϕs · dWs for t ∈ [0, T ]. Then (Mt )t∈[0,T ] is a
uniformly integrable martingale.
Proof. M is a real-valued continuous local martingale with M0 = 0. By the Burkholder–Davis–
Gundy inequalities with p = 1 and τ = T ,
h i
1/2
E sup |Ms | ≤ C1 E hM iT
s≤T
 ! 21 
Z T
= C1 E  |ϕs |2 ds 
0
< ∞.
Thus the process M is dominated by the integrable random variable sups≤T |Ms |. So, using
dominated convergence, we find that it is a uniformly integrable martingale as required.
In order to prove the theorem we need to find a mapping with a suitable Lipschitz constant. To
do this we first need to make a few apriori estimates that control the size of the difference between
two solutions. They may look nasty but their eventual application will be very straightforward.
First, a lemma.
Lemma 1.9. Suppose BSDE (1.1) has a square-integrable solution (Y, Z) with standard data
(f, ξ). Then supt≤T |Yt | ∈ L2,1
T .
Proof from El Karoui (1997). From equation (1.2), we get

Z T Z T
|Yt | = ξ − f (s, Ys , Zs )ds − Zs∗ dWs
t t
Z T Z T
≤ |ξ| + |f (s, Ys , Zs )|ds + Zs∗ dWs
t t
Z T Z T
≤ |ξ| + |f (s, Ys , Zs )|ds + sup Zs∗ dWs
0 t≤T t
and hence Z T Z T
sup |Yt | ≤ |ξ| + |f (s, Ys , Zs )|ds + sup Zs∗ dWs . (1.4)
t≤T 0 t≤T t
By the Burkholder-Davis Gundy inequalities (Proposition 1.7):

   
Z T 2 Z T 2 " Z t 2
#
∗ ∗ ∗
E sup Zs dWs  ≤ 2E  Zs dWs  + 2E sup Zs dWs
t≤T t 0 t≤T 0
"Z #
T
≤ 4C2 E |Zs |2 ds
0
<∞
since Z is square-integrable. By Proposition 1.5, f (·, Y, Z) ∈ HT2,d . Hence every term on the
right-hand side of equation (1.4) is in L2,1 2,1
T and so supt≤T |Yt | ∈ LT .
5
Proposition 1.10 (A priori estimates). Let ((f i , ξ i ); i = 1, 2) be the standard data of two
BSDEs (1.1) and suppose that the BSDEs have square-integrable solutions ((Y i , Z i ); i = 1, 2)
respectively. Let C be a Lipschitz constant for f 1 , and set δY = Y 1 − Y 2 , δZ = Z 1 − Z 2 and
δ2 f = f 1 (·, Y 2 , Z 2 ) − f 2 (·, Y 2 , Z 2 ). Then for any β > C(2 + C), the following inequalities hold:

1
kδY k2β ≤ T eβT E |δYT |2 + 2

kδ 2 f k β , (1.5a)
β − 2C − C 2

1
kδZk2β ≤ (2 + 2C 2 T ) eβT E |δYT |2 + 2

kδ 2 f k β . (1.5b)
β − 2C − C 2
Proof from El Karoui (1997). From Itō’s formula applied from s = t to s = T to the semimartin-
gale eβs |δYs |2 , we get
Z T
eβt |δYt |2 = eβT |δYT |2 + 2 eβs δYs · f 1 (s, Ys1 , Zs1 ) − f 2 (s, Ys2 , Zs2 ) ds

t
Z T Z T Z T
(1.6)
−β eβs |δYs |2 ds − eβs |δZs |2 ds − 2 eβs δYs · δZs∗ dWs .
t t t
Now (eβt δZt δYt )t ∈ HT1,n because using Hölder’s inequality,

s   s 
Z T Z T
E |eβt δZt δYt |2 dt ≤ E sup |δYt | e2βt |δZt |2 dt
0 t 0
12 "Z # 12
T
2 2βt 2
≤ E sup |δYt | E e |δZt | dt
t 0
<∞
by Lemma 1.9 and square-integrability of δZ. Hence by Corollary 1.8, the stochastic integral
term in (1.6) is a uniformly integrable martingale and thus has zero expectation. Using the
Lipschitz property of the driver f 1 we have that
|f 1 (s, Ys1 , Zs1 ) − f 2 (s, Ys2 , Zs2 )| ≤ |f 1 (s, Ys1 , Zs1 ) − f 1 (s, Ys2 , Zs2 )| + |δ2 fs |
≤ C (|δYs | + |δZs |) + |δ2 fs |.
So by using the Lipschitz property in equation (1.6) and taking expectations we get
" Z T #
βt 2
βT 2 βs
E e |δYt | ≤ E e |δYT | + 2 e |δYs | (C (|δYs | + |δZs |) + |δ2 fs |) ds
t
" # (1.7)
Z T Z T
βs 2 βs 2
+ E −β e |δYs | ds − e |δZs | ds .
t t
We notice that a quadratic form appears on the right-hand side of (1.7), given by
Q(y, z) = 2C|y|2 + 2C|y||z| + 2|δ2 fs ||y| − β|y|2 − |z|2 (1.8)
which we can rearrange as

−1 −1
Q(y, z) = −βC (|y| − βC |δ2 fs |)2 − (|z| − C|y|)2 + βC |δ2 fs |2 (1.9)
6
where βC := β − 2C − C 2 > 0 by assumption. Hence (1.7) becomes
" Z #
T Z T
−1
βt 2
βs 2 βs 2
E e |δYt | + E βC e (|δYs | − βC |δ2 fs |) ds + e (|δZs | − C|δYs |) ds
t t
"Z #
T 2
|δ2 fs |
≤ E eβT |δYT |2 + E eβs

ds .
t βC
The target inequality (1.5) for δY follows directly from the above inequality by integrating
between 0 and T . For δZ, notice |δZs |2 ≤ 2(|δZs | − C|δYs |)2 + 2C 2 |δYs |2 so
"Z #
T
2
kδZkβ ≤ 2E e (|δZs | − C|δYs |) ds + 2C 2 kδY k2β
βs 2
0
and the result follows.

The proof of the main theorem is an immediate consequence of the previous proposition.
Proof of Theorem 1.6. As may be evident from our previous work, the main task of this proof is
2,d 2,n×d
to construct a contraction mapping on HT,β × HT,β , for some β > 0 to be specified later.
Fix (y, z) h∈ HT2,d × HT2,n×d and leti M be the continuous version of the square-integrable
RT
martingale E 0 f (s, ys , zs )ds + ξ|Ft . By the martingale representation theorem (in Karatzas
and Shreve) there exists a unique square-integrable process Z ∈ HT2,n×d for which Mt = M0 +
Rt ∗ Rt
Z dWs . Then define Y by Yt = Mt − 0 f (s, ys , zs )ds, so that Y ∈ HT2,d by Proposition 1.5,
0 s
and Y is continuous. Notice that the pair (Y, Z) thus defined satisfies the equation
−dYt = f (t, yt , zt )dt − Zt∗ dWt ;
YT = ξ.
Hence let Ψ : HT2,d × HT2,n×d → HT2,d × HT2,n×d be the map that sends (y, z) 7→ (Y, Z) as above.
2,d 2,n×d
Now let (y 1 , z 1 ), (y 2 , z 2 ) be two elements of HT,β × HT,β , and let (Y i , Z i ) = Ψ(y i , z i ) for
i = 1, 2. We seek to apply Proposition 1.10. Note that there is some potential for confusion
here: in the statement of the proposition we have to set f 1 = f (s, ys1 , zs1 ), which actually has
no explicit dependence on (Y 1 , Z 1 ), since (y 1 , z 1 ) and (Y 1 , Z 1 ) are different processes. Hence
we can set the ”Lipschitz constant” in the proposition to be 0. Also δYT = ξ − ξ = 0 and
δ2 ft = f (t, yt1 , zt1 ) − f (t, yt2 , zt2 ) so we get the inequalities
"Z #
T
2 T βs 1 1 2 2 2
kδY kβ ≤ E e |f (s, ys , zs ) − f (s, ys , zs )| ds , (1.10)
β 0
"Z #
T
2 2 βs 1 1 2 2 2
kδZkβ ≤ E e |f (s, ys , zs ) − f (s, ys , zs )| ds . (1.11)
β 0
Now f (ω, t, y, z) is uniformly Lipschitz in (y, z) with constant C so

2(2 + T )C 2
kδY k2β + kδZk2β ≤ kδyk2β + kδzk2β

(1.12)
β
2,d
and therefore, if we pick β such that β > 2(2 + T )C 2 , Ψ is a contraction on the space HT,β ×
2,n×d
HT,β . Hence by the contraction mapping theorem there exists a unique fixed point of Ψ, which
is the unique continuous square-integrable solution of the BSDE.
7
Remark 1.11. In what sense exactly is our solution “unique”? What we have proven above is
this: The BSDE (1.1) with standard data has a continuous square-integrable solution, denoted
by (Y 1 , Z 1 ), say. If (Y 2 , Z 2 ) is another square-integrable solution to the BSDE, then
"Z #
T
kY 1 − Y 2 k2 ≡ E |Yt1 − Yt2 |2 dt = 0, (1.13)
0
which is equivalent to the statement Y 1 = Y 2 dP ⊗ dt a.s.. A similar statement holds for Z 1 and
Z 2.
We can, however, do better than this if Y 2 is also continuous. Equation (1.13) is equivalent
to Z T
|Yt1 − Yt2 |2 dt = 0 P a. s.,
0
which means that there exists an almost sure set Ω0 ⊆ Ω on which
Z T
|Yt1 (ω) − Yt2 (ω)|2 dt = 0 ∀ω ∈ Ω0 , (1.14)
0
and on which |Yt1 (ω) − Yt2 (ω)| is continuous in t. Fix an ω ∈ Ω0 . The integrand in (1.14) above
is non-negative and continuous in t, and its integral is zero. It therefore must be zero everywhere.
From this we get the following stronger formulation of uniqueness for continuous solutions:
Corollary 1.12. Let (Y 1 , Z 1 ) and (Y 2 , Z 2 ) be continuous square-integrable solutions of the
BSDE (1.1) with standard data. Then
Yt1 = Yt2 ∀t ∈ [0, T ] P a. s. . (1.15)
Equivalently we write Y 1 = Y 2 P a.s. for the above statement. With the main proof complete,
we now give an example illustrating that the condition that the solution be square-integrable is
necessary for uniqueness to hold:
Example 1.13 (From El Karoui (1997)). From Dudley (1977) it is possible, for any finite time
Rt
horizon T > 0, to construct an Rd -valued stochastic integral It := I0 + 0 ψs∗ dWs such that I0 = 1
RT
the vector of ones, IT = 0 the vector of zeros, and 0 kψs k2 ds < ∞, P a.s.. Note that we cannot
have ψ ∈ HT2,n×d because this would imply that I is a martingale, which it cannot be. Now let
the BSDE (1.1) have standard data (0, ξ) and let (Y, Z) be its square-integrable solution, so that
Yt = E [ξ|Ft ] and Z is given by the martingale representation theorem. Then for any λ ∈ R,
(Y + λI, Z + λψ) is also a solution of this BSDE.
Note that henceforth, when we refer to ”the solution” of a BSDE with standard data, we
mean its unique continuous square-integrable solution. The rest of this section will be devoted
to proving a few select results about the behaviour of BSDEs.
1.4 LBSDEs and the comparison theorem

In this subsection we investigate a special class of BSDEs, from which we then prove a very
important and powerful result about general BSDEs.
8
1.4.1 Linear backward stochastic differential equations
Definition 1.14. A linear backward stochastic differential equation (or LBSDE for short) is a
BSDE (1.1) of the form
−dYt = (ϕt + Yt βt + Zt∗ γt )dt − Zt∗ dWt ;

(1.16)
YT = ξ.
where ϕ, β and γ are progressively measurable Rd -, R- and Rn -valued processes respectively.

LBSDEs are well-known in mathematical finance, as the classical problem of pricing a Euro-
pean contingency claim takes the form of an LBSDE, which we will deal with in the next section.
For now, some notation: for a continuous semimartingale M , we write E(M ) for its stochastic
exponential, i.e.
1
E(M ) := eM− 2 hMi . (1.17)
We have that dE(M )t = E(M )t dMt . Evidently if M is a local martingale then E(M ) is also a
local martingale. Recall also the following result from the classical theory:
Proposition 1.15 (Novikov’s condition). Suppose that M is a continuous local martingale and
that h 1 i
E e 2 hMi∞ < ∞. (1.18)
Then E(M ) is a uniformly integrable martingale.

The proof can be found in Karatzas and Shreve (1991). Note that, due to our finite time
horizon, we can replace the hM i∞ in the condition by hM iT whenever we want to use it.
Definition 1.16. Consider the LBSDE (1.16) in the case that β and γ are bounded. The adjoint
process of the LBSDE is then the R-valued positive process
Z Z
Γ := E βs ds + γs · dWs , (1.19)
so that Γ satisfies the FSDE

dΓt = Γt (βt dt + γt · dWt );
(1.20)
Γ0 = 1.
Lemma 1.17 (Properties of the adjoint process). For β,γ bounded and Γ defined above,
1. Γ ∈ HT2,1 ,
2. supt≤T Γt ∈ L2,1
T .
Proof. Z t t
1
Z
Γt = exp (βs − |γs |2 )ds + γs · dWs
0 2 0
so
Z t Z t
(Γt )2 = exp (2βs − |γs |2 )ds + 2γs · dWs
0 0
Z t Z
2
= exp (2βs + |γs | )ds E 2γs · dWs .
0 t
9
R
By Novikov’s condition and the boundedness of γ, E 2γs · dWs is a uniformly integrable
martingale. Let c ≥ 0 be an upper bound for the bounded process |2β + |γ|2 |. Then
" (Z ) Z #
T
2 2

E (Γt ) ≤ E exp |2βs + |γs | |ds E 2γs · dWs
0 t
Z
cT
≤e E E 2γs · dWs
t
= ecT .
So kΓk2 ≤ T ecT . This proves the first part of the lemma.

Now dΓt = Γt (βt dt + γt · dWt ) and Γ0 = 1 so
Z t Z t
Γt = 1 + Γs βs ds + Γs γs · dWs .
0 0
Observe that, since Γ ∈ HT2,1 , we have that Γβ ∈ HT2,1 and Γγ ∈ HT2,n by boundedness. The
proof of the second part of this lemma is essentially identical to the method of proving Lemma
1.9.
The next result shows that we can find an explicit formula for the solutions of LBSDEs that
have adjoint processes. It is a slightly modified result of El Karoui (1997):
Proposition 1.18. Consider the LBSDE (1.16) where β, γ are bounded, ϕ ∈ HT2,d and ξ ∈ L2,d
T .
Let Γ be the adjoint process of the LBSDE. Then the LBSDE has a unique continuous square-
integrable solution (Y, Z) where Y is given explicitly by
" Z T #
1
Yt = E ξΓT + Γs ϕs ds|Ft . (1.21)
Γt t
Proof. Notice that with the conditions given, the LBSDE is a BSDE with standard data. So by
Theorem 1.6, it has a unique continuous square-integrable solution (Y, Z). Define
Z t
Mt = Γt Yt + Γs ϕs ds. (1.22)
0
By Itō’s formula,
dMt = Γt dYt + Yt dΓt + dhΓ, Y it + Γt ϕt dt

= Γt (−(ϕt + Yt βt + Zt∗ γt )dt + Zt∗ dWt ) + Yt Γt (βt dt + γt · dWt )
+ Γt Zt∗ γt dt + Γt ϕt dt.
All the drift terms cancel and we are left with

Z t Z t
Mt = Y0 + Γs Ys γs · dWs + Γs Zs∗ dWs (1.23)
0 0
which is a local martingale.

Now by Lemmas 1.9 and 1.17, we have that supt≤T |Yt | ∈ L2,1 2,1
T , supt≤T Γt ∈ LT . Hence
supt≤T |Yt | × supt≤T Γt is P-integrable by the Cauchy-Schwarz inequality. Since γ is bounded,
we therefore have that ΓY γ ∗ ∈ HT1,n×d . Two applications of Hölder’s inequality (see proof of
10
Proposition 1.10) also show us that ΓZ ∈ HT1,n×d . Therefore by Corollary 1.8, M is a uniformly
integrable martingale. In particular for t ∈ [0, T ] we have Mt = E[MT |Ft ] from which, recalling
the definition of M , we get
" #
Z T
Γt Yt = E ξΓT + Γs ϕs ds|Ft .
t
The following are immediate consequences of the above proposition:

Corollary 1.19. Let d = 1.
1. If ξ is P a.s. non-negative and ϕ is dP⊗dt a.s. non-negative, then Y is P a.s. non-negative.
2. Suppose ξ is P a.s. non-negative and ϕ is dP ⊗ dt a.s. non-negative, and additionally
Yτ = 0 P a.s. for some constant τ ∈ [0, T ].
Then ξ = 0 P a.s., ϕ1[τ,T ] = 0 dP ⊗ dt a.s. and ∀t ∈ [τ, T ], Yt = 0 P a.s..
Remark 1.20. With d = 1, it turns out that for any (not necessarily square-integrable) continuous
solution (X, ζ) of an LBSDE satisfying the conditions of Proposition 1.18, the first item of
Corollary 1.19 will hold merely if there exists a positive B ∈ L2,1
T such that Xt ≥ −B ∀t ∈ [0, T ]
Rt
P a.s.. We can see this in the following way: with ξ, ϕ non-negative, define Mt = Γt Xt + 0 Γs ϕs ds
as before. This is still a local martingale, and is bounded from below by the P-integrable (using
Lemma 1.17 and Cauchy-Schwarz inequality) random variable −B supt≤T Γt . Hence M is a
supermartingale, and Mt ≥ E[MT |Ft ] so
" #
Z T
Γt Xt ≥ E ξΓT + Γs ϕs ds|Ft ≥ 0.
t
Definition 1.21. Sometimes it is convenient to define a family of adjoint processes (Γs : s ∈

[0, T ]), where for t ∈ [s, T ],
Z · Z ·
Γst := E βu du + γu · dWu .
s s t
This means that dΓst = Γst (βt dt + γt · dWt ); Γss = 1.

Remark 1.22. Notice that the original definition of the adjoint process Γ coincides with Γ0 . Also
notice that for any 0 ≤ s ≤ t ≤ T ,
Γt
= Γst .
Γs
This means that the explicit formula for Y in Proposition 1.18 can be written as
" Z T #
t t
Yt = E ξΓT + Γs ϕs ds|Ft . (1.24)
t
11
1.4.2 The comparison theorem
This is just a corollary of Proposition 1.18, but is, somewhat miraculously, a result that extends
to all BSDEs with standard data. It can be interpreted as an analogue of the maximum principle
in PDE theory.
Theorem 1.23 (Peng (1992)). Let d = 1. Let (f 1 , ξ 1 ), (f 2 , ξ 2 ) be two standard data of BSDEs
(1.1) with associated continuous square-integrable solutions (Y 1 , Z 1 ), (Y 2 , Z 2 ). Suppose that the
following hold:
1. ξ 1 ≥ ξ 2 P a.s.,
2. δ2 f := f 1 (·, Y 2 , Z 2 ) − f 2 (·, Y 2 , Z 2 ) ≥ 0 dP ⊗ dt a.s..
1 2
Then Y ≥ Y P a.s..
Moreover, this comparison is strict, i.e. on the event Yt1 = Yt2 , we have ξ 1 = ξ 2 , δ2 fs = 0
and Ys1 = Ys2 a.s. for all s ≥ t.
Proof. For simplicity we assume n = 1. As in Proposition 1.10, let δY = Y 1 − Y 2 and δZ =
Z 1 − Z 2 . We have that
−dδYt = f 1 (t, Yt1 , Zt1 ) − f 2 (t, Yt2 , Zt2 ) dt − δZt dWt ,

so if we let
f 1 (t,Yt1 ,Zt1 )−f 1 (t,Yt2 ,Zt1 )
(
1 Yt1 −Yt2
if Yt1 − Yt2 6= 0,
∆y f (t) =
0 otherwise,
1
(t,Yt2 ,Zt1 )−f 1 (t,Yt2 ,Zt2 )
(
f
if Zt1 − Zt2 6= 0,
∆z f 1 (t) = Zt1 −Zt2
0 otherwise,
Then (δY, δZ) is the solution of the LBSDE
−dδYt = ∆y f 1 (t)δYt + ∆z f 1 (t)δZt + δ2 ft dt − δZt dWt ,

(1.25)
δYT = ξ 1 − ξ 2 .
Observe that, since the driver f 1 is uniformly Lipschitz, ∆y f 1 and ∆z f 1 are bounded processes.
This LBSDE therefore satisfies the conditions of Proposition 1.18. The theorem then follows
from Corollary 1.19.
The argument for n > 1 is similar, but some additional care has to be taken in defining
∆z f 1 (t).
If we set (f 2 , ξ 2 ) to zero, we get a sufficient condition for non-negativity:
Corollary 1.24. Let (f, ξ) be the standard data of a BSDE (1.1) with continuous square-
integrable solution (Y, Z), and suppose that ξ ≥ 0 P a.s. and f (·, 0, 0) ≥ 0 dP ⊗ dt a.s.. Then
Y ≥ 0 P a.s..
Remark 1.25. By Corollary 1.19 and Remark 1.20 we may relax the assumptions of square-
integrability of solutions in the comparison theorem, and merely assume that there exists a
positive B ∈ L2,1 1 2 1
T such that Yt − Yt ≥ −B for all t ∈ [0, T ], P a.s.. We still find that Y ≥ Y
2
P a.s..
Related to this, we get the following extremely useful Corollary:
12
Corollary 1.26. Let d = 1, and let (Y 1 , Z 1 ) be any continuous solution of a BSDE (1.1) with
standard data (f, ξ). Suppose there exists a positive B ∈ LT2,1 such that Yt1 ≥ −B ∀t ∈ [0, T ]
P a.s.. Let (Y 2 , Z 2 ) be the unique continuous square-integrable solution of the BSDE. Then
Y 1 ≥ Y 2 P a.s..
Proof. The two solutions solve the same BSDE, so they satisfy conditions 1 and 2 of the com-
parison theorem. Additionally using Lemma 1.9,

Yt1 − Yt2 ≥ − B + sup |Yt2 | ∈ LT2,1
t≤T
for all t ∈ [0, T ], P a.s.. The result follows from the previous remark.
1.5 Supersolutions
In this subsection we assume d = 1.
Definition 1.27. A continuous supersolution of BSDE (1.1) is a triple (Y, Z, C) = (Yt , Zt , Ct )t∈[0,T ]
of processes such that:
• Y is a continuous adapted R-valued process which is bounded below, i.e. there exists
positive B ∈ L2,1
T with Yt ≥ −B ∀t ∈ [0, T ] P a.s.,
RT
• Z is a predictable Rn -valued process with 0 |Zt |2 ds < ∞ P a.s.,
• C is an increasing continuous adapted R-valued process with C0 = 0,
and which satisfies
−dYt = f (t, Yt , Zt )dt − Zt∗ dWt + dCt ;
(1.26)
YT = ξ.
Supersolutions are a common concept in mathematical finance, arising naturally in models

that incorporate a notion of consumption (hence why the third process is denoted C). This link
with finance will be covered in more detail in the next section.
Remark 1.28. Suppose f is a linear driver, as in equation (1.16), and suppose that β,γ are
bounded and that ϕ ∈ HT2,1 . Let Γ be the adjoint process of the LBSDE, and let (Y, Z, C) be a
Rt
supersolution. Defining Mt = Γt Yt + 0 Γs ϕs ds as in the proof of Proposition 1.18, we have by
Itō’s formula
dMt = Γt Yt γt · dWt + Γt Zt · dWt − Γt dCt ,
so M is a local supermartingale. This leads to the next proposition.
Proposition 1.29. Let d = 1. Let the BSDE (1.1) have standard data (f, ξ) and a continuous
supersolution (Y 1 , Z 1 , C), and let (Y 2 , Z 2 ) be its unique continuous square-integrable solution.
Then Y 1 ≥ Y 2 P a.s..
Proof. The proof is analogous to the comparison theorem, so we use the same notation, and as-
sume n = 1 for simplicity. We have f 1 = f 2 = f , so we let δY , δZ, ∆y f (t) and ∆z f (t) be defined
as in the proof of the comparison theorem, so that (δY, δZ, C) is a continuous supersolution of a
certain LBSDE. Namely, it satisfies
−dδYt = (∆y f (t)δYt + ∆z f (t)δZt ) dt − δZt dWt + dCt ,

(1.27)
δYT = 0.
13
As before, ∆y f (t) and ∆z f (t) are bounded by the Lipschitz property. Let Γ be the adjoint process
of the above LBSDE, then we know from Remark 1.28 that ΓδY is a local supermartingale.
However, ΓδY is bounded from below by − supt≤T Γt B + supt≤T |Yt2 | which is integrable by
the Cauchy-Schwarz inequality, so by Fatou’s inequality it is in fact a supermartingale, and in
particular for t ∈ [0, T ],
Γt δYt ≥ E[ΓT δYT |Ft ] = 0,
so δY ≥ 0 dP ⊗ dt a.s.. The result follows from continuity.
14
2 European claims in dynamically complete markets
In this section we see how BSDEs can help in the solution of one of the most fundamental
problems in mathematical finance: the pricing of a European contingent claim.
2.1 Basic definitions

We keep our probability space (Ω, F , P) from the previous section with fixed time horizon T > 0,
and define a few new processes:
• The short rate r = (rt )t∈[0,T ] , which is R-valued, predictable and bounded,
• The vector of stock appreciation rates b = (bt )t∈[0,T ] , which is Rn -valued, predictable and
bounded,
• The volatility matrix σ = (σt )t∈[0,T ] , which is Rn×n -valued, predictable and bounded, and
such that σt is invertible P a.s. ∀t ∈ [0, T ] with bounded inverse.
With these processes defined, our (n + 1)-asset market is given by the price process P =
(Pt0 , Pt1 , . . . , Ptn )∗t∈[0,T ] where
dPt0 = Pt0 rt dt (2.1)
is our single locally riskless asset, representing a bank account or a bond, and
 
n
i,j j
X
dPti = Pti bit dt + σt dWt  (2.2)
j=1
for i = 1, . . . , n are our risky securities, for example representing stocks. In addition, we define a
predictable and bounded Rn -valued process θ = (θt )t∈[0,T ] known as a risk premium that satisfies
bt − rt 1 = σt θt dP ⊗ dt a. s.
where 1 = (1, . . . , 1)∗ ∈ Rn .

Suppose we have an investor whose actions have no affect on the market. We let Vt be
his total wealth and πt = (πt1P , . . . , πtn )∗ the value of his holdings in the ith risky asset at time
n
t ∈ [0, T ], so that π := V − i=1 π i is the value of his holdings in the riskless asset. Since he
0
can only decide what to do at time t based on the current information available, we require that
the processes π and V be predictable.
In the Merton model (1971), the investor also has a consumption rate c = (ct )t∈[0,T ] which
is a scalar non-negative predictable process and represents him putting aside a portion of his
wealth, not to be invested further. However in this case we instead usually R talk about the total
consumption given by the continuous increasing predictable process C = cs ds, whose absolute
continuity we then relax (so c needn’t exist at all).
Definition 2.1. A self-financing trading strategy is a pair of processes (V, π) such that
dVt = rt Vt dt + πt · σt (dWt + θt dt), (2.3)

Z T
|σt∗ πt |2 dt < ∞ P a. s. .
0
15
This SDE is known as the wealth equation, and is equivalent to
n Z t
X dPti
Vt = V0 + πti , (2.4)
i=0 0 Pti
and its correct interpretation is that all of the investor’s wealth is always invested in some
combination of the n + 1 assets, and that he does not gain or lose wealth in any other manner.
We extend these definitions to the Merton model in the following way:
Definition 2.2. A self-financing superstrategy is a triple of processes (V, π, C) such that
dVt = rt Vt dt − dCt + πt · σt (dWt + θt dt), (2.5)

Z T
|σt∗ πt |2 dt < ∞ P a. s. .
0
where C is increasing, right-continuous and adapted with C0 = 0. Observe that if (V, π) is a

self-financing trading strategy, then (V, π, 0) is a self-financing superstrategy.
Remark 2.3. Notice that the above two definitions coincide with the definitions of solution and
supersolution of a BSDE respectively.
Definition 2.4. A trading strategy (V, π) or superstrategy (V, π, C) is admissible if V is P a.s.
bounded from below. It is 0-admissible or feasible if V ≥ 0 P a.s..
2.2 Hedging claims

Definition 2.5. A European contingent claim settled at time T is an FT -measurable random
variable, usually denoted ξ.
Note that this is identical to the definition of ξ in the previous section. Examples of contingent
claims include call options and futures contracts.
The questions we would like to answer are as follows: Given a non-negative European
contingent claim ξ, can we find an feasible self-financing strategy (V, π) or super-
strategy (V, π, C) such that we can guarantee that our wealth at time T is VT = ξ? If
so, what is the smallest initial wealth V0 needed to carry out such a strategy? With
these questions in mind, we make the following definitions:
Definition 2.6. A hedging strategy against a European contingent claim ξ is a feasible self-
financing trading strategy (V, π) such that VT = ξ. Let H(ξ) denote the class of all hedging
strategies against ξ. We call ξ hedgeable if H(ξ) 6= ∅.
For feasible self-financing superstrategies (V, π, C) we similarly define superhedging strategy
against ξ, H′ (ξ) and superhedgeable.
Definition 2.7. The fair price of a hedgeable European contingent claim ξ is
p(ξ) := inf {x ≥ 0 : ∃(V, π) ∈ H(ξ) s.t. V0 = x a. s.}
the lowest initial wealth of a hedging strategy against ξ. Similarly we define the upper price of
a superhedgeable European contingent claim ξ:
p′ (ξ) := inf {x ≥ 0 : ∃(V, π, C) ∈ H′ (ξ) s.t. V0 = x a. s.} .
16
We can now state the main theorem of this section, which we prove using our results on
BSDEs. It shows that all non-negative square-integrable claims are hedgeable, and in this case
we call the market dynamically complete.
Theorem 2.8. Let ξ ∈ L2,1T be a non-negative square-integrable European contingent claim. Then
there exists some hedging strategy (X, π) ∈ H(ξ) achieving the fair price of ξ, i.e. X0 = p(ξ).
Moreover, the upper price p′ (ξ) is equal to the fair price.
Proof. This theorem is a simple consequence of BSDE theory. We are looking for a pair (X, π)
for which
dXt = rt Xt dt + πt · σt (dWt + θt dt),
XT = ξ, (2.6)
X ≥ 0 P a. s.
Rearranging this to −dXt = (−rt Xt − (σt πt )∗ θt )dt − (σt πt )∗ dWt , we see that this is exactly
the form of the LBSDE (1.16). Let (H s : s ∈ [0, T ]) be the family of adjoint processes of this
LBSDE, i.e. Z · Z ·

Hts := E − ru du − θu · dWu , s ≤ t. (2.7)
s s t
Because of our boundedness assumptions, we can apply Proposition 1.18 to find the unique
square-integrable solution (X, σπ) of the LBSDE, from which the invertibility of σ gives us the
process π. The wealth process X is given by
Xt = E ξHTt |Ft ,

(2.8)
where ξ is non-negative and H is positive so X must be non-negative. So (X, π) ∈ H(ξ) (and

(X, π, 0) ∈ H′ (ξ)).
Now we let (Y, ρ) ∈ H(ξ) be another hedging strategy against ξ. Since Y is non-negative, we
can apply Corollary 1.26 to see that Y ≥ X P a.s.. The same holds if we let (Y, ρ, C) ∈ H′ (ξ)
be a hedging superstrategy against ξ. By Proposition 1.29, Y ≥ X P a.s.. Hence (X, π) achieves
the fair price and the upper price of ξ, given by
p(ξ) = p′ (ξ) = E ξHT0 .

(2.9)
Remark 2.9. In finance the family of adjoint processes (Hts ) given in (2.7) is usually referred to
as the deflator.
Remark 2.10 (Equivalent martingale measure). Let H = H 0 be the deflator started at 0, and
consider the process
R·
Z ·
e 0 rs ds H = E − θu · dWu .
0
HT2,1
Since this is in (by Lemma 1.17 and boundedness of r), it is a positive uniformly integrable
martingale with respect to P and so we can define a new probability measure Q by the Radon-
Nikodym derivative
dQ RT
= e 0 rs ds HT . (2.10)
dP
17
Under this new measure, the minimal hedging strategy (X, π) in equation (2.8) satisfies
h RT i EP [H ξ|Ft ]
EQ e− 0 rs ds ξ|Ft = h RT T i
EP e 0 rs ds HT |Ft
EP [HT ξ|Ft ]
= Rt
r ds
e 0 s Ht
Rt
− 0 rs ds P
t
=e E ξHT |Ft
Rt
= e− 0
rs ds
Xt
R·
by Bayes’ rule. This shows that the discounted wealth process e− 0 rs ds X is a Q-martingale for
any positive square-integrable claim. The measure Q is called the equivalent martingale measure
or the risk-neutral measure.
Remark 2.11. If we allow the definitions of hedging and superhedging strategies to include ad-
missible strategies, Theorem 2.8 still holds by essentially the same proof.
We give an example to illustrate what happens when we relax the condition of feasibility (or
more generally admissibility) of hedging strategies in Theorem 2.8:
Example 2.12 (From El Karoui, Peng and Quenez (1997)). Recall Example 1.13 in the case
RT RT
d = 1. We can construct an R-valued stochastic integral 0 ψs · dWs = 1 such that 0 kψs k2 ds <
∞, P a.s.. Construct the pair (Y, ϕ) by
Z t
Yt = Ht−1 ψs · dWs ,
0 (2.11)
(σt∗ )−1 Ht−1 ψt

ϕt = + Yt θt .
By an application of Itō’s lemma we can show that (Y, ϕ) is a self-financing strategy satisfying
the LBSDE
dYt = rt Yt dt + ϕt · σt (dWt + θt dt)
such that Y0 = 0 and YT = HT−1 . It is what is known as an arbitrage opportunity. Now note that
the pair (H −1 , H −1 (σ ∗ )−1 θ) satisfies the same LBSDE, but with H0−1 = 1. So we define
(X 0 , π 0 ) = (H −1 − Y, H −1 (σ ∗ )−1 θ − ϕ),
which is by linearity a solution of the LBSDE for which X00 = 1, XT0 = 0.

Suppose we relax admissibility for hedging strategies, and allow any trading strategy (V, ρ)
satisfying the LBSDE and the terminal condition VT = ξ to be a hedging strategy. Then for any
λ ∈ R and any (X, π) ∈ H(ξ), by linearity we have (X + λX 0 , π + λπ 0 ) ∈ H(ξ), which has initial
wealth X0 + λ. In this case the fair price is not well-defined.
18
3 Concave BSDEs and applications
We assume d = 1 in this section. Recall that in Proposition 1.18 we derived an explicit solution
for an LBSDE, under some regularity conditions. The purpose of this section is to extend the
class of standard data for which we can find an explicit solution. We do this by remarking
that a concave (resp. convex) function can be expressed as the infimum (resp. supremum) of a
collection of linear functions, by means of the Legendre transform. We then give an application
of this theory to more general investment problems.
Before we begin we would like to formalize the notion of supremum and infimum of a family of
stochastic processes in a way more suited to measure theory, and to this end we follow Dellacherie
(1977):
Definition 3.1 (Essential supremum and infimum of processes). Let U , V be two processes
defined up to time T . We say that U minorises V if
{ω ∈ Ω : ∃t ∈ [0, T ] : Ut (ω) > Vt (ω)} (3.1)
is a P-null set, i.e. U ≤ V a.s.. Now let {U α : α ∈ I} be a family of processes defined up to time
T , with some indexing set I. We say that U = ess inf α U α if:
1. U minorises U α for every α ∈ I, and
2. If another process V minorises U α for every α ∈ I, then V minorises U .
We likewise define majorise and ess supα V α .
Recall that the essential supremum and essential infimum of a family of random variables is
defined similarly. We now state a result about essential infima without proof:
Lemma 3.2 (Dellacherie (1977)). Let {U α : α ∈ I} be a family of càdlàg processes defined up
to time T , with indexing set I. Then U = ess inf α U α exists and there exists a sequence (αn ) ∈ I
such that U = inf n U αn .
3.1 Extrema of standard data

Let ((f α , ξ α ) : α ∈ I) be a family of data of BSDEs (1.1) with indexing set I. What happens
if we take the essential infimum of this family as the data of our BSDE? We would like to be
able to control solutions of this new BSDE using solutions of our original family of BSDEs. The
following proposition is one of a few ways of doing this:
Proposition 3.3. Let ((f α , ξ α ) : α ∈ I) be a family of standard data of BSDEs (1.1) with in-
dexing set I and continuous square-integrable solutions ((Y α , Z α ) : α ∈ I). Let (f, ξ) be another
standard data with continuous square-integrable solution (Y, Z). Suppose there exists ᾱ ∈ I such
that
f (·, Y, Z) = ess inf f α (·, Y, Z) = f ᾱ (·, Y, Z) dP ⊗ dt a. s.,
α
(3.2)
ξ = ess inf ξ α = ξ ᾱ P a. s. .
α
Then the processes Y and Y α satisfy
Y = ess inf Y α = Y ᾱ P a. s. . (3.3)

α
19
Proof from Quenez (1997). Notice that (f α , ξ α ) and (f, ξ) are data that satisfy the conditions
of the comparison theorem (Theorem 1.23). So Yt ≤ Ytα ∀t ∈ [0, T ] P a.s. for each α and hence
Y ≤ ess inf α Y α P a.s. by the definition of essential infimum.
Now we see that (Y, Z) is a continuous square-integrable solution to the BSDE (1.1) with
data (f ᾱ , ξ ᾱ ), so by uniqueness, (Y, Z) = (Y ᾱ , Z ᾱ ) P a.s.. Hence
ess inf Y α ≥ Y = Y ᾱ ≥ ess inf Y α P a. s. .

α α
3.2 Concave drivers

Recall the definitions of a concave function and of a convex function. Also recall that for any
m ∈ N we have an involution on the set of convex functions {f : Rm → Rm ∪ {+∞}} known as
the convex conjugate (or Legendre transform in the case m = 1), and given by f 7→ f ∗ where
f ∗ (p) = sup (p · x − f (x)) . (3.4)

x∈Rm
The conjugate function f ∗ (p) can be interpreted as −1 times the value at x = 0 of the tangent
hyperplane to f (x) with gradient p. An important property of the convex conjugate is that it is
an involution, i.e. self inverse:
f (x) = sup (x · p − f ∗ (p)) . (3.5)

p∈Rm
This shows that f can be written as the supremum of a family of linear functions.
We proceed with a modification of this theory. Let f (t, y, z) be a standard generator of a
BSDE (1.1) and moreover suppose that f is concave (not convex!) in (y, z). Let ξ ∈ LT2,1 , and
let (Y, Z) be the square-integrable solution of the BSDE with standard data (f, ξ). Let C be a
Lipschitz constant for f , and let K = [−C, C]n+1 ⊆ R × Rn .
Definition 3.4. The polar process F : Ω × [0, T ] × R × Rn → R associated with f is the convex
function given by
F (ω, t, β, γ) = sup (f (ω, t, y, z) − βy − γ · z). (3.6)
(y,z)∈R×Rn
The effective domain of F is given by
DF := {(ω, t, β, γ) ∈ Ω × [0, T ] × R × Rn : F (ω, t, β, γ) < ∞} . (3.7)

(ω,t)
For (ω, t) ∈ Ω × [0, T ], denote the (ω, t)-section of DF by DF ⊆ R × Rn .
Remark 3.5. Equation (3.6), along with the involutive property of the convex conjugate, shows
us that the conjugacy relation in this case is
f (ω, t, y, z) = inf (F (ω, t, β, γ) + βy + γ · z). (3.8)

(ω,t)
(β,γ)∈DF
(ω,t)
Note that we could extend the infimum above to be over any bounded superset of DF , because
(ω,t)
F is infinite outside of DF . This equation has the important interpretation that it allows f to
be expressed as the infimum of a family of functions that are linear in (y, z).
20
(ω,t)
Remark 3.6. For all (ω, t) ∈ Ω × [0, T ], we have DF ⊆ K. This is because if, for example,
|β| > C, then the Lipschitz property gives
f (ω, t, y, z) − βy − γ · z ≥ −C|y| + f (ω, t, 0, z) − βy − γ · z

(ω,t)
which is unbounded above in y, and hence (β, γ) ∈
/ DF for any γ ∈ Rn .
Definition 3.7. Let (β, γ) be predictable processes, known as control parameters. Let the linear
driver f β,γ : Ω × [0, T ] × R × Rn → R be given by
f β,γ (ω, t, y, z) = F (ω, t, βt , γt ) + βt y + γt · z. (3.9)
Let the set of admissible control parameters be given by

n o
A = (β, γ) predictable, K-valued : F (·, β, γ) ∈ HT2,1 . (3.10)
Remark 3.8. It’s easy to see that if (β, γ) ∈ A, then f β,γ is a standard driver.
We seek to apply Proposition 3.3 to the driver f and the family {f β,γ : (β, γ) ∈ A}, and need
a few lemmas to justify this. The first is given without proof:
Lemma 3.9 (Measurable selection theorem; Kuratowski–Ryll-Nadzewski (1965)). Let (E, E)
be a measurable space and let X be a Polish space. Denote the power set of X by PX. Let
F : E → PX be a point-to-set mapping and assume that
1. F (ω) 6= ∅ ∀ω ∈ E,
2. For every open set G ⊆ X,
{ω ∈ E : F (ω) ∩ G 6= ∅} ∈ E.
Then there is an E-measurable function f , known as a selection function, such that f (ω) ∈ F (ω)
∀ω ∈ E.
The details of this lemma are beyond us, but it suffices for us to know that Rm is a Polish
space for any m ≥ 1, and also that any closed subset of a Polish space (e.g. K) is a Polish space.
Lemma 3.10. For any (ω, t, y, z), the infimum in the conjugacy relation (3.8) is achieved in K
by some pair (β, γ).
Proof from El Karoui, Peng and Quenez (1997). We fix a quadruple (ω, t, y, z). By the infimum
(ω,t)
(3.8), there exists a sequence (β n , γ n )n∈N ∈ DF such that
f (ω, t, y, z) = lim (F (ω, t, β n , γ n ) + β n y + γ n · z).

n→∞
(ω,t)
Since DF is contained in a compact set, we can assume without loss of generality that (β n , γ n )n
converges to some (β, γ) ∈ K by the Bolzano–Weierstrass theorem. F (ω, t, ·, ·) is a convex
conjugate, so by a well-known result in convex analysis, it is lower semi-continuous. Therefore
F (ω, t, β, γ) + βy + γ · z ≤ lim (F (ω, t, β n , γ n ) + β n y + γ n · z)

n→∞
= f (ω, t, y, z)
= inf (F (ω, t, β ′ , γ ′ ) + β ′ y + γ ′ · z)
(β ′ ,γ ′ )∈K
≤ F (ω, t, β, γ) + βy + γ · z
21
where we have extended the infimum to be over the set K by Remark 3.5. So
f (ω, t, y, z) = F (ω, t, β, γ) + βy + γ · z.
We now prove that the main condition needed to invoke Proposition 3.3 holds in this case:
Lemma 3.11. There exists an optimal control (β̄, γ̄) ∈ A such that
f (·, Y, Z) = f β̄,γ̄ (·, Y, Z). (3.11)
Proof from El Karoui, Peng and Quenez (1997). We aim to construct a point-to-set function
from the measurable space (Ω× [0, T ], P) to the Polish space (K, B n+1 ) and then use the measur-
able selection theorem 3.9. By Lemma 3.10, given (ω, t) ∈ Ω×[0, T ], the set of all (β, γ) ∈ K that
minimise (3.8) in the case that f (ω, t, y, z) = f (ω, t, Yt (ω), Zt (ω)) is non-empty. Moreover, the
predictability of f (·, Y, Z), Y and Z ensure that the second condition of the measurable selection
theorem holds. Hence we can find a predictable pair (β̄, γ̄) taking values in K such that for all
(ω, t),
f β̄,γ̄ (ω, t, Yt (ω), Zt (ω)) = inf (F (ω, t, β, γ) + βYt (ω) + γ · Zt (ω)) (3.12)
(ω,t)
(β,γ)∈DF
and hence
f (·, Y, Z) = f β̄,γ̄ (·, Y, Z) (3.13)
where we have suppressed the dependence on ω. Now since f (·, Y, Z), Y and Z are square-
integrable and (β̄, γ̄) is bounded, the equation
F (·, β̄, γ̄) = f (·, Y, Z) − β̄Y − γ̄ · Z
illustrates that F (·, β̄, γ̄) ∈ HT2,1 . So (β̄, γ̄) ∈ A.

Now for each (β, γ) ∈ A, we denote the square-integrable solution of the associated LBSDE
with data (f β,γ , ξ) by (Y β,γ , Z β,γ ). We now state our main theorem:
Theorem 3.12. Let f be a concave standard driver and {f β,γ : (β, γ) ∈ A} the associated linear
standard drivers satisfying
f = ess inf f β,γ dP ⊗ dt a. s. .

(β,γ)∈A
Then
Y = ess inf Y β,γ P a. s. .
(β,γ)∈A
Proof from El Karoui, Peng and Quenez (1997). From Lemma 3.11 we have that
f (·, Y, Z) = f β̄,γ̄ (·, Y, Z) = ess inf f β,γ (·, Y, Z) dP ⊗ dt a. s. .

(β,γ)∈A
so the result follows directly from Proposition 3.3.
22
Now let’s use what we know about LBSDEs. Take (β, γ) ∈ A, so that (β, γ) is bounded,
predictable, and F (·, β, γ) ∈ HT2,1 . Let (Γβ,γ
s,t )t∈[s,T ] denote the adjoint process associated with
the LBSDE with standard data (f β,γ , ξ) and solution (Y β,γ , Z β,γ ). Then Y β,γ is given by
" #
Z T
Ytβ,γ = E ξΓβ,γ
t,T + Γβ,γ
t,s F (s, βs , γs )ds|Ft
t
and so we have an explicit expression for (Yt )t≤T :

" #
Z T
Yt = ess inf E ξΓβ,γ
t,T + Γβ,γ
t,s F (s, βs , γs )ds|Ft . (3.14)
(β,γ)∈A t
3.3 Application: European claims in convex markets

Recall the self-financing condition of a trading strategy in our dynamically complete market of
Section 2, represented by the following BSDE:
dVt = rt Vt dt + πt · σt (dWt + θt dt)

= (rt Vt + πt · σt θt )dt + πt · σt dWt .
Sometimes the market might be a bit more interesting. Note that the pair (V, σ ∗ π) in the above
wealth equation corresponds to the solution (Y, Z) of a BSDE, so we now consider a general
wealth equation
−dYt = b(t, Yt , Zt )dt − Zt∗ dWt , (3.15)
where b : Ω × [0, T ] × R × Rn → R is a standard driver, and the volatility matrix σ satisfies our
previous assumptions, i.e. Rn×n -valued, predictable and bounded, and such that σt is invertible
P a.s. ∀t ∈ [0, T ] with bounded inverse. In Section 2 we took
b(t, y, z) = −rt y − z ∗ θt .
Let C be a Lipschitz constant for b and let K = [−C, C]n+1 ⊆ R × Rn . We apply our previous
results by assuming that b(t, y, z) is convex in (y, z). If (V, σ ∗ π) is a solution of (3.15), then
(−V, −σ ∗ π) is a solution of the BSDE
−dYt = (−b(t, −Yt , −Zt ))dt − Zt∗ dWt , (3.16)
and notice that −b(t, −y, −z) is a standard driver with Lipschitz constant C, and is concave in
(y, z). Let B be the convex polar process associated with −b(t, −y, −z), given by
B(ω, t, β, γ) = sup (−b(ω, t, −y, −z) − βy − γ · z)

(y,z)∈R×Rn
= sup (βy + γ · z − b(ω, t, y, z)),

(y,z)∈R×Rn
so that the conjugacy relation is
−b(ω, t, −y, −z) = inf (B(ω, t, β, γ) + βy + γ · z), (3.17)

(β,γ)∈B
⇒ b(ω, t, y, z) = sup (βy + γ · z − B(ω, t, β, γ)). (3.18)

(β,γ)∈B
23
Note that this means that (b, B) are a pair of convex conjugates. Let the set of admissible control
parameters be n o
A = (β, γ) predictable, K-valued : B(·, β, γ) ∈ HT2,1 .
For (β, γ) ∈ A, let
bβ,γ (ω, t, y, z) := βt y + γt · z − B(ω, t, βt , γt ). (3.19)
be a linear standard driver. We can now apply Theorem 3.12:
Proposition 3.13. Let ξ ∈ L2,1 ∗
T . Let (X, σ π) be the unique square-integrable solution of BSDE
(3.15) with convex standard parameters (b, ξ). Let (X β,γ , σ ∗ π β,γ ) be the square-integrable solution
of the LBSDE with standard data (bβ,γ , ξ). Then P a.s.,
X = ess sup X β,γ . (3.20)

(β,γ)∈A
Proof. Recall that (−X, −σ ∗ π) is the unique square-integrable solution to the BSDE (3.16). The
pair (−X β,γ , −σ ∗ π β,γ ) satisfies the LBSDE
−dYt = −bβ,γ (t, −Yt , −Zt )dt − Zt∗ dWt .
Now
−bβ,γ (ω, t, −y, −z) = βt y + γt · z + B(ω, t, βt , γt )
so from equation (3.17),
−b(·, −Y, −Z) = ess inf (−bβ,γ (·, −Y, −Z)) dP ⊗ dt a. s. .

(β,γ)∈A
By Theorem 3.12,
−X = ess inf (−X β,γ ) P a. s.
(β,γ)∈A
and the result follows.

Remark 3.14. We have that (X β,γ , σ ∗ π β,γ ) satisfies the LBSDE
−dYt = (βt Yt + γt · Zt − B(t, βt , γt ))dt − Zt∗ dWt ;
(3.21)
YT = ξ.
If we compare this to equation (2.5), we see that (X β,γ , π β,γ ) is a hedging strategy against ξ in
a fictitious market with bounded short rate −β, bounded risk premium −γ and (not necessarily
non-negative) instantaneous cost process −B(·, β, γ). Moreover, the price of the claim ξ is equal
to its price in an optimal such fictitious market.
We now give a concrete example, demonstrating the applicability of this result.
Example 3.15 (Markets with higher interest rate for borrowing). Let us suppose that holding a
negative quantity of the riskless asset incurs a higher interest rate R > r, where r is the original
short rate. We assume the process R to be predictable and bounded. The self-financing condition
in this case is thus
dVt = rt Vt dt + πt · σt (θt dt + dWt ) − (Rt − rt )(πt0 )− dt (3.22)
where (π 0 )− is the negative part of the holdings of the riskless asset π 0 (and is a non-negative
process). Let’s eliminate π 0 and rearrange into standard SDE form:
−dVt = − rt Vt + πt · σt θt − (Rt − rt )(Vt − πt · 1)− dt − πt · σt dWt

(3.23)
24
where 1 = (1, . . . , 1)∗ ∈ Rn . Our driver in this case is given by
b(t, y, z) = −rt y − θt · z + (Rt − rt )(y − (σt−1 1) · z)− (3.24)
which, with respect to (y, z), is the sum of a linear function and a convex function, so is convex.
The processes r, θ, R and σ −1 are all bounded so b is a standard driver. Let ξ ∈ LT2,1 , and
let (X, σ ∗ π) be the square-integrable solution of the BSDE (1.1) with standard data (b, ξ). The
convex conjugate B associated with b is given by
B(ω, t, β, γ) = sup (βy + γ · z − b(ω, t, y, z))

(y,z)∈R×Rn
= sup ((β + rt (ω))y + (γ + θt (ω)) · z

(y,z)∈R×Rn
− (Rt (ω) − rt (ω))(y − (σt (ω)−1 1) · z)− ).
The expression above takes a finite maximum over (y, z) if and only if
−(Rt (ω) − rt (ω)) ≤ β + rt (ω) ≤ 0,

γ + θt (ω) = −(β + rt (ω))σt (ω)−1 1,
and in this case
B(ω, t, β, γ) = sup ((β + rt (ω))(y − (σt (ω)−1 1) · z)

(y,z)∈R×Rn
− (Rt (ω) − rt (ω))(y − (σt (ω)−1 1) · z)− ).
So B is given by
(
0 if β ∈ [−Rt (ω), rt (ω)], γ = −θt (ω) − (β + rt (ω))σt (ω)−1 1;
B(ω, t, β, γ) = (3.25)
∞ otherwise.
Thus in this case, the set of admissible control parameters is
A = (β, γ) predictable : −R ≤ β ≤ −r, γ = −θ − (β + r)σ −1 1 .

Notice that we no longer require that (β, γ) take values in some bounded set K. This is because
the conditions given already bound (β, γ), and the Lipschitz constant of b can be made arbitrarily
large.
For a predictable process β such that −R ≤ β ≤ −r, we hence define (X β , σ ∗ π β ) to be the
solution of the LBSDE with standard data given by
−dYt = (βt Yt − (θt + (βt + rt )σt−1 1) · Zt )dt − Zt∗ dWt ;

(3.26)
YT = ξ.
Then by Proposition 3.13, it follows that P a.s.,
X = ess sup Xβ. (3.27)

−R≤β≤−r
In this case, for all admissible (β, γ) we have that B = 0 so if ξ is non-negative, so X β is

non-negative by Corollary 1.19, and so X is P a.s. non-negative. (X, π) is therefore a feasible
25
square-integrable hedging strategy against ξ. By Corollary 1.26 and Proposition 1.29, it follows
that the fair price and the upper price of ξ are given by
h i
p(ξ) = p′ (ξ) = sup E ΓβT ξ , (3.28)
−R≤β≤−r
where Z Z
ΓβT = E βs ds − (θs + (βs + rs )σs−1 1) · dWs .
T
Finally, notice that if we take β ′ = −β so that r ≤ β ′ ≤ R, then the strategy (X β , π β ) satisfies
dXtβ = (βt′ Xtβ + πtβ · σt θt − (βt′ − rt )πtβ · 1)dt + πtβ · σt dWt

(3.29)
= rt Xtβ dt + πtβ · σt (θt dt + dWt ) + (βt′ − rt )(π β )0t dt
which is a strategy against ξ in a fictitious dynamically complete market with a short rate r and
risk premium θ, and an instantaneous cost process that depends linearly on (π β )0 , the quantity
of wealth held in the riskless asset. Notice the similarities between this SDE and equation (3.22),
as expected.
26
4 Utility theory in incomplete markets
In this section we consider the problem of finding a portfolio process that maximises the expected
utility of our time-T wealth XT in a potentially incomplete market. That is, given a utility
function U taking values in R, an initial value x ∈ R and a set of admissible portfolio processes
A, our objective is to maximise h i
(π)
E U XT
where (X (π) , π) is a trading strategy such that π ∈ A and X (π) = x. The utility function U
represents how much “value” our investor assigns to a given quantity of wealth, and is frequently
assumed to have various nice properties such as monotonicity and concavity. Examples of utility
functions include:
1. Logarithmic utility: U : (0, ∞) → R such that
U (x) = log x.
2. Power utility: Uγ : [0, ∞) → R with γ ∈ (0, 1) such that
1 γ
Uγ (x) = x .
γ
3. Exponential utility: Uα : R → R with α ∈ (0, ∞) such that
Uα (x) = −e−αx .
Note that all of these functions satisfy both monotonicity and concavity.
4.1 Preliminaries
Definition 4.1. Let A be a closed subset of Rm and let a ∈ Rm . Let the distance between a
and A be
distA (a) = min |a − b|, (4.1)
b∈A
the minimum distance between a and an element of A. Obviously if a ∈ A then distA (a) = 0.
Let the set
ΠA (a) = {b ∈ A : |a − b| = distA (a)} (4.2)
be the set of elements of A attaining the minimum distance from a. The set A is closed, so ΠA (a)
is non-empty, and may contain more than one element.
We give without proof a measurability result relating to the distance function defined above.
A proof, based on joint continuity, can be found in Hu, Imkeller and Müller (2005).
Lemma 4.2. Let (at )t∈[0,T ] and (σt )t∈[0,T ] be Rn - and Rd×n -valued predictable processes respec-
tively. Let C̃ be a closed subset of Rd . Let Ct = σt∗ C̃. Then the process
d = (dist(at , Ct ))t∈[0,T ]
is predictable.
Our setup in this section differs from Section 2 in a number of ways:
27
1. The volatility matrix σ is now generalised to a full-rank predictable Rd×n -valued process,
where d ≤ n. The number d is the number of risky assets in the market, and n is the
dimension of the Brownian motion W , which can be interpreted as the number of “degrees
of freedom” in the market. Additionally, σσ ∗ is assumed to be uniformly elliptic, which
means that there exist ε, K > 0 such that P a.s.,
ε|x| ≤ |σt∗ x| ≤ K|x| ∀x ∈ Rd , ∀t ∈ [0, T ]. (4.3)
2. Since σ may no longer be invertible, we now define the Rn -valued risk premium θ by
θ = σ ∗ (σσ ∗ )−1 b P a. s. (4.4)
where b is the predictable, bounded and Rd -valued vector of stock appreciation rates. This
agrees with the original definition if d = n and σ is invertible. Since b is bounded and σσ ∗
is uniformly elliptic, we have that θ is bounded.
3. This is a constrained optimisation problem: We require that our portfolio processes only
take values inside some (non-empty(!)) closed subset. Note that this subset is not assumed
to be convex.
We keep the same definition of a self-financing trading strategy, i.e. (X, π) satisfying
dXt = rt Xt dt + πt · σt (dWt + θt dt),

Z T
|σt∗ πt |2 dt < ∞ P a. s. .
0
Remark 4.3. If d < n then the market is called incomplete, since there exist square-integrable
claims that cannot be hedged by a feasible trading strategy.
Definition 4.4. Sometimes it can be more instructive to describe the entries of the portfolio
process π by the proportion of the total wealth X that they represent. For a trading strategy
(X, π) such that X is positive, we therefore define the Rd -valued predictable process ρ̃ by
πti
ρ̃it = (4.5)
Xt
so that the wealth equation can be written as
dXt = Xt (rt dt + ρ̃t · σt (dWt + θt dt)). (4.6)
This is useful because it allows us to write the total wealth as a stochastic exponential:
Z Z
X = X0 E rs ds + ρ̃s · σs (dWs + θs ds) . (4.7)
Finally, we define the Rn -valued predictable process ρ by ρ = σ ∗ ρ̃, for additional simplicity.
4.2 Logarithmic utility

This is possibly the simplest of the three utility functions given above over which to optimise.
We provide a generalisation of a result of Hu, Imkeller and Müller (2005), who assumed the short
rate r to be zero.
28
As a constraint, we require that our process ρ̃ (see Definition 4.4) take values in some fixed
non-empty closed subset C̃ ⊆ Rd . In this section we will mainly be working with the process
ρ = σ ∗ ρ̃, so for t ∈ [0, T ] we define
Ct = σt∗ C̃. (4.8)
Note that by working with ρ̃ we are implicitly assuming that our wealth processes X are positive.
Definition 4.5. Our set of admissible processes Al is defined as the set of processes ρ ∈ HT2,n
such that ρt ∈ Ct dP ⊗ dt a.s..
Given ρ ∈ Al , denote the wealth process associated with ρ by X (ρ) , i.e.
Z Z
(ρ) (ρ)
X = X0 E rs ds + ρs · (dWs + θs ds) . (4.9)
Fixing some x > 0, our objective is thus to maximise

h i
(ρ)
E log XT
(ρ)
over pairs (X (ρ) , ρ) such that ρ ∈ Al and X0 = x. Let
h i
(ρ)
V (x) = sup E log XT (4.10)
ρ∈Al
(ρ)
X0 =x
be the value we are trying to find. It follows that

"Z #
T T
1
Z
2
V (x) = log(x) + sup E ρs · dWs + rs + ρs · θs − |ρs | ds . (4.11)
ρ∈Al 0 0 2
Here is the rough plan of attack: We seek to define a suitable family of processes (R(ρ) ) indexed
by ρ ∈ Al such that
(ρ)
1. R0 = R0 is constant,
(ρ) RT RT
2. RT = 0 ρs · dWs + 0 (rs + ρs · θs − 12 |ρs |2 )ds for all ρ ∈ Al (as in (4.11)),
3. R(ρ) is a supermartingale for all ρ ∈ Al ,

4. ∃ρ̂ ∈ Al such that R(ρ̂) is a martingale.
It will follow that (X (ρ̂) , ρ̂) attains the maximum V (x). To do this, we introduce the BSDE

(4.12)
YT = 0,
where f is an R-valued driver, to be specified later. This is equivalent to

Z T Z T
Yt = f (s, Ys , Zs )ds − Zs∗ dWs .
t t
29
Assume that the BSDE (4.12) (which we still haven’t actually specified) has a solution (Y, Z).
We can hence define
Z t Z t
(ρ) 1 1
Rt = ρs · dWs + rs − |ρs − θs |2 + |θs |2 ds + Yt
0 0 2 2
Z t Z t
1 1
= Y0 + (ρs + Zs ) · dWs + rs − |ρs − θs |2 + |θs |2 − f (s, Ys , Zs ) ds,
0 0 2 2
which can be written as

(ρ) 1 1
dRt = (ρt + Zt ) · dWt + rt − |ρt − θt |2 + |θt |2 − f (t, Yt , Zt ) dt (4.13)
2 2
(ρ)
since the value of Y0 is irrelevant (all we need is that it is independent of ρ). Notice that RT
satisfies the terminal condition that we want. To get the required supermartingale property for
R(ρ) , we would like the drift term here to be non-positive, but not strictly negative. Hence we
pick
1 1
f (t, y, z) = rt + |θt |2 − dist(θt , Ct )2 (4.14)
2 2
which is in fact independent of its last two arguments, and predictable by Lemma 4.2. Therefore

(ρ) 1 1
dRt = (ρt + Zt ) · dWt − |ρt − θt |2 − dist(θt , Ct )2 dt. (4.15)
2 2
This is all well and good, but now we have to show that a solution to our BSDE (4.12) actually
exists.
Lemma 4.6. There is a unique continuous square-integrable solution (Y, Z) to (4.12).
Proof. Since the driver f has no dependence on (y, z), a sufficient condition for us to use our
existence-uniqueness theorem is that f (·) is P a.s. bounded.
The short rate r and risk premium θ are both P a.s. bounded, so let M1 > 0 be an upper
bound for |r| and M2 > 0 an upper bound for |θ|. The constraint set C̃ is non-empty, so let
c ∈ C̃. Then we have that σt∗ c ∈ Ct ∀t ∈ [0, T ]. Then by uniform ellipticity, P a.s. for all
t ∈ [0, T ],
dist(θt , Ct ) ≤ |θt − σt∗ c|
≤ |θt | + |σt∗ c|
≤ M2 + K|c|
so
1 1
|f (t)| ≤ M1 + M22 + (M2 + K|c|)2
2 2
and hence f (·) is P a.s. bounded. Thus by Theorem 1.6 there is a unique square-integrable
solution.
Let us return to R(ρ) , now given by
Z t Z t
(ρ) 1 1
Rt = Y0 + (ρs + Zs ) · dWs + − |ρs − θs |2 − dist(θs , Cs )2 ds. (4.16)
0 0 2 2
The integrand of the stochastic integral is in HT2,n so the stochastic integral is a martingale. Ad-
ditionally, the integrand of the drift term is dP⊗dt a.s. non-positive, so R(ρ) is a supermartingale.
We now seek to use the measurable selection theorem 3.9 to find ρ̂.
30
Lemma 4.7. There exists ρ̂ ∈ Al such that R(ρ̂) is a martingale.
Proof. The mapping
(ω, t) 7→ ΠCt (ω) (θt (ω)) ≡ {x ∈ Ct (ω) : |x − θt (ω)| = dt (ω)} , (4.17)
where d = (dist(θt , Ct ))t∈[0,T ] , is a point-to-set mapping between the measurable space (Ω ×
[0, T ], P) and the Polish space (Rn , B n ). Since Ct (ω) is closed, the set is non-empty for all (ω, t).
The predictability of the processes θ, σ and d (by Lemma 4.2) ensure that the second condition
of the measurable selection theorem holds. Thus there exists an Rn -valued predictable process
ρ̂ such that ρ̂t (ω) ∈ Ct (ω) ∀(ω, t) and
|ρ̂t (ω) − θt (ω)| = dist(θt (ω), Ct (ω)) ∀(ω, t). (4.18)
Now recalling c ∈ C̃ from before we have that P a.s.,
|ρ̂t | ≤ |ρ̂t − θt | + |θt |
= dist(θt , Ct ) + |θt |
≤ 2M2 + K|c|,
i.e. ρ̂ is P a.s. bounded uniformly over t ∈ [0, T ]. Thus ρ̂ ∈ HT2,n . So ρ̂ ∈ Al . Finally,
Z t
(ρ̂)
Rt = Y0 + (ρ̂s + Zs ) · dWs (4.19)
0
so R(ρ̂) so a martingale.
It only takes a few additional steps to prove our main theorem:
Theorem 4.8. Our optimisation problem is maximised at ρ̂ ∈ Al as defined above, with value
"Z #
T
V (x) = log(x) + E f (s)ds (4.20)
0
where
1 1
f (t) = rt + |θt |2 − dist(θt , Ct )2 .
2 2
Proof. Let ρ ∈ Al . Then by the supermartingale and martingale properties of R(ρ) ,
h i h i
(ρ) (ρ) (ρ̂) (ρ̂)
E RT ≤ R0 = R0 = E RT .
So
"Z #
T T
1
Z
2
V (x) = log(x) + sup E ρs · dWs + rs + ρs · θs − |ρs | ds
ρ∈Al 0 0 2
h i
(ρ)
= log(x) + sup E RT
ρ∈Al
h i
(ρ̂)
= log(x) + E RT
= log(x) + Y0
"Z #
T Z T
= log(x) + E f (s)ds − Zs∗ dWs
0 0
"Z #
T
= log(x) + E f (s)ds ,
0
31
where the last equality follows since Z ∈ HT2,n so the stochastic integral is a martingale.
Remark 4.9. 1. Suppose we have no set constraint, or equivalently C̃ = Rd . It follows that

Ct = σt R = image(σt∗ ), and by definition θt ∈ image(σt∗ ) ∀t ∈ [0, T ] P a.s., so we get the
∗ d
simplified form of the value function

"Z #
T
1 2
V (x) = log(x) + E rs + |θs | ds
0 2
"Z # (4.21)
T
1
= log(x) + kθk2 + E rs ds .
2 0
2. By using trading strategies described by (X (ρ) , ρ) we are constraining X (ρ) to be positive.

Were we to allow X to be merely bounded below, our derived value of V (x) may be greater,
but we would not be able to take advantage of the properties of the stochastic exponential.
We have also required that admissible processes ρ be square-integrable, another constraint
required for this derivation to work.
Remark 4.10. The proofs of the analogous results for power and exponential utility are slightly
more complicated, but follow the same general idea of defining a family of processes R which has
nice martingale and supermartingale properties, and which incorporates an underlying BSDE.
These proofs are given in Hu, Imkeller and Müller (2005) in the case that the short rate r is zero.
32
References
[1] Bismut, J., ’Conjugate Convex Functions In Optimal Stochastic Control’, Journal Of Math-
ematical Analysis And Applications, vol. 44, no. 2, 1973, pp. 384-404.
[2] Black, F. and M Scholes, ’The Pricing Of Options And Corporate Liabilities’, The Journal
Of Political Economy, vol. 81, no. 3, 1973, pp. 637-654.
[3] Dellacherie, C., ’Sur L’Existence De Certains Essinf Et Essup De Familles De Processus
Mesurables’, in Séminaire De Probabilités XII, C Dellacherie, PA Meyer and M Weil (eds),
Springer-Verlag, Berlin And New York, 1977, pp. 512-514.
[4] Dudley, R.M., ’Wiener Functionals As Itô Integrals’, The Annals Of Probability, vol. 5, no.
1, 1977, pp. 140-141.
[5] El Karoui, N., ’Backward Stochastic Differential Equations: A General Introduction’, in
Backward Stochastic Differential Equations, N El Karoui and L Mazliak (eds), CRC Press,
Harlow, 1997, pp. 7-26.
[6] El Karoui, N., S Peng and M Quenez, ’Backward Stochastic Differential Equations In Fi-
nance’, Mathematical Finance, vol. 7, no. 1, 1997, pp. 1-71.
[7] Hu, Y., P Imkeller and M Müller, ’Utility Maximization In Incomplete Markets’, The Annals
Of Applied Probability, vol. 15, no. 3, 2005, pp. 1691-1712.
[8] Karatzas, I. and SE Shreve, Brownian Motion And Stochastic Calculus, 2nd edn, Springer-
Verlag, New York, 1991.
[9] Kuratowski, K. and C Ryll-Nardzewski, ’A General Theorem On Selectors’, Bull. Acad.
Polon. Sci. Sér. Sci. Math. Astronom. Phys., vol. 13, 1965, pp. 397-403.
[10] Merton, R., ’Optimum Consumption And Portfolio Rules In A Continuous-Time Model’,
Journal Of Economic Theory, vol. 3, no. 4, 1971, pp. 373-413.
[11] Pardoux, É. and S Peng, ’Adapted Solution Of A Backward Stochastic Differential Equa-
tion’, Systems & Control Letters, vol. 14, no. 1, 1990, pp. 55-61.
[12] Peng, S., ’A Generalized Dynamic Programming Principle And Hamilton-Jacobi-Bellman
Equation’, Stochastics And Stochastic Reports, vol. 38, no. 2, 1992, pp. 119-134.
[13] Quenez, M., ’Stochastic Control And BSDEs’, in Backward Stochastic Differential Equa-
tions, N El Karoui and L Mazliak (eds), CRC Press, Harlow, 1997, pp. 83-100.
33

Backward Stochastic Differential Equations in Financial Mathematics

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Backward Stochastic Differential Equations in Financial Mathematics

Uploaded by

Copyright:

Available Formats

arXiv:2312.06690v1 [math.

PR] 9 Dec 2023

Backward Stochastic Differential

1 Backward Stochastic Differential Equations 3

2 European claims in dynamically complete markets 15

3 Concave BSDEs and applications 19

4 Utility theory in incomplete markets 27

−dYt = f (t, Yt , Zt )dt − Zt∗ dWt ;

1.2 Basic definitions

|f (·, y1 , z1 ) − f (·, y2 , z2 )| ≤ C (|y1 − y2 | + |z1 − z2 |) ∀(y1 , z1 ), (y2 , z2 ) ∈ Rd × Rn×d .

|f (·, y, z) − f (·, 0, 0)| ≤ C (|y| + |z|) .

1.3 Existence and uniqueness of solutions

where Xt∗ = sups≤t |Xs | is the maximum process of X.

Proof from El Karoui (1997). From equation (1.2), we get

By the Burkholder-Davis Gundy inequalities (Proposition 1.7):

Now (eβt δZt δYt )t ∈ HT1,n because using Hölder’s inequality,

Q(y, z) = 2C|y|2 + 2C|y||z| + 2|δ2 fs ||y| − β|y|2 − |z|2 (1.8)

which we can rearrange as

and the result follows.

Now f (ω, t, y, z) is uniformly Lipschitz in (y, z) with constant C so

Yt1 = Yt2 ∀t ∈ [0, T ] P a. s. . (1.15)

1.4 LBSDEs and the comparison theorem

−dYt = (ϕt + Yt βt + Zt∗ γt )dt − Zt∗ dWt ;

where ϕ, β and γ are progressively measurable Rd -, R- and Rn -valued processes respectively.

Then E(M ) is a uniformly integrable martingale.

so that Γ satisfies the FSDE

So kΓk2 ≤ T ecT . This proves the first part of the lemma.

dMt = Γt dYt + Yt dΓt + dhΓ, Y it + Γt ϕt dt

All the drift terms cancel and we are left with

which is a local martingale.

The following are immediate consequences of the above proposition:

Definition 1.21. Sometimes it is convenient to define a family of adjoint processes (Γs : s ∈

This means that dΓst = Γst (βt dt + γt · dWt ); Γss = 1.

−dδYt = f 1 (t, Yt1 , Zt1 ) − f 2 (t, Yt2 , Zt2 ) dt − δZt dWt ,

Then (δY, δZ) is the solution of the LBSDE

−dδYt = ∆y f 1 (t)δYt + ∆z f 1 (t)δZt + δ2 ft dt − δZt dWt ,

Supersolutions are a common concept in mathematical finance, arising naturally in models

−dδYt = (∆y f (t)δYt + ∆z f (t)δZt ) dt − δZt dWt + dCt ,

2.1 Basic definitions

where 1 = (1, . . . , 1)∗ ∈ Rn .

dVt = rt Vt dt + πt · σt (dWt + θt dt), (2.3)

dVt = rt Vt dt − dCt + πt · σt (dWt + θt dt), (2.5)

where C is increasing, right-continuous and adapted with C0 = 0. Observe that if (V, π) is a

2.2 Hedging claims

p(ξ) := inf {x ≥ 0 : ∃(V, π) ∈ H(ξ) s.t. V0 = x a. s.}

p′ (ξ) := inf {x ≥ 0 : ∃(V, π, C) ∈ H′ (ξ) s.t. V0 = x a. s.} .

where ξ is non-negative and H is positive so X must be non-negative. So (X, π) ∈ H(ξ) (and

p(ξ) = p′ (ξ) = E ξHT0 .

which is by linearity a solution of the LBSDE for which X00 = 1, XT0 = 0.

{ω ∈ Ω : ∃t ∈ [0, T ] : Ut (ω) > Vt (ω)} (3.1)

3.1 Extrema of standard data

Then the processes Y and Y α satisfy

Y = ess inf Y α = Y ᾱ P a. s. . (3.3)

ess inf Y α ≥ Y = Y ᾱ ≥ ess inf Y α P a. s. .

3.2 Concave drivers

f ∗ (p) = sup (p · x − f (x)) . (3.4)

f (x) = sup (x · p − f ∗ (p)) . (3.5)

The effective domain of F is given by

DF := {(ω, t, β, γ) ∈ Ω × [0, T ] × R × Rn : F (ω, t, β, γ) < ∞} . (3.7)

f (ω, t, y, z) = inf (F (ω, t, β, γ) + βy + γ · z). (3.8)

f (ω, t, y, z) − βy − γ · z ≥ −C|y| + f (ω, t, 0, z) − βy − γ · z

f β,γ (ω, t, y, z) = F (ω, t, βt , γt ) + βt y + γt · z. (3.9)

Let the set of admissible control parameters be given by

f (ω, t, y, z) = lim (F (ω, t, β n , γ n ) + β n y + γ n · z).