Professional Documents
Culture Documents
Slides Seance04
Slides Seance04
3 Practical implementation
It requires the discretization of (X x , rX x ). One may use the usual Euler scheme, setting:
- and on ⇡:
d
X
Xt⇡i+1 = Xt⇡i + b(Xt⇡i )(ti+1 ti ) + .j
(Xt⇡i )(Wtji+1 Wtji )
j=1
d
X
rXt⇡i+1 = rXt⇡i + rb(Xt⇡i )rXt⇡i (ti+1 ti ) + r .j
(Xt⇡i )rXt⇡i (Wtji+1 Wtji )
j=1
Remark: simplification if d = 1.
110
3.3 Greek weights
⇥ ⇤
u(✓) = E g(XT✓ )
Example 3.3. (i) If ✓ = x, and x is the starting point of X then we are computing the ’Delta’.
(ii) If X follows a BS model and ✓ = the volatility of X then we are computing the ’vega’.
111
• We are interested in finding random variable ⇧ s.t.
h i
@✓ u(✓0 ) = E g(XT✓0 )⇧
• Precisely, we consider
h i
W := {⇧ 2 L2 | @✓ u(✓0 ) = E g(XT✓0 )⇧ , for all bounded measurable function g}
h i
@✓ u(✓0 ) = E g(XT✓0 )⇧✏ + E(✏)
where ⇧✏ is an approximation of such greek weight and E(✏) is the associated error.
112
3.3.1 Likelihood Ratio Method
• Assuming that XT✓ has a differentiable positive probability density (✓, x):
⇥ ⇤ h i
✓0 ✓0
@✓ E g(XT✓ ) = E g(XT )s(✓0 , XT ) with s(✓, x) = @✓ ln( (✓, x)) .
✓=✓0
We compute
Z
⇥ ⇤
@✓ E g(XT✓ ) = @✓ g(x) (✓, x)dx
Z Z
@✓ (✓, x)
= g(x)@✓ (✓, x)dx = g(x) (✓, x)dx
(✓, x)
113
• If s✓0 := s(✓0 , XT✓0 ) 2 L2 , then
h i
W := {⇧ 2 L2 | E ⇧ | XT✓0 = S ✓0 }
h i
V ar[g(XT✓0 )⇧] = E (g(XT✓0 )⇧)2 (@✓ u)2
h h ii
= E E (g(XT✓0 )⇧)2 |XT✓0 (@✓ u)2
h h i i
E (E g(XT✓0 )⇧|XT✓0 )2 (@✓ u)2
= V ar[g(XT✓0 )s✓0 ]
114
Example: Black Scholes delta In the Black-Scholes setting, we have
WT
@x u(0, x) = e rT E g(Xtx ) .
x T
1 2
1 log(y/x) (r 2 )T
h(y) = p '(⇣(y)), ⇣(y) = p
y T T
And we compute
1 2
log(y/x) (r 2 )T
@x h(y)/h(y) = ⇣(y)@x ⇣(y) =
x 2T
115
Similar computations lead to the following form for the vega,
W2 1
@ u(0, x) = e rT E g(Xtx ) T WT .
T
Higher order derivatives Assuming that XT✓ has a twice differentiable positive probability density (✓, x):
2
⇥ ⇤ 2
✓ @✓✓ (✓, XT )
✓
@✓✓ E g(XT✓ ) = E g(XT ) .
(✓, XT✓ )
116
3.3.2 Integration by part
• We consider the delta in the Black Scholes setting (r = 0), the payoff function is Cb1 .
1 0 x x
@x u(0, x) = E[g (XT )XT ] .
x
1
@x u(0, x) = E[g(XTx )WT ] .
x T
We have
Z p p
2 2
@x u(0, x) = g 0 (xe T /2+ Ty
)e T /2+ Ty
'(y)dy
117
which rewrites
Z p
1 2
T /2+ Ty
@x u(0, x) = p @y [g(xe )]'(y)dy
x T
Z p
1 2
T /2+ Ty
= p g(xe )y'(y)dy
x T
118
3.3.3 Bismut’s formula
Theorem 3.2.
Z T
1
@x u(0, x) = E g(XTx ) (Xsx ) 1 rXsx dWs . (3.4)
T 0
RT
(Here we have ⇧ = 1
T 0 (Xsx ) 1 rXsx dWs ).
and then
Z T
1
@x u(0, x) = E[@x u(s, Xsx )rXsx ] ds
T 0
119
Z T Z T Z T
E @x u(s, Xsx )rXsx ds = E @x u(s, Xsx ) (Xsx )dWs (Xsx ) 1 rXsx dWs
0 0 0
And via the martingale representation theorem, since u is solution to the pde,
Z T
@x u(s, Xsx ) (Xsx )dWs = g(XTx ) u(0, x)
0
120
4 U.S. options in complete market
W is a standard BM under the risk neutral probability, the market is complete ( is invertible).
Upon exercise, the option pays to the owner g(Xt ) (g exercise payoff, Lipstchitz continuous).
121
• Super-hedging price:
y,
0 := inf{y 2 R | 9 ; admissible financial strategy s.t. Vt
pus g(Xt ) 8 t 2 [0, T ]}
,! at time 0:
⇥ ⇤
pus
0 = sup E e
r⌧
g(X⌧ )
⌧ 2T[0,T ]
⇥ r⌧ ?
⇤
=Ee g(X⌧ ? )
,! at time t 2 [0, T ]:
h i
pus
t = esssup⌧ 2T[t,T ] E e r(⌧ t)
g(X⌧ )|Ft
122
The discounted price is a super-martingale
123
⇥ ⇤
• (non-linear) PDE representation: pus (t, x) = sup⌧ 2T[t,T ] E e r(⌧ t)
g(X⌧t,x ) is solution3 to
min{ LX u ru , u g} = 0 on [0, T ) ⇥ R
u(T, .) = g(.)
• This comes from the Dynamic programming principle: For all stopping time t ✓ T
h i
us
p (t, x) = sup E e r(⌧ t)
g(X✓t,x )1{✓>⌧ } +p us
(✓, X✓t,x )1{✓⌧ }
⌧ 2T[t,T ]
3
in some sense... e.g. viscosity sense.
124
4.2 Bermudan option
• An option that can be exercised at a discrete set of time (<) during its life.
< = {0 =: s0 , . . . , sj , . . . , s := T } .
• Super-hedging price:
125
• Dual representation of the price:
,! at time 0:
⇥ ⇤
pb0 = sup E e r⌧
g(X⌧ )
<
⌧ 2T[0,T ]
⇥ r⌧ ⇤
⇤
=Ee g(X⌧ ⇤ )
,! at time t 2 <:
h i
pbt < E e
= esssup⌧ 2T[t,T ]
r(⌧ t)
g(X⌧ )|Ft
126
• Backward Programming Algorithm: (denoting G := g(X) )
h i
Ysj = max Csj , Gsj where Csj := E e r(sj+1 sj )
Ysj+1 |Fsj (Continuation value)
127
Proposition 4.1. (i) Y is the smallest super-martingale above G = g(X).
⌧ ? = inf{t 2 < | Yt = Gt } ^ T ,
i.e. Y0 = E[G⌧ ? ].
Proof.
(i) by induction
128
(ii) Since Y is a super-martingale above Z on <, we observe that Y0 E[Y⌧ ] E[Z⌧ ] for all ⌧ 2 T[0,T
<
].
=0
(iii) For t 2 <, we have that Yt = Y0 + Mt? A?t where M ? is a martingale and A? is an increasing process given
129
recursively by (M0 , A0 ) = (0, 0) and
⇥ ⇤
Ms?j+1 = Ms?j + Ysj+1 E Ysj+1 |Fsj
⇥ ⇤
A?sj+1 = A?sj + Ysj E Ysj+1 |Fti
⇥ ⇤
We notice that Ysj E Ysj+1 |Fti = [Gsj Csj ] +
(iv) Denotes
y = V0y, Y0
130
b. Denote ⌘ ? = Y0 + M ? , we have that
Y0 = E[⌘⌧?? ] == E[⌘T ] .
In our setting, this means that we can replicate ⌘T with an initial wealth of Y0 i.e. there exists an admissible strategy
s.t.
Thus Y0 2 . 2
131
Proposition 4.2. Set |<| = max0<j (sj sj 1 ), then the following holds:
⇥ ⇤
sup E |pbt pus
t | 2
C|<|↵
t2[0,T ]
0 pus
t pbt
132
Let ⌧¯, be the projection on the grid < of ⌧ ⇤ (optimal stopping time for the US option), we have
h i
pus
t pbt = E g(X⌧ ⇤ ) Ft pbt
h i
E[g(X⌧ ⇤ )|Ft ] E g(X⌧¯ ) Ft
h i
CE |X⌧ ⇤ X⌧¯ | Ft
p
C |<|
C|<|
133
4.2.1 Discretisation of the forward process
⇥ ⇤
p⇡0 = sup E e r⌧
g(X⌧⇡ )
<
⌧ 2T[0,T ]
⇥ r⌧ ⇤
⇤
=Ee g(X⌧⇡⇤ )
134
Proposition 4.3. In our setting, we have
1
|pb0 p⇡0 | C|⇡| 2 .
135
Extension to the continuous case
p
|pus
0 p⇡0 | C ⇡ .
136
4.2.2 Longstaff-Schwarz algorithm
• Observe that in Definition (4.1), the optimal stopping time ⌧ˆ0 (from time 0) can be estimated by
1. set ⌧ˆ = T
• The Longstaff-Schwartz Algorithm [59] focuses on the sequence of optimal time execution
1. set ⌧˜ = T
137
⇥ ⇤
2. then set ⌧˜j = sj 1Ej + ⌧˜j+1 1Ejc with Ej = {E G⌧˜j+1 |Fsj Gsj } .
• Note that the value process Y is computed only through its representation in terms of optimal stopping time.
⇥ ⇤
indeed, Ysj = E G⌧˜j+1 |Fsj
138
4.3 Dual approach
The inf is achieved for M ? the martingale part of the Doob-Meyer decomposition and
Y0 = sup(Gt Mt? ) .
t2<
• The previous representation has been introduced in [71] (for the continuous case) and simultaneously in [47] where
139
Proof. 1. First, observe we observe that
Y0 = sup E[G⌧ M⌧ ]
⌧
E sup(G⌧ M⌧ ) = E sup(Gt Mt ) .
⌧ t2<
2. Recall the Doob-Meyer decomposition theorem: Yt = Y0 + Mt? A?t where A is an increasing predictable process
Gt Yt = Y0 + Mt? A?t
which leads to
140
• In [71], sub-optimal martingales are constructed in "ad-hoc" way, see Section 4.4.2 for possible systematic ap-
proaches.
• Numerical methods based on the dual formula 4.1 will lead naturally to upper-bounds for the true price.
141
4.4 Implementation using regression techniques
• The regression part is required to estimate the conditional expectation (linear or nonlinear [74, 47])
< = {0 =: s0 , . . . , sj , . . . , s := T }
142
(ii) For j < , compute
⇥ ⇤
Yj = max E Yj+1 |Fsj , g(Xsj )
⇥ ⇤ ⇥ ⇤
• Denote Cj := cj (Xsj ) := E Yj+1 |Xsj = E Yj+1 | Fsj ( continuation value), we have
Yj = cj (Xsj ) _ g(Xsj )
143
• Observe that
⇥ ⇤ ⇥ ⇤
Cj = argminZ2L2 (Fsj ) E |Yj+1 Z|2 = argminZ2L2 ( E
(Xsj )) |Yj+1 Z|2
⇥ ⇤
cj = argminf 2L2j E |Yj+1 f (Xsj )|2
• L2j is too big! so one considers some basis function ( ` )` 1 and then pick a finite number of them, say K, to get
K
X
`
f' `
`=1
144
• With this approximation, the minimisation problem can be rewritten
" K
#
X
¯j = argmin 2RK E |uj+1 (Xsj+1 ) `
` (Xsj )|
2
(4.2)
`=1
K
X
c̄j := ¯` ` , and cj = c̄j + error
j
`=1
PK
(Cj = C̄j + error where C̄j = `
`=1 c̄j ` (Xsj ))
• The coefficient ¯ is easily calculated, once observed that C̄j is the orthogonal projection of Yj+1 on V =
145
We denote := ( 1, . . . , K) and set
⇥ ⇤ ⇥ ⇤
B j = E (Xj )> (Xj ) , B j u = E (Xj )> uj+1 (Xj+1 ) ,
we have that
¯j = (B j ) 1 B j ,
u
assuming B j non-singular.
⇥ ⇤
By property of the orthogonal projection on the vector space V, we have E (Yj+1 C̄j )Z = 0, for all Z 2 V
leading to
" K
#
X
E Yj+1 ¯` ` (Xj ) r (Xj ) = 0 ,1 r K .
`=1
146
• In practice, one has to consider estimated counterparts of B j , B j u ...
N
j 1 X
B = (Xji )> (Xji )
N i=1
147
• Full approximation (Tsitsiklis-Van Roy): observe that in practice one has to estimate ūj = c̄j _ g instead of uj
PK ˆ¯
set c̄ˆj := `=1 ` ` and ūˆj = c̄ˆj _ g.
• One can compute an approximated optimal policy ⌧ˆ⇤ on each path and recompute E[g(X⌧ˆ⇤ )].
Remark 4.1. 1. The choice of the basis functions is key, specially in high dimension.
148
2. The linear-regression method is used for the Longstaff-Schwartz Algorithm [59].
• Based on Theorem 4.1, one should find a "good" martingale. Various approaches beyond "guessing" [71] have
149
• The martingale to use in (4.1) is M ? which is given by
⇥ ⇤
Mj? = Ysj+1 E Ysj+1 |Fsj . (4.3)
• To ensure the martingale property, Haugh & Kogan suggest to recompute the previous martingale increment for
⇥ ⇤
– Simulate M path of X: for each Xsmj , compute E ūˆ(Xsj+1 )|Xsj by resimulation and set
⇥ ⇤
M̂j+1 = M̂j + ˆ j with ˆ j = ūˆ(Xsj+1 ) E ūˆ(Xsj+1 )|Xsj (4.4)
PM
– Compute 1
M
m
m=1 maxt2< (g(Xsj ) M̂jm )
150
4.5 Quantization based methods
See e.g. the review article [62] or Chapter 5 in the book [63]
• Let X be an Rd random variable. Quantization ! find the best approximation of X by discrete random variable
• Voronoi tessellation of the M -quantizer x is a (Borel) partition C(x) := (Ci (x))1iM s.t.
Ci (x) ⇢ {⇠ 2 Rd | |⇠ xi | min |⇠ xj |}
i6=j
151
152
• Nearest neighbour projection on x: Px : ⇠ 2 Rd 7! xi if ⇠ 2 Ci (x)
• x-quantization of X: X̂ = Px (X) (remark: if X is absolutely continuous any two x quantizations are Pa.s.
equal.)
h i 12
• Quadratic mean quantization error : E |X x 2
X̂ | =: kX X̂ x k2
h i
1. The quadratic distortion function: x 7! E |X X̂ x |2 reaches a minimum at some quantizer x⇤
153
h i
x⇤ 2
2. The function M 7! E |X X̂ | is decreasing to 0 as M ! +1
• Upper bound on the convergence rate for X 2 L2+✏ for some ✏ > 0: There exists Cd,✏ s.t.
1
8M 1, min kX X̂ x k2 Cd,✏ kXk2+✏ M d (4.5)
x2(Rd )M
• Any L2 -optimal M -quantizer is the best approximation of X by L2 r.v. taking at most M values (least square
approximation) i.e.
⇤
kX X̂ x k2 := min{kX Y k2 | #Im(Y ) M } (4.6)
154
Proof. Set y := Im(Y ) = {y1 , . . . , yM } and observe that
min |X yi | |X Y| P a.s.
i
⇤
so that kX Y k2 kX X̂ y k2 kX X̂ x k2 . 2
h ⇤
i ⇤
E X X̂ x = X̂ x . (4.7)
h i
(in particular E[X] = E X̂ x as soon as the quantizer is stationary which generally does not correspond to
minima)
155
Proof. We compute
h i
x⇤ 2
E |X X̂ |
h i2 h i h i h i 2
x⇤ x⇤ x⇤ x⇤ x⇤ ⇤
= E X E X X̂ + 2(X E X X̂ )(E X X̂ X̂ ) + E X X̂ X̂ x
h i2 h i 2
x⇤ x⇤ ⇤
=E X E X X̂ + E X X̂ X̂ x
⇤
where the last inequality is obtained by conditioning w.r.t X̂ x .
h i
x⇤
From the previous point, we also have, as E X X̂ takes almost M different values (this is a measurable function
⇤
of X x ...)
h i h i
⇤ 2
x 2 x⇤
E |X X̂ | E X E X X̂
156
This leads to
h i 2
x⇤ ⇤
E E X X̂ X̂ x =0.
• Two M -quantizer (M = 500) of N (0, I2 ) one of them being (close to be) optimal... [62]
157
For a discussion on how to obtain optimal quantization grid, we refer to Section 5.3 in [63].
158
Cubature formulas
h i XM
x
E[g(X)] ' E g(X̂ ) = f (xi )P(X 2 Ci (x))
i=1
• If g is Lipschitz,
h i h i
x x
|E[g(X)] E g(X̂ ) | [g]Lip E |X X̂ | [g]Lip kX X̂ x k2
h i h i
|E[g(X)] x
E g(X̂ ) | [Dg]Lip E |X x 2
X̂ | (4.8)
159
as X̂ x is stationnary.
Conditional expectation
h i
ˆ
• Approximation of ⇥ = E[F (X)|Y ] by ⇥ = E F (X̂)|Ŷ ...
⇣ ⌘
k⇥ ˆ
⇥k2 C kX X̂k2 + kY Ŷ k2 (4.9)
160
Proof. See Exercises ??. 2
161
4.5.2 Quantization tree for optimal stopping problem
This technique has been introduced in [6, 5], where a complete error analysis is done, see also Section 11.3.2 in [63].
• we are given a discrete-time Markov Chain (Xk )0k on a grid ⇡ (samples of (Xt )0tT or an associated scheme).
• For each k, we consider the (optimal) quantization X̂k of Xk on the grid Ck := (Cki )1iMk .
162
h i
• For any ', we replace E['(Xk+1 )|Xk ] by E '(X̂k+1 ) | X̂k , denote then
163
Pn 1
• ’online’ computational cost: ⇠ k=0 Mk Mk+1 .
• note that the grids and (⇡ k ) are computed offline, e.g. by MC method: Note
where (Xkn )1nN,1k are MC simulation of the Markov Chain (Xk )1k .
164
4.5.3 Markovian quantization (grid method)
= {x 2 Zd | |xj | 6 , 1 6 j 6 d} .
165
• we use an optimal quantization of Gaussian random variables ( Wi ):
p Wi
ci :=
W h i GM ( p )
hi
GM denotes the projection operator on the optimal quantization grid for the standard Gaussian distribution with
h i p1 p 1
E | Wi ci |p
W 6 Cp,d hM d . (4.12)
4
The grids can be d ownloaded from the website: http://www.quantize.maths-fi.com/ .
166
• we introduce the following discrete/truncated version of the Euler scheme,
8
>
> b0⇡ = X0
<X
h i (4.13)
>
>
:Xb⇡ = ⇧ Xb ⇡ + hi b(X
b ⇡ ) + (X
b ⇡) W
ci .
i+1 i i i
Definition 4.3. We denote (Ybi⇡ )06i6n the solution of the backward scheme satisfying
h i
Ybi⇡ = max(Eti Ybi+1
⇡ bi⇡ ))
, g(X (4.14)
167
Proposition 4.4. For all i 2 {0, ..., n}, there exists a function u⇡ (ti , .) : ! R such that
Yb ⇡ = u⇡ (ti , X
bi⇡ )
This function is computed on the grid by the following backward induction: for all i 2 {0, ..., n} and x 2 ,
h ⇣ ⇣ p ⌘⌘i
⇡ ⇡
u (ti , x) = max{E u ti+1 , ⇧ x + hi b(x) + hi (x)GM (U ) , g(x)}
168