American Option Montecarlo

INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE
Pricing and hedging American options by Monte

Carlo methods using a Malliavin calculus approach
Vlad Bally Lucia Caramellino Antonino Zanette
N 4804
April 2003
ISSN 0249-6399
ISRN INRIA/RR--4804--FR+ENG
THME 4
apport
de recherche
Pricing and hedging American options by Monte Carlo

methods using a Malliavin calculus approach
Vlad Bally
, Lucia Caramellino
, Antonino Zanette
Thme 4 Simulation et optimisation

de systmes complexes
Projet Math
Rapport de recherche n 4804 April 2003 46 pages
Abstract:
Following the pioneering papers of Fourni, Lasry, Lebouchoux, Lions and Touzi (see
[12] and [13]), an important work concerning the applications of the Malliavin calculus in numerical methods for mathematical nance has come after. One is concerned with two problems:
computation of a large number of conditional expectations on one hand and computation of Greeks
(sensitivities) on the other hand. A signicant test of the power of this approach is given by its
application to pricing and hedging American options.
Our paper gives a global and simplied presentation of this topic including the reduction of variance
techniques based on localization and control variables.
A special interest is given to practical
implementation, number of numerical tests are presented and their performances are carefully
discussed.
Key-words:
American options, Malliavin calculus, COnditional expectations, Sensitivities,
Monte Carlo, Reduction of variance, Localization

Acknowledgments. For the kind hospitality at the CERMICS-ENPC and the INRIA, the authors wish to thank
Bernard Lapeyre, Agnes Sulem and all the equipe of the
partially supported this research.
Premia project (http://cermics.enpc.fr/premia), which
The authors are also grateful to Bruno Bouchard and Herv Regnier for useful and interesting discussions.
The work of Antonino Zanette was partially supported by the research project Progetto di Ricerca Modelli
matematici innovativi per lo studio dei rischi nanziari e assicurativi(Regione Autonoma Friuli-Venezia Giulia,
Legge Regionale 3/1998).
Equipe Statistiques et Processus, Universit du Maine, and INRIA Rocquencourt, Project Math, Domaine de
Voluceau, Rocquencourt, BP 105, 78153 Le Chesnay Cedex, France
Dipartimento di Matematica, Universit di Roma
, via della Ricerca Scientica, I-00133 Roma,
Tor Vergata
Italy
Dipartimento di Finanza dell'Impresa e dei Mercati Finanziari, Universit di Udine, Via Tomadini 30/A, I-33100
Udine, Italy
Unit de recherche INRIA Rocquencourt

Domaine de Voluceau, Rocquencourt, BP 105, 78153 Le Chesnay Cedex (France)
Tlphone : +33 1 39 63 55 11 Tlcopie : +33 1 39 63 53 30
Calcul du prix et couverture des options amricaines par

mthodes de Monte Carlo et calcul de Malliavin
Rsum :
A la suite des premiers articles de Fourni, Lasry, Lebouchoux, Lions et Touzi (voir [12]
et [13]), un travail considrable sur les applications du calcul de Malliavin pour les mthodes numriques en nance a t dvelopp. Deux problmes se posent en particulier : le calcul d'esprances
conditionnelles d'une part, les sensitivits et le calcul de la couverture de l'autre. Un test signicatif de la puissance de cette approche est donn par l'application au calcul de prix d'options
amricaines.
Notre papier donne une prsentation globale et simplie, avec l'utilisation de techniques de rduction de la variance, base sur la localisation et l'usage de variables de contrle.
Un intrt
particulier est donn l'implmentation concrte ; plusieurs tests numriques ont t raliss et
discuts.
Mots-cls :
Options amricaines, Calcul de Malliavin, Esprances conditionnelles, Sensitivits,
Monte Carlo, Rduction de variance, Localisation
Pricing and hedging American options by Monte Carlo methods using a Malliavin calculus approach3
1 Introduction
The theory of Markov Processes and the Stochastic Calculus have provided a probabilistic interpretation for the solutions of linear partial dierential equations (shortly, linear PDE's) by means
of the Feynman-Kac formula. One of the most striking applications is the emergence of the Monte
Carlo method as an alternative to deterministic numerical algorithms for solving PDE's.
method is slower than the analytical ones, but as the dimension increases (more than
3),
This
it is well
known that the analytical methods do no more work and so the Monte Carlo method remains the
only alternative.
But one has to notice that both the Feynman-Kac formula and the Monte Carlo method are specic to linear problems and collapse when dealing in the non linear case. In the last decay much
work has been done in order to extend the probabilistic methods in a non linear frame. On one
hand, in a set of papers Pardoux and Peng (see [20], [21]) introduced the Backward Stochastic
Dierential Equations (BSDE's in short), which generalize the probabilistic representation given
by the Feynman-Kac formula to non linear problems. In [14] (see also [1]) this representation was
obtained for obstacle problems as well (by means of reected BSDE's). Applications to mathematical nance have been discussed in [15]. So the theoretical background has been prepared. The
second step is to obtain ecient probabilistic algorithms for solving such problems. Recall that
one specic point in the Monte Carlo method is that it does not employ grids. More precisely, in
order to compute the solution
u(0, x0 )
of the linear PDE
(t + L)u(t, x) = 0,
(t, x) [0, T ] Rd
u(T, x) = f (x),
one represents
u(0, x0 )
as an expectation and employes a sample of the underlying diusion in
order to compute this expectation. On the contrary, an analytical method constructs a time-space
grid and employs a dynamical programing algorithm in order to compute the solution. This means
that in order to compute the solution in one point (0, x0 ), the analytical method is obliged to
(tk , xik ) while the Monte Carlo method provides directly the solution
in (0, x0 ). This is a big advantage because in large dimension grids become dicult to handle.
compute it on a whole grid
Think now that instead of the above linear problem, one has to solve a PDE of the type
(t + L)u(t, x) + f (t, x, u(t, x)) = 0.

Even if a probabilistic representation of
is available (and this is the case), one is obliged to
compute the solution on a whole grid for the simple reason that one has to know the value of
u(t, x)
which comes on in
f (t, x, u(t, x)).
Therefore, the dynamical programing algorithm becomes
essential. At this stage, a probabilistic approach is in the same position as an analytical one: one
has to construct a grid, then to solve at each time step a linear problem and nally to add the
input
f (t, x, u(t, x)). A terminology dierence however:
the probabilist would say that he computes
a conditional expectation instead of saying that he solves a linear PDE. But the problem remains
the same: computing a large number of conditional expectations, and this is not trivial.
The question is now if a probabilistic point of view provides a new approach to this kind of
problems. And the answer is armative. These last years, essentially motivated by mathematical
nance problems - and the most striking one is pricing American options in large dimension several new ideas appeared in this eld. Roughly speaking they may be divided in three families.
In the rst one, a tree is built up in order to obtain a discretization of the underlying diusion on a
grid. This family includes Broadie and Glasserman algorithm (see [10]), the quantization algorithm
(see [2], [3]) as well as Chevance's algorithm (see [11]). The second idea is to use regression on a
2
truncated basis of L in order to compute the conditional expectations. This is done in Longsta
RR n 4804
Bally & Caramellino & Zanette
and Schwartz [18] and in Tsisiklis and Van Roy [22]. Finally in [12], [13], Fourni et al. obtain
a representation of the conditional expectation using Malliavin's calculus and then employ this
representation in order to perform a Monte Carlo method. The specicity of this last approach is
that it appears as a pure Monte Carlo method despite the non linearity.
Except for solving PDE's (which amounts to pricing an option in the nancial frame), a second
problem of interest is to compute the sensitivity of the solution to some parameter (hedging and
Greeks, in nancial language). It seems that Malliavin calculus is an especially promising tool for
solving such a problem. It has been used by Lions and Reigner [17] which follow the third method,
as well as in [4] where quantization algorithm is employed.
The present paper deals with the last method based on Malliavin calculus. Although this approach
works for a large class of non linear problems, we focus on the American option pricing, which
amounts to solve obstacle problems for PDE's. This seems to be a signicant test on the eciency
of the method and several papers related to this subject appeared in the last time. Our aim is
to give a general view on this topic (with simplied proofs), to present concrete examples and to
discuss the numerical results. A special interest is given to the reduction of variance techniques
based on localization and control variables and to their impact in concrete simulation.
It worth to mention that as long as one restricts himself to the Black Scholes model (which of course
is of interest in mathematical nance) the computations related to Malliavin calculus are rather
explicit and elementary.
This permits us to give here an approach which is self contained and
accessible to readers who are not familiar with the rather heavy machinery of Mallavin calculus.
In the case of a general diusion the same reasonings work as well, but of course one has to employ
the Malliavin calculus in all the generality.
And consequently, the explicit formulas which are
available in the Balck Scholes model become heavier (in particular, they involve the inverse of the
Malliavin covariance matrix). The presentation that we give here is based on one more remark:
the results obtained in the one dimensional case extend easily (using an elementary geometrical
argument) to the multidimensional case. This signicantly simplies the proofs.
The paper is organized as follows. In Section 2 we present the problem and in Section 3 we give
the results (representation formulas for conditional expectation and for the strategy) in the one
dimensional case.
This section also contains the Malliavin calculus approach in the elementary
frame which is the ours. In Section 4 we present the multidimensional case and in Section 5 we
discuss optimal localization. In Section 6 we give the algorithm based on this approach and we
discus the numerical tests.
2 Presentation of the problem

An American option with maturity
any time up to
T.
Let
Xs
T,
is an option whose holder can exercise his right of option in
denote the underlying asset price process, here modeled as a diusion
process:
dXt = b(Xt )dt + (Xt )dWt
(1)
X0 = x
b and denote a vector and a matrix eld on Rd and W is a d-dimensional Brownian motion.
(Xs ) denote the cash-ow associated with the option. The price as seen at time t of such an
where
Let
American option is given by

R

P (t, Xt ) = sup E e t rs ds (X ) Ft
Tt,T
where
Tt,T
stands for the set of all the stopping times taking values on
[t, T ]
and
denotes the
spot rate process, which is here supposed to be deterministic.
INRIA
The solution of this optimal stopping problem has been provided by using the theory of the Snell
envelopes: the rst optimal stopping time is given by
t = inf{s [t, T ] ; P (s, Xs ) = (Xs )}.

Moreover, the function
P (t, x),
giving the price of the option, can be characterized as follows. We
dene the stopping region, also called exercise region, as
Es = {(t, x) [0, T ] R+ : P (t, x) = (x)}

and the continuation region as its complement, that is
Ec = {(t, x) [0, T ] R+ : P (t, x) > (x)}.

Roughly speaking, by using the Ito's Lemma, one can see that
dierential equation: setting
as the innitesimal generator of
Lg(x) =
being
a = ,
P (t, x) solves
X , that is
the following partial
d
d
X
1 X
2g
g
aij (x)
(x) +
bi (x)
(x)
2 i,j=1
xi xj
xi
i=1
then
P
(t, x) + LP (t, x) r(t)P (t, x) = 0
t
P (t, x) > (x), with the nal condition P (T, x) = (x). A
whenever
rigorous reasoning turns out
1
to be much more dicult because generally we have not sucient regularity for P (that is, C in t
2
and C in x) on one hand and on the other one, the behavior of the solution P in a neighborhood of
the free boundary
{(t, x) ; P (t, x) = (x)}
is rather delicate to describe (it gives a supplementary
term in the PDE). This leads to a weak formulation of the problem. The rst one was given in
terms of variational inequalities by Bensoussan and Lions [6]; in El Karoui et al. [14], one can nd
a formulation in viscosity sense and in [1], Bally et al. give a formulation in Sobolev sense.
Thus, the problem of the pricing of an American option is a strongly nonlinear problem, and there
is non hope to nd closed formulas. In order to numerically compute the price of an American
option, one can use either a deterministic or a stochastic algorithm.
Concerning the deterministic methods, they can be based on nite elements methods for variational
inequalities. But here, we are interested in a stochastic approach, that is in a Monte Carlo algorithm
for the pricing of American options. It turns out from a dynamic programming principle, but let
us rstly start by formalizing better the framework.
(, F , P) be a ltered probability space where a d-dimensional Browinan motion W is dened

Ft = (Ws : s t). Let Xt denote the underlying asset prices, which evolve according
to (1). The price at time t of an associated American option with maturity T and payo function
: Rd+ R is then

Let
and set
P (t, x) = sup Et,x er(t)(X )
(2)
Tt,T
P (0, x),
0, it is possible to set up a Bellman dynamic programming principle.
Indeed, let 0 = t0 < t1 < . . . < tn = T be a discretization of the time interval [0, T ], with step size
kt ' Xkt .
kt )k=0,1,...,n an approximation of (Xt )t[0,T ] , that is X
equal to t = T /n, and let (X
kt ) can be approximated by means of the quantity Pkt (X
kt ), given by the
The price P (kt, X
where we have supposed that the spot rate is constant. In order to numerically evaluate
that is the price as seen at time
following recurrence equality:
RR n 4804
Theorem 2.1
Given
1, n 2, . . . , 1, 0,
t = T /n (0, 1),
dene
kt ) = (X
nt )
Pnt (X
k = n
and for any

kt .
kt ) = max (X
kt ) , ert E P(k+1)t (X
(k+1)t ) X
Pkt (X
Then
kt ) ' P (kt, Xkt ).

Pkt (X
The above statement is heuristic and a rigorous formulation supposes to precise the hypothesis on
the diusion coecients and on the regularity of the obstacle. This is done by Bally and Pags
[3]. Roughly speaking, the error is of order
if one has few regularity and of order
if more
regularity holds.
As a consequence, one can numerically evaluate the delta
(t, x)
of an American option, that is
the derivative of the price with respect to the initial value of the underlying asset price:
x P (t, x).
(t, x) =
Recall that this is important since it gives the sensibility of the price with respect to
the initial underlying asset price and also for the hedging of the option. By considering the case
t = 0,
as in Theorem 2.1, then the following approximation
Proposition 2.2
For any
t = T /n (0, 1),
0 (x)
of
(0, x)
can be stated.
set
t = { Rd ; Pt () < ()},
where
Pt ()
is dened as in Theorem 2.1, that is

t = .
2t ) X
Pt () = max () , ert E P2t (X
Then, by setting

kt = 1c
(k+1)t ) X
()
= () 1t + ert E P(k+1)t (X
t

0 (x) = Ex (Xt )
where
denotes the gradient, one has
and
0 (x)
(0, x) '
Such an assertion is heuristic and a rigorous statement, including error bound, turns out to be a
more dicult problem. One may nd in Bally et al. [4] such a bound but it is given in a weaker
L2 ([0, T ], dt)).
sense (in
Such results state that in order to numerically compute the price
P (0, x)
and its delta
(0, x),
it is sucient to approximate a family of conditional expectations and of their derivatives, thus

allowing one to set up Monte Carlo simulations.
Existing Monte Carlo methods applied to this context, consist in the numerical evaluation of the
conditional expectations by means of a stratication of the path space for the approximation of the
transition density of the process
Xt ,
as the quantization algorithm by Bally, Pags and Printems
[4], the algorithm by Broadie and Glassermann [10] or the one by Barraquand and Martineau [5].
Another Monte Carlo approach makes use of regression methods to perform the approximation of
the conditional expectation, as made by Longsta and Schwartz [18] or by Tsitsiklis and VanRoy
[22]
Also in order to overcome the problem of the discretization of the path space, another method can
be used. It has been introduced by Lions and Regnier [17] and uses formulas allowing to represent
conditional expectations like
E(F (Xt ) | Xs = )
and its derivative
E(F (Xt ) | Xs = )
written in
INRIA
terms of a suitable ratio of non-conditioned expectations, that is

E(F (X ) )

t
s
E F (Xt ) Xs = =
E(s )

E(F (X ) 1, )E( ) E(F (X ) )E( 1, )

t
t
s
s
s
s
E F (Xt ) Xs = =
E(s )2
being
and
s1,
(3)
suitable weights, which could also depend on suitable localizing functions. Such
representations can be proved by using Malliavin calculus techniques. The proofs providing the
formulas as in (3) can be found in Section 3 and 4, in the framework of the Black and Scholes
model; the original reference is the paper by Lions and Regnier [17] but here we develop a little
dierent approach, mainly for the multidimensional case, allowing to simplify considerably the
proofs. Moreover, in Section 5 we discuss an optimality criterium for the choice of the localizing
functions, which play an important role for practical purposes, as already studied and observed by
Kohatsu-Higa and Petterson [16] and by Bouchard, Ekeland and Touzi [7].
Let us point out that we could employ the strong Malliavin calculus for our proofs (that is,
Malliavin derivatives, Skorohod integrals etc. - words which do not appear in this paper) but we
have opted for a more friendly approach, in order to make it readable and comprehensible also
to people who are not familiar with the topic.
Representations (3) can be used for the practical purpose of the pricing of American options
1,
can be written explicitly, expectations like
as follows. In fact, since the weights s and s
1,
E(f (Xt ) s ) or E(f (Xt ) s ) can be approximated through the associated empirical means and
used to numerically compute the price
P (0, x)
and its delta
(0, x)
by using Theorem (2.1) and
Proposition 2.2, thus avoiding the problem of the approximation of the transition density and of the
discretization of the path space. This plan gives also the considerable gain to provide a Monte Carlo
algorithm for the evaluation of
P (0, x)
and
(0, x)
which makes use of only one set of simulated
trajectories for the computation of any conditional expectation appearing in Theorem 2.1 and in
Proposition 2.2. Let us remark that, using this approach, the valuation of the delta is not made
through nite dierence approximations but it is performed by means of representation formulas
written in terms of expectations. We postpone to Section 6 (cfr. 6.1 and 6.2) a comprehensive
presentation of such an algorithm. Finally, some numerical results can be found in 6.3.
Finally, here we consider price and delta, then representation formulas for the conditional expectation and its gradient.
It is worth to point out that one could also attack the problem of the
numerical evaluation of other greeks. Indeed, similar techniques could be applied in order to obtain
representation formulas for other derivatives of the conditional expectation: the second derivative
(for the gamma), the derivative w.r.t.
w.r.t. the volatility
the spot rate
(for the theta), as well as the derivative
(for the vega).
3 Representation formulas for the conditional expectation

and its gradient: the one dimensional case
Let
be a single underlying asset price process, driven by the Black and Scholes model, i.e. it
solves the following stochastic dierential equation (sde)
dXt = (r )Xt dt + Xt dWt

X0 = x
RR n 4804
where:
x R+ ; r, R, being r the (constant) spot rate and the dividend

W is a 1-dimensional Brownian motion. Thus,

1 2
Xt = x exp ht + Wt ,
where h = r .
2
of the option;
>0
denotes the volatility;
(4)
The aim is to study the conditional expectation
E((Xt ) | Xs = )
where
0 < s < t, R+
and
Eb (R) = {f M(R) :
where
is a function with polynomial growth, that is belonging to
there exist
M(R) = {f : R R : f
C>0
and
mN
such that
f (y) C(1 + |y|m )}
is measurable}. In such a case one can state the following result
(Lions and Regnier [17], Lemme 2.1.1):
Theorem 3.1 (Representation formulas I: without localization)

i) For any
0 s < t, Eb
where
being
> 0, one has

T []()

s,t
E (Xt ) Xs = =
Ts,t [1]()
and

H(Xs )
Ts,t [f ]() = E f (Xt )
Ws,t
s(t s)Xs
H() = 10 , R,
(5)
and
Ws,t = (t s)(Ws + s) s(Wt Ws ).

ii) For any
where
0 s < t, Eb and > 0, one has

R []()T [1]() T []()R [1]()

s,t
s,t
s,t
s,t
E (Xt ) Xs = =
Ts,t [1]()2
Ts,t [f ]
is dened above and

H(Xs ) h (Ws,t )2
t i
+
W
.
Rs,t [f ]() = E f (Xt )
s,t
s(t s)Xs2 s(t s)
Let us point out that, for any xed function
f,
the operator
Ts,t [f ](): Rs,t [f ]() = Ts,t [f ]().
Rs,t [f ]()
In the proof of Theorem 3.1 it is shown the existence of a process

speaking,
where
(6)
is simply the derivative of
s,t L2 ()
such that, roughly

E (Xt )0 (Xs ) = E((Xt )H(Xs )s,t )
stands for the Dirac measure in
associated with
0 ).
(notice that
is actually the distribution function
Obviously, the Dirac mass is something very irregular". In some sense, the
Malliavin integration by parts formula allows to regularize it, thus to overcome the problem of
handling a Dirac mass, but it is worth to point out that this procedure provides an high variance
because of the presence of the Heaviside function
H.
Thus, it is very useful to give a localization
method for the computation of conditional expectations, as made in the following Theorem 3.3
(Lions and Regnier [17], Lemme 2.1.1) by introducing the localizing function
INRIA
Lemma 3.2
Ts,t
true with
R
: R [0, +) be such that R ()d = 1. Then formulas (5)
and Rs,t replaced by Ts,t = Ts,t and Rs,t = Rs,t respectively, where
Let
and (6) hold

H(Xs ) (Xs )
T
[f
]()
=
E(f
(X
)(X
))
+
E
f
(X
)
W
t
s
t
s,t
s,t
s(t s)Xs
and
denotes the

E f (Xt )(Xs )
Ws,t
+
s(t s)Xs

H(Xs ) (Xs ) h (Ws,t )2
t i
+
W
.
E f (Xt )
s,t
s(t s)Xs2
s(t s)
Ry
probability distribution function associated with : (y) = ()d .
R
s,t [f ]() =
where
(7)
(8)
By using the localized version for the operators, we can immediately set up localized representation
formulas:
Theorem 3.3 (Representation formulas II:R with localization)

>0
and for any
: R [0, +),
such that
i ()d = 1,
For any
0 s < t, F E b ,
one has

T []()

s,t
E (Xt ) Xs = =
Ts,t [1]()
and

R []()T [1]() T []()R [1]()

s,t
s,t
s,t
s,t
E (Xt ) Xs = =
,
2
Ts,t [1]()
where the operators
T
s,t [f ]()
and
R
s,t [f ]()
are dened in (7) and (8) respectively.
We postpone to Section 5 a discussion on the choice of the localizing function
We give now another localization result due to Kohatsu-Higa and Pettersson [16] in the onedimensionale case and to Bouchard, Ekeland and Touzi [7] in the multidimensional setting.
Theorem 3.4 (Representation formulas III)

For any
Cb1 (R)
such that
()
=1
Let
0 s t, E b
and let
>0
be xed.
then

J []()

s,t
E (Xt ) Xs = =
J
s,t [1]()
where
J
s,t [f ]() =

i
1
H(Xs ) h
E f (Xt )
(Xs )Ws,t + 0 (Xs ) Xs s(t s)
s(t s)
Xs
We shall give here an elementary approach to the computation of conditional expectations allowing
to prove Theorems 3.1, 3.3 and 3.4. We start next section by proving some results on Malliavin
Calculus allowing to prove the above Theorems, whose proofs can be found in the section after the
next one.
RR n 4804
10
3.1
Some preliminary results in Malliavin Calculus
The representation formulas stated in the previous section are based on Malliavin Calculus techniques.
In the frame of the Black and Scholes model (on which we focus in this paper), these
techniques reduce to elementary reasonings based on (standard) integration by parts and so are
available to readers who are not familiar with this theory. Therefore, we present in this section
the elementary counterpart of Malliavin's integration by parts formula and the way in which one
employs this formula for computing conditional expectations. In the case of a general diusion,
the strategy is the same and people familiar with this calculus may easily adapt the arguments.
Let us start by a simple denition.
Denition 3.5
Let
and
G denote square integrable random variables.
We say that the
(Integration by Parts) property holds if there exists a square integrable weight
E(0 (F ) G) = E((F ) F (G)),

(here, as usual,
Cb (R)
for any
F (G)
IP (F, G)
such that
Cb (R)
(9)
stands for the set of bounded and innitely dierentiable paths).
One has
Lemma 3.6
(i) Suppose that
probability density
IP (F, 1)
holds.
Then the law of the random variable
has a
given by
p() = E(H(F ) F (1)).

(ii) Suppose that both
IP (F, 1)
and
IP (F, G)
hold. Then
E(G | F = ) =
E(H(F ) F (G))
,
E(H(F ) F (1))
with the convention that the above quantity is equal to 0 whenever
E(H(F ) F (1)) = 0.
Proof.
(i)
We employ regularization functions constructed as follows. Take a smooth, symmetric, non-
negative function
with support contained in [1, 1] and such that

(t) dt = 1. Then consider
R
Rx
0
and (x) =
(t) dt. Note that = and also that lim0 (x) =
(x) = (x/)/
e
1
[0,+) (x) = 1(0,+) (x) + 0 (x)/2. Take now G as a random variable, independent of F , whose
law is (x) dx. Clearly lim0 E(f (F G )) = E(f (F )), for any continuous function f . Then we
can write
Z Z
E(f (F G )) =
f (u v) (v) dv dP F 1 (u)
R R
Z
Z
Z
f (z) (u z) dP F 1 (u)dz =
f (z)E( (F z)) dz.
=
R
Since the
IP (F, 1)
property holds, we have
E( (F z)) = E(0 (F z)) = E( (F z) F (1)),

so that by using Lebesgue's Theorem we obtain
Z
lim E(f (F G )) =
e
f (z)E(1
[0,+) (F z) F (1)) dz.
INRIA
Thus, for any continuous function
we have
Z
E(f (F )) =
which allows to state that the law of
e
f (z)E(1
[0,+) (F z) F (1)) dz,
e
p(z) = E(1
[0,+) (F
e
replace 1[0,+) (F z) with
has a probability density given by
z) F (1)) = E(1[z,+) (F ) F (1)) = E(H(F ) F (1)) (we can

1[0,+) (F z) because we already know that the law of F is absolutely
(ii)
Let us denote
() =
E(H(F ) F (G))
.
E(H(F ) F (1))
We only have to check that for any bounded and continuous function
E(f (F ) (F )). By
IP (F, G) property
continuous).
it holds that
E(f (F ) G) =
using again the regularization function introduced above, the validity of the
and what proved in part
(i)
allow to state that
Z

E(f (F ) G) = E G lim
f (z) (F z) dz = lim
f (z) E(G (F z)) dz
0 R
0 R
Z
Z
f (z) E(G 0 (F z)) dz = lim
f (z) E( (F z) F (G)) dz
= lim
0 R
0 R
Z
Z
f (z) E(1[0,+) (F z) F (G)) dz =
f (z) (z) p(z) dz = E(f (F ) (F )),
=
R
that is the statement holds.
2
We give now a localized version of the result presented above, which will be used in order to prove
Theorem 3.4.
Denition 3.7
Let
G denote square integrable random variables

IPD (F, G) (Integration by Parts localized on D)
D
weight F (G) such that
and
open set. We say that the

exists a square integrable
E(0 (F ) G) = E((F ) FD (G)),
for any
Cb (R)
F has a local density p on D if

Z
E(f (F )) = f (x) p(x) dx,
for any f Cb (R)
such that
and let
D R
be an
property holds if there
supp(0 ) D.
(10)
Moreover, we say that
Finally, we say that
E(G | F = ) = ()
E(f (F ) G) = E((F ) G)
for
for any
such that
supp(f ) D.
if
f Cb (R)
such that
supp(f ) D.
With these localized denitions, we obtain the exact analogous of Lemma 3.6, whose proof is
dropped since it is actually identical to the proof of Lemma 3.6:
Lemma 3.8
(i) Suppose that
local probability density
IPD (F, 1) holds.

D given by
Then the law of the random variable
on
p() = E(H(F ) FD (1)).
RR n 4804
has a
12
(ii) Suppose that both
IPD (F, 1)
and
IPD (F, G)
E(G | F = ) =
hold. Then
E(H(F ) FD (G))
,
E(H(F ) FD (1))
with the convention that the above quantity is equal to 0 whenever

Let us prove now that the following lemma, allowing to state the
IP
E(H(F ) FD (1)) = 0.
and
IPD
properties in a
suitable gaussian case:
Lemma 3.9
Let
(a) For every
N(0, ).
f, g C 1 ,
one has

E(f 0 () g()) = E f () g() g 0 () .
X = xe+ ,
(b) Setting
one has

g(X)

+ g 0 (X) ,
E(f 0 (X) g(X)) = E f (X)
X
with
f, g C 1 .
(a)
Proof.
By using the standard integration by parts formula, one has
E(f () g()) =
Z

1
2
1
y
y 2 /(2)
e
ey /(2) dy
f (y) g(y)
dy = f (y) g(y) g 0 (y)
2
2

0
= E f () g() g () .
(b), notice that X is a function of . Thus, we

g(X) = g1 (), one has f10 () = f 0 (X) X = f 0 (X) X
Therefore, by (a),
Concerning
can use (a): setting f (X) =

0
0
and g1 () = g (X) X =
f1 () and
g 0 (X) X .

g ()
g1 ()
g1 ()
1
= E f1 ()

E(f 0 (X) g(X)) = E f10 ()
X
X
X

g(X)
0
+ g (X) .
= E f (X)
X
2
We can then state the following
Proposition 3.10
(i) The
Let
IP (Xs , g(Xt ))
evolve as in (4), let
s<t
and
g : RR
with polynomial growth.
property holds, that is
E(0 (Xs ) g(Xt )) = E((Xs ) s [g](Xt ))

for any
Cb (R),
with
s [g](Xt ) = g(Xt )
being
1
Ws,t ,
s(t s) Xs
Ws,t = tWs sWt + s(t s).
INRIA
(ii) For a xed

Let
> 0,
Cb1 (R)
Cb1 (R) such that supp(0 ) B () = ( , + ),

that
= 1. Then,
let
such
with
> 0.
B ()
E(0 (Xs ) g(Xt )) = E((Xs ) s [g](Xt )),

with
s [g](Xt ) =
g(Xt )
Ws,t ,
s(t s) Xs
being

s ) tWs sWt + s(t s) + 0 (Xs ) Xs s(t s)
Ws,t = (X
s )Ws,t + 0 (Xs ) Xs s(t s).
= (X
In particular, the
IPD (Xs , g(Xt ))

W
s,t
IP (Xs , g(Xt ) s(ts)X
)
s
(iii) The
property holds, with
Cb (R),
Ws,t
s [g](Xt ))
= E((Xs )
s(t s)Xs
We shall prove parts
(ii)
and
(W )2
1
t
s,t
+
W
.
s,t
s(t s) Xs2 s(t s)
(iii):
the statement in
(almost identical) to the ones used for proving

Proof of
(ii).
(i)
follows by using arguments similar
(ii).
The key point of the localization formula, that is the formula in
Xt = Xs Y ,
(ii), is that 0 = 0 ,

s ) g(Xt ) .
E 0 (Xs ) g(Xt ) = E 0 (Xs ) (X
so that
Since
with
s [g](Xt ) = g(Xt )
Proof.
for any xed
property holds, that is

E 0 (Xs ) g(Xt )
for any
D = B (),
Y = eh(ts)+(Wt Ws )
independent of Xs , we can

s ) g(Xt )) = E E(0 (Xs ) (X
s ) g(y Xs ))
E(0 (Xs ) (X
with
write

.
y=Y
Now,
Xs = x ehs+Ws ,
so that by using
(b)
of Lemma 3.9, one has
s ) g(y Xs ))
E(0 (Xs ) (X

(X

s ) g(y Xs ) Ws
s ) y g 0 (y Xs )
+ 0 (Xs ) g(y Xs ) (X
= E (Xs )
Xs
s

(X

s ) g(y Xs ) Ws
s ) g 0 (y Xs ) .
+ 0 (Xs ) g(y Xs ) E y (Xs ) (X
= E (Xs )
Xs
s
By using again the independence between
Xs
and
Y,
one has
s ) g(Xt ))
E(0 (Xs ) (X

(X

s ) g(Xt ) Ws
s ) g 0 (Xt ) .
+ 0 (Xs ) g(Xt ) E Y (Xs ) (X
= E (Xs )
Xs
s
RR n 4804
14
Let us consider the second addendum of the r.h.s.: we will use the integration by parts formula
based on
in order to remove the derivative of
write
g.
By conditioning with respect to
Xs ,
one can

s ) g 0 (Xt ) = E E (z) (z)
Y g 0 (z Y )
E Y (Xs ) (X
z=Xs

0
.
= E (z) (z)E
Y g (z Y )
z=Xs
By using
(b)
of Lemma 3.9, it holds

g(z Y ) W W

g(z Y ) Y W W
t
s
t
s
+ 1 =E
E g 0 (z Y ) Y = E
z
Y
ts
z
ts
and by inserting this in the above equality one obtains

s ) g 0 (Xt ) = E (Xs ) (X
s ) g(Xt ) Wt Ws .
E Y (Xs ) (X
Xs
ts
We can nally write
s ) g(Xt ))
E(0 (Xs ) (X

(X

s ) g(Xt ) Ws
s ) g(Xt ) Wt Ws
+ 0 (Xs ) g(Xt ) E (Xs ) (X
= E (Xs )
Xs
s
Xs
ts
and straightforward computations allow to conclude. Let us nally observe that to achieve this
1
representation one has implicitly assumed that g is regular (C ), which is not true in general. But
this is not really a problem: one can regularize
with some suitable mollier and by using density
arguments, the statement follows.

Proof of
(iii).
We write once again
v(Xs , Xt ) =
Then by using
(b)
Xt = y Xs , g(Xt ) = g(y Xs )
and
Ws,t = v(Xs , Xt ),
where
s

t s
ln Xs ln x hs + 2 s
ln Xt ln Xs h(t s) .
of Lemma 3.9, we obtain

h g(X ) v(X , X )

t
s
t
Ws,t
E 0 (Xs )g(Xt ) s(ts)X
)
(Ws + s)
=
E
(X
s
s
2 s2 (t s)Xs2
vx (Xs , Xt ) + Y vy (Xs , Xt ) g(Xt ) v(Xs , Xt ) i
v(Xs ,Xt )
g(X
)
+
Y g 0 (Xt ) s(ts)X
t
s
s(t s)Xs
s(t s)Xs2

v(Xs ,Xt )
0
0
Consider now the term containing g , that is E (Xs )Y g (Xt )
s(ts)Xs : by conditioning rst
on Xs , one has to evaluate:

(x )
v(x, xY )
=
E(Y g 0 (xY ) v(x, xY )).
E (x )Y g 0 (xY )
s(t s)x
s(t s)x
By recalling that
Y = exp(h(t s) + (Wt Ws ))
and by using again part
(b)
of Lemma 3.9, one
has
E(Y g 0 (xY ) v(x, xY )) =

=

g(xY ) h Y v(x, xY ) W W
i
t
s
+ v(x, xY ) Y vy (x, xY ) x
E
x
Y
ts

g(xY )
v(x, xY ) (Wt Ws ) Y g(xY ) vy (x, xY )
E
(t s)x
INRIA
so that

v(Xs , Xt )
E (Xs )Y g 0 (Xt )
s(t s)Xs
i

g(Xt ) h v(Xs , Xt )
(Wt Ws ) Y vy (Xs , Xt ) .
= E (Xs )
s(t s)Xs (t s)Xs
By inserting such a quantity in the expression above, straightforward computations allow to obtain

Ws,t
E 0 (Xs )g(Xt ) s(ts)X
=
E
(Xs )
s
g(Xt )
[v(Xs , Xt )(t s)(Ws + s)
s)Xs2

v(Xs , Xt )s(Wt Ws ) + v(Xs , Xt )s(t s) vx (Xs , Xt )s(t s)Xs ] .
Now, since
vx (Xs , Xt ) =
2 s2 (t
t
Xs , one can rewrite the above expression as

E 0 (Xs )g(Xt )

Ws,t
(Xs ) h (Ws,t )2
t i
+ Ws,t
.
= E g(Xt )
2
s(t s)Xs
s(t s)Xs s(t s)
Remark 3.11
The key point in the above lemma was that we used two integration by parts forXs and the second with respect to Y = Xt /Xs = eh(ts)+(Wt Ws ) .
From a numerical point of view, it is the integration by parts with respect to Y which is dangermulas: the rst with respect to
ous. In fact, the noise contained in Y is Wt Ws , so, if t s is small, we have few noise.
As a consequence, the weight which appears in the integration by parts contains 1/ t s, which
blows up if
t = tk ,
t s is small. Think now to our dynamical programming principle, where s = tk1 and
t s = tk tk1 . If one wants to take a very rich time-grid (which is necessary in
so that
order to get a good approximation of an American option with a Bermudean one), then one has to
face the problem of
1/ tk tk1 .
This represents the specic diculty of this type of algorithm:
tk tk1 , and the

variance of the weights which appear in the integration by parts. In dimension d = 1, this is of
1/2
but in dimension d, we will have (tk tk1 )d/2 , and this is the way in
order (tk tk1 )
which the dimension comes on in this algorithm. The localization technique allows to reduce this
one has to handle the balance between the analytical error, which is of order
variance, and this is crucial.
3.2
Proofs
Proof of Theorem 3.1.
Concerning part
ii),
Part
i) is
is an immediate consequence of Lemma 3.6 and Proposition 3.10.
it suces to prove that

H(Xs ) h (Ws,t )2
t i
+
W
,
Ts,t [f ]() = Rs,t [f ]() = E f (Xt )
s,t
s(t s)Xs2 s(t s)
for any
f Eb (R).
First, let us take
0, to the Dirac mass in 0.

as 0, to H . Let us set
as a
probability density function weakly convergent, as
This means that the associated distribution function
converges,

H (Xs )
Ts,t [f ]() = E f (Xt )
Ws,t .
s(t s)Xs
Obviously,
Ts,t [f ]() Ts,t [f ]()
as
0,
but also, by using (almost) standard density argu-
ments, one can easily show that
lim Ts,t [f ]() = Ts,t [f ]().
RR n 4804
16
So, let us work with
Ts,t [f ]().
One has

Ts,t [f ]() = E f (Xt ) h (Xs )
By using part
(iii)
Ws,t
.
s(t s)Xs
of Proposition 3.10, one has

H (Xs ) h (Ws,t )2
ti
+
W
Ts,t [f ]() = E f (Xt )
s,t
s(t s)Xs2 s(t s)
(recall that
h = H0 )
and by taking the limit as
0,
the statement follows.
2
Proof of Lemma 3.2.
Let us denote for simplicity
F = Xs ,
G = f (Xt ),
so that we can write shortly

such that
R (t) dt = 1
F (G) = f (Xt )
1
Ws,t ,
s(t s) Xs
Ts,t [f ]() =
R xE(H(F ) F (G)), where H(x) = 1x0 .
(x) = (t) dt. Since the IP (F, G) holds (recall
and set
of Proposition 3.10), one has
(11)
Take
0
(i)
(11) and
E(H(F ) F (G)) = E((F ) G) E((F ) G) + E(H(F ) F (G))

= E((F ) G) + E(H(F ) F (G)) E(0 (F ) G)
= E((F ) G) + E(H(F ) F (G)) E((F ) F (G))
= E((F ) G) + E((H(F ) (F )) F (G)).
Inserting now the exact values for
F, G
and
F (G)
as in (11), one obtains

(H )(Xs )
Ts,t [f ]() = E f (Xt ) (Xs ) + E f (Xt )
Ws,t = T
s,t [f ]().
s(t s) Xs
Concerning
Rs,t ,
we set
(W )2
1
1
t
s,t
= f (Xt )
+
W
,
Ws,t , F (G)
s,t
s(t s) Xs
s(t s) Xs2 s(t s)
(12)
R
. Take 0 such that
so thatRs,t [f ]() = E(H(F ) F (G))
(t)
dt
=
1
and set (x) =
R
Rx
(t) dt. By using the IP (F, G) property proved in (iii) of Proposition 3.10, one has
= f (Xt )
F = Xs , G
= E((F ) G)
+ E(H(F ) F (G))
E(0 (F ) G)
E(H(F ) F (G))
= E((F ) G) + E(H(F ) F (G)) E((F ) F (G))
+ E((H )(F ) F (G)).

= E((F ) G)
The conclusion follows by inserting the exact values for
F, G
and
F (G)
as in (12).
stronger condition
We recall that
B ()
= 1,
for some
is xed and
> 0.
()
= 1.
Then, by using
In a rst step, let us assume the
(ii)
of Proposition 3.10, we have
INRIA
that the
IPB () (F, G)
G = (Xt ),

E H(Xs ) X
((Xs ))
s

,
E((Xt ) | Xs = ) =
for
E H(Xs ) X
(1)
s
obtain
and in particular for
property holds, with
F = Xs
and
and by using Lemma 3.8 we
any
B (),
(13)
= .
()
= 1. Then we may construct a sequence {n }n such that
c
B1/n (), n = on B2/n () and |n | 1 for any n. We have n () ()

for
0
0
n () () for any 6= . It then follows that Xs ((Xs )) Xs ((Xs )) and
Assume now that we just have
n = 1
on
any and
n
X
(1) X
(1) as
s
s
a.s. By using Lebesgue's dominated convergence theorem, one can
pass to the limit in the expectations in (13), written with
for such a
n ,
and to obtain the validity of (13)
4 Representation formulas for the conditional expectation

and its gradient: the multidimensional case
Let
be the underlying asset price process, driven by the Black and Scholes model, i.e. it solves
the following stochastic dierential equation (sde)
dXt = (
r )Xt dt + Xt dWt
X0 = x
where:
x Rd+ ;
r, Rd ,
with
ri = r
for any
i = 1, . . . , d,
being
the (constant) spot rate, and with
the
vector of the dividends of the option;
denotes the
is a
dd
volatility matrix which we suppose to be non-degenerate;
d-dimensional
correlated Brownian motion.
is a sub-triangular matrix, that is ij = 0

d-dimensional Brownian motion. Thus, any component
Without loss of generality, one can suppose that

whenever
of
Xt
i < j,
and that
is a standard
can be written as
i

X
Xti = xi exp hi t +
ij Wtj ,
i = 1, . . . , d
j=1
where from now on we set
1X 2
,
2 j=1 ij
i
hi = ri i
i = 1, . . . , d.
The aim is to study the conditional expectation
E((Xt ) | Xs = )
RR n 4804
(14)
18
0 < s < t, Rd+ and Eb (Rd ) the class of the measurable functions with polynomial
m
growth, i.e |(y)| C(1 + |y| ), in order to derive representation formulas in dimension d using
the ones in dimension d = 1.
e with independent components
In few words, to this goal it suces to consider an auxiliary process X
where
for which a formula for the conditional expectation immediately follows as a product. In a second
step, such a formula can be adapted to the original process
giving
from the auxiliary process
e.
X
by means of an (inversible) function
We will discuss later the connections with what has been
developed in the paper of Lions and Regnier [17].

To our purposes, let
`t = (`1t , . . . , `dt )
be a xed
C1
function and let us set

e i = xi exp hi t + ì + ii W i ,
X
t
t
t
i = 1, . . . , d
(15)
which obviously satises the sde
et = (
et dt +
et dWt
dX
r + `0t )X
X
e0 = x
X
being
the diagonal matrix whose entries are given by
a transformation allowing to handle the new process
Lemma 4.1
For any
t0
there exists a function
Let
t, `, x
ii , i = 1, . . . , d.
As a rst result, we study
in place of the original process
Ft () : Rd+ Rd+
et )
Xt = Ft (X
Proof.
e
X
(16)
such that
Ft
X:
is inversible and
et = Ft1 (Xt ).
X
be xed. From (15) we have
Wti =

ei
1 X
ln it hi t ìt .
ii
x
Inserting this in (14), one obtains
i
i e j /

Y
X
Xt ij jj
ij
Xti = xi exp hi t
(hj t + `jt )
xj
j=1 jj
j=1
Thus, by setting
eij =
and by using the notation
ij
,
jj
ln = (ln 1 , . . . , ln d ),
i, j = 1, . . . , d.
for
= (1 , . . . , d ) Rd , then Ft = (Ft1 , . . . , Ftd )
satises
ln Ft (y) = e
`t +
e ln y + (I
e)(ln x + ht)
and, by setting
b=
e
, its inverse function is given by
ln Ft1 (z) = `t +
b ln z + (I
b )(ln x + ht).
2
Let us summarize for later use the previous result. We set
eij =
ij
, i, j = 1, . . . , d,
jj
and
b=
e1
(17)
INRIA
b is easy to
bii = 1 for any i.
It is worth noticing that
compute because
itself triangular and
Thus, the function
et )
Xt = Ft (X
is a triangular matrix. Moreover,

b is
1
and its inverse Gt = Ft
such that
Ft
et = Gt (Xt )
X
and
are given by
Fti (y) = e
Pi
j=1
eij `jt
yi
i1
Y
j=1
and
i
Git (z) = e`t zi
i1
Y
z
j=1
By using the process
e,
X
xj
yj hj t eij
e
,
xj
ehj t
bij
i = 1, . . . , d, y Rd+
(18)
i = 1, . . . , d, z Rd+ .
(19)
mainly the fact that its components are independent, we can easily obtain
a rst formula for the conditional expectation starting from the one-dimensional one.
Theorem 4.2 (Representation formulas I: without localization)

i) Let
0s<t
Eb (Rd )
be xed. For any function
and
Rd+ ,
one has

T []()
s,t
E (Xt ) | Xs = =
Ts,t [1]()
where
being
d

Y
ei
H(X
ei )
s
i
Ts,t [f ]() = E f (Xt )
Ws,t
ei
i=1 ii s(t s)Xs
es = Gs (Xs )
X
and
e = Gs () [Gs
being dened in (19)],
i
Ws,t
= (t s)(Wsi + ii s) s(Wti Wsi ),
ii) Let
0s<t
be xed. For any function
Eb (Rd ),
H() = 10 , R,
(20)
and
i = 1, . . . , d.
one has, for
j = 1, . . . , d,
j

X
Rs,t;k []()Ts,t [1]() Ts,t []()Rs,t;k [1]()
ek
j E (Xt ) | Xs = =
bkj
,
j
Ts,t [1]()2
k=1
where
Ts,t [f ]()
is dened above and, as
Rs,t;k [f ]() =
Remark 4.3
k = 1, . . . , d,

E f (Xt )
k 2
esk
)
H(X
t i
ek ) h (Ws,t
k
+ Ws,t
esk )2 kk s(t s)
kk
kk s(t s)(X
d

Y
esi
H(X
ei )
i
Ws,t
.
ei
i=1,i6=k ii s(t s)Xs
If no correlation is assumed among the assets, that is if the volatility matrix
is
Iddd . Thus, the sum appearing for the evaluation of j E(F (Xt ) | Xs = )
j
reduces to the single term with k = j , with coecient
ej /j = e`s , which in turn is equal to 1
whenever ` = 0.
diagonal, then
RR n 4804
b =
(21)
20
Since
et )
Xt = Ft (X
(recall that
i)
Let us set
e
e t (y) (y)
= Ft (y), y Rd+ ,
being
Ft
dened in (18).
t, one obviously has

e X
et ) X
es = Gs () ,
E (Xt ) Xs = = E (
for any
Gs = Fs1 ).
Thus, setting
es
e = Gs (),
it is sucient to prove that

T
e s,t [](e
e )
es =
e X
et ) X
E (
e =
e s,t [1](e
T
)
where
e s,t [f ](e
T
) =
Now, let us rstly suppose that
of
d functions
(22)
d

Y
ei
1
H(X
ei )
s
i
et )
E f (X
Ws,t
ei
s(t s)
X
s
i=1 ii
i=1
d
Y
e
e 1 (y1 )
e can be separated in the product
e d (yd ), that is
(y)
=
Eb (R). In such a case,
each one depending only on a single variable and belonging to
one obviously has
d

Y

e
esi =
e
e
e i (X
eti ) X
E (Xt ) Xs =
e =
E
ei .
i=1
ei
X
t
e i is
Now, it is easy to see that for each
Theorem 3.1 can be applied (indeed, here the drift of X
t
i ei
i
of the form t Xt while the process handled in the one-dimensional case has t = const., but this
is not a diculty). Thus, recalling that
e
X
has independent components, by Theorem 3.1 one has
d
d

Y

Y
e i ](e
Tis,t [
i )
es =
ei =
e X
et ) X
e i (X
e i ) X
E (
e =
E
e
=
i
t
s
i
T
[1](e
i)
s,t
i=1
i=1
being
Tis,t [g](e
i ) =

e i ei )
1
i
eti ) H(Xs
E g(X
Ws,t
.
esi
ii s(t s)
X
By using again the independence for the components of
e s,t [](e
e ) =
T
d
Y
e i ](e
Tis,t [
i )
and
e,
X
it easily follows that
e s,t [1](e
T
) =
i=1
d
Y
Tis,t [1](e
i ),
i=1
e
e 1 (y1 )
e d (yd ). In the general case, the statement holds by
(y)
=
d
e Eb (R ) there exists a sequence of functions {
e n }n Eb (Rd )
using a density argument: for any
e n (X
e n is a linear combination of functions which
et ) (
e X
et ) in L2 and such that each
such that
e n , it nally holds for
e
separate the variables as above. Since representation (22) holds for any
so that (22) holds when
as well, as it immediately follows by passing to the limit.
ii)
bkj
ek /j = j Gks () = j
ek . Thus, by considering Ts,t [f ]()
e
e, that is Ts,t [f ]() = Ts,t [f ](e

), then we only have to show that
First, notice that, by (19),
a function (as it is!) of
as
d

Y
esi
H(X
ei )
i
e s,t [f ](e
et )
Rs,t;k [f ]() = ek T
) = ek E fe(X
Ws,t
,
ei
i=1 ii s(t s)Xs
INRIA
where we have set

the
k th
et ). By conditioning w.r.t. all the coordinates

et ) = f Ft (X
fe(X
e has independent components, one has
X
of
et
X
except for
one, and by recalling that
d

Y
e k ek )
ei
H(X
ei )
s
k
i
e s,t [f ](e
e k ) H(Xs
T
) = E E fek (X
W
Ws,t
,

t
s,t
j
e ,j6=k
ek
ei
x
ej =X
kk s(t s)X
t
s
being
etk ) = fe(e
etk , x
x1 , . . . , x
ek1 , X
ek+1 , . . . x
ed ). Thus,
fek (X

k
k )
k
e s,t [f ](e
etk ) H(Xes e
ek T
) = E ek E fek (X
W

s,t
k
e
s(ts)X
kk
d
Y
i=1,i6=k
e j ,j6=k
x
ej =X
t

ei
H(X
ei )
s
i
Ws,t
.
ei
ii s(t s)X
s
Now, by using the one dimensional results (see Theorem 3.1 and (6) in particular), one has
k

e k ek )
e k ek ) h Ws,t
t i
k
k
e k ) H(Xs
e k ) H(Xs
+Ws,t
ek E fek (X
Ws,t
= E fek (X
t
t
ek
e k )2 kk s(t s)
kk s(t s)X
kk s(t s)(X
s
s
By rearranging everything, the formula holds.
2
Let us give some commentary remarks about the signicance of the drift-function
Qit (z)
=e
ìt
Git (z)
= zi
i1
Y
j=1
`t .
Setting
zj hj t bij
e
,
xj
then
H(Gis (Xs ) Gis ()) = 1Gis (Xs )Gis () = 1Qis (Xs )Qis () = H(Qis (Xs ) Qis ()),
so that we obtain
Ts,t [f ]() = e
Pd
i=1
ìs
d

Y
1
H(Qis (Xs ) Qis ())
i
E f (Xt )
Ws,t
.
i
s(t s)
Qs (Xs )
i=1 ii
i=1
d
Y
(23)
i
Now, by recalling that Qt (z) does no more depend on `, the only way in which `t comes out in our
Pd
i
formula is
i=1 `t , so that we have just one degree of freedom. The problem is how to choose `,
let us now discuss about this.
The main application of the formula is addressed to Monte Carlo simulations, therefore, as usual,
one way is to optimize (here, minimize) the variance or the integrated variance of the Monte Carlo
`.
estimator with respect to
m() = E((Xt ) | Xs = )
More precisely, by using Theorem 4.2, the conditional expectation
is numerically evaluated in a Monte Carlo setting by means of
mN () =
where
TN
s,t [f ]() =
d
Y
N
d
Y
esi,(k)
1 X
1
H(X
ei )
(k)
i,(k)
f (Xt )
Ws,t
i,(k)
s(t
s)
N
e
Xs
i=1 ii
i=1
k=1
RR n 4804
TN
s,t []()
TN
s,t [1]()
22
or equivalently
TN
s,t [f ]() =
d
Y
N
d
(k)
Y
1 X
1
H(Gis (Xs ) Gis ())
(k)
i,(k)
Ws,t ,
f (Xt )
i (X (k) )
s(t
s)
N
ii
G
s
i=1
i=1
s
k=1
where
(W (k) , X (k) ),
as
k = 1, . . . , N
stand for independent copies of
(W, X).
By using (23), we
can also write
TN
s,t [f ]() = e
Pd
i=1
ìs
d
Y
N
d
(k)
Y
1 X
1
H(Qis (Xs ) Qis ())
(k)
i,(k)
Ws,t ,
f (Xt )
(k)
i
s(t
s)
N
Qs (Xs )
i=1 ii
i=1
k=1
TN
s,t [f ]() is the empirical mean, up to a constant
that is, in our context the Monte Carlo estimator
term, of independent copies of the random variable
`s
Hs,t
[f ]() = e
Pd
i=1
d
Y
d
Y
1
H(Qis (Xs ) Qis ())
i
f (Xt )
Ws,t
.
i (X )
s(t
s)
Q
ii
s
s
i=1
i=1
ìs
Thus, one could ask for `s to be set in order to minimize the variance of
`s
Hs,t
[f ](), or its integrated
variance, that is

`s
v[f ](`s ) = Var Hs,t
[f ]()
By recalling that the
Qis 's
Z
v[f ](`s ) = Var
or
do not depend on
Rd
`,
we can write
v[f ](`s ) = e2
where
v[f ]

`s
Hs,t
[f ](d) .
Pd
i=1
ì (s)
v[f ]
`s . Thus, the percentage ratios

R

`s
`s
E Hs,t
[f ]()
E Rd Hs,t
[f ]()(d)
and

1/2
R
1/2
`s
`s
Var Hs,t
[f ]()
Var Rd Hs,t
[f ]()(d)
is independent of
do not depend on
`,
that is the minimization of the variance with respect to
is a nonsense in this
context.
Another way to choose the drift
is in order in order to have
by Lions and Regnier in [17]. The corresponding
Proposition 4.4
ì (s)
which is the choice made
Gs
if and only if
` = ` ,
where
` = ` ,
where
is
given by

j
bij hj s ln
,
xj
j=1
i1
X
being dened in (17). In particular, if
ì
Gs () = ,
is computed in the following
is a xed point for the transformation
any path having value at time
`t = ` t
then
Gs () =

1 j
bij hj ln
,
s xj
j=1
i1
X
i = 1, . . . , d,
if and only if
i = 1, . . . , d.
INRIA
Proof.
hs),
The proof is immediate. By (19) one can shortly write
thus
= Gs ()
ln Gs (z) = `s +
b ln z + (I
b)(ln x +
if and only if

`s = `s =
b I h s ln
x
where the symbol
ln x
stands for a vector whose
ith
entry is given by
ln xii .
2
Remark 4.5
is diagonal, then = Gs () if and only if
b I = 0, which ensures that ` = 0, as it must

obviously follow. In any case, one always has `
e11 = 1. Moreover,
1 = 0, because of the fact that
the choice ` = ` gives a considerable simplication of the function Ft and its inverse Gt . Indeed,
since ln Ft (y) = e
`t +
e ln y + (I
e)(ln x + ht), one easily obtains
` = 0.
It is worth noticing that if originally
In fact, in such a case
Fti (y)
e=I
= yi
so that
i1
Y
j=1
moreover, since
yj xj t/s
xj j
eij
,
ln Gs (z) = `s +
b ln z + (I
b)(ln x + hs),
conclude that
Git (z)
= zi
i1
Y
j=1
zj xj t/s
xj j
i = 1, . . . , d;
(24)
straightforward computations allow to
bij
,
i = 1, . . . , d.
(25)
Let us now discuss about the connection with the formula given by Lions and Regnier [17], which
takes into account the process
b
X
dened as
i1

X
bt = xi exp hi t +
X
ij wtj + ii Wti
j=1
where (as usual,
P0
j=1 ()
:= 0
i
X
and)
wt
solves the system
ik wtk = ln i ln xi hi t,
i = 1, . . . , d.
(26)
k=1
It is worth remarking that
b
X
has independent components, as well as
e.
X
The formula given in
[17] states the following: setting
Ls,t [f ]() =
then
d

Y
bsi i )
1
H(X
i
E f (Xt )
Ws,t
bsi
s(t s)
X
i=1 ii
i=1
d
Y

L []()
s,t
E (Xt ) | Xs = =
Ls,t [1]()
(27)
e and
Thus, both formulas depend on an auxiliary process with independent components (X
respectively), which in turn is determined by the introduction of a new drift (` and
the latter being dened through (26)). Furthermore, the formulas giving the operators
RR n 4804
b
X
respectively,
Ts,t
and
24
Ls,t , allowing to write down the conditional expectation, are really very similar except for the point
. More precisely, both operators can be rewritten as
d

Y
1
H(Y i gi ())
i
E f (Xt )
As,t [f ]() =
W
,
s,t
s(t s)
Yi
i=1 ii
i=1
d
Y
in which
choose
choose
es
Y =X
bs
Y =X
and
and
g() = Gs () in order to obtain As,t [f ]() = Ts,t [f ]();

g() = in order to obtain As,t [f ]() = Ls,t [f ]().
1
Thus: what is the connection between the two approaches? It is simple: if one sets `t = 0 and for
P
i1
j
i = 2, . . . , d, ìt = j=1 ij wt , then it is straightforward to see that condition (26) equals to the
condition studied in Proposition 4.4, that is
e = Gt () = . This means that if `s is xed in order
to have
Gs () = ,
as it has been made in Proposition 4.4, the two formulas are actually identical.
Therefore, the approach studied here is some more general than the one developed by Lions and
Regnier.
Let us resume the above observations in the following
Remark 4.6
In principle one could take
`(t) = ` t seems to be good

choices for ` can be suggested:
choice
` = 0:
arbitrarily, so that for practical purposes the simple
enough. Concerning the (now) constant
this simplies the process
e;
X
`,
up to now two main
the operator allowing to write down the expectation
becomes in this case
Ts,t [f ]() =
where
d

Y
1
E f (Xt )
s(t s)
i=1 ii
i=1
d
Y

Hi (Xs , )
i
W
,

s,t
b
Qi1 Xsj h s ij
i
j
Xs j=1 xj e
Hi (y, ) = 1Qi
b ij
1 .
j=1 (Xs /j )
` = ` ,
with
as in Proposition 4.4: as observed, this gives a formula for the conditional
expectation in point of fact identical to the one provided by Lions and Reigner and the operator
Ts,t [f ]()
is here
d

Y
eì s
E f (Xt )
Ts,t [f ]() =
s(t s)
i=1 ii
i=1
d
Y

Hi (Xs )
i
W
,

s,t
bij
Q
Xsj hj s
Xsi i1
e
j=1
xj
which can be rewritten as
Ts,t [f ]() =
Qi1
d

Y
Hi (Xs )
i
E f (Xt )
W
.
Qi
s,t
j
bij
s(t s)
j=1 (Xs )
i=1 ii
i=1
d
Y
j=1
j ij
Let us now discuss formulas involving localization functions. If we restrict our attention to producttype localizing function, then we can rst state a localized formula for the operators
Rs,t;j [f ]()
R
Qd
(x) = i=1 i (xi ), x = (x1 , . . . , xd ) Rd , with i 0 and R i ()d = 1.
operators Ts,t and Rs,t;j , dened in (20) and (21) respectively, can be localized as follows:
Lemma 4.7
Then the
Ts,t [f ]() and
and then for the conditional expectation and its gradient. In fact, one rst has
Let
Ts,t [f ]() = T
s,t [f ]()
and
Rs,t;k [f ]() = R
s,t;j [f ](),
k = 1, . . . , d,
INRIA
where
d h

i
Y
esi
esi
H(X
ei ) i (X
ei )
i
T
[f
]()
=
E
f
(X
)
Ws,t
i (Xs )) +
t
s,t
esi
ii s(t s)X
i=1
(28)
and

h
esk
R
[f
]()
=
E
f
(X
)
k (X
ek )
t
s,t;k
k
Ws,t
+
esk
kk s(t s)X
k 2
ek
ek
)
t i
ek ) k (X
ek ) (Ws,t
H(X
s
s
k
+ Ws,t
+
e k )2
kk s(t s)
kk
kk s(t s)(X
s
d
h
i
Y
ei
ei
H(X
ei ) i (X
ei )
s
s
i
ei
ei ) +
Ws,t
i (X
.
s
ei
ii s(t s)X
s
i=1,i6=k
where
Proof.
denotes the probability distribution function associated with
i : i (y) =
et
et ). By conditioning w.r.t. all the coordinates of X
et ) = f Ft (X
fe(X
e has independent components, one has
by recalling that X
Set
one, and
Ry
(29)
i ()d .
except for the rst
d

Y
e 1 e1 )
esi
H(X
ei )
1
i
et1 ) H(Xs
Ts,t [f ]() = E E fe1 (X
Ws,t
W

s,t ,
e j ,j6=1
es1
ei
x
ej =X
11 s(t s)X
t
i=1,i6=1 ii s(t s)Xs
where we have set
etk ) = fe(X
et1 , x
e2 , . . . x
ed ).
fe1 (X
By using the one dimensional results (see Lemma
3.2), one has

h

i
e 1 e1 )
e1
(H )(X
e1 )
s
1
1
e 1 ) H(Xs
e 1 ) 1 (X 1
E fe1 (X
Ws,t
e1 ) +
Ws,t
= E fe1 (X
,
t
t
s
e1
e1
11 s(t s)X
11 s(t s)X
s
s
so that
d

h
i Y

e1
ei
(H )(X
e1 )
H(X
ei )
s
s
1
i
Ts,t [f ]() = E f (Xt ) 1 (Xs1
e1 ) +
Ws,t
Ws,t
.
e1
ei
11 s(t s)X
s
i=1,i6=1 ii s(t s)Xs
Now, by considering the conditioning w.r.t.
et1 , X
et3 , . . . , X
etd ,
X
one has
2 h
d

i Y

Y
ej
ei
(H )(X
ej )
H(X
ei )
j
s
s
i
Ts,t [f ]() = E f (Xt )
j )+
W
W
j (Xsj e
.
s,t
s,t
esj
ei
jj s(t s)X
j=1
i=1,i6=1,2 ii s(t s)Xs
By iterating a similar procedure on the remaining components, one arrives to the end.
Concerning
Rs,t;k ,
the statement can be proved in the same way, by using the one dimensional
localization formula for

except for the
k th
Rs,t
in Lemma 3.2.
Indeed, by conditioning w.r.t.
all the coordinates
one, one rst arrives to

h
ek
Rs,t;k [f ]() = E f (Xt ) k (X
ek )
s
k
Ws,t
+
ek
kk s(t s)X
s
k 2
esk
esk
)
t i
ek ) k (X
ek ) (Ws,t
H(X
k
+ Ws,t
+
esk )2
kk s(t s)
kk
kk s(t s)(X
d
i
Y
esi
H(X
ei )
i
Ws,t
.
ei
RR n 4804
26
Now, by conditioning w.r.t. all the coordinates except for the

the localization properties for
Ts,t
j th
one, with
j 6= k ,
and by using
as in Lemma 3.2, the nal formula can be achieved.
2
By using the localized version for the operators, the localized representation formulas for the
conditional expectation and its gradient immediately follows:
Theorem 4.8 (Representation formulas II: with localization)

Rd+
Ld ,
and for any
For any
0 s < t, F E b ,
one has

T [F ]()

s,t
E F (Xt ) Xs = =
Ts,t [1]()
and, as
j = 1, . . . , d,
j

X
R
ek

s,t;k [F ]()Ts,t [1]() Ts,t [F ]()Rs,t;k [1]()
bkj
,
j E F (Xt ) Xs = =
2
j
T
s,t [1]()
k=1
where the operators
Remark 4.9
T
s,t [f ]()
and
R
s,t;k [f ]()
are dened in (28) and (29) respectively.
In principle, one could take dierent localizing functions for each operator, that is:

E F (Xt ) Xs = =

j E F (Xt ) Xs = =
1
T
s,t [F ]()
2
T
s,t [1]()
j
X
bkj
k=1
4
5
6
3
R
ek
s,t;k [F ]()Ts,t [1]() Ts,t [F ]()Rs,t;k [1]()
.
7
2
j
T
s,t [1]()
See next section for a discussion on localizing functions. Furthermore, what observed in Remark
is diagonal, the sum giving j E(F (Xt ) | Xs = ) reduces to the

j
j , with coecient
ej /j = e` s , which in turn is equal to 1 if ` = 0.
4.3 holds here as well: when

single term with
k=
5 Optimal localizing functions

Let us conclude this theoretical part with an analysis on the choice of the localizing functions.
Let us rst discuss the one dimensional case. By referring to Theorem 3.3, in order to compute
E((Xt ) | Xs = )
one has to evaluate

h
i
H(Xs ) (Xs )
T
Ws,t ,
s,t [f ]() = E f (Xt ) (Xs ) +
s(t s)Xs
with
f =
and
f = 1.
Such an expectation is practically evaluated by means of the empirical
mean obtained through many independent replications:
T
s,t [f ]() '
N
(q)
(q)
h
i
1 X
H(Xs ) (Xs )
(q)
(q)
f (Xt ) (Xs(q) ) +
W
s,t .
(q)
N q=1
s(t s)Xs
The aim is now to choose the localizing function
in order to reduce the variance as well as
possible. To this purpose, let us introduce the quantity
I1f ()

h
i2
H(Xs ) (Xs )
E f 2 (Xt ) (Xs ) +
Ws,t
d,
s(t s)Xs
R
INRIA
which gives the integrated variance up to the constant (with respect to
Ts,t [f ]().
term
Then one has
Proposition 5.1
L1 = { : R [0, +) ; C 1 (R), (+) = 0
Setting
then
and
R
R
T
s,t [f ]() =
(t) dt = 1},
inf I1f () = I1f ( ),
L1
where
= (), R,
() =
Proof.
is a Laplace-type probability density function:
||
e
,
2
with

2 1/2
1
2
s(ts) Xs Ws,t
E f (Xt )

= [f ] =
.
2
E f (Xt )
Let us set
s,t =
Ws,t
.
s(t s)Xs
L1 . First, by interchanging the order of integration and by considering

= Xs , one has
Z

2
I1f () = E f 2 (Xt ) () + (H )() s,t d.
Fix a localizing function

the change of variable
L1 (R)
R and let
dt, one has
(t)
Now, set
Rx
such that for any small
then
+ L1 .
Setting
(x)
=

= lim 1 I f ( + )
I f ()
(I1f )0 ()()
1
1
0
Z

() + (H )()s,t d.
()
= 2E f 2 (Xt ) ()
s,t
R
We show now that
is the only function in
L1
such that
=0
(I1f )0 ( )()
for any
satisfying
the conditions above. Consider the rst term of the (last) r.h.s.: by using the standard integration
0 = ), we can write
by parts formula (recall that

()
f 2 (Xt ) () + (H )()s,t (1) d
R

+ Z

2
f 2 (Xt ) () + (H )()s,t d
()
= () f (Xt ) () + (H )()s,t
Now, since
()
0
as
and
is a (quite smooth) probability density function, so that

() + (H )() 0 as +, the rst term nullies. Moreover, () +

(H )()s,t = 0 () ()s,t for almost any . Thus, we obtain
Z

f 0
2
()
f 2 (Xt ) 0 () ()s,t + () s,t + (H )()s,t
(I1 ) ()() = 2E
d
R
Z

2
E f 2 (Xt ) 0 () + (H )()s,t
d.
= 2 ()
in particular
is the primitive function of an almost arbitrarily integrable

Since
f 0
= 0 for any if and only if
(I1 ) ()()

2
E f 2 (Xt ) (H )()s,t
+ 0 () = 0
RR n 4804
function, we can say that
28
for (almost) any
Let us denote
2
E(f 2 (Xt ) s,t
)
2
E(f (Xt ))
a2 =
and
v() = (),
so that
v 0 () = ()
and
v 00 () = 0 ().
We therefore achieve the following
ordinary dierential equation:
v 00 () a2 v() + a2 H() = 0,
whose solution is simply

v() =
with
c1 , . . . , c4 R
and
a=
a2 .
c1 ea + c2 ea + 1
c3 ea + c4 ea
if
if
>0
< 0,
v(+) = 1, v() = 0
Now, by imposing that
and
v 0 ()
has
to be a continuous probability density function, we nally get
v () =
1 12 ea
1 a
2 e
if
if
or equivalently
() = v () =
>0
< 0,
a a||
e
.
2
= 0 for any L1 (R). Now, in order to show that gives the

(I1f )0 ( )()
notice that for any such that + L1 one has

Z
I( + ) I( ) = E f 2 (Xt )(() ()s,t )2 d + (I1f )0 ( )(),

Thus,
minimum,
where
(x) =
Rx
(z) dz .
Since
(I1f )0 ( )() = 0
I( + ) I( ) = E
which proves that
for any
L1 (R),
then
f 2 (Xt )(() ()s,t )2 d 0,
is actually the minimizing function.
Remark 5.2
(recall that
The optimal value of the parameter
corresponding
to
f = 1 can be explicitly written
denotes the starting underlying asset price):
1 hs+2 s
[1] = x
t + 2 s(t s)
.
2 s(t s)
In fact, by Proposition 5.1 one has
[1] =

1/2
1
E Xs2 (Ws,t )2
,
s(t s)
which can be computed. Indeed, by using the representation of
Xs ,
it holds

2

E Xs2 (Ws,t )2 = x2 e2hs E e2Ws (t s)Ws s(Wt Ws ) + s(t s)
INRIA

2

= x2 e2hs E E e2Ws (t s)Ws s y + s(t s)

y=Wt Ws
E(e2Ws ) = e2
Now, since
E(Ws e2Ws ) = 2se2
and
E(Ws2 e2Ws ) = s(1+4 2 s)e2
one has
= e2
By inserting

2

E e2Ws (t s)Ws s y + s(t s)

s(t s)2 (1 + 4 2 s) + s2 ((t s) y)2 4s(t s)((t s) y) .
y = Wt Ws and by computing the turning out expectation, it follows

2
E Xs2 (Ws,t )2 = x2 e2hs+2 s s(t s)(t + 2 s(t s)),
s
so that
1 hs+2 s
[1] = x
that
t + 2 s(t s)
.
2 s(t s)
The above optimization criterium has been introduced by Kohatsu-Higa and Petterson [16].
In
principle, one could consider a measure more general than the Lebesgue one, namely to replace d
f
with (d) in the expression for I1 (), but in such a case it is not possible to write down explicitly
the optimal localizing function.
Such an approach can be generalized to the multidimensional case, where the functional
I1f
has to
be obviously replaced by
Z
Idf () =
d h

i2
Y
ei
(H i )(X
ei )
s
i
esi
E f 2 (Xt )
ei ) +
Ws,t
i (X
de
.
ei
ii s(t s)X
Rd
s
i=1
Then the following result holds:
Proposition 5.3
i},
Setting
Ld = { : Rd [0, +) ; (x) =
then
inf I f ()
Ld d
Qd
i=1
j 2
j
Ws,t
i2 Y h
esj
jj s(t s)X
i : i6=j

Y h 2
E f 2 (Xt )
i +
i : i6=j
where
i L1 ,
for any
= Idf ( )
Qd
where () =
j=1 j (j ), = (1 , . . . , d ) R , with j (j )
j [f ], enjoys the following system of nonlinear equations:

h
E f 2 (Xt )
i (xi ),
i 2 +
i
Ws,t
j
2
esi
ii s(t s)X
2 i
W i
ej |j | , j R
and
j =
2 i
,
j = 1, . . . , d.
s,t
ei
ii s(t s)X
s
Idf () = Idf (1 , . . . , d ), with 1 , . . . , d L1 . We will then compute

b1 , 0, . . . , 0), where b1 = b1 (x1 ) is some arbitrary function, by
the derivative in the direction (
reducing the computation to the one-dimensional case, as in Proposition 5.1. For a xed f , let us
d
put fet (y) fe(y) = f Ft (y), y R+ and Ft being dened in (18). Setting
Proof.
First, we can say that
fe12 (x1 ) =
RR n 4804
d h

i2
Y
esi
(H i )(X
ei )
2
2
d
i
e
e
e
esi
de
2 . . . de
d E f (x1 , Xt , . . . , Xt )
ei ) +
Ws,t
i (X
esi
ii s(t s)X
Rd1
i=2
30
then
Z
Idf () =

h
i2
e1
(H 1 )(X
e1 )
s
1
e 1 ) 1 (X
e1
de
1 E fe12 (X
e1 ) +
Ws,t
= I1f1 (1 ).
t
s
1
e
11 s(t s)Xs
R
By employing the one dimensional results, we get
where
Idf () b
I f1 (1 ) b
(1 , 0, . . . , 0) = 1
(1 )
b1
b1
Z
1

h
i2
Ws,t
b 1 () E fe12 (X
et1 ) 10 () + (H 1 )()
,
= 2 d
e1
11 s(t s)X
R
s
R
b 1 (y) = y b1 () d . We solve the equation
Idf () b
(1 , 0, . . . , 0) = 0,
b1
since
b1
is arbitrary,
b1
for any
b1 :
is also arbitrary, so that the above equation gives

h
e 1 ) 0 () + (H 1 )()
E fe12 (X
t
1
i2
1
Ws,t
e1
11 s(t s)X
s
= 0,
for any
This gives rise to an ordinary dierential equation which has been studied in the proof of Proposition 5.1: the solution is then
1 () =
1 1 ||
e
,
2
R,
with
h

i2
1
Ws,t
et1 )
E fe12 (X
1
e
s(ts)X
11 s
1 2 = 1 (f1 )2 =
2
e
et1 )
E f1 (X

h
i2 Q
h
i2
1
R
e i e
Ws,t
(Hi )(X
d
2
i
s i )
ei
de
.
.
.
de
E
f
(X
)
(
X
e
)
+
W
2
d
t
i
i
d1
s
s,t
1
i
i=2
e
e
R
11 s(ts)Xs
ii s(ts)Xs
=

h
i2
R
Q
e i e
(H
)(
X
)
d
i
i
esi
2 . . . de
d E f 2 (Xt ) i=2 i (X
ei ) + s(ts)sXe i i Ws,t
Rd1 de
ii
Now, since everything is symmetric, we obtain similar equations for the remaining coordinates
2, . . . , d.
Now, if we want all these equations to hold simultaneously, we obtain the following
system:
j 2
j
2
ej || ,
R,

h
i2 Q
h
i2
j
R
e i i )
Ws,t
(H
ei
i
i )(Xs e
de
j E f 2 (Xt ) s(ts)
(
X
e
)
+
W
i
j
s
s,t
i
i
:
i6
=
j
i
e
e
Rd1
Xs
ii s(ts)Xs
jj
=

h
i2
R
)(X
Q
e i e
(H
i )
s
i
i
2
i
e
de
E
f
(X
)
(
X
e
)
+
W
j
t
i
s
s,t
i
i : i6=j
ei
Rd1
s(ts)X
j () =
ii
where the notation

for
ej .
de
j
means that one has to integrate with respect to all the variables except
It remains now to give a better representation of the above integrals. They can be rewritten
in the following simpler form: setting
s,t;i =
i
Ws,t
esi
ii s(t s)X
INRIA
then one has to handle something like
Z
Rd1

i2
Y h
2
esi
esi
de
j E f 2 (Xt ) s,t;j
ei ) + (H i )(X
ei )s,t;i
i (X
i : i6=j
Z

2
2
= E f (Xt ) s,t;j
Rd1
with
that
Y h
de
j
ei
ei
ei ) + (H i )(X
ei )s,t;i
i (X
s
s
i2
,
i : i6=j
s,t;j = s,t;j or s,t;j = 1. Let us consider the integral inside the expectation: by recalling
i () = i ei || /2, then (H i )() = sign() ei || /2, so that

Z
h
i2
esi
esi
de
j i (X
ei ) + (H i )(X
ei )s,t;i
Rd1
h
i2
Y Z
Y 1
ei
1
i |
esi
=
de
i e2i |Xs e
ei ) s,t;i =
[ 2 + 2s,t;i ]
i + sign(X
4
4i i
R
i : i6=j
i : i6=j
By inserting everything, one obtains
Z
Rd1

i2
Y h
2
ei
ei
de
j E f 2 (Xt ) s,t;j
ei ) + (H i )(X
ei )s,t;i
i (X
s
s
i : i6=j

Y
2
= E f 2 (Xt ) s,t;j
i : i6=j

1 2
2
[
+
]
i
s,t;i
4i
and the statement immediately follows.
Remark 5.4
` = 0 (see Remark 4.6 for details), for f = 1 the corresponding optimal values
j are given by (recall that x1 , . . . , xd are the starting underlying asset prices)
s
2 s(t s)
t + jj
2
hj s+jj s
j = 1, . . . , d.
j [1] = x1
e
j
2 s(t s) ,
jj
If
the parameters
of
This is an immediate consequence of the computations made in Remark 5.2.

It is worth to point out that similar arguments could be used in order to handle the problem
of minimizing the variance coming out from the expectation giving the operator Rs,t [f ]() and
R
s,t;k [f ]() (see (7) and (28)). To this purpose, let us recall that

h

i
k
k
ek
ek
es
R
=
E
f
(X
)
(
X
e
)
+
(H
)(
X
e
)
((
X
e
)
)
,
t
k
s,t;k
k
s,t;k
s,t;k
k
s
s
s,t;k
where
s,t;k =
k
Ws,t
ek
kk s(t s)X
s
and
es
s,t;k ((X
e)k ) =
s,t;k =
d

Y
i=1,i6=k
RR n 4804
(W k )2
t
s,t
k
+ Ws,t
e k )2 kk s(t s)
kk
kk s(t s)(X
s
1

ei
ei
H(X
ei ) i (X
ei )
s
s
i
ei
e
)
+
W
i (X
i
s
s,t
ei
ii s(t s)X
s
(30)
32
Rd , the notation k means the vector on Rd1 having the same coordinates of
th one. In the one dimensional case, one simply has to drop the subscript k , take
except for the k
e = and set s,t;k (e

k ) 1. Thus, if we set
Z

h

i2
2
ek
ek
es
Jdf ;k () =
E f 2 (Xt ) k (X
ek )s,t;k + (H k )(X
ek )s,t;k s,t;k
((X
e)k ) de
s
s
where, for
Rd
then the following result holds:
Proposition 5.5
i},
Setting
Ld = { : Rd [0, +) ; (x) =
then
Qd
i=1
i (xi ),
where
i L1 ,
for any
inf Jdf ;k () = Jdf ;k ( ;k )
Ld
Qd
where ;k () =
j=1 j;k (j ),
j;k = j;k [f ], are given by:
= (1 , . . . , d ) Rd ,
i) in the one dimensional case (k
= j = 1),
with
j;k
(j ) =
j;k
2
ej;k |j | , j R
and
one has:
1/2

E f 2 (Xt ) 2s,t

= [f ] =
E f 2 (Xt ) 2s,t
where
s,t
and
s,t
are as in (30);
ii) in the multidimensional case, for any xed

as
k = 1, . . . , d,
the
j = 1, . . . , d:
j;k 's
solve the nonlinear system,

h
i
Q
E f 2 (Xt ) 2s,t;k di=1,i6=k (i;k )2 + 2s,t;j
h
i and for j 6= k :
(k;k )2 =
Qd
E f 2 (Xt ) 2s,t;i i=1,i6=k (i;k )2 + 2s,t;i

h
i
Q
E f 2 (Xt ) (k;k )2 2s,t;k + 2s,t;k 2s,t;j di=1,i6=j,k (i;k )2 + 2s,t;i

Q
h
i
(j;k )2 =
d
)2 + 2
E f 2 (Xt ) (k;k )2 2s,t;k + 2s,t;k
(
s,t;i
i;k
i=1,i6=j,k
where
s,t;k
and
s,t;k
are dened in (30).
In the proof, the same technique as in the proof Proposition 5.1 and 5.3 is used. Since it is also
quite tedious, it is postponed to the Appendix.
Anyway, for practical purposes, numerical evidences show that the choice
= 1/ t s
works
good enough (thus avoiding to weight the algorithm with the computation of further expectations).
Finally, let us conclude with a short consideration. For simplicity, let us consider the one dimensional case. The main problem is a good estimate of E((Xt ) | Xs = ), which can be written as
the ratio between Ts,t []() and Ts,t [1]() but also in the following way:

s,t
,
E((Xt ) | Xs = ) = E (Xt )
E(s,t
)
s,t
= (Xs ) +
H(Xs ) (Xs )
Ws,t
s(t s)Xs
(one could also complicate things by considering two dierent localizing functions in the above
ratio...). So, another reasonable way to proceed might take into account the variance coming out
from the weight s,t . But since it is written in terms of a ratio, at this stage it does not seem
reasonably feasible to obtain results giving the associated optimal localizing function
INRIA
6 The algorithm for the pricing of American options

We give here rst a detailed presentation of the use of the representation formulas in the applied context of the pricing and hedging of American options. Secondly, we summarize the pricing/hedging algorithm.
6.1
How to use the formulas in practice
(0, x)
of
on underlying assets whose price
The algorithm is devoted to the numerical evaluation of the price
an American option with payo function
and maturity
T,
P (0, x)
and the delta
evolves following the Black-Scholes model, that is as in (14). It has been briey described in the
Introduction, let us now go into the details.
0 = t0 < t1 < . . . < tn = T be a discretization of the time interval [0, T ], with step size equal to
= T /n. By using Theorem 2.1 and Proposition 2.2, the price P (0, x) is approximated by means
0 (x), where Pk (Xk ), as k = 0, 1, . . . , n, is iteratively dened as:
of P
Let
Pn (Xn ) = (Xn
n ) (XT )

o

Pk (Xk ) = max (Xk ) , er E P(k+1) (X(k+1) ) Xk
k = n 1, . . . , 1, 0 :
and the delta
(0, x)
is approximated by using the following plan:
()

er E P2 (X2 ) X =
=X

0 (x) = Ex (X ) .
) =
(X
setting
then
The conditional expectation
)|=X
(31)
=X
E(P(k+1) (X(k+1) ) | Xk )
if
P (X ) < (X )
if
P (X ) > (X )
and the derivative
(32)
E(P2 (X2 ) | X =
will be computed through the formulas given in the previous section, by means of suitable
empirical means evaluated over
Remark 6.1
simulated paths.
In the context of the geometric Brownian motion, the process
simulated at each instant
k
X
tk = k.
can be exactly
So, in this particular case we do not need an approximation
of Xk , and thus we write directly Xk . Furthermore, it is worth remarking that the algorithm
allows to use the same sample in order to simulate all the involved conditional expectations, as it
will follows from the next description.
As
k = n, n 1, . . . , 0
we need
1
i
Xk
= xi e(ri 2
In order to have
Xk ,
we simply need
Pi
j=1
Wk .
2
ij
)k+
Pi
j=1
j
ij Wk
i = 1, . . . , d.
Since the algorithm is of backward-type, for the above
simulation we consider the following backward approach, which uses the Brownian bridge law.
Indeed, at time
Wn =
which gives
T = n,
we can simulate
n Un ,
with
Xn .
Wn
in the classical way:
Un = (Un1 , . . . , Und ),
RR n 4804
W(n1)
given the
independent,
X(n1) , we need W(n1) , which in turn can be

Wn is known, W(n1) can be simulated by using
observed value for Wn . It is well known that the law
Now, in order to simulate
simulated by using the Brownian bridge: since

the conditional law of
Uni N(0, 1), i = 1, . . . , d,
34
Ws given
s(t s)/t I .
of
that
Wt = y
for
0<s<t
is given by a gaussian law with mean
Thus,
W(n1)
n1
Wn +
=
n
1
d
i
Un1 = (Un1
, . . . , Un1
), Un1
N(0, 1), i = 1, . . . , d, are all
proceed similarly for the simulation of Wk , as k = n 2, . . . , 1:
r
k
k
W(k+1) +
Uk ,
Wk =
k+1
k+1
with
Uk = (Uk1 , . . . , Ukd ), Uki N(0, 1), i = 1, . . . , d,
and variance
n1
Un1
n
with
can
s/t y
independent.
independent. Obviously, we
Thus, the basic data in the
algorithm are given by
U = {Uki,q ; k = 1, . . . , n
(time),
i = 1, . . . , d
(dimension),
q = 1, . . . , N
(sample)}
(33)
and the simulation algorithm can be summarized step by step as follows.

1. Computation of the samples
1, . . . , N ,
for
for
i,q
(Wk
)i=1,...,d; k=1,...,n , q = 1, . . . , N :
k=n:
i,q
Wn
=
n Uni,q ,
i,q
Wk
k = n 1, . . . , 1 :
i = 1, . . . , d,
k
W i,q
=
+
k + 1 (k+1)
k
Uki,q ,
k+1
i,q
(Xk
)i=1,...,d; k=0,...,n , q = 1, . . . , N :
k = n, n 1, . . . , 1
set for
i,q
= xi e(ri 2
Xk
and
q=
and
2. Computation of the samples
1, . . . , N ,
for any xed sample
set
X0i,q = xi , i = 1, . . . , d.
time=1
time=0
Figure 1
Pi
j=1
2
ij
)k+
Pi
j=1
i,q
ij Wk
for any xed sample
i = 1, . . . , d
q =
(35)
As an example, Figure 1 shows a set of simulated paths of
X.
Xq
........ .
.
......
.......
................1
....
..................
...............................................
....... ................................ .
......
.
..
........................................................................................................
.........
............
.........
..........................
....
............
.... ....................................................................................................................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...
........ ......
.....
........
....
................
....
....................................
...........................................................................
.............
....
..
......... .................................................. ................................
......
... ....... ......
............
............. ........... ...........................................................
.......
.......... .....
....
.
.
.................. ................... .................................................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
............................. ...
........
................................................
..
...............
....
.....
..................... ........ ........... ..........................
...................................
................................. ...................................................... ..................
.....
..........................
..........
.........
.............................................................................................................................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
. ..
.........................................
.... ..........
..................... ................................. ........................
. .
...................... ......... ....... .......................................................................................................
. .. .
...
..
........................................................................................................................................
.............
................. ...... ..... ............................
..
.....
......... ... .........
...............................................................................................................................................................
...................................................
............................
An example of the tree turning out by simulating
3. Computation of
(34)
i = 1, . . . , d.
e i,q )i=1,...,d; k=0,...,n , q = 1, . . . , N ,

(X
k
10
paths of the process
on
[0, 1].
allowing to numerically evaluate the
conditional expectations involved in (31). In order to do this, if
d > 1 one needs to
introduce
which could vary according to the time interval of interest [k, (k + 1)], that is
i
one has something like `k , for k = 1, . . . , n 1 (time) and i = 1, . . . , d (dimension). As a rst
i
1
d
stage, the `k 's can be chosen arbitrarily. But since one could also choose `k = (`k , . . . , `k )
th
depending on the position of X at time k, the weight `k could also depend on the q
sample,
the drift
`,
as suggested in Remark 4.6. So let us consider the set
L = {ì,q
k ; k = 1, . . . , n 1
(time),
i = 1, . . . , d
(dimension),
q = 1, . . . , N
(sample)}.
(36)
INRIA
Once
xed sample
q = 1, . . . , N ,
set for
k = n, n 1, . . . , 1
e i,q = xi e(ri 12
X
k
and
e i,q )i=1,...,d; k=0,...,n :

(X
k
is given or computed, one can compute the sample
Pi
j=1
i,q
2
ij
)k+ì,q
k k+ii Wk
for any
i = 1, . . . , d,
(37)
e i,q = xi , i = 1, . . . , d.
X
0
Let us point out some remarks:
e does not need, so one can drop this

(a) in the one dimensional case, the auxiliary process X
q
q
e
computation or else set Xk = Xk ;
e i,q has to be evaluated in terms of its corresponding ì,q , q = 1, . . . , N ;
(b) each X
k
k
i,q
i,q
e
(c) there are two main dierences between Xk and Xk :
i.
e i,q does not contain Pi ij W j but only ii W i .

X
k
j=1
k
k
e d,q are independent, for any xed q and k ;
X
As a consequence,
e 1,q , . . . ,
X
k
ii.
e i,q
X
k
contains the term
4. Computation of the weights
ìk
which does not appear in
i,q
Xk
.
{Wki,q }i=1,...,d; k=1,...,n1 , q = 1, . . . , N ,
dened as
Wki,q := Wti,q
= (tk+1 tk )Wti,q
tk (Wti,q
Wti,q
) + tk (tk+1 tk )ii .
k tk+1
k
k+1
k
So, for any xed sample
q = 1, . . . , N ,
set for
k = n 1, . . . , 1
i,q
i,q
i,q
k(W(k+1)
Wk
) + k2 ii ,
Wki,q = Wk
1, . . . , d.
Once we have all the previous ingredients, we can proceed to the computation of
(38)
Pk (Xk )
given
by (31). To this purpose, let us set

q
Pk [F ](Xk
) = E F (X(k+1) ) Xk =
where
q
1,q
d,q
Xk
= (Xk
, . . . , Xk
) = the q th
q
=Xk
sample, given by (35).
which has to be considered here as a datum. Notice that for each xed (time) k , we have a random
q
1,q
d,q
q
d
space grid Xk = (Xk , . . . , Xk ) R and we compute Pk [F ](Xk ) for each point of the grid.
q
By Theorem 4.2, Pk [F ](Xk ) is given by

d

Y
ei
H(X
e
)
i
i
k
E F (X(k+1) )
Wk
(k+1)
ei
X

k
i=1
q
Pk [F ](Xk ) =

d
Y

ei
H(X
e
)
i

i
k
E
Wk (k+1)

i
e

X
k
i=1
eq
e =X
k
where, for
R, H() = 1
if
and
h() = 0
otherwise.
In this practical context, such
expectations are computed and thus replaced by the associated empirical means, that is we set in
practice
N
X
q
Pk [F ](Xk
)=
q0 =1
q0
F (X(k+1)
)
i=1
d
N Y
X
e i,q X
e i,q )
0
H(X
k
k
Wki,q
0
i,q
e
X
0
q =1 i=1
RR n 4804
d
Y
e i,q X
e i,q )
0
H(X
k
k
Wki,q
0
i,q
e
X
(39)
36
Here,
e i,q
X
k
and
Wki,q
are computed through (37) and (38) respectively. Finally, we can set up
the dynamic programming principle:
for
k = n 1, . . . , 1, 0
Obviously,
q
by u0 (X0 )
then
q
q
un (Xn
) = (Xn
), q = 1, . . . , N

q
q
q
uk (Xk
) = max (Xk
), Pk [uk+1 ](Xk
) , q = 1, . . . , N.
uk gives the Monte Carlo estimate for Pk in (31) and nally the price P0 is approximated
u0 (x) = max((x), P0 [u1 ](Xq )), where in practice we set
P0 [u1 ](x) =
N
1 X
u1 (Xq ).
N q=1
This procedure gives the price. Concerning the delta, everything starts at the nal steps, that is
when time
0 (x) = (
0;1 (x), . . . ,
0;d (x))
v0 (x) = (v0;1 (x), . . . , v0;d (x)) given by, for j = 1, . . . , d,
is considered. Indeed, by (32) we can approximate
through its Monte Carlo estimate
N
1 X
v0;j (x) =
v1;j (Xq )
N q=1
v1;j (Xq ) = j ()|=Xq if u (Xq ) < (Xq ) and v1;j (Xq ) = er j E(u2 (X2 ) | X =
)|=Xq if u (Xq ) > (Xq ) (recall that u2 is the estimate for P2 ). The gradient () should
where
obviously be considered as given in input. Concerning the gradient of the conditional expectation,
by using Theorem 4.2 one has to evaluate

Hj [u2 ](Xq ) j E u2 (X2 ) X =
=Xq
j
X
bmj
m=1
where
Ts,t
and
em,q
X
Xj,q
Rs,t;m
Rs,t;m [u2 ](Xq )Ts,t [1](Xq ) Ts,t [u2 ](Xq )Rs,t;m [1](Xq )
Ts,t [1](Xq )2
are given by (20) and (21) respectively.
expectations of random variables for which we have
(40)
Now, since they are weighted
samples, they can be practically evaluated
by means of the associated empirical mean: by taking into account (20) and (21), we write, for
q0
q0
q0
f = u2 or f = 1 (more precisely, by replacing f (X2
) = u2 (X2
) or f (X2
) = 1 in the formulas
below),
Ts,t [f ](Xq )
N
d
Y
e i,q0 X
e i,q )
0
1 X
H(X
q0
f (X2 )
W1i,q
i,q0
2
e
N 0
ii X
i=1
q =1
Rs,t;m [f ](Xq )
0
N
em,q0 X
em,q ) h (W m,q )2
1 X
H(X
t i
q0
m,q0
1
+
W
f (X2 )
0
1
em,q )2 mm s(t s)
N 0
mm
mm s(t s)(X
q =1
d
Y
i=1,i6=m
ei,q0 X
ei,q )
H(X
i,q0
0 W1
i,q
e
ii s(t s)X
(41)
This concludes the analysis of the pricing/hedging algorithm.

Let us point out that, for the sake of simplicity, in the above description we have taken into account
the non localized formulas. In practice, it is much better to use localizing functions in order to
reduce the variance, so one should use the formulas coming from Theorem 4.8. Obviously, nothing
INRIA
changes except for the choice of the localizing functions, for which we refer to the discussion in
Section 5.
Finally, let us observe that one could use a further technique allowing to reduce the variance: the
introduction of a control variable. Unfortunately, there is not a standard way to proceed in this
direction. For example, one could use as a control variable the price of the associated European
option. The idea is the following. For a xed initial time
P am (t, x)
and
same payo
P eu (t, x)
and underlying asset price
x,
let us set
as the price of an American and European option respectively, with the
and maturity
T.
We dene
P (t, x) = P am (t, x) P eu (t, x).

Then it is easy to see that

b X ) Ft
P (t, Xt ) = sup E er(t) (,
Tt,T
where
Tt,T
stands for the set of all the stopping times taking values on
[t, T ]
and
is dened by
b x) = (x) P eu (t, x)
(t,
(notice the obstacle
0).
b x)
(t,
is now dependent on the time variable also, and is such that
Thus, for the numerical valuation of
P (0, x),
in point of fact identical to the one previously described, provided that the obstacle
by the new obstacle
b x).
(t,
b
(T,
x) =
one can set up a dynamic programming principle
Once the estimated price
P0 (x)
and delta
0 (x)
is replaced
are computed,
the approximation of the price and delta of the American option is then given by
P0am (x) = P0 (x) + P eu (0, x)
and
eu
am
0 (x) = 0 (x) + (0, x)
respectively. Notice that the new obstacle has to be evaluated at each time step: in order to set up
this program, it should be possible to compute the price/delta of an European option on
This
happens for some call or put options, for which prices and deltas are known in closed form. But
one could think also to proceed by simulation for their computation, by using the formulas given
in Theorem 3.1 and Theorem 4.2.
6.2
Sketch of the pricing/hedging algorithm
The algorithm itself can be stated as follows. We refer here to the simplest case: we do not consider
localizing functions (but we underline in footnote where they should be) and control variables.
and the dividends
maturity
d, the starting point x, the interest

and the auxiliary matrices
e and
b, the
j , j = 1, . . . d.
Set the diusion and option parameters: the dimension

rate
Choose
T,
the payo
and set
i ,
the volatility matrix
and its rst derivatives
= T /n.
STEP n

RR n 4804
Produce
Unq N(0, I), q = 1, . . . , N .
Using (34), compute
q
Wn
, q = 1, . . . , N .
Using (35), compute
q
Xn
, q = 1, . . . , N .
Initialization: set
q
q
un (Xn
) = (Xn
).
38
STEP k, for k=n-1,...,1

Ukq N(0, I), q = 1, . . . , N .

q
(34), compute Wk , q = 1, . . . , N .
q
(35), compute Xk , q = 1, . . . , N .
Produce
Using
Using
Do the following .
* Choose `qk , q = 1, . . . , N . For example:
q
: `k = 0;
q
q
: `k = ` (Xk ) as in (4.4) (with
choice 1
choice 2
`1,q
k = 0
and for
i = 2, . . . , d:
q
= Xk
ì,q
k =
and
s = k):

X j,q
1
ln k .
bij hj
k
xj
j=1
i1
X
e q , q = 1, . . . , N .
X
k
q
Using (38), compute Wk , q = 1, . . . , N .
q
), q = 1, . . . , N .
Using (39)2 , compute Pk [uk+1 ](Xk
Compute for any q = 1, . . . , N ,

q
q
q
uk (Xk
) = max (Xk
), er Pk [uk+1 ](Xk
)
If
Using (37), compute
k=1
*
*
then add the following .
4
Using (41) and (40), compute
Compute for any
Hj [u2 ](Xq ), j = 1, . . . , d, q = 1, . . . , N .
j = 1, . . . , d and q = 1, . . . , N ,
v1;j (Xq ) = j (Xq ) 1{u1 (Xq )<(Xq )} + er Hj [u2 ](Xq ) 1{u1 (Xq )>(Xq )} .
Remark.
For
q
q
q
q eq
q
q = 1, . . . , N , Uk+1
, W
(k+1) , X(k+1) , `k , Xk , Wk
and
q
uk+1 (Xk
),
will not
be employed anymore.
STEP 0:
Compute
P0 [u1 ](x) =

u0 (x) = max (x), er P0 [u1 ](x) ,
and set
which nally approximates the price
N
1 X
u1 (Xq ).
N q=1
Compute for any
P (0, x).
j = 1, . . . , d
v0;j (x) =
which nally approximates the
j th
N
1 X
v1;j (Xq ),
N q=1
component of the delta vector
(0, x).
1 This step is devoted to the multidimensional case: recall that X

e X in the one dimensional case.
2 Or: rst choose the localizing (=probability density) functions i = 1, . . . , d, compute the 's as the
i
i
associated probability distribution functions and after use the localized version of (39), coming from part
Theorem 4.8 (or Theorem 3.3 if
d = 1).
i)
of
3 This step is devoted to the computation of the delta.

4 Or: rst choose the localizing (=probability density) functions , i = 1, . . . , d, compute the 's as the
i
i
associated probability distribution functions and after use the localized version of (41), coming from part
Theorem 4.8 (or Theorem 3.3 if
d = 1).
ii)
of
INRIA
6.3
Numerical results
In this section, the Malliavin-Monte Carlo method is numerically illustrated on three test problems.
We rst study a one-dimensional and two dimensional American put problem in order to assess the
numerical behavior of the algorithm. We then investigate the methodology by pricing and hedging
American put options on the geometric mean of ve stocks.
6.3.1 Numerical experiments

Standard American put options.
payo
(x) = (K x)+ ,
instantaneous interest rate
We consider a one-dimensional American put option with
exercise price
r = ln(1.1)
K = 100,
volatility
= 0.2,
= 0.
and dividend yield rate
maturity
T = 1
year,
We take as the true
reference price and delta, the one issued of the Binomal Black Scholes Richardson extrapolation
tree-method [9] (BBSR-P and BBSR- in the next tables) with
for problems with
n = 10, 20, 50 time
periods in Table 1, where
1000 step. Results are reported

Nmc is the number of Monte Carlo
iterations.
Put on the minimum of 2 assets.
We evaluate an American put option on the minimum
(x) = (K min(x1 , x2 ))+ . We assume that the initial

values of the stock prices are x1 = x2 = 100, the exercise price is K = 100, the volatility is
1 = 2 = 0.2 and the value of the correlation is = 0 (this means that the volatility matrix
is diagonal: = diag[1 , 2 ]), the interest rate is r = ln(1.05), while the continuous dividend
rates are 1 = 2 = 0. We take as the true reference price and deltas, the one issued of the
Villeneuve-Zanette nite dierence algorithm [23] (VZ-P and VZ- in the next tables) with 500
time-space steps. Results are reported for problems with n = 10, 20, 50 time periods in Table 2.
of two underlying assets with payo
Put on the geometric mean of 5 assets.
We compute the American put option on the the

Q5
(x) = (K ( i=1 xi )1/5 )+ . We assume that the initial
values of the stock prices are x1 = . . . = x5 = 100, the exercise price is K = 100, the volatility
1 = . . . = 5 = 0.2 and the correlations are null (again, this means that = diag[1 , . . . , 5 ]),
the interest rate is r = ln(1.1), while the continuous dividend rates are 1 = . . . = 5 = 0. With
geometric mean of
assets with payo
this payo, the simulation results are benchmarked with the one-dimensional American BBSR
tree-method [9] with
1000 step.
Results are reported for problems with
n = 10, 20, 50 time
periods
in Table 3.
Tables of results.
The tables refer to the examples previously introduced. Here,
the approximating price and delta respectively.
and
denotes
In any experiment, the price of the associated
Europen option has been used as a control variable. Moreover, also the localization has been taken
into account, as a Laplace-type probability density function (see Section 5). In order to relax the
computational executing time, we have always used the parameter of the localizing function equal
to
being
= T /n,
where
Concerning the auxiliary drift
T stands for the

`, we have chosen
maturity and
both
`=0
n is the number of time periods.

` giving as a xed point for G
and
(see Remark 4.6 for details). Actually, the results are not inuenced by the two possibilities: we
obtain really similar outcomes.
For the sake of comparison with other existing Monte Carlo methods, in the case of the standard
American put (Table 1) also the price and delta provided by the Barraquand and Martineau [5] and
Longsta-Schwartz [18] algorithms
5 are reported (prices and deltas are denoted as BM-P , BM-
and LS-P , LS-, respectively).
5 It is worth noticing that the Barraquand and Martineau algorithm takes into account a space-grid, while
the Longsta and Schwartz algorithm uses least-squares approximation techniques: we have considered the ones
implemented in the software
RR n 4804
Premia, see http://cermics.enpc.fr/premia
40
Nmc
BBSR-P
BM-P
LS-P
5.611
4.914
-0.378
5.147
4.714
-0.387
5.038
4.710
-0.384
BM-
LS-
-0.316
-0.284
-0.325
-0.232
-0.287
-0.258
-0.387
-0.279
-0.254
4.816
-0.388
-0.270
0.265
5.469
5.025
-0.398
-0.284
-0.237
5.198
4.903
-0.385
-0.261
-0.251
5.086
4.765
-0.385
-0.281
-0.256
4.873
5.076
4.843
-0.392
-0.272
-0.262
20000
4.875
5.057
4.868
-0.387
-0.278
-0.269
500
4.936
5.347
5.165
-0.384
-0.261
-0.258
1000
4.951
5.904
5.118
-0.384
-0.270
-0.3018
500
4.807
10 time
1000
4.795
periods
5000
4.804
10000
4.818
4.973
4.747
20000
4.823
4.928
500
4.896
20 time
1000
4.896
periods
5000
4.864
10000
50 time
periods
4.918
4.918
4.918
BBSR-
-0.387
-0.387
5000
4.897
5.184
4.880
-0.386
-0.283
-0.263
10000
4.904
5.181
4.859
-0.392
-0.387
-0.273
-0.263
20000
4.910
5.149
4.907
-0.389
-0.281
-0.278
Table 1: Standard American put.
Nmc
VZ-P
-0.288
-0.296
-0.295
-0.294
-0.295
-0.297
500
10.298
10 time
1000
10.283
periods
5000
10.271
10000
10.277
-0.297
-0.297
20000
10.263
-0.296
-0.295
500
10.388
-0.302
-0.294
20 time
1000
10.371
-0.293
-0.292
periods
5000
10.350
-0.295
-0.296
10000
10.349
-0.296
-0.297
20000
10.327
-0.296
-0.297
500
10.511
-0.306
-0.287
50 time
1000
10.443
-0.288
-0.289
periods
5000
10.416
-0.300
-0.299
10000
10.424
-0.297
-0.299
20000
10.403
-0.298
-0.297
10.306
10.306
10.306
VZ-
-0.295
-0.295
-0.295
Table 2: American put on the minimum of 2 assets (here, VZ- denotes the common value of VZ-1 and
VZ-2 ).
INRIA
Nmc
BBSR-P
-0.0752
-0.0741
-0.0824
-0.0767
-0.0708
-0.0743
-0.0739
-0.0744
-0.0687
-0.0784
-0.0738
-0.0794
-0.0750
-0.0727
0.0761
-0.0759
-0.0780
500
1.525
5 time
1000
1.535
periods
5000
1.528
10000
1.537
-0.0775
-0.0765
-0.0786
20000
1.541
-0.0777
-0.0777
-0.0797
0.0779
-0.0783
500
1.716
-0.0781
-0.0766
-0.0868
-0.0934
-0.0708
10 time
1000
1.684
-0.0763
-0.0756
-0.0839
-0.0786
-0.0823
periods
5000
1.740
-0.0878
-0.0864
-0.0870
-0.0825
-0.0863
10000
1.737
-0.0864
-0.0842
-0.0872
-0.0865
-0.0878
20000
1.727
-0.0838
0.0871
-0.0867
-0.084
-0.0842
500
1.800
-0.0777
-0.0754
-0.0681
-0.0827
-0.0643
20 time
1000
1.846
-0.0851
-0.0872
-0.0753
-0.0720
-0.0768
periods
5000
1.907
-0.0864
-0.0833
0.0919
-0.0933
-0.0983
10000
1.880
0.0918
-0.0882
-0.0890
-0.0829
0.0849
20000
1.879
-0.0847
-0.0913
-0.0911
-0.0907
-0.0872
1.583
1.583
1.583
BBSR-
-0.0755
-0.0755
-0.0755
Table 3: American put on the geometric mean of 5 assets (here, BBSR- denotes the common value of
BBSR-1 , . . . ,BBSR-5 ).
RR n 4804
42
6.3.2 Final comments

The numerical values for the price and delta in tables 1, 2 and 3, show good accordance with the
true values.
It is worth noticing that this holds for a not high number of both time periods
(10) and Monte Carlo iterations (500, 1000). Moreover, from Table 3 one may deduce that, as
the dimension increases, the convergence (in terms of the number of iterations) is slower when the
number of time periods is high.
This could be explained if one knew the theoretical error, for
which at the moment there are no results. However, Bouchard and Touzi in [8] (Theorem 6.3) have
proved that, in a simulation scheme very similar to the one developed here (in particular, using
localizing functions which generalize the one given in Theorem 3.4 in the one dimensional case),
p
the maximum of all the L distances between the true conditional expectations and the associated
d/(4p)
regression estimators is of order
N 1/(2p) as 0 (recall that is the discretization
time-step,
is the number of simulations and
stands for the dimension - notice that there is
a dependence on the dimension as well). In particular, the maximum of the variances is of order
d/4 N 1/2 as 0. This would suggest that as the dimension increases, one should increase
also the number of Monte Carlo iterations in order to achieve good results.
We also tested what happens if one does not consider the localization function and/or the control
variable. The results are shown in Table 4, which refers to the simplest case of the one dimensional
American put previously introduced.
Nmc
P :
with loc.
without loc.
without loc.
without contr. var.
with contr. var.
without contr. var.

4.024
500
5.056
4.615
1000
4.873
4.725
4.272
5000
4.717
10.417
124.148
10000
4.773
14.330
10437.562
20000
4.829
10.224
219.259
Table 4: Standard American Put,
10
time periods (true price: 4.918)
First of all, it is worth to point out that the algorithm numerically does not work if the localization
is not taken into account.
On the contrary, as Table 4 shows, the use of the localization, even
without any control variables, gives unstable but rather reasonable prices.
Thus, the results in
tables 1, 2, 3 and 4 allow to deduce that the introduction of the control variable together with the
localization then brings to more stable results, with quite satisfactory precision both for prices and
deltas also with a small number of Monte Carlo iterations (500, 1000).
Let us analyze the CPU time arising from the running of the algorithm. Our computations have
been performed in double precision on a PC Pentium IV
1.8
GHz with
256
Mb of RAM. Figure 2
shows the computational time spent for the pricing of the one dimensional American put option
as in the above example. As expected, this Monte Carlo procedure has a rather high CPU time
cost, which empirically comes out to be quadratic with respect to the number of simulations.
Figure 2 allows to conclude that this method is interesting for further developments but is uncompetitive with respect to other Monte Carlo methods, such as Barraquand-Martineau [5] and
Longsta-Schwartz [18], in terms of computing time.
Nevertheless, dierently from these other
procedures (which do not provide an ad hoc way for the Greeks: they are computed simply and
poorly by nite dierences), this method allows to eciently obtain the delta.
Therefore, as a
further research development, one could think to combine this procedure with some other one in
order to achieve better results, mainly cheaper in terms of CPU time costs, as it has been suggested
by H. Regnier in private communications.
INRIA
3200
2800
2400
2000
1600
1200
800
400
CPU (seconds)
..
......
......
......
......
......
.
.
.
.
.
.
......
......
......
20 time
......
......
.
.
.
.
.
..
periods
......
......
.
.
.
.
.
..
...
......
..........
......
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
......
...........
......
..........
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
....
......
..........
10 time
......
..........
......
..........
.......
...........
periods
.............
..........
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.......
......
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
..
.
..............
...................
.............................................................
.
.
.
.
.
.
.
.
.
.
.
.
.
.
...................................
MC iterations
.......................................
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000
Figure 2
Required CPU time (in seconds) in terms of Monte Carlo iterations.
Appendix
Proof of Proposition 5.5.
We give here only a sketch of the proof, since it is quite similar to the
proof of Propositions 5.1 and 5.3.
L1 (R) such that for any small

First, suppose d = 1. Take
Rx
(x) = (t) dt, one has (as in the proof of Proposition 5.1)
= 2E
(J1f )0 ()()
= 2
ZR
R
then
+ L1 .
Setting

f 2 (Xt ) ()
()s,t + (H )()s,t d.
s,t ()s,t

()
E f 2 (Xt ) 0 ()2s,t + (H )()2s,t d.
By setting
2 =
E(f 2 (Xt ) 2s,t )

E(f 2 (Xt ) 2s,t )
v() = (), one has to solve the ordinary dierential equation v 00 () a2 v() + a2 H() = 0.
||
This gives () = v() = e
/2. Now, it is immediate to see that gives the minimum,
and
so the statement follows in dimension 1.

Consider now the case
d > 1.
in (18).
First, we compute the derivative of
fe12 (x1 ) =
k = 1: by symmetry arguments, the

fet (y) fe(y) = f Ft (y), y Rd+ and Ft being dened
For simplicity, let us assume
general case will be clear. Also, let us set
Jdf ;1 ()
in the direction
(1 , 0, . . . , 0).
then
Jdf ;1 () =
d h

i2
Y
ei
(H i )(X
ei )
s
i
et2 , . . . , X
etd)
esi
de
1 E fe2 (x1 , X
ei ) +
Ws,t
i (X
ei
ii s(t s)X
Rd1
s
i=2
Z
R

h
i2
e
et1 ) 1 (X
es1
es1
de
1 E fe12 (X
e1 )s,t;1 + (H 1 )(X
e1 )s,t;1
= J1f1 (1 ).
Now, since we are here interested in the behavior in the direction

one dimensional result, getting immediately the solution:
1 () =
RR n 4804
Setting
1 1 ||
e
,
2
R,
(1 , 0, . . . , 0),
we can employ the
44
with
1 2

de
1 E f 2 (Xt ) 2s,t;1
=R

de
E
f 2 (Xt ) 2s,t;1
d1
1
R
R
Rd1

e 1 ) 2
E fe12 (X
t
s,t;1

=
et1 ) 2
E fe12 (X
s,t;1
Qd h
e i ei ) +
i=2 i (Xs
Qd h
e i ei ) +
i=2 i (Xs
We consider now the behavior in the direction
fe22 (x2 ) =
Z
Rd1
e i e
(Hi )(X
s i )
ei
ii s(ts)X
s
i
Ws,t
e i e
(Hi )(X
s i )
ei
ii s(ts)X
s
i
Ws,t
(0, 2 , 0, . . . , 0).
i2
i2
Setting

h
i2
et1 , x2 , . . . , X
etd ) 1 (X
es1
es1
de
2 E fe2 (X
e1 )s,t;1 + (H 1 )(X
e1 )s,t;1
d h
i2
Y
ei
(H i )(X
ei )
s
i
ei
e
)
+
W
i (X
i
s
s,t
ei
ii s(t s)X
s
i=3
then
Jdf ;1 ()
where
=
R
I1f ()

h
e 2 ) 2 (X
e2
e2
de
2 E fe22 (X
e2 ) + (H 2 )(X
e2 )
t
s
s
2
Ws,t
es2
22 s(t s)X
i2
= I1f2 (2 ),
is the one handled in Proposition 5.1. By using similar arguments, one obtains
2 () =
2 2 ||
e
,
2
R,
with

h
i2
2
Ws,t
e 2)
E fe22 (X
t
e2
s(ts)X
22 s
2 2 =
2
2
e )
E fe2 (X
t

h
i2 Q
h
i2
2
R
e i e
Ws,t
d
(Hi )(X
i
s i )
e1
ei
de
2 E f 2 (Xt ) (X
e1 )2 s(ts)
(
X
e
)
+
W
i
i
s
s
s,t
2
i
i=3
e
e
Rd1
Xs
ii s(ts)Xs
22
=

i2
R
Qd h
e i e
(Hi )(X
i
s i )
2
1
2
i
e
e
de
E
f
(X
)(
X
e
)
(
X
e
)
+
W
2
t
1
i
i
d1
s
s
s,t
i=3
ei
R
s(ts)X
ii
where
es1
es1
es1
es1
(X
e1 ) 1 (X
e1 ) = 1 (X
e1 )s,t;1 + (H 1 )(X
e1 )s,t;1
Now, for the remaining coordinates one can apply similar arguments. Thus, by summarizing, one
Qd
j |j | ,
has that () =
j R and
j=1 j (j ), = (1 , . . . , d ) R , with j (j ) = 2 e
j = j [f ], are given by:

i2
Qd h
(H
i
i )(i )
d1 E f 2 (Xt ) 2s,t;1 i=2 i (i ) + s(ts)
W
s,t
ei
X
ii
s
= R

h
i2 and for j = 2, . . . , d :
)( )
Q
(H
d
i
( ) +
i
i
2 (X ) 2
d
E
f
W
1
t
i
d1
s,t
s,t;1
i
i
i=2
e
R
ii s(ts)Xs

h W j
i2 Q
h
i2
R
d
(H
s,t
2
i
i )(i )
i=2,i6=j i (i ) + ii s(ts)X
esj
e i Ws,t
Rd1 dj E f (Xt ) (1 ) jj s(ts)X
s
=

h
i2
R
)( )
Q
(H
d
i
i
i
dj E f 2 (Xt ) (1 )2 i=2,i6=j i (i ) + s(ts)
W
s,t
i
e
Rd1
X
R
1 2
j 2
Rd1
ii
INRIA
(1 ) 1 (1 ) = 1 (1 )s,t;1 + (H 1 )(1 )s,t;1 . Now,
interesting representation for the 's, it is easy to show that (see also
where
in order to give a more

the proof of Proposition
5.3)
Z
R
so that the
1 2
j 2
's
i (i ) +
i
h
2
i2
Ws,t
(H i )(i )
1
i
Ws,t
di = i 2 +
,
ei
ei
4i
ii s(t s)X
ii s(t s)X
s
s
Z

1 2 2
2
(1 )2 d1 =
,
1
s,t;1
s,t;1
41
R
have to solve the nonlinear system
h

i2 i
i
Qd h
Ws,t
E f 2 (Xt ) 2s,t;1 i=2 i 2 + s(ts)
ei
X
ii
s
=
h W i
h
i2 i and for j = 2, . . . , d :
Q
d
2
s,t
2
2
E f (Xt ) s,t;1 i=2 i + s(ts)Xe i
ii
s

h
h
i2 Q
h
i2 i
j
i
Ws,t
Ws,t
d
2
2 2
2
E f (Xt ) 1 s,t;1 + s,t;1 s(ts)Xe j
i 2 + s(ts)
i
e
i=2,i6
=
j
Xs
jj
ii
s
=
.

h W i
Q
h
i2 i
d
2
2
s,t
2
2
2
E f (Xt ) 1 s,t;1 + s,t;1
i=2,i6=j i + s(ts)X
ei
ii
Finally, it is straightforward to see that
actually gives the minimum.
References
[1] V. Bally, M.E. Caballero, B. Fernandez, N. El Karoui (2002) Reected BSDE's, PDE's and
variational inequalities. Preprint INRIA.
[2] V. Bally and G. Pags (2000). A quantization algorithm for solving multi-dimensional optimal stopping problems. Prpublication du Laboratoire de Probabilits & Modles Alatoires, n. 628, Universit Paris VI.
[3] V. Bally and G. Pags (2002). Error analysis of a quantization algorithm for obstacle
problems. To appear on Stoch. Proc. and their Appl..
[4] V. Bally, G. Pags and J. Printems (2003). First-orders Schemes in the Numerical Quantization Method. Mathematical Finance,
13,1-16.
[5] J. Barraquand and D. Martineau (1995). Numerical valuation of high dimensiona multivariate American securities. Journal of Finance and Quantitative Analysis,
30, 383-405.
[6] A. Bensoussan and J.L. Lions (1984). Impulse Control and Quasivariational Inequalities.
Gauthiers-Villars.
[7] B. Bouchard, I. Ekeland and N. Touzi (2002). On the Malliavin approach to Monte Carlo approximation of conditional expectations. Prpublication du Laboratoire de Probabilits & Modles Alatoires, n. 709; available at
preprints/index.html
http://www.proba.jussieu.fr/mathdoc/
[8] B. Bouchard and N. Touzi (2002). Discrete time approximation and Monte-Carlo simulation
of Backward Stochastic Dierential Equations. Preprint.
RR n 4804
46
[9] M. Broadie and J. Detemple (1996). American option valuation: new bounds, approximations, and a comparison of existing methods securities using simulation. The Review of
Financial Studies,
9, 1221-1250.
[10] M. Broadie and P. Glassermann (1997). Pricing American-style securities using simulation.
Journal of Economic Dynamics and Control,
21, 1323-1352.
[11] D. Chevance (1997). Thse de l'Universit de Provence & INRIA.

[12] . Fourni, J.M. Lasry, J. Lebouchoux, P.L. Lions, N. Touzi (1999). Applications of Malli-
avin calculus to Monte carlo methods in Finance. Finance & Stochastics, , 391-412.
[13] . Fourni, J.M. Lasry, J. Lebouchoux, P.L. Lions (2001). Applications of Malliavin calculus
to Monte carlo methods in Finance II. Finance & Stochastics, , 201-236.

[14] N. El Karoui, C. Kapoudjan, E. Pardoux, S. Peng and M.C. Quenez (1997). Reected
solutions of Backward SDE's and related obstacle problems for PDE's. The Annals of
Probability,
25, 702-73.
[15] N. El Karoui, S. Peng and M.C. Quenez (1994). Backward stochastic dierential equations
in nance. Mathematical Finance,
25, 702-73.
[16] A. Kohatsu-Higa and R. Pettersson (2001). Variance reduction methods for simulation of
4, 431-450.
densities on Wiener space. SIAM Journal of Numerical Analysis,
[17] P.L. Lions and H. Regnier (2001). Calcul du prix et des sensibilits d'une option amricaine
par une mthode de Monte Carlo. Preprint.
[18] F.A. Longsta and E.S. Schwartz (2001). Valuing American options by simulations:
simple least squares approach. The Review of Financial Studies,
14, 113-148.
[19] D. Nualart (1995). The Malliavin Calculus and Related Topics. Springer Verlag, Berlin.
[20] E. Pardoux and S. Peng (1990). Adapted solution of a backward stochastic dierential
equation. Systems and Control Letters,
14, 55-61.
[21] E. Pardoux and S. Peng (1990). Backward stochastic dierential equations and quasilinear
parabolic partial dierential equations and their applications. In Lect. Notes Control Inf.
Sci., Vol. 176, 200-217.

[22] J.N. Tsitsiklis and B. VanRoy (1999). Optimal stopping for Markov processes: Hilbert space
theory, approximation algorithm and an application to pricing high-dimensional nancial
derivatives. IEEE Trans. Autom. Controll,
44, 1840-1851.
[23] S. Villeneuve and A. Zanette (2002). Parabolic ADI methods for pricing American options
on two stocks. Mathematics of Operations Research,
27, 121-149.
INRIA
Unit de recherche INRIA Rocquencourt

Domaine de Voluceau - Rocquencourt - BP 105 - 78153 Le Chesnay Cedex (France)
Unit de recherche INRIA Lorraine : LORIA, Technople de Nancy-Brabois - Campus scientifique
615, rue du Jardin Botanique - BP 101 - 54602 Villers-ls-Nancy Cedex (France)
Unit de recherche INRIA Rennes : IRISA, Campus universitaire de Beaulieu - 35042 Rennes Cedex (France)
Unit de recherche INRIA Rhne-Alpes : 655, avenue de lEurope - 38330 Montbonnot-St-Martin (France)
Unit de recherche INRIA Sophia Antipolis : 2004, route des Lucioles - BP 93 - 06902 Sophia Antipolis Cedex (France)
diteur
INRIA - Domaine de Voluceau - Rocquencourt, BP 105 - 78153 Le Chesnay Cedex (France)
http://www.inria.fr
ISSN 0249-6399

American Option Montecarlo

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

American Option Montecarlo

Uploaded by

Copyright:

Available Formats

INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE

Pricing and hedging American options by Monte

Pricing and hedging American options by Monte Carlo

Thme 4  Simulation et optimisation

A special interest is given to practical

American options, Malliavin calculus, COnditional expectations, Sensitivities,

Monte Carlo, Reduction of variance, Localization

Premia project (http://cermics.enpc.fr/premia), which

Unit de recherche INRIA Rocquencourt

Calcul du prix et couverture des options amricaines par

Options amricaines, Calcul de Malliavin, Esprances conditionnelles, Sensitivits,

Monte Carlo, Rduction de variance, Localisation

of the linear PDE

as an expectation and employes a sample of the underlying diusion in

compute it on a whole grid

(t + L)u(t, x) + f (t, x, u(t, x)) = 0.

is available (and this is the case), one is obliged to

f (t, x, u(t, x)).

Therefore, the dynamical programing algorithm becomes

f (t, x, u(t, x)). A terminology dierence however:

the probabilist would say that he computes

Bally & Caramellino & Zanette

This permits us to give here an approach which is self contained and

And consequently, the explicit formulas which are

2 Presentation of the problem

is an option whose holder can exercise his right of option in

denote the underlying asset price process, here modeled as a diusion

dXt = b(Xt )dt + (Xt )dWt

American option is given by

spot rate process, which is here supposed to be deterministic.

t = inf{s [t, T ] ; P (s, Xs ) = (Xs )}.

giving the price of the option, can be characterized as follows. We

dene the stopping region, also called exercise region, as

Es = {(t, x) [0, T ] R+ : P (t, x) = (x)}

Ec = {(t, x) [0, T ] R+ : P (t, x) > (x)}.

as the innitesimal generator of

the following partial

{(t, x) ; P (t, x) = (x)}

is rather delicate to describe (it gives a supplementary

(, F , P) be a ltered probability space where a d-dimensional Browinan motion W is dened

P (t, x) = sup Et,x er(t)(X )

that is the price as seen at time

following recurrence equality:

Bally & Caramellino & Zanette

and for any

kt ) ' P (kt, Xkt ).

if one has few regularity and of order

of an American option, that is

as in Theorem 2.1, then the following approximation

is dened as in Theorem 2.1, that is

denotes the gradient, one has

Such results state that in order to numerically compute the price

and its delta

it is sucient to approximate a family of conditional expectations and of their derivatives, thus

as the quantization algorithm by Bally, Pags and Printems

and its derivative

terms of a suitable ratio of non-conditioned expectations, that is

and its delta

by using Theorem (2.1) and

which makes use of only one set of simulated

the spot rate

(for the theta), as well as the derivative

(for the vega).

Thme 4 Simulation et optimisation

as an expectation and employes a sample of the underlying diusion in

f (t, x, u(t, x)). A terminology dierence however:

denote the underlying asset price process, here modeled as a diusion

dene the stopping region, also called exercise region, as

as the innitesimal generator of

(, F , P) be a ltered probability space where a d-dimensional Browinan motion W is dened

is dened as in Theorem 2.1, that is

it is sucient to approximate a family of conditional expectations and of their derivatives, thus

solves the following stochastic dierential equation (sde)

is dened above and

Let us point out that, for any xed function

are dened in (7) and (8) respectively.

stands for the set of bounded and innitely dierentiable paths).