You are on page 1of 11

51 American Monte Carlo Method

51.1 American-style securities


A bermudan option is an option that gives its holder the right to exercise it on a set of
dates {T1 , T2 , . . . , TN }.
We will denote Ii (xi ) the payoff of the option if it is exercised at time Ti in the state of
the world xi ∈ Rd . It is the amount of cash received at time Ti by the holder if he decides
to exercise.
Then the value of our Bermudan Option is given by:
 h Rτ i
V (0) = Sup E Q e− 0 rs ds Iτ (xτ )
τ ∈Γ
 h I (x ) i
τ τ
= B(0, TN ) Sup E QN ,
τ ∈Γ B(τ, TN )

where Γ is the set of all stopping times with values in {T1 , T2 , . . . , TN }, and Q, and QN ,
are the risk neutral and, respectively, forward neutral probabilities.
As τ ∈ Γ can also be referred to as an Exercise Strategy (or exercise Boundary), the
expression above means that the price of a Bermudan option is its expected value given
the best Exercise Strategy.

51.2 Description of the Methods


51.2.1 General Framework
Both methods (Longstaff & Schwartz and Andersen) are based upon a common framework:

1. An exercise boundary for the option is computed.

2. The computed exercise boundary is used for forward pricing using classical Monte
Carlo (once known, the exercise boundary can be used to price the option like a
trigger option).

This methodology can be explained easily: as shown, the price of a Bermudan option is
given by its expected value provided its best exercise strategy (or stopping time). Hence,
a Bermudan option can be priced by first approximating its best exercise strategy and
then using it to price the option as a trigger.
The difference between these two numerical methods (L&S and Andersen) lies in the
way exercise boundaries approximations are computed.

400
51.2 Description of the Methods 401

51.2.2 Andersen’s algorithm


Step 1: Computation of the Exercise Boundary
In the case of Andersen’s method, the options’ best exercise strategy is approximated
by stopping times of the form: τ = Ti if for the first time Ii (xi ) ≥ ai . We will refer to the
set of those stopping times as ΓN , and we will denote τa1 ,...,aN an element of ΓN (because
a stopping time τ ∈ ΓN can be identified with an N -uple (a1 , . . . , aN )).
Hence, the best Exercise Strategy in the Andersen algorithm is given by:
 h I (x ) i
τ τ
(a∗1 , . . . , a∗N ) = τa∗1 ,...,a∗N = ArgMax E QN
τ ∈ΓN B(τ, TN )

The value of a Bermudan option is then approximated by:


  I (x ) i
τ τ
V (0) ≈ B(0, TN ) Sup E QN
τ ∈ΓN B(τ, TN )
h I (x ) i
τ τ
= B(0, TN ) E .
τ =τa∗ ,...,a∗ B(τ, TN )
1 N

Now, the point is about the way to find out those (a∗1 , . . . , a∗N ).
A first method would be to compute the expression
 h I (x ) i
τ τ
τa∗1 ,...,a∗N = ArgMax E QN
τ ∈ΓN B(τ, TN )

by approximating
h I (x ) i
τ τ
E QN
B(τ, TN )
with
M N (i)
1 XX In (xn ) 
1τ (x(i) )=n ,
M B(Tn , TN )
i=1 n=1

x(1) , . . . , x(M )

where are M simulated paths, and by solving

M X
N (i)
X In (xn ) 
τa∗1 ,...,a∗N = ArgMax 1τ (x(i) )=n .
τ ∈ΓN B(Tn , TN )
i=1 n=1

This expression involves an optimization over N variables, which is quite complicated to


achieve. That’s why Andersen proposed a solution involving N one-variable optimizations.
This solution lies in a recursive backward algorithm. It works as below:

1. Build M trajectories (x(1) , . . . , x(M ) ).

2. Set a∗N = 0. This reflects the fact that on the last exercise date, the option is
exercised if and only if its exercise value if positive.

Copyright © 2005-2013 Pricing Partners


51.2 Description of the Methods 402

3. Assume that (a∗n+1 , . . . , a∗N ) are computed. We would like to compute a∗n . Define
fn (a) as
M  (i)
X In (xn ) 
fn (a) = 1I (x(i) )≥a + Cn (xn(i) )1I (x(i) )<a ,
B(Tn , TN ) n n n n
i=1
(i)
where Cn (xn ) is defined as

Cn (x(i) (i)
n ) = Iq (xq )

with

q = inf p > n | Ip (x(i)

p ) ≥ ap .

Now solve
a∗n = ArgMax fn (a).
a

(i)
For a better understanding of the method, we can say that Cn (xn ) can be interpreted as
(i)
the value of the option at date tn in the state of the world xn in case it is note exercised
at time tn . Hence, fn (an ) is the value of the option at date Tn provided that its holder
exercise if In (xn ) ≥ an .
This method is continued until n = 0.

Step 2: Computation of the price provided the Exercise Boundary.


Once (a∗1 , . . . , a∗N ) is computed,
 it is used as follows:
M other paths x(1) , . . . , x(M ) of the underlying process are generated. And the price
of the security is given by:

M (i)
1 X In(i) (xn(i) )
P = B(0, TN ) ,
M B(Tn(i) , TN )
i=1

with
(i)
n Ip (xp ) o
n(i) = inf p ≥ 0 ≥ a∗p .

B(Tp , TN )

51.2.3 Longstaff & Schwartz’s Algorithm


The Longstaff & Schwartz algorithm works the same way as Andersen’s, though exercise
decisions are taken in a quite different manner. It works by calculating an approximation
of the continuation value (denoted as Cp∗ (xp )) of the option at each exercise date. The
decision to exercise is then taken when Ip (xp ) ≥ Cp∗ (xp ).
The algorithm works as follows:

Copyright © 2005-2013 Pricing Partners


51.2 Description of the Methods 403

Step 1: Computation of the exercise boundary (ie of the continuation values).

1. M trajectories (x(1) , . . . , x(M ) ) are simulated.

2. Define CN∗ (x ) = 0, coming from the fact that the decision to exercise at the last
N
∗ (x ) ≥ 0.
exercise date is taken iff IN N

∗ , . . . C ∗ are computed. We want to compute C ∗ .


3. Let’s now assume that Cn+1 N n
Let Φ = (φ1 , . . . , φk ) be a set of Rd -valued functions.
We write for all i ∈ [1, M ],
(i)
Iq (xq )
Cni =
B(Tq , TN )
where
(i)
n Ip (xp ) o
q = inf p ≥ n ≥ Cp∗ (x(i)
p ) ,

B(Tp , TN )
and we define Cn∗ by Cn∗ = α1∗ φ1 + . . . + αk∗ φk with
M 
X 2 
(α1∗ , . . . , αk∗ ) = ArgMin α1 φ1 (x(i)
n ) + . . . + αk φk (x(i)
n ) − Cn
i
.
i=1

Step2: ∗ are computed, the price of the option is given by:


Now that C0∗ , . . . CN

M (i)
1 X In(i) (xn(i) )
P = B(T0 , TN )
M B(Tn(i) , TN )
i=1

where
(i)
n Ip (xp ) o
n(i) = inf p ≥ 0 ≥ Cp∗ (x(i) ) .

p
B(Tp , TN )

Basis used in Price-it® The Longstaff & Schwartz algorithm works with a basis. As
is well known, the choice of the basis used in the regression is critical as it changes the
numerical results substantially.
The intuitive idea is to take a basis whose first terms represent well the continuation
value function. Ideally, the optimal basis would be the one that has as the first term the
continuation value function Cp∗ (xp ) as in this case, the regression will be optimal with
only one single term and the projection of the continuation value function as a linear
combination of the Longstaff & Schwartz will keep all the information. However, this
optimal basis cannot be used in practice as we are precisely trying to infer numerically
the continuation value function itself and hence this is an unknown.
Another important characteristic in the choice of the basis is to use one whose projection
of the continuation value function converges rapidly to the continuation value function.
By rapidly, we mean a polynomial of degree 4 or 5. In Price-it® , we decided to use for

Copyright © 2005-2013 Pricing Partners


51.2 Description of the Methods 404

the Longstaff & Schwartz basis the basis given by 1, F (S(ti )), [F (S(ti )]2 , [F (S(ti ))]3
and [F (S(ti ))]4 where F (S(ti )) is the intrinsic value of the option. This basis is very
interesting as it makes the Longstaff & Schwartz very similar to the Andersen method
making the two methods highly compatible.

51.2.4 Greeks computation in American Monte Carlo


Let’s assume that the Exercise boundary of the priced security is parameterized with
an n-uple (a1 , . . . , an ) (coefficients of functions in the case of Longstaff & Schwartz, or
Exercise Limits in the case of Andersen). Those parameters depend on (λ1 , . . . , λp ),
parameters given by the markets or the model.
The price of the option can be written as V (a1 , . . . , an , λ1 , . . . , λp ).
Hence, we can write that:
dV ∂V ∂V ∂aj
= +
dλi ∂λi ∂aj ∂λi
In the present case, coefficients (a1 , . . . , an ) are supposed to be optimal. Hence, we have
∂aj
∂λi = 0 for any i and j.
So, as a conclusion:
dV ∂V
=
dλi ∂λi
This means that when Greeks are calculated, there is no need for the exercise boundary
to be recomputed.

Copyright © 2005-2013 Pricing Partners


52 Lonstaff-Schwartz Algorithm
52.1 Introduction
The main topic of this paper is to expose with a clear manner the computation of the
quantity  
sup E f (τ, Sτ )
τ ∈T0;T
Rt
For example, f (t, St ) = e− 0 rs ds
(St − K)+ .

52.2 Setting the problem


Consider the probability space (Ω, F, P) where F = (Ft )t≥0 . We give the definitions of
the notations:
• T0;T is the set of stopping time with real values in [0; T ]
• τ is a stopping time in T0;T with respect to F
• S is the asset price.
• f is a smooth function.
Firstly, we change this problem in another one easier to deal with. Like any numerical
methods, we first need to work on a discrete set of times representing our time period:
{t0 , . . . , tN } with the convention t0 = 0 and tN = T . We then note Tt0 ;tN the set of
stopping time with values in {t0 , . . . , tN } with respect to (Ftn )0≤n≤N . The option becomes
a bermudan option, and one has now to compute the quantity

sup E f (τ, Sτ )
τ ∈T0;tN

To make the problem easily writable, let’s introduce some other notations:
• u(tN , StN ) = f (tN , StN )
 
• u(tn , Stn ) = max f (tn , Stn ) , E(u(tn+1 , Stn+1 )|Ftn ) for n = 0, . . . , N − 1
n o
• τ0∗ = inf tn u(tn , Stn ) = f (tn , Stn )

We make the hypothesis that f (tn , Stn )0≤n≤N is a martingale w.r.t. the filtration
(Ftn )0≤n≤N

Our main aim is to simulate correctly the optimal time τ0∗ .

405
52.3 Some Results 406

52.3 Some Results


We have the following theorem

Theorem 1.
sup E (f (τ, Sτ )) = E f (τ0∗ , Sτ0∗ ) .

u(t0 , S0 ) = (52.1)
τ ∈Tt0 ;tN
n o
If we now introduce τj∗ = inf tn ≥ tj u(tn , Stn ) = f (tn , Stn ) with j = 1, ..., N − 1 we

get a similar result

Corollary 1.
 
u(tj , Stj ) = sup E (f (τ, Sτ )) = E f (τj∗ , Sτj∗ ) Ftj (52.2)

τ ∈Ttj ;tN

where Ttj ;tN the set of stopping time with real values in {tj , . . . , tN }.

The following result is important (we will not demonstrate it)

Theorem 2. (u(tn , Stn ))0≤n≤N is the smallest super-martingale above (f (tn , Stn ))0≤n≤N .
Then 
E u(tn+1 , Stn+1 )|Ftn ≤ u(tn , Stn ) n ∈ {0, . . . , N − 1}
and for n ∈ {0, . . . , N }

u(tn , Stn ) ≥ f (tn , Stn ). (52.3)


Moreover, if it exists another super martingale above (f (tn , Stn ))0≤n≤N , then it is equal
to

(u(tn , Stn ))0≤n≤N . (52.4)

52.4 Relations between Optimal Times


It is in fact possible to calculate explicitly the (τj∗ )j with a backward induction:

τj∗ = tj 1 + τ ∗ 1
j+1
(i)
f (tj ,Stj )=u(tj ,Stj ) f (tj ,Stj )<u(tj ,Stj )

We explain this point: by definition of τj∗ , if f (tj , Stj ) = u(tj , Stj ), it is clear that
τj∗ = tj . Suppose now we have f (tj , Stj ) < u(tj , Stj ). Then τj∗ > tj so τj∗ ≥ tj+1 which
implies τj∗ ≥ τj+1
∗ since
n o

τj+1 = inf tn ≥ tj+1 u(tn , Stn ) = f (tn , Stn ) .

Copyright © 2005-2013 Pricing Partners


52.4 Relations between Optimal Times 407

As τj∗ ≤ τj+1
∗ by definition of the 2 stopping time, it is now evident that considering the
event {f (tj , Stj ) < u(tj , Stj )}, τj∗ = τj+1
∗ .

Let Uj = u(tj , Stj ), Zj = f (tj , Stj ), for j = 0, . . . , N . As (Un )0≤n≤N is a super


martingale and (Zn )0≤n≤N is a martingale, we have for n = 0, . . . , N − 1, E(Un+1 |Ftn ) ≤
Un and E(Zn+1 |Ftn ) = Zn .

Then (i) becomes with the new notations

τj∗ = tj 1 + τ ∗ 1
j+1
(ii)
Zj =Uj Zj <Uj

Now we show this following result

Zj < Uj ⇔ Zj < E(Uj+1 |Ftj ) (iii)

Actually, we just have to use the definition of Uj = max{Zj , E(Uj+1 |Ftj )}. As Zj < Uj ,
we get Zj < E(Uj+1 |Ftj ) and conversely, considering Zj < E(Uj+1 |Ftj ) we have Zj < Uj .

It is then clear that thanks to (iii), we have also the result

Zj = Uj ⇔ Zj ≥ E(Uj+1 |Ftj ) (iv)

(it is important to note that the case Zj > Uj is impossible for all j = 0, . . . , N .)
(ii) becomes

τj∗ = tj 1 + τ ∗ 1
j+1
(v)
Zj ≥E(Uj+1 |Ftj ) Zj <E(Uj+1 |Ftj )

 
Thanks to Corollary 1, Uj+1 ∗
= E f (τj+1 , Sτj+1
∗ )|Ft ; using the properties of a filtration,
j+1

in particular Ftj ⊂ Ftj+1 and the properties of the conditional expectation with embedded
filtrations we get the result
  !
∗ ∗

E(Uj+1 )|Ftj ) = E E f (τj+1 ∗ ) Ft
, Sτj+1 Ft = E(f (τj+1 , Sτj+1
∗ )|Ft )

j+1 j j

and (v) gives

τj∗ = tj 1 ∗ ,S ∗
+ τ ∗ 1
j+1 ∗ ,S ∗
(vi)
f (tj ,Stj )≥E(f (τj+1 τ )|Ftj ) f (tj ,Stj )<E(f (τj+1 τ )|Ftj )
j+1 j+1

that is
τj∗ = tj 1 + τ ∗ 1
j+1

Zj ≥E(Uj+1 |Ftj ) Zj <E(Uj+1 |Ftj )

Thus we see explicitly we need to compute a barrier option.


Finally, we get the following algorithm:

Copyright © 2005-2013 Pricing Partners


52.5 Hilbert spaces and projections 408

• Set τN = tN = T
• Considering that τj+1
∗ has already been computed in previous steps, one sets
τj∗ = tj 1Zj ≥E(Uj+1 |Ftj ) + τj+1

1Zj <E(Uj+1 |Ftj )
and we do this until j reaches 0 i.e. until we obtain τ0∗ . Note that if τj+1
∗ has been
computed, Uj+1 is easily obtained as we have the relation, thanks to the corollary
∗ , S ∗ ).
of Theorem 1, Uj+1 = f (τj+1 τj+1

Now we need to be more precise to see how we can compute the (τj∗ )0≤j≤N −1 . One
first must remark that to compute τj∗ knowing the next one τj+1
∗ , it suffices to know how
∗ , S ∗ )|F ). So in order to go on, we are going
to compute E(Uj+1 |Ftj ) that is E(f (τj+1 τj+1 tj
to remind some definition about Hilbert Spaces and Projection.

52.5 Hilbert spaces and projections


In fact, we must remember that we give the definition of the conditional expectation in
L2 (Ω) i.e. for the space of all random variables X such that E(|X|2 ) < ∞. |.| means an
euclidian norm if X takes values in Rd i.e. X = (X1 , . . . , Xd ) with Xi taking values in R
for i = 1, . . . , d.
We must also remember that L2 (Ω) is a Hilbert space with the scalar product
< ., . >: L2 × L2 → R, (X, Y ) 7→ E(XY ) (vii)
and then all the Ftj ’s for j = 0, . . . , N are Hilbert spaces with the same scalar product.

Characterization of a Hilbert Space Consider an Hilbert space H with the scalar prod-
uct < ., . >. Then it exists a countable base of elements (hi )i≥1 of H such that
- the adherence of {hi : i ≥ 1} is equals to H that is to say {hi : i ≥ 1} is dense in H (this
is a way to define an Hilbert Space).
- all elements g of H can be written as follows

X
g= αi hi with αi =< g, hi > .
i=1

Definition of the Projection on a Hilbert Space Suppose we have 2 Hilbert spaces H


and G such that G ( H. We consider these 2 Hilbert spaces have the same scalar product.
Suppose also we want to compute the projection of an element h of H on G that is to
say we want to find the element gh of G such that ||h − gh ||2 =< h − gh , h − gh > is the
smallest possible which can be written as gh = arg ming∈G ||h − g||2 . gh is also called the
projection of h upon G and can be denoted by p⊥ G (h).

It comes to the same to search for


X
αi )i≥1 = arg min ||h −
(e αi gbi ||2
(αi )i≥1
i≥1

Copyright © 2005-2013 Pricing Partners


52.6 Solving the problem 409

where (bgi )i≥1 is a base of G. Remember that for i ≥ 1 we have αei =< h, gbi > but they can
be difficult to calculate using this relation. The previous proceed to find the projection of
h on G is a least square minimization. Moreover to find the (e αi )i≥1 is to find the Hilbert
decomposition of h on G and we have
X ∞
X
gh = p ⊥ 2
G (h) = arg min ||h − g|| = arg min ||h − αi gbi ||2 = α
ei gbi .
g∈G (αi )i≥1
i≥1 i=1

52.6 Solving the problem


Now we come back to our problem. We then need to compute E(u(tj+1 , Stj+1 )|Ftj ) i.e. a
conditional expectation which is defined as a projection on Ftj which is an Hilbert space.
But (St )t≥0 has the Markov property, in particular

E(Uj+1 |Ftj ) = E(u(tj+1 , Stj+1 )|Ftj )


= E(u(tj+1 , Stj+1 )|Stj )
= E(u(tj+1 , Stj+1 )|σ(Stj ))
= ϕ(Stj ).

where ϕ is a borel function, σ(Stj ) is the σ-algebra generated by the random variable
Stj and it is also an Hilbert space included in L2 (Ω). Let us remark that all variable Y
σ(Stj )-measurable depend in fact of Stj which can seen as the relation Y = φ(Stj ) where
φ is a borel function. The space of projection is here σ(Stj ) ⊂ L2 (Ω).
The computation of the conditional expectation is then done by least square minimiza-
tion. So we now suppose that we have a truncated base (as it is impossible to work with
∞) of random variables (gk (Stj ))1≤k≤K σ(Stj )-measurable to compute the conditional
expectation. We note by K the number used to truncate the Hilbert decomposition of
αkj )1≤k≤K the coefficients of its decomposition.
E(u(tj+1 , Stj+1 )|Ftj ) and by (e

52.7 The Algorithm


Now we focus on the Monte Carlo algorithm:

1. simulate M trajectories of (S (m) )1≤m≤M at the times

0 = t0 , t1 , . . . , tN = T.
(m) (m) (m) (m)
2. Set τN = T and U (m) = f (τN , S (m) ) for m = 1, . . . , M
τN τN

3. We detail the first 5 sub-steps of the third step.


a) calculate
M  K
( 2 )!
1 X (m)
X (m)
αkN −1 )1≤k≤K = Arg min
(e U (m) − αk gk (StN −1 )
α M τN
m=1 k=1

Copyright © 2005-2013 Pricing Partners


52.7 The Algorithm 410

(m) (m)
At this sub-state, (U (m) )1≤m≤M , (gk (StN −1 ))1≤m≤M are all known so we just
τN
have to compute the least square minimization.
Remark that even though we use the truncated sum
K
(m)
X
ekN −1 gk (StN −1 ),
α
k=1

we are not so far from


(m) (m) 
E U (m) |σ(StN −1 )
τN
for m = 1, . . . , M. Then
b) set
(m)
τN −1 = tN −1 1 (m) −1 (m)

)≥ K eN
P
f (tN −1 ,St k=1 α k gk (St )
N −1 N −1
(m)
+τN 1 (m) −1 (m)

)< K eN
P
f (tN −1 ,St k=1 α k gk (St )
N −1 N −1

c) set
n K o
(m) (m) (m)
X
U (m)
m
= max f (τN −1 , S (m) ), ekN −1 gk (StN −1 )
α
τN −1 τN −1
k=1

d) calculate
M  K0
( 2 )!
1 X (m)
X (m)
αkN −2 )1≤k≤K 0 = Arg min
(e U (m) − αk gk0 (StN −2 )
α M τN −1
m=1 k=1

where (gk0 )1≤k≤K 0 is the Hilbert truncated base of σ(StN −2 ) as at this step we
need to compute

E(u(tN −1 , StN −1 )|FtN −2 ) = E(f (τN −1 , SτN

−1
)|StN −2 )

e) set
(m)
τN −2 = tN −2 1 (m) P 0 (m)

f (tN −2 ,St )≥ K
k=1 ekN −2 hk (St
α )
N −2 N −2
(m)
+τN −1 1 (m) P 0 −2 (m)

f (tN −2 ,St )< K
k=1 eN
α k hk (St )
N −2 N −2

and so on ...
(m)
4. the backward induction is repeated until the τ0 ’s have been found form =
1, . . . , M .
5. Finally, it suffices to make a simple Monte Carlo evaluation to find the price i.e.
M
1 X (m)
U0 = u(t0 , St0 ) = f (τ0 , Sτ (m) ).
M 0
m=1

Copyright © 2005-2013 Pricing Partners

You might also like