You are on page 1of 26

The Rate of Convergence of the Binomial

Tree Scheme

John B. Walsh1
Department of Mathematics, University of British Columbia, Vancouver B.C. V6T
1Y4, Canada
(e-mail: walsh@math.ubc.ca)

Abstract. We study the detailed convergence of the binomial tree scheme. It
is known that the scheme is first order. We find the exact constants, and show it
is possible to modify Richardson extrapolation to get a method of order three-
halves. We see that the delta, used in hedging, converges at the same rate. We
analyze this by first embedding the tree scheme in the Black-Scholes diffusion
model by means of Skorokhod embedding. We remark that this technique applies
to much more general cases.

Key words: Tree scheme, options, rate of convergence, Skorokhod embedding
Mathematics Subject Classification (1991): 91B24, 60G40, 60G44
JEL Classification: G13

1 Introduction
The binomial tree scheme was introduced by Cox, Ross, and Rubinstein [1] as a
simplification of the Black-Scholes model for valuing options, and it is a popular
and practical way to evaluate various contingent claims. Much of its usefulness
stems from the fact that it mimics the real-time development of the stock price,
making it easy to adapt it to the computation of American and other options.
From another point of view, however, it is simply a numerical method for solving
initial-value problems for a certain partial differential equation. As such, it is
known to be of first order [6], [7], [2], [3], at least for standard options. That is,
the error varies inversely with the number of time steps.
A key point in typical financial problems is that the data is not smooth. For
instance, if the stock value at term is x, the payoff for the European call option
is of the form f (x) = (x − K)+ , which has a discontinuous derivative. Others,
such as digital and barrier options, have discontinuous payoffs. This leads to
1 I would like to thank O. Walsh for suggesting this problem and for many helpful conver-

sations.

1

an apparent irregularity of convergence. It is possible, for example, to halve
the step size and actually increase the error. This phenomenon comes from the
discontinuity in the derivative, and makes it quite delicate to apply things such
as Richardson extrapolation and other higher-order methods which depend on
the existence of higher order derivatives in the data.
The aim of this paper is to study the convergence closely. We will determine
the exact rate of convergence and we will even find an expression for the constants
of this rate.
Merely knowing the form of the error allows us to modify the Richardson
extrapolation method to get a scheme of order 3/2.
We will also see that the delta, which determines the hedging strategy, can
also be determined from the tree scheme, and converges at exactly the same
rate.
The argument is purely probabilistic. The Black-Scholes model treats the
stock price as a diffusion process, while the binomial scheme treats it as a Markov
chain. We use a procedure called Skorokhod embedding to embed the Markov
chain in the diffusion process. This allows a close comparison of the two, and
an accurate evaluation of the error. This was done in a slightly different way by
C.R. Rogers and E.J. Stapleton, [9], who used it to speed up the convergence of
the binomial tree scheme.
This embedding lets us split the error into two relatively easily analyzed
parts, one which depends on the global behavior of the data, and the other
which depends on its local properties.

2 Embeddings
The stock price (St ) in the Black-Scholes model is a logarithmic Brownian mo-
tion, and their famous hedging argument tells us that in order to calculate
def
option prices, the discounted stock price S̃t = e−rt St should be a martingale.
This hedging argument does not depend on the fact that the stock price is a
logarithmic Brownian motion, but only on the fact that the market is complete:
the stock prices in other complete-market models should also be martingales,
at least for the purposes of pricing options. Even in incomplete markets, it is
common to use a martingale measure to calculate option prices, at least as a
first approximation.
It is a general fact [8] that any martingale can be embedded in a Brownian
motion with the same initial value by Skorokhod embedding, and a strictly
positive martingale can be embedded in a logarithmic Brownian motion. That
means that one can embed the discounted stock price from other single-stock
models in the discounted Black-Scholes stock price. Suppose for example, that
Yk , k = 0, 1, 2, . . . is the stock price in a discrete model, and that Y0 = S0 .
def
Under the martingale measure, the discounted stock price Ỹk = e−krδ Yn is a
martingale. Then there are (possibly randomized) stopping times 0 = τ0 < τ1 <
. . . for St such that the processes {Ỹk , k = 0, 1, 2, . . . } and {S̃τk , k = 0, 1, 2, . . . }
have exactly the same distribution. Thus the process (Ỹk ) is embedded in S̃t : Ỹk

2

We will assume the probability measure is the martingale measure. . but not in gen- eral for a multi-stock market.   P Ỹj+1 = aỸj | Ỹj = = q. Let f (x) be a positive function. We note that this embedding works for a single-stock market. 0). The martingale property assures us that 1 def def P Ỹj+1 = a−1 Ỹj | Ỹj = 1 − q . and we assume it takes values in the discrete set of values aj . the theorem is quite general: it is used for the trinomial tree in [10]. Thus rT rT U (s0 . so that (Ỹk ) is a martingale. This is what we mean by embedding. we have a good hold on the embedding times τn and can use this to get quite accurate estimates of the error. so this could apply to models in which the stock price has discontinuities. and the difference between the two values is U (s0 . for some given function f . j = 0. We should note that it is the discounted stock prices which are embedded. not the stock prices themselves. Suppose there is a contingent claim. If S0 = s0 . 0) − V (s0 . . Let δ > 0. if T = nδ. then the value of the claim at time zero is U (s0 . where a > 1 is a real number. and let the stock price at time t = kδ be Yk . We will consider the Cox-Ross-Rubinstein binomial tree model for the discounted stock price. 0) = e−rT E{f (Yn )} . the same contingent claim for the discrete model pays def f (Yn ) at maturity and has a value at time zero of U (s0 . not fixed. However. the discounted price is Ỹk = e−rkδ Yk . its value at time zero is V (s0 . Fix an integer n and let δ = T /n. But Yn = e Ỹn has the same distribution as e S̃τn . In cases such as the binomial tree. If Y0 = s0 .is just the process S̃t sampled at discrete times. while ST = erT S̃T . one can even embed continuous martingales. 0) ≡ e−rT E{f (ST )} . . 1. At each step. On the other hand. Although we will only embed discrete parameter martingales here. the times are random. and its value at some intermediate time t = kδ is 3 . Rogers and Stapleton [9] have suggested modifying the binomial tree slightly in order to embed the stock prices directly. 0) = e−rT E{f (erT S̃τn )}. . 2. which pays off an amount f (ST ) at time T if the stock price at time T is ST . or nearly so. 3 The Tree Scheme Let r be the interest rate. a+1 so (Ỹk ) is a Markov chain with these transition probabilities. Let f be a positive function. such as a European option. and consider the contingent claim which pays f (YT ) at time T . (1) This involves the same process at two different times. 0) = e−rT E{f (erT S̃τn ) − f (erT S̃T )} . Yk can jump to one of two possible values: either Yk+1 = aYk or Yk+1 = a−1 Yk . unless the stocks evolve independently. although there is a simple relation between the two. the fixed time T and the random time τn .

(5) There is a relation between a. 4 . then u(j. The space step h is then h = σ T /n. 0). j ∈ Z . j. The question we will answer is “How fast?” 4 Results We say that a function f is piecewise C (k) if f. (3) Under its own martingale measure. and its value at an intermediate time 0 < t < T is def V (St . k) = e−rδ q u(j + 1. k). (iii) f . def U (Ỹk . there exist K > 0 and p > 0 such that |f (x)| + |f 0 (x)| + |f 00 (x)| ≤ K(1 + |x|p ) for all x. k) will converge to V (x.1 Let K be the class of real-valued functions f on R which satisfy (i) f is piecewise C (2) . . k = 0. . k + 1) + (1 − q) u(j − 1. δ. Let us introduce some notation which will be in force for the remainder of the paper. f 0 . .   u(j. and k tend to infinity in such a way that kT /n −→ t and √ jσ T /n+krT /n e → x. k + 1) . j ∈ Z. f (k) have at most finitely many discontinuities and no oscillatory discontinuities. . and σ which connects these models: r T T δ = . t). In this model. . f 0 . Let f ∈ K and consider a contingent claim which pays an amount f (s) at a fixed time T > 0 if the stock price at time T is s. its value at time zero is V (s0 . n) = f (erT aj ). . The 1 2 discounted stock price is the martingale S̃t = eσWt − 2 σ t . k) = U (aj . Then u is the solution of the difference scheme u(j. t) = e−r(T −t) E{f (ST ) | St } . Let n be the number of time steps in the discrete p model. T . We will treat the follow- ing class of possible payoff functions.e. so that the time-step is δ = T /n. Definition 4. . . the above contingent claim pays f (ST ) at time T . n. n − 1 . n n If we let n. (ii) at each x. (4) where Wt is a standard Brownian motion and σ > 0 is the volatility. the corresponding Black-Scholes model will have a stock price given by 1 2 St = S0 eσWt +(r− 2 σ )t . and f 00 are polynomially bounded: i. k) = e−r(T −kδ) E{f (Yn ) | Yk } = e−r(T −k)δ) E{f (erT Ỹn ) | Yk } . log a = σ . f (x) = 21 (f (x+) + f (x−)). t ≥ 0. (2) Let u(j.

. Corollary 4.2 below. Let the initial price be S0 = s0 .2 below.1 Suppose f ∈ K and that n is an even integer.  log s  θ(s) = frac 2h where frac(x) is the fractional part of x. The density of XT = log(S̃T /s0 ) ≡ σWT − σ 2 T /2 is (x+ 1 σ2 T )2 def 1 2 p̂(x) = √ e− 2σ2 T . and on the relation of these discontinuities to the lattice points. Let s1 . Let n be an even integer. 0) − V (s0 . 0). ∆f (s) = f (s+ ) − f (s− ). let s̃ = se−rT . (7) n where θ = θ(K̃/s0 ). A is a constant which depends on f . If f is discontinu- ous. and if s0 is the initial stock price. Then the error in the tree scheme is 5 . ∆f 0 (s) = f 0 (s+ ) − f 0 (s− ). We get: Theorem 4. 9. and Nho = h + Nhe the set of all odd multiples of h. and B = 2σ 2 T K∆f 0 (K) p̂(log K̃/s0 ) . (6) def Let hZ be the set of all multiples of h. The error depends on the discontinuities of f and f 0 . . so the error of the tree scheme is defined to be def E tot (f ) = U (s0 . We collect (10). 0). The value in the Black-Scholes model is given by V (s0 . Moreover. and Propositions 9. and let s0 be the initial stock price. s2 .2 Suppose that f ∈ K.6 and 9. . and if the discontinuity is not on a lattice point. If all discontinuities are on lattice points. and its value in the binomial tree scheme is given by U (s0 . .5. 2πσ 2 T The main result is Theorem 4.7. then E tot (f ) = O(n−1/2 ). if K̃ = Ke−rT is the discounted strike price. so there is no need for a separate proof. then E tot (f ) is of the form 1 E tot (f ) = A + Bθ(1 − θ) + O(n−3/2 ) . This is a special case of Theorem 4. sk be the set of discontinuity points of f and f 0 . and use (46) to express them in terms of f instead of g. 0) . Nhe = 2hZ the set of all even multiples def def of h. For any real s. but we will first give a simple and easily-used corollary. then E tot (f ) is O(n−1 ). if f is a European call or put of strike price K.

for the same reason. The random walk Ỹk is periodic. This leads to a well-understood even/odd fluctuation in the tree scheme. both are interchangeable with Sτn . The delta. so these formulas remain correct. To avoid this. there are constants A and B such that the error at time 0 is of the form 1 A + B θ̌(K̃)(1 − θ̌(K̃)) + o(n−1 ) . but not with both. which determines the hedging strategy in the Black-Scholes model. (9) n 5 Remarks and Extensions 1. and it might be better to do so. and. can also be estimated in the tree scheme. However. and alternates between even and odd lattice points. 6 .) Let θ̌(s) = frac ( h+log 2h ) s Corollary 4. So in fact ST and erT S̃τn are interchangeable in (8).3 We have expressed the errors in terms of E{f (ST )}. We could as well have worked exclusively with odd values. For a call or put option with strike price K.4 Suppose that f is continuous and both f and f 0 are in K. " e−rT  5 σ 2 T σ 4 T 2  n o 1 n o E tot (f ) = + + E f (ST ) − 2 E (log(S̃T /s0 ))2 f (ST ) n 12 6 192 6σ T 1 n 4 o 2 2 n 2 00 o − E (log( S̃ T /s 0 )) f (S T ) + σ T E S T f (S T ) 12σ 4 T 2 3 X 1  1  + σ2 T si ∆f 0 (si ) − ∆f (si ) + 2θ(s̃i /s0 )(1 − θ(s̃i /s0 )) p̂(log(s̃i /s0 )) i 2 3 1 X − log(s̃i /s0 ) ∆f (si ) p̂(log(s̃i /s0 )) 3 h i:log(s̃i /s0 )∈Ne # 1 X + log(s̃i /s0 ) ∆f (si ) p̂(log(s̃i /s0 )) 6 i:log(s̃i /s0 )∈Nh o √ −rT σ T X  1  +e √ (2θ(s̃i /s0 ) − 1) ∆f (si )p̂(log(s̃i /s0 )) + O 3/2 (8) n n i:log(s̃i /s0 )∈hZ / where the expectations are taken with respect to the martingale measure. Indeed. we can also express them in terms of erT S̃τn . and its estimate also converges with order one. we work exclusively with even values of n. and they occur as coefficients multiplying 1/n in (8) so one can replace ST by erT S̃τn in (8) and the result will only change by O(n−2 ). The symmetric estimate (35) of the delta converges with order one. Remark 4. (See Section 10. since this is exactly what the binomial scheme computes. the theorem tells us that the expectations of f (ST ) and f (erT S̃τn ) only differ by O(1/n).

For example. and then one at n. so this term can vary by a factor of nearly three. In fact the embedding times 2 F. a typical European call with strike price K pays off f (x) = (x − K)+ and (8) simplifies: the last three series vanish. but “with a wobble:” there are constants c1 < c2 for which c1 /n < E tot (f ) < c2 /n. The striking fact about the tree scheme’s convergence is that. giving us potentially a scheme of order 3/2. and the first reduces to the single term 1  σ2 T K + 2θ(1 − θ) p̂(log(s̃/s0 )) . It is usually the raw stock price. However. It is not the only error term. It can vary from 0 to 1. our numerical studies have shown that the behavior of the two schemes is virtually identical: to adapt Corollary 4. In fact.2. and subtract off the first order error terms. just replace the discounted strike price K̃ by the raw strike price K in the definition of θ. If we take f to be the indicator function of (−∞. In log scale. the error goes to zero at the general rate of O(1/n). 3 The quantity to focus on is θ. 4. and M.1 to the evolution of the raw price. one could do this cheaply: use two runs at roughly the square root of n. If we run the tree for three values of n which give different values of θ. It is in effect the fractional distance (in log scale) from K̃ to thepnearest even lattice point. we recover the Berry-Esseen bound.2 The reason is clear from (8). and Corollary 4. Diener [2]. but it is important. and it is why there are cases where one can double the number of steps and more than double the error at the same time. The coefficients in Theorem 4. based on an asymptotic expansion of the binomial coefficients derived by a modification of Laplace’s method. This might be of interest when using the scheme to value American options. Theorem 4. We have therefore used the discounted price for its convenience in the embedding. so the whole lattice changes as n changes. From a purely probabilistic point of view. 7 . [3] have investigated this wobble from a quite different point of view.) 6 Embedding the Markov Chain in the Diffusion The argument in Section 2 showed that the tree scheme could be embedded in a logarithmic Brownian motion. It shows that one can make a Richardson-like ex- trapolation to increase the order of convergence. the lattice points are multiples of σ T /n. This means that θ changes with n too. (We thank the referee for pointing this out. 3. not the discounted price. 5. we can then write down (7) for the three.2 is a rate-of-convergence result for a central limit theorem for Bernoulli random variables. and the error fluctuates quasi-periodically between these bounds. which evolves on the lattice.1 is handier for vanilla options.2 are rather complex. z]. even when restricted to even values of n. but didn’t say how. solve for the coefficients A and B.

which can be handled directly. and set s = τn − T . random variables. τ1 . and a local part. τn is independent of the sequence (Sτj ). 0) − v(1. . the lowest order error√term above would not drop out. by induction: τ0 = 0. Nevertheless. the dependence is essentially a local property. Define stopping times τ0 . while (Xt ) is a Brownian motion with drift. τ2 .can be defined explicitly. . . a+1 a+1 It follows that (S̃τk ) is a Markov chain with the same transition probabilities as (Ỹk ). It follows that the error in the binomial scheme (considered as an approximation to the Black- Scholes model) is given by def E tot (f ) = u(1. Expand E{f (ST +s )} in a Taylor series. One lesson to draw from the above is that it is important that E{τn } = T . so is S̃τ0 . Moreover. This argument is not rigorous. since S̃τ0 = Y0 = 1. so if we stretch things a bit and assume it is independent of (St ). τk+1 = inf{t > τk : S̃t = aS̃τk or a−1 S̃τk } . since τn is a function of the process (St ). 8 . for some a1 and a2 . so it has mean T and variance n var(τ1 ) = c/n. If it were not. Since S̃τk+1 can only equal aS̃τk or −1 a S̃τk . the two are identical processes. we take logarithms of the def stock price: set Xt = log(S̃t /s0 ) = σWt − 12 σ 2 t.1) so that E{τn − T } = 0 and E{(τn − T )2 } = c/n. S̃τ1 . . we would have E{f (Sτn ) − f (ST )} ∼ E{a1 (τn − T ) + a2 (τn − T )2 } (11) = a2 c/n implying that the error is indeed O(1/n). First.i. As S̃t is a martingale. and we can isolate it by breaking the error into a global part. Now τn = τ1 + (τ2 − τ1 ) + · · · + (τn − τn−1 ) is a sum of i.d. we must have 1 a P {S̃τk+1 = aS̃τk | S̃τk } = P {S̃τk+1 = a−1 S̃τk | S̃τk } = . It is E{f (XT +s )} = E{f (ST )} + a1 s + a2 s2 + O(s3 ) . (10) Here is a quick heuristic argument to show that the convergence is first order. 0) = e−rT E{f (erT S̃τn ) − f (erT S̃T )} . on which this argument is rigorous. (see Prop 11. . the integer multiples of h. 7 Splitting the Error We will make two simplifying transformations. and would in fact make a contribution of O(1/ n). so it can’t be independent of it. From the form of the times τj we see that (Xτj ) is a random walk on hZ.

It alternates between even and odd multiples of h. Then J is a stopping time for the sequence τ0 . with even-integer values. Q). τ1 . we will restrict ourselves to even values of j and n. As the notation suggests. . so as τn ≤ ξ. τn . (12) In terms of Q. and (Xτj ) is a simple symmetric random walk on hZ. . We expect that τJ ∼ T . Under the measure Q. and τJ . 9 . Next. To smooth this out. such as its continuity and smoothness. Thus let n = 2m for some integer m and define J = inf{2j : τ2j > T } . Define a function g by def x σ2 T g(x) = f (s0 ex+rT )e− 2 − 8 . and then translate the results back to P at the very end. the error is E tot (f ) = e−rT E{f (Sτn ) − f (SτJ )} + e−rT E{f (SτJ ) − f (ST )} def = E glob (f ) + E loc (f ) . Notice that τJ > T . Similarly E P {f (ST )} = E Q {g(XT )} . where τJ is defined below—the value of ξ is not important. not the absolute error. Let ξ be the maximum of T . such as its integrability. Notice that these concern the signed error. we make a Girsanov transformation to remove the drift of Xt . the error in (10) is n  1 1 2 o E tot (f ) = e−rT E Q f (s0 eXτn +rT ) − f (s0 eXT +rT ) e− 2 Xξ − 8 σ ξ . while E loc (f ) depends on local properties. . We will do all our calculations in terms of Q. By Girsanov’s Theorem [4]. (13) 1 2 Now e− 2 Xt −σ t/8 is a Q-martingale. 1 2 E P {f (Sτn )} = E Q f (Sτn )e− 2 Xξ −σ ξ/8  1 2 2 = E Q f (Sτn )e− 2 Xτn −σ τn /8 = E Q g(Xτn )e−σ (τn −T )/8   since Sτn = s0 eXτn . so long as it is larger than the values of t we work with—and set 1 1 2 dQ = e 2 Xξ + 8 σ ξ dP . In terms of the martingale measure P . E glob (f ) depends on global properties of f . 0 ≤ t ≤ ξ} is a standard Brownian motion on (Ω. F . { σ1 Xt . We will call Q the Brownian measure to distinguish it from the martingale measure P . Xt is a Brownian motion. and τJ is a stopping time for (Xt ).

.d. E Q e− 8 (τ1 − n ) =   q cosh σ T 2 192n 4n where we have used Proposition 11. τ2 .Thus σ2 E tot (f ) = e−rT E Q {g(Xτn ) − g(XT )} + e−rT E Q g(Xτn ) e− 8 (τn −T ) − 1 . Theorem 8.1) and the last term above is  σ2 e−rT E Q {g(Xτn )}E Q e− 8 (τn −T ) − 1 .1 and expanded in powers of 1/n. 8 The Global Error Let us first look at the global error in (14). The local error Ê loc (g) can be computed explicitly. The global error Ê glob (g) can be handled with a suitable modification of the Taylor series argument of (11). But now τn − T = (τ1 − T /n) + (τ2 − τ1 − T /n) + · · · + (τn − τn−1 − T /n). so the last expectation is  n σ2 T 4 2 σ 2 T n e 8n  = 1 + σ T + O(1/n2 ) . . .   Now Xt is a Q-Brownian motion. The final term comes from the fact that we defined g with a fixed time T instead of the random time ξ when we changed the probability measure. This splits the error into two parts. .i. . . (15) 12σ 4 T 2 n 10 . are independent of Xτ1 . Then 1 h5 Q 1 Ê glob (g) = E {g(Xτn )} − 2 E Q {Xτ2n g(Xτn )} 6n 2 σ T 1 Q 4 i 3 − E {X τ g(X τ n )} + O(n− 2 ) . (see Proposition 11.1 Let g be measurable and exponentially bounded. and it is here that the local properties such as the continuity and differentiability of g come into play. the summands are i. Thus σ4 T 2 Q 1 E tot (f ) = e−rT E Q {g(Xτn ) − g(XT )} + e−rT E {g(Xτn )} + O( 2 ) 192n n σ4 T 2 Q 1 = e−rT E Q {g(Xτn )−g(XτJ )}+e−rT E Q {g(XτJ )−g(XT )}+e−rT E {g(Xτn )}+O( 2 ) 192n n def σ4 T 2 Q = Ê glob (g) + Ê loc (g) + e−rT E {g(Xτn )} + O(1/n2 ) (14) 192n which defines Ê glob (g) and Ê loc (g). . . Xτ2 . so that the times τ1 .

E loc comes from the interval of time between T and τJ . and r. By (16). where Q3 is a sum of terms of effective order at most − 23 . Let Pn (x) be the transition probabilities of a simple symmetric random walk on the integers. and XX E Q {g(Xτn ) − g(XτJ )} = P Q {J − n = k} Pn (x) − Pn+k (x)) g(hx) k x X X k 3k 2 + 4kx2 3k 2 x2 k 2 x4  = − 2 + 3 − 4 + Q3 Pn (x) g(hx) .2 gives the values of E{J −n} = 4/3+O(h) and E{(J −n)2 } = 2n/3 + O(1). so that Pj (x) = P Q {Xτj = hx}. so k p xq if p 6= 1 this term has order p+q 2 − r. x 2n 8n 4n 8n k by Proposition 11. we get (15). the contributions to this integral for |x| > n 5 and/or |k| > n 5 go to zero faster than any power of n. Now g inherits the conti- nuity and differentiability properties of f . Substituting. but it is where the local properties of the payoff function f come in. We will express this in terms of g rather than f . for integers p. Thus we can restrict ourselves to the sum 3 over the values max |x|. x=−n n n (σ T )q k=−n (16) By Proposition 11. By 3 3 Corollary 11.2 of the Appendix. (17) 8σ T n Proposition 11. we identify this as 1 h 1 Q 3 Q  E {J − n} − E {(J − n)2 } E{g(Xτn )} n 2 8n 1 1 Q 3 Q  − 2 E {J − n} − E {(J − n)2 } E Q {Xτ2n g(Xτn )} σ T 2 4n 1 i − 4 2 E Q {(J − n)2 }E Q {Xτ4n g(Xτn ) + O(n−3/2 ) . q. 9 The Local Error The local error. and the polynomial boundedness of f 11 . in which case Pn+k (x) and Pn (x) are both defined.Proof.5. This is short. Let us remark that J is independent of (Xτj ) so that ∞ X P Q {XτJ = hx} = P Q {J − n = k}Pn+k (x) .4. the two expectations are bounded. k=−n and. ∞ n p+q X X k p xq   J − n p Q  q n 2 −r Q P {J−n = k}Pn (x) r g(xh) = E Q √ E Xτn g(Xτn ) √ . which is the effective order of nr . |k| ≤ n 5 .

so E{g(XτJ ) | XT .translates into exponential boundedness of g: there exist A > 0 and a > 0 such that |g(x)| ≤ Aea|x| for all x. Thus we can calculate the 12 . But now. τL ≤ t ≤ T =⇒ |Xt − XτL | < h. and τL = τJ−1 < T < τJ . in which case XτL ∈ Nho . k ∈ N. (19) On the other hand. in which case XτL ∈ Nhe . Apply the Markov property at T . L is even } = Πo Πe g(XT ) . But if L is even. Note that in either case. L is even } = Πe g(XτJ−1 ) . Reverse Xt from time T : let X̂t = XT −t . Either L is an odd integer. so that τJ−1 is the first time after T when Xt ∈ Nho . then τJ is the first time after T that Xt hits Nhe . Define two operators. Moreover. Let us define def L = sup{j : Xτj < T } . and τJ is the first time after τJ−1 that Xt ∈ Nhe . Then L is even if and only if X̂t hits Nhe before it hits Nho . L is odd } = Πe g(XT ) . L = J − 2. using the known hitting probabilities of Brownian motion. τJ−1 is the first hit of Nho . and if this is so. 0 ≤ t ≤ T } is a Brownian bridge. so. then {X̂t . or L is an even integer. Xt is a Brownian motion from T on. τJ−1 coincides with a stopping time when L is even. and x 7→ Πe u(x) is linear in each interval [2kh. so we can apply the Markov property at τJ−1 to see E{g(XτJ ) | XτJ−1 . def def Let Nhe = 2hZ and Nho = h + Nhe be the sets of even and odd multiples of h respectively. L is even } = E{Πe g(XτJ−1 ) | XT . x ∈ R by: • Πe u(x) = u(x) if x ∈ Nhe . Note that L is even if and only if XτL ∈ Nhe . (18) so that τL is the last stopping time before T . if we condition on XT = x. E{g(XτJ ) | XT . and x 7→ Πo u(x) is linear in each interval [(2k − 1)h. (2k + 2)h]. and if L is odd. (2k + 1)h]. Let def q(x) = P {L is even | XT = x} . or equivalently on X̂0 = x. Πe and Πo on functions u(x). L = J − 1. Recall that J was the first even integer j such that τj > T . 0 ≤ t ≤ T . and X̂t − X̂0 is a Brownian motion. k ∈ N. Thus Πe u and Πo u are linear interpolations of u in between the even (respec- tively odd) multiples of h. There are two cases. τL = τJ−2 < T < τJ−1 . Xt does not def hit Nho between τL and T . if L is even. XτL ∈ Nhe . • Πo u(x) = u(x) if x ∈ Nho .

exact probability that it hits Nhe before Nho . the probability of hitting Nhe before Nho is not much influenced by the drift. Then ξ is piecewise linear with vertices on Nhe . we see this is = E{Πo Πe g(XT )q(XT )} + E{Πe g(XT )(1 − q(XT ))}   = E Πo Πe g(XT ) − Πe g(XT ) q(XT ) + E{Πe g(XT )} . L odd }P {L odd | XT } . The first integral equals the first expectation on the right hand side of Proposition 9. By the definition of q and the above relations. (20) h Proposition 9. L odd }.1. Thus n E{g(XτJ )} = E E{g(XτJ ) | XT . if X̂0 = x ∈ (2kh. q(x) = P {X̂t reaches 2kh before (2k + 1)h} ∼ (2k+1)h−x h = dist(x. Nho ) + O(h) . Note that {L odd } ∈ F T . x 2 def Definition 9. let ξ(x) = Πe g(x). Nho )/h. The second expectation can be written Z ∞ (Πo Πe g(x) − Πe g(x))q(x)p(x) dx . Proof. L even } + E{g(XτJ ). Thus 1 q(x) = dist(x. we can just note that if h is small. Λ) is the distance from x to the set Λ. (21) −∞ 3 k Proof. t ≥ 0} given XT . (2k + 1)h). and so it is approximately that of unconditional Brownian motion.1 E{g(XτJ )−g(XT )} = E{Πe g(XT )−g(XT )}+E{(Πo Πe g(XT )−Πe g(XT ))q(XT )}. where dist(x. (22) −∞ To simplify the second term. Corollary 9. so we can write it in the form X1 ξ(x) = ax + b + ∆k |x − 2kh| 2 k 13 . More simply. Then ∞ h2 X Z E loc (g) = (Πe g(x) − g(x))p(x) dx + (∆k ) p(2kh) + O(h3 ) . which proves the proposition. Thus. Let us write E{g(XτJ )} = E{g(XτJ ).2 Let ∆k = (Πe g)0 (2kh+) − (Πe g)0 (2kh−) be the jump in the derivative of Πe g at the point 2kh. L even }P {L even | XT } o + E{g(XτJ ) | XT .1 Let p(x) = √ 1 2πσ2 T e− 2σ2 T be the density of XT under the Brownian measure. so it is conditionally independent of {XT +t .

Then it is easy to see that g3 is continuous. Write p(x) = p(2kh) + O(h) there. By the definition of K. we see this is Z ∞ 1X = ∆k (Πo |x − 2kh| − |x − 2kh|)q(x)p(x) dx . Define the modified Heaviside function H̃(x) by   1 if x > 0 1 H̃(x) = if x = 0  2 0 if x < 0 Lemma 9. Since Πo is a linear operator and Πo (ax + b) ≡ ax + b. and g3 ∈ C (1) with g300 piecewise continuous.) However. 14 . Then ∞ (2k+1)h 1 Z Z (Πo |x−2kh|−|x−2kh|)q(x)p(x) dx = (h−|x−2kh|)2 (p(2kh)+O(h)) dx −∞ h (2k−1)h 2 2 = h p(2kh) + O(h3 ) . then g − g1 is continuous. and g3 . (24) y y 2 Proof. Remark 9. it may still have a finite number of discontinuities in its def derivative.4 The local error is not hard to calculate. g2 . but it will have to be handled separately for each of the functions g1 . We remove these by subtracting g2 : g3 = g − g1 − g2 . the corollary follows. g2 is continuous and piecewise linear. 2kh) and (2kh. We can decompose g as follows. Moreover. (2k + 1)h].3 Let g be piecewise C (2) . It is easy to check that if we define g1 by (24). where g1 is a step function with at most finitely many discontinuities. The sums in (24) are finite. ∆k = 2. Then we can write g = g1 + g2 + g3 . (23) 3 If we remember that if g = |x − 2kh|. so that Πo |x − 2kh| = |x − 2kh| except on the interval [(2k − 1)h. we have X X1 g1 (x) = ∆g(y)H̃(x − y) . 2 −∞ k Now |x − 2kh| is linear on both (−∞. and that g300 is piecewise continuous.for some a and b. g(x) = 12 (g(x+) + g(x−)) at any discontinuity of g. has a continuous first derivative. (This is the reason we modified the Heaviside function. for q(x) is approximately 1/h times the distance to the nearest odd multiple of h. Πo |x − 2kh| ≡ h and q(x) = (h − |x − 2kh|)/h. ∞). On that interval. g2 (x) = (∆g 0 )(y)|x − y| .

Πe g(x) − 0 g(x) = 12 g 00 (yk )(h2 − (x − yk )2 ) + o(h2 ). Summing. 3 def where ∆k is the discontinuity of the derivative of Πe g at xk = 2kh: ∆k = (g(xk+1 ) − 2g(xk ) + g(xk−1 ))/2h Z 2h = (1/2h) (g 0 (xk + x) − g 0 (xk − x)) dx 0 = 2g 00 (xk )(h + o(h)) .2 The Piecewise Linear Case For x ∈ R. Then x = 2kh + 2θ̂(x)h for some integer k. 2 P h2 This gives h3 00 g 00 (x)p(x) dx + o(h2 ). but as it multiplies the coefficent of h2 . Let ∆g 0 (x) = g 0 (x+) − g 0 (x−). (Indeed. (One has to R be slightly careful here: the o(h2 ) term is uniform.9. Ik (Πe g(x)−g(x))p(x) dx = 32 g 00 (yk )p(yk )(h3 +o(h3 )). the error is o(h2 ) in any case. 15 . (26) 3 2 P The second contribution to the error in (21) is h3 k ∆k p(2kh) + O(h ). We will calculate the right hand side of (21). Let Ik be R ∞the interval [2kh. Πe g = g if g is a linear function of x. −∞ 3 k This is a Riemann sum for the integral (h2 /3) g 00 (x)p(x) dx. Proof. Then 2h2 ∞ 00 Z Ê loc (g) = g (x)p(x) dx + o(h2 ).1 The Smooth Case Proposition 9. (25) 3 −∞ If g ∈ C (4) and g 000 and g (iv) are exponentially bounded. (2k + 2)h] and let yk = (2k + 1)h be its midpoint. Notice that on Ik . we get Z ∞ X1 (Πe g(x) − g(x))p(x) dx = g 00 (yk )p(yk )(h2 + o(h2 ))2h .5 Suppose g is in C (2) and that g and its first two derivitaves are exponentially bounded. If g ∈ C (4) . The rest follows since x 7→ Πe g(x) is R linear on Ik and equals g at the endpoints. There is an o(1) error in approximating the integral by the sum. Write −∞ (Πe g(x) − P R g(x))p(x) dx = k Ik (Πe g(x) − g(x))p(x) dx. so that the first order terms drop out. while g 7→ Πe g is linear. it is O(h4 ). so it doesn’t cause trouble in the improper integral.) Thus h2 Z = g 00 (x)p(x) dx + o(h2 ) . 9. the error is O(h4 ). Add this R k g (xk )p(xk )(2h + o(h)) = 3 to (26) to finish the proof. Expand g around yk : g(x) = g(yk ) + g (yk )(x − yk ) + 12 g 00 (yk )(x − yk )2 + o(h2 ).) Write p(x) = p(yk )+O(h) on Ik . let θ̂(x) = frac(x/2h) be the fractional part of x/2h.

x 7→ g(x) is linear on the semi-infinite intervals on both sides of Ik . Let Ik be the interval [2kh.Proposition 9. y z so the final term is just 2h2 /3. If y ∈ Ik . Adding this to (28). we compute the integrals in (21). 9. Then the first term in (21) is Z ∞ Z (Πe g(x) − g(x))p(x) dx = (p(y) + O(h)) (Πe g(x) − g(x)) dx −∞ Ik = 4h2 θ̂(y)(1 − θ̂(y))p(y) + O(h3 ) . Πe g(x) = 1 if x > 2(k + 2)h.3 The Step Function Case Proposition 9. and Πe g is linear in Ik . Now Πe is a linear operator. we get (27). Write g(x) = ax + b + 12 y ∆g 0 (y)|x − y|. Write p(x) = p(y) + O(h) on Ik and note that the only contribution to the integral comes from Ik : 16 . By Lemma 9. so Πe g(x) = g(x) except on Ik . notice that as Πe g = g outside of Ik . we note that Πe g(x) = 0 if x < 2kh. Let Ik = [2kh. If y ∈ Ik and 0 < θ̂(y) < 1. Then X h2 X Ê loc (g) = h (2θ̂(y) − 1)∆g(y)p(y) − y ∆g(y) p(y) 3σ 2 T h y ∈hZ / y∈Ne h2 X + y ∆g(y) p(y) . (28) Turning to the final term in (21). Once again. Let us compute the two terms in (21). and Πe f = f if f is affine.6 Suppose g is continuous and piecewise linear. it is enough to consider the case where g(x) = H̃(x − y). we can write Πe g − g explicitly. Let p(x) = p(y) + O(h) for x ∈ Ik . Now x 7→ Πe g(x) is linear in Ik and equals g(x) at the endpoints. which we can do by Lemma P 9.3 we can write g(x) = y ∆g(y)H̃(x − y). (2k + 2)h].7 Suppose that g is a step function. so it is enough to prove this for g(x) = |x − y| for some fixed y. (29) 6σ 2 T h y∈No P Proof. As g(2kh) = 2θ̂(y)h and g((2k + 2)h) = 2(1 − θ̂(y))h. (2k + 2)h]. (27) y 3 Proof. By linearity. Then X 1  Ê loc (g) = h2 ∆g 0 (y) + 2θ̂(y)(1 − θ̂(y)) p(y) + O(h3 ) .3. X X ∆(Πe g)0 (z)0 (y) = ∆g 0 (z) = ∆g 0 (y) = 2 .

x ∈ Ik . so this is 1 0 2kh =p (2kh)h2 + O(h3 ) = − 2 p(2kh) + O(h3 ) . Thus Πe g(x) = (x − (2k − 2)h)/4h if (2k − 2)h < x < (2k + 2)h. (31). then g = H̃(x). then θ̂(y) = 12 . the first error term will be "Z # ∞ (2k+2)h (2k+2)h (x − 2kh)2 Z Z 0 (Πe g(x) − g(x))p(x) dx = p (y) dx − dx −∞ 2kh 2h 2(k+θ̂(y))h 1 = − h2 p0 (y) (31) 6 h2 y = p(y) . We note that since H̃ 0 (x − y) = 0 for all x 6= y. so g(2kh) = 21 . so that by (21). 17 . −∞ The cases θ̂(y) = 12 and θ̂(y) = 0 are special. y ∈ Ik . y ∆(H̃ 0 )(x − y) = 0. So write p(x) = p(y) + p0 (y)(x − y) + O(h2 ). so that ∞ (2k+2)h Z (2k+2)h i x − (2k − 2)h Z hZ (Πe g(x)−g(x))p(x) dx = p(2kh) − dx −∞ (2k−2)h 4h 2kh (2k+2)h Z (2k+2)h hZ (x − (2k − 2)h)2 i + p0 (2kh) − (x − 2kh) dx . ∞ (2k+2)h (2k+2)h (x − 2kh) Z hZ Z i (Πe g(x) − g(x))p(x) dx = dx − dx (p(y) + O(h)) −∞ 2kh 2h 2(k+θ̂(y))h 2 = (2θ̂(y) − 1)p(y)h + O(h ) . If θ̂ = 0. that integral is Z ∞ (Πo Πe g(x) − Πe g(x))p(x) dx = O(h3 ) . It is zero for x ≤ (2k − 2)h and one for x > (2k + 2)h. In both cases we need to expand p up to a linear term. means y = (2k + 1)h. (32) (2k−2)h 4h 2kh The first term in square brackets vanishes. If g(x) = H̃(x − y). since the constant term cancels out. (33) 3 3σ T The proposition follows upon adding (30). Noting that the contribution from p(y) vanishes. (30) The secondPterm in (21) is easily handled. and (33). 6σ 2 T where we have used the fact that p0 (x) = − σ2xT p(x).

It is of interest to know how well the tree scheme estimates this. not order 1. t) − V (e−h s.4 The price of our derivative at time t < T is V (s. h) − fˆ(ST . which does not imply Corollary 4. k) − u(j − 1. Proof. 18 . h)} + E{fˆ(ST . t) (s. 0) (1. k) . (35) s(eh − e−h ) where u is the solution of the tree scheme (3). we approximate ∂V /∂s by the symmetric discrete derivative u(j + 1. ∂s h→0 eh − e−h h→0 where f (eh s) − f (e−h s) fˆ(s.4. We will also assume that r = 0 to simplify notation. Theorem 4. The key remark is that if St is a logarithmic Brownian motion from s. ∂ fˆ/∂s and ∂ 2 fˆ/∂s2 . (34) ∂V The hedging strategy depends on the space derivative . uniformly for h < 1 . However. not f . 0) − V (e−h s.10 Convergence of the Delta: Proof of Cor. which is called the ∂s delta.1 Estimating the delta is essentially equivalent to running the scheme on f 0 . This will justify passages to the limit. so that. it is enough to prove the result for t = 0 and S0 = 1. then eh St and e−h St are logarithmic Brownian motions from eh s and e−h s respec- tively. 0) = lim = lim E{fˆ(ST . In fact it depends on some uniform bounds which come from Theorem 4. ∂V ∂s (1. t) = er(T −t) E{f (ST ) | St = s} . t) = e−r(T −t) E{f 0 (ST ) | St = s} = lim ∂s h→0 s(eh − e−h ) If t = kδ and s = ejh+rt .2 would give order 1/2. eh − e−h Now f 0 ∈ K so that f and its first three derivatives are polynomially bounded. 4. By the Markov property. if f 0 is discontinuous—as it is for a call or a put—and if the disconti- nuity falls on a non-lattice point. def E(h) = E{fˆ(Sτn . for instance.2 and the fact we use the symmetric estimate of the derivative.2. 0) = E{ST f 0 (ST )}. h) − ST f 0 (ST )} = E{fˆ(Sτ . h) = . h)} . so that ∂V B(eh s. the result follows from Theorem 4. From (34) ∂V V (eh s. Remark 10. h) − ST f 0 (ST )} n def = E 1 (h) + E 2 (h) . Thus there is something to prove. If f 0 is continuous. hence there is a polynomial Q(s) which bounds fˆ.

h). −∞ < λ < . h). (36) n 8T (ii) τ1 . . are i. T 2T 2 2T 2 (iii) E{τ1 } = n . −∞ R seh 0 where p is the density of ST . where {Wt . so it has the moment generating function  r T −1 nπ 2 def λτ1 F1 (λ) = E{e } = cos 2λ . Note that fˆ will have discontinuities of approximately 1/2h and −1/2h at s = k − h and s = K + h respectively. h) − sf 0 (s) = eh −e 1 −h se−h (f (u) − 0 (2) 0 f (s)) du. . . where ν = inf{t > 0 : |Wt | = 1}. expand f to first order in a Taylor series and integrate to see this is s2 h2 f 00 (s) + o(h2 ). Thus we will not write E Q and P Q to indicate that we are using the Brownian measure. This completes the proof. let f = (s − K)+ and evaluate E 1 (h) by Theorem 4. Thus E 1 (h) is the error for a payoff function f (·. Ck > 0 such that 19 .2 come from moments of τn and J. and Theorem 4.2. hence so is fˆ(·. (iv) For each k ≥ 1 there are constants ck > 0. . var{τn } = 3n . The bound on E 2 (h) is straight analysis. Now f ∈ K.1 (i) τ1 has the same distribution as Tn ν. . that Xt is a Brownian motion. so that we can write (8) in the form h p̂(log(K − h)) − p̂(log(K+ i E 2 (h) = h2 C + Dθ̌(1 − θ̌) .e. In any case. independent of Xτ1 . We can write Xt = σWt . t ≥ 0} is a standard Brownian motion. these coefficients are uniformly bounded in h for h < 1.i. i. and we can conclude that there is a constant A such that E 1 (h) ≤ Ah2 for small h. 2h the ratio converges to −p̂0 (logK) and (9) follows. There are only finitely many points where f ∈ / C (3) . Xτ2 . except to remark that θ̌ corresponds to θ̂. if |f 00 | ≤ C on the interval. τ2 − τ1 . h) − sf 0 (s))p(s) ds . We will derive them here. so that θ(eh K) = θ(e−h K) = θ̌. .d. To prove (9). . τ3 − τ2 . . Note also that θ(log s) (see section 3) is periodic def with period 2h. By the uniform polynomial bound of fˆ and ˆ related functions. it is bounded by Cs2 h. for the odd h instead of even multiples of h. Now fˆ(s. We will assume that P is the Brownian measure. Proposition 11.1 Moments of τn and J The (very complicated!) coefficients in Theorem 4. 11 Appendix 11.2 applies. We can write Z ∞ E 2 (h) = (fˆ(s. each contributes at most Cs2 h2 to E 2 we see that |E 2 (h)| ≤ Bh2 for some other constant B. var{τ1 } = 3n2 E{τn } = T. If f ∈ C on the interval.

. while the expectated additional time to reach Nho is σ −2 (h2 − dist2 (x. which implies (iv). Nhe )/h + O(h). i=1 Then (Mj ) is a martingale. 4 (ii) E{J} = n + 3 + O(h). the expected additional time to reach Nhe is σ −2 (h2 − dist2 (x. j=1 j=1 But E{ηjk } = E{(τ1 − T /n)k } = Ck0 T k /nk . Apply the stopping theorem to the bounded stopping time J ∧ N and let N → ∞ to see that 0 = E{MJ } = E{τJ } − E{J}E{τ1 } . For (iv). A) be the shortest distance from x to the set A. so that the moments in question are finite.2 Suppose n is an even integer. To get the kth central moment. notice that. + · · · + ηn . Nho )/h + O(h). Proposition 11. Set ηj = Tj − Tj−1 − n. Nho )). j = 1. The kth moment of τ1 is determined by Brownian scaling: ck = E{ν k }. and put j X Mj = ηj = τj − jE{τ1 } . TJ will either be the first hit of Nhe after T —if L is odd (see (18)—or it will be the first hit of Nhe after the first hit of Nho after T . the expected time to reach Nhe from there is T /n = h2 /σ 2 . Once at Nho .g. [9]. n nX  k2 o n k X E{(τn − T )k } ≤ Ck E ηj2 ≤ Ck n 2 E{ηjk } . see e. . note that by (ii) we can write τn − T as a sum of n i. Now by (20). there exists a constant Ck such that E{(J − n)k } ≤ Ck nk/2 T Proof. it is not difficult to extend to λ > 0. (i) follows by Brownian scaling. Let dist(x. If XT = x. The ηj have mean zero. E{|τn − T |k } ≤ Ck . if L is even. Nhe )). (iii) E{(J − n) } = 32 n + O(1). notice that τ1 has finite exponential moments. as in Proposition 9. . (37) E{τ1 } T Now τJ > T . 20 . say τn − T = η1 . and P {L is odd | XT = x} = dist (x. Then (i) E{τJ } = T + 43 Tn + O(h3 ). nk nk/2 Proof. and the moment generating function is well-known for λ < 0 [4].i. b). and (iii) is an easy consequence of (i). 2 (iv) For k > 1.1. so by Burkholder’s and Hölder’s inequalities in that order. P {L is even | XT = x} = q(x) = dist (x. b) is σ −2 (x − a)(b − x). so that E{τJ } n E{J} = = E{τJ } . The expected time for Brownian motion to reach the endpoints of an interval is well known: if X0 = x ∈ (a. so to find its expectation. the expected time for X to leave (a. ck T k Tk E{τ1k } = . copies of τ1 − T /n.d. Then (ii) is well-known. 2.

the index of the last stopping time before T . note that Mj2 − j var(τ1 ) is also a martingale. and gives 11 T 4T 3 12 n . we can stop it at time J to see that that E{MJ2 } = E{J}var{τ1 }. n −∞ 21 . (38) −2 −∞ R∞ P  R xk +h  Now let xk = (2k + 1)h and write −∞ = k (1/2h) xk −h 2h. We can then do the integrals explicitly: X p(xk ) Z xk +h   (1 + O(h))σ −2 h2 − dist2 (x. As with M . L is even}. It did not depend on J. 2 but we know the latter will be of the form γ Tn2 for some γ > 0. and J is F T -measurable. Nhe ) h−1 dist (x. so we see E{τJ − T } = 3 n + O(h ) .Thus. √ 3y To see (iv). Now E{MJ2 } = E{τJ2 } − 2E{JτJ }E{τ1 } + E{τ1 }2 E{J 2 }. 3 and (iii) follows. Nho ) h−1 dist x. Write p(x) = p(xk )(1 + O(h)) on the interval (xk − h. But E{JτJ } = T E{J} + E{J(τj − T )}. which implies (i) and (ii). Nho ) + O(h) dx . while the value of E{τj − T | F T } depends only on the value of XT and on the parity of L. this turns out 2 to be negligeable. xk + h). We have found the values of all the quantities except E{J 2 } and E{(τJ −T )2 }. so that  n |J − n| k o Z ∞ 3y def E √ ≤4 y k d(e− 4 ) = Ck . The other integral is similar. so that if p(x) is the density of XT . we have E{τJ − T | XT = x} = P {L is odd | XT = x}E{τJ − T | XT = x. The joint distribution of XT and the parity of L was determined in Section 9 by reversing Xt from T . Nhe )/h + O(h)) dx 2h xk −h k 5 −2 2 X 5 T = σ h p(xk ) 2h + O(h3 ) ∼ (39) 12 12 n k R since the Riemann sum approximates p(x) dx = 1. E{J τJ } = E{J}E{τJ }. as L is conditionally independent of {XT +t . In short. Thus E{TJ2 } − 2E{J}E{τJ }E{τ1 } + E{τ1 }E{J 2 } = E{J}var(τ1 ) . so we can solve for E{J } to find that 10 E{J 2 } = n2 + n + O(1) . Nhe + O(h) dx  E{τJ − T } = −∞ Z ∞    + σ p(x) 2h2 − dist2 (x. L is odd} + P {L is even | XT = x}E{τJ − T | XT = x. notice that from (42). t ≥ 0} given XT . P |J − n| > y n ≤ 4e− 4 . Z ∞    σ −2 p(x) h2 − dist2 (x. Nho ) (dist (x. To see (iii).

and the conclusion follows. Proposition √ 11. (40) n ξn n Proof. 22 . Suppose m = nξn . Exactly the same manipulations show that Pn is again bounded by (41). cos 2λT√ cos x n q 2λT 3ρ where x = √ n . Take logs and choose λ = 2ξn T 2 to see that ρ2  3 9  x2  log Pn ≤ − − + log cos x . Corollary 11. so this is 3ρ2 1 =− 2 + O( √ ) . n√ m 2  1  − 3ρ o P n |τm − T | > ρ ≤ 2e 4T 2 ξn 1 + O √ .which proves the assertion. By Chebyshev’s inequality and (36). n√  m  o √ def m Pn = P n τm − T > ρ ≤ e−λρ E{eλ n(τm − n T ) } n 4λ2 T 2 ξn λT 2 !  −√ nξn − x2 x4 e n e = e−λρ q = e−λρ . ξn T 2 2 x4 2 √ Expand log cos x near x = 0 and note that x2 = O(1/ξn n) = o(1). where nξn → ∞ as n → ∞. The following proposition is a key.4 Let y > 0 and let g be exponentially bounded. We will need to control the tails of the distributions of τn and J. let def m m e−λρ E e−λ(τm − n T )   Pn = P τm − T < −ρ ≤ n  λT nξn e n √ = e−λρ q cosh 2λT √ n which differs from the above only in that cosh replaces the cosine.3 Let (ξn ) be a sequence of reals. (41) 4ξn T ξn n The other direction is similar. For λ > 0. as n → ∞. Then. Then n |J − n| y2 −3 o P √ ≥ y ≤ 2e 4 (1+y/ n) (42) √ n and for all strictly positive p and .

and add to get (42). Pn (x) k p xq p q def 1 Define the effective order Ô of a monomial nr to be Ô( knxr ) = 2 (p+q)−r.  (43) n→∞ 1 lim np E g(XτJ ). Similarly. P {J > nξ} = P {τnξ < T } ≤ P { n |τnξ − ξT | > n (ξ − 3n(ξ−1)2 √ 1)T }. k. |J − n| > n 2 + = 0 . so 1 1 |E{g(Xτn ). k. P {J < nξ} = √ P {τnξ > T } ≤ P { √n |τnξ − ξT | > n (1 − ξ)T }. Xτn and τn are independent and |g(x)| ≤ Aea|x| for some A and a. Take y = (ξ − 1) n. so that. 2n 8n2 4n3 8n4 where Q3 is a sum of monomials of effective order at most − 23 . x) = . By (40).2 Transition Probabilities Let Pn (x) be the transition probability of a simple symmetric random walk on the integers: n! 2−n Pn (x) = n+x  n−x  (45) 2 ! 2 ! which is the probability of taking 12 (n + x) positive steps and 12 (n − x) negative steps out of n total steps. Next. 11. 3 Proposition 11.4. k. 23 .5 Let n. to √ −3 y2 see that P {J − n √> y n} ≤ 2e 4√(1+y/ n) .  (44) n→∞ √ √ Proof. The second assertion follows from Corollary 11. |XτJ − XT | ≤ 4h. Now let def Pn+k (x) R(n. 1 lim np E g(Xτn ). Use (40) to get the same bound for P {J − n < −y n}. Xτn is binomial √so we use its moment generating function to see that E{ea|Xτn | } ≤ 2eσ nT . |τn − T | > n− 4 + }| ≤ AE{ea|Xτn | }P {|τn − T | > n− 4 + } . |E{g(XτJ )}| ≤ Aea|XτJ | ≤ Aea|XT |+4h . and x be even integers with max (|k|. P {J − n > n(ξ − 1)} ≤ e− 4ξ . |x|) ≤ n 5 . Then k 3k 2 + 4kx2 3k 2 x2 k 2 x4 R(n. |τn − T | > n− 4 + = 0 . for ξ < 1. Let ξ > 1. x) = 1 − + − + + Q3 + O(n−3/2 ) . Combine this with the bound (40) on the tails of τn to see (43). once we notice that in any case.

x) in terms of factorials and use Stirling’s formula on each. and take logarithms. Write R(n. so that 1 1 1 3 log R1 = − ξ + ξ 2 + mη 2 ξ − mη 2 ξ 2 + mη 2 ξ 3 + mη 4 ξ + Ŝ(m. η) is a polynomial whose terms are all o(1/n). k. k. η) + o(n− 2 ) . x) + S(n. x) + o(n− 2 ) 2n 4n2 2n3 4n4 def 3 = Q(n. If we include terms up to order 6 in ξ and η. ξ. k. Let −3 ξ = k/n. k. the remainder will be o(n−3/2 ). k k 2 + 2kx2 k 2 x2 2k 3 x2 + kx4 3 log R = − + − + + S(n. so that we can write 8 ! Q+S − 23 X (Q + S)p R = e (1 + o(n )) = (1 + o(n−1 )) p=1 p! def − 23 = (1 + o(n )) Q1 . k. k. We find 1 1 1 log R1 = (2m(1+ξ)+ ) log(1+ξ)+(m(1+η)+ ) log(1+η)+(m(1−η+ ) log(1−η) 2 2 2 1 1 − (m(1 + ξ + η) + ) log(1 + ξ + η) − (m(1 + ξ − η) + ) log(1 + ξ − η) 2 2 1  1 2 2 2 2  log R2 = + + −1− − . Thus it is enough to determine log R up to terms of order 1/n. 2 4 2 where Ŝ(m. Since |k| and |x| are smaller than n3/5 . each of which is o(1/n). We 1 can write R(n. The largest term in Q is h2 x2 /n2 ≤ n−1/5 . if rn → r > 0 and | log rn − log r| < cn−p . where S(n. and mξ p η q with p + q ≤ 5. x) = R1 R2 . and an easy calculation shows that ξ log R2 =+ O(n−3/2 ) . ξ. 24 . −3 The O(n−3 ) term is uniform in the sense that there is an a such that for all n it is between −a/n3 and a/n3 . that is. To see how many terms we need to keep. where R2 comes from the factors e 12n +O(n ) . We need Stirling’s formula in the form n! = 2π nn+ 2 e−n+ 12n +O(n ) . ξ and η are smaller than n−2/5 . η = x/n. x) is a sum of monomials. √ 1 1 Proof. then |rn − r| < 2rcn−p for large n. so we can include it as part of Ŝ. those making an O(1/n) or larger contribution will be of the form ξ p η q with p + q ≤ 2. In terms of n. 24m 1 + ξ 1+η 1−η 1+ξ+η 1+ξ−η Notice that errors in log R are of the same order as those of R. and m = n/2. note that the coefficients may be O(n). k. 4n Expand log R1 in a power series in ξ and η. and x. Notice that log R2 is also of o(1/n). x) + o(n− 2 ) .

since all effective orders are multiples of 1/2. The translation is straightforward. References [1] Cox. Thus E Q {g(XT )} = E P {f (ST )}. 2n 2n2 2 4n2 n3 and the other two terms have effective order −3/2. Journal of Financial Economics 7. 25 . Then g(x) = f (s0 ex+rT )ρ(x). This proves the proposition. Then R = (Q2 +Q3 )(1+o(n−3/2 )) where all terms in Q3 have effective order at most −3/2. Rubinstein. but we would like the final results in terms of the original function f . Ross. namely k2 k 2 x4 k 2 x2 2 + 4 − . a simplified ap- proach. where p and p̂ are the densities of XT under the Brownian and martingale measures respectively.. 12 Summary and Translation We have done all our work in terms of the function g. so XT = log(S̃T /s0 ) and E Q {g(XT )} = E P {f (ST )} n 1 o E Q {g 00 (XT )} = E P ST2 f 00 (ST ) + f (ST ) (46) 4 E Q {XTk g(XT )} = E P {(log(S̃T /s0 ))k f (ST )} . Moreover. All terms in S have effective order less than −1. the martingale measure P and the Brownian measure Q are connected by dP = ρ(XT )dQ on F t . x σ2 T Let ρ(x) = e− 2 − 8 . Thus define def k 3k 2 + 4kx2 3k 2 x2 k 2 x4 Q2 = 1 − + 2 − 3 + . Let s = s0 ex+rT . 8n 8n 4n3 All other terms have effective orders at most −3/2. p̂(x) = ρ(x)p(x). ST = s0 eXT +rT . Check the effective order of the terms in the polynomial Q1 : we see that k   kx2  1  k2   k 2 x2  Ô = Ô =− .C. Note first that ρ0 (x) = − 21 ρ(x). J. so that the only terms in Q1 of effective order at least −1 are those in Q and the three terms from 21 Q2 which come from the squares and products of the terms of effective order −1/2. S. Now the effective order of the product of monomials is the sum of the effective orders. and M. 2n 8n 4n 8n4 and let Q3 be all the other terms in Q1 .A. Moreover. Ô = Ô = −1 . and hence less than or equal to −3/2. (1979) 229–263. Option pricing. We also have formulas involving the derivatives of g.

Journal of economic dynamics and control 22. 1998. Math. Binomial models for option valuation- examining and improving convergence. L. (1998) 1419–1444. Asymptotics of the price of a barrier option in a tree model (preprint) [4] Karatzas. D. and John B. [9] Rogers. a detailed convergence anal- ysis for binomial models. Itrel. and S. and M.P. D. Methods of Mathematical Finance. 101-121 (in press). Ann. Fields Communications Series 34. (1996) 319–346 [7] Leisen. and M. I. F. and S. (preprint) [3] Diener. [8] Monroe. Finance and Stochastics 2 (1998). [6] Leisen. and M. Pricing the American put option. Springer- Verlag. and E. Owen D.P. Springer-Verlag.. Embedding and the convergence of the bi- nomial and trinomial tree schemes.C. 26 . Diener. Asymptotics of the price oscillations of a vanilla option in a tree model. 1293–1316.G. Stat 43. Numerical Methods and Stochastics. I. [2] Diener. 1988.J.. Diener. Shreve. Stapleton. F. Brownian Motion and Stochastic Calculus. On embedding right-continuous martingales in Brownian mo- tion. Shreve. Applied Math. pp. [10] Walsh. Reimer. Finance 3. 3–17. [5] Karatzas. Fast accurate binomial pricing.