You are on page 1of 29

Implied Binomial Trees from the Historical Distribution

Nusret Cakici

and

Kevin R. Foster*

City College of New York


Convent Avenue at 138th Street
New York, NY 10031
Tel: (212) 650-6201
email: ncakici@yahoo.com, kfoster@ccny.cuny.edu

November 2001

Draft -- Not for Citation

*
We are grateful for helpful comments from Salih Neftci. The comments of an anonymous
referee were enormously beneficial. We are grateful for support from the Schweger Fund.
Remaining errors are our own.
Implied Binomial Trees from the Historical Distribution

This paper shows how to build a binomial tree that is consistent with the
historical distribution and that can be used to value American options.
Previous methods of using the historical distribution (Stutzer 1996) use data on
just the underlying security to imply options prices -- but only European
options. Since the binomial trees in this paper are constructed only from
historical data on the underlying asset, not option prices, market prices may be
compared to the estimates from these trees, in order to measure the relative
richness or cheapness of the quoted options.
Implied Binomial Trees from the Historical Distribution

This paper demonstrates a way to build binomial trees that have an option volatility "smile"

consistent with the historical distribution. This research links two recent strands: the Stutzer

(1996) method of using the historical distribution of asset prices to infer European option

prices and volatilities with the methods of constructing "smiling trees" of Derman and Kani

(1994) and others. The constructed tree then has a volatility smile derived entirely from the

historical data on the underlying asset not options prices. It can serve as a benchmark in

evaluating the current option prices to judge whether they reflect higher or lower volatility

than would be implied by their history. Just as the Stutzer methodology calculates European

option prices that are consistent with the historical distribution, this paper extends that work to

build trees that calculate American option prices consistent with the historical distribution.

Since the pioneering work of Dupire (1994), Rubinstein (1994), and Derman and Kani (1994),

both academics and practitioners have been building trees that can match the volatility smile,

in order to price American options and other path-dependent derivatives. These approaches

typically begin with a set of option prices and then generate a set of nodes and transition

probabilities that are consistent with these prices. However they are limited by the need to

accept as inputs the present structure of option prices (or else make strong functional form

assumptions) and so are not suitable for traders looking to measure the relative richness or

cheapness of the quoted options. We would like to be able to construct some sort of historical

baseline to be able to provide an estimate, based on the past behavior of the underlying asset,

of the option value for American options or other path-dependent derivatives.

1
A method of constructing an historical baseline for European options is provided in the papers

of Stutzer (1996) and Zou and Derman (1999), which demonstrate a straightforward

methodology for generating a risk neutralized historical distribution (RNHD). From this

RNHD, a historically appropriate set of European options prices is derived, using only the

data on the underlying asset not data on current option prices. The RNHD is the distribution

that is "closest" to the historical empirical distribution but still satisfies certain constraints: the

mean is inferred from the risk neutrality constraint and the spread may be inferred from the at-

the-money option price. The RNHD is "closest" in that it minimizes the Kullback-Leibler

entropic distance between a nonparametric kernel estimate of the historical distribution and

the risk-neutral distribution (Cakici and Foster 2001).

The options prices implied by this risk neutralized historical distribution (RNHD) then serve

as inputs into the tree-building procedure in order to derive an appropriate benchmark for

valuing American options. In constructing these, we may wish to match a particular option

price so as to leverage this knowledge to evaluate other prices. For example, the thickly

traded at-the-money option price can imply out-of-the-money prices based on the empirical

distribution. Since this procedure delivers prices for both American and European options,

analysts can also make cross-comparisons.

This paper demonstrates the utility of this joint procedure by valuing S&P100 options, which

are American and can be exercised early. The existence of a substantial volatility smile,

2
resulting from the skewed distribution that fails to satisfy the Black-Scholes assumptions of

log-normality, makes the proper modeling of these options particularly important.

Section 2 of this paper describes the procedure for constructing the RNHD based on a that

will accurately reflect the historical distribution. Section 3 outlines the procedure for

estimating a nonparametric kernel. Section 4 describes the procedure, based on Barle and

Cakici (1998), for creating the tree from the option volatility smile. Section 5 applies this

method to the S&P100 data. Section 6 concludes with some suggestions for further research.

2. Constructing the RNHD Estimate of the Volatility Smile

Equilibrium in a financial market is generally characterized with assumptions about a lack of

risk-free arbitrage trading opportunities. Forward prices are rationally derived from the

expected price of the underlying asset, so a particular asset with current value S0, paying a

dividend stream, d, where agents can borrow and lend at the riskless interest rate r, will have

expected price at some future date, T, given as ST, evaluated under a probability distribution,

P, such that

E [S T P ] = S 0 e (r −d )T . (1)

Options are priced from the forward distribution, P, valued as their discounted expected value

if they expire "in the money." This forward distribution, P, is unknowable to empiricists

except as inferred from options prices. Some economists postulate a particular functional

form for the distribution (lognormal for Black-Scholes), but instead we use the entire
~
historical distribution as our guide. This historical distribution, Q , is transformed to match

~
the no-arbitrage condition and the option price, so we form Q from Q to get a plausible

3
estimate of P. Construction of Q, the risk-neutralized historical distribution, allows

calculation of the values of any standard European options.

~
Estimation of Q from the historical distribution, Q , is not trivial, however. There is no

necessary link between past events and expectations about the future. However, if we assume

that present market participants take account of the past in forming their expectations, we can

expect some relationship to exist (see Grandmont 1993). The tightness of this relationship

may be measured by the entropy change associated with it. Since a reduction in entropy is an

increase in orderliness (due to the restriction), the smallest decrease measures the least

prejudicial application of the no-arbitrage constraint.

The Shannon entropy of a distribution, Q, with observations qj, is measured as

åq j log(q j ) , (2)
j

and the Kullback-Liebler entropic distance between two distributions P and Q is measured

analogously as

åq j ( )
log q j p j . (3)
j

This entropic distance measure deviates from a standard metric in that it is not symmetric: it

gives the directed distance from Q to P (not necessarily the same as the directed distance from

P to Q). The reason for this asymmetry is that the Kullback-Liebler distance weights the

deviations by their probability of occurrence, so it is more sensitive to deviations that are

more likely to be observed and less sensitive to improbable deviations. Other familiar

4
measures such as the squared deviations would give uniform weight to each squared

deviation.

The Kullback-Liebler minimum entropy change, used in this paper, in finding a distribution Q
~
that is near to Q but that satisfies the constraints, restricts us from inadvertently imposing

extra assumptions. The relation with the Shannon entropy measure allows us to interpret

these extra assumptions as adding information (reducing entropy). This follows Laplace's

principle of insufficient reason. For instance, the Black-Scholes assumtion of a normal

distribution imposes precise assumptions upon the third and higher moments. These

assumptions can be measured as decreasing entropy (increasing information), and so finding a

distribution that makes the slightest decrease in information corresponds to making the

weakest set of assumptions that can fulfill the given constraints. This entropic distance is

commonly used in other fields such as physics and engineering, since for particular

distributional families the cross entropy minimization procedure produces analytical solutions

of Bose-Einstein or Maxwell-Boltzmann distributions (see Kapur and Kesavan 1992).

Further reasons for using this measure are summarized by Von Neumann's reply to Shannon,

on the reason to label his original measure as "entropy": First, it "is the same as the expression

for entropy in thermodynamics and as such you should not use two different names for the

same mathematical expression, and second, and more importantly, entropy, in spite of one

hundred years of history, is not very well understood yet and so as such you will win every

time you use entropy in an argument!" (reported in Kapur and Kesavan, p. 8).

5
This Kullback-Liebler distance criterion is minimized subject to the requirement that Q be a

proper probability measure, that the expected price satisfies risk-neutrality, and may be

minimized subject to the additional requirement that some European option prices (typically

at-the-money) implied by this modified distribution match. Writing this new Q distribution as

a function of the λi parameter vectors, the no-arbitrage constraint requires that the expected

value under the new probability distribution should equal the forward price, so:

ò Q(λ , λ )S
1 2 T dS = S 0 e ( r − d )T (4)

where r is the risk-free rate and d is the continuous dividend yield. Additionally, we may

impose the constraint that the price of a European call or put (typically the at-the-money

option) should match, thus

C(K,T ) = e -rT max(0, E[S T Q(λ1 , λ2 )] - K ) or (5)

P(K , T ) = e -rT max(0, K - E[S T Q(λ1 , λ2 )]), (6)

where K is the strike price, set equal to the forward price ST for at-the-money.

By the entropy distance minimization (see Buchen and Kelly 1996), if the ATM call price is

matched, we can derive

PST
Q(λ1 , λ 2 ) = − λ1S − λ2 C ( K ,T )
e − λ1ST −λ2C ( K ,T ) , (7)
ò PS e dS

where the parameters of λ1 can be interpreted as the shadow values of the first no-arbitrage

constraints and λ2 are the shadow values of the constraints that the at-the-money option

prices should match. The parameter vectors λ1 and λ2 are found numerically.

6
A simple example may clarify this procedure. Suppose the "Dice" company has a history of

returns given as if from a roll of a die: one sixth are 1, one sixth are 2, ... , one sixth are 6. It
~
is a uniform discrete distribution from 1 to 6, which we refer to as Q . The mean of the series

is evidently 3.5 and the variance is 1.7. We want to modify this series so that it satisfies risk-

neutrality: assume that we want to make the expected value equal to some other number, say

4.44. Note that the entropy distance from a uniform (each pi = 1


n ) is just a constant plus the

Shannon entropy:

qi
åq i ln
1
n
= ln n + å q i ln q i . (8)

So minimize the distance (maximize the negative) subject to the constraints that the

probabilities sum to one and that the mean is 4.44, getting the Lagrangian:

6
æ 6 ö æ 6 ö
− ln 6 − å qi ln qi − λ0 ç å qi − 1÷ − λ1 ç å iq i − 4.44 ÷ (9)
i =1 è i =1 ø è i =1 ø

Get the first order conditions with respect to the qi and rearrange so

q i = ab i , (10)

where a = e −1− λ0 and b = e − λ1 . Substitute this into the constraints to be left with two

equations in two unknowns:

a å b i = 1 and (11)

a å ib i = 4.44 , (12)

where the second equation gives a polynomial in b while a is just a normalizing parameter.

Some calculation gives the following numbers for Q, the distribution that has the minimum
~
entropy distance from Q but satisfies the constraints as P:

7
Prob{1} Prob{2} Prob{3} Prob{4} Prob{5} Prob{6}
0.0594 0.0839 0.1186 0.1675 0.2365 0.3341

This distribution now has a mean of 4.44. When applied to the problem of minimizing the

distance from a distribution of stock returns given the prior of the historical returns, the

answer may not necessarily be from a particular distributional family, since the cross entropy

minimization no longer has analytical solutions.

With some further calculations, the variance could be made consistent with the volatility

implied by an option price. For instance, an analyst might believe that the thickly-traded at-

the-money option gives an accurate volatility measure, but want to leverage this belief to

value thinly-traded options. Or the analyst might have a strong belief about another

probability (corresponding to a different option) and want to see the implied effect upon at-

the-money volatility.

3. Kernel Estimator of Historical Distribution

The simulations of Stutzer and Zou and Derman were made as simple draws from the daily

historical returns. While this has the advantage of simplicity, some of the tail events remain

discrete, chunky occurrences not smooth probabilities. The 1987 Crash is one such notable

discrete event. In the historical simulations there is a simple chance of picking this calamity

with each draw, but no chance of ever picking a slightly larger or slightly smaller drop.

Market participants may have somewhat smoother priors to assign at least a tiny probability to

the interspersed large negative movements. Surely it seems plausible that market participants

may assign some nonzero probability to a substantial fall that is not exactly a clone of the

8
1987 Crash. This intuitive notion can be modeled with a non-parametric kernel estimate of

the historical distribution.

The kernel estimator will be appropriate, as long as the historical distribution is stationary,

ergodic, and the data generating process satisfies some relatively weak assumptions of

stochastic equicontinuity in order for a Uniform Law of Large Numbers result to obtain

(Andrews 1994). Stochastic equicontinuity of an empirical process, HT, is defined as

∀ε > 0 and ∀η > 0 , ∃δ > 0 such that

é ù
lim P ê sup H T (θ ′) − H T (θ ) > η ú < ε . (13)
T →∞
ëθ ,θ ′∈Θ, ρ (θ ,θ ′ )<δ û

In generalizing this method to higher dimensions, we should keep in mind the tradeoff

between the smoothness required of the function and the dimension of the random variable.

The kernel estimator at each point is constructed as

1 T
æ x − Xi ö
H T ( x) =
hT
å K çè
i =1 h ø
÷ (14)

where K is the Epanechnikov kernel

ì 3
ï
K (z ) = í 4 5
(
1 − 15 z 2 ) z ≤ 5
. (15)
ïî 0 else

We use an Epanechnikov kernel because in the limit this has the highest efficiency in Mean

Squared Error in trading off between bias and variance (Silverman 1986). The bandwidth

parameter, h, is similarly selected based on limit efficiency as

h = 1.06σˆT − 5
1
(16)

9
where σˆ is the standard deviation of the sample. We have experimented with other kernels

and other bandwidths, but found little variation (results available upon request). A Normal

(Gaussian) kernel implies basically the same volatility smile. The smile is also little changed

by variations in the bandwidth parameter or even the use of an adjustable bandwidth estimator

that allows the sensitivity to vary depending on the local density so that high-density areas get

a low bandwidth while low-density estimators get a higher bandwidth.

Use of a kernel estimator for interest rates has been criticized by Chapman and Pearson

(2000) for making a poor estimation of areas of the distribution with very few data points.

However this analysis is not so concerned with the behavior at extreme values, but rather the

well-estimated behavior at points where the data are dense. The kernel estimator is also well

suited to addressing the issue of the length of memory. This procedure assumes that the entire

historical series enters the present judgements of market participants as they assess the

likelihood of future events, without any decay so that events from decades past enter in the

same manner as the most current events. Statistically, this is the assumption of stationarity.

While this assumption may not be universally valid, it still gives an interesting baseline by

which to judge current option prices. A modification to the kernel estimator could address

this issue, since it is straightforward to introduce another dimension of temporal distance

when creating a kernel estimate of the historical distribution.

This kernel estimate is used to construct a RNHD estimator of the forward distribution, given

particular values for r, d, and at-the-money option prices. This forward distribution generates

a smile of implied Black-Scholes volatilities for the European call prices. A smooth version

10
of the smile is fitted as a function of the strike price and its functionals (square, cubic, etc.).

This is a simplification to minimize the computing time, since the construction of the tree

requires evaluation of call prices (therefore volatilities) at quite numerous intervals while the

RNHD procedure gives discrete values. Derman and Kani used a simple linear smile, but we

wish to capture more detail by using the higher-order functionals. The smooth functional

preserves the necessary information while minimizing the computational burden.

4. Constructing a Recombining Binomial Tree from the Volatility Smile

Following the Barle and Cakici modification of the Derman and Kani algorithm, the

recombining binomial tree is constructed recursively forward, one level at a time. Assuming

that the tree has been constructed up to the nth level, letting r be the risk-free interest rate, d be

the dividend yield rate, and s i be the stock price at node i of the nth level, then the forward

price, Fi = si e (r − d )∆t , must by risk-neutrality satisfy

Fi = pi S i +1 + (1 − pi )S i (17)

where the capital letters S i +1 and S i represent the nodes as the (n+1) level branching from the

price s i .

The call price of a European option maturing at (n+1) and with strike price, K, is C (K , n + 1) .

Of course, the price must be interpolated, which we derive from the functional describing

volatility. Then the call and put prices are:

n +1
C (K , n + 1) = å Λ i max(S i − K ,0 ) (18)
i =1

11
n +1
P(K , n + 1) = å Λ i max (K − S i ,0 ) (19)
i =1

where Λ i is the Arrow-Debreu price at each node, calculated as

ì pn λn i = n +1
ï
Λie r∆t
= í pi −1λi −1 + (1 − pi )λi 2≤i≤n (20)
ï (1 − p1 )λ1 i =1
î

where λi is the known Arrow-Debreu price at node i.

One remaining assumption concerns finding the value of the central node(s). The tree is

allowed to grow at the risk-free rate so that the central nodes do not continue to represent zero

growth (as in Derman and Kani). If the number of new nodes is odd, then the central node

price is set equal to the forward value. If the number of new nodes is even, then the two

central nodes must satisfy S i S i +1 = Fi 2 . So the upper central node is

λi Fi + ∆Ci
S i +1 = Fi , (21)
λi Fi − ∆Ci

where

å λ (F − Fi ) ,
n
∆Ci = e r∆t C (Fi , n + 1) − j j (22)
j =i +1

while the lower central node is

Fi 2
Si = . (23)
S i +1

From these central nodes, the upper and lower parts of the tree may be derived.

For the upper nodes, the previous equations imply that

∆Ci S i − λi Fi (Fi − S i )
S i +1 = (24)
∆Ci − λi (Fi − S i )

12
Analogously, the lower nodes are filled with the relation that:

λi Fi (S i +1 − Fi ) − ∆Pi S i +1
Si = (25)
λi (S i +1 − Fi ) − ∆Pi

where now

i −1
∆Pi = e r∆t P(Fi , n + 1) − å λ j (Fi − F j ) . (26)
j =1

Negative probabilities are not allowed and are chiefly minimized by setting the strike price,

K, equal to F, in order to evaluate the option. Given these modifications, negative

probabilities are avoided even when building very large trees.

The complete method to construct a binomial tree with volatilities consistent with the

historical distribution may be summarized as follows:

1. A nonparametric kernel is constructed to estimate the empirical distribution of asset

returns.

2. The RNHD is constructed as the distribution that is closest to the empirical distribution

but also satisfies the conditions of risk-neutrality (Equation 4) and perhaps option prices

(Equations 5 and 6), typically at-the-money as in Zou and Derman.

3. The implied option prices and volatilities from the RNHD are evaluated at discrete

intervals and a polynomial function interpolates the "smile".

4. A recombining binomial tree is constructed that matches the smile implied by the

historical data, using the Barle and Cakici algorithm.

13
5. Application to S&P100 Options

A nonparametric kernel is estimated for the S&P100 index using data from March 5, 1984 to

November 2, 2000, from which the log ratio gives the percent change. This empirical

probability density function is shown in Figure 1, where it is compared with a normal density

(the assumption that underlies the Black-Scholes model). It can be seen that the empirical

probability density function (p.d.f.) has fatter tails (leptokurtotic) but is also more

concentrated at the middle of the distribution so that the “shoulders” are lower than the

reference normal.

This kernel is used to generate a risk-neutralized historical distribution (RNHD), assuming a

risk-free rate of 8% and a dividend yield rate of 4%. These are not meant to be historical

averages but rather to demonstrate that, whatever the present market-quoted levels of these

parameters, this RNHD procedure is quite capable of matching them. The at-the-money

volatility is unrestricted, however additional computations could be made to match that price.

The RNHD generates a volatility smile for one-year (252-day) European options, shown in

Table 1. The Black-Scholes volatilities show a steep drop as the strike price rises, which

arises from the fact that, since the tails of the historical distribution are much thicker than

those of the normal distribution, the option prices are higher.

A regression is estimated to correlate the implicit volatility with the strike/price ratio as well

as its square, its cube, and a constant. Each regression is of the form:

2 3
æK ö æK ö æK ö
σˆ i = β 0 + β 1 ç i ÷ + β 2 ç i ÷ + β 3 ç i ÷ (27)
è S ø è S ø è S ø

and the estimated betas for this case are:

14
2 3
æ Ki ö æ Ki ö æK ö
σˆ i = 2.04 − 4 . 59ç ÷ + 3 . 84ç ÷ − 1.09ç i ÷ (28)
è S ø è S ø è S ø
(0.04) (0.12) (0.12) (0.04)

where the standard errors are in parentheses under each coefficient. The R2 for the regression

is 0.9998. With all of the functionals, such a high R2 would normally bring worries of over-

fitting. However in this case that is the point. We want to get a very tight fit since we are not

particularly interested in out-of-sample prediction (which would be the reason to worry about

over-fitting) but rather are interested in getting a very precise description of the present

volatility surface. For strike values outside the range presented in Table 1, of [80,120], we

use the endpoint volatilities rather than extrapolate. This description of the volatility surface

is then put into the next step where it is used to generate a tree that is consistent with it. The

more precise the estimated volatility surface, the more precise is the tree.

The tree is generated with the Barle-Cakici algorithm. For illustration, a five-step binomial

tree is shown in Figure 2 for the one-year options, although a 50-step model is used for the

later results. Each node shows the stock price, the probability of a subsequent upward move,

and the Arrow-Debreu price, λi . The labels refer to the explanatory text below. The interest

rate is 8%, so the continuously compounded rate, r= ln(1 + .08) = 0.0770, while the dividend

payment stream is 4%, also modified to the continuously compounded rate, d= ln(1 + .04) =

0.0392. The time to maturity is one year so each step, ∆t , is 0.2 years. The initial stock

price, S1 , is 100 and λ1 =1.

15
To fill in the nodes of the tree, first we calculate the stock prices at the two nodes labeled A

and B, where the forward price is F1 = 100e (r −d )∆t = 100.7577. Since we have an even

number of new nodes, we use Equation 21 to calculate

λ1 F1 + ∆C1
S A = F1 , (29)
λ1 F1 − ∆C1

where the initial Arrow-Debreu price is set to be λ1 = 1 , from Equation 22 we have

∆C1 = e r∆t C ( F1 ,2) − 0 , and C (F1 ,2) = 3.5626 is the Black-Scholes price of a call with maturity

of 0.2 years, exercise price F1 , and volatility calculated from the regression coefficients

( )
describing the smile. This volatility is 2.04 − 4.59 100
F1
( )
+ 3.84 100
F1 2
( )
− 1.09 100
F1 3
= 0.2013. Since

it is the first node, the summation terms in the formula for ∆C1 are zero. Therefore

∆C1 = 3.6179 and S A = 108.27 .

F12
Then from Equation 23, the lower node is easily derived so S B = = 93.7727. The
SA

probabilities are from Equation 17: the probability of moving up to node A from the first

F1 − S B
node, p1 = = 0.4820. (Note that this probability is recorded at the first level.)
SA − SB

Therefore the Arrow-Debreu price at node A is (from Equation 20) Λ A = λ1 p1e − r∆t = 0.4747

and Λ B = λ1 (1 − p1 )e − r∆t = 0.51.

The third column of nodes, n=3, are first calculated by setting the price of the middle node, D,

equal to the forward price, so S D = 100e (r −d )2 ∆t = 101.5211. (Since there are an odd number of

nodes, calculating the central value is straightforward compared with the case of an even

16
number of nodes.) Additionally we calculate the forward price from node A,

FA = 108.2629e (r −d )∆t = 109.0831. From these, we calculate the values for node C, where,

using Equation 24, the price of the stock,

∆CC (101.5211) − λ A (109.0831)(109.0831 − 101.5211)


SC = , (30)
∆CC − λ A (109.0831 − 101.5211)

where ∆CC = e r∆t C (109.0831,3) − 0 , since the summation terms are zero and C(109.0831,3)=

2.0852 is the Black-Scholes price of a call option with the indicated strike, 0.4 year maturity,

and volatility from the regression on the smile as 0.1905. Thus ∆CC =2.1175 and

S C =119.9606. The transition probability of moving from node A to node C,

109.0831 − 101.52
pA = =0.4101 and the new Arrow-Debreu price can be found as before.
119.9606 − 101.52

To calculate the lower node price, which is labeled E, we use Equation 25, that

λ B (94.4832)(101.5211 − 94.4832) − ∆PE (101.5211)


SE = (31)
λ B (101.5211 − 94.4832) − ∆PE

where FE = 94.4832 = e (r − d )0.2 S B . We gather λ B = .51 from the previous node and calculate

∆PE from Equation 26. The Black-Scholes put price, 2.4243, is calculated with exercise price

of 94.4832 and volatility of 0.2142 (again, this volatility is from the regression in Equation

28). Since the summation terms are zero, ∆PE = e 0.077 (0.2 ) (2.4243) =2.4619. Therefore SE is

79.12. From this price we can find the transition probability (recorded at node B) using

94.48 − 79.12
Equation 17 as =0.69.
101.52 − 79.12

17
The next column of prices, n=4, (the probabilities and Arrow-Debreu prices can be easily

found from the prices) is found by first using Equations 21 and 23 to find the two central

nodes, since n is even. The forward price, F4 = 100e (r −d )0.6 = 102.29 and the upper central

λ D (102.29 ) + ∆CG
node (labeled G in Figure 2) is calculated from this as 102.29 , where λ D
λ D (102.29) − ∆CG

was previously calculated as 0.62 and ∆CG is calculated from the call price and the summation

terms. The Black-Scholes call price of an option with an exercise price of 102.29 and

volatility (given from the regression) of 0.1989 is 5.9970. The summation term reads over the

nodes that are above and to the left of node G, which in this case is only node C. This term is

λC (FC , D − F4 ) , where λC was previously calculated to be 0.1918, FC , D = 120.87

= 119.96e (r −d )∆t and F4 is 102.29 (from above). The summation terms are 3.5615 so

∆CG = e 0.2 r (5.9970 ) − 3.515 = 2.5286. Therefore the upper central node, G, has price

SG=110.78. From Equation 23, the lower central node price (denoted H) is 94.45

=
(102.29 )
2
.
110.78

After calculating the two central nodes we move centrifugally, first calculating the upper

nodes then the lower ones. The price at node F is calculated from Equation 24, where

FF = 119.96e 0.2(r − d ) =120.87 and Si is, from the node below (the G node), equal to 110.78.

The λC is 0.19 and we only need to calculate ∆CF . The summation terms are zero since it is

at the edge of the tree. The Black-Scholes price of a call is calculated for an exercise of

120.87 and volatility of 0.1819 (since the node is beyond 120% of the initial spot price we do

18
not extrapolate but use the value truncated at 120). This call price is 0.8649 and so

∆CF =0.8783. From this we get

0.8783(110.78) − (0.19)(120.87)(120.87 − 110.78)


SF = . (32)
0.8783 − (0.19)(120.87 − 110.78)

The lower node price, labeled I, uses the price at the node just above it (at H), which is Si+1 in

Equation 25 so SH=94.45. The price is

λ E FI (S H − FI ) − ∆PI S H
SI = (33)
λ E (S H − FI ) − ∆PI

where FI =79.72= 79.12e (r − d )0.2 . The ∆PI term is calculated from the put price with exercise

79.72 and appropriate volatility (again, truncated to be set at 0.2696), which is 1.0194 so

∆PI =1.0352. Putting these terms into the formula above ( λ E was previously calculated) gives

67.89.

Finally, in order to ensure that we have adequately explained our procedure, we show the

calculation for a node at the end of the tree, labeled J. Once the first four levels of nodes have

been calculated, the upper central node, just below J, is calculated to be 112.45 (since there

are an even number of nodes, the procedure from the steps with n=2 and n=4 is used again).

The stock price at J, S J , is given from Equation 24 that

∆CJ (112.4492) − λ 4, 4 (121.1406)(121.1406 − 112.4492)


SJ = , (34)
∆CJ − λ 4, 4 (121.1406 − 112.4492)

where λ 4, 4 =0.2455 is the Arrow-Debreu price at the fourth level, fourth node up from the

bottom, and

19
∆CJ = e r∆t C (121.1406,5) − 1.0236 , (35)

where C(121.1406,5)=2.0841 is the Black-Scholes call price at the given strike, maturity of 1

year, and volatility from the smile of 0.1819 (again, this node is beyond 120% of the initial

spot price, so we use the level at 120). The summation term is λ 4,5 (F4,5 − F4, 4 ) , where

F4,i = e (r − d )∆t S 4,i and λ 4,5 is the Arrow-Debreu price at the fourth level, fifth node up, so that

1.0236 =0.0420(145.5104 - 121.1406) since we only sum over the nodes above and to the left

of J. Thus S J = 130.2670.

From this simple five-step example, the general method should be clear. Table 2 shows the

option prices from a 50-step tree using the specified smile from the regression. These prices

reflect the early-exercise premium as well as the historically implied volatility skew. The

Cox-Ross-Rubinstein (CRR) implied volatilities are also shown (Cox, Ross, and Rubinstein

1979). These volatilities, of course, are different than the Black-Scholes volatilities from

Table 1, reflecting the early-exercise premium. The volatility skew, reflected in the tree by

the varying transition probabilities and prices, does not conform to the CRR assumptions.

The higher CRR volatility comes from the relatively thicker tails of the historical distribution:

there is a larger chance that the option will expire in the money and so the price (therefore

volatility) is higher.

These prices can provide a baseline for judging the relative prices of American options. This

baseline is constructed entirely using data on the underlying security--not the quoted options

prices. It gives an independent measure of the options prices. Alternately, with some more

computation, we could match a quoted American option price (instead of a European option

20
price). From that we could find the rest of the (American) volatility skew that is consistent

with the historical distribution.

6. Conclusions

This paper shows how to build a binomial tree that is consistent with the historical

distribution and that can be used to value American options. The advantage of using the risk-

neutralized historical distribution (RNHD) is that it avoids making restrictive assumptions

about the distribution of the returns. Where we lack knowledge (about the higher moments

and the behavior of the tails) we will not impose strict assumptions but rather will use the past

as our guide. Only where we have specific knowledge or a strong theoretical rationale do we

make particular assumptions.

The RNHD method of Stutzer provides a way to use data on just the underlying security to

imply options valuations. As Zou and Derman discuss, this method of using only data on the

underlier has two virtues. First, options (or, for portfolio analyses, assets with option-like

characteristics) can be valued even if option markets are thin or absent. Second, in cases

where there is ample data on option prices, the historically based simulation method is still

useful because it provides an independent baseline. These papers and subsequent work,

however, were only able to value European options. This paper shows a method by which

early-exercise American options can be valued. Further research should test the predictability

of prices generated by this method against those from other models.

21
Tree-generating methods that use option prices as inputs do not leave any degrees of freedom

to judge whether an option is dear or cheap, but this kernel-based historically consistent

simulation procedure can give stable benchmark option values. This paper's contribution

extends the Stutzer methodology to build binomial trees that can be used to value American

options.

22
Bibliography

Andrews, Donald W.K. (1994). "Empirical Process Methods in Econometrics," Cowles


Foundation Discussion Paper No. 1059.

Barle, Stanko and Nusret Cakici (1998). "How to Grow a Smiling Tree," Journal of
Financial Engineering, 7, 127-46.

Buchen, Peter W., and Michael Kelly (1996). "The Maximum Entropy Distribution of an
Asset Inferred from Option Prices," Journal of Financial and Quantitative Analysis,
31, 143-59.

Cakici, Nusret, and Kevin R. Foster (2001). "Risk-Neutralized At-the-Money Consistent


Historical Distributions in Currency Options Pricing," Journal of Computational
Finance, forthcoming.

Chapman, David A., and Neil D. Pearson (2000). "Is the Short Rate Drift Actually
Nonlinear?" Journal of Finance, 55(1), 355-88.

Cox, John C., Steven Ross, and Mark Rubinstein (1979). "Option Pricing: A Simplified
Approach," Journal of Financial Economics, 9, 321-46.

Derman, Emanuel and Iraj Kani (1994). "Riding on the Smile," Risk 7 32-39.

Dupire, Bruno (1994). "Pricing with a Smile," Risk 7 18-20.

Grandmont, J.M. (1993). "Expectations Driven Nonlinear Business Cycles," Proceedings of


the German Academy of Sciences, Westdeutscher Verlag.

Kapur, J.N., and H.K. Kesavan (1992). Entropy Optimization Principles with Applications.
New York: Academic Press Harcourt Brace Jovanovich.

Rubinstein, Mark (1994). "Implied Binomial Trees," Journal of Finance 49 (3), 771-818.

Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis. New York:
Chapman and Hall.

Stutzer, Michael (1996). "A Simple Nonparametric Approach to Derivative Security


Valuation," Journal of Finance, 51, 1633-52.

Zou, Joseph and Emanuel Derman (1999). "Strike-Adjusted Spread: A New Metric for
Estimating the Value of Equity Options," Goldman Sachs Quantitative Strategies
Research Notes.

23
Figure 1

Comparison of Historical PDF of S&P100 returns (solid line) to Normal (dashed line)

Historical pdf is estimated with an Epanechnikov kernel with optimal bandwidth, as described in the text.

24
Figure 2

Implied Binomial Tree for one-year options, illustrated with 5 steps

155.97
144.42 -
129.26 0.59
0.41 0.02
0.04
0.10
119.96 130.27
F
0.55 120.23 -
108.27 0.19 0.49 0.13
0.41 C 110.78 0.25 J
100 0.47 0.50
0.482 112.45
A 0.38
1 101.52 103.07 -
G 0.31
93.77 0.48 0.48
0.69 0.62 94.45 0.40
0.51 D 0.56
0.39 95.90
B
79.12 H 85.14 -
0.45 0.63 0.32
0.16 67.89 0.19
E 0.24 68.88
0.09 -
63.01
I 0.12
0.79
0.06
43.10
-
0.01

At each node, the top entry is the stock price (beginning at 100); the middle entry is the

probability of a subsequent upward move, pi ; the lowest entry at each node is the Arrow-

Debreu price, λi . The labels, A, B, C, ... J, refer to the nodes that are explicitly calculated in

the paper.

25
Table 1
Prices of 1-year European Call Options, with Implied Volatilities, from
nonparametric kernel estimation RNHD procedure
Black-Scholes Implied
Strike/ Price Option Price Volatility
80 24.20 0.270
85 20.05 0.245
90 16.18 0.227
91 15.45 0.224
92 14.73 0.221
93 14.03 0.218
94 13.35 0.215
95 12.68 0.213
96 12.04 0.211
97 11.41 0.209
98 10.80 0.207
99 10.21 0.205
100 9.64 0.203
101 9.09 0.201
102 8.56 0.200
103 8.05 0.198
104 7.56 0.197
105 7.09 0.195
106 6.63 0.194
107 6.20 0.193
108 5.79 0.191
109 5.40 0.190
110 5.03 0.189
115 3.47 0.185
120 2.32 0.182

26
Table 2
American Call Prices from 50-step binomial tree with historically-based volatility skew
American Call Price, using historically- CRR implied
Strike/ Price based volatility skew volatility
80 26.99 0.387
85 22.79 0.345
90 18.80 0.312
91 18.02 0.307
92 17.24 0.301
93 16.46 0.293
94 15.74 0.287
95 15.05 0.282
96 14.36 0.276
97 13.66 0.271
98 12.97 0.267
99 12.32 0.263
100 11.73 0.261
101 11.13 0.256
102 10.54 0.252
103 9.95 0.247
104 9.38 0.243
105 8.89 0.242
106 8.41 0.240
107 7.93 0.239
108 7.45 0.235
109 6.97 0.231
110 6.58 0.229
115 4.74 0.221
120 3.30 0.214

27

You might also like