AlexanderNotes Module III

Lesson
10: Markov, Wiener, and Ito

Alright sports fans, it’s about to get real.

While we’ve been using stochastic processes for the better part of our options pricing
discussions to this point. It’s probably best to define them more specifically.

Simple put, a stochastic process arises when a variable changes in an uncertain way.

Think about a single linear regression:

f ( x ) = α + β x + ε

This is not a stochastic process (assuming epsilon is smaller and normally distributed)

Since f(x) is a function of x, we expect every next iteration of f(x) to be alpha plus beta times x. In
contrast, a stochastic process would have a (seemingly) random outcome.

A subset of stochastic processes is the Markov process. Markov processes are stochastic
processes exhibiting the Markov property. The Markov property describes a memory-‐less
variable, in which only the present value is a relevant determinant of future value. Said another
way, past values have no predictive materiality.

If I define the set M = {1, 2, 3, 4} as a discrete Markov process, then which addition to the set
would be more likely – the value 3 or 5?

Exactly.

In this Markov process, the values 1, 2, and 3 have no effect on the next addition to the set; only
4 has predictive value. Given that one value is only a single point, we can’t determine a trend.
Without a trend, we lack clarity into the direction of the change, just its relative proximity to the
present value. Therefore, we would not be able to say with confidence which new value would
be more likely. Therefore, in this space, they are equally likely.

So, let’s take a second to examine the evolution of a Markov process.

The following example is a Markov process in two dimensions (which means in this example the
process can move in four directions).

Step 1

Step 2

Step 5

Step 15

Step 100

Step 1,000

Step 10,000

Again, another interesting connection, we’ve just used an underlying tenet of modern financial
theory to draw a seahorse. And remember, this is only discrete in two-‐dimensions, let’s see
what three dimensions looks like.

Step 1

Step 200

Step 1,000

Step 10,000

Step 100,000

Stock returns are typically assumed to be Markov processes, after all does today’s price or last
year’s price matter more to determine tomorrow’s price? Of course, today’s price matters more.
Now, are stock prices absolutely Markov process? I think a strong case can be made to suggest
that they are not, however, the Markov property fits better than other mathematical
assumptions.

Moving onward. The mechanics of Markov arithmetic is not overly complicated. Again, if we
consider stock prices Markov processes then if we define a stock’s price as a function of the
2
( )
normal distribution φ µ, σ , where mean = 0 and variance = 1, then the returns of the stock
in one year should be independent from the return of the stock in the following year. Given
their independence, the two-‐year return of the stock should be the addition of the independent
returns in years one and two. Therefore, the expected return of the stock after two years is
φ ( 0, 2 ) with a standard deviation of 2 . Further, this property can be used to describe a
shorter period of time as well; thus the expected return of the stock over three months is
! 1$ 1
φ # 0, & with a standard deviation of .
" 4% 2

So, this last example actually describes a subset of Markov processes known as the Wiener
process. Yes, that’s right, go ahead and get the giggle out now; you’re going to see this quite a
bit going forward.

The Wiener process is a Markov process that is normally distributed about a mean of zero and a
variance of one. You may be more familiar with the concept of the Wiener process in particle
physics where it is referred to as Brownian motion.

The Wiener process has two properties:

(1) Δz = ε Δt ; where epsilon is φ ( 0,1)
(2) The mean and variance of Δz = ( 0, Δt )

Again, given the independence of each process, we can define the value of a change in z over
time as:

N
z (T ) − z ( 0 ) = ∑εi Δt [EQ 10.01]
i=1

Let’s take a look at a Wiener process graphically.

1
One year @ Δt = :
100

1
One year @ Δt = :
1, 000

So, at delta t approaches zero, we can generalize about delta z, and instead refer to it as dz. Now
this Wiener process still assumes a mean of zero and a variance of one, but we can adjust for a
more practical application of the Wiener process to account for both a drift rate and a variance
rate.

In a generalized form, we can express this Wiener process as:

dx = adt + bdz [EQ 10.02]

In EQ 10,02, adt is the drift rate that in the basic form of the Wiener process would equal zero.
The bdz element gives this equation its Wiener property; without it, dx = adt would integrate to
a function of time and the variable x (here regarded as a position with respect to either value or
space-‐time). In a general Wiener process, a and b are constants.

To get a sense of how the drift coefficient and variance coefficient affect the path of the
process, let’s chart two processes and mess with them.

In this iteration, both the red and green processes have the same initial value (0), and equal drift
and variance coefficients (both 0,0.5). Notice the independence.

In this iteration, the red process has a drift coefficient set to +1, while the green process has a
drift coefficient set to -‐1. Notice the difference between how the two processes move versus
each other and versus the original plot.

In this last iteration, the drifts were reset to zero, but the variance coefficients were changed;
the red process has a variance coefficient of +1, while the green process has a variance
coefficient of 0, effectively removing the Wiener property from the process.

Now that we’ve got the generalized Wiener process under control, let’s take another step
forward.

As we just stated the generalized Wiener process uses constants a and b to define both drift and
variance coefficients. But happens if we relax that constraint to allow the coefficient to become
a function of the variable x and t?

Well, then you get Ito’s process.

dx = a ( x, t ) dt + b ( x, t ) dz [EQ 10.03]

That seems to make sense, right? After all, just like a rocket ship’s trajectory, a stock price’s drift
coefficient isn’t actually a constant but rather an independent function apart from the price of
the stock. Said another way, the drift coefficient needs to be normalized by the price of the
underlying. Think about this as drift divided by the stock price equals expected return, and
expected return should not change just because the value of the underlying changes. Therefore,
drift cannot be constant.

Now, if the change in the stock price were only a function of time (let’s ignore volatility for a
moment), then we could adjust our process (switching out x for S):

dS = µ Sdt [EQ 10.04]

Luckily, this point is pretty simple; essentially, the change in the value of the underlying is the
expected return of the underlying multiplied by the value of the underlying adjusted for the
period of time. Makes sense, but this is not a Wiener process anymore; there’s no variance
coefficient.

To reenter Wiener space, let’s stop ignoring volatility. In the same way that we reimagined the
drift constant into expected return for the underlying, independent of the underlying price we
can (and need to) do the same thing for the variance coefficient. Think about it this way – if a
stock price doubles today, or you any more or less certain of the outcome tomorrow? Well, no,
probably not. All else equal, the volatility of a stock return is not linked to its price. Therefore we
can expand EQ 10.04 to include volatility and reclaim the Wiener property.

dS = µ Sdt + σ Sdz [EQ 10.05]

The percentage change in underlying value equals:

dS
= µ dt + σ dz [EQ 10.06]
S

Now, if you’re paying attention, a swell of relief should over take you, because within the
context of an Ito process, we’ve just defined the change in a stock price as a function of the
expected return and time, and variance and time. This is a premise that most financial
professionals can accept (and to which many cling).

So, two quick points before we leap forward again – (1) thus far expected return and variance
are not linked; though, one would/should expect more/less return given a change to the level of
risk in the portfolio. (2) expected return is not critically important to the valuation of derivatives;
the price of the underlying and the volatility of the underlying are critical to the valuation of
derivatives.

Ito’s Lemma
So here’s the thing – in 1951, K. Ito showed that a function G(x, t) follows the [his process]:

" ∂G 1 ∂2G 2 2 ∂G % ∂G
dG = $ µS + 2
σ S + ' dt + σ Sdz [EQ 10.07]
# ∂S 2 ∂S ∂t & ∂S

Yes. I know at first this may look intimidating, but look harder, what do you see?

Exactly, Ito used the partial derivative of his function G to redefine the expected return of the
process as a portfolio of delta, gamma, and theta.

Next step, define G(x, t) as ln S.

So:
G = ln S
1
∴Δ G =
S

1
∴Γ G = − 2
S
∴θ G = 0

Using these values as substitutes for EQ 10.07, we get:

" σ2%
dG = $ µ − ' dt + σ dz [EQ 10.08]
# 2 &

So what did this get us?

Well, our drift coefficient is now defined by the constant expected return and variance, both of
which are independent from the price of the underlying. Additionally, or variance coefficient is
also constant and independent of the price of the underlying. Therefore, a change in the price of
" σ2%
Ito’s function G (in this case ln S) is normally distributed around the mean $ µ − ' T , and the
# 2 &
variance σ 2T .

So, the only question left is why did Ito insist upon using the lognormal of the underlying, and
not the normal distribution of the underlying as his function G?

First, let’s make sure we know what they look like.

Normal Distribution
PHxL DHxL
x x

Lognormal Distribution
PHxL DHxL
x x

Second (and kind of a key point), if a random variable is lognormally distributed, then if we take
the logarithm of the random variable, it becomes normally distributed.

Lognormal distributions are often characterized by a random variable that is a function of
several, positive independent variables. Additionally, a lognormal distributions represent values
between zero and infinity, clearly with a positive skew. This tends to represent the value of stock
prices as stock prices stop at zero (on the left side).

Congratulations, you now understand Markov, Wiener, and Ito – I think you’re ready for Black,
Scholes, and Merton.

Lesson 11.0 – Black Scholes Merton

So let’s be very clear – Black Scholes Merton (BSM) is not magic. It does not miraculously yield
the absolute, correct option value for all options. BSM (in its pure form) values European
options of non-‐dividend paying stocks. BSM represents an inspired way to approach option
valuation via a closed form solution; but it is not he “end all, be all,” it’s just a good way to value
European options on non-‐dividend paying stocks.

Now that that’s out of the way, let’s continue.

In 1997, Merton and Scholes won the Nobel Prize for economics for the Black Scholes Merton
Model; Black died in 1995 (very Jonathan Larson).

So let’s get under the hood; how does BSM work?

The first major assumption in BSM is that stock price changes are lognormally distributed over
the short term, and this distribution can be defined by (μ, σ). Furthermore, the mean return in
time equals 𝜇Δ𝑡, and standard deviation equals 𝜎 Δ𝑡; therefore, just like in Ito’s process, the
percentage change in price in time is a function of the expected mean return in time and the
variance in time.

Therefore, as in Ito’s Lemma, we can use his function G as the ln S. Looking for the change in
price leaves us with:

(" σ2% +
ln ST − ln S0 ~ φ *$ µ − ' T, σ 2T -
)# 2 & ,
ST (" σ2% +
∴ln ~ φ *$ µ − ' T, σ 2T -
S0 )# 2 & ,
( " σ2% +
∴ln ST ~ φ *ln S0 + $ µ − ' T, σ 2T -
) # 2 & ,

Let’s take another look at a lognormal distribution:

E(ST ) = S0 eµT [EQ 11.01]
PHxL DHxL
x x

Given the mean (E[S]), the variance of the lognormal distribution of the stock at time T equals:

( )
2
var ( ST ) = S02 e 2 µT eσ T −1 [EQ 11.02]

Further, we can use these parameters of a stock’s lognormal distribution to define the
continuously compounded rate of return earned over the time period from t=0 to t=T.

So, if:

𝑆! = 𝑆! 𝑒 !" , then:

! !!
𝑥 = ln [EQ 11.03]
! !!

Alright, now that we have rekindled our understanding of lognormal distributions, let’s puch
deeper into BSM.

There are seven assumptions for BSM:
1. Stock prices are processes; mu and sigma are constant.
2. Negative positions in the stock are allowed.
3. Ignore transaction costs and taxes.
4. Ignore dividends.
5. There are no riskless arbitrage opportunities.
6. Time is continuous.
7. Risk-‐free rate is constant, available, and the same across securities.

Okay, so if we remember from the previous lesson, the equation:

dS = µ Sdt + σ Sdz

can describe a stock price process, where the dz was the portion of the equation that made it a
stochastic process. Furthermore, Ito showed us that the change in his function G can be found
through the partial differentiation of the stock process:

" ∂G 1 ∂2G 2 2 ∂G % ∂G
dG = $ µS + 2
σ S + ' dt + σ Sdz
# ∂S 2 ∂S ∂t & ∂S

Out next step is to allow G to equal f, the price of a derivative of the underlying S.

" ∂f 1 ∂2 f 2 2 ∂f % ∂f
df = $ µ S + 2
σ S + ' dt + σ Sdz
# ∂S 2 ∂S ∂t & ∂S
[EQ 11.04]
Now, the assumption is that there are intervals of t, where the changes to dz are the same for
both the derivative process and the stock process. If this is true (this being the fact that a change
in the underlying affects the underlying and the derivative by the same amount), then we can
construct a portfolio of the underlying and the derivative such that the process is eliminated.
Essentially what we’re looking for here is the ability to perfectly hedge underlying exposure to
zero. Thus, if we assume ownership of the underlying, then we must short the derivative
(assuming not a put option – it should be pretty safe to assume call option unless otherwise
stated). Thus, an off-‐setting position in a derivative position to eliminate equity exposure,
thereby, eliminating the process would be:

∂f
−f = S
∂S

Now, the df/dS should look familiar as it’s just the derivative’s delta. Therefore, any residual
exposure not hedged will affect the value of the portfolio. Thus:

∂f
Π=−f + S [EQ 11.05]
∂S

Therefore, any change in the value of the portfolio is given as:

∂f
ΔΠ = −Δf + ΔS
∂S

Written another way:

% ∂f 1 ∂2 f 2 2 (
ΔΠ = ' − − 2
σ S * Δt
& ∂t 2 ∂S )

Remember, no dz means no process means no risk across the change in t. Therefore, we have
defined the risk-‐free rate as the change in value of the portfolio:

ΔΠ = rΠΔt [EQ 11.06]

Through a substitution of equations, we are left with:

∂f ∂f 1 ∂2 f 2 2
rf = + rS + σ S
∂t ∂S 2 ∂S 2
[EQ 11.07]

Again, note the similarities between this equation and many before it. The risk-‐free rate
multiplied by the value of the derivative equals theta plus the return of the underlying times
delta plus the gamma coefficient times gamma.

Excellent, now let’s put this into an even more familiar context.

The value of a derivative should be worth the difference between the current price of the
underlying less the present value of the delivery or strike price. We’ve already seen this in EQ
2.434:

f = ( F0 − K ) e−rT

If we change this slightly to conform to our new variables, this same expression translates
directly into:

−r(T −t )
f = S − Ke [EQ 11.08]

Therefore, we can translate EQ 11.08 into its counterpart Greek portfolio format:

∂f
= −rKe ( )
−r T −t
θ=
∂t
∂f
Δ= =1
∂S
∂2 f
Γ= 2 =0
∂S

So… moving right along – see if you notice anything familiar in the BSM pricing formulas.

CALL
c = S0 N ( d1 ) − Ke−rT N ( d2 )
S0 " σ2%
ln + $ r + 'T
K # 2 &
d1 = [EQs 11.09]
σ T
d2 = d1 − σ T

PUT
p = Ke−rT N (−d2 ) − S0 N (−d1 ) [EQ 11.10]

NOTE – if you use excel the corresponding function for N() is NORMSDIST.

So, what does the output of BSM look like?

The above represents a European option with T=0.25, vol=0.3, r=0.04, S=50, and K=65.

American Options
Now we demonstrated that calls should not be exercised early in an earlier lesson, but now that
we have a better grasp on the tools needed to analyze derivative more completely, let’s re-‐
examine the case of the American option.

American options differ from European option insofar as European options can only be
exercised at expiration; American options, on the other hand, can be exercised at any time up
until expiration.

So, if there is no dividend, then the option should not be exercised early.

However, if there is a dividend, then we have to consider how well the option prices the
dividend.

First, how do we value a dividend?

Hint – think about the way for which we accounted for coupons in bonds earlier. Granted,
coupons are contractually bound otherwise default may occur, however, we can approximate
the same framework if we assume that if a declared dividend is not paid, then the stock may
suffer bankruptcy as a corresponding consequence of a lack of liquidity.

Therefore, we can discount a dividend as before:

δT = δ e−rT [EQ 11.11]

Note, the above equation only accounts for a single dividend payment; often several dividend
payments may be considered for a single adjustment.

Once you have calculated the PV of the dividend(s) to be considered, you may simple adjust the
PV of the underlying to account for the discount. This works for both European and American
options. Now that we have a basis for evaluating the price of a dividend, how can we determine
whether or not to exercise early in the case of an American option?

Well, let’s assume that the ex-‐dividend date is tn+1; therefore the time period immediately
preceding the ex-‐dividend date is tn.

Therefore, if we exercise early at tn, then the option-‐exerciser will receive:

S(tn) – K;

On the other hand, if we continue to hold the option through the ex-‐dividend date, then the
stock will drop by the value of the dividend:

S(tn) – δn;

Therefore, the value of the option should be greater than:

−r (T −tn )
Stn − δtn − Ke

Said another way, if the value of the dividend is less than the adjusted PV of the strike price to
account for the dividend, then do not exercise early.

δn ≤ K 1− e ( −r(T −tn )
)

However, in the event that the dividend payment is greater than the adjusted PV of the strike
price to account for the dividend, then exercising early can capture the difference in value. Of
course, this too ignores transaction costs and the difference in tax rates.

δn > K 1− e ( −r(T −tn )
)

Congratulations, you’ve made it through processes, BSM, and options. This concludes the
second module of the course, or as I like to call it – all things options. We will see options again,
and some of the theory used to value them, but for now, we will change direction slightly to
take another look at risk and other derivatives used to manage that risk.

AlexanderNotes Module III

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AlexanderNotes Module III

Uploaded by

Copyright:

Available Formats

Lesson

10: Markov, Wiener, and Ito

You might also like