Professional Documents
Culture Documents
Step
2
Step
5
Step
15
Step
100
Step
1,000
Step
10,000
Again,
another
interesting
connection,
we’ve
just
used
an
underlying
tenet
of
modern
financial
theory
to
draw
a
seahorse.
And
remember,
this
is
only
discrete
in
two-‐dimensions,
let’s
see
what
three
dimensions
looks
like.
Step
1
Step
200
Step
1,000
Step
10,000
Step
100,000
Stock
returns
are
typically
assumed
to
be
Markov
processes,
after
all
does
today’s
price
or
last
year’s
price
matter
more
to
determine
tomorrow’s
price?
Of
course,
today’s
price
matters
more.
Now,
are
stock
prices
absolutely
Markov
process?
I
think
a
strong
case
can
be
made
to
suggest
that
they
are
not,
however,
the
Markov
property
fits
better
than
other
mathematical
assumptions.
Moving
onward.
The
mechanics
of
Markov
arithmetic
is
not
overly
complicated.
Again,
if
we
consider
stock
prices
Markov
processes
then
if
we
define
a
stock’s
price
as
a
function
of
the
2
( )
normal
distribution
φ µ, σ ,
where
mean
=
0
and
variance
=
1,
then
the
returns
of
the
stock
in
one
year
should
be
independent
from
the
return
of
the
stock
in
the
following
year.
Given
their
independence,
the
two-‐year
return
of
the
stock
should
be
the
addition
of
the
independent
returns
in
years
one
and
two.
Therefore,
the
expected
return
of
the
stock
after
two
years
is
φ ( 0, 2 )
with
a
standard
deviation
of
2 .
Further,
this
property
can
be
used
to
describe
a
shorter
period
of
time
as
well;
thus
the
expected
return
of
the
stock
over
three
months
is
! 1$ 1
φ # 0, &
with
a
standard
deviation
of
.
" 4% 2
So,
this
last
example
actually
describes
a
subset
of
Markov
processes
known
as
the
Wiener
process.
Yes,
that’s
right,
go
ahead
and
get
the
giggle
out
now;
you’re
going
to
see
this
quite
a
bit
going
forward.
The
Wiener
process
is
a
Markov
process
that
is
normally
distributed
about
a
mean
of
zero
and
a
variance
of
one.
You
may
be
more
familiar
with
the
concept
of
the
Wiener
process
in
particle
physics
where
it
is
referred
to
as
Brownian
motion.
The
Wiener
process
has
two
properties:
(1)
Δz = ε Δt ;
where
epsilon
is
φ ( 0,1)
(2)
The
mean
and
variance
of
Δz = ( 0, Δt )
Again,
given
the
independence
of
each
process,
we
can
define
the
value
of
a
change
in
z
over
time
as:
N
z (T ) − z ( 0 ) = ∑εi Δt
[EQ
10.01]
i=1
Let’s
take
a
look
at
a
Wiener
process
graphically.
1
One
year
@
Δt = :
100
1
One
year
@
Δt = :
1, 000
So,
at
delta
t
approaches
zero,
we
can
generalize
about
delta
z,
and
instead
refer
to
it
as
dz.
Now
this
Wiener
process
still
assumes
a
mean
of
zero
and
a
variance
of
one,
but
we
can
adjust
for
a
more
practical
application
of
the
Wiener
process
to
account
for
both
a
drift
rate
and
a
variance
rate.
In
a
generalized
form,
we
can
express
this
Wiener
process
as:
dx = adt + bdz
[EQ
10.02]
In
EQ
10,02,
adt
is
the
drift
rate
that
in
the
basic
form
of
the
Wiener
process
would
equal
zero.
The
bdz
element
gives
this
equation
its
Wiener
property;
without
it,
dx
=
adt
would
integrate
to
a
function
of
time
and
the
variable
x
(here
regarded
as
a
position
with
respect
to
either
value
or
space-‐time).
In
a
general
Wiener
process,
a
and
b
are
constants.
To
get
a
sense
of
how
the
drift
coefficient
and
variance
coefficient
affect
the
path
of
the
process,
let’s
chart
two
processes
and
mess
with
them.
In
this
iteration,
both
the
red
and
green
processes
have
the
same
initial
value
(0),
and
equal
drift
and
variance
coefficients
(both
0,0.5).
Notice
the
independence.
In
this
iteration,
the
red
process
has
a
drift
coefficient
set
to
+1,
while
the
green
process
has
a
drift
coefficient
set
to
-‐1.
Notice
the
difference
between
how
the
two
processes
move
versus
each
other
and
versus
the
original
plot.
In
this
last
iteration,
the
drifts
were
reset
to
zero,
but
the
variance
coefficients
were
changed;
the
red
process
has
a
variance
coefficient
of
+1,
while
the
green
process
has
a
variance
coefficient
of
0,
effectively
removing
the
Wiener
property
from
the
process.
Now
that
we’ve
got
the
generalized
Wiener
process
under
control,
let’s
take
another
step
forward.
As
we
just
stated
the
generalized
Wiener
process
uses
constants
a
and
b
to
define
both
drift
and
variance
coefficients.
But
happens
if
we
relax
that
constraint
to
allow
the
coefficient
to
become
a
function
of
the
variable
x
and
t?
Well,
then
you
get
Ito’s
process.
dx = a ( x, t ) dt + b ( x, t ) dz
[EQ
10.03]
That
seems
to
make
sense,
right?
After
all,
just
like
a
rocket
ship’s
trajectory,
a
stock
price’s
drift
coefficient
isn’t
actually
a
constant
but
rather
an
independent
function
apart
from
the
price
of
the
stock.
Said
another
way,
the
drift
coefficient
needs
to
be
normalized
by
the
price
of
the
underlying.
Think
about
this
as
drift
divided
by
the
stock
price
equals
expected
return,
and
expected
return
should
not
change
just
because
the
value
of
the
underlying
changes.
Therefore,
drift
cannot
be
constant.
Now,
if
the
change
in
the
stock
price
were
only
a
function
of
time
(let’s
ignore
volatility
for
a
moment),
then
we
could
adjust
our
process
(switching
out
x
for
S):
dS = µ Sdt
[EQ
10.04]
Luckily,
this
point
is
pretty
simple;
essentially,
the
change
in
the
value
of
the
underlying
is
the
expected
return
of
the
underlying
multiplied
by
the
value
of
the
underlying
adjusted
for
the
period
of
time.
Makes
sense,
but
this
is
not
a
Wiener
process
anymore;
there’s
no
variance
coefficient.
To
reenter
Wiener
space,
let’s
stop
ignoring
volatility.
In
the
same
way
that
we
reimagined
the
drift
constant
into
expected
return
for
the
underlying,
independent
of
the
underlying
price
we
can
(and
need
to)
do
the
same
thing
for
the
variance
coefficient.
Think
about
it
this
way
–
if
a
stock
price
doubles
today,
or
you
any
more
or
less
certain
of
the
outcome
tomorrow?
Well,
no,
probably
not.
All
else
equal,
the
volatility
of
a
stock
return
is
not
linked
to
its
price.
Therefore
we
can
expand
EQ
10.04
to
include
volatility
and
reclaim
the
Wiener
property.
dS = µ Sdt + σ Sdz
[EQ
10.05]
The
percentage
change
in
underlying
value
equals:
dS
= µ dt + σ dz
[EQ
10.06]
S
Now,
if
you’re
paying
attention,
a
swell
of
relief
should
over
take
you,
because
within
the
context
of
an
Ito
process,
we’ve
just
defined
the
change
in
a
stock
price
as
a
function
of
the
expected
return
and
time,
and
variance
and
time.
This
is
a
premise
that
most
financial
professionals
can
accept
(and
to
which
many
cling).
So,
two
quick
points
before
we
leap
forward
again
–
(1)
thus
far
expected
return
and
variance
are
not
linked;
though,
one
would/should
expect
more/less
return
given
a
change
to
the
level
of
risk
in
the
portfolio.
(2)
expected
return
is
not
critically
important
to
the
valuation
of
derivatives;
the
price
of
the
underlying
and
the
volatility
of
the
underlying
are
critical
to
the
valuation
of
derivatives.
Ito’s
Lemma
So
here’s
the
thing
–
in
1951,
K.
Ito
showed
that
a
function
G(x,
t)
follows
the
[his
process]:
" ∂G 1 ∂2G 2 2 ∂G % ∂G
dG = $ µS + 2
σ S + ' dt + σ Sdz
[EQ
10.07]
# ∂S 2 ∂S ∂t & ∂S
Yes.
I
know
at
first
this
may
look
intimidating,
but
look
harder,
what
do
you
see?
Exactly,
Ito
used
the
partial
derivative
of
his
function
G
to
redefine
the
expected
return
of
the
process
as
a
portfolio
of
delta,
gamma,
and
theta.
Next
step,
define
G(x,
t)
as
ln
S.
So:
G = ln S
1
∴Δ G =
S
1
∴Γ G = − 2
S
∴θ G = 0
Using
these
values
as
substitutes
for
EQ
10.07,
we
get:
" σ2%
dG = $ µ − ' dt + σ dz
[EQ
10.08]
# 2 &
So
what
did
this
get
us?
Well,
our
drift
coefficient
is
now
defined
by
the
constant
expected
return
and
variance,
both
of
which
are
independent
from
the
price
of
the
underlying.
Additionally,
or
variance
coefficient
is
also
constant
and
independent
of
the
price
of
the
underlying.
Therefore,
a
change
in
the
price
of
" σ2%
Ito’s
function
G
(in
this
case
ln
S)
is
normally
distributed
around
the
mean
$ µ − ' T ,
and
the
# 2 &
variance
σ 2T .
So,
the
only
question
left
is
why
did
Ito
insist
upon
using
the
lognormal
of
the
underlying,
and
not
the
normal
distribution
of
the
underlying
as
his
function
G?
First,
let’s
make
sure
we
know
what
they
look
like.
Normal
Distribution
PHxL DHxL
x x
Lognormal
Distribution
PHxL DHxL
x x
Second
(and
kind
of
a
key
point),
if
a
random
variable
is
lognormally
distributed,
then
if
we
take
the
logarithm
of
the
random
variable,
it
becomes
normally
distributed.
Lognormal
distributions
are
often
characterized
by
a
random
variable
that
is
a
function
of
several,
positive
independent
variables.
Additionally,
a
lognormal
distributions
represent
values
between
zero
and
infinity,
clearly
with
a
positive
skew.
This
tends
to
represent
the
value
of
stock
prices
as
stock
prices
stop
at
zero
(on
the
left
side).
Congratulations,
you
now
understand
Markov,
Wiener,
and
Ito
–
I
think
you’re
ready
for
Black,
Scholes,
and
Merton.
Lesson
11.0
–
Black
Scholes
Merton
So
let’s
be
very
clear
–
Black
Scholes
Merton
(BSM)
is
not
magic.
It
does
not
miraculously
yield
the
absolute,
correct
option
value
for
all
options.
BSM
(in
its
pure
form)
values
European
options
of
non-‐dividend
paying
stocks.
BSM
represents
an
inspired
way
to
approach
option
valuation
via
a
closed
form
solution;
but
it
is
not
he
“end
all,
be
all,”
it’s
just
a
good
way
to
value
European
options
on
non-‐dividend
paying
stocks.
Now
that
that’s
out
of
the
way,
let’s
continue.
In
1997,
Merton
and
Scholes
won
the
Nobel
Prize
for
economics
for
the
Black
Scholes
Merton
Model;
Black
died
in
1995
(very
Jonathan
Larson).
So
let’s
get
under
the
hood;
how
does
BSM
work?
The
first
major
assumption
in
BSM
is
that
stock
price
changes
are
lognormally
distributed
over
the
short
term,
and
this
distribution
can
be
defined
by
(μ,
σ).
Furthermore,
the
mean
return
in
time
equals
𝜇Δ𝑡,
and
standard
deviation
equals
𝜎 Δ𝑡;
therefore,
just
like
in
Ito’s
process,
the
percentage
change
in
price
in
time
is
a
function
of
the
expected
mean
return
in
time
and
the
variance
in
time.
Therefore,
as
in
Ito’s
Lemma,
we
can
use
his
function
G
as
the
ln
S.
Looking
for
the
change
in
price
leaves
us
with:
(" σ2% +
ln ST − ln S0 ~ φ *$ µ − ' T, σ 2T -
)# 2 & ,
ST (" σ2% +
∴ln ~ φ *$ µ − ' T, σ 2T -
S0 )# 2 & ,
( " σ2% +
∴ln ST ~ φ *ln S0 + $ µ − ' T, σ 2T -
) # 2 & ,
Let’s
take
another
look
at
a
lognormal
distribution:
E(ST ) = S0 eµT
[EQ
11.01]
PHxL DHxL
x x
Given
the
mean
(E[S]),
the
variance
of
the
lognormal
distribution
of
the
stock
at
time
T
equals:
( )
2
var ( ST ) = S02 e 2 µT eσ T −1
[EQ
11.02]
Further,
we
can
use
these
parameters
of
a
stock’s
lognormal
distribution
to
define
the
continuously
compounded
rate
of
return
earned
over
the
time
period
from
t=0
to
t=T.
So,
if:
𝑆! = 𝑆! 𝑒 !" ,
then:
! !!
𝑥 = ln
[EQ
11.03]
! !!
Alright,
now
that
we
have
rekindled
our
understanding
of
lognormal
distributions,
let’s
puch
deeper
into
BSM.
There
are
seven
assumptions
for
BSM:
1. Stock
prices
are
processes;
mu
and
sigma
are
constant.
2. Negative
positions
in
the
stock
are
allowed.
3. Ignore
transaction
costs
and
taxes.
4. Ignore
dividends.
5. There
are
no
riskless
arbitrage
opportunities.
6. Time
is
continuous.
7. Risk-‐free
rate
is
constant,
available,
and
the
same
across
securities.
Okay,
so
if
we
remember
from
the
previous
lesson,
the
equation:
dS = µ Sdt + σ Sdz
can
describe
a
stock
price
process,
where
the
dz
was
the
portion
of
the
equation
that
made
it
a
stochastic
process.
Furthermore,
Ito
showed
us
that
the
change
in
his
function
G
can
be
found
through
the
partial
differentiation
of
the
stock
process:
" ∂G 1 ∂2G 2 2 ∂G % ∂G
dG = $ µS + 2
σ S + ' dt + σ Sdz
# ∂S 2 ∂S ∂t & ∂S
Out
next
step
is
to
allow
G
to
equal
f,
the
price
of
a
derivative
of
the
underlying
S.
" ∂f 1 ∂2 f 2 2 ∂f % ∂f
df = $ µ S + 2
σ S + ' dt + σ Sdz
# ∂S 2 ∂S ∂t & ∂S
[EQ
11.04]
Now,
the
assumption
is
that
there
are
intervals
of
t,
where
the
changes
to
dz
are
the
same
for
both
the
derivative
process
and
the
stock
process.
If
this
is
true
(this
being
the
fact
that
a
change
in
the
underlying
affects
the
underlying
and
the
derivative
by
the
same
amount),
then
we
can
construct
a
portfolio
of
the
underlying
and
the
derivative
such
that
the
process
is
eliminated.
Essentially
what
we’re
looking
for
here
is
the
ability
to
perfectly
hedge
underlying
exposure
to
zero.
Thus,
if
we
assume
ownership
of
the
underlying,
then
we
must
short
the
derivative
(assuming
not
a
put
option
–
it
should
be
pretty
safe
to
assume
call
option
unless
otherwise
stated).
Thus,
an
off-‐setting
position
in
a
derivative
position
to
eliminate
equity
exposure,
thereby,
eliminating
the
process
would
be:
∂f
−f = S
∂S
Now,
the
df/dS
should
look
familiar
as
it’s
just
the
derivative’s
delta.
Therefore,
any
residual
exposure
not
hedged
will
affect
the
value
of
the
portfolio.
Thus:
∂f
Π=−f + S
[EQ
11.05]
∂S
Therefore,
any
change
in
the
value
of
the
portfolio
is
given
as:
∂f
ΔΠ = −Δf + ΔS
∂S
Written
another
way:
% ∂f 1 ∂2 f 2 2 (
ΔΠ = ' − − 2
σ S * Δt
& ∂t 2 ∂S )
Remember,
no
dz
means
no
process
means
no
risk
across
the
change
in
t.
Therefore,
we
have
defined
the
risk-‐free
rate
as
the
change
in
value
of
the
portfolio:
ΔΠ = rΠΔt
[EQ
11.06]
Through
a
substitution
of
equations,
we
are
left
with:
∂f ∂f 1 ∂2 f 2 2
rf = + rS + σ S
∂t ∂S 2 ∂S 2
[EQ
11.07]
Again,
note
the
similarities
between
this
equation
and
many
before
it.
The
risk-‐free
rate
multiplied
by
the
value
of
the
derivative
equals
theta
plus
the
return
of
the
underlying
times
delta
plus
the
gamma
coefficient
times
gamma.
Excellent,
now
let’s
put
this
into
an
even
more
familiar
context.
The
value
of
a
derivative
should
be
worth
the
difference
between
the
current
price
of
the
underlying
less
the
present
value
of
the
delivery
or
strike
price.
We’ve
already
seen
this
in
EQ
2.434:
f = ( F0 − K ) e−rT
If
we
change
this
slightly
to
conform
to
our
new
variables,
this
same
expression
translates
directly
into:
−r(T −t )
f = S − Ke
[EQ
11.08]
Therefore,
we
can
translate
EQ
11.08
into
its
counterpart
Greek
portfolio
format:
∂f
= −rKe ( )
−r T −t
θ=
∂t
∂f
Δ= =1
∂S
∂2 f
Γ= 2 =0
∂S
So…
moving
right
along
–
see
if
you
notice
anything
familiar
in
the
BSM
pricing
formulas.
CALL
c = S0 N ( d1 ) − Ke−rT N ( d2 )
S0 " σ2%
ln + $ r + 'T
K # 2 &
d1 =
[EQs
11.09]
σ T
d2 = d1 − σ T
PUT
p = Ke−rT N (−d2 ) − S0 N (−d1 )
[EQ
11.10]
NOTE
–
if
you
use
excel
the
corresponding
function
for
N()
is
NORMSDIST.
So,
what
does
the
output
of
BSM
look
like?
The
above
represents
a
European
option
with
T=0.25,
vol=0.3,
r=0.04,
S=50,
and
K=65.
American
Options
Now
we
demonstrated
that
calls
should
not
be
exercised
early
in
an
earlier
lesson,
but
now
that
we
have
a
better
grasp
on
the
tools
needed
to
analyze
derivative
more
completely,
let’s
re-‐
examine
the
case
of
the
American
option.
American
options
differ
from
European
option
insofar
as
European
options
can
only
be
exercised
at
expiration;
American
options,
on
the
other
hand,
can
be
exercised
at
any
time
up
until
expiration.
So,
if
there
is
no
dividend,
then
the
option
should
not
be
exercised
early.
However,
if
there
is
a
dividend,
then
we
have
to
consider
how
well
the
option
prices
the
dividend.
First,
how
do
we
value
a
dividend?
Hint
–
think
about
the
way
for
which
we
accounted
for
coupons
in
bonds
earlier.
Granted,
coupons
are
contractually
bound
otherwise
default
may
occur,
however,
we
can
approximate
the
same
framework
if
we
assume
that
if
a
declared
dividend
is
not
paid,
then
the
stock
may
suffer
bankruptcy
as
a
corresponding
consequence
of
a
lack
of
liquidity.
Therefore,
we
can
discount
a
dividend
as
before:
δT = δ e−rT
[EQ
11.11]
Note,
the
above
equation
only
accounts
for
a
single
dividend
payment;
often
several
dividend
payments
may
be
considered
for
a
single
adjustment.
Once
you
have
calculated
the
PV
of
the
dividend(s)
to
be
considered,
you
may
simple
adjust
the
PV
of
the
underlying
to
account
for
the
discount.
This
works
for
both
European
and
American
options.
Now
that
we
have
a
basis
for
evaluating
the
price
of
a
dividend,
how
can
we
determine
whether
or
not
to
exercise
early
in
the
case
of
an
American
option?
Well,
let’s
assume
that
the
ex-‐dividend
date
is
tn+1;
therefore
the
time
period
immediately
preceding
the
ex-‐dividend
date
is
tn.
Therefore,
if
we
exercise
early
at
tn,
then
the
option-‐exerciser
will
receive:
S(tn)
–
K;
On
the
other
hand,
if
we
continue
to
hold
the
option
through
the
ex-‐dividend
date,
then
the
stock
will
drop
by
the
value
of
the
dividend:
S(tn)
–
δn;
Therefore,
the
value
of
the
option
should
be
greater
than:
−r (T −tn )
Stn − δtn − Ke
Said
another
way,
if
the
value
of
the
dividend
is
less
than
the
adjusted
PV
of
the
strike
price
to
account
for
the
dividend,
then
do
not
exercise
early.
δn ≤ K 1− e ( −r(T −tn )
)
However,
in
the
event
that
the
dividend
payment
is
greater
than
the
adjusted
PV
of
the
strike
price
to
account
for
the
dividend,
then
exercising
early
can
capture
the
difference
in
value.
Of
course,
this
too
ignores
transaction
costs
and
the
difference
in
tax
rates.
δn > K 1− e ( −r(T −tn )
)
Congratulations,
you’ve
made
it
through
processes,
BSM,
and
options.
This
concludes
the
second
module
of
the
course,
or
as
I
like
to
call
it
–
all
things
options.
We
will
see
options
again,
and
some
of
the
theory
used
to
value
them,
but
for
now,
we
will
change
direction
slightly
to
take
another
look
at
risk
and
other
derivatives
used
to
manage
that
risk.