You are on page 1of 6

Second year essay - not finished

Edgar Vitins
u2155924

February 14, 2023

1 Recap of statistics
First it would be useful to look at some useful ideas and results in basic statistics.
Expected value of random variables X, Y , where pX (x) is the probability mass function of X,
and fY (y) is the probability density function of Y .
X Z
µX := E[X] = xpX (x) , µY := E[Y ] = yfY (y)dy
x∈X y∈Y

Variance

V ar[Y ] = E[(Y − µY )2 ] = E[Y 2 ] − µ2y

2 Wiener process
Before introducing this process, it is important to understand the intuition behind it. Usually, it is
reasonable to assume that markets obey the weak form of market efficiency. It assumes that stock
prices posses the Markov property, that is, the only relevant information about this price is the last
value. This means that knowing the path the price took to get there wouldn’t give an advantage
in determining the future price. The reasoning behind it is that, first of all, if traders knew that a
certain path would imply rapid growth, then everyone would try to capitalize on this opportunity,
which would quickly diminish any possible arbitrage. Second of all, there isn’t much evidence out
there that suggests it is possible.
So now that we feel comfortable assuming that stock prices have the Markov property, we can
model them as a random walk. This leads us to the first step, Wiener’s process. The idea is that,
at each point there is a probability of moving up or down with mean value 0, and the size of this
step is related to the time step.
Definition 2.1 Standard Wiener process. It is a random variable W (t) that depends on
t ∈ [0, T ] and satisfies the following properties (let ϕ ∼ N (0, 1)):

1. W (0) = 0

2. for 0 ≤ t ≤ t + δt ≤ T ,

W (t + δt) − W (t) = ∆W (t, δt) ∼ δtϕ

1
3. for 0 ≤ t ≤ t + δt ≤ t′ ≤ t′ + δt′ ≤ T ,
W (t + δt) − W (t) and W (t′ + δt′ ) − W (t′ )
are independent.
Corollary 2.2 From the second property, we can see that

• the mean of ∆W is E[∆W ] = δtE[ϕ] = 0
• the variance is V ar(∆W ) = δtV ar(ϕ) = δt

• and hence the standard deviation is σ∆W = δt
The third property ensures that Markov property holds. From this we can also see that the sum
of changes over multiple intervals has similar properties. More specifically,
N N √ √ N
X X X T
W (T ) = ∆Wi = δtϕi = δt ϕi , where δt = and ϕi ∼ N (0, 1)
N
i=1 i=1 i=1

Then, from [stats properties to be written] it follows that


√ P
• E[W (T ) − W (0)] = δt N i=1 E[ϕi ] = 0

• V ar(W (T ) − W (0)) = δt N
P
i=1 V ar(ϕi ) = δtN = T

• therefore, the standard deviation of W (T ) − W (0) is σW = T
I won’t define this rigorously, but it will be enough to know that dW will have the properties of
∆W as δt → 0

3 Generalized Wiener process


It would be useful to include some kind of directional bias in this model. In the context of stocks,
maybe it is expected that the price will increase over time. This behaviour for a stochastic process
can be modeled as a drift rate. It is defined as the mean change per unit time. So, if µ is the
drift rate then, dx = µdt. We can also define a useful quantity which would describe the stochastic
processes propensity for shifting about. This is called the variance rate, and it is defined as the
variance per unit time. The standard Wiener process has a variance rate of 1 and drift rate of 0.
But we can describe a more general Wiener process as
dX = adt + bdW
where X is a random variable and a,b ∈ R. [an example image here could be nice]
From this definition and using Corollary 2.2, if we look at the discrete case it is clear that, for
T
δt > 0 and N = δt we get
"N # N
X X √
E[X(T ) − X(0)] = E aδt + b∆Wi = E[aδt + b δtϕi ] = aT
i=1 i=1

N
X
V ar (X(T ) − X(0)) = b2 δt V ar(ϕi ) = b2 T
i=1

In this case the drift rate of X would be a, and the variance rate would be b2 .

2
4 Itô Process and Itô’s Lemma
We can make the aforementioned process even more general by letting a, b : R2 → R be functions
that are dependant on the values of X and t.
Definition 4.1 Let W be a standard Wiener process and a, b : R2 → R be some functions, then
Itô process is defined as
dX = a(X, t)dt + b(X, t)dW
Note that this process is Markov, since the change in X depends only on its current value and time.
Ultimately we want to use this stochastic process to determine how other stochastic processes
would behave if they were dependant on this one. So ideally, given a function G(X, t) it would
be great if we could find a differential equation that this function solves, essentially expressing it
in the Itô Process form. However, here we run into a slight issue, because if we try the standard
approach of
 
∂G ∂G ∂G ∂G ∂G
dG = dX + dt = a(X, t) + dt + b(X, t)dW
∂X ∂t ∂X ∂t ∂X
this won’t give us the full picture, since the standard Wiener process is in some sense proportional
√  √ 
to dt Recall dW ∼ dtϕ . Therefore, there are more terms that should be considered here.
This where Itô’s Lemma comes in.
Itô’s Lemma Let G(X, t) be a function of X, t where X is an Itô Process and t is time, then

∂G 1 ∂ 2 G 2
 
∂G ∂G
dG = a(X, t) + + b (X, t) dt + b(X, t)dW
∂X ∂t 2 ∂X 2 ∂X
Proof :
Again this will not be a rigorous proof, but it will give insights into why this lemma is true. If
we had a function G : R2 → R that is n-times continuously differentiable (n > 2), then

∂G ∂G 1 ∂2G 2 ∂2G 1 ∂2G


∆G(u, v) ≈ ∆u + ∆v + (∆u) + ∆u∆v + (∆v)2 + ... (1)
∂u ∂v 2 ∂u2 ∂u∂v 2 ∂v 2
in the limit as ∆u → 0, ∆v → 0, this will converge to the classical result. What if instead
√ of u
and v we substituted X and t? As mentioned earlier, ∆X has a term proportional to ∆t so we
cannot ignore the (∆X)2 term in equation (1). If we expand (∆X)2 , we get
 √ 2 3
(∆X)2 = a(X, t)∆t + b(X, t) ∆tϕ = b2 ∆tϕ2 + abϕ(∆t) 2 + a2 (∆t)2

and as ∆t → 0 the higher power terms will disappear, so

(∆X)2 → b2 ϕ2 ∆t

Now, how do we tackle this? Well, first of all, by the properties of ϕ [stats recap to be made]
we know that E[ϕ2 ] = V ar(ϕ) − (E(ϕ))2 = 1. So the expected value of (∆X)2 is b2 ∆t.
Second, the variance of (∆X)2 is [stats recap to be made]
2
V ar[(∆X)2 ] = E[(∆X)4 ] − E[(∆X)2 ]
5
(∆X)4 = b4 ϕ4 (∆t)2 + {terms of order (∆t) 2 and higher} ⇒
⇒ V ar[(∆X)2 ] = b4 (∆t)2 E[ϕ4 ] − b4 (∆t)2

3
Using the result derived previously, we get that

V ar[(∆X)2 ] = 2b4 (∆t)2

This is a very nice result for our purposes, because it means that the variance decreases faster than
the expected value. Therefore, as ∆t → 0 the variance will become negligibly small compared to
the expected value, essentially making the variable constant. So,

(∆X)2 → b2 ∆t

Now we can return back to equation (1), and substituting X, t we get

∂G ∂G 1 ∂2G 2 ∂2G 1 ∂2G


∆G(X, t) ≈ ∆X + ∆t + (∆X) + ∆X∆t + (∆t)2 + ...
∂X ∂t 2 ∂X 2 ∂X∂t 2 ∂t2
∂G ∂G 1 ∂2G 2 3
∆G(X, t) ≈ (a∆t + b∆W ) + ∆t + 2
b ∆t + {terms of order (∆t) 2 and higher}
∂X ∂t 2 ∂X
By grouping the terms and letting ∆t → 0, we get the result stated in the Lemma,

∂G 1 ∂ 2 G 2
 
∂G ∂G
dG = a(X, t) + + b (X, t) dt + b(X, t)dW
∂X ∂t 2 ∂X 2 ∂X

5 Stock prices and the log-normal property


To model stock prices, it is reasonable to assume that there could be an average drift upwards,
because we expect the company to grow or for other reasons. This growth would probably be
proportional to the stocks price, since $1 change in price for a stock that costs $2 is way more
significant than for one which costs $100. Furthermore, expect it to be somewhat random, and
again we would expect this randomness to be proportional to the stocks price. Therefore, we can
define a simple model of a stocks price as an Itô process,

dS = µSdt + σSdW

where

• S is the stocks price in some currency

• µ is the expected return per unit time of S

• σ is the volatility per unit time of S

So, for example, if the expected return per year was 10% and the annual volatility was 20%, then
it could be modeled as
dS = 1.10Sdt + 0.2SdW
Note that if we model stock prices in this manner then they follow a log-normal distribution. Let
G = ln S, then by differentiating and using Itô’s lemma we get

∂G 1 ∂2G 1 ∂G
= , 2
=− 2 , =0
∂S S ∂S S ∂t

4
∂G 1 ∂ 2 G 2 2
 
∂G ∂G
dG = µS + + σ S dt + σSdW
∂S ∂t 2 ∂S 2 ∂S
 
1 1 1 2 2 1
dG = µS + 0 − 2
σ S dt + σSdW
S 2S S
σ2
 
dG = µ − dt + σdW
2
So we can see that, according to this model, stock prices follow a log-normal distribution, since µ
and σ are fixed. Therefore, we get that

σ2
  
2
ln S(T ) − ln S(0) ∼ N µ− T, σ T
2

6 Black-Scholes equation
Now that we have the necessary ingredients, we derive the Black-Scholes equation. To do this, we
can imagine that we can set up a riskless portfolio of a stock and its option, that is, given we have
a stock, then we can get the right quantity of options so that any price movements in the stock will
be offset by the options price movements. This is the so called perfect hedge, however this will be
maintained only for an infinitesimal time interval, and the portfolio will have to be continuously
adjusted. So, let f be the price of a European call option, T be its maturity date, S the price of
the underlying stock and Π the total value of the portfolio. Then, given we have 1 call option, a
riskless portfolio would be
∂f
Π = −f + S (2)
∂S
So after some ∆t, the change in our portfolio will be
∂f
∆Π = −∆f + ∆S
∂S
Furthermore, assuming the same model for stock prices as in the previous section and using Itô’s
lemma, we get that
1 ∂2f 2 2
 
∂f ∂f ∂f
df = µS + + 2
σ S dt + σSdW
∂S ∂t 2 ∂S ∂S
So combining the results we have, we get

∆S = µS∆t + σS∆W

1 ∂2f 2 2
 
∂f ∂f ∂f
∆f = µS + + 2
σ S ∆t + σS∆W
∂S ∂t 2 ∂S ∂S
1 ∂2f 2 2
 
∂f
⇒ ∆Π = − − σ S ∆t (3)
∂t 2 ∂S 2
Now, because the portfolio we have created is riskless, it must yield the risk-free interest rate, which
will be denoted as r, which usually would be the yield of some government bonds. Why is this the
case? Well, since it is riskless, the outcome is certain, therefore if it earns less than the risk-free
rate, traders could short this portfolio, and earn a profit. On the other hand, if it was higher, then

5
they could just buy it. This process would eliminate any arbitrage opportunities. Therefore, we
assume that there aren’t any, and the return of this portfolio in ∆t is the risk-free rate. That is,

∆Π = rΠ∆t

So by combining this and equations (2),(3), we get

1 ∂2f 2 2
   
∂f ∂f
∆Π = r −f + S ∆t = − − σ S ∆t
∂S ∂t 2 ∂S 2

And after simplifying we get the famous Black-Scholes equation,

∂f 1 ∂ 2 f 2 2 ∂f
+ σ S + S = rf
∂t 2 ∂S 2 ∂S

You might also like