You are on page 1of 16

lOMoARcPSD|9377264

A Quick Guide through the Wondrous World of Financial Modelling


qhouwink
December 2016

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

1 Distributions
2 Return
Simple Returns
Pt − Pt−1
Rt = (1)
Pt−1
Continuously Compounded Returns
Pt
Yt = log(1 + Rt ) = log( ) = log(Pt ) − log(Pt−1 ) (2)
Pt−1

2.0.1 Simple vs. continuously compounded returns


• Simple is used for all accounting purposes
• Investors are concerned with simple returns
• Continuously compounded returns are symmetric

• Their mathematics are easier


• Also convenient for derivatives pricing (Black-Scholes)

2.0.2 Autocorrelations
Autocorrelations measure how returns on one day are correlated with returns on previous days. The coeffi-
cients of an autocorrelation function (ACF) give the correlation between returns and its lags:

β̂i = Corr(xt , xt−i ) (3)


Joint significance of autocorrelation coefficients (β̂1 , β̂2 ,...,β̂N ) can be tested by using the Ljung-Box (LB)
test shown in the following formula:
N
X β̂i2
JN = T (T + 2) ∼ X 2 (N ) (4)
i=1
T −N

2.1 Normal Distribution


Used in pretty much every financial model.
• Often an asset is not exactly normally distributed
• Wrong on three important accounts:

1. Volatility clustering
2. Fat tails
3. Nonlinear dependence
The function of the normal distribution is given by the following function:
1 (x−µ)2
f (x) = √ e− 2σ2 (5)
σ 2π

2.1.1 Volatility clustering


Volatility tends to cluster, which makes it possible to have some autocorrelation in models for volatility.

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

2.1.2 Fat tails


Financial data tends to show more extreme outcomes than you would expect from a normal distribution.
This is called fat tails. It can be tested in several ways that fat tails are present.

1. Jarque-Bera (JB) test or Kolmogorov-Smirnov (KS) test.


(a) The JB test is given by the following test statistic:

T T
Skewness2 + (Kurtosis − 3)2 ∼ X 2 (6)
6 24

(b) KS test, based on minimum distance estimation comparing sample with reference probability
distribution (t-dist). Advantage is that it is distribution free and non-parametric
2. QQ-plots

2.1.3 nonlinear dependence


Under the normal distribution, dependence is linear. Nonlinear dependence (NLD) implies that dependence
between variables changes depending on some factor. In finance, perhaps according to market conditions.
Example: Different returns are relatively independent during normal times, but highly dependent during
crises.
Exceedance Correlations show the correlations of (standardized) stock returns X and Y as being condi-
tional on exceeding some threshold.

Corr[X, Y |X ≤ Qx (p) and Y ≥ QY (p)] for x ≤ 0.5
pe(p) =
Corr[X, Y |X > Qx (p) and Y > QY (p)] for x > 0.5

2.1.4 Frequency and non-stationarity


Other stylised features include the frequency for which a trade is made and non-stationarity of the covari-
ance. However, the returns are considered stationary.

2.2 t-distribution (student distribution)


Models fat tails better than a normal distribution. Notation t(v), where v is the degree of freedom. Larger
v means more kurtosis. For v = ∞ the t-distribution becomes a normal distribution.

Γ( v+1 ) x2 v+1
f (x) = √ 2 v (1 + )− 2 (8)
vπΓ( 2 ) v
*

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

3 Factor Models
A factor model estimates risk and return on the basis of other variables such as macroeconomic effects or
other stocks. The general form of factor model (RGP) is:

ri ) + bi,1 fe1 + ... + bi,K feK + eei


rei = E(e (9)

• The fi ’s are common factors that affect most securities. Examples are economic growth, interest rates,
and inflation. It is given as the deviation from what we expect. (E[fi ]= 0)
• bi,j denotes the loading of the i’th asset on the j’th factor. This tells you how much the asset’s return
goes up when the factor is one unit higher than expected.

3.0.1 What are factor models good for?


• Pricing Assets
• Performance evaluation
• Risk Management

• Portfolio selection

3.0.2 Merits and cons of factor models


Merits
• Easy to predict return, standard deviations and covariances

• When one assumes relation between assets in prototype, the amount of input variables is drastically
decreased
• Easy to specialize along certain factors
Cons

• Some factor models are purely statistical, does not explain whether a low price is because of risk or
becasuse of mispricing
• Relies on past data and assumes stationarity of volatility

3.0.3 Types of factors


• Macroeconomic (GDP, inflation, etc.)

• Firm specific (firm size, industry)


• Unobservable (Abstract factors determined by data)

3.1 CAPM
CAPM is a single factor model, based on an equilibrium model. Specifying a relation between expected rates
of return and covariances for all assets. The most commonly quoted equation for the CAPM is:

E(Ri ) = Rf + βi [E(Rm ) − Rf ] (10)


Where Rf is the risk-free rate of interest and [E(Rm )-Rf ] as the market risk premium.

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

3.1.1 CAPM assumptions


• No transaction costs.
• Assets are all tradable and are all infinitely divisible.
• No taxes.
• No individual can effect security prices (perfect competition).
• Investors care only about expected returns and variances:
– returns are normally distributed.
– all investors have a quadratic utility function
• Unlimited short sales and borrowing and lending
• Homogeneous expectations.

3.1.2 Finding β for CAPM


A stock’s beta can be calculated in two ways – one approach is to calculate it directly:
Cov(Rie , Rme
)
βi = e
(11)
V ar(Rm )
Alternatively, and equivalently, we can run a simple time-series regression of the excess stock returns on
the excess returns to the market portfolio separately for each stock, and the slope estimate will be the beta:
e e
Ri,t = αi + βi Rm,t + ui,t (12)
A lot more difficult regression could be ran, with squared β’s or taking into account the effect that small
capitalisation firms tend to have larger returns than CAPM predicts.

3.1.3 Problems of CAPM


• False assumption that all investors have same preferences
• Heteroscedasticity, changing volatility between the stocks

• Non-normality, caused by outliers


• Measurement errors, the β’s might be wrongly estimated

3.2 APT
The APT specifies a pricing relationship with a number of “systematic” factors.The idea behind the APT
is that investors require different rates of return from different securities, depending on the riskiness of the
securities. If, however, risk is priced inconsistently across securities, then there will be arbitrage opportuni-
ties.

3.2.1 Assumptions for APT


• All securities have finite expected value and variance
• Agents can form well diversified portfolio’s
• There are no taxes or transactions costs

Equation for APT is:

E(e
ri ) = λ0 + λ1 bb,1 + ... + λK bi,k (13)

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

3.2.2 Selecting factors

1. We treat the factors as observable and specify the fej directly. Loadings can be estimated using
regressions.
(a) In this case, the factors can be macroeconomic variables like inflation, output.
(b) We can estimate the loadings via regression.
(c) We think that these variables will be sufficient to capture the systematic risk in the economy.
2. We treat the loadings as observable and obtain these from fundamental information. e.g. value firms
have higher returns than growth firms, or similar firms move together.
3. Treat both factors and loadings as unobservable, use just regressions to get loadings of statistically
determined factors. (Fama-French)

3.2.3 CAPM vs. APT


• APT makes fewer assumptions, but gets weaker results
• APT does not say what the systematic factors are
• APT relies on a statistical model for returns
• APT does not assume everyone is optimizing

3.3 Other very related models:


• Fama and Macbeth, They use five years of observations to estimate the CAPM betas and the other
risk measures (the standard deviation and squared beta) and these are used as the explanatory variables
in a set of cross-sectional regressions each month for the following four years
• Fama-French, A set of cross-sectional regressions are run of the form:

Ri,t = α0,t + α1,t βi,t + α2,t M Vi,t + α3,t BT Mi,t + ui,t (14)

where Ri,t are again the monthly returns, βi,t are the CAPM betas, M Vi,t are the market capitalisa-
tions, and BT Mi,t are the book-to-price ratios, each for firm i and month t. Similar stuff can be done
with SMB (small minus big) and HML (high minus low). SMB is the difference in return between a
protfolio of small stocks and a portfolio of large stocks. HML is the difference in return between a
portfolio of value stocks and a portfolio of growth stocks.
• Carhart, It has become customary to add a fourth factor to the equations above based on momentum.
This is measured as the difference between the returns on the best performing stocks over the past
year and the worst performing stocks

3.4 Tests
3.4.1 Pearson’s Correlation Coefficient
Pearson’s correlation coefficient measures the strength of linear dependence, it is scale independent and
defined as:
σX,Y
ρX,Y = (15)
σX σY

ρ is always less than or equal to 1 in magnitude. If ρ is greater than 0 it implies that Y tends to be above
average when X is above average, and if ρ is less than 0 then Y tends to be below average when X is above
average. If ρ is 0 then there is no linear relationship between X and Y.

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

3.4.2 Testing CAPM


Run a regression of 100 stocks (N = 100) and their returns using five years of monthly data (T = 60). Then
take the average return using the following equation:

R̄i = λ0 + λi βi + vi (16)

Then we should see that λ0 = Rf and λ1 = [Rm − Rf ]. You could do the same thing with squared beta’s
and inserting error terms, but you should always see that those λ’s should be zero.

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

4 Event Studies
Event studies are extremely common in finance and in research projects! They represent an attempt to gauge
the effect of an identifiable event on a financial variable, usually stock returns So, for example, research has
investigated the impact of various types of announcements (e.g., dividends, stock splits, entry into or deletion
from a stock index) on the returns of the stocks concerned.

• Define the return for each firm i on each day t during the event window as Rit
• We need to be able to separate the impact of the event from other, unrelated movements in prices.
• So we construct abnormal returns, denoted ARit , which are calculated by subtracting an expected
return from the actual return ARit = Rit - E(Rit )

• The simplest method for constructing expected returns is to assume a constant mean return. More
complex methods include market regressions using indices like S&P500 or home-made indices

4.1 Hypothesis testing


The hypothesis testing framework is usually set up so that the null to be examined is of the event having
no effect on the stock price (i.e. an abnormal return of zero). These test statistics will be asymptotically
normally distributed.

ARi t ∼ N (0, σ 2 (ARit )) (17)


List of types of tests

1. Variance of residuals from the market model


2. Standardised abnormal return, which is the abnormal return devided by standard error
3. Cumulative abnormal return, sums all abnormal returns

4. Standardised cumulative abnormal return, which is a combination of former two.

It could also be interesting to look at average returns in an event study across firms. Or over firms and
time.

4.2 Problems to take into accounts for event studies


• Prevent covariance between firms by placing similar firms together in a study

• The variance tends to increase over time after an event


• The estimation window or the amount of stocks used for a study could be too small

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

5 EWMA, ARCH and GARCH


5.1 Modeling Framework
E[ǫt |ωt−1 ] = 0 (18)
E[ǫ2t |ωt−1 ] = σt2 (19)
It also holds that:
E[ǫ2t |Ωt−1 ] = σt2 E[u2t |Ωt−1 ] = σt2 (20)

5.2 EWMA
The simplest volatility is the moving average model. Which is a very bad model.
WE
1 X
σ̂t2 = y2 (21)
WE i=1 t−i

Risk Metrics is a slightly more advanced model, it uses a weighted sum of past returns.
2
σ̂t2 = w1 yt−1 2
+ w2 yt−2 2
+ ... + wWE yt−W E
(22)
EWMA is even more advanced, exponentially declining the weights of past returns. This can also be
written as shown in the following formula, which is the standard notation of EWMA.
2
σ̂t2 = (1 − λ)yt−1 2
+ λσ̂t−1 (23)

5.2.1 Pros and cons


The EWMA is a reduced GARCH model.
Pros
• Really easy to implement
• Not inaccurate compared to more complex GARCH
• Multivariate versions are also easy
Cons
• By definition less accurate than GARCH
2
• Can go to infinity when continuously filling in total equation for σ̂t−1 .

5.3 ARCH
An econometric term used for observed time series. ARCH models are used to model financial time series
with time-varying volatility, such as stock prices. ARCH models assume that the variance of the current
error term is related to the size of the previous periods’ error terms, giving rise to volatility clustering.

σt2 = ω + α1 ǫ2t−1 (24)


Or in general, ARCH(q):
σt2 = ω + α1 ǫ2t−1 + α2 ǫ2t−2 + ... + αq ǫ2t−q (25)
Characteristics ARCH
• ARCH provided a framework to model time series with varying volatility.
• However, how should q be determined? q might be very large
• Non-negativity constraints might be violated as q increases
• Does not capture autocorrelations

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

5.3.1 Write ARCH as AR


Realizing the following relation:
vt = ǫ2t − σt2 ⇔ σt2 = ǫ2t − vt (26)
From this we can write AR(1).
ǫ2t = ω + α1 ǫ2t−1 + vt (27)
Also, if you assume that the ARCH model is stationary, then:
ω
E[ǫ2t ] = σ 2 = (28)
1 − α1

5.4 GARCH
The general process for a GARCH model involves three steps. The first is to estimate a best-fitting au-
toregressive model. The second is to compute autocorrelations of the error term. The third is to test for
significance. GARCH models are used by financial professionals in several areas including trading, investing,
hedging and dealing. A GARCH(1,1) model is shown below:

σt2 = ω + α1 u2t−1 + βσt−1


2
(29)

• Avoids overfitting (using too many q’s(??))


• Less likely to breach non-negativity constraints than ARCH
• Take autocorrelations into account

5.4.1 Finding parameters of GARCH(1,1)


1. Specify the appropriate equations for the mean and the variance of the GARCH(1,1) model.

yt = µ + φyt−1 + ut (30)

σt2 = α0 + α1 u2t−1 + βσt−1


2
(31)

2. Specify the log-likelihood function (LLF) to maximise under a normality assumption for the distur-
bances.
T T
T 1X 1X
L = − log(2π) − log(σt2 ) − (yt − µ − φyt−1 )2 /σt2 (32)
2 2 t=1 2 t=1

5.4.2 Types of GARCH


• GJR-GARCH
– GARCH model is “symmetric” in the sense that positive and negative values of ǫt−1 have the
same effect on σt2 .
– Often, we see volatility shooting up after stock crashes, much more so than after price increases.
– GJR-GARCH corrects for this: Estimation: maximum likelihood
• E-GARCH
– The E-GARCH is similar to the GJR model, but is defined in logs
– Additional advantages: variance always positive by design
• (G)ARCH-M
– Sometimes, you want the volatility to enter the level equation
– Example: as a correction for (time-varying) risk

10

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

5.4.3 GARCH and ARMA


Filling in the following equation into GARCH:

vt = ǫ2t − σt2 ⇔ σt2 = ǫ2t − vt (33)

This gives ARMA(1,1).


ǫ2t = ω + (α1 + β1 )ǫ2t−1 + vt − β1 vt−1 (34)
ω
E[ǫ2t ] = σ 2 = (35)
1 − α1 − β1

5.4.4 Forecasting with GARCH

11

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

6 Simulations
6.1 Monte Carlo Simulation
Simulations studies are usually used to investigate the properties and behaviour of various statistics of
interest. The technique is often used in econometrics when the properties of a particular estimation method
are not known. For example, it may be known from asymptotic theory how a particular test behaves with
an infinite sample size, but how will the test behave if only 50 observations are available?

6.1.1 Conducting Monte Carlo Simulation


1. Generate the data according to the desired data generating process (DGP), with the errors being drawn
from some given distribution

2. Do the regression and calculate the test statistic


3. Save the test statistic or whatever parameter is of interest
4. Go back to stage 1 and repeat N times.

6.1.2 Random Number Generation


• Fundamental input of MC analysis is a long sequence of random numbers (RNs)
• A sequence of RNs is usually generated by a function of previous RN
• Random number generators can only provide a fixed number of different random numbers, after which
they repeat themselves. This fixed number is called a period

• Prevent serial Correlations


• Generate normally distributed random number is done with the the box-muller method

6.1.3 Limitations of Monte Carlo for financial modelling


• It might be computationally expensive
• The results might not be precise when wrong assumptions are made

• The results are often hard to replicate


• Simulation results are experiment specific

6.2 Variance Reduction Techniques


The sampling variation in a Monte Carlo study is measured by the standard error estimate, denoted Sx .
r
var(x)
Sx = (36)
N

6.2.1 Antithetic Variates


One reason that a lot of replications are typically required of a Monte Carlo study is that it may take many,
many repeated sets of sampling before the entire probability space is adequately covered. By their very
nature, the values of the random draws are random, and so after a given number of replications, it may be
the case that not the whole range of possible outcomes has actually occurred.
The antithetic variate technique involves taking the complement of a set of random numbers and running a
parallel simulation on those. For example, if the driving stochastic force is a set of T N(0, 1) draws, denoted
ut , for each replication, an additional replication with errors given by -ut is also used.

12

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

6.3 Bootstrapping
In finance, the bootstrap is often used instead of a pure simulation, this is mainly because financial asset
returns do not follow the standard statistical distributions that are used in simulations. Bootstrapping is
similar to pure simulation but the former involves sampling from real data rather than creating new data.

6.4 Antithetic Variates


• The application of control variates involves employing a variable similar to that used in the simulation,
but whose properties are known prior to the simulation.
• Denote the variable whose properties are known by y, and that whose properties are under simulation
by x.
• The simulation is conducted on x and also on y, with the same sets of random number draws being
employed in both cases.
• Denoting the simulation estimates of x and y using hats, a new estimate of x can be derived from:

x∗ = y + (x̂ − ŷ) (37)

6.4.1 Examples of bootstrapping


Consider a standard regression model:
y = βX + u (38)
The regression can be bootstrapped in two ways.
1. Resample the data, take the data, and sample the entire rows corresponding to observation i together.
Problem with this approach is that we are sampling from the regressors.
2. Resampling from the residuals, Estimate the model on the actual data, obtain the fitted values ŷ, and
calculate the residuals, û.

6.4.2 Pro’s and cons of Bootstrapping


• Makes inferences without using a distribution
• Properties are similar to the dataset by definition
• Large outliers have a significant effect on the Bootstrapping model

6.5 Derivatives
A derivative is a financial instruments whose value depends on (or is derived from) the value of other, more
basic, underlying variables, like stocks, bonds etc.

13

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

6.5.1 Some definitions


• Moneyness
– strike price divided by stock price
– call is 5% in-the-money: strike price is equal to 0.95 × stock price
– put is 5% in-the-money: strike price is equal to 1.05 × stock price

14

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

7 Value at Risk (VaR)


Find the loss on a portfolio so the probability of losses exceed the VaR has probability p. Notation: VaR(p)

7.1 Important terms:


• Q = profit/loss
• V = Value of portfolio
• p = probability of equaling or exceeding VaR
• Holding period = 100/p

• We = estimation window, period of time used for developing a forecast model


• Wt = testing window, the period that is forecasted

7.2 Problems
• VaR is only a quantile on the P/L distribution, could be much worse than the VaR shows for one day
• VaR is not a coherent risk measure
• VaR is easy to manipulate by companies

• In general many datapoints are needed

7.3 Method for finding VaR


1. The probability of losses exceeding VaR, p

2. The holding period, the time period over which losses may occur
3. Identification of the probability distribution of the P/L of the portfolio Empirical

7.3.1 Possibilities for the distribution


1. Historical Simulation
Order all observations from small to large, take value of the relevant quantile for the p. (e.g. VaR(5)
for a data set of 500 points would mean the lowest 25 observations.)
(a) Very simple and does not assume a certain distribution
(b) Assumes history repeats itself
(c) Big sample needed, minimum sample size of 3/p
(d) Adjusts slowly to changes in volatility

2. GARCH Normal distribution (Variance, covariance method)


Assumes stock returns are normally distributed. Probability is extracted from this normal distribution.

V aRt = −σφ−1 (p)V (39)

(a) Normal distribution assumed (often incorrect)


(b) Parameters of normal distribution need to be found
(c) Usually has a much smaller error term than HS
(d) GARCH needs much more than 3/p observations.

15

Downloaded by RR PP (rohitpatel13579@gmail.com)
lOMoARcPSD|9377264

3. EWMA and MA, The MA model assumes that each day in the sample gets the same weight, but we
can improve volatility forecasts by applying more weight to the most recent dates. This can be done
using the EWMA model.
(a) MA is really bad at forecasting
(b) EWMA is only slightly worse than GARCH
(c) EWMA needs data set of at least 0.3/p observations

7.4 Testing VaR


Done by backtesting. Using an estimation window to determine VaR, then testing this VaR with the
testing window. A value excessing the VaR is called a VaR violation. η = 1 for violation. v1 counts the
amount sum of η. v0 counts the times η = 0.

7.4.1 Violation Rate


observed number of violations v1
Violation Rate = = (40)
expected number of violations p × Wt
Rule of thumb:

• VaR is good if VR is [0.8:1.2]


• VaR is bad if VR<0.5 or VR>1.5

7.4.2 Bernoulli coverage test


Non-parametric, so does not need any input from a distribution of some sort. Performs likelihood test based
on Bernoulli variables (0, 1).
T
Y
Unrestricted Likelihood = LU (p̂) = (1 − p̂)1−ηt (p̂)ηt = (1 − p̂)v0 p̂v1 (41)
t=We +1

T
Y
Restricted Likelihood = LR (p) = (1 − p)1−ηt (p)ηt = (1 − p)v0 pv1 (42)
t=We +1

The likelihood ratio test, reject if LR>3.84, if a significance level was chosen of 5%.

LR = 2(logLU (p̂) − logLR (p)) ∼ X 2 (asymptotic,1) (43)

7.4.3 Independence test


Test whether violations cluster (several after each other). This can be tested with an independence test,
thus calculating the probability of two successive violations.

LR = (1 − p01 )v00 pv0101 (1 − p11 )v10 pv1111 (44)

LU = (1 − p̂)v00 +v10 p̂v01 +v11 (45)

LR = 2(logLU − logLR ) ∼ X 2 (asymptotic,1) (46)

16

Downloaded by RR PP (rohitpatel13579@gmail.com)

You might also like