Lecture7 Bootstrap Simulation PDF

Bootstrap Simulation
Dr. Jinghan Meng
FINA0404/3351 – Spreadsheet Modelling in Finance

Bootstrapping
5-2
Bootstrapping
❖ Bootstrapping is also a widely used resampling algorithm.
❖ In statistics, bootstrapping relies on random sampling with replacement.
❖ Bootstrapping is the practice of estimating properties of an estimator
(such as its variance) by measuring those properties when sampling from an
approximating distribution.
❖ One standard choice for an approximating distribution is the empirical
distribution of the observed data.
❖ That is, if observations can be assumed to be from an independent and
identically distributed population, this can be implemented by constructing a
number of resamples with replacement, of the observed dataset.
5-3
Bootstrapping
Population (unknown) Sample (observed) 𝑟1 𝑟2
𝑟2 Bootstrapped
𝑟3 𝑟5
? ?
?
𝑟2 𝑟3
? ? 𝑟1 𝑟4
𝑟4 Bootstrapped
? 𝑟4
𝑟3
? ? 𝑟3 𝑟5
? 𝑟5
? ?
𝑟4 𝑟3
Bootstrapped
Statistical inference 𝑟5
𝑟1 𝑟5
Statistical inference 5-4

Simple Bootstrapping 1
❖ The most basic approach can be observed in the Excel
“Lecture7_Bootstrap.xlsm”, tab “boot0”.
❖ Basically, we simulate the “position” of observations in the sample, rather
than simulating data based on any model.
❖ Steps:
1. Let 𝑁 denote the total number of observations in the sample. We index
the observations from 1 to 𝑁. It is the “position” of each data.
2. Generate random seed from 𝑟𝑠~Uniform(0,1) distribution. Then 𝑟𝑠 ×
𝑁 follows Uniform(0, 𝑁) distribution.
3. Calculate the “boot position” using 𝑏 = Ceiling(𝑟𝑠 × 𝑁), which is to
round up 𝑟𝑠 × 𝑁 to nearest integer. This the simulated index of
observation.
4. Find the data corresponding to the index 𝑏, this is the bootstrapped
data.
5-5
❖ As you can see, bootstrapped HSI excess returns have similar mean and
standard deviation.
❖ More importantly, bootstrapped HSI series captures the “skewness” of
stock returns, which cannot be achieved by our previous simulation method.
❖ Skewness is an important property in stock returns.
𝐸 (𝑟−𝐸 𝑟 )3
The definition of “skewness”:
𝜎3
5-6
Bootstrapping using VBA
❖ The function “bootseeds” in Module “Bootstrapping” is used to bootstrap
data position using simple bootstrapping method.
❖ Sub procedure “boots_series()” in Module “Bootstrapping” is used to
bootstrap 100 times from HSI excess return data (Column N and O of Excel
tab “boot0”) using function bootseeds.
5-7
Factor Model Simulation - Bootstrap
❖ Excel “bootstrap” tab continues the example of single-factor model
𝑒 𝑒
simulation (𝑟𝑖,𝑡 = 𝛼 + 𝛽𝑟𝑚,𝑡 + 𝜀𝑡 ) of CAF stock prices.
𝑒
❖ Now we simulate HSI excess return (𝑟𝑚,𝑡 ) and firm-specific shock (𝜀𝑡 ) with
bootstrap method:
▪ HSI excess returns are bootstrapped from sample observation.
▪ Shocks are bootstrapped from regression residual data.
▪ 𝛼, 𝛽 are still estimated from regression.
❖ In the example, we can simulate the next 20 days CAF returns and prices –
time-series simulation.
5-8
Factor Model Simulation - Bootstrap
❖ We can then use Data Table function to do cross-sectional simulation of
month-end CAF prices for 100 times. Then you can calculate mean and SD of
price, as well as confidence interval of mean price.
𝑒
❖ Note: We don’t impose assumptions on the distribution of 𝑟𝑚,𝑡 , 𝜀𝑡 , but
require them to be independent, and not autocorrelated.
5-9
❖ You must be wondering: Why is this method called “bootstrap”?
=> Come from the term “pulling yourself up by your own bootstraps”.
❖ Bootstrapping methods are “non-parametric”.
❖ Bootstrapping methods have better “small-sample property” (especially
high-order properties such as skewness).
❖ Thus, instead of using parametric estimation, we would prefer using
bootstrapping to make robust statistical inferences and simulations.
❖ Simple bootstrapping can capture the non-normality of small sample.
❖ But it cannot capture the autocorrelation in residuals (time-series
property of data). Here, we need “stationary bootstrapping”.
5-10
Autocorrelation in Residuals
❖ Check the autocorrelation coefficient for the CAF return series. In the tab
“FactorModel”, we find that the 1st order autocorrelation of predicted
residuals is as high as 0.43.
❖ However, the 1st order autocorrelation coefficient of the simulated
residuals ε is very close to 0, because our random seeds are totally
uncorrelated.
❖ Autocorrelation in residuals actually matters, as it suggests that the OLS
estimation is not precise (the standard errors and the t-stat of the alphas and
betas are thus incorrect).
❖ The intuition is that autocorrelation will lead to more volatile time series.
 Stationary bootstrap can generate autocorrelated residuals ε.
5-11
Stationary Bootstrapping 1
❖ We will bootstrap in the following way:
1. First, we randomly pick the observation at time t as the first
bootstrapped observation.
2. Second, we obtain the following observation at time 𝑡 + 1 as the
second bootstrapped observation with probability 𝑞, and randomly pick
another observation (at time other than t) as the second bootstrapped
observation with probably 1 − 𝑞.
3. We continue the above process to pick the third, the fourth, …
bootstrapped observations until we finish timeline.
4. If the picked observation is already the last one, we use the first
observation as the next one (“wrap-up”).
❖ Note: 𝑞 is not equal to the autocorrelation. It is a parameter in the
bootstrap and we may try different value of 𝑞 in practice.
5-12
❖ See the tab “stationary” for an example with 𝑞 = 0.8.
❖ As you can see, the bootstrapped residuals for CAF excess returns
based on stationary bootstrapping reveal a similar first-order
autocorrelation pattern.
❖ Now, since we assume that CAF follows CAPM, and the market
factor also contains random noises, we will need to conduct stationary
bootstrap for both random noises together (market factor residuals
and CAF residuals).
❖ After bootstrapping both market factor residuals and CAF residuals,
we can generate simulated CAF returns and according prices.
5-13
❖ We can then easily compute the simulated CAF price using
stationary bootstrapping.
❖ We can use Data Table function (using the same input for the CAF
price on Aug 29) to generate 100 simulated CAF scenarios in 20 days
after Aug 29.
❖ Let’s take a look at the standard deviation of the 100 simulated
prices to see the distribution (possible future scenarios) of CAF prices.
❖ Now, compare the SD of simulated CAF prices based on simple
bootstrapping and that based on stationary bootstrapping – which
should provide more diverse distribution? What’s the implication?
5-14
❖ We find that the simulations based on stationary bootstrapping is
more spread-out than the simulations based on simple bootstrapping.
❖ Such a finding supports the intuition that autocorrelation matters
(in leading to higher standard deviations).
❖ So, when data displays high autocorrelation, we should use
stationary bootstrapping for more robust and conservative simulations
(because we will get more spread-out distributions, which fit the
original data better).
❖ More importantly, stationary bootstrapping accommodates the
cross-correlation among variables.
5-15
Bootstrapping using VBA
❖ The function “bootseeds” in Module “Bootstrapping” is used to
bootstrap data position using simple bootstrapping method.
❖ Sub procedure “boots_series()” in Module “Bootstrapping” is
used to bootstrap 20 times from HSI excess return data (Column N and
O of Excel tab “boot0”) using function bootseeds.
❖ The function “statbootseeds” in Module “Stationary” is used to
bootstrap data position using stationary bootstrapping method.
❖ Sub procedure “statboots_series()” in Module “Stationary” is
used to bootstrap 20 times from CAF residual data (Column K and L of
Excel tab “stationary”).
5-16

Lecture7 Bootstrap Simulation PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture7 Bootstrap Simulation PDF

Uploaded by

Copyright:

Available Formats

Bootstrap Simulation

Dr. Jinghan Meng

FINA0404/3351 – Spreadsheet Modelling in Finance

Statistical inference 5-4

You might also like