Professional Documents
Culture Documents
University of Connecticut
Xiao Liang
Yiran Liao
Ziran Luo
Yuechen Pan
Yuqi Peng
Zhiyong Yan
Zhaojie Yang
Professor Biolsi
8 December 2017
1
Forecasting Crude Oil and Natural Gas Volatility
1. Introduction 3
1.1 Background 3
1.2 Literature Review 7
2. Data 9
2.1 Data Source for EGARCH and GJR-GARCH 11
2.2 Mean Reversion Test 13
2.2.1 Hurst Exponent 13
2.2.2 Augmented Dickey--Fuller Test 13
2.2.3 Conclusion 14
3. Prediction 15
3.1 VIX 15
3.1.1 Theory foundation of OVX 15
3.1.1.1 VIX methodology 16
3.1.2 Prediction 18
3.1.3 Conclusion 19
3.2 Option Moneyness 21
3.2.1 Moneyness implied volatility and historical volatility 21
3.2.2 Absolute change of moneyness implied volatility and that of historical
volatility 22
3.2.3 Conclusion 22
3.3 GARCH (1,1) Model 24
3.3.1 Theory foundation of GARCH (1,1) 24
3.3.2 Conclusion 25
3.4 EGARCH Model 27
3.4.1 Theory foundation of EGARCH 27
3.4.2 Prediction 30
3.4.3 Conclusion 31
3.5 GJR GARCH Model 33
3.5.1 Theory foundation of GJR-GARCH 33
3.5.2 Prediction 34
3.5.3 Conclusion 35
4. Model Comparison and Conclusion 36
4.1 Comparison 36
4.2 Conclusion 39
2
Forecasting Crude Oil and Natural Gas Volatility
References 40
Figures 42
Figures – Crude Oil 42
Figures – Natural Gas 46
Tables 51
Tables - Crude Oil 51
Tables - Natural Gas 53
APPENDIX 55
R Code - Crude oil 55
R Code - Natural Gas 66
3
Forecasting Crude Oil and Natural Gas Volatility
1. Introduction
1.1 Background
Crude oil is debatably one of the most important driving forces of the global economy,
and changes in the price of oil have significant effects on economic growth and welfare
around the world. The level of importance of oil is even larger to industrialized economies.
Today, Oil is one of the most important raw material for the world and will likely remain so
for many decades to come. Most countries are significantly affected by developments in the
oil market, either as producers, consumers, or both. Oil is directly responsible for about 2.5%
of world GDP, and in 2014, oil provided about 38 % of the world’s energy needs, and in the
future, oil is expected to continue to provide a leading component of the world’s energy.
From estimation, to meet the projected increase in world oil demand, the total petroleum
supply in 2030 is required to reach 118 million barrels per day from 80 million barrels per
day as of the year 2003. Everyday, we use hundreds of things that are made from oil or gas,
cooking and heating fuel in most U.S. households, natural gas has many other energy and raw
material uses that are a surprise to most people who learn about them. In the United States,
most natural gas is burned as a fuel. In 2012 about 30% of the energy consumed across the
nation was obtained from natural gas. It was used to generate electricity, heat buildings, fuel
vehicles, heat water, bake foods, power industrial furnaces, and even run air conditioners.
Oil and gas powers nearly 100% of all transportation. The assumption of oil and gas are
very big. The world’s oil & gas transport infrastructure is a globe-spanning spiderweb of
pipelines and shipping routes. The natural gas distribution pipelines in the US alone could
4
Forecasting Crude Oil and Natural Gas Volatility
stretch from Earth to the Moon 7-8 times. There are millions upon millions of miles of pipe
on the planet to distribute crude oil, refined products, and natural gas. There is no reason
whatsoever to think any feasible amount of renewables growth can displace fossil fuels in a
couple of generations. Wind and solar are growing fast, but the use of renewables as a
percentage of total world energy consumption only increased by 0.07% from 1973 to 2009.
Oil and natural gas have huge roles not only in world economy, but also has strong
influence on global crises. Understanding macro impacts of oil and gas prices also requires
considering in detail the exposure and interactions of micro channels, such as the housing or
auto sector.
So, predicting the future oil and gas prices are important. Higher oil and gas prices
means higher business costs. From a financial perspective, many sectors of the economy will
be adversely affected by increasing oil and gas prices, or helped when they go down. It can be
very different for importing and exporting countries. It’s universal. It’s fast changing. During
the past decade, the price of oil has traveled from $60 per barrel to a peak of $145 in 2008
Here is how the crude oil and natural gas react to a variety of geopolitical and economy
events during the past decade. 2008 is a magical year. The crude oil price went high and
reached its pre-financial crisis peak of $145 due to the unrest and consumers’ fears about the
wars in both Iraq and Afghanistan. In addition, it was just the time of the 2008 Beijing
Olympic Games. Millions of travelers entered the country that drives the demand for oil went
up. Crude oil price began to go down when the global market collapsed in the very same
year. Now take a look at the natural gas price changes back then. The normal natural gas
price trend is prices falling in the summer and rising in the winter. However, instead of
seasonal weather playing an important role in prices, in the summer of 2008, a large price
5
Forecasting Crude Oil and Natural Gas Volatility
spike took place in summer and then quickly drops from its peak of over $13 per Mcf. to
below $3 per Mcf in the winter due to demand drop resulted from the economy recession. It
follows the pattern of oil prices of rising in the summer and falling in the winter. After the
economy had recovered for two years, crude oil price fluctuated in a relatively small range
and maintained a steady growth from 2011 to early 2014 and there has been slight fluctuation
in the price of natural gas but not any major spikes. Due to the oversupply by OPEC and the
appreciation of U.S. dollar in the second half of 2014, oil and gas demand was driven down.
Crude oil price went low that people has not seen since the last global economy recession.
Also, the Iran nuclear deal and the turmoil in Iraq and Libya contributed to the declining
crude oil price and it affected the geopolitical risk in market. Crude oil price and natural gas
price both reached its lowest point in the beginning of 2016 as below $30 and $2
respectively.
Over the last couple of decades, volatility has become one of the significant issues in the
energy market. It is apparent that energy prices are the most volatile among all the
commodity prices. Crude oil, natural gas, coal and other energy products all observe
significant price fluctuations. These fluctuations in prices create uncertainty in the minds of
consumers and producers. Oil price shocks due to such events have continuously increased in
size and frequency. Wide fluctuations in oil prices have played an important role in driving
recessions and even regimes collapsing—which is why oil price movements are closely
watched by economists and investors. From these evidences of significantly changes of crude
oil and natural gas prices, we naturally think about whether the econometric tools that we
From a finance perspective, in the current context of an ongoing global financial crisis,
risk management and volatility forecasting are the most important topics nowadays in the
6
Forecasting Crude Oil and Natural Gas Volatility
financial world. We all know that return has a close relationship with volatility and most of
the financial decisions are made based on a tradeoff between risk and returns. Thus, in order
to analysis and predict the changes of return of crude oil and natural gas, we have to analysis
Overall, the fluctuation in crude oil and natural gas price for the past decade has a
significant impact on stock market and global economy. For risk managers, it is important to
understand the degree of volatility in any investment, along with its potential impact on the
overall investment strategy since volatility directly affects the investment valuation.
Therefore, being able to understand and predict future volatility is very important in
For the volatility forecasting, there are two main sources. One of them is the approach
based on time series, and the other one is the volatility implied from option prices. From a
theoretical point of view, the implied volatility of the option price should contain all the
available information related to the forecasting of volatility, which is necessary to the future
volatility forecasting. However, the actual situation is very complicated. In general, the risk
premium of the implied volatility of the option price is due to the fact that the risk of
volatility cannot be completely hedged, it is shows in the study of Bollerslev and Zhou
(2005). In addition, one of the most noteworthy phenomena is called smile effect, which
shows the limitations of the classic Black-Scholes model. The smile effect is the effect when
calculating the implied volatility for options with different strikes on the same underlying
with the same time to maturity one does not necessarily get the same implied volatility. In
general, the implied volatility is a u-shape and the minimum implied volatility occurs at
at-the-money position. Thus, if we want to use the implied volatility to predict future
volatility, the same market is giving multiple forecasts for the future volatility of the same
In the literature, many models have been used to forecast the crude oil volatility. The
models that we most widely used are ARCH model that proposed by Engle (1982) and then
unconditional variance for the first time, but the model is simple and needs a lot of
parameters. Bollerslev (1986) extended the ARCH model to the Generalized Autoregressive
Conditional Heteroscedasticity (GARCH) which had the same key properties as the ARCH
but required far less parameters to adequately model the volatility process. According to their
8
Forecasting Crude Oil and Natural Gas Volatility
research, we can build the model that simultaneously model both mean and variance of
financial time series. These approaches are significant improvements in the time series
analysis.
Time invariant GARCH (1,1) models have fared well in predicting the conditional
volatility of financial assets (Hansen and Lunde 2005). Moreover, oil price volatility has been
cannot avoid the impacts that caused by asymmetric effects of positive and negative asset
returns. In order to overcome the weakness, Nelson (1991) proposed an extension to the
GARCH model called the Exponential GARCH(EGARCH), which can avoid the asymmetric
influences and bias. Another widely used extension of the GARCH model is the
GJR-GARCH proposed by Glosten, Jagannathan and Runkle (1993), GJR-GARCH has been
shown to have good out-of-sample performance when forecasting oil price volatility at short
horizons (Mohammadi and Su 2010, and Hou and Suardi 2012). On the basis of previous
papers, Wei et al (2010) studied nine GARCH models and compared the accuracy of their
forecasting with six different loss functions. Finally, they concluded that although the
nonlinear model can properly capture the asymmetric leverage effect of long memory
volatility and the asymmetric leverage, these models are not the best model to forecast the
volatility.
9
Forecasting Crude Oil and Natural Gas Volatility
2. Data
In this paper, the data that we obtained from Bloomberg is based on the West Texas
Exchange's oil futures contracts and natural gas traded on the New York Mercantile Division
(NYMEX) of the Chicago Mercantile Exchange (CME). The rationale for choosing WTI
futures is that it can provide us with sufficient amount of data needed to measure and
compare the accuracy of forecasting power for different models. We derived daily and
monthly closing price for crude oil over a 10-year period, from 2008 to 2017. Options are
divided into 7 different strike price categories ranging from 10 basis point out-of-money to 10
basis point in-the-money. The estimation of historical model is based on a 21-day window of
realized volatility.
Different models provided different series of predicted volatilities, the way we compare
these numbers to find the optimal prediction model is by using regression between our
predicted numbers and HVT. HVT is historical volatility of the oil price we downloaded from
Bloomberg directly. The historical volatility reflects the actual price changes of the oil over a
given time period. Therefore, the model whose regression results fit the HVT best is the
optimal one. HVT is usually calculated in several different moving windows. The windows
express the days include in the HVT’s calculation, and the whole data set will move forward
every day. We take the 30 days HVT which representative monthly volatility for VIX, option
moneyness, GARCH (1,1), and GJRGARCH models, and use the 10 days, 30 days, 50 days,
and 100 days windows for GARCH model. In period range from January 2nd, 2008 to July
20th, 2017. The average volatilities for crude oil are 34.62%, 35.73%, 36.05% and 36.57%
for 10 days, 30 days, 50 days, and 100 days. And the standard deviations for crude oil are
10
Forecasting Crude Oil and Natural Gas Volatility
21.65%, 19.18%, 18.42%, 17.42% for 10 days, 30 days, 50 days, and 100 days. For Natural
gas, data range from November 2nd,2007 to November 8th,2017, the average volatility for
every 10 days and 30 days are 44.79% and 46.00%. The standard deviation for 10 and 30
days window are 20.45% and 16.80%. The max volatility of 10 days is 172.6% observe in
September 29th, 2009. The max volatility of 30 days is 130.66% observe in October 1st,
2009. It means volatility will rise with the window range rising but the volatility will be more
For crude oil, in both EGARCH and GJRGARCH, we use the daily spot price for the
West Texas Intermediate (WTI) crude oil obtained from inserted R package. The sample
period ranges from January 2nd, 2008 to July 20th, 2017. Over this period of time, the
average price for a barrel of crude oil was $77.31, the median value equaled $82.18, and the
standard deviation was $24.79. A maximum price of $145.29 was observed on July 3, 2008
and the minimum price of $26.21 was on February 11, 2016. To model the returns in the oil
price and its volatility, we calculate daily oil returns by taking difference in the logarithm of
consecutive days closing prices. The mean rate of return is about -0.031% with a standard
deviation of 2.49%. Note also that WTI returns are slightly positively skewed at 0.2012.
Kurtosis is a little bit high at the value of 4.54, compared with 3 for a normal distribution.
Large variations are observed during the global financial crisis in late 2008 and since crude
oil prices started decreasing in July 2014. The reason we choose this range of data is written
in the previous back ground part, and the time period is large sufficient for us to make a
reliable prediction.
For natural gas, we use the daily natural gas future price of CME. Time range is form
November 2nd, 2007 to November 8th, including 2510 data in total. In this period, the
average price of the natural gas is $4.11 per MMBtu, the median value is $3.70, and standard
deviation of daily future prices is $1.96. A maximum price of $13.577 was observed on July
3, 2008 and the minimum price of $1.63 was on March 3, 2016. The maximum price of
natural gas appeared at the same day as the crude oil, and the minimum price showed at near
time for gas and oil. Same as crude oil, we calculate daily gas returns by taking difference in
the logarithm of consecutive days closing prices. The mean rate of return is -0.039% with a
12
Forecasting Crude Oil and Natural Gas Volatility
standard deviation of 3.04%. The natural gas returns are positive skewed at 0.6394, and has a
slightly high kurtosis at 4.85. From this simple calculation of natural gas prices and return,
we can observe that the price tendency of crude oil and natural gas are little bit similar to
each other.
For the comparable historical volatility, we download the 10 days, 30 days, 50 days, and
100 days HVT for EGARCH model. 30 days HVT for GJRGARCH model.
13
Forecasting Crude Oil and Natural Gas Volatility
The Hurst exponent is the classical test to detect long memory in time series. This
analysis was introduced by English hydrologist H.E. Hurst in 1951, based on Einstein’s
contributions regarding Brownian motion of physical particles, to deal with the problem of
reservoir control near Nile River Dam. R/S analysis in economy was introduced by
Mandelbrot, who argued that this methodology was superior to the autocorrelation, the
The eldest and best-known method to estimate the Hurst exponent is R/S analysis. It was
When the process is a Brownian motion, H has to be 0.5, when it is persistent H will be
greater than 0.5, and finally when it is anti-persistent H will be less than 0.5. For a white
noise, H = 0, while for a simple linear trend, H = 1. Note that H must lie between 0 and 1.
We used package “pracma 2.1.1” in R and the function of hurstexp(x, box, display).
an augmented Dickey–Fuller test (ADF) tests the null hypothesis that a unit root is
present in a time series sample. The alternative hypothesis is different depending on which
version of the Dickey–Fuller test for a larger and more complicated set of time series models.
When the process is a stationary process, p-value should be larger than our threshold
14
Forecasting Crude Oil and Natural Gas Volatility
(5% for 95% confidence for example).
Otherwise, if p-value is smaller than 5%, null hypothesis is significant, which means it is
2.2.3 Conclusion
In both test, the result shows that the nature gas and crude oil are mean-reversion
3. Prediction
3.1 VIX
The CBOE Crude Oil ETF Volatility Index ("Oil VIX", Ticker - OVX) measures the
market's expectation of 30-day volatility of crude oil prices by applying the VIX
methodology to United States Oil Fund, LP (Ticker - USO) options spanning a wide range of
strike prices.
The United States Oil Fund is an exchange-traded security designed to track changes in
crude oil prices. By holding near-term futures contracts and cash, the performance of the
Fund is intended to reflect, as closely as possible, the spot price of West Texas Intermediate
The CBOE Volatility Index (VIX Index) is considered by many to be the world's
premier barometer of equity market volatility. The VIX Index is based on real-time prices of
options on the S&P 500 Index (SPX) and is designed to reflect investors' consensus view of
future (30-day) expected stock market volatility. The VIX Index is often referred to as the
In 2008, CBOE pioneered the use of the VIX methodology to estimate expected
volatility of certain commodities and foreign currencies. The CBOE Crude Oil ETF Volatility
Index (OVXSM), CBOE Gold ETF Volatility Index (GVZSM) and CBOE Eurocurrency ETF
Volatility Index (EVZSM) use exchange-traded fund options based on the United States Oil
Fund, LP (USO), SPDR Gold Shares (GLD) and Currency Shares Euro Trust (FXE),
respectively.
CBOE has since introduced several new volatility indexes, including volatility indexes
16
Forecasting Crude Oil and Natural Gas Volatility
based on individual stocks, just like CBOE U.S. Energy Sector ETF Volatility Index
(VXXLESM). However, there is still no VIX to track the natural gas volatility only. As is
shown below, the VIX we calculate is by giving different weights to different options, the
lower the strike price, the higher the weights. Through our personal experience, the 90%
moneyness put option implied volatility might be a reasonable substitution for the natural gas
Stock indexes, such as the S&P 500, are calculated using the prices of their component
stocks. Each index employs rules that govern the selection of component securities and a
The VIX Index is a volatility index comprised of options rather than stocks, with the
price of each option reflecting the market’s expectation of future volatility. Like conventional
indexes, the VIX calculation employs rules for selecting component options and a formula to
∆K i RT 2
σ2 = 2
T
∑ e Q(K i ) − T1 [ KF0 − 1]
K 2i
i
Where...
V IX
σ is 100
, we can get VIX=σ*100
T Time to expiration
K1 Strike price of the ith out-of-the-money option; a call if K1>K0; and a put if K1<K0; both
17
Forecasting Crude Oil and Natural Gas Volatility
put and call if K1=K0
ΔK1 Interval between strike price-half the difference between the strike on either side of K1:
K −K
∆K i = i+1 2 i−1
(Note: ΔK for the lowest strike is simply the difference between the lowest strike and
the next higher strike. Likewise, ΔK for the highest strike is the difference between the
Q(K1) The midpoint of the bid-ask spread for each option with strike Ki
GETTING STARTED:
The VIX calculation measures time to expiration, T, in calendar days and divides each
day into minutes in order to replicate the precision that is commonly used by professional
option and volatility traders. The time to expiration is given by the following expression:
{M current day +M settlement day +M ohter days }
T = M inutes in a year
WHERE...
M settlement day = minutes from midnight until 8:30 a.m. for “standard” SPX
SPX expirations
M ohter days = total minutes in the days between current day and expiration day
STEP 3: Calculate the 30-day weighted average of σ21 and σ22 . Then take the square root of
18
Forecasting Crude Oil and Natural Gas Volatility
that value and multiply by 100 to get VIX.
3.1.2 Prediction
What we do right now is running regressions between the historical volatilities and OVX
data, and then judge whether OVX is a good predictor of predicting future WTI oil price
volatility. Since we download OVX monthly data to predict the 30-day volatility, there is
In part one, the data range for OVX is from Nov-2007 to Jun-2017 and the data range
We then run a regression on HVT and OVX price correspondingly. The Multiple
R-squared for this regression is 0.1863 and the Adjusted R-squared for this regression is
0.1792.
In part two, we run a regression on the absolute change of HVT and OVX price. The
data for absolute changes of OVX range from Dec-2007 minus Nov-2007 to Jun-2017 minus
May-2017 and the data range for HVT is from Jan-2008 minus Dec-2007 to July-2017 minus
Jun-2017.
We then run a regression on HVT and OVX absolute change correspondingly. The
Multiple R-squared for this regression is 4.108e-05 and the Adjusted R-squared for this
regression is -0.008808.
For predicting the natural gas volatility, we do the same. First, we import the natrual gas
monthly historical volatility with period from Jan-2008 to Sept-2017 and import the implied
volatility of 90% put option price with period form Dec-2007 to Aug-2017.
We then run a regression on those two correspondingly. The Multiple R-squared for
this regression is 0.6117 and the Adjusted R-squared for this regression is 0.6083.
19
Forecasting Crude Oil and Natural Gas Volatility
By running the regression on the absolute change of natural gas HVT and implied
volatility of 90% put option price. The Multiple R-squared for this regression is 0.2377 and
3.1.3 Conclusion
From all we discussed above, we can conclude that OVX is not a completely good
The OVX calculates not the actual volatility of the crude oil price, but the implied
volatility in the its option price, which means that the supply and demand factors for options
are included in the OVX. When demand for options is high, option prices would be higher
The VIX does not have the necessary predictive power in the real world. Sometimes, the
real volatility rises when markets rise, and the VIX rises in this case, but that doesn't
necessarily indicate that market sentiment is developing into a panic. In turn, the VIX is
likely to fall as markets fall. The VIX, which fell 70 percent in 2009, is likely to be the result
Just like there is no causal relationship between the VIX and S&P 500, this conclusion
could also apply to OVX and crude oil, a rising OVX cannot force the crude oil price down
and that explains why OVX is not an absolute excellent predictor of crude oil price to some
extent.
For natural gas, we got a pretty good R-square if we use the implied volatility of 90%
put option price as a substitution. However, in the real world, the VIX to track the volatility
of natural gas option price do not exist. Therefore, using VIX methodology to track the
One of the best possible forecast of future realized volatility is moneyness implied
volatility. Moneyness is the relative position of the current price or future price of an
underlying asset like a stock with respect to the strike price of a derivative, most commonly a
call option or a put option. Moneyness is firstly a three-fold classification: if the derivative
would make money if it were to expire today, it is said to be in the money, while if it would
not make money it is said to be out of the money, and if the current price and strike price are
percentage. For example, if a call option has a strike price at $50 and is currently trading at
$55, it can be said that the contract is in the money by 10% or the option has a moneyness of
110%.
express the moneyness as a number, measuring how far the asset is in the money or out of the
money with respect to the strike – or conversely how far a strike is in or out of the money
with respect to the spot (or forward) price of the asset. This quantified notion of moneyness is
most importantly used in defining the relative volatility surface: the implied volatility in
From the perspective of volatility, there is a graph called volatility smile which can
describe the relationship between an option's implied volatility and strike price. The more an
Literally, option moneyness can be used to predict volatility not only because the trading
volume of oil option is quite large but also because the moneyness implied volatility can truly
reflect the expectation of the market without very large deviation. So, our group focus on
figuring out which moneyness volatility can do the best prediction for next few days or
102.5%, 100%, 97.5%, 95% and 90% and then do the regression between each moneyness
3.2.2 Absolute change of moneyness implied volatility and that of historical volatility
To find the prediction power of moneyness implied volatility further, we calculate the
absolute change of monthly moneyness implied volatilities and that of historical volatilities.
Our goal is to see whether moving of moneyness implied volatility can match well to that of
historical volatility and whether the directions of moving are the same. Since the unit of
3.2.3 Conclusion
For crude oil, our results show that the implied contain at least some information on
future realized volatility. Within the regression between Moneyness implied volatility and
historical volatility, we find that 102.5% option can be the best indication of future volatility
23
Forecasting Crude Oil and Natural Gas Volatility
on the monthly level because its Adjusted R-squared of the regression is the highest and
reaches 0.19. Within the regression between absolute change of moneyness implied volatility
and that of historical volatility, we find that absolute change of 95% implied volatility is the
best indication of future volatility from the perspectives of sign of coefficient and R-squared.
For natural gas, the best volatility prediction comes from 110% option whose regression
gives us the highest Adjusted R-squared, 0.6451, which is a pretty good result. It can be said
that moneyness implied volatility can be an accurate indication of the real volatility for
natural gas. Within the regression between absolute change of moneyness implied volatility
and that of historical volatility, we find that absolute change of 102.5% implied volatility is
the best indication of future volatility from the perspectives of sign of coefficient and
R-squared.
Comparing the regression results of crude oil and natural gas, we figure out that
moneyness implied volatility has more prediction power on natural gas than crude oil.
Although the trading volume of crude oil is larger than that of natural gas, perhaps the
transactions of natural gas option give us more specific and accurate market information with
fewer noises.
There are some shortcomings in using moneyness implied volatility to do the prediction.
First, it is complex to implement. Second, this method is not suitable for the products which
are not actively traded in the market. Even for an actively traded product, taking various
moneyness of the crude oil option into consideration, perhaps for some time period the
trading volume of some specific option moneyness is relatively low. Third, become it is
calculated from the market conducts so the market noise made by irrational expectations of
econometric term developed in 1982 by Robert F. Engle, an economist and 2003 winner of
the Nobel Memorial Prize for Economics, to describe an approach to estimate volatility in
financial markets. There are several forms of GARCH modeling. The GARCH process is
context than other forms when trying to predict the prices and rates of financial instruments.
GARCH processes, being autoregressive, depend on past squared observations and past
variances to model for current variance. GARCH processes are widely used in finance due to
their effectiveness in modeling asset returns and inflation. GARCH aims to minimize errors
ongoing predictions.
The GARCH model has several advantages. First, it could capture long-term mean
Third, the weight applied to the observation can be adjusted to better fit the past observations
to the subsequent observations. Fourth, GARCH model can be modified to account for the
σ2 = ω
1–α–β
25
Forecasting Crude Oil and Natural Gas Volatility
Thus, tomorrow’s variance is a weighted average of the long-run variance, today’s squared
When forecasting, we use one period lag in GARCH(1,1) data because the volatility
predicted by GARCH(1,1) of this month corresponds to the historical volatility data of next
month. An inconvenience shared by the two models is that the multi-period distribution is
unknown even if the one-day ahead distribution is assumed to be normal. The GARCH model
produce a one-day-ahead forecast of volatility σt+12, and can be easily extended to volatility
forecast of k periods, especially if our goal is to price an option with k steps to expiration
using our volatility model. We use 21-day-ahead forecast, which is the average of sample
options time to expiration. And then we calculate the number of business days of each month
and multiply its squared root with volatilities predicted at the end of each month as the
monthly volatility. For the last step, we do the regression between the predicted monthly
volatility with monthly historical volatility to see the GARCH(1,1) prediction power.
3.3.2 Conclusion
We find that adjusted R-squared of the regression of crude oil is 0.1842 and the
corresponding t-statistics value is 14.948 which means that this result is significant on the
99% confidence level. As for the natural gas, the adjusted R-squared of the regression is
0.5252 which is much higher than that of crude oil regression and the corresponding
t-statistics value is 8.518. From this point of view, we can say that GARCH(1,1) is doing
We admit that GARCH (1,1) is useful across a wide range of applications, however, the
limitation of GARCH (1,1) is its inability to respond asymmetrically to falling and rising
GARCH models are usually applied to return series, financial decisions are rarely based
(EGARCH) model introduced by Nelson (1991) builds in a directional effect of price moves
on conditional variance. From practice, there is negative correlation between stock returns
and changes in returns volatility. Volatility tends to rise in response to "bad news", (excess
returns lower than expected) and to fall in response to "good news" (excess returns higher
than expected). Which means large price declines can have a larger impact on volatility than
large increases. GARCH models, however, assume that only the magnitude but not the
positivity or negativity of unanticipated excess returns determines feature σ2t . Moreover, The
GARCH models are not able to explain the observed covariance between ε2t and εt−j . This is
addition, GARCH models essentially specify the behavior of the square of the data. In this
As the general GARCH model has some limitations, the asymmetric models provide an
explanation for the so-called leverage effect, which an unexpected price drop increases
volatility more than an analogous unexpected price increase. The EGARCH(p,q) model
provides an explanation for the σ2t depends on both size and the sign of lagged residuals.
√
2
volatility leverage effect in the model and α the magnitude. et ~N(0, 1) with E |et−1 | = π
formula
2 2 2 2
lnσˆ t+h = σ + β (lnσˆ t+1 − σ )
h−1
2 γ
where σ = (ω − 2
/(1 − β) .
√ π
An attractive feature of the more general GARCH models, such as EGARCH and
GJR-GARCH, is that they allow for an asymmetric effect of positive and negative shocks on
asymmetrical effects different types of shocks can have on volatility. In the case of crude oil
prices, political disruptions in the Middle East or large decreases in global demand tend to
increase volatility (see, e.g. Ferderer 1996, Wilson et al. 1996) whereas the effect of new oil
field discoveries seems to have a more muted effect. A large increase in the volatility of WTI
crude oil returns around the global financial crisis but no decline when shale oil started to be
shipped in larger quantities to Cushing. Even when put together, the conditional normality
assumption and the simultaneous estimation of conditional variance, do not capture the thick
tails entirely. As for natural gas, it plays a crucial role in the economy of the United States. In
2010, there is about 25 of energy used in the U.S. came from natural gas. Similar to the oil,
the volatility tendency of natural gas seems to be asymmetric. Over the past several years, the
volatility exhibited in the price of natural gas market has become a great concern among
In this part, we used nonlinear garch models to predict the volatility with the purpose of
In many financial time series, the standardized residuals from the estimated models
display excess kurtosis which suggests departure from conditional normality. In such cases,
29
Forecasting Crude Oil and Natural Gas Volatility
the fat-tailed distribution of the innovations driving an ARCH process can be better modeled
using the Student’s-t or the Generalized Error Distribution (GED). Taking the square root of
volatility estimate. A single estimated model can be used to construct forecasts of volatility
There are several things we want to test by running EGARCH model. First, in our time
range of the data, the oil price suffered huge volatility since the financial crisis. Given the
extremely high kurtosis present in the data, is EGARCH model assumed to follow a Student t
Second, some articles about EGARCH model implied that it will be more accurately to
predict volatility in following 1 to 5 days. We intend to figure out what predicted horizon is
with historical data. We downloaded historical oil price volatility named HVT from
Bloomberg. The HVT were classified into several types according to horizon, such as 10 days
HVT, 30 days HVT, 50 days HVY ect.. We are going to test which line of HVT fits best to
Lastly, as we tried different type of GARCH models, however, which is most stable to
predict oil future volatility? We will look at the MSE result of the model and checkout
3.4.2 Prediction
In this prediction, we use package Quandl for data downloading, rugarch for model
fitting and forecasting. And we use rolling forecast in rugarch package to predict the
30
Forecasting Crude Oil and Natural Gas Volatility
volatility and compare with the historical value. Therefore, in the ugarchroll function, there
are several arguments and we have to select the best combination of the arguments out of
series of choices, here we select (100,200,300,500,1000) as window selection and (5, 10, 20,
50) as refit length selection. Thus, we wrote two loops looping over a selection of parameters.
We applied MSE to quantify the forecasting error of this model, and choose the combination
with least MSE. And then, we do the regression of the historical volatility and the forecasting
volatility.
Like our previous process of forecasting oil future volatility, we choose daily data of gas
future contract from CME, with maturity of 1 month, ranging from Nov. 11, 2007 to Nov. 8,
2017. We first took the first difference of logarithmic prices as the daily returns, which is
shown below.
Then we calculate mean and variance of the return in this period, it turned out the
average return is about -0.00038, which is not significantly different from zero, and standard
Next, we made several tests on the stationery and distribution of the data.
• The Jarque and Bera statistics shown that the null hypothesis of normal distribution
31
Forecasting Crude Oil and Natural Gas Volatility
should be rejected at 1% significant level.
data: ret_total
• The Ljung and Box’s Q statistics show the rejection of no autocorrelation up to the
10th orders.
We then use the same process to fit the egarch model, as before, we choose a set of
parameters to fit different models, and then calculate MSE for each model prediction and
historical volatility, and choose the best parameter based on the lowest MSE. We also plot the
We then wrote the prediction result and historical volatility to a csv file, and read the file
into R, and to avoid autocorrelation, we feed the first data of each month to a linear
regression, and got the parameter a and b. and plot the predict and historical volatility.
3.4.3 Conclusion
After the forecasting process, we can get the combination with least MSE (702.5309) for
crude oil, which means the optimal parameters of prediction. In this combination, the forecast
length equals to 100, refit length equals to 20, and the historical data is the 50 days’ historical
volatility.
As we can see from the plot, where the blue line stands for the predicted sigma and the
gray line represents the historical volatility from the data, our prediction can more or less
catch the trend of true volatility, but at a less volatile fashion. So, it might not be the best
The GJR GARCH model of Glosten et al. (1993) models positive and negative shocks
on the conditional variance asymmetrically via the use of the indicator function I,
where γ j now represents the 'leverage' term. The indicator function I takes on value of 1 for
"ε≤0 and 0 otherwise. Because of the presence of the indicator function, the persistence of the
model now crucially depends on the asymmetry of the conditional distribution used. The
where κ is the expected value of the standardized residuals zt below zero (effectively the
Where f is the standardized conditional density with any additional skew and shape
parameters (. . .). In the case of symmetric distributions, the value of κ is simply equal to 0.5.
The variance targeting, half-life and unconditional variance follow from the persistence
counterpart at every iteration of the solver following the mean equation filtration, and υj
represents the sample mean of the jth external regressors in the variance equation (assuming
stationarity).
The naming conventions for passing fixed or starting parameters for this model are:
3.5.2 Prediction
Data range is from Jan-02-2008 to July-20-2017 for 2407 days. We use 1000 days as fit
window for GARCH model, and predicted about 1407 days 21 days ahead volatility. Each
prediction we use 100 paths simulation for simulating random part of the model. We picked
last day of every month and multiply by the square root of number of days of every month as
The R-squared of the crude oil regression is 0.22688 and adjusted R square is 0.2149.
The R-squared of the natural gas regression is 0.396 and adjusted R square is 0.3868. In
the graph blue line is predicted volatility by GJR-GARCH and black line is real monthly
volatility.
3.5.3 Conclusion
GARCH family predictions are not well performed. Which means it is not very suitable
for long-term volatility prediction. If we use daily prediction for one-day horizon, we could
obtain very high R square which means GARCH family is a good method to predict short
horizon volatility for calculating Value at Risk (VaR) or used to process risk management.
But it is not a good way to predict long term volatility because GARCH only includes
historical data.
36
Forecasting Crude Oil and Natural Gas Volatility
4.1 Comparison
We use adjusted R-Squared as evaluation criteria for model forecasting power. The table
below is the estimation summary for five different models we use to forecast volatility based
Except for EGARCH model, VIX has the worst performance in forecasting crude oil
volatility. VIX has been performing higher than the realized volatility these years. This is not
to suggest that the VIX measure is of low value. Rather, it should be interpreted strictly for
what it aims to represent: the market price of volatility exposure consistent with observed
option prices. But as a direct indicator of future volatility, it is more limited because it
combines volatility forecasting with the pricing of the risk associated with
volatility. However, there is no existing VIX to track natural gas volatility. We use the
has the highest R-squared as 0.1904 while the regression between absolute change of
moneyness implied volatility and that of historical volatility shows the 95% option is the best
indication of crude oil future volatility. As implied volatility did not pass the test of forecast
rationality, this is indirectly in contradiction to the conclusion that implied volatility is the
best available predictor of future volatility. For natural gas, the 110% option gives the highest
adjusted R-squared as 0.6451 while the 102.5% option of absolute change is the best
indication of natural gas future volatility in terms of the sign of coefficient and R-squared. It
can be said that moneyness implied volatility can be an accurate indication of the real
volatility for natural gas since the seasonal change plays an important role in natural gas
demand shift. This characteristic makes it less uncertain than the crude oil.
As we can see, both volatility forecast results obtained by EGARCH model aren’t very
significant as we focus more on MSE in this paper. GJR-GARCH has the highest adjusted
R-squared in GARCH family for crude oil forecast. Asymmetric effects are present in data
and asymmetric models that are capable of allowing different responses to different past
shocks perform better in explaining volatility. But overall the GARCH family are not
well-performed in forecasting oil future volatility. For natural gas forecasting, the simple
Low R-squared values are problematic when we need precise predictions. The
regression plots can interpret the analysis result better for us (see figures 1-5). The black line
represents the historical volatility data while the colorful line represents the predicted
volatility. They indicate all five models we studied are not very good indicator for forecasting
Overall, each model we used has better performance in predicting natural gas volatility.
38
Forecasting Crude Oil and Natural Gas Volatility
In order to improve our results for crude oil future volatility forecast, it might be helpful to
use combined models. Meanwhile, the crude oil price volatility forecast performance can be
4.2 Conclusion
In this paper, we estimate and forecast the WTI crude oil and natural gas volatility using
five different models, including VIX, option moneyness (moneyness implied volatility),
GARCH(1,1), EGARCH, and GJR-GARCH. We take the 2008-2017 period as a sample and
using daily observations from the US stock markets. Taking into account what has been
discussed above, we can safely come to the following conclusions for future volatility
forecasting:
eliminate biases caused by Black Scholes model assumption. It’s a better indicator for natural
(2) Behavioral finance matters. Investors have to behave rational when using the
available market information in decision making process since market noise can cause
(3) The GARCH family used in this paper provide relatively higher predictive accuracy
for shorter time horizon (such as one day ahead). In general, for the normal period, the simple
GARCH model perform better than the asymmetric GARCH but for fluctuation period,
(4) Asymmetric GJR-GARCH model provide the preferred forecast for oil future
volatility, while VIX contribute a small but significant additional degree of forecast power
References
Frangoul, Anmar. "Natural Gas: Why It's Important and What You Need to Know." CNBC.
Irakli. “Oil's Role in the World Economy and in the Global Crises.” CNN. Cable News
Nielsen, Barry. "Financial Matters: The Importance of the Price of Oil." Helena Independent
Lux, Thomas, Mawuli Segnon, and Rangan Gupta. "Forecasting Crude Oil Price Volatility
and Value-at-risk: Evidence from Historical and Recent Data." Energy Economics 56
01 Dec. 2017.
https://en.wikipedia.org/wiki/Augmented_Dickey%E2%80%93Fuller_test.
Granero, M.a. Sánchez, J.e. Trinidad Segovia, and J. GarcÃa Pérez. "Some Comments on
Hurst Exponent and the Long Memory Processes on Capital Markets." Physica A:
Statistical Mechanics and Its Applications 387.22 (2008): 5543-551. Web. 1 Oct. 2017.
“Cboe Crude Oil ETF Volatility Index (OVX).” Cboe. N.p., n.d. Web. 30 Oct. 2017.
http://www.cboe.com/products/vix-index-volatility/volatility-on-etfs/cboe-crude-oil-etf
-volatility-index-ovx.
http://www.cboe.com/aboutcboe/cboe-cbsx-amp-cfe-press-releases?DIR=ACNews&FI
LE=cboe_20080714.doc.
“Cboe Volatility Index® (VIX®).” VIX Index. N.p., n.d. Web. 30 Oct. 2017.
http://www.cboe.com/products/vix-index-volatility/vix-options-and-futures/vix-index.
41
Forecasting Crude Oil and Natural Gas Volatility
Staff, Investopedia. "VIX - CBOE Volatility Index." Investopedia. N.p., 07 Aug. 2015. Web.
30 Oct. 2017.
https://www.investopedia.com/terms/v/vix.asp.
“The CBOE Volatility Index - VIX®.” Cboe (n.d.): 1-23. Aug. 2014. Web. 3 Oct. 2017.
https://www.cboe.com/micro/vix/vixwhite.pdf.
Rossi, Eduardo. Lecture Notes on GARCH Models. Diss. University of Pavia, 2004. Web. 12
Oct. 2017.
Hansson, Mathias, and Rune Sand. FORECASTING CRUDE OIL FUTURES VOLATILITY.
Musaddiq, Tareena. "Modeling and Forecasting the Volatility of Oil Futures Using the
ARCH Family Models." The Lahore Journal of Business (2012): 79-108. Summer
Bentes, Sónia R. "A Comparative Analysis of the Predictive Power of Implied Volatility
Indices and GARCH Forecasted Volatility." Physica A: Statistical Mechanics and Its
Ghalanos, Alexios. "Introduction to the Rugarch Package. (Version 1.3-1)." (2017): 1-48.
Zhang, Yuejun, Ting Yao, and Lingyun He. "Forecasting Crude Oil Market Volatility: Can
the Regime Switching GARCH Model Beat the Single-regime GARCH Models?"
Figures
VIX
Part 1
(Regression on HVT & OVX price)
rm(list=ls())
library(readr)
VIX_Monthly <- read_csv("VIX-MONTHLY.csv")
HVT_Monthly <- read_csv("HVT-MONTHLY.csv")
VIX1<-VIX_Monthly$`Price`
###plot
plot(HVT1,type='l')
points(vix1*0.224+23.371,type='l',col='blue')
summary(lm(HVT1~vix1+1))
Part 2
###(Regression on HVT & OVX price absolute change)
rm(list=ls())
library(readr)
VIX_Monthly <- read_csv("VIX-MONTHLY.csv")
HVT_Monthly <- read_csv("HVT-MONTHLY.csv")
VIX1<-VIX_Monthly$`Price`
###import the OVX date from 2007-05 to 2017-09
vix1<-VIX1[7:122]
###from 2007-11 to 2017-06
vix1_change<-diff(vix1)
###calculate the change for VIX from 2007-12-2007-11 to 2017-06-2017-05
HVT1<-HVT_Monthly$'CL1 COMB Comdty Hist Vol (30)'[1:116]
###import the HVT data from 2007-12 to 2017-07
HVT1_change<-diff(HVT1)
###calculate the change for HVT from 2008-01-2007-12 to 2017-07-2017-06
lm(HVT1_change~vix1_change+1)
R_squared1<-summary(lm(HVT1_change~vix1_change+1))$adj.r.squared
print(R_squared1)
###run the regression correspondingly
56
Forecasting Crude Oil and Natural Gas Volatility
plot(HVT1_change,type='l')
points(vix1_change*-0.001709+0.063119,type='l',col='blue')
summary(lm(HVT1_change~vix1_change+1))
###plot and summary (Regression on HVT &OVX Absolute Change)
57
Forecasting Crude Oil and Natural Gas Volatility
Option moneyness
#Clear the workspace
rm(list=ls())
#Import the monthly historical volatility with period from 01/2008 to 09//2017
HVT<-read.csv('HVT MONTHLY.csv')
HVT_30<-HVT$CL1.COMB.Comdty.Hist.Vol..30.[2:118]
T<-length(HVT_30)
#Import the option moneyness data with period from 12//2007 to 08/2017 which is one period
lag of the historical volatility
moneyness<-read.csv('Moneyness_30_monthly.csv')[13:(13+T-1),]
#Import the monthly historical volatility with period from 12/2007 to 09//2017
HVT<-read.csv('HVT MONTHLY.csv')
HVT_30<-HVT$CL1.COMB.Comdty.Hist.Vol..30.[1:118]
T<-length(HVT_30)
#Calculate the absolute change of monthly historical volatility and the period is from 01/2008
to 09/2017
HVT_30_change<-diff(HVT_30)
#Import the monthly moneyness implied volatility with period from 12/2007 to 09//2017
moneyness<-read.csv('Moneyness_30_monthly.csv')[12:(12+T-1),]
#Calculate the absolute change of the monthly moneyness implied volatility and the period is
from 01/2008 to 09/2017
moneyness_110_change<-diff(moneyness$X30DAY_IMPVOL_110.0.MNY_DF)
moneyness_105_change<-diff(moneyness$X30DAY_IMPVOL_105.0.MNY_DF)
moneyness_102.5_change<-diff(moneyness$X30DAY_IMPVOL_102.5.MNY_DF)
moneyness_100_change<-diff(moneyness$X30DAY_IMPVOL_100.0.MNY_DF)
moneyness_97.5_change<-diff(moneyness$X30DAY_IMPVOL_97.5.MNY_DF)
moneyness_95_change<-diff(moneyness$X30DAY_IMPVOL_95.0.MNY_DF)
moneyness_90_change<-diff(moneyness$X30DAY_IMPVOL_90.0.MNY_DF)
#Regress the change of historical volatility with that of volatility of 110% option
lm_110<-lm(HVT_30_change ~ moneyness_110_change)
summary(lm_110)
#Regress the change of historical volatility with that of volatility of 105% option
lm_105<-lm(HVT_30_change ~ moneyness_105_change)
summary(lm_105)
#Regress the change of historical volatility with that of volatility of 102.5% option
lm_102.5<-lm(HVT_30_change ~ moneyness_102.5_change)
summary(lm_102.5)
#Regress the change of historical volatility with that of volatility of 100% option
lm_100<-lm(HVT_30_change ~ moneyness_100_change)
summary(lm_100)
#Regress the change of historical volatility with that of volatility of 97.5% option
lm_97.5<-lm(HVT_30_change ~ moneyness_97.5_change)
summary(lm_97.5)
#Regress the change of historical volatility with that of volatility of 95% option
lm_95<-lm(HVT_30_change ~ moneyness_95_change)
summary(lm_95)
#Regress the change of historical volatility with that of volatility of 90% option
lm_90<-lm(HVT_30_change ~ moneyness_90_change)
summary(lm_90)
59
Forecasting Crude Oil and Natural Gas Volatility
plot(HVT_30_change,type='l',main="Absolute change of Moneyness and HVT")
points(moneyness_95_change,type='l',col='blue',lwd=2)
60
Forecasting Crude Oil and Natural Gas Volatility
GARCH (1,1)
#Business Day
library(zoo)
rm(list=ls())
oil_daily1 <- Quandl("FRED/DCOILWTICO",api_key="2T1Yy7mQwKqsGtFXKtCy",
type="raw",collapse="daily",start_date="2007-12-31",
end_date="2017-08-20")
oil_daily <- oil_daily1 %>%
arrange(Date) %>%
mutate(Ret_total = c(9999,diff(log(Value)))) %>%
slice(-1)
oil_daily <- oil_daily %>%
mutate(year_mon = as.yearmon(Date)) %>%
group_by(year_mon) %>%
mutate(busdays = n()) %>%
ungroup()
busdays <- oil_daily %>%
group_by(year_mon) %>%
slice(1) %>%
select(Date,year_mon,busdays) %>%
ungroup()
busdays <- subset(busdays, select = c(1, 3))
write.csv(busdays,file="busdays.csv")
#Import Packages
library(Quandl)
library(dplyr)
library(xts)
library(lubridate)
library(forecast)
library(dygraphs)
library(fGarch)
#Import data
data=read.csv('HVT-Daily2.csv')
oil_daily=data$Price
for (i in 1001:T){
Retwindow <- Ret_total[(i-1000):(i-1)]
fit1 <- garchFit( formula = ~garch(1, 1), data = Retwindow, trace = FALSE)
data$Pred[i] <- predict(fit1, n.ahead=21)$standardDeviation
61
Forecasting Crude Oil and Natural Gas Volatility
print(i)
#predicted period: from 01/2008 to 09/2017
}
#Write the predicted data with corresponding date into a csv file.
write.csv(data,file='Garch11.csv')
#Regression Preparation
result<- read.csv('Garch11.csv')
##Download Data
oil_daily <- Quandl("FRED/DCOILWTICO", api_key='TPywx-DUcfEE4VMynwHR',
type='raw', collapse='daily', order = 'asc', start_date="2008-01-01",end_date="2017-07-20")
rownames(oil_daily) <- oil_daily$Date
oil_daily$Date <- NULL
ret_total <- diff(log(oil_daily$Value))
hvt <- read.csv('HVT-Daily.csv', header = TRUE, col.names = c('Date',
'HVT','Vol_10','Vol_30','Vol_50','Vol_100'))
hvt <- hvt[-1,]$Vol_100
T <- length(ret_total)
# Iterating over the windows, and trying to find one with least Mean Squared Error(MSE)
forecast_length = c(100, 200, 300, 500, 1000)
refit_length = c(5, 10, 20, 50)
optimal_window = NULL
optimal_refit = NULL
mse = Inf
opt_forc = NULL
model = ugarchspec(variance.model = list(model='eGARCH'), mean.model =
list(armaOrder=c(0,0)), distribution.model = 'norm')
fit2 = ugarchfit(data=ret_total, spec=model)
for(i in c(1:5)){
for (j in c(1:4)){
rollforc = ugarchroll(spec=spec, data=ret_total, n.ahead = 1, forecast.length = window[i],
63
Forecasting Crude Oil and Natural Gas Volatility
refit.every = refit_length[j], refit.window = c('recursive'), solver = 'hybrid', keep.coef =
TRUE)
sigmapred = as.data.frame(rollforc@forecast$density$Sigma)
error = mean((hvt[(T-window[i]):(T-1)]- sigmapred)^2)
if (error <= mse){
mse = error
optimal_window = window[i]
optimal_refit = refit_length[j]
opt_forc = rollforc}
}}
print(optimal_window)
print(optimal_refit)
64
Forecasting Crude Oil and Natural Gas Volatility
GJR-GARCH
#Clear the workspace
rm(list=ls())
#Import Packages
library(Quandl)
library(dplyr)
library(xts)
library(lubridate)
library(forecast)
library(dygraphs)
library(fGarch)
library(rugarch)
library(parallel)
data=read.csv('HVT-Daily.csv')
oil_monthly=data$Price
#Predict 21 days ahead, could adjust different GARCH model inside loop.
#Check the output csv file whether there is 999 data in Pred. If there is, it means there is error
in that loop.
#Simply substitute i-1's data if Pred[i] is 999
#Regression Prepare
result<- read.csv('GJRGarch.csv')
65
Forecasting Crude Oil and Natural Gas Volatility
#Pick the last day of one month as the month volatility
Month=result$Month[1001:T]
Monthdiff<-diff(Month)
aa<-which(Monthdiff!=0)
Pred<-result$Pred[1001:T]
Predd<-Pred[aa]
#add last day of data as the last month volatility.
Predd<-c(Predd,Pred[length(Pred)])
VIX
Part 1
###Regression on HVT & the volatility of 90% put option price(VIX)
rm(list=ls())
library(readr)
VIX_Monthly <- read.csv('Natural gas moneyness.csv')
VIX_30<-VIX_Monthly $ X30DAY_IMPVOL_90.0.MNY_DF[1:117]
###Import the option moneyness data with period from 12/2007 to 08/2017 which is one
period ahead of the historical volatility
HVT_Monthly <- read.csv('Natural gas HVT_monthly.csv')
HVT_30<-HVT_Monthly $ VOLATILITY_30D[2:118]
###Import the monthly historical volatility with period from 01/2008 to 09/2017
R_squared1<-summary(lm(HVT_30~VIX_30+1))$adj.r.squared
print(R_squared1)
###run the regression correspondingly
plot(HVT_30,type='l')
points(VIX_30*1.10971-2.67889,type='l',col='blue')
summary(lm(HVT_30~VIX_30+1))
###plot and summary
Part 2
###Regression on HVT & the volatility of 90% put option price Absolute change
rm(list=ls())
library(readr)
VIX_Monthly <- read.csv('Natural gas moneyness.csv')
VIX_30<-VIX_Monthly $ X30DAY_IMPVOL_90.0.MNY_DF[1:117]
###Import the option moneyness data with period from 12/2007 to 08/2017 which is one
period ahead of the historical volatility
VIX_30_change<-diff(VIX_30)
###calculate the difference for VIX_30
HVT_Monthly <- read.csv('Natural gas HVT_monthly.csv')
HVT_30<-HVT_Monthly $ VOLATILITY_30D[2:118]
###Import the monthly historical volatility with period from 01/2008 to 09/2017
HVT_30_change<- diff(HVT_30)
###calculate the difference for HVT_30
R_squared1<-summary(lm(HVT_30_change~VIX_30_change+1))$adj.r.squared
print(R_squared1)
###run the regression correspondingly
plot(HVT_30_change,type='l')
points(VIX_30_change*0.8824+0.1025,type='l',col='blue')
summary(lm(HVT_30_change~VIX_30_change+1))
###plot and summary
67
Forecasting Crude Oil and Natural Gas Volatility
Option Moneyness
rm(list=ls())
#Clear the workspace
lm_110<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_110.0.MNY_DF)
summary(lm_110)
#Regress the historical volatility with volatility of 110% option
lm_105<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_105.0.MNY_DF)
summary(lm_105)
#Regress the historical volatility with volatility of 105% option
lm_102.5<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_102.5.MNY_DF)
summary(lm_102.5)
#Regress the historical volatility with volatility of 102.5% option
lm_100<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_100.0.MNY_DF)
summary(lm_100)
#Regress the historical volatility with volatility of 100% option
lm_97.5<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_97.5.MNY_DF)
summary(lm_97.5)
#Regress the historical volatility with volatility of 97.5% option
lm_95<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_95.0.MNY_DF)
summary(lm_95)
#Regress the historical volatility with volatility of 95% option
lm_90<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_90.0.MNY_DF)
summary(lm_90)
#Regress the historical volatility with volatility of 90% option
HVT_30_change<-diff(HVT_30)
#Calculate the absolute change of monthly historical volatility and the period is from 01/2008
to 09/2017
moneyness<-read.csv('Natural gas moneyness.csv')[1:T,]
#Import the monthly moneyness implied volatility with period from 12/2007 to 09//2017
moneyness_110_change<-diff(moneyness$X30DAY_IMPVOL_110.0.MNY_DF)
moneyness_105_change<-diff(moneyness$X30DAY_IMPVOL_105.0.MNY_DF)
moneyness_102.5_change<-diff(moneyness$X30DAY_IMPVOL_102.5.MNY_DF)
moneyness_100_change<-diff(moneyness$X30DAY_IMPVOL_100.0.MNY_DF)
moneyness_97.5_change<-diff(moneyness$X30DAY_IMPVOL_97.5.MNY_DF)
moneyness_95_change<-diff(moneyness$X30DAY_IMPVOL_95.0.MNY_DF)
moneyness_90_change<-diff(moneyness$X30DAY_IMPVOL_90.0.MNY_DF)
#Calculate the absolute change of the monthly moneyness implied volatility and the period is
from 01/2008 to 09/2017
lm_110<-lm(HVT_30_change ~ moneyness_110_change)
summary(lm_110)
#Regress the change of historical volatility with that of volatility of 110% option
lm_105<-lm(HVT_30_change ~ moneyness_105_change)
summary(lm_105)
#Regress the change of historical volatility with that of volatility of 105% option
lm_102.5<-lm(HVT_30_change ~ moneyness_102.5_change)
summary(lm_102.5)
#Regress the change of historical volatility with that of volatility of 102.5% option
lm_100<-lm(HVT_30_change ~ moneyness_100_change)
summary(lm_100)
#Regress the change of historical volatility with that of volatility of 100% option
lm_97.5<-lm(HVT_30_change ~ moneyness_97.5_change)
summary(lm_97.5)
#Regress the change of historical volatility with that of volatility of 97.5% option
lm_95<-lm(HVT_30_change ~ moneyness_95_change)
summary(lm_95)
#Regress the change of historical volatility with that of volatility of 95% option
69
Forecasting Crude Oil and Natural Gas Volatility
lm_90<-lm(HVT_30_change ~ moneyness_90_change)
summary(lm_90)
#Regress the change of historical volatility with that of volatility of 90% option
rm(list=ls())
### Clean the R workspace
gas_daily1<-Quandl("CHRIS/CME_NG1",api_key="2T1Yy7mQwKqsGtFXKtCy",type="ra
w",collapse="daily",start_date="2007-12-31",end_date="2017-07-20")
gas_daily<-gas_daily1[ nrow(gas_daily1):1, ]
for (i in 1001:T){
Retwindow <- Ret_total[(i-1000):(i-1)]
fit1 <- garchFit( formula = ~garch(1, 1), data = Retwindow, trace = FALSE)
Pred[i] <- predict(fit1, n.ahead=21)$standardDeviation
print(i)
#predicted period: from 01/2008 to 07/2017
}
write.csv(Pred,file='result.csv')
#Regression Prepare
result<- read.csv('gas_daily.csv')
#Pick the last day of one month as the month volatility
Month=result$Month[1001:T]
Monthdiff<-diff(Month)
aa<-which(Monthdiff!=0)
Pred<-result$Pred[1001:T]
Predd<-Pred[aa]
#add last day of data as the last month volatility
Predd<-c(Predd,Pred[length(Pred)])
##download data
gas_daily <- Quandl("CHRIS/CME_NG1", api_key='TPywx-DUcfEE4VMynwHR',
type='raw', collapse='daily', order = 'asc', start_date="2007-11-02",end_date="2017-11-08")
rownames(oil_daily) <- oil_daily$Date
gas_daily$Date <- NULL
ret_total <- diff(log(gas_daily$Settle))
gas_hvt <- read.csv('Natural gas HVT_daily.csv', header = TRUE)
hvt <- gas_hvt[,2]
T <- length(ret_total)
# iterating over the windows, and trying to find one with least Mean Squared Error(MSE)
forecast_length = c(100, 200, 300, 500, 1000)
refit_length = c(5, 10, 20, 50)
optimal_window = NULL
optimal_refit = NULL
mse = Inf
opt_forc = NULL
model = ugarchspec(variance.model = list(model='eGARCH'), mean.model =
list(armaOrder=c(0,0)), distribution.model = 'norm')
fit2 = ugarchfit(data=ret_total, spec=model)
for(i in c(1:5)){
for (j in c(1:4)){
73
Forecasting Crude Oil and Natural Gas Volatility
rollforc = ugarchroll(spec=spec, data=ret_total, n.ahead = 1, forecast.length =
forecast_length[i], refit.every = refit_length[j], refit.window = c('recursive'), solver = 'hybrid',
keep.coef = TRUE)
sigmapred = as.data.frame(rollforc@forecast$density$Sigma)
error = mean((hvt[(T-forecast_length[i]):(T-1)]- sigmapred)^2)
if (error <= mse){
mse = error
optimal_window = forecast_length[i]
optimal_refit = refit_length[j]
opt_forc = rollforc}
}}
print(optimal_window)
print(optimal_refit)
print(mse)
plot(opt_forc, which=2)
#use linear regression to regress HVT of every last day of a month and Predicted Volatility
by Egarch of every last day of previous month.
result<-read.csv('result.csv', header = TRUE, sep = '\t')
colnames(result)[1] = 'Month'
Monthdiff<-diff(result$Month)
aa<-which(Monthdiff!=0)
HVT<-result$HVT[aa]
Pred<-result$Sigmapred[aa]
HVTd<-HVT[2:48]
Predd<-Pred[1:47]
fit2<-lm(HVTd~Predd)
summary(fit2)
plot(HVTd,type='l')
points(Predd*1954.292-8.917,type='l',col='blue')
74
Forecasting Crude Oil and Natural Gas Volatility
GJR-GARCH
library(Quandl)
library(dplyr)
library(xts)
library(lubridate)
library(forecast)
library(dygraphs)
library(fGarch)
library(rugarch)
library(parallel)
gas_daily1<-Quandl("CHRIS/CME_NG1",api_key="2T1Yy7mQwKqsGtFXKtCy",type="ra
w",collapse="daily",start_date="2007-12-31",end_date="2017-09-30")
gas_daily<-gas_daily1[ nrow(gas_daily1):1, ]
price<-gas_daily$Settle
#Predict 21 days ahead, could adjust different garch model inside loop.
#Check the output csv file whether there is 999 data in Pred. If there is, it means there is error
in that loop.
#Simply substitute i-1's data if Pred[i] is 999
pred<-numeric(1442)
for (i in 1001:T){
tryCatch({
Retwindow<-Ret_total[(i-1000):(i-1)]
spec = ugarchspec(variance.model=list(model="gjrGARCH"), distribution="std")
fit = ugarchfit(spec, Retwindow)
bootp = ugarchboot(fit, method = c("Partial", "Full")[1],n.ahead = 21, n.bootpred = 100)
predsigma<-bootp@forc@forecast$sigmaFor[21]
pred[i]=predsigma
print(i)
}, error=function(e){
pred[i]=999})
}
plot(pred,type='l')
write.csv(pred,file='GJRGarch-gas.csv')
write.csv(gas_daily,file='gas-daily.csv')
75
Forecasting Crude Oil and Natural Gas Volatility
#Here we divide Date into Year, Month and Day as three columns in excel manually.
result=read.csv('gas-daily.csv')
data_real=read.csv('gas vol.csv')
Real_vol=data_real$VOLATILITY_30D[50:117]/100
Month=result$Month[1002:T]
Monthdiff<-diff(Month)
aa<-which(Monthdiff!=0)
Pred<-result$Pred[1002:T]
Predd<-Pred[aa]
plot(Predd,type='l')
Busday<-read.csv('busdays.csv')
Predddd<-Predd*sqrt(Busday$busdays[2:69])
fit2<-lm(Real_vol~Predddd)
summary(fit2)
plot(Real_vol,type='l')
points(Predddd*fit2$coefficients[2]+fit2$coefficients[1],type='l',col='blue',lwd=2)