Forecasting Crude Oil and Natural Gas Volatility

0
FORECASTING CRUDE OIL AND NATURAL GAS VOLATILITY
University of Connecticut
Forecasting Crude Oil and Natural Gas Volatility
Xiao Liang
Yiran Liao
Ziran Luo
Yuechen Pan
Yuqi Peng
Zhiyong Yan
Zhaojie Yang
Professor Biolsi
Vince Lanci-Echo Bay Partners
8 December 2017
1
1. Introduction 3
1.1 Background 3
1.2 Literature Review 7
2. Data 9
2.1 Data Source for EGARCH and GJR-GARCH 11
2.2 Mean Reversion Test 13
2.2.1 Hurst Exponent 13
2.2.2 Augmented Dickey--Fuller Test 13
2.2.3 Conclusion 14
3. Prediction 15
3.1 VIX 15
3.1.1 Theory foundation of OVX 15
3.1.1.1 VIX methodology 16
3.1.2 Prediction 18
3.1.3 Conclusion 19
3.2 Option Moneyness 21
3.2.1 Moneyness implied volatility and historical volatility 21
3.2.2 Absolute change of moneyness implied volatility and that of historical
volatility 22
3.2.3 Conclusion 22
3.3 GARCH (1,1) Model 24
3.3.1 Theory foundation of GARCH (1,1) 24
3.3.2 Conclusion 25
3.4 EGARCH Model 27
3.4.1 Theory foundation of EGARCH 27
3.4.2 Prediction 30
3.4.3 Conclusion 31
3.5 GJR GARCH Model 33
3.5.1 Theory foundation of GJR-GARCH 33
3.5.2 Prediction 34
3.5.3 Conclusion 35
4. Model Comparison and Conclusion 36
4.1 Comparison 36
4.2 Conclusion 39
2
References 40
Figures 42
Figures – Crude Oil 42
Figures – Natural Gas 46
Tables 51
Tables - Crude Oil 51
Tables - Natural Gas 53
APPENDIX 55
R Code - Crude oil 55
R Code - Natural Gas 66
3
1. Introduction
1.1 Background
Crude oil is debatably one of the most important driving forces of the global economy,
and changes in the price of oil have significant effects on economic growth and welfare
around the world. The level of importance of oil is even larger to industrialized economies.
Today, Oil is one of the most important raw material for the world and will likely remain so
for many decades to come. Most countries are significantly affected by developments in the
oil market, either as producers, consumers, or both. Oil is directly responsible for about 2.5%
of world GDP, and in 2014, oil provided about 38 % of the world’s energy needs, and in the
future, oil is expected to continue to provide a leading component of the world’s energy.
From estimation, to meet the projected increase in world oil demand, the total petroleum
supply in 2030 is required to reach 118 million barrels per day from 80 million barrels per
day as of the year 2003. Everyday, we use hundreds of things that are made from oil or gas,
such as gasoline, diesel, plastic, synthetic fiber, pitch, etc.
Natural gas is used in an amazing number of ways. Although it is widely seen as a
cooking and heating fuel in most U.S. households, natural gas has many other energy and raw
material uses that are a surprise to most people who learn about them. In the United States,
most natural gas is burned as a fuel. In 2012 about 30% of the energy consumed across the
nation was obtained from natural gas. It was used to generate electricity, heat buildings, fuel
vehicles, heat water, bake foods, power industrial furnaces, and even run air conditioners.
Oil and gas powers nearly 100% of all transportation. The assumption of oil and gas are
very big. The world’s oil & gas transport infrastructure is a globe-spanning spiderweb of
pipelines and shipping routes. The natural gas distribution pipelines in the US alone could
4
stretch from Earth to the Moon 7-8 times. There are millions upon millions of miles of pipe
on the planet to distribute crude oil, refined products, and natural gas. There is no reason
whatsoever to think any feasible amount of renewables growth can displace fossil fuels in a
couple of generations. Wind and solar are growing fast, but the use of renewables as a
percentage of total world energy consumption only increased by 0.07% from 1973 to 2009.
Oil and natural gas have huge roles not only in world economy, but also has strong
influence on global crises. Understanding macro impacts of oil and gas prices also requires
considering in detail the exposure and interactions of micro channels, such as the housing or
auto sector.
So, predicting the future oil and gas prices are important. Higher oil and gas prices
means higher business costs. From a financial perspective, many sectors of the economy will
be adversely affected by increasing oil and gas prices, or helped when they go down. It can be
very different for importing and exporting countries. It’s universal. It’s fast changing. During
the past decade, the price of oil has traveled from $60 per barrel to a peak of $145 in 2008
and subsequently descended again to $50 in 2017.
Here is how the crude oil and natural gas react to a variety of geopolitical and economy
events during the past decade. 2008 is a magical year. The crude oil price went high and
reached its pre-financial crisis peak of $145 due to the unrest and consumers’ fears about the
wars in both Iraq and Afghanistan. In addition, it was just the time of the 2008 Beijing
Olympic Games. Millions of travelers entered the country that drives the demand for oil went
up. Crude oil price began to go down when the global market collapsed in the very same
year. Now take a look at the natural gas price changes back then. The normal natural gas
price trend is prices falling in the summer and rising in the winter. However, instead of
seasonal weather playing an important role in prices, in the summer of 2008, a large price
5
spike took place in summer and then quickly drops from its peak of over $13 per Mcf. to
below $3 per Mcf in the winter due to demand drop resulted from the economy recession. It
follows the pattern of oil prices of rising in the summer and falling in the winter. After the
economy had recovered for two years, crude oil price fluctuated in a relatively small range
and maintained a steady growth from 2011 to early 2014 and there has been slight fluctuation
in the price of natural gas but not any major spikes. Due to the oversupply by OPEC and the
appreciation of U.S. dollar in the second half of 2014, oil and gas demand was driven down.
Crude oil price went low that people has not seen since the last global economy recession.
Also, the Iran nuclear deal and the turmoil in Iraq and Libya contributed to the declining
crude oil price and it affected the geopolitical risk in market. Crude oil price and natural gas
price both reached its lowest point in the beginning of 2016 as below $30 and $2
respectively.
Over the last couple of decades, volatility has become one of the significant issues in the
energy market. It is apparent that energy prices are the most volatile among all the
commodity prices. Crude oil, natural gas, coal and other energy products all observe
significant price fluctuations. These fluctuations in prices create uncertainty in the minds of
consumers and producers. Oil price shocks due to such events have continuously increased in
size and frequency. Wide fluctuations in oil prices have played an important role in driving
recessions and even regimes collapsing—which is why oil price movements are closely
watched by economists and investors. From these evidences of significantly changes of crude
oil and natural gas prices, we naturally think about whether the econometric tools that we
have nowadays are able to help us forecast volatilities accurately.
From a finance perspective, in the current context of an ongoing global financial crisis,
risk management and volatility forecasting are the most important topics nowadays in the
6
financial world. We all know that return has a close relationship with volatility and most of
the financial decisions are made based on a tradeoff between risk and returns. Thus, in order
to analysis and predict the changes of return of crude oil and natural gas, we have to analysis
the volatility of it as first.
Overall, the fluctuation in crude oil and natural gas price for the past decade has a
significant impact on stock market and global economy. For risk managers, it is important to
understand the degree of volatility in any investment, along with its potential impact on the
overall investment strategy since volatility directly affects the investment valuation.
Therefore, being able to understand and predict future volatility is very important in
investment decision making.

7
1.2 Literature Review
For the volatility forecasting, there are two main sources. One of them is the approach
based on time series, and the other one is the volatility implied from option prices. From a
theoretical point of view, the implied volatility of the option price should contain all the
available information related to the forecasting of volatility, which is necessary to the future
volatility forecasting. However, the actual situation is very complicated. In general, the risk
premium of the implied volatility of the option price is due to the fact that the risk of
volatility cannot be completely hedged, it is shows in the study of Bollerslev and Zhou
(2005). In addition, one of the most noteworthy phenomena is called smile effect, which
shows the limitations of the classic Black-Scholes model. The smile effect is the effect when
calculating the implied volatility for options with different strikes on the same underlying
with the same time to maturity one does not necessarily get the same implied volatility. In
general, the implied volatility is a u-shape and the minimum implied volatility occurs at
at-the-money position. Thus, if we want to use the implied volatility to predict future
volatility, the same market is giving multiple forecasts for the future volatility of the same
underlying asset during the same time period.
In the literature, many models have been used to forecast the crude oil volatility. The
models that we most widely used are ARCH model that proposed by Engle (1982) and then
generalized by Bollerslev (1986). In Engle's classic study, he distinguished conditional and
unconditional variance for the first time, but the model is simple and needs a lot of
parameters. Bollerslev (1986) extended the ARCH model to the Generalized Autoregressive
Conditional Heteroscedasticity (GARCH) which had the same key properties as the ARCH
but required far less parameters to adequately model the volatility process. According to their
8
research, we can build the model that simultaneously model both mean and variance of
financial time series. These approaches are significant improvements in the time series
analysis.
Time invariant GARCH (1,1) models have fared well in predicting the conditional
volatility of financial assets (Hansen and Lunde 2005). Moreover, oil price volatility has been
traditionally modeled as a time-invariant GARCH process. However, in GARCH model, we
cannot avoid the impacts that caused by asymmetric effects of positive and negative asset
returns. In order to overcome the weakness, Nelson (1991) proposed an extension to the
GARCH model called the Exponential GARCH(EGARCH), which can avoid the asymmetric
influences and bias. Another widely used extension of the GARCH model is the
GJR-GARCH proposed by Glosten, Jagannathan and Runkle (1993), GJR-GARCH has been
shown to have good out-of-sample performance when forecasting oil price volatility at short
horizons (Mohammadi and Su 2010, and Hou and Suardi 2012). On the basis of previous
papers, Wei et al (2010) studied nine GARCH models and compared the accuracy of their
forecasting with six different loss functions. Finally, they concluded that although the
nonlinear model can properly capture the asymmetric leverage effect of long memory
volatility and the asymmetric leverage, these models are not the best model to forecast the
volatility.
9
2. Data
In this paper, the data that we obtained from Bloomberg is based on the West Texas
Intermediate(WTI) crude oil, an underlying commodity of the New York Mercantile
Exchange's oil futures contracts and natural gas traded on the New York Mercantile Division
(NYMEX) of the Chicago Mercantile Exchange (CME). The rationale for choosing WTI
futures is that it can provide us with sufficient amount of data needed to measure and
compare the accuracy of forecasting power for different models. We derived daily and
monthly closing price for crude oil over a 10-year period, from 2008 to 2017. Options are
divided into 7 different strike price categories ranging from 10 basis point out-of-money to 10
basis point in-the-money. The estimation of historical model is based on a 21-day window of
realized volatility.
Different models provided different series of predicted volatilities, the way we compare
these numbers to find the optimal prediction model is by using regression between our
predicted numbers and HVT. HVT is historical volatility of the oil price we downloaded from
Bloomberg directly. The historical volatility reflects the actual price changes of the oil over a
given time period. Therefore, the model whose regression results fit the HVT best is the
optimal one. HVT is usually calculated in several different moving windows. The windows
express the days include in the HVT’s calculation, and the whole data set will move forward
every day. We take the 30 days HVT which representative monthly volatility for VIX, option
moneyness, GARCH (1,1), and GJRGARCH models, and use the 10 days, 30 days, 50 days,
and 100 days windows for GARCH model. In period range from January 2nd, 2008 to July
20th, 2017. The average volatilities for crude oil are 34.62%, 35.73%, 36.05% and 36.57%
for 10 days, 30 days, 50 days, and 100 days. And the standard deviations for crude oil are
10
21.65%, 19.18%, 18.42%, 17.42% for 10 days, 30 days, 50 days, and 100 days. For Natural
gas, data range from November 2nd,2007 to November 8th,2017, the average volatility for
every 10 days and 30 days are 44.79% and 46.00%. The standard deviation for 10 and 30
days window are 20.45% and 16.80%. The max volatility of 10 days is 172.6% observe in
September 29th, 2009. The max volatility of 30 days is 130.66% observe in October 1st,
2009. It means volatility will rise with the window range rising but the volatility will be more
stable at the with the range increasing.

11
2.1 Data Source for EGARCH and GJR-GARCH
For crude oil, in both EGARCH and GJRGARCH, we use the daily spot price for the
West Texas Intermediate (WTI) crude oil obtained from inserted R package. The sample
period ranges from January 2nd, 2008 to July 20th, 2017. Over this period of time, the
average price for a barrel of crude oil was $77.31, the median value equaled $82.18, and the
standard deviation was $24.79. A maximum price of $145.29 was observed on July 3, 2008
and the minimum price of $26.21 was on February 11, 2016. To model the returns in the oil
price and its volatility, we calculate daily oil returns by taking difference in the logarithm of
consecutive days closing prices. The mean rate of return is about -0.031% with a standard
deviation of 2.49%. Note also that WTI returns are slightly positively skewed at 0.2012.
Kurtosis is a little bit high at the value of 4.54, compared with 3 for a normal distribution.
Large variations are observed during the global financial crisis in late 2008 and since crude
oil prices started decreasing in July 2014. The reason we choose this range of data is written
in the previous back ground part, and the time period is large sufficient for us to make a
reliable prediction.
For natural gas, we use the daily natural gas future price of CME. Time range is form
November 2nd, 2007 to November 8th, including 2510 data in total. In this period, the
average price of the natural gas is $4.11 per MMBtu, the median value is $3.70, and standard
deviation of daily future prices is $1.96. A maximum price of $13.577 was observed on July
3, 2008 and the minimum price of $1.63 was on March 3, 2016. The maximum price of
natural gas appeared at the same day as the crude oil, and the minimum price showed at near
time for gas and oil. Same as crude oil, we calculate daily gas returns by taking difference in
the logarithm of consecutive days closing prices. The mean rate of return is -0.039% with a
12
standard deviation of 3.04%. The natural gas returns are positive skewed at 0.6394, and has a
slightly high kurtosis at 4.85. From this simple calculation of natural gas prices and return,
we can observe that the price tendency of crude oil and natural gas are little bit similar to
each other.
For the comparable historical volatility, we download the 10 days, 30 days, 50 days, and
100 days HVT for EGARCH model. 30 days HVT for GJRGARCH model.

13
2.2 Mean Reversion Test
2.2.1 Hurst Exponent
The Hurst exponent is the classical test to detect long memory in time series. This
analysis was introduced by English hydrologist H.E. Hurst in 1951, based on Einstein’s
contributions regarding Brownian motion of physical particles, to deal with the problem of
reservoir control near Nile River Dam. R/S analysis in economy was introduced by
Mandelbrot, who argued that this methodology was superior to the autocorrelation, the
variance analysis and to the spectral analysis.
The eldest and best-known method to estimate the Hurst exponent is R/S analysis. It was
proposed by Mandelbrot and Wallis, based on the previous work of Hurst.
When the process is a Brownian motion, H has to be 0.5, when it is persistent H will be
greater than 0.5, and finally when it is anti-persistent H will be less than 0.5. For a white
noise, H = 0, while for a simple linear trend, H = 1. Note that H must lie between 0 and 1.
We used package “pracma 2.1.1” in R and the function of hurstexp(x, box, display).
Then we got the result:
Simple R/S Hurst estimation Result

Gas 0.836415 Persistent/Mean reverse
Oil 0.9978132 Persistent/Mean reverse
2.2.2 Augmented Dickey--Fuller Test
an augmented Dickey–Fuller test (ADF) tests the null hypothesis that a unit root is
present in a time series sample. The alternative hypothesis is different depending on which
version of the test is used, but is usually stationarity or trend-stationarity. It is an augmented
version of the Dickey–Fuller test for a larger and more complicated set of time series models.
When the process is a stationary process, p-value should be larger than our threshold
14
(5% for 95% confidence for example).
Otherwise, if p-value is smaller than 5%, null hypothesis is significant, which means it is
a random walk process.
We used package “tseries 0.10-42” and the function adf.test.
Then we got the result:
ADF.test p-value Result

Gas 0.3616 Persistent/Mean reverse
Oil 0.5701 Persistent/Mean reverse
2.2.3 Conclusion
In both test, the result shows that the nature gas and crude oil are mean-reversion
stationary process. Which means it is suitable to use GARCH model to do prediction.

15
3. Prediction
3.1 VIX
3.1.1 Theory foundation of OVX
The CBOE Crude Oil ETF Volatility Index ("Oil VIX", Ticker - OVX) measures the
market's expectation of 30-day volatility of crude oil prices by applying the VIX
methodology to United States Oil Fund, LP (Ticker - USO) options spanning a wide range of
strike prices.
The United States Oil Fund is an exchange-traded security designed to track changes in
crude oil prices. By holding near-term futures contracts and cash, the performance of the
Fund is intended to reflect, as closely as possible, the spot price of West Texas Intermediate
light, sweet crude oil, less USO expenses.
The CBOE Volatility Index (VIX Index) is considered by many to be the world's
premier barometer of equity market volatility. The VIX Index is based on real-time prices of
options on the S&P 500 Index (SPX) and is designed to reflect investors' consensus view of
future (30-day) expected stock market volatility. The VIX Index is often referred to as the
market's "fear gauge".
In 2008, CBOE pioneered the use of the VIX methodology to estimate expected
volatility of certain commodities and foreign currencies. The CBOE Crude Oil ETF Volatility
Index (OVXSM), CBOE Gold ETF Volatility Index (GVZSM) and CBOE Eurocurrency ETF
Volatility Index (EVZSM) use exchange-traded fund options based on the United States Oil
Fund, LP (USO), SPDR Gold Shares (GLD) and Currency Shares Euro Trust (FXE),
respectively.
CBOE has since introduced several new volatility indexes, including volatility indexes
16
based on individual stocks, just like CBOE U.S. Energy Sector ETF Volatility Index
(VXXLESM). However, there is still no VIX to track the natural gas volatility only. As is
shown below, the VIX we calculate is by giving different weights to different options, the
lower the strike price, the higher the weights. Through our personal experience, the 90%
moneyness put option implied volatility might be a reasonable substitution for the natural gas
VIX that do not exist.
3.1.1.1 VIX methodology
Stock indexes, such as the S&P 500, are calculated using the prices of their component
stocks. Each index employs rules that govern the selection of component securities and a
formula to calculate index values.
The VIX Index is a volatility index comprised of options rather than stocks, with the
price of each option reflecting the market’s expectation of future volatility. Like conventional
indexes, the VIX calculation employs rules for selecting component options and a formula to
calculate index values.
The generalized formula used in the VIX calculation is:
∆K i RT 2
σ2 = 2
T
∑ e Q(K i ) − T1 [ KF0 − 1]
K 2i
i
Where...
V IX
σ is 100
, we can get VIX=σ*100
T Time to expiration
F Forward index level desired from index option price
K0 First strike below the forward index level, F
K1 Strike price of the ith out-of-the-money option; a call if K1>K0; and a put if K1<K0; both
17
put and call if K1=K0
ΔK1 Interval between strike price-half the difference between the strike on either side of K1:
K −K
∆K i = i+1 2 i−1
(Note: ΔK for the lowest strike is simply the difference between the lowest strike and
the next higher strike. Likewise, ΔK for the highest strike is the difference between the
highest strike and the next lower strike.)
R Risk-free interest rate to expiration
Q(K1) The midpoint of the bid-ask spread for each option with strike Ki
GETTING STARTED:
The VIX calculation measures time to expiration, T, in calendar days and divides each
day into minutes in order to replicate the precision that is commonly used by professional
option and volatility traders. The time to expiration is given by the following expression:
{M current day +M settlement day +M ohter days }
T = M inutes in a year
WHERE...
M current day = minutes remaining until midnight of the current day
M settlement day = minutes from midnight until 8:30 a.m. for “standard” SPX
expirations; or minutes from midnight until 3:00 p.m. for “weekly”
SPX expirations
M ohter days = total minutes in the days between current day and expiration day
STEP 1: Select the options to be used in the VIX calculation
F = S trike P rice + eRT * (Call P rice − P ut P rice)

STEP 2: Calculate volatility for both near-term and next-term options
STEP 3: Calculate the 30-day weighted average of σ21 and σ22 . Then take the square root of
18
that value and multiply by 100 to get VIX.
3.1.2 Prediction
What we do right now is running regressions between the historical volatilities and OVX
data, and then judge whether OVX is a good predictor of predicting future WTI oil price
volatility. Since we download OVX monthly data to predict the 30-day volatility, there is
always a one-month lag.
In part one, the data range for OVX is from Nov-2007 to Jun-2017 and the data range
for HVT is from Dec-2007 to July-2017.
We then run a regression on HVT and OVX price correspondingly. The Multiple
R-squared for this regression is 0.1863 and the Adjusted R-squared for this regression is
0.1792.
In part two, we run a regression on the absolute change of HVT and OVX price. The
data for absolute changes of OVX range from Dec-2007 minus Nov-2007 to Jun-2017 minus
May-2017 and the data range for HVT is from Jan-2008 minus Dec-2007 to July-2017 minus
Jun-2017.
We then run a regression on HVT and OVX absolute change correspondingly. The
Multiple R-squared for this regression is 4.108e-05 and the Adjusted R-squared for this
regression is -0.008808.
For predicting the natural gas volatility, we do the same. First, we import the natrual gas
monthly historical volatility with period from Jan-2008 to Sept-2017 and import the implied
volatility of 90% put option price with period form Dec-2007 to Aug-2017.
We then run a regression on those two correspondingly. The Multiple R-squared for
this regression is 0.6117 and the Adjusted R-squared for this regression is 0.6083.
19
By running the regression on the absolute change of natural gas HVT and implied
volatility of 90% put option price. The Multiple R-squared for this regression is 0.2377 and
the Adjusted R-squared for this regression is 0.231.
3.1.3 Conclusion
From all we discussed above, we can conclude that OVX is not a completely good
predictor of future crude oil volatility.
The OVX calculates not the actual volatility of the crude oil price, but the implied
volatility in the its option price, which means that the supply and demand factors for options
are included in the OVX. When demand for options is high, option prices would be higher
and so did the implied volatility of these options.
The VIX does not have the necessary predictive power in the real world. Sometimes, the
real volatility rises when markets rise, and the VIX rises in this case, but that doesn't
necessarily indicate that market sentiment is developing into a panic. In turn, the VIX is
likely to fall as markets fall. The VIX, which fell 70 percent in 2009, is likely to be the result
of a decline in historical volatility and a possible retreat from investor anxiety.
Just like there is no causal relationship between the VIX and S&P 500, this conclusion
could also apply to OVX and crude oil, a rising OVX cannot force the crude oil price down
and that explains why OVX is not an absolute excellent predictor of crude oil price to some
extent.
For natural gas, we got a pretty good R-square if we use the implied volatility of 90%
put option price as a substitution. However, in the real world, the VIX to track the volatility
of natural gas option price do not exist. Therefore, using VIX methodology to track the
natural gas volatility is not feasible.

20
21
3.2 Option Moneyness
3.2.1 Moneyness implied volatility and historical volatility
One of the best possible forecast of future realized volatility is moneyness implied
volatility. Moneyness is the relative position of the current price or future price of an
underlying asset like a stock with respect to the strike price of a derivative, most commonly a
call option or a put option. Moneyness is firstly a three-fold classification: if the derivative
would make money if it were to expire today, it is said to be in the money, while if it would
not make money it is said to be out of the money, and if the current price and strike price are
equal, it is said to be at the money. Moneyness can be accurately presented by using
percentage. For example, if a call option has a strike price at $50 and is currently trading at
$55, it can be said that the contract is in the money by 10% or the option has a moneyness of
110%.
The rough classification of option moneyness can be quantified by various definitions to
express the moneyness as a number, measuring how far the asset is in the money or out of the
money with respect to the strike – or conversely how far a strike is in or out of the money
with respect to the spot (or forward) price of the asset. This quantified notion of moneyness is
most importantly used in defining the relative volatility surface: the implied volatility in
terms of moneyness, rather than absolute price.
From the perspective of volatility, there is a graph called volatility smile which can
describe the relationship between an option's implied volatility and strike price. The more an
option is in-the-money or out-of-the-money, the greater its implied volatility becomes.

22
Literally, option moneyness can be used to predict volatility not only because the trading
volume of oil option is quite large but also because the moneyness implied volatility can truly
reflect the expectation of the market without very large deviation. So, our group focus on
figuring out which moneyness volatility can do the best prediction for next few days or
months. We collect monthly volatilities of different moneyness such as 110%, 105%,
102.5%, 100%, 97.5%, 95% and 90% and then do the regression between each moneyness
and historical volatility with the corresponding the same period.
3.2.2 Absolute change of moneyness implied volatility and that of historical volatility
To find the prediction power of moneyness implied volatility further, we calculate the
absolute change of monthly moneyness implied volatilities and that of historical volatilities.
Our goal is to see whether moving of moneyness implied volatility can match well to that of
historical volatility and whether the directions of moving are the same. Since the unit of
volatility is percentage, we use the absolute change instead of relative change.
3.2.3 Conclusion
For crude oil, our results show that the implied contain at least some information on
future realized volatility. Within the regression between Moneyness implied volatility and
historical volatility, we find that 102.5% option can be the best indication of future volatility
23
on the monthly level because its Adjusted R-squared of the regression is the highest and
reaches 0.19. Within the regression between absolute change of moneyness implied volatility
and that of historical volatility, we find that absolute change of 95% implied volatility is the
best indication of future volatility from the perspectives of sign of coefficient and R-squared.
For natural gas, the best volatility prediction comes from 110% option whose regression
gives us the highest Adjusted R-squared, 0.6451, which is a pretty good result. It can be said
that moneyness implied volatility can be an accurate indication of the real volatility for
natural gas. Within the regression between absolute change of moneyness implied volatility
and that of historical volatility, we find that absolute change of 102.5% implied volatility is
the best indication of future volatility from the perspectives of sign of coefficient and
R-squared.
Comparing the regression results of crude oil and natural gas, we figure out that
moneyness implied volatility has more prediction power on natural gas than crude oil.
Although the trading volume of crude oil is larger than that of natural gas, perhaps the
transactions of natural gas option give us more specific and accurate market information with
fewer noises.
There are some shortcomings in using moneyness implied volatility to do the prediction.
First, it is complex to implement. Second, this method is not suitable for the products which
are not actively traded in the market. Even for an actively traded product, taking various
moneyness of the crude oil option into consideration, perhaps for some time period the
trading volume of some specific option moneyness is relatively low. Third, become it is
calculated from the market conducts so the market noise made by irrational expectations of
investors can heavily affect the accuracy.

24
3.3 GARCH (1,1) Model
3.3.1 Theory foundation of GARCH (1,1)
The generalized autoregressive conditional heteroskedasticity (GARCH) process is an
econometric term developed in 1982 by Robert F. Engle, an economist and 2003 winner of
the Nobel Memorial Prize for Economics, to describe an approach to estimate volatility in
financial markets. There are several forms of GARCH modeling. The GARCH process is
often preferred by financial modeling professionals because it provides a more real-world
context than other forms when trying to predict the prices and rates of financial instruments.
GARCH processes, being autoregressive, depend on past squared observations and past
variances to model for current variance. GARCH processes are widely used in finance due to
their effectiveness in modeling asset returns and inflation. GARCH aims to minimize errors
in forecasting by accounting for errors in prior forecasting, enhancing the accuracy of
ongoing predictions.
The GARCH model has several advantages. First, it could capture long-term mean
reversion of volatility. Second, it captures near-term persistence and fluctuations in volatility.
Third, the weight applied to the observation can be adjusted to better fit the past observations
to the subsequent observations. Fourth, GARCH model can be modified to account for the
asymmetry of volatility. Last, it can be multivariate to capture the cross-correlation of
volatility across asset classes.
The simplest generalized autoregressive conditional heteroskedasticity (GARCH) model
of dynamic variance which is called GARCH(1,1) model can be written as:
σ2t+1 = ω + α * R2t + β * σ2t , with α + β < 1
We can define the unconditional, or long-run average, variance to be
σ2 = ω
1–α–β
25
Thus, tomorrow’s variance is a weighted average of the long-run variance, today’s squared
return and today’s variance.
When forecasting, we use one period lag in GARCH(1,1) data because the volatility
predicted by GARCH(1,1) of this month corresponds to the historical volatility data of next
month. An inconvenience shared by the two models is that the multi-period distribution is
unknown even if the one-day ahead distribution is assumed to be normal. The GARCH model
produce a one-day-ahead forecast of volatility σt+12, and can be easily extended to volatility
forecast of k periods, especially if our goal is to price an option with k steps to expiration
using our volatility model. We use 21-day-ahead forecast, which is the average of sample
options time to expiration. And then we calculate the number of business days of each month
and multiply its squared root with volatilities predicted at the end of each month as the
monthly volatility. For the last step, we do the regression between the predicted monthly
volatility with monthly historical volatility to see the GARCH(1,1) prediction power.
3.3.2 Conclusion
We find that adjusted R-squared of the regression of crude oil is 0.1842 and the
corresponding t-statistics value is 14.948 which means that this result is significant on the
99% confidence level. As for the natural gas, the adjusted R-squared of the regression is
0.5252 which is much higher than that of crude oil regression and the corresponding
t-statistics value is 8.518. From this point of view, we can say that GARCH(1,1) is doing
better on natural gas volatility prediction than that of crude oil.
We admit that GARCH (1,1) is useful across a wide range of applications, however, the
limitation of GARCH (1,1) is its inability to respond asymmetrically to falling and rising
levels of volatility—an important observable and persistent relationship between volatility

26
and asset returns. Furthermore, GARCH models are only part of a solution. Although
GARCH models are usually applied to return series, financial decisions are rarely based
solely on expected returns and volatilities.

27
3.4 EGARCH Model
3.4.1 Theory foundation of EGARCH
The Exponential Generalized Autoregressive Conditional Heteroscedasticity
(EGARCH) model introduced by Nelson (1991) builds in a directional effect of price moves
on conditional variance. From practice, there is negative correlation between stock returns
and changes in returns volatility. Volatility tends to rise in response to "bad news", (excess
returns lower than expected) and to fall in response to "good news" (excess returns higher
than expected). Which means large price declines can have a larger impact on volatility than
large increases. GARCH models, however, assume that only the magnitude but not the
positivity or negativity of unanticipated excess returns determines feature σ2t . Moreover, The
GARCH models are not able to explain the observed covariance between ε2t and εt−j . This is
possible only if the conditional variance is expressed as an asymmetric function of εt−j . In
addition, GARCH models essentially specify the behavior of the square of the data. In this
case a few large observations can dominate the sample.
As the general GARCH model has some limitations, the asymmetric models provide an
explanation for the so-called leverage effect, which an unexpected price drop increases
volatility more than an analogous unexpected price increase. The EGARCH(p,q) model
provides an explanation for the σ2t depends on both size and the sign of lagged residuals.
The conditional variance in the EGARCH (1,1) is given by
ln (σ2t ) = ω + αet−1 + γ(|et−1 | − E |et−1 |) + βi ln (σ2t−1 )

where represents the asymmetric leverage parameter that quantifies the degree of the
√
2
volatility leverage effect in the model and α the magnitude. et ~N(0, 1) with E |et−1 | = π
.The model parameters are free from nonnegativity constraints.

28
Following the same procedures as with GARCH(1,1), the h-step ahead forecast
formula
of the EGARCH(1,1) can be expressed as:
2 2 2 2
lnσˆ t+h = σ + β (lnσˆ t+1 − σ )
h−1
2 γ
where σ = (ω − 2
/(1 − β) .
√ π
An attractive feature of the more general GARCH models, such as EGARCH and
GJR-GARCH, is that they allow for an asymmetric effect of positive and negative shocks on
the conditional variance. In fact, a well-documented feature of financial data is the
asymmetrical effects different types of shocks can have on volatility. In the case of crude oil
prices, political disruptions in the Middle East or large decreases in global demand tend to
increase volatility (see, e.g. Ferderer 1996, Wilson et al. 1996) whereas the effect of new oil
field discoveries seems to have a more muted effect. A large increase in the volatility of WTI
crude oil returns around the global financial crisis but no decline when shale oil started to be
shipped in larger quantities to Cushing. Even when put together, the conditional normality
assumption and the simultaneous estimation of conditional variance, do not capture the thick
tails entirely. As for natural gas, it plays a crucial role in the economy of the United States. In
2010, there is about 25 of energy used in the U.S. came from natural gas. Similar to the oil,
the volatility tendency of natural gas seems to be asymmetric. Over the past several years, the
volatility exhibited in the price of natural gas market has become a great concern among
market participants as well as researchers.
In this part, we used nonlinear garch models to predict the volatility with the purpose of
taking asymmetric effect in to account.
In many financial time series, the standardized residuals from the estimated models
display excess kurtosis which suggests departure from conditional normality. In such cases,
29
the fat-tailed distribution of the innovations driving an ARCH process can be better modeled
using the Student’s-t or the Generalized Error Distribution (GED). Taking the square root of
the conditional variance and expressing it as an annualized percentage yields a time-varying
volatility estimate. A single estimated model can be used to construct forecasts of volatility
over any time horizon.
There are several things we want to test by running EGARCH model. First, in our time
range of the data, the oil price suffered huge volatility since the financial crisis. Given the
extremely high kurtosis present in the data, is EGARCH model assumed to follow a Student t
distribution is favored over that a normal distribution is presumed?
Second, some articles about EGARCH model implied that it will be more accurately to
predict volatility in following 1 to 5 days. We intend to figure out what predicted horizon is
the best for EGARCH.
Third, as long as we calculate predicted volatility, we have to compare the predicted
with historical data. We downloaded historical oil price volatility named HVT from
Bloomberg. The HVT were classified into several types according to horizon, such as 10 days
HVT, 30 days HVT, 50 days HVY ect.. We are going to test which line of HVT fits best to
our predicted numbers.
Lastly, as we tried different type of GARCH models, however, which is most stable to
predict oil future volatility? We will look at the MSE result of the model and checkout
models are suitable for what situations.
3.4.2 Prediction
In this prediction, we use package Quandl for data downloading, rugarch for model
fitting and forecasting. And we use rolling forecast in rugarch package to predict the
30
volatility and compare with the historical value. Therefore, in the ugarchroll function, there
are several arguments and we have to select the best combination of the arguments out of
series of choices, here we select (100,200,300,500,1000) as window selection and (5, 10, 20,
50) as refit length selection. Thus, we wrote two loops looping over a selection of parameters.
We applied MSE to quantify the forecasting error of this model, and choose the combination
with least MSE. And then, we do the regression of the historical volatility and the forecasting
volatility.
Like our previous process of forecasting oil future volatility, we choose daily data of gas
future contract from CME, with maturity of 1 month, ranging from Nov. 11, 2007 to Nov. 8,
2017. We first took the first difference of logarithmic prices as the daily returns, which is
shown below.
Then we calculate mean and variance of the return in this period, it turned out the
average return is about -0.00038, which is not significantly different from zero, and standard
deviation is about 0.0305, which is much larger.
Next, we made several tests on the stationery and distribution of the data.
• The Jarque and Bera statistics shown that the null hypothesis of normal distribution
31
should be rejected at 1% significant level.
Jarque Bera Test
data: ret_total
X-squared = 2448.1, df = 2, p-value < 2.2e-16
• The Ljung and Box’s Q statistics show the rejection of no autocorrelation up to the
10th orders.
We also plotted the acf of the return
We then use the same process to fit the egarch model, as before, we choose a set of
parameters to fit different models, and then calculate MSE for each model prediction and
historical volatility, and choose the best parameter based on the lowest MSE. We also plot the
output from rugarch (see figure 6B).
We then wrote the prediction result and historical volatility to a csv file, and read the file
into R, and to avoid autocorrelation, we feed the first data of each month to a linear
regression, and got the parameter a and b. and plot the predict and historical volatility.
3.4.3 Conclusion
After the forecasting process, we can get the combination with least MSE (702.5309) for
crude oil, which means the optimal parameters of prediction. In this combination, the forecast
length equals to 100, refit length equals to 20, and the historical data is the 50 days’ historical
volatility.
As we can see from the plot, where the blue line stands for the predicted sigma and the
gray line represents the historical volatility from the data, our prediction can more or less
catch the trend of true volatility, but at a less volatile fashion. So, it might not be the best
model to predict volatility.

32
Based on the result of linear regression for natural gas, the R2 is not very significant.
33
3.5 GJR GARCH Model
3.5.1 Theory foundation of GJR-GARCH
The GJR GARCH model of Glosten et al. (1993) models positive and negative shocks
on the conditional variance asymmetrically via the use of the indicator function I,
where γ j now represents the 'leverage' term. The indicator function I takes on value of 1 for
"ε≤0 and 0 otherwise. Because of the presence of the indicator function, the persistence of the
model now crucially depends on the asymmetry of the conditional distribution used. The
persistence of the model P̂ is,
where κ is the expected value of the standardized residuals zt below zero (effectively the
probability of being below zero),
Where f is the standardized conditional density with any additional skew and shape
parameters (. . .). In the case of symmetric distributions, the value of κ is simply equal to 0.5.
The variance targeting, half-life and unconditional variance follow from the persistence
parameter, and ω is replaced by:

34
2
where σ is the unconditional variance of ε2 which is consistently estimated by its sample
counterpart at every iteration of the solver following the mean equation filtration, and υj
represents the sample mean of the jth external regressors in the variance equation (assuming
stationarity).
The naming conventions for passing fixed or starting parameters for this model are:
● ARCH(q) parameters are 'alpha1', 'alpha2', ...,
● Leverage(q) parameters are 'gamma1', 'gamma2', ...,
● GARCH(p) parameters are 'beta1', 'beta2', ...,
● variance intercept parameter is 'omega'
● the external regressor parameters are 'vxreg1', 'vxreg2', ...,
The Leverage parameter follows the order of the ARCH parameter.
3.5.2 Prediction
R package of GJR-GARCH is from RUGARCH package.
Data range is from Jan-02-2008 to July-20-2017 for 2407 days. We use 1000 days as fit
window for GARCH model, and predicted about 1407 days 21 days ahead volatility. Each
prediction we use 100 paths simulation for simulating random part of the model. We picked
last day of every month and multiply by the square root of number of days of every month as
the month’s predicted volatility:
σP red = σlastday * sqrt(di )

We got 68 predicted monthly volatility and linearly regressed with historical monthly
volatility data. The result of the crude oil regression is:
σHist,t+1 = 0.945 * σP red,t + 0.205

The result of the natural gas regression is:
σHist,t+1 = 3.51337 * σP red,t − 0.01778

35
In the equation, σHist,t+1 is historical monthly volatility in the t+1th month; σP red,t is
predicted volatility of time t+1th conditioned on data before tth month.
The R-squared of the crude oil regression is 0.22688 and adjusted R square is 0.2149.
The R-squared of the natural gas regression is 0.396 and adjusted R square is 0.3868. In
the graph blue line is predicted volatility by GJR-GARCH and black line is real monthly
volatility.
3.5.3 Conclusion
GARCH family predictions are not well performed. Which means it is not very suitable
for long-term volatility prediction. If we use daily prediction for one-day horizon, we could
obtain very high R square which means GARCH family is a good method to predict short
horizon volatility for calculating Value at Risk (VaR) or used to process risk management.
But it is not a good way to predict long term volatility because GARCH only includes
historical data.
36
4. Model Comparison and Conclusion
4.1 Comparison
We use adjusted R-Squared as evaluation criteria for model forecasting power. The table
below is the estimation summary for five different models we use to forecast volatility based
on 10-year period historical data.
Forecasting Model Results Summary
Model Oil Oil R-Squared Gas Gas R-Squared
Adjusted R-Squared (absolute change) Adjusted R-Squared (absolute change)
VIX 0.1792 -0.008808 0.6083 0.231
Option Moneyness 0.1904 -0.006138 0.6451 0.01628
GARCH(1,1) 0.1842 — 0.5252 —
EGARCH 0.03551 — 0.06132 —
GJR-GARCH 0.2149 — 0.3868 —
Except for EGARCH model, VIX has the worst performance in forecasting crude oil
volatility. VIX has been performing higher than the realized volatility these years. This is not
to suggest that the VIX measure is of low value. Rather, it should be interpreted strictly for
what it aims to represent: the market price of volatility exposure consistent with observed
option prices. But as a direct indicator of future volatility, it is more limited because it
combines volatility forecasting with the pricing of the risk associated with
volatility. However, there is no existing VIX to track natural gas volatility. We use the
implied volatility of 90% put option price as a substitution.

37
For option moneyness which incorporates option implied volatility, the 102.5% option
has the highest R-squared as 0.1904 while the regression between absolute change of
moneyness implied volatility and that of historical volatility shows the 95% option is the best
indication of crude oil future volatility. As implied volatility did not pass the test of forecast
rationality, this is indirectly in contradiction to the conclusion that implied volatility is the
best available predictor of future volatility. For natural gas, the 110% option gives the highest
adjusted R-squared as 0.6451 while the 102.5% option of absolute change is the best
indication of natural gas future volatility in terms of the sign of coefficient and R-squared. It
can be said that moneyness implied volatility can be an accurate indication of the real
volatility for natural gas since the seasonal change plays an important role in natural gas
demand shift. This characteristic makes it less uncertain than the crude oil.
As we can see, both volatility forecast results obtained by EGARCH model aren’t very
significant as we focus more on MSE in this paper. GJR-GARCH has the highest adjusted
R-squared in GARCH family for crude oil forecast. Asymmetric effects are present in data
and asymmetric models that are capable of allowing different responses to different past
shocks perform better in explaining volatility. But overall the GARCH family are not
well-performed in forecasting oil future volatility. For natural gas forecasting, the simple
GARCH(1,1) did a better job.
Low R-squared values are problematic when we need precise predictions. The
regression plots can interpret the analysis result better for us (see figures 1-5). The black line
represents the historical volatility data while the colorful line represents the predicted
volatility. They indicate all five models we studied are not very good indicator for forecasting
crude oil future volatility.
Overall, each model we used has better performance in predicting natural gas volatility.
38
In order to improve our results for crude oil future volatility forecast, it might be helpful to
use combined models. Meanwhile, the crude oil price volatility forecast performance can be
evaluated based on both statistical values and traders’ behaviors.

39
4.2 Conclusion
In this paper, we estimate and forecast the WTI crude oil and natural gas volatility using
five different models, including VIX, option moneyness (moneyness implied volatility),
GARCH(1,1), EGARCH, and GJR-GARCH. We take the 2008-2017 period as a sample and
using daily observations from the US stock markets. Taking into account what has been
discussed above, we can safely come to the following conclusions for future volatility
forecasting:
(1) In order for implied volatility to be an efficient volatility forecast we have to
eliminate biases caused by Black Scholes model assumption. It’s a better indicator for natural
gas future volatility than oil future volatility.
(2) Behavioral finance matters. Investors have to behave rational when using the
available market information in decision making process since market noise can cause
substantial effect on the accuracy of forecasting power.
(3) The GARCH family used in this paper provide relatively higher predictive accuracy
for shorter time horizon (such as one day ahead). In general, for the normal period, the simple
GARCH model perform better than the asymmetric GARCH but for fluctuation period,
asymmetric GJR-GARCH model is preferred.
(4) Asymmetric GJR-GARCH model provide the preferred forecast for oil future
volatility, while VIX contribute a small but significant additional degree of forecast power
and information not contained in the GARCH forecasts.

40
References
Frangoul, Anmar. "Natural Gas: Why It's Important and What You Need to Know." CNBC.
CNBC, 25 Apr. 2017. Web. 28 Nov. 2017.
Irakli. “Oil's Role in the World Economy and in the Global Crises.” CNN. Cable News
Network, 21 Mar. 2015. Web. 28 Nov. 2017.
Nielsen, Barry. "Financial Matters: The Importance of the Price of Oil." Helena Independent
Record. N.p., 31 Dec. 2015. Web. 28 Nov. 2017.
Lux, Thomas, Mawuli Segnon, and Rangan Gupta. "Forecasting Crude Oil Price Volatility
and Value-at-risk: Evidence from Historical and Recent Data." Energy Economics 56
(2016): 117-33. Web. 1 Sep. 2017.
“Augmented Dickey–Fuller Test.” Wikipedia. Wikimedia Foundation, 23 Nov. 2017. Web.
01 Dec. 2017.
https://en.wikipedia.org/wiki/Augmented_Dickey%E2%80%93Fuller_test.
Granero, M.a. SÃ¡nchez, J.e. Trinidad Segovia, and J. GarcÃa PÃ©rez. "Some Comments on
Hurst Exponent and the Long Memory Processes on Capital Markets." Physica A:
Statistical Mechanics and Its Applications 387.22 (2008): 5543-551. Web. 1 Oct. 2017.
“Cboe Crude Oil ETF Volatility Index (OVX).” Cboe. N.p., n.d. Web. 30 Oct. 2017.
http://www.cboe.com/products/vix-index-volatility/volatility-on-etfs/cboe-crude-oil-etf
-volatility-index-ovx.
.p., 14 July 2008. Web. 30 Oct. 2017.

“Cboe, CBSX, & CFE Press Releases.” Cboe. N
http://www.cboe.com/aboutcboe/cboe-cbsx-amp-cfe-press-releases?DIR=ACNews&FI
LE=cboe_20080714.doc.
“Cboe Volatility Index® (VIX®).” VIX Index. N.p., n.d. Web. 30 Oct. 2017.
http://www.cboe.com/products/vix-index-volatility/vix-options-and-futures/vix-index.
41
Staff, Investopedia. "VIX - CBOE Volatility Index." Investopedia. N.p., 07 Aug. 2015. Web.
30 Oct. 2017.
https://www.investopedia.com/terms/v/vix.asp.
“The CBOE Volatility Index - VIX®.” Cboe (n.d.): 1-23. Aug. 2014. Web. 3 Oct. 2017.
https://www.cboe.com/micro/vix/vixwhite.pdf.
Rossi, Eduardo. Lecture Notes on GARCH Models. Diss. University of Pavia, 2004. Web. 12
Oct. 2017.
Hansson, Mathias, and Rune Sand. FORECASTING CRUDE OIL FUTURES VOLATILITY.
Thesis. BI Norwegian Business School, 2012. N.p.: n.p., n.d. Print.
Musaddiq, Tareena. "Modeling and Forecasting the Volatility of Oil Futures Using the
ARCH Family Models." The Lahore Journal of Business (2012): 79-108. Summer
2012. Web. 2 Oct. 2017.
Bentes, SÃ³nia R. "A Comparative Analysis of the Predictive Power of Implied Volatility
Indices and GARCH Forecasted Volatility." Physica A: Statistical Mechanics and Its
Applications 424 (2015): 105-12. Web. 1 Oct. 2017.
Ghalanos, Alexios. "Introduction to the Rugarch Package. (Version 1.3-1)." (2017): 1-48.
Web. 15 Oct. 2017.
Zhang, Yuejun, Ting Yao, and Lingyun He. "Forecasting Crude Oil Market Volatility: Can
the Regime Switching GARCH Model Beat the Single-regime GARCH Models?"
(2015): 1-30. 5 Dec. 2015. Web. 1 Sept. 2017.

42
Figures
Figures – Crude Oil
Figure 1A: VIX Prediction
Figure 1a: VIX Prediction (absolute change)

43
Figure 2A: Option Moneyness
Figure 2a: Option Moneyness (absolute change)

44
Figure 3A: GARCH (1,1) Prediction
Figure 4A: EGARCH Prediction

45
Figure 5A: GJR-GARCH Prediction
Figure 6A: EGARCH Plot

46
Figures – Natural Gas
Figure 1B: VIX Prediction
Figure 1b: VIX Prediction (absolute change)

47
Figure 2B: Option Moneyness
Figure 2b: Option Moneyness (absolute change)

48
Figure 3B: GARCH (1,1) Prediction
Figure 4B: EGARCH Prediction

49
Figure 5B: GJR-GARCH Prediction
Figure 6B: EGARCH Plots

50
51
Tables
Tables - Crude Oil
Table 1A: VIX Prediction

Residuals:
Min 1Q Median 3Q Max
-12.250 -5.052 -2.578 4.829 14.101
Coefficients:
Estimate Std. Error t value Pr(>|t|)
Intercept 23.37130 1.91583 12.199 < 2e-16 ***
vix1 0.24403 0.04776 5.109 1.31e-06 ***
Residual standard error 7.273 on 114 degrees of freedom

Multiple R-squared Adjusted R-squared F-statistic p-value
0.1863 0.1792 26.11 on 1 and 114 DF 1.311e-06
Table 1a: VIX Prediction (absolute change)

Residuals:
-8.7193 -0.5075 -0.0791 0.2456 10.2201
Coefficients:
Intercept 0.063119 0.163173 0.387 0.700
VIX_30 -0.001709 0.025081 -0.068 0.946

4.108e-05 -0.008808 0.004642 on 1 and 113 DF 0.9458
Table 2A: Option Moneyness Prediction

Regression result
option 110% 105% 102.5% 100% 97.5% 95% 90%
moneyness
Slope 0.27240 0.26983 0.27213 0.26941 0.26208 0.2641 0.26284
Adjusted 0.1834 0.1848 0.1904 0.1873 0.1796 0.1778 0.1868
R-squared
P-value 8.694e-07 7.876e-07 5.245e-07 6.559e-07 1.149e-06 1.306e-06 6.787e-07
52
Table 2a: Option Moneyness Prediction (absolute change)
Regression result
option 110% 105% 102.5% 100% 97.5% 95% 90%
moneyness
Slope -0.002481 -0.001121 -0.001241 -0.001344 0.00864 0.01006 0.009465
Adjusted -0.008542 -0.008663 -0.008656 -0.00865 -0.006822 -0.006138 -0.006345
R-squared
P-value 0.895 0.9518 0.9464 0.9425 0.6445 0.5898 0.6052
Table 3A: GARCH (1,1) Prediction

Regression result
Coefficients t-statistics P-value
Intercept 0.23433 14.948 < 2e-16
Predddd 0.61246 3.988 0.000172
Multiple R-squared 0.1966 Adjusted R-squared 0.1842
Table 4A: EGARCH Prediction

Residuals:
-14.340 -9.069 -4.285 4.336 33.571
Coefficients:
Intercept 35.796 5.253 6.814 1.92e-08 ***
pred -351.153 213.96 -1.641 0.108

0.05648 0.03551 2.694 on 1 and 45 DF 0.1077
53
Tables - Natural Gas
Table 1B: VIX Prediction

Residuals:
-29.642 -6.682 -1.508 3.425 38.583
Coefficients:
Intercept -2.67899 3.83329 -0.699 0.486
VIX_30 1.10971 0.08246 13.458 <2e-16 ***

0.6117 0.6083 181.1 on 1 and 115 DF < 2.2e-16
Table 1b: VIX Prediction (absolute change)

Residuals:
-32.898 -6.614 -1.089 5.538 44.007
Coefficients:
Intercept 0.1025 0.1025 0.094 0.926
VIX_30 0.8824 0.1480 5.962 2.85e-08 ***

0.2377 0.231 35.54 on 1 and 114 DF 2.855e-08
Table 2B: Option Moneyness Prediction

Regression result for Natural Gas
option
110% 105% 102.5% 100% 97.5% 95% 90%
moneyness
Slope 1.13418 1.12053 1.11321 1.14609 1.12315 1.12021 1.10971
Adjusted
0.6451 0.6335 0.6268 0.639 0.6195 0.6187 0.6083
R-squared
P-value <2.2e-16 <2.2e-16 <2.2e-16 <2.2e-16 <2.2e-16 <2.2e-16 <2.2e-16
54
Table 2b: Option Moneyness Prediction (absolute change)
Regression result for Natural Gas
option
110% 105% 102.5% 100% 97.5% 95% 90%
moneyness
Slope 0.2257 0.22968 0.23045 0.23753 0.22869 0.25076 0.28068
Adjusted
0.01586 0.01593 0.01628 0.0136 0.009768 0.01193 0.0156
R-squared
P-value 0.09297 0.09251 0.09018 0.1097 0.1458 0.1241 0.09473
Table 3B: GARCH (1,1) Prediction

Regression result
Coefficients t-statistics P-value
Intercept 0.03162 0.635 0.528
Predddd 3.11737 8.518 3.51e-12
Multiple R-squared 0.5275 Adjusted R-squared 0.5202
Table 4B: EGARCH Prediction

Residuals:
-23.102 -9.040 -5.174 5.005 45.610
Coefficients:
Intercept 29.028 7.737 3.752 0.0005 ***
pred 517.458 258.562 2.001 0.0514

0.08173 0.06132 4.005 on 1 and 45 DF 0.05141
55
APPENDIX
R Code - Crude oil
VIX
Part 1
(Regression on HVT & OVX price)
rm(list=ls())
library(readr)
VIX_Monthly <- read_csv("VIX-MONTHLY.csv")
HVT_Monthly <- read_csv("HVT-MONTHLY.csv")
VIX1<-VIX_Monthly$`Price`
###from 2007-05 to 2017-09

vix1<-VIX1[7:122]
###from 2007-11 to 2017-06
HVT1<-HVT_Monthly$'CL1 COMB Comdty Hist Vol (30)'[1:116]
###from 2007-12 to 2017-07
lm(HVT1~vix1+1)
###run the regression correspondingly
R_squared1<-summary(lm(HVT1~vix1+1))$adj.r.squared
print(R_squared1)
###plot
plot(HVT1,type='l')
points(vix1*0.224+23.371,type='l',col='blue')
summary(lm(HVT1~vix1+1))
Part 2
###(Regression on HVT & OVX price absolute change)
rm(list=ls())
library(readr)
VIX_Monthly <- read_csv("VIX-MONTHLY.csv")
HVT_Monthly <- read_csv("HVT-MONTHLY.csv")
VIX1<-VIX_Monthly$`Price`
###import the OVX date from 2007-05 to 2017-09
vix1<-VIX1[7:122]
###from 2007-11 to 2017-06
vix1_change<-diff(vix1)
###calculate the change for VIX from 2007-12-2007-11 to 2017-06-2017-05
HVT1<-HVT_Monthly$'CL1 COMB Comdty Hist Vol (30)'[1:116]
###import the HVT data from 2007-12 to 2017-07
HVT1_change<-diff(HVT1)
###calculate the change for HVT from 2008-01-2007-12 to 2017-07-2017-06
lm(HVT1_change~vix1_change+1)
R_squared1<-summary(lm(HVT1_change~vix1_change+1))$adj.r.squared
print(R_squared1)
56
plot(HVT1_change,type='l')
points(vix1_change*-0.001709+0.063119,type='l',col='blue')
summary(lm(HVT1_change~vix1_change+1))
###plot and summary (Regression on HVT &OVX Absolute Change)
57
Option moneyness
#Clear the workspace
rm(list=ls())
#Import the monthly historical volatility with period from 01/2008 to 09//2017
HVT<-read.csv('HVT MONTHLY.csv')
HVT_30<-HVT$CL1.COMB.Comdty.Hist.Vol..30.[2:118]
T<-length(HVT_30)
#Import the option moneyness data with period from 12//2007 to 08/2017 which is one period
lag of the historical volatility
moneyness<-read.csv('Moneyness_30_monthly.csv')[13:(13+T-1),]
#Regress the historical volatility with volatility of 110% option

lm_110<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_110.0.MNY_DF)
summary(lm_110)
summary(lm_105)
#Regress the historical volatility with volatility of 102.5% option
lm_102.5<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_102.5.MNY_DF)
summary(lm_102.5)
summary(lm_100)
summary(lm_97.5)
summary(lm_95)
summary(lm_90)
plot(HVT_30,type='l',main="Moneyness and HVT")

points(moneyness$X30DAY_IMPVOL_102.5.MNY_DF,type='l',col='blue',lwd=2)
58
Option Moneyness (absolute change)
rm(list=ls())
HVT<-read.csv('HVT MONTHLY.csv')
HVT_30<-HVT$CL1.COMB.Comdty.Hist.Vol..30.[1:118]
T<-length(HVT_30)
#Calculate the absolute change of monthly historical volatility and the period is from 01/2008
to 09/2017
HVT_30_change<-diff(HVT_30)
#Import the monthly moneyness implied volatility with period from 12/2007 to 09//2017
moneyness<-read.csv('Moneyness_30_monthly.csv')[12:(12+T-1),]
#Calculate the absolute change of the monthly moneyness implied volatility and the period is
from 01/2008 to 09/2017
moneyness_110_change<-diff(moneyness$X30DAY_IMPVOL_110.0.MNY_DF)
moneyness_102.5_change<-diff(moneyness$X30DAY_IMPVOL_102.5.MNY_DF)
#Regress the change of historical volatility with that of volatility of 110% option
lm_110<-lm(HVT_30_change ~ moneyness_110_change)
summary(lm_110)
summary(lm_105)
#Regress the change of historical volatility with that of volatility of 102.5% option
lm_102.5<-lm(HVT_30_change ~ moneyness_102.5_change)
summary(lm_102.5)
summary(lm_100)
summary(lm_97.5)
summary(lm_95)
summary(lm_90)
59
plot(HVT_30_change,type='l',main="Absolute change of Moneyness and HVT")
points(moneyness_95_change,type='l',col='blue',lwd=2)
60
GARCH (1,1)
#Business Day
library(zoo)
rm(list=ls())
oil_daily1 <- Quandl("FRED/DCOILWTICO",api_key="2T1Yy7mQwKqsGtFXKtCy",
type="raw",collapse="daily",start_date="2007-12-31",
end_date="2017-08-20")
oil_daily <- oil_daily1 %>%
arrange(Date) %>%
mutate(Ret_total = c(9999,diff(log(Value)))) %>%
slice(-1)
oil_daily <- oil_daily %>%
mutate(year_mon = as.yearmon(Date)) %>%
group_by(year_mon) %>%
mutate(busdays = n()) %>%
ungroup()
busdays <- oil_daily %>%
group_by(year_mon) %>%
slice(1) %>%
select(Date,year_mon,busdays) %>%
ungroup()
busdays <- subset(busdays, select = c(1, 3))
write.csv(busdays,file="busdays.csv")
#Import Packages
library(Quandl)
library(dplyr)
library(xts)
library(lubridate)
library(forecast)
library(dygraphs)
library(fGarch)
#Clean the R workspace

rm(list=ls())
#Import data
data=read.csv('HVT-Daily2.csv')
oil_daily=data$Price
#Basic data process

Ret_total<-diff(log(oil_daily))
T<-length(Ret_total)
for (i in 1001:T){
Retwindow <- Ret_total[(i-1000):(i-1)]
fit1 <- garchFit( formula = ~garch(1, 1), data = Retwindow, trace = FALSE)
data$Pred[i] <- predict(fit1, n.ahead=21)$standardDeviation
61
print(i)
#predicted period: from 01/2008 to 09/2017
}
#Write the predicted data with corresponding date into a csv file.
write.csv(data,file='Garch11.csv')
#Regression Preparation
result<- read.csv('Garch11.csv')
#Pick the last day of one month as the monthly volatility

Month<-result$Month[1001:T]
Monthdiff<-diff(Month)
aa<-which(Monthdiff!=0)
Pred<-result$Pred[1001:T]
Predd<-Pred[aa]
#add last day of data as the last month volatility.

Predd<-c(Predd,Pred[length(Pred)])
#Compare with HVT30

Real<-read.csv('HVT.csv')
HVTd<-Real$HVT[2:68]/100
Preddd<-Predd[1:67]
#Multiply the number of business days in each month
Busday<-read.csv('busdays.csv')
Predddd<-Preddd*sqrt(Busday$busdays[2:68])
#Backtest Regression Part with HVT month

fit2<-lm(HVTd~Predddd)
summary(fit2)
plot(HVTd,type='l',main="GARCH(1,1) and HVT")
points(Predddd*fit2$coefficients[2]+fit2$coefficients[1],type='l',col='blue',lwd=2)
62
EGARCH
##Import Packages
rm(list=ls())
install.packages('Quandl')
install.packages('rugarch')
library(Quandl)
library(rugarch)
library(tseries)
library(forecast)
##Download Data
oil_daily <- Quandl("FRED/DCOILWTICO", api_key='TPywx-DUcfEE4VMynwHR',
type='raw', collapse='daily', order = 'asc', start_date="2008-01-01",end_date="2017-07-20")
rownames(oil_daily) <- oil_daily$Date
oil_daily$Date <- NULL
ret_total <- diff(log(oil_daily$Value))
hvt <- read.csv('HVT-Daily.csv', header = TRUE, col.names = c('Date',
'HVT','Vol_10','Vol_30','Vol_50','Vol_100'))
hvt <- hvt[-1,]$Vol_100
T <- length(ret_total)
## Data Description $ Testing

mean_return=mean(ret_total)
st_dev = sd(ret_total)
qqnorm(ret_total)
qqline(ret_total)
adf.test(ret_total, alternative='stationary')
acf(ret_total)
pacf(ret_total)
## Estimating the Model

spec <- ugarchspec(variance.model = list(model='eGARCH'), mean.model =
list(armaOrder=c(0,0)), distribution.model = 'norm')
fit = ugarchfit(data = ret_total, spec=spec)
# Iterating over the windows, and trying to find one with least Mean Squared Error(MSE)
forecast_length = c(100, 200, 300, 500, 1000)
refit_length = c(5, 10, 20, 50)
optimal_window = NULL
optimal_refit = NULL
mse = Inf
opt_forc = NULL
model = ugarchspec(variance.model = list(model='eGARCH'), mean.model =
fit2 = ugarchfit(data=ret_total, spec=model)
for(i in c(1:5)){
for (j in c(1:4)){
rollforc = ugarchroll(spec=spec, data=ret_total, n.ahead = 1, forecast.length = window[i],
63
refit.every = refit_length[j], refit.window = c('recursive'), solver = 'hybrid', keep.coef =
TRUE)
sigmapred = as.data.frame(rollforc@forecast$density$Sigma)
error = mean((hvt[(T-window[i]):(T-1)]- sigmapred)^2)
if (error <= mse){
mse = error
optimal_window = window[i]
optimal_refit = refit_length[j]
opt_forc = rollforc}
}}
print(optimal_window)
print(optimal_refit)
64
GJR-GARCH
rm(list=ls())
#Import Packages
library(Quandl)
library(dplyr)
library(xts)
library(lubridate)
library(forecast)
library(dygraphs)
library(fGarch)
library(rugarch)
library(parallel)
#Preparation: Put HVT-Daily.csv, HVT.csv busdays.csv in directory.

#HVT.csv is HVT-monthly data
#In HVT-Daily.csv, the data should be divided manually in EXCEL data-divide column
function.
data=read.csv('HVT-Daily.csv')
oil_monthly=data$Price
#Basic data process

Ret_total<-diff(log(oil_monthly))
#Predict 21 days ahead, could adjust different GARCH model inside loop.
#Check the output csv file whether there is 999 data in Pred. If there is, it means there is error
in that loop.
#Simply substitute i-1's data if Pred[i] is 999
for (i in 1001:T){

tryCatch({
Retwindow<-Ret_total[(i-1000):(i-1)]
spec = ugarchspec(variance.model=list(model="gjrGARCH"), distribution="std")
fit = ugarchfit(spec, Retwindow)
bootp = ugarchboot(fit, method = c("Partial", "Full")[1],n.ahead = 21, n.bootpred = 100)
predsigma<-bootp@forc@forecast$sigmaFor[21]
data$Pred[i]=predsigma
print(i)
}, error=function(e){
data$Pred[i]=999})
}
write.csv(data,file='GJRGarch.csv')
#Regression Prepare
result<- read.csv('GJRGarch.csv')
65
#Pick the last day of one month as the month volatility
Month=result$Month[1001:T]
Predd<-Pred[aa]
#add last day of data as the last month volatility.
#Compare with HVT30

Real<-read.csv('HVT.csv')
HVTd<-Real$HVT[2:68]/100
Preddd<-Predd[1:67]
#Multiply number of day in each month
Predddd<-Preddd*sqrt(Busday$busdays[2:68])

summary(fit2)
plot(HVTd,type='l')
66
R Code - Natural Gas
VIX
Part 1
###Regression on HVT & the volatility of 90% put option price(VIX)
rm(list=ls())
library(readr)
VIX_Monthly <- read.csv('Natural gas moneyness.csv')
VIX_30<-VIX_Monthly $ X30DAY_IMPVOL_90.0.MNY_DF[1:117]
###Import the option moneyness data with period from 12/2007 to 08/2017 which is one
period ahead of the historical volatility
HVT_Monthly <- read.csv('Natural gas HVT_monthly.csv')
HVT_30<-HVT_Monthly $ VOLATILITY_30D[2:118]
###Import the monthly historical volatility with period from 01/2008 to 09/2017
R_squared1<-summary(lm(HVT_30~VIX_30+1))$adj.r.squared
print(R_squared1)
plot(HVT_30,type='l')
points(VIX_30*1.10971-2.67889,type='l',col='blue')
summary(lm(HVT_30~VIX_30+1))
###plot and summary
Part 2
###Regression on HVT & the volatility of 90% put option price Absolute change
rm(list=ls())
library(readr)
VIX_Monthly <- read.csv('Natural gas moneyness.csv')
VIX_30<-VIX_Monthly $ X30DAY_IMPVOL_90.0.MNY_DF[1:117]
###Import the option moneyness data with period from 12/2007 to 08/2017 which is one
period ahead of the historical volatility
VIX_30_change<-diff(VIX_30)
###calculate the difference for VIX_30
HVT_Monthly <- read.csv('Natural gas HVT_monthly.csv')
HVT_30<-HVT_Monthly $ VOLATILITY_30D[2:118]
###Import the monthly historical volatility with period from 01/2008 to 09/2017
HVT_30_change<- diff(HVT_30)
###calculate the difference for HVT_30
R_squared1<-summary(lm(HVT_30_change~VIX_30_change+1))$adj.r.squared
print(R_squared1)
plot(HVT_30_change,type='l')
points(VIX_30_change*0.8824+0.1025,type='l',col='blue')
summary(lm(HVT_30_change~VIX_30_change+1))
###plot and summary
67
Option Moneyness
rm(list=ls())
HVT<-read.csv('Natural gas HVT_monthly.csv')

HVT_30<-HVT$VOLATILITY_30D[2:118]
T<-length(HVT_30)
moneyness<-read.csv('Natural gas moneyness.csv')[1:117,]

#Import the option moneyness data with period from 12//2007 to 08/2017 which is one period
lag of the historical volatility
summary(lm_110)
summary(lm_105)
summary(lm_102.5)
summary(lm_100)
summary(lm_97.5)
summary(lm_95)
summary(lm_90)
plot(HVT_30,type='l',main="Moneyness and HVT for Natural Gas")

points(moneyness$X30DAY_IMPVOL_110.0.MNY_DF,type='l',col='blue',lwd=2)
68
Option Moneyness (absolute change)
rm(list=ls())
HVT<-read.csv('Natural gas HVT_monthly.csv')

HVT_30<-HVT$VOLATILITY_30D[1:118]
T<-length(HVT_30)
HVT_30_change<-diff(HVT_30)
#Calculate the absolute change of monthly historical volatility and the period is from 01/2008
to 09/2017
moneyness<-read.csv('Natural gas moneyness.csv')[1:T,]
#Import the monthly moneyness implied volatility with period from 12/2007 to 09//2017
#Calculate the absolute change of the monthly moneyness implied volatility and the period is
from 01/2008 to 09/2017
summary(lm_110)
summary(lm_105)
summary(lm_102.5)
summary(lm_100)
summary(lm_97.5)
summary(lm_95)
69
summary(lm_90)
plot(HVT_30_change,type='l',main="Absolute change of Moneyness and HVT for Natural

Gas")
points(moneyness_102.5_change,type='l',col='blue',lwd=2)
70
GARCH (1,1)
library(Quandl)
library(dplyr)
library(xts)
library(lubridate)
library(forecast)
library(dygraphs)
library(fGarch)
rm(list=ls())
### Clean the R workspace
gas_daily1<-Quandl("CHRIS/CME_NG1",api_key="2T1Yy7mQwKqsGtFXKtCy",type="ra
w",collapse="daily",start_date="2007-12-31",end_date="2017-07-20")
gas_daily<-gas_daily1[ nrow(gas_daily1):1, ]
#Basic data process

Ret_total<-diff(log(gas_daily$Settle))
write.csv(gas_daily,'gas_daily.csv')
Pred<-numeric(2392)
for (i in 1001:T){
Retwindow <- Ret_total[(i-1000):(i-1)]
fit1 <- garchFit( formula = ~garch(1, 1), data = Retwindow, trace = FALSE)
Pred[i] <- predict(fit1, n.ahead=21)$standardDeviation
print(i)
#predicted period: from 01/2008 to 07/2017
}
write.csv(Pred,file='result.csv')
#Regression Prepare
result<- read.csv('gas_daily.csv')
#Pick the last day of one month as the month volatility
Predd<-Pred[aa]
#add last day of data as the last month volatility
#Compare with HVT30

Real<-read.csv('Natural gas HVT_monthly.csv')
HVTd<-Real$VOLATILITY_30D[49:115]/100
#Multiply number of day in each month
71
Predddd<-Predd*sqrt(Busday$busdays[2:68])

summary(fit3)
plot(HVTd, type='l')
72
EGARCH
##import pakages
rm(list=ls())
install.packages('Quandl')
install.packages('rugarch')
library(Quandl)
library(rugarch)
##download data
gas_daily <- Quandl("CHRIS/CME_NG1", api_key='TPywx-DUcfEE4VMynwHR',
type='raw', collapse='daily', order = 'asc', start_date="2007-11-02",end_date="2017-11-08")
rownames(oil_daily) <- oil_daily$Date
gas_daily$Date <- NULL
ret_total <- diff(log(gas_daily$Settle))
gas_hvt <- read.csv('Natural gas HVT_daily.csv', header = TRUE)
hvt <- gas_hvt[,2]
T <- length(ret_total)
## data description $ testing

mean_return=mean(ret_total)
st_dev = sd(ret_total)
qqnorm(ret_total)
qqline(ret_total)
adf.test(ret_total, alternative='stationary')
acf(ret_total)
pacf(ret_total)
## estimating the model

spec <- ugarchspec(variance.model = list(model='eGARCH'), mean.model =
fit = ugarchfit(data = ret_total, spec=spec)
# iterating over the windows, and trying to find one with least Mean Squared Error(MSE)
forecast_length = c(100, 200, 300, 500, 1000)
refit_length = c(5, 10, 20, 50)
optimal_window = NULL
optimal_refit = NULL
mse = Inf
opt_forc = NULL
model = ugarchspec(variance.model = list(model='eGARCH'), mean.model =
fit2 = ugarchfit(data=ret_total, spec=model)
for(i in c(1:5)){
for (j in c(1:4)){
73
rollforc = ugarchroll(spec=spec, data=ret_total, n.ahead = 1, forecast.length =
forecast_length[i], refit.every = refit_length[j], refit.window = c('recursive'), solver = 'hybrid',
keep.coef = TRUE)
sigmapred = as.data.frame(rollforc@forecast$density$Sigma)
error = mean((hvt[(T-forecast_length[i]):(T-1)]- sigmapred)^2)
if (error <= mse){
mse = error
optimal_window = forecast_length[i]
optimal_refit = refit_length[j]
opt_forc = rollforc}
}}
print(optimal_window)
print(optimal_refit)
print(mse)
plot(opt_forc, which=2)
#use linear regression to regress HVT of every last day of a month and Predicted Volatility
by Egarch of every last day of previous month.
result<-read.csv('result.csv', header = TRUE, sep = '\t')
colnames(result)[1] = 'Month'
Monthdiff<-diff(result$Month)
HVT<-result$HVT[aa]
Pred<-result$Sigmapred[aa]
HVTd<-HVT[2:48]
Predd<-Pred[1:47]
fit2<-lm(HVTd~Predd)
summary(fit2)
plot(HVTd,type='l')
points(Predd*1954.292-8.917,type='l',col='blue')
74
GJR-GARCH
library(Quandl)
library(dplyr)
library(xts)
library(lubridate)
library(forecast)
library(dygraphs)
library(fGarch)
library(rugarch)
library(parallel)
rm(list=ls()) ### Clean the R workspace
gas_daily1<-Quandl("CHRIS/CME_NG1",api_key="2T1Yy7mQwKqsGtFXKtCy",type="ra
w",collapse="daily",start_date="2007-12-31",end_date="2017-09-30")
gas_daily<-gas_daily1[ nrow(gas_daily1):1, ]
price<-gas_daily$Settle
#Basic data process

Ret_total<-diff(log(price))
#Predict 21 days ahead, could adjust different garch model inside loop.
#Check the output csv file whether there is 999 data in Pred. If there is, it means there is error
in that loop.
#Simply substitute i-1's data if Pred[i] is 999
pred<-numeric(1442)
for (i in 1001:T){
tryCatch({
Retwindow<-Ret_total[(i-1000):(i-1)]
spec = ugarchspec(variance.model=list(model="gjrGARCH"), distribution="std")
fit = ugarchfit(spec, Retwindow)
bootp = ugarchboot(fit, method = c("Partial", "Full")[1],n.ahead = 21, n.bootpred = 100)
predsigma<-bootp@forc@forecast$sigmaFor[21]
pred[i]=predsigma
print(i)
}, error=function(e){
pred[i]=999})
}
plot(pred,type='l')
write.csv(pred,file='GJRGarch-gas.csv')
write.csv(gas_daily,file='gas-daily.csv')
75
#Here we divide Date into Year, Month and Day as three columns in excel manually.
result=read.csv('gas-daily.csv')
data_real=read.csv('gas vol.csv')
Real_vol=data_real$VOLATILITY_30D[50:117]/100
Predd<-Pred[aa]
plot(Predd,type='l')
Predddd<-Predd*sqrt(Busday$busdays[2:69])
fit2<-lm(Real_vol~Predddd)
summary(fit2)
plot(Real_vol,type='l')

Forecasting Crude Oil and Natural Gas Volatility

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Forecasting Crude Oil and Natural Gas Volatility

Uploaded by

Copyright:

Available Formats

0

FORECASTING CRUDE OIL AND NATURAL GAS VOLATILITY

Forecasting Crude Oil and Natural Gas Volatility

Vince Lanci-Echo Bay Partners

such as gasoline, diesel, plastic, synthetic fiber, pitch, etc.

Natural gas is used in an amazing number of ways. Although it is widely seen as a

and subsequently descended again to $50 in 2017.

have nowadays are able to help us forecast volatilities accurately.

the volatility of it as first.

investment decision making.

underlying asset during the same time period.

generalized by Bollerslev (1986). In Engle's classic study, he distinguished conditional and

traditionally modeled as a time-invariant GARCH process. However, in GARCH model, we

Intermediate(WTI) crude oil, an underlying commodity of the New York Mercantile

stable at the with the range increasing.

2.1 Data Source for EGARCH and GJR-GARCH

2.2 Mean Reversion Test

2.2.1 Hurst Exponent

variance analysis and to the spectral analysis.

proposed by Mandelbrot and Wallis, based on the previous work of Hurst.

Then we got the result:

Simple R/S Hurst estimation Result

2.2.2 Augmented Dickey--Fuller Test

version of the test is used, but is usually stationarity or trend-stationarity. It is an augmented

a random walk process.

We used package “tseries 0.10-42” and the function adf.test.

Then we got the result:

ADF.test p-value Result

stationary process. Which means it is suitable to use GARCH model to do prediction.

3.1.1 Theory foundation of OVX

light, sweet crude oil, less USO expenses.

market's "fear gauge".

VIX that do not exist.

3.1.1.1 VIX methodology

formula to calculate index values.

calculate index values.

The generalized formula used in the VIX calculation is:

F Forward index level desired from index option price

K0 First strike below the forward index level, F

highest strike and the next lower strike.)

R Risk-free interest rate to expiration

M current day = minutes remaining until midnight of the current day

expirations; or minutes from midnight until 3:00 p.m. for “weekly”

STEP 1: Select the options to be used in the VIX calculation

F = S trike P rice + eRT * (Call P rice − P ut P rice)

always a one-month lag.

for HVT is from Dec-2007 to July-2017.

the Adjusted R-squared for this regression is 0.231.

predictor of future crude oil volatility.

and so did the implied volatility of these options.

of a decline in historical volatility and a possible retreat from investor anxiety.

natural gas volatility is not feasible.

3.2 Option Moneyness

3.2.1 Moneyness implied volatility and historical volatility

equal, it is said to be at the money. Moneyness can be accurately presented by using

The rough classification of option moneyness can be quantified by various definitions to

terms of moneyness, rather than absolute price.

option is in-the-money or out-of-the-money, the greater its implied volatility becomes.

months. We collect monthly volatilities of different moneyness such as 110%, 105%,

and historical volatility with the corresponding the same period.

volatility is percentage, we use the absolute change instead of relative change.

investors can heavily affect the accuracy.

3.3 GARCH (1,1) Model

3.3.1 Theory foundation of GARCH (1,1)

persistence of the model P̂ is,

predicted volatility of time t+1th conditioned on data before tth month.