You are on page 1of 76

0

FORECASTING CRUDE OIL AND NATURAL GAS VOLATILITY

University of Connecticut

Forecasting Crude Oil and Natural Gas Volatility

Xiao Liang

Yiran Liao

Ziran Luo

Yuechen Pan

Yuqi Peng

Zhiyong Yan

Zhaojie Yang

Professor Biolsi

Vince Lanci-Echo Bay Partners

8 December 2017
1
Forecasting Crude Oil and Natural Gas Volatility
1. Introduction 3
1.1 Background 3
1.2 Literature Review 7
2. Data 9
2.1 Data Source for EGARCH and GJR-GARCH 11
2.2 Mean Reversion Test 13
2.2.1 Hurst Exponent 13
2.2.2 Augmented Dickey--Fuller Test 13
2.2.3 Conclusion 14
3. Prediction 15
3.1 VIX 15
3.1.1 Theory foundation of OVX 15
3.1.1.1 VIX methodology 16
3.1.2 Prediction 18
3.1.3 Conclusion 19
3.2 Option Moneyness 21
3.2.1 Moneyness implied volatility and historical volatility 21
3.2.2 Absolute change of moneyness implied volatility and that of historical
volatility 22
3.2.3 Conclusion 22
3.3 GARCH (1,1) Model 24
3.3.1 Theory foundation of GARCH (1,1) 24
3.3.2 Conclusion 25
3.4 EGARCH Model 27
3.4.1 Theory foundation of EGARCH 27
3.4.2 Prediction 30
3.4.3 Conclusion 31
3.5 GJR GARCH Model 33
3.5.1 Theory foundation of GJR-GARCH 33
3.5.2 Prediction 34
3.5.3 Conclusion 35
4. Model Comparison and Conclusion 36
4.1 Comparison 36
4.2 Conclusion 39
2
Forecasting Crude Oil and Natural Gas Volatility
References 40
Figures 42
Figures – Crude Oil 42
Figures – Natural Gas 46
Tables 51
Tables - Crude Oil 51
Tables - Natural Gas 53
APPENDIX 55
R Code - Crude oil 55
R Code - Natural Gas 66
3
Forecasting Crude Oil and Natural Gas Volatility

1. Introduction

1.1 Background

Crude oil is debatably one of the most important driving forces of the global economy,

and changes in the price of oil have significant effects on economic growth and welfare

around the world. The level of importance of oil is even larger to industrialized economies.

Today, Oil is one of the most important raw material for the world and will likely remain so

for many decades to come. Most countries are significantly affected by developments in the

oil market, either as producers, consumers, or both. Oil is directly responsible for about 2.5%

of world GDP, and in 2014, oil provided about 38 % of the world’s energy needs, and in the

future, oil is expected to continue to provide a leading component of the world’s energy.

From estimation, to meet the projected increase in world oil demand, the total petroleum

supply in 2030 is required to reach 118 million barrels per day from 80 million barrels per

day as of the year 2003. Everyday, we use hundreds of things that are made from oil or gas,

such as gasoline, diesel, plastic, synthetic fiber, pitch, etc.

Natural gas is used in an amazing number of ways. Although it is widely seen as a

cooking and heating fuel in most U.S. households, natural gas has many other energy and raw

material uses that are a surprise to most people who learn about them. In the United States,

most natural gas is burned as a fuel. In 2012 about 30% of the energy consumed across the

nation was obtained from natural gas. It was used to generate electricity, heat buildings, fuel

vehicles, heat water, bake foods, power industrial furnaces, and even run air conditioners.

Oil and gas powers nearly 100% of all transportation. The assumption of oil and gas are

very big. The world’s oil & gas transport infrastructure is a globe-spanning spiderweb of

pipelines and shipping routes. The natural gas distribution pipelines in the US alone could
4
Forecasting Crude Oil and Natural Gas Volatility
stretch from Earth to the Moon 7-8 times. There are millions upon millions of miles of pipe

on the planet to distribute crude oil, refined products, and natural gas. There is no reason

whatsoever to think any feasible amount of renewables growth can displace fossil fuels in a

couple of generations. Wind and solar are growing fast, but the use of renewables as a

percentage of total world energy consumption only increased by 0.07% from 1973 to 2009.

Oil and natural gas have huge roles not only in world economy, but also has strong

influence on global crises. Understanding macro impacts of oil and gas prices also requires

considering in detail the exposure and interactions of micro channels, such as the housing or

auto sector.

So, predicting the future oil and gas prices are important. Higher oil and gas prices

means higher business costs. From a financial perspective, many sectors of the economy will

be adversely affected by increasing oil and gas prices, or helped when they go down. It can be

very different for importing and exporting countries. It’s universal. It’s fast changing. During

the past decade, the price of oil has traveled from $60 per barrel to a peak of $145 in 2008

and subsequently descended again to $50 in 2017.

Here is how the crude oil and natural gas react to a variety of geopolitical and economy

events during the past decade. 2008 is a magical year. The crude oil price went high and

reached its pre-financial crisis peak of $145 due to the unrest and consumers’ fears about the

wars in both Iraq and Afghanistan. In addition, it was just the time of the 2008 Beijing

Olympic Games. Millions of travelers entered the country that drives the demand for oil went

up. Crude oil price began to go down when the global market collapsed in the very same

year. Now take a look at the natural gas price changes back then. The normal natural gas

price trend is prices falling in the summer and rising in the winter. However, instead of

seasonal weather playing an important role in prices, in the summer of 2008, a large price
5
Forecasting Crude Oil and Natural Gas Volatility
spike took place in summer and then quickly drops from its peak of over $13 per Mcf. to

below $3 per Mcf in the winter due to demand drop resulted from the economy recession. It

follows the pattern of oil prices of rising in the summer and falling in the winter. After the

economy had recovered for two years, crude oil price fluctuated in a relatively small range

and maintained a steady growth from 2011 to early 2014 and there has been slight fluctuation

in the price of natural gas but not any major spikes. Due to the oversupply by OPEC and the

appreciation of U.S. dollar in the second half of 2014, oil and gas demand was driven down.

Crude oil price went low that people has not seen since the last global economy recession.

Also, the Iran nuclear deal and the turmoil in Iraq and Libya contributed to the declining

crude oil price and it affected the geopolitical risk in market. Crude oil price and natural gas

price both reached its lowest point in the beginning of 2016 as below $30 and $2

respectively.

Over the last couple of decades, volatility has become one of the significant issues in the

energy market. It is apparent that energy prices are the most volatile among all the

commodity prices. Crude oil, natural gas, coal and other energy products all observe

significant price fluctuations. These fluctuations in prices create uncertainty in the minds of

consumers and producers. Oil price shocks due to such events have continuously increased in

size and frequency. Wide fluctuations in oil prices have played an important role in driving

recessions and even regimes collapsing—which is why oil price movements are closely

watched by economists and investors. From these evidences of significantly changes of crude

oil and natural gas prices, we naturally think about whether the econometric tools that we

have nowadays are able to help us forecast volatilities accurately.

From a finance perspective, in the current context of an ongoing global financial crisis,

risk management and volatility forecasting are the most important topics nowadays in the
6
Forecasting Crude Oil and Natural Gas Volatility
financial world. We all know that return has a close relationship with volatility and most of

the financial decisions are made based on a tradeoff between risk and returns. Thus, in order

to analysis and predict the changes of return of crude oil and natural gas, we have to analysis

the volatility of it as first.

Overall, the fluctuation in crude oil and natural gas price for the past decade has a

significant impact on stock market and global economy. For risk managers, it is important to

understand the degree of volatility in any investment, along with its potential impact on the

overall investment strategy since volatility directly affects the investment valuation.

Therefore, being able to understand and predict future volatility is very important in

investment decision making.


7
Forecasting Crude Oil and Natural Gas Volatility
1.2 Literature Review

For the volatility forecasting, there are two main sources. One of them is the approach

based on time series, and the other one is the volatility implied from option prices. From a

theoretical point of view, the implied volatility of the option price should contain all the

available information related to the forecasting of volatility, which is necessary to the future

volatility forecasting. However, the actual situation is very complicated. In general, the risk

premium of the implied volatility of the option price is due to the fact that the risk of

volatility cannot be completely hedged, it is shows in the study of Bollerslev and Zhou

(2005). In addition, one of the most noteworthy phenomena is called smile effect, which

shows the limitations of the classic Black-Scholes model. The smile effect is the effect when

calculating the implied volatility for options with different strikes on the same underlying

with the same time to maturity one does not necessarily get the same implied volatility. In

general, the implied volatility is a u-shape and the minimum implied volatility occurs at

at-the-money position. Thus, if we want to use the implied volatility to predict future

volatility, the same market is giving multiple forecasts for the future volatility of the same

underlying asset during the same time period.

In the literature, many models have been used to forecast the crude oil volatility. The

models that we most widely used are ARCH model that proposed by Engle (1982) and then

generalized by Bollerslev (1986). In Engle's classic study, he distinguished conditional and

unconditional variance for the first time, but the model is simple and needs a lot of

parameters. Bollerslev (1986) extended the ARCH model to the Generalized Autoregressive

Conditional Heteroscedasticity (GARCH) which had the same key properties as the ARCH

but required far less parameters to adequately model the volatility process. According to their
8
Forecasting Crude Oil and Natural Gas Volatility
research, we can build the model that simultaneously model both mean and variance of

financial time series. These approaches are significant improvements in the time series

analysis.

Time invariant GARCH (1,1) models have fared well in predicting the conditional

volatility of financial assets (Hansen and Lunde 2005). Moreover, oil price volatility has been

traditionally modeled as a time-invariant GARCH process. However, in GARCH model, we

cannot avoid the impacts that caused by asymmetric effects of positive and negative asset

returns. In order to overcome the weakness, Nelson (1991) proposed an extension to the

GARCH model called the Exponential GARCH(EGARCH), which can avoid the asymmetric

influences and bias. Another widely used extension of the GARCH model is the

GJR-GARCH proposed by Glosten, Jagannathan and Runkle (1993), GJR-GARCH has been

shown to have good out-of-sample performance when forecasting oil price volatility at short

horizons (Mohammadi and Su 2010, and Hou and Suardi 2012). On the basis of previous

papers, Wei et al (2010) studied nine GARCH models and compared the accuracy of their

forecasting with six different loss functions. Finally, they concluded that although the

nonlinear model can properly capture the asymmetric leverage effect of long memory

volatility and the asymmetric leverage, these models are not the best model to forecast the

volatility.
9
Forecasting Crude Oil and Natural Gas Volatility

2. Data

In this paper, the data that we obtained from Bloomberg is based on the West Texas

Intermediate(WTI) crude oil, an underlying commodity of the New York Mercantile

Exchange's oil futures contracts and natural gas traded on the New York Mercantile Division

(NYMEX) of the Chicago Mercantile Exchange (CME). The rationale for choosing WTI

futures is that it can provide us with sufficient amount of data needed to measure and

compare the accuracy of forecasting power for different models. We derived daily and

monthly closing price for crude oil over a 10-year period, from 2008 to 2017. Options are

divided into 7 different strike price categories ranging from 10 basis point out-of-money to 10

basis point in-the-money. The estimation of historical model is based on a 21-day window of

realized volatility.

Different models provided different series of predicted volatilities, the way we compare

these numbers to find the optimal prediction model is by using regression between our

predicted numbers and HVT. HVT is historical volatility of the oil price we downloaded from

Bloomberg directly. The historical volatility reflects the actual price changes of the oil over a

given time period. Therefore, the model whose regression results fit the HVT best is the

optimal one. HVT is usually calculated in several different moving windows. The windows

express the days include in the HVT’s calculation, and the whole data set will move forward

every day. We take the 30 days HVT which representative monthly volatility for VIX, option

moneyness, GARCH (1,1), and GJRGARCH models, and use the 10 days, 30 days, 50 days,

and 100 days windows for GARCH model. In period range from January 2nd, 2008 to July

20th, 2017. The average volatilities for crude oil are 34.62%, 35.73%, 36.05% and 36.57%

for 10 days, 30 days, 50 days, and 100 days. And the standard deviations for crude oil are
10
Forecasting Crude Oil and Natural Gas Volatility
21.65%, 19.18%, 18.42%, 17.42% for 10 days, 30 days, 50 days, and 100 days. For Natural

gas, data range from November 2nd,2007 to November 8th,2017, the average volatility for

every 10 days and 30 days are 44.79% and 46.00%. The standard deviation for 10 and 30

days window are 20.45% and 16.80%. The max volatility of 10 days is 172.6% observe in

September 29th, 2009. The max volatility of 30 days is 130.66% observe in October 1st,

2009. It means volatility will rise with the window range rising but the volatility will be more

stable at the with the range increasing.


11
Forecasting Crude Oil and Natural Gas Volatility

2.1 Data Source for EGARCH and GJR-GARCH

For crude oil, in both EGARCH and GJRGARCH, we use the daily spot price for the

West Texas Intermediate (WTI) crude oil obtained from inserted R package. The sample

period ranges from January 2nd, 2008 to July 20th, 2017. Over this period of time, the

average price for a barrel of crude oil was $77.31, the median value equaled $82.18, and the

standard deviation was $24.79. A maximum price of $145.29 was observed on July 3, 2008

and the minimum price of $26.21 was on February 11, 2016. To model the returns in the oil

price and its volatility, we calculate daily oil returns by taking difference in the logarithm of

consecutive days closing prices. The mean rate of return is about -0.031% with a standard

deviation of 2.49%. Note also that WTI returns are slightly positively skewed at 0.2012.

Kurtosis is a little bit high at the value of 4.54, compared with 3 for a normal distribution.

Large variations are observed during the global financial crisis in late 2008 and since crude

oil prices started decreasing in July 2014. The reason we choose this range of data is written

in the previous back ground part, and the time period is large sufficient for us to make a

reliable prediction.

For natural gas, we use the daily natural gas future price of CME. Time range is form

November 2nd, 2007 to November 8th, including 2510 data in total. In this period, the

average price of the natural gas is $4.11 per MMBtu, the median value is $3.70, and standard

deviation of daily future prices is $1.96. A maximum price of $13.577 was observed on July

3, 2008 and the minimum price of $1.63 was on March 3, 2016. The maximum price of

natural gas appeared at the same day as the crude oil, and the minimum price showed at near

time for gas and oil. Same as crude oil, we calculate daily gas returns by taking difference in

the logarithm of consecutive days closing prices. The mean rate of return is -0.039% with a
12
Forecasting Crude Oil and Natural Gas Volatility
standard deviation of 3.04%. The natural gas returns are positive skewed at 0.6394, and has a

slightly high kurtosis at 4.85. From this simple calculation of natural gas prices and return,

we can observe that the price tendency of crude oil and natural gas are little bit similar to

each other.

For the comparable historical volatility, we download the 10 days, 30 days, 50 days, and

100 days HVT for EGARCH model. 30 days HVT for GJRGARCH model.


13
Forecasting Crude Oil and Natural Gas Volatility

2.2 Mean Reversion Test

2.2.1 Hurst Exponent

The Hurst exponent is the classical test to detect long memory in time series. This

analysis was introduced by English hydrologist H.E. Hurst in 1951, based on Einstein’s

contributions regarding Brownian motion of physical particles, to deal with the problem of

reservoir control near Nile River Dam. R/S analysis in economy was introduced by

Mandelbrot, who argued that this methodology was superior to the autocorrelation, the

variance analysis and to the spectral analysis.

The eldest and best-known method to estimate the Hurst exponent is R/S analysis. It was

proposed by Mandelbrot and Wallis, based on the previous work of Hurst.

When the process is a Brownian motion, H has to be 0.5, when it is persistent H will be

greater than 0.5, and finally when it is anti-persistent H will be less than 0.5. For a white

noise, H = 0, while for a simple linear trend, H = 1. Note that H must lie between 0 and 1.

We used package “pracma 2.1.1” in R and the function of hurstexp(x, box, display).

Then we got the result:

Simple R/S Hurst estimation Result


Gas 0.836415 Persistent/Mean reverse
Oil 0.9978132 Persistent/Mean reverse

2.2.2 Augmented Dickey--Fuller Test

an augmented Dickey–Fuller test (ADF) tests the null hypothesis that a unit root is

present in a time series sample. The alternative hypothesis is different depending on which

version of the test is used, but is usually stationarity or trend-stationarity. It is an augmented

version of the Dickey–Fuller test for a larger and more complicated set of time series models.

When the process is a stationary process, p-value should be larger than our threshold
14
Forecasting Crude Oil and Natural Gas Volatility
(5% for 95% confidence for example).

Otherwise, if p-value is smaller than 5%, null hypothesis is significant, which means it is

a random walk process.

We used package “tseries 0.10-42” and the function adf.test.

Then we got the result:

ADF.test p-value Result


Gas 0.3616 Persistent/Mean reverse
Oil 0.5701 Persistent/Mean reverse

2.2.3 Conclusion

In both test, the result shows that the nature gas and crude oil are mean-reversion

stationary process. Which means it is suitable to use GARCH model to do prediction.


15
Forecasting Crude Oil and Natural Gas Volatility

3. Prediction

3.1 VIX

3.1.1 Theory foundation of OVX

The CBOE Crude Oil ETF Volatility Index ("Oil VIX", Ticker - OVX) measures the

market's expectation of 30-day volatility of crude oil prices by applying the VIX

methodology to United States Oil Fund, LP (Ticker - USO) options spanning a wide range of

strike prices.

The United States Oil Fund is an exchange-traded security designed to track changes in

crude oil prices. By holding near-term futures contracts and cash, the performance of the

Fund is intended to reflect, as closely as possible, the spot price of West Texas Intermediate

light, sweet crude oil, less USO expenses.

The CBOE Volatility Index (VIX Index) is considered by many to be the world's

premier barometer of equity market volatility. The VIX Index is based on real-time prices of

options on the S&P 500 Index (SPX) and is designed to reflect investors' consensus view of

future (30-day) expected stock market volatility. The VIX Index is often referred to as the

market's "fear gauge". 

In 2008, CBOE pioneered the use of the VIX methodology to estimate expected

volatility of certain commodities and foreign currencies. The CBOE Crude Oil ETF Volatility

Index (OVX​SM​), CBOE Gold ETF Volatility Index (GVZ​SM​) and CBOE Eurocurrency ETF

Volatility Index (EVZ​SM​) use exchange-traded fund options based on the United States Oil

Fund, LP (USO), SPDR Gold Shares (GLD) and Currency Shares Euro Trust (FXE),

respectively.

CBOE has since introduced several new volatility indexes, including volatility indexes
16
Forecasting Crude Oil and Natural Gas Volatility
based on individual stocks, just like CBOE U.S. Energy Sector ETF Volatility Index

(VXXLESM). However, there is still no VIX to track the natural gas volatility only. As is

shown below, the VIX we calculate is by giving different weights to different options, the

lower the strike price, the higher the weights. Through our personal experience, the 90%

moneyness put option implied volatility might be a reasonable substitution for the natural gas

VIX that do not exist.

3.1.1.1 VIX methodology

Stock indexes, such as the S&P 500, are calculated using the prices of their component

stocks. Each index employs rules that govern the selection of component securities and a

formula to calculate index values.

The VIX Index is a volatility index comprised of options rather than stocks, with the

price of each option reflecting the market’s expectation of future volatility. Like conventional

indexes, the VIX calculation employs rules for selecting component options and a formula to

calculate index values.

The generalized formula used in the VIX calculation is:

∆K i RT 2
σ2 = 2
T
∑ e Q(K i ) − T1 [ KF0 − 1]
K 2i
i
Where...

V IX
σ is 100
, we can get VIX=σ*100

T Time to expiration

F Forward index level desired from index option price

K0 First strike below the forward index level, F

K1 Strike price of the i​th out-of-the-money option; a call if K​1​>K​0​; and a put if K​1​<K​0​; both
17
Forecasting Crude Oil and Natural Gas Volatility
put and call if K​1​=K​0

ΔK​1​ Interval between strike price-half the difference between the strike on either side of K​1​:
K −K
∆K i = i+1 2 i−1
(Note: ΔK for the lowest strike is simply the difference between the lowest strike and

the next higher strike. Likewise, ΔK for the highest strike is the difference between the

highest strike and the next lower strike.)

R Risk-free interest rate to expiration

Q(K​1​) The midpoint of the bid-ask spread for each option with strike K​i

GETTING STARTED:

The VIX calculation measures time to expiration, T, in calendar days and divides each

day into minutes in order to replicate the precision that is commonly used by professional

option and volatility traders. The time to expiration is given by the following expression:
{M current day +M settlement day +M ohter days }
T = M inutes in a year
WHERE...

M current day = minutes remaining until midnight of the current day

M settlement day = minutes from midnight until 8:30 a.m. for “standard” SPX

expirations; or minutes from midnight until 3:00 p.m. for “weekly”

SPX expirations

M ohter days = total minutes in the days between current day and expiration day

STEP 1: Select the options to be used in the VIX calculation

F = S trike P rice + eRT * (Call P rice − P ut P rice)


STEP 2: Calculate volatility for both near-term and next-term options

STEP 3: Calculate the 30-day weighted average of σ21 and σ22 . Then take the square root of
18
Forecasting Crude Oil and Natural Gas Volatility
that value and multiply by 100 to get VIX.

3.1.2 Prediction

What we do right now is running regressions between the historical volatilities and OVX

data, and then judge whether OVX is a good predictor of predicting future WTI oil price

volatility. Since we download OVX monthly data to predict the 30-day volatility, there is

always a one-month lag.

In part one, the data range for OVX is from Nov-2007 to Jun-2017 and the data range

for HVT is from Dec-2007 to July-2017.

We then run a regression on HVT and OVX price correspondingly. The Multiple

R-squared for this regression is 0.1863 and the Adjusted R-squared for this regression is

0.1792.

In part two, we run a regression on the absolute change of HVT and OVX price. The

data for absolute changes of OVX range from Dec-2007 minus Nov-2007 to Jun-2017 minus

May-2017 and the data range for HVT is from Jan-2008 minus Dec-2007 to July-2017 minus

Jun-2017.

We then run a regression on HVT and OVX absolute change correspondingly. The

Multiple R-squared for this regression is 4.108e-05 and the Adjusted R-squared for this

regression is -0.008808.

For predicting the natural gas volatility, we do the same. First, we import the natrual gas

monthly historical volatility with period from Jan-2008 to Sept-2017 and import the implied

volatility of 90% put option price with period form Dec-2007 to Aug-2017.

We then run a regression on those two correspondingly. The Multiple R-squared for

this regression is 0.6117 and the Adjusted R-squared for this regression is 0.6083.
19
Forecasting Crude Oil and Natural Gas Volatility
By running the regression on the absolute change of natural gas HVT and implied

volatility of 90% put option price. The Multiple R-squared for this regression is 0.2377 and

the Adjusted R-squared for this regression is 0.231.

3.1.3 Conclusion

From all we discussed above, we can conclude that OVX is not a completely good

predictor of future crude oil volatility.

The OVX calculates not the actual volatility of the crude oil price, but the implied

volatility in the its option price, which means that the supply and demand factors for options

are included in the OVX. When demand for options is high, option prices would be higher

and so did the implied volatility of these options.

The VIX does not have the necessary predictive power in the real world. Sometimes, the

real volatility rises when markets rise, and the VIX rises in this case, but that doesn't

necessarily indicate that market sentiment is developing into a panic. In turn, the VIX is

likely to fall as markets fall. The VIX, which fell 70 percent in 2009, is likely to be the result

of a decline in historical volatility and a possible retreat from investor anxiety.

Just like there is no causal relationship between the VIX and S&P 500, this conclusion

could also apply to OVX and crude oil, a rising OVX cannot force the crude oil price down

and that explains why OVX is not an absolute excellent predictor of crude oil price to some

extent.

For natural gas, we got a pretty good R-square if we use the implied volatility of 90%

put option price as a substitution. However, in the real world, the VIX to track the volatility

of natural gas option price do not exist. Therefore, using VIX methodology to track the

natural gas volatility is not feasible.


20
Forecasting Crude Oil and Natural Gas Volatility
21
Forecasting Crude Oil and Natural Gas Volatility

3.2 Option Moneyness

3.2.1 Moneyness implied volatility and historical volatility

One of the best possible forecast of future realized volatility is moneyness implied

volatility. Moneyness is the relative position of the current price or future price of an

underlying asset like a stock with respect to the strike price of a derivative, most commonly a

call option or a put option. Moneyness is firstly a three-fold classification: if the derivative

would make money if it were to expire today, it is said to be in the money, while if it would

not make money it is said to be out of the money, and if the current price and strike price are

equal, it is said to be at the money. Moneyness can be accurately presented by using

percentage. For example, if a call option has a strike price at $50 and is currently trading at

$55, it can be said that the contract is in the money by 10% or the option has a moneyness of

110%.

The rough classification of option moneyness can be quantified by various definitions to

express the moneyness as a number, measuring how far the asset is in the money or out of the

money with respect to the strike – or conversely how far a strike is in or out of the money

with respect to the spot (or forward) price of the asset. This quantified notion of moneyness is

most importantly used in defining the relative volatility surface: the implied volatility in

terms of moneyness, rather than absolute price.

From the perspective of volatility, there is a graph called volatility smile which can

describe the relationship between an option's implied volatility and strike price. The more an

option is in-the-money or out-of-the-money, the greater its implied volatility becomes.


22
Forecasting Crude Oil and Natural Gas Volatility

Literally, option moneyness can be used to predict volatility not only because the trading

volume of oil option is quite large but also because the moneyness implied volatility can truly

reflect the expectation of the market without very large deviation. So, our group focus on

figuring out which moneyness volatility can do the best prediction for next few days or

months. We collect monthly volatilities of different moneyness such as 110%, 105%,

102.5%, 100%, 97.5%, 95% and 90% and then do the regression between each moneyness

and historical volatility with the corresponding the same period.

3.2.2 Absolute change of moneyness implied volatility and that of historical volatility

To find the prediction power of moneyness implied volatility further, we calculate the

absolute change of monthly moneyness implied volatilities and that of historical volatilities.

Our goal is to see whether moving of moneyness implied volatility can match well to that of

historical volatility and whether the directions of moving are the same. Since the unit of

volatility is percentage, we use the absolute change instead of relative change.

3.2.3 Conclusion

For crude oil, our results show that the implied contain at least some information on

future realized volatility. Within the regression between Moneyness implied volatility and

historical volatility, we find that 102.5% option can be the best indication of future volatility
23
Forecasting Crude Oil and Natural Gas Volatility
on the monthly level because its Adjusted R-squared of the regression is the highest and

reaches 0.19. Within the regression between absolute change of moneyness implied volatility

and that of historical volatility, we find that absolute change of 95% implied volatility is the

best indication of future volatility from the perspectives of sign of coefficient and R-squared.

For natural gas, the best volatility prediction comes from 110% option whose regression

gives us the highest Adjusted R-squared, 0.6451, which is a pretty good result. It can be said

that moneyness implied volatility can be an accurate indication of the real volatility for

natural gas. Within the regression between absolute change of moneyness implied volatility

and that of historical volatility, we find that absolute change of 102.5% implied volatility is

the best indication of future volatility from the perspectives of sign of coefficient and

R-squared.

Comparing the regression results of crude oil and natural gas, we figure out that

moneyness implied volatility has more prediction power on natural gas than crude oil.

Although the trading volume of crude oil is larger than that of natural gas, perhaps the

transactions of natural gas option give us more specific and accurate market information with

fewer noises.

There are some shortcomings in using moneyness implied volatility to do the prediction.

First, it is complex to implement. Second, this method is not suitable for the products which

are not actively traded in the market. Even for an actively traded product, taking various

moneyness of the crude oil option into consideration, perhaps for some time period the

trading volume of some specific option moneyness is relatively low. Third, become it is

calculated from the market conducts so the market noise made by irrational expectations of

investors can heavily affect the accuracy.


24
Forecasting Crude Oil and Natural Gas Volatility

3.3 GARCH (1,1) Model

3.3.1 Theory foundation of GARCH (1,1)

The generalized autoregressive conditional heteroskedasticity (GARCH) process is an

econometric term developed in 1982 by Robert F. Engle, an economist and 2003 winner of

the Nobel Memorial Prize for Economics, to describe an approach to estimate volatility in

financial markets. There are several forms of GARCH modeling. The GARCH process is

often preferred by financial modeling professionals because it provides a more real-world

context than other forms when trying to predict the prices and rates of financial instruments.

GARCH processes, being autoregressive, depend on past squared observations and past

variances to model for current variance. GARCH processes are widely used in finance due to

their effectiveness in modeling asset returns and inflation. GARCH aims to minimize errors

in forecasting by accounting for errors in prior forecasting, enhancing the accuracy of

ongoing predictions.

The GARCH model has several advantages. First, it could capture long-term mean

reversion of volatility. Second, it captures near-term persistence and fluctuations in volatility.

Third, the weight applied to the observation can be adjusted to better fit the past observations

to the subsequent observations. Fourth, GARCH model can be modified to account for the

asymmetry of volatility. Last, it can be multivariate to capture the cross-correlation of

volatility across asset classes.

The simplest generalized autoregressive conditional heteroskedasticity (GARCH) model

of dynamic variance which is called GARCH(1,1) model can be written as:

σ2t+1 = ω + α * R2t + β * σ2t , with α + β < 1

We can define the unconditional, or long-run average, variance to be

σ2 = ω
1–α–β
25
Forecasting Crude Oil and Natural Gas Volatility
Thus, tomorrow’s variance is a weighted average of the long-run variance, today’s squared

return and today’s variance.

When forecasting, we use one period lag in GARCH(1,1) data because the volatility

predicted by GARCH(1,1) of this month corresponds to the historical volatility data of next

month. An inconvenience shared by the two models is that the multi-period distribution is

unknown even if the one-day ahead distribution is assumed to be normal. The GARCH model

produce a one-day-ahead forecast of volatility σ​t+1​2​, and can be easily extended to volatility

forecast of k periods, especially if our goal is to price an option with k steps to expiration

using our volatility model. We use 21-day-ahead forecast, which is the average of sample

options time to expiration. And then we calculate the number of business days of each month

and multiply its squared root with volatilities predicted at the end of each month as the

monthly volatility. For the last step, we do the regression between the predicted monthly

volatility with monthly historical volatility to see the GARCH(1,1) prediction power.

3.3.2 Conclusion

We find that adjusted R-squared of the regression of crude oil is 0.1842 and the

corresponding t-statistics value is 14.948 which means that this result is significant on the

99% confidence level. As for the natural gas, the adjusted R-squared of the regression is

0.5252 which is much higher than that of crude oil regression and the corresponding

t-statistics value is 8.518. From this point of view, we can say that GARCH(1,1) is doing

better on natural gas volatility prediction than that of crude oil.

We admit that GARCH (1,1) is useful across a wide range of applications, however, the

limitation of GARCH (1,1) is its inability to respond asymmetrically to falling and rising

levels of volatility—an important observable and persistent relationship between volatility


26
Forecasting Crude Oil and Natural Gas Volatility
and asset returns. Furthermore, GARCH models are only part of a solution. Although

GARCH models are usually applied to return series, financial decisions are rarely based

solely on expected returns and volatilities.


27
Forecasting Crude Oil and Natural Gas Volatility

3.4 EGARCH Model

3.4.1 Theory foundation of EGARCH

The Exponential Generalized Autoregressive Conditional Heteroscedasticity

(EGARCH) model introduced by Nelson (1991) builds in a directional effect of price moves

on conditional variance. From practice, there is negative correlation between stock returns

and changes in returns volatility. Volatility tends to rise in response to "bad news", (excess

returns lower than expected) and to fall in response to "good news" (excess returns higher

than expected). Which means large price declines can have a larger impact on volatility than

large increases. GARCH models, however, assume that only the magnitude but not the

positivity or negativity of unanticipated excess returns determines feature σ2t . Moreover, The

GARCH models are not able to explain the observed covariance between ε2t and εt−j . This is

possible only if the conditional variance is expressed as an asymmetric function of εt−j . In

addition, GARCH models essentially specify the behavior of the square of the data. In this

case a few large observations can dominate the sample.

As the general GARCH model has some limitations, the asymmetric models provide an

explanation for the so-called leverage effect, which an unexpected price drop increases

volatility more than an analogous unexpected price increase. The EGARCH(p,q) model

provides an explanation for the σ2t depends on both size and the sign of lagged residuals.

The conditional variance in the EGARCH (1,1) is given by

ln (σ2t ) = ω + αet−1 + γ(|et−1 | − E |et−1 |) + βi ln (σ2t−1 )


where represents the asymmetric leverage parameter that quantifies the degree of the


2
volatility leverage effect in the model and α the magnitude. et ~N(0, 1) with E |et−1 | = π

.The model parameters are free from nonnegativity constraints.


28
Forecasting Crude Oil and Natural Gas Volatility
Following the same procedures as with GARCH(1,1), the h-step ahead forecast

formula

of the EGARCH(1,1) can be expressed as:

2 2 2 2
lnσˆ t+h = σ + β (lnσˆ t+1 − σ )
h−1
2 γ
where σ = (ω − 2
/(1 − β) .
√ π

An attractive feature of the more general GARCH models, such as EGARCH and

GJR-GARCH, is that they allow for an asymmetric effect of positive and negative shocks on

the conditional variance. In fact, a well-documented feature of financial data is the

asymmetrical effects different types of shocks can have on volatility. In the case of crude oil

prices, political disruptions in the Middle East or large decreases in global demand tend to

increase volatility (see, e.g. Ferderer 1996, Wilson et al. 1996) whereas the effect of new oil

field discoveries seems to have a more muted effect. A large increase in the volatility of WTI

crude oil returns around the global financial crisis but no decline when shale oil started to be

shipped in larger quantities to Cushing. Even when put together, the conditional normality

assumption and the simultaneous estimation of conditional variance, do not capture the thick

tails entirely. As for natural gas, it plays a crucial role in the economy of the United States. In

2010, there is about 25 of energy used in the U.S. came from natural gas. Similar to the oil,

the volatility tendency of natural gas seems to be asymmetric. Over the past several years, the

volatility exhibited in the price of natural gas market has become a great concern among

market participants as well as researchers.

In this part, we used nonlinear garch models to predict the volatility with the purpose of

taking asymmetric effect in to account.

In many financial time series, the standardized residuals from the estimated models

display excess kurtosis which suggests departure from conditional normality. In such cases,
29
Forecasting Crude Oil and Natural Gas Volatility
the fat-tailed distribution of the innovations driving an ARCH process can be better modeled

using the Student’s-t or the Generalized Error Distribution (GED). Taking the square root of

the conditional variance and expressing it as an annualized percentage yields a time-varying

volatility estimate. A single estimated model can be used to construct forecasts of volatility

over any time horizon.

There are several things we want to test by running EGARCH model. First, in our time

range of the data, the oil price suffered huge volatility since the financial crisis. Given the

extremely high kurtosis present in the data, is EGARCH model assumed to follow a Student t

distribution is favored over that a normal distribution is presumed?

Second, some articles about EGARCH model implied that it will be more accurately to

predict volatility in following 1 to 5 days. We intend to figure out what predicted horizon is

the best for EGARCH.

Third, as long as we calculate predicted volatility, we have to compare the predicted

with historical data. We downloaded historical oil price volatility named HVT from

Bloomberg. The HVT were classified into several types according to horizon, such as 10 days

HVT, 30 days HVT, 50 days HVY ect.. We are going to test which line of HVT fits best to

our predicted numbers.

Lastly, as we tried different type of GARCH models, however, which is most stable to

predict oil future volatility? We will look at the MSE result of the model and checkout

models are suitable for what situations.

3.4.2 Prediction

In this prediction, we use package Quandl for data downloading, rugarch for model

fitting and forecasting. And we use rolling forecast in rugarch package to predict the
30
Forecasting Crude Oil and Natural Gas Volatility
volatility and compare with the historical value. Therefore, in the ugarchroll function, there

are several arguments and we have to select the best combination of the arguments out of

series of choices, here we select (100,200,300,500,1000) as window selection and (5, 10, 20,

50) as refit length selection. Thus, we wrote two loops looping over a selection of parameters.

We applied MSE to quantify the forecasting error of this model, and choose the combination

with least MSE. And then, we do the regression of the historical volatility and the forecasting

volatility.

Like our previous process of forecasting oil future volatility, we choose daily data of gas

future contract from CME, with maturity of 1 month, ranging from Nov. 11, 2007 to Nov. 8,

2017. We first took the first difference of logarithmic prices as the daily returns, which is

shown below.

Then we calculate mean and variance of the return in this period, it turned out the

average return is about -0.00038, which is not significantly different from zero, and standard

deviation is about 0.0305, which is much larger.

Next, we made several tests on the stationery and distribution of the data.

• The Jarque and Bera statistics shown that the null hypothesis of normal distribution
31
Forecasting Crude Oil and Natural Gas Volatility
should be rejected at 1% significant level.

Jarque Bera Test

data: ret_total

X-squared = 2448.1, df = 2, p-value < 2.2e-16

• The Ljung and Box’s Q statistics show the rejection of no autocorrelation up to the

10th orders.

We also plotted the acf of the return

We then use the same process to fit the egarch model, as before, we choose a set of

parameters to fit different models, and then calculate MSE for each model prediction and

historical volatility, and choose the best parameter based on the lowest MSE. We also plot the

output from rugarch (see figure 6B).

We then wrote the prediction result and historical volatility to a csv file, and read the file

into R, and to avoid autocorrelation, we feed the first data of each month to a linear

regression, and got the parameter a and b. and plot the predict and historical volatility.

3.4.3 Conclusion

After the forecasting process, we can get the combination with least MSE (702.5309) for

crude oil, which means the optimal parameters of prediction. In this combination, the forecast

length equals to 100, refit length equals to 20, and the historical data is the 50 days’ historical

volatility.

As we can see from the plot, where the blue line stands for the predicted sigma and the

gray line represents the historical volatility from the data, our prediction can more or less

catch the trend of true volatility, but at a less volatile fashion. So, it might not be the best

model to predict volatility.


32
Forecasting Crude Oil and Natural Gas Volatility
Based on the result of linear regression for natural gas, the R2 is not very significant.
33
Forecasting Crude Oil and Natural Gas Volatility
3.5 GJR GARCH Model

3.5.1 Theory foundation of GJR-GARCH

The GJR GARCH model of Glosten et al. (1993) models positive and negative shocks

on the conditional variance asymmetrically via the use of the indicator function I,

where γ j ​now represents the 'leverage' term. The indicator function I takes on value of 1 for

"ε≤0 and 0 otherwise. Because of the presence of the indicator function, the persistence of the

model now crucially depends on the asymmetry of the conditional distribution used. The

persistence of the model P̂ ​is,

where κ is the expected value of the standardized residuals z​t below zero (effectively the

probability of being below zero),

Where f is the standardized conditional density with any additional skew and shape

parameters (. . .). In the case of symmetric distributions, the value of κ is simply equal to 0.5.

The variance targeting, half-life and unconditional variance follow from the persistence

parameter, and ω is replaced by:


34
Forecasting Crude Oil and Natural Gas Volatility
2
where σ is the unconditional variance of ε2 which is consistently estimated by its sample

counterpart at every iteration of the solver following the mean equation filtration, and υj

represents the sample mean of the j​th external regressors in the variance equation (assuming

stationarity).

The naming conventions for passing fixed or starting parameters for this model are:

● ARCH(q) parameters are 'alpha1', 'alpha2', ...,

● Leverage(q) parameters are 'gamma1', 'gamma2', ...,

● GARCH(p) parameters are 'beta1', 'beta2', ...,

● variance intercept parameter is 'omega'

● the external regressor parameters are 'vxreg1', 'vxreg2', ...,

The Leverage parameter follows the order of the ARCH parameter.

3.5.2 Prediction

R package of GJR-GARCH is from RUGARCH package.

Data range is from Jan-02-2008 to July-20-2017 for 2407 days. We use 1000 days as fit

window for GARCH model, and predicted about 1407 days 21 days ahead volatility. Each

prediction we use 100 paths simulation for simulating random part of the model. We picked

last day of every month and multiply by the square root of number of days of every month as

the month’s predicted volatility:

σP red = σlastday * sqrt(di )


We got 68 predicted monthly volatility and linearly regressed with historical monthly

volatility data. The result of the crude oil regression is:

σHist,t+1 = 0.945 * σP red,t + 0.205


The result of the natural gas regression is:

σHist,t+1 = 3.51337 * σP red,t − 0.01778


35
Forecasting Crude Oil and Natural Gas Volatility
In the equation, σHist,t+1 is historical monthly volatility in the t+1​th month; σP red,t is

predicted volatility of time t+1​th​ conditioned on data before t​th​ month.

The R-squared of the crude oil regression is 0.22688 and adjusted R square is 0.2149.

The R-squared of the natural gas regression is 0.396 and adjusted R square is 0.3868. In

the graph blue line is predicted volatility by GJR-GARCH and black line is real monthly

volatility.

3.5.3 Conclusion

GARCH family predictions are not well performed. Which means it is not very suitable

for long-term volatility prediction. If we use daily prediction for one-day horizon, we could

obtain very high R square which means GARCH family is a good method to predict short

horizon volatility for calculating Value at Risk (VaR) or used to process risk management.

But it is not a good way to predict long term volatility because GARCH only includes

historical data.
36
Forecasting Crude Oil and Natural Gas Volatility

4. Model Comparison and Conclusion

4.1 Comparison

We use adjusted R-Squared as evaluation criteria for model forecasting power. The table

below is the estimation summary for five different models we use to forecast volatility based

on 10-year period historical data.

Forecasting Model Results Summary

Model Oil Oil R-Squared Gas Gas R-Squared

Adjusted R-Squared (absolute change) Adjusted R-Squared (absolute change)

VIX 0.1792 -0.008808 0.6083 0.231

Option Moneyness 0.1904 -0.006138 0.6451 0.01628

GARCH(1,1) 0.1842 — 0.5252 —

EGARCH 0.03551 — 0.06132 —

GJR-GARCH 0.2149 — 0.3868 —

Except for EGARCH model, VIX has the worst performance in forecasting crude oil

volatility. VIX has been performing higher than the realized volatility these years. This is not

to suggest that the VIX measure is of low value. Rather, it should be interpreted strictly for

what it aims to represent: the market price of volatility exposure consistent with observed

option prices. ​But as a direct indicator of future volatility, it is more limited because it

combines volatility forecasting with the pricing of the risk associated with

volatility. However, there is no existing VIX to track natural gas volatility. We use the

implied volatility of 90% put option price as a substitution.


37
Forecasting Crude Oil and Natural Gas Volatility
For option moneyness which incorporates option implied volatility, the 102.5% option

has the highest R-squared as 0.1904 while the regression between absolute change of

moneyness implied volatility and that of historical volatility shows the 95% option is the best

indication of crude oil future volatility​. As implied volatility did not pass the test of forecast

rationality, this is indirectly in contradiction to the conclusion that implied volatility is the

best available predictor of future volatility. For natural gas, the 110% option gives the highest

adjusted R-squared as 0.6451 while the 102.5% option of absolute change is the best

indication of natural gas future volatility in terms of the sign of coefficient and R-squared. It

can be said that moneyness implied volatility can be an accurate indication of the real

volatility for natural gas since the seasonal change plays an important role in natural gas

demand shift. This characteristic makes it less uncertain than the crude oil.

As we can see, both volatility forecast results obtained by EGARCH model aren’t very

significant as we focus more on MSE in this paper. GJR-GARCH has the highest adjusted

R-squared in GARCH family for crude oil forecast. Asymmetric effects are present in data

and asymmetric models that are capable of allowing different responses to different past

shocks perform better in explaining volatility. But overall the GARCH family are not

well-performed in forecasting oil future volatility. For natural gas forecasting, the simple

GARCH(1,1) did a better job.

Low R-squared values are problematic when we need precise predictions. The

regression plots can interpret the analysis result better for us (see figures 1-5). The black line

represents the historical volatility data while the colorful line represents the predicted

volatility. They indicate all five models we studied are not very good indicator for forecasting

crude oil future volatility.

Overall, each model we used has better performance in predicting natural gas volatility.
38
Forecasting Crude Oil and Natural Gas Volatility
In order to improve our results for crude oil future volatility forecast, it might be helpful to

use combined models. Meanwhile, the crude oil price volatility forecast performance can be

evaluated based on both statistical values and traders’ behaviors.


39
Forecasting Crude Oil and Natural Gas Volatility

4.2 Conclusion

In this paper, we estimate and forecast the WTI crude oil and natural gas volatility using

five different models, including VIX, option moneyness (moneyness implied volatility),

GARCH(1,1), EGARCH, and GJR-GARCH. We take the 2008-2017 period as a sample and

using daily observations from the US stock markets. Taking into account what has been

discussed above, we can safely come to the following conclusions for future volatility

forecasting:

(1) In order for implied volatility to be an efficient volatility forecast we have to

eliminate biases caused by Black Scholes model assumption. It’s a better indicator for natural

gas future volatility than oil future volatility.

(2) Behavioral finance matters. Investors have to behave rational when using the

available market information in decision making process since market noise can cause

substantial effect on the accuracy of forecasting power.

(3) The GARCH family used in this paper provide relatively higher predictive accuracy

for shorter time horizon (such as one day ahead). In general, for the normal period, the simple

GARCH model perform better than the asymmetric GARCH but for fluctuation period,

asymmetric GJR-GARCH model is preferred.

(4) Asymmetric GJR-GARCH model provide the preferred forecast for oil future

volatility, while VIX contribute a small but significant additional degree of forecast power

and information not contained in the GARCH forecasts.


40
Forecasting Crude Oil and Natural Gas Volatility

References
Frangoul, Anmar. "Natural Gas: Why It's Important and What You Need to Know." ​CNBC.

CNBC, 25 Apr. 2017. Web. 28 Nov. 2017.

Irakli. “Oil's Role in the World Economy and in the Global Crises.” ​CNN.​ Cable News

Network, 21 Mar. 2015. Web. 28 Nov. 2017.

Nielsen, Barry. "Financial Matters: The Importance of the Price of Oil." ​Helena Independent

Record. ​N.p., 31 Dec. 2015. Web. 28 Nov. 2017.

Lux, Thomas, Mawuli Segnon, and Rangan Gupta. "Forecasting Crude Oil Price Volatility

and Value-at-risk: Evidence from Historical and Recent Data." ​Energy Economics ​56

(2016): 117-33. Web. 1 Sep. 2017.

“Augmented Dickey–Fuller Test.” ​Wikipedia.​ Wikimedia Foundation, 23 Nov. 2017. Web.

01 Dec. 2017.

https://en.wikipedia.org/wiki/Augmented_Dickey%E2%80%93Fuller_test​.

Granero, M.a. Sánchez, J.e. Trinidad Segovia, and J. GarcÃa Pérez. "Some Comments on

Hurst Exponent and the Long Memory Processes on Capital Markets." ​Physica A:

Statistical Mechanics and Its Applications​ 387.22 (2008): 5543-551. Web. 1 Oct. 2017.

“Cboe Crude Oil ETF Volatility Index (OVX).” ​Cboe.​ N.p., n.d. Web. 30 Oct. 2017.

http://www.cboe.com/products/vix-index-volatility/volatility-on-etfs/cboe-crude-oil-etf

-volatility-index-ovx​.

​ .p., 14 July 2008. Web. 30 Oct. 2017.


“Cboe, CBSX, & CFE Press Releases.” ​Cboe. N

http://www.cboe.com/aboutcboe/cboe-cbsx-amp-cfe-press-releases?DIR=ACNews&FI

LE=cboe_20080714.doc​.

“Cboe Volatility Index® (VIX®).” ​VIX Index.​ N.p., n.d. Web. 30 Oct. 2017.

http://www.cboe.com/products/vix-index-volatility/vix-options-and-futures/vix-index​.
41
Forecasting Crude Oil and Natural Gas Volatility
Staff, Investopedia. "VIX - CBOE Volatility Index." ​Investopedia.​ N.p., 07 Aug. 2015. Web.

30 Oct. 2017.

https://www.investopedia.com/terms/v/vix.asp​.

“The CBOE Volatility Index - VIX®.” ​Cboe​ (n.d.): 1-23. Aug. 2014. Web. 3 Oct. 2017.

https://www.cboe.com/micro/vix/vixwhite.pdf​.

Rossi, Eduardo. ​Lecture Notes on GARCH Models.​ Diss. University of Pavia, 2004. Web. 12

Oct. 2017.

Hansson, Mathias, and Rune Sand. ​FORECASTING CRUDE OIL FUTURES VOLATILITY.

Thesis. BI Norwegian Business School, 2012. N.p.: n.p., n.d. Print.

Musaddiq, Tareena. "Modeling and Forecasting the Volatility of Oil Futures Using the

ARCH Family Models." ​The Lahore Journal of Business​ (2012): 79-108. Summer

2012. Web. 2 Oct. 2017.

Bentes, Sónia R. "A Comparative Analysis of the Predictive Power of Implied Volatility

Indices and GARCH Forecasted Volatility." ​Physica A: Statistical Mechanics and Its

Applications​ 424 (2015): 105-12. Web. 1 Oct. 2017.

Ghalanos, Alexios. "Introduction to the Rugarch Package. (Version 1.3-1)." (2017): 1-48.

Web. 15 Oct. 2017.

Zhang, Yuejun, Ting Yao, and Lingyun He. "Forecasting Crude Oil Market Volatility: Can

the Regime Switching GARCH Model Beat the Single-regime GARCH Models?"

(2015): 1-30. 5 Dec. 2015. Web. 1 Sept. 2017.


42
Forecasting Crude Oil and Natural Gas Volatility

Figures

Figures – Crude Oil

Figure 1A: VIX Prediction

Figure 1a: VIX Prediction (absolute change)


43
Forecasting Crude Oil and Natural Gas Volatility
Figure 2A: Option Moneyness

Figure 2a: Option Moneyness (absolute change)


44
Forecasting Crude Oil and Natural Gas Volatility
Figure 3A: GARCH (1,1) Prediction

Figure 4A: EGARCH Prediction


45
Forecasting Crude Oil and Natural Gas Volatility
Figure 5A: GJR-GARCH Prediction

Figure 6A: EGARCH Plot


46
Forecasting Crude Oil and Natural Gas Volatility

Figures – Natural Gas

Figure 1B: VIX Prediction

Figure 1b: VIX Prediction (absolute change)


47
Forecasting Crude Oil and Natural Gas Volatility
Figure 2B: Option Moneyness

Figure 2b: Option Moneyness (absolute change)


48
Forecasting Crude Oil and Natural Gas Volatility
Figure 3B: GARCH (1,1) Prediction

Figure 4B: EGARCH Prediction


49
Forecasting Crude Oil and Natural Gas Volatility
Figure 5B: GJR-GARCH Prediction

Figure 6B: EGARCH Plots


50
Forecasting Crude Oil and Natural Gas Volatility
51
Forecasting Crude Oil and Natural Gas Volatility
Tables

Tables - Crude Oil

Table 1A: VIX Prediction


Residuals:
Min 1Q Median 3Q Max
-12.250 -5.052 -2.578 4.829 14.101
Coefficients:
Estimate Std. Error t value Pr(>|t|)
Intercept 23.37130 1.91583 12.199 < 2e-16 ***
vix1 0.24403 0.04776 5.109 1.31e-06 ***

Residual standard error 7.273 on 114 degrees of freedom


Multiple R-squared Adjusted R-squared F-statistic p-value
0.1863 0.1792 26.11 on 1 and 114 DF 1.311e-06

Table 1a: VIX Prediction (absolute change)


Residuals:
Min 1Q Median 3Q Max
-8.7193 -0.5075 -0.0791 0.2456 10.2201
Coefficients:
Estimate Std. Error t value Pr(>|t|)
Intercept 0.063119 0.163173 0.387 0.700
VIX_30 -0.001709 0.025081 -0.068 0.946

Residual standard error 1.75 on 113 degrees of freedom


Multiple R-squared Adjusted R-squared F-statistic p-value
4.108e-05 -0.008808 0.004642 on 1 and 113 DF 0.9458

Table 2A: Option Moneyness Prediction


Regression result
option 110% 105% 102.5% 100% 97.5% 95% 90%
moneyness
Slope 0.27240 0.26983 0.27213 0.26941 0.26208 0.2641 0.26284
Adjusted 0.1834 0.1848 0.1904 0.1873 0.1796 0.1778 0.1868
R-squared
P-value 8.694e-07 7.876e-07 5.245e-07 6.559e-07 1.149e-06 1.306e-06 6.787e-07
52
Forecasting Crude Oil and Natural Gas Volatility
Table 2a: Option Moneyness Prediction (absolute change)
Regression result
option 110% 105% 102.5% 100% 97.5% 95% 90%
moneyness
Slope -0.002481 -0.001121 -0.001241 -0.001344 0.00864 0.01006 0.009465
Adjusted -0.008542 -0.008663 -0.008656 -0.00865 -0.006822 -​0.006138 -0.006345
R-squared
P-value 0.895 0.9518 0.9464 0.9425 0.6445 0.5898 0.6052

Table 3A: GARCH (1,1) Prediction


Regression result
Coefficients t-statistics P-value
Intercept 0.23433 14.948 < 2e-16
Predddd 0.61246 3.988 0.000172
Multiple R-squared 0.1966 Adjusted R-squared 0.1842

Table 4A: EGARCH Prediction


Residuals:
Min 1Q Median 3Q Max
-14.340 -9.069 -4.285 4.336 33.571
Coefficients:
Estimate Std. Error t value Pr(>|t|)
Intercept 35.796 5.253 6.814 1.92e-08 ***
pred -351.153 213.96 -1.641 0.108

Residual standard error 7.273 on 114 degrees of freedom


Multiple R-squared Adjusted R-squared F-statistic p-value
0.05648 0.03551 2.694 on 1 and 45 DF 0.1077
53
Forecasting Crude Oil and Natural Gas Volatility
Tables - Natural Gas

Table 1B: VIX Prediction


Residuals:
Min 1Q Median 3Q Max
-29.642 -6.682 -1.508 3.425 38.583
Coefficients:
Estimate Std. Error t value Pr(>|t|)
Intercept -2.67899 3.83329 -0.699 0.486
VIX_30 1.10971 0.08246 13.458 <2e-16 ***

Residual standard error 11.16 on 115 degrees of freedom


Multiple R-squared Adjusted R-squared F-statistic p-value
0.6117 0.6083 181.1 on 1 and 115 DF < 2.2e-16

Table 1b: VIX Prediction (absolute change)


Residuals:
Min 1Q Median 3Q Max
-32.898 -6.614 -1.089 5.538 44.007
Coefficients:
Estimate Std. Error t value Pr(>|t|)
Intercept 0.1025 0.1025 0.094 0.926
VIX_30 0.8824 0.1480 5.962 2.85e-08 ***

Residual standard error 11.79 on 114 degrees of freedom


Multiple R-squared Adjusted R-squared F-statistic p-value
0.2377 0.231 35.54 on 1 and 114 DF 2.855e-08

Table 2B: Option Moneyness Prediction


Regression result for Natural Gas
option
110% 105% 102.5% 100% 97.5% 95% 90%
moneyness
Slope 1.13418 1.12053 1.11321 1.14609 1.12315 1.12021 1.10971
Adjusted
0.6451 0.6335 0.6268 0.639 0.6195 0.6187 0.6083
R-squared
P-value <2.2e-16 <2.2e-16 <2.2e-16 <2.2e-16 <2.2e-16 <2.2e-16 <2.2e-16
54
Forecasting Crude Oil and Natural Gas Volatility
Table 2b: Option Moneyness Prediction (absolute change)
Regression result for Natural Gas
option
110% 105% 102.5% 100% 97.5% 95% 90%
moneyness
Slope 0.2257 0.22968 0.23045 0.23753 0.22869 0.25076 0.28068
Adjusted
0.01586 0.01593 0.01628 0.0136 0.009768 0.01193 0.0156
R-squared
P-value 0.09297 0.09251 0.09018 0.1097 0.1458 0.1241 0.09473

Table 3B: GARCH (1,1) Prediction


Regression result
Coefficients t-statistics P-value
Intercept 0.03162 0.635 0.528
Predddd 3.11737 8.518 3.51e-12
Multiple R-squared 0.5275 Adjusted R-squared 0.5202

Table 4B: EGARCH Prediction


Residuals:
Min 1Q Median 3Q Max
-23.102 -9.040 -5.174 5.005 45.610
Coefficients:
Estimate Std. Error t value Pr(>|t|)
Intercept 29.028 7.737 3.752 0.0005 ***
pred 517.458 258.562 2.001 0.0514

Residual standard error 7.273 on 114 degrees of freedom


Multiple R-squared Adjusted R-squared F-statistic p-value
0.08173 0.06132 4.005 on 1 and 45 DF 0.05141
55
Forecasting Crude Oil and Natural Gas Volatility
APPENDIX

R Code - Crude oil

VIX
Part 1
(Regression on HVT & OVX price)
rm(list=ls())
library(readr)
VIX_Monthly <- read_csv("VIX-MONTHLY.csv")
HVT_Monthly <- read_csv("HVT-MONTHLY.csv")
VIX1<-VIX_Monthly$`Price`

###from 2007-05 to 2017-09


vix1<-VIX1[7:122]
###from 2007-11 to 2017-06
HVT1<-HVT_Monthly$'CL1 COMB Comdty Hist Vol (30)'[1:116]
###from 2007-12 to 2017-07
lm(HVT1~vix1+1)
###run the regression correspondingly
R_squared1<-summary(lm(HVT1~vix1+1))$adj.r.squared
print(R_squared1)

###plot
plot(HVT1,type='l')
points(vix1*0.224+23.371,type='l',col='blue')
summary(lm(HVT1~vix1+1))

Part 2
###(Regression on HVT & OVX price absolute change)
rm(list=ls())
library(readr)
VIX_Monthly <- read_csv("VIX-MONTHLY.csv")
HVT_Monthly <- read_csv("HVT-MONTHLY.csv")
VIX1<-VIX_Monthly$`Price`
###import the OVX date from 2007-05 to 2017-09
vix1<-VIX1[7:122]
###from 2007-11 to 2017-06
vix1_change<-diff(vix1)
###calculate the change for VIX from 2007-12-2007-11 to 2017-06-2017-05
HVT1<-HVT_Monthly$'CL1 COMB Comdty Hist Vol (30)'[1:116]
###import the HVT data from 2007-12 to 2017-07
HVT1_change<-diff(HVT1)
###calculate the change for HVT from 2008-01-2007-12 to 2017-07-2017-06
lm(HVT1_change~vix1_change+1)
R_squared1<-summary(lm(HVT1_change~vix1_change+1))$adj.r.squared
print(R_squared1)
###run the regression correspondingly
56
Forecasting Crude Oil and Natural Gas Volatility

plot(HVT1_change,type='l')
points(vix1_change*-0.001709+0.063119,type='l',col='blue')
summary(lm(HVT1_change~vix1_change+1))
###plot and summary (Regression on HVT &OVX Absolute Change)
57
Forecasting Crude Oil and Natural Gas Volatility
Option moneyness
#Clear the workspace
rm(list=ls())

#Import the monthly historical volatility with period from 01/2008 to 09//2017
HVT<-read.csv('HVT MONTHLY.csv')
HVT_30<-HVT$CL1.COMB.Comdty.Hist.Vol..30.[2:118]
T<-length(HVT_30)

#Import the option moneyness data with period from 12//2007 to 08/2017 which is one period
lag of the historical volatility
moneyness<-read.csv('Moneyness_30_monthly.csv')[13:(13+T-1),]

#Regress the historical volatility with volatility of 110% option


lm_110<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_110.0.MNY_DF)
summary(lm_110)
#Regress the historical volatility with volatility of 105% option
lm_105<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_105.0.MNY_DF)
summary(lm_105)
#Regress the historical volatility with volatility of 102.5% option
lm_102.5<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_102.5.MNY_DF)
summary(lm_102.5)
#Regress the historical volatility with volatility of 100% option
lm_100<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_100.0.MNY_DF)
summary(lm_100)
#Regress the historical volatility with volatility of 97.5% option
lm_97.5<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_97.5.MNY_DF)
summary(lm_97.5)
#Regress the historical volatility with volatility of 95% option
lm_95<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_95.0.MNY_DF)
summary(lm_95)
#Regress the historical volatility with volatility of 90% option
lm_90<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_90.0.MNY_DF)
summary(lm_90)

plot(HVT_30,type='l',main="Moneyness and HVT")


points(moneyness$X30DAY_IMPVOL_102.5.MNY_DF,type='l',col='blue',lwd=2)
58
Forecasting Crude Oil and Natural Gas Volatility
Option Moneyness (absolute change)
#Clear the workspace
rm(list=ls())

#Import the monthly historical volatility with period from 12/2007 to 09//2017
HVT<-read.csv('HVT MONTHLY.csv')
HVT_30<-HVT$CL1.COMB.Comdty.Hist.Vol..30.[1:118]
T<-length(HVT_30)

#Calculate the absolute change of monthly historical volatility and the period is from 01/2008
to 09/2017
HVT_30_change<-diff(HVT_30)

#Import the monthly moneyness implied volatility with period from 12/2007 to 09//2017
moneyness<-read.csv('Moneyness_30_monthly.csv')[12:(12+T-1),]

#Calculate the absolute change of the monthly moneyness implied volatility and the period is
from 01/2008 to 09/2017
moneyness_110_change<-diff(moneyness$X30DAY_IMPVOL_110.0.MNY_DF)
moneyness_105_change<-diff(moneyness$X30DAY_IMPVOL_105.0.MNY_DF)
moneyness_102.5_change<-diff(moneyness$X30DAY_IMPVOL_102.5.MNY_DF)
moneyness_100_change<-diff(moneyness$X30DAY_IMPVOL_100.0.MNY_DF)
moneyness_97.5_change<-diff(moneyness$X30DAY_IMPVOL_97.5.MNY_DF)
moneyness_95_change<-diff(moneyness$X30DAY_IMPVOL_95.0.MNY_DF)
moneyness_90_change<-diff(moneyness$X30DAY_IMPVOL_90.0.MNY_DF)

#Regress the change of historical volatility with that of volatility of 110% option
lm_110<-lm(HVT_30_change ~ moneyness_110_change)
summary(lm_110)
#Regress the change of historical volatility with that of volatility of 105% option
lm_105<-lm(HVT_30_change ~ moneyness_105_change)
summary(lm_105)
#Regress the change of historical volatility with that of volatility of 102.5% option
lm_102.5<-lm(HVT_30_change ~ moneyness_102.5_change)
summary(lm_102.5)
#Regress the change of historical volatility with that of volatility of 100% option
lm_100<-lm(HVT_30_change ~ moneyness_100_change)
summary(lm_100)
#Regress the change of historical volatility with that of volatility of 97.5% option
lm_97.5<-lm(HVT_30_change ~ moneyness_97.5_change)
summary(lm_97.5)
#Regress the change of historical volatility with that of volatility of 95% option
lm_95<-lm(HVT_30_change ~ moneyness_95_change)
summary(lm_95)
#Regress the change of historical volatility with that of volatility of 90% option
lm_90<-lm(HVT_30_change ~ moneyness_90_change)
summary(lm_90)
59
Forecasting Crude Oil and Natural Gas Volatility
plot(HVT_30_change,type='l',main="Absolute change of Moneyness and HVT")
points(moneyness_95_change,type='l',col='blue',lwd=2)
60
Forecasting Crude Oil and Natural Gas Volatility
GARCH (1,1)
#Business Day
library(zoo)
rm(list=ls())
oil_daily1 <- Quandl("FRED/DCOILWTICO",api_key="2T1Yy7mQwKqsGtFXKtCy",
type="raw",collapse="daily",start_date="2007-12-31",
end_date="2017-08-20")
oil_daily <- oil_daily1 %>%
arrange(Date) %>%
mutate(Ret_total = c(9999,diff(log(Value)))) %>%
slice(-1)
oil_daily <- oil_daily %>%
mutate(year_mon = as.yearmon(Date)) %>%
group_by(year_mon) %>%
mutate(busdays = n()) %>%
ungroup()
busdays <- oil_daily %>%
group_by(year_mon) %>%
slice(1) %>%
select(Date,year_mon,busdays) %>%
ungroup()
busdays <- subset(busdays, select = c(1, 3))
write.csv(busdays,file="busdays.csv")

#Import Packages
library(Quandl)
library(dplyr)
library(xts)
library(lubridate)
library(forecast)
library(dygraphs)
library(fGarch)

#Clean the R workspace


rm(list=ls())

#Import data
data=read.csv('HVT-Daily2.csv')
oil_daily=data$Price

#Basic data process


Ret_total<-diff(log(oil_daily))
T<-length(Ret_total)

for (i in 1001:T){
Retwindow <- Ret_total[(i-1000):(i-1)]
fit1 <- garchFit( formula = ~garch(1, 1), data = Retwindow, trace = FALSE)
data$Pred[i] <- predict(fit1, n.ahead=21)$standardDeviation
61
Forecasting Crude Oil and Natural Gas Volatility
print(i)
​#predicted period: from 01/2008 to 09/2017
}

#Write the predicted data with corresponding date into a csv file.
write.csv(data,file='Garch11.csv')

#Regression Preparation
result<- read.csv('Garch11.csv')

#Pick the last day of one month as the monthly volatility


Month<-result$Month[1001:T]
Monthdiff<-diff(Month)
aa<-which(Monthdiff!=0)
Pred<-result$Pred[1001:T]
Predd<-Pred[aa]

#add last day of data as the last month volatility.


Predd<-c(Predd,Pred[length(Pred)])

#Compare with HVT30


Real<-read.csv('HVT.csv')
HVTd<-Real$HVT[2:68]/100
Preddd<-Predd[1:67]
#Multiply the number of business days in each month
Busday<-read.csv('busdays.csv')
Predddd<-Preddd*sqrt(Busday$busdays[2:68])

#Backtest Regression Part with HVT month


fit2<-lm(HVTd~Predddd)
summary(fit2)
plot(HVTd,type='l',main="GARCH(1,1) and HVT")
points(Predddd*fit2$coefficients[2]+fit2$coefficients[1],type='l',col='blue',lwd=2)
62
Forecasting Crude Oil and Natural Gas Volatility
EGARCH
##Import Packages
rm(list=ls())
install.packages('Quandl')
install.packages('rugarch')
library(Quandl)
library(rugarch)
library(tseries)
library(forecast)

##Download Data
oil_daily <- Quandl("FRED/DCOILWTICO", api_key='TPywx-DUcfEE4VMynwHR',
type='raw', collapse='daily', order = 'asc', start_date="2008-01-01",end_date="2017-07-20")
rownames(oil_daily) <- oil_daily$Date
oil_daily$Date <- NULL
ret_total <- diff(log(oil_daily$Value))
hvt <- read.csv('HVT-Daily.csv', header = TRUE, col.names = c('Date',
'HVT','Vol_10','Vol_30','Vol_50','Vol_100'))
hvt <- hvt[-1,]$Vol_100
T <- length(ret_total)

## Data Description $ Testing


mean_return=mean(ret_total)
st_dev = sd(ret_total)
qqnorm(ret_total)
qqline(ret_total)
adf.test(ret_total, alternative='stationary')
acf(ret_total)
pacf(ret_total)

## Estimating the Model


spec <- ugarchspec(variance.model = list(model='eGARCH'), mean.model =
list(armaOrder=c(0,0)), distribution.model = 'norm')
fit = ugarchfit(data = ret_total, spec=spec)

# Iterating over the windows, and trying to find one with least Mean Squared Error(MSE)
forecast_length = c(100, 200, 300, 500, 1000)
refit_length = c(5, 10, 20, 50)
optimal_window = NULL
optimal_refit = NULL
mse = Inf
opt_forc = NULL
model = ugarchspec(variance.model = list(model='eGARCH'), mean.model =
list(armaOrder=c(0,0)), distribution.model = 'norm')
fit2 = ugarchfit(data=ret_total, spec=model)
for(i in c(1:5)){
for (j in c(1:4)){
rollforc = ugarchroll(spec=spec, data=ret_total, n.ahead = 1, forecast.length = window[i],
63
Forecasting Crude Oil and Natural Gas Volatility
refit.every = refit_length[j], refit.window = c('recursive'), solver = 'hybrid', keep.coef =
TRUE)
sigmapred = as.data.frame(rollforc@forecast$density$Sigma)
error = mean((hvt[(T-window[i]):(T-1)]- sigmapred)^2)
if (error <= mse){
mse = error
optimal_window = window[i]
optimal_refit = refit_length[j]
opt_forc = rollforc}
}}
print(optimal_window)
print(optimal_refit)
64
Forecasting Crude Oil and Natural Gas Volatility
GJR-GARCH
#Clear the workspace
rm​(​list​=​ls​())

#Import Packages
library​(​Quandl​)
library​(​dplyr​)
library​(​xts​)
library​(​lubridate​)
library​(​forecast​)
library​(​dygraphs​)
library​(​fGarch​)
library​(​rugarch​)
library​(​parallel​)

#Preparation: Put HVT-Daily.csv, HVT.csv busdays.csv in directory.


#HVT.csv is HVT-monthly data
#In HVT-Daily.csv, the data should be divided manually in EXCEL data-divide column
function.

data​=​read.csv​(​'HVT-Daily.csv'​)
oil_monthly​=​data​$​Price

#Basic data process


Ret_total​<-​diff​(​log​(​oil_monthly​))
T​<-​length​(​Ret_total​)

#Predict 21 days ahead, could adjust different GARCH model inside loop.
#Check the output csv file whether there is 999 data in Pred. If there is, it means there is error
in that loop.
#Simply substitute i-1's data if Pred[i] is 999

for​ ​(​i ​in​ 1001​:​T​){


tryCatch​({
Retwindow​<-​Ret_total​[(​i​-​1000​):(​i​-​1​)]
spec ​=​ ugarchspec​(​variance.model​=​list​(​model​=​"gjrGARCH"​)​, distribution​=​"std"​)
fit ​=​ ugarchfit​(​spec, Retwindow​)
bootp ​=​ ugarchboot​(​fit, method ​=​ c​(​"Partial", "Full"​)[​1​]​,n.ahead ​=​ 21, n.bootpred ​=​ 100​)
predsigma​<-​bootp@forc@forecast​$​sigmaFor​[​21​]
data​$​Pred​[​i​]=​predsigma
print​(​i​)
​}​, error​=function(​e​){
data​$​Pred​[​i​]=​999​})
}
write.csv​(​data,file​=​'GJRGarch.csv'​)

#Regression Prepare
result​<-​ read.csv​(​'GJRGarch.csv'​)
65
Forecasting Crude Oil and Natural Gas Volatility
#Pick the last day of one month as the month volatility
Month​=​result​$​Month​[​1001​:​T​]
Monthdiff​<-​diff​(​Month​)
aa​<-​which​(​Monthdiff​!=​0​)
Pred​<-​result​$​Pred​[​1001​:​T​]
Predd​<-​Pred​[​aa​]
#add last day of data as the last month volatility.
Predd​<-​c​(​Predd,Pred​[​length​(​Pred​)])

#Compare with HVT30


Real​<-​read.csv​(​'HVT.csv'​)
HVTd​<-​Real​$​HVT​[​2​:​68​]/​100
Preddd​<-​Predd​[​1​:​67​]
#Multiply number of day in each month
Busday​<-​read.csv​(​'busdays.csv'​)
Predddd​<-​Preddd​*​sqrt​(​Busday​$​busdays​[​2​:​68​])

#Backtest Regression Part with HVT month


fit2​<-​lm​(​HVTd​~​Predddd​)
summary​(​fit2​)
plot​(​HVTd,type​=​'l'​)
points​(​Predddd​*​fit2​$​coefficients​[​2​]+​fit2​$​coefficients​[​1​]​,type​=​'l',col​=​'blue',lwd​=​2​)
66
Forecasting Crude Oil and Natural Gas Volatility
R Code - Natural Gas

VIX
Part 1
###Regression on HVT & the volatility of 90% put option price(VIX)
rm(list=ls())
library(readr)
VIX_Monthly <- read.csv('Natural gas moneyness.csv')
VIX_30<-VIX_Monthly $ X30DAY_IMPVOL_90.0.MNY_DF[1:117]
###Import the option moneyness data with period from 12/2007 to 08/2017 which is one
period ahead of the historical volatility
HVT_Monthly <- read.csv('Natural gas HVT_monthly.csv')
HVT_30<-HVT_Monthly $ VOLATILITY_30D[2:118]
###Import the monthly historical volatility with period from 01/2008 to 09/2017
R_squared1<-summary(lm(HVT_30~VIX_30+1))$adj.r.squared
print(R_squared1)
###run the regression correspondingly
plot(HVT_30,type='l')
points(VIX_30*1.10971-2.67889,type='l',col='blue')
summary(lm(HVT_30~VIX_30+1))
###plot and summary

Part 2
###Regression on HVT & the volatility of 90% put option price Absolute change
rm(list=ls())
library(readr)
VIX_Monthly <- read.csv('Natural gas moneyness.csv')
VIX_30<-VIX_Monthly $ X30DAY_IMPVOL_90.0.MNY_DF[1:117]
###Import the option moneyness data with period from 12/2007 to 08/2017 which is one
period ahead of the historical volatility
VIX_30_change<-diff(VIX_30)
###calculate the difference for VIX_30
HVT_Monthly <- read.csv('Natural gas HVT_monthly.csv')
HVT_30<-HVT_Monthly $ VOLATILITY_30D[2:118]
###Import the monthly historical volatility with period from 01/2008 to 09/2017
HVT_30_change<- diff(HVT_30)
###calculate the difference for HVT_30
R_squared1<-summary(lm(HVT_30_change~VIX_30_change+1))$adj.r.squared
print(R_squared1)
###run the regression correspondingly
plot(HVT_30_change,type='l')
points(VIX_30_change*0.8824+0.1025,type='l',col='blue')
summary(lm(HVT_30_change~VIX_30_change+1))
###plot and summary
67
Forecasting Crude Oil and Natural Gas Volatility
Option Moneyness
rm(list=ls())
#Clear the workspace

HVT<-read.csv('Natural gas HVT_monthly.csv')


HVT_30<-HVT$VOLATILITY_30D[2:118]
#Import the monthly historical volatility with period from 01/2008 to 09//2017
T<-length(HVT_30)

moneyness<-read.csv('Natural gas moneyness.csv')[1:117,]


#Import the option moneyness data with period from 12//2007 to 08/2017 which is one period
lag of the historical volatility

lm_110<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_110.0.MNY_DF)
summary(lm_110)
#Regress the historical volatility with volatility of 110% option

lm_105<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_105.0.MNY_DF)
summary(lm_105)
#Regress the historical volatility with volatility of 105% option

lm_102.5<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_102.5.MNY_DF)
summary(lm_102.5)
#Regress the historical volatility with volatility of 102.5% option

lm_100<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_100.0.MNY_DF)
summary(lm_100)
#Regress the historical volatility with volatility of 100% option

lm_97.5<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_97.5.MNY_DF)
summary(lm_97.5)
#Regress the historical volatility with volatility of 97.5% option

lm_95<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_95.0.MNY_DF)
summary(lm_95)
#Regress the historical volatility with volatility of 95% option

lm_90<-lm(HVT_30 ~ moneyness$X30DAY_IMPVOL_90.0.MNY_DF)
summary(lm_90)
#Regress the historical volatility with volatility of 90% option

plot(HVT_30,type='l',main="Moneyness and HVT for Natural Gas")


points(moneyness$X30DAY_IMPVOL_110.0.MNY_DF,type='l',col='blue',lwd=2)
68
Forecasting Crude Oil and Natural Gas Volatility
Option Moneyness (absolute change)
rm(list=ls())
#Clear the workspace

HVT<-read.csv('Natural gas HVT_monthly.csv')


HVT_30<-HVT$VOLATILITY_30D[1:118]
#Import the monthly historical volatility with period from 12/2007 to 09//2017
T<-length(HVT_30)

HVT_30_change<-diff(HVT_30)
#Calculate the absolute change of monthly historical volatility and the period is from 01/2008
to 09/2017
moneyness<-read.csv('Natural gas moneyness.csv')[1:T,]
#Import the monthly moneyness implied volatility with period from 12/2007 to 09//2017

moneyness_110_change<-diff(moneyness$X30DAY_IMPVOL_110.0.MNY_DF)
moneyness_105_change<-diff(moneyness$X30DAY_IMPVOL_105.0.MNY_DF)
moneyness_102.5_change<-diff(moneyness$X30DAY_IMPVOL_102.5.MNY_DF)
moneyness_100_change<-diff(moneyness$X30DAY_IMPVOL_100.0.MNY_DF)
moneyness_97.5_change<-diff(moneyness$X30DAY_IMPVOL_97.5.MNY_DF)
moneyness_95_change<-diff(moneyness$X30DAY_IMPVOL_95.0.MNY_DF)
moneyness_90_change<-diff(moneyness$X30DAY_IMPVOL_90.0.MNY_DF)
#Calculate the absolute change of the monthly moneyness implied volatility and the period is
from 01/2008 to 09/2017

lm_110<-lm(HVT_30_change ~ moneyness_110_change)
summary(lm_110)
#Regress the change of historical volatility with that of volatility of 110% option

lm_105<-lm(HVT_30_change ~ moneyness_105_change)
summary(lm_105)
#Regress the change of historical volatility with that of volatility of 105% option

lm_102.5<-lm(HVT_30_change ~ moneyness_102.5_change)
summary(lm_102.5)
#Regress the change of historical volatility with that of volatility of 102.5% option

lm_100<-lm(HVT_30_change ~ moneyness_100_change)
summary(lm_100)
#Regress the change of historical volatility with that of volatility of 100% option

lm_97.5<-lm(HVT_30_change ~ moneyness_97.5_change)
summary(lm_97.5)
#Regress the change of historical volatility with that of volatility of 97.5% option

lm_95<-lm(HVT_30_change ~ moneyness_95_change)
summary(lm_95)
#Regress the change of historical volatility with that of volatility of 95% option
69
Forecasting Crude Oil and Natural Gas Volatility

lm_90<-lm(HVT_30_change ~ moneyness_90_change)
summary(lm_90)
#Regress the change of historical volatility with that of volatility of 90% option

plot(HVT_30_change,type='l',main="Absolute change of Moneyness and HVT for Natural


Gas")
points(moneyness_102.5_change,type='l',col='blue',lwd=2)
70
Forecasting Crude Oil and Natural Gas Volatility
GARCH (1,1)
library(Quandl)
library(dplyr)
library(xts)
library(lubridate)
library(forecast)
library(dygraphs)
library(fGarch)

rm(list=ls())
### Clean the R workspace

gas_daily1<-Quandl("CHRIS/CME_NG1",api_key="2T1Yy7mQwKqsGtFXKtCy",type="ra
w",collapse="daily",start_date="2007-12-31",end_date="2017-07-20")
gas_daily<-gas_daily1[ nrow(gas_daily1):1, ]

#Basic data process


Ret_total<-diff(log(gas_daily$Settle))
T<-length(Ret_total)
write.csv(gas_daily,'gas_daily.csv')
Pred<-numeric(2392)

for (i in 1001:T){
Retwindow <- Ret_total[(i-1000):(i-1)]
fit1 <- garchFit( formula = ~garch(1, 1), data = Retwindow, trace = FALSE)
Pred[i] <- predict(fit1, n.ahead=21)$standardDeviation
print(i)
#predicted period: from 01/2008 to 07/2017
}

write.csv(Pred,file='result.csv')

#Regression Prepare
result<- read.csv('gas_daily.csv')
#Pick the last day of one month as the month volatility
Month=result$Month[1001:T]
Monthdiff<-diff(Month)
aa<-which(Monthdiff!=0)
Pred<-result$Pred[1001:T]
Predd<-Pred[aa]
#add last day of data as the last month volatility
Predd<-c(Predd,Pred[length(Pred)])

#Compare with HVT30


Real<-read.csv('Natural gas HVT_monthly.csv')
HVTd<-Real$VOLATILITY_30D[49:115]/100
#Multiply number of day in each month
Busday<-read.csv('busdays.csv')
71
Forecasting Crude Oil and Natural Gas Volatility
Predddd<-Predd*sqrt(Busday$busdays[2:68])

#Backtest Regression Part with HVT month


fit3<-lm(HVTd~Predddd)
summary(fit3)
plot(HVTd, type='l')
points(Predddd*fit3$coefficients[2]+fit3$coefficients[1],type='l',col='blue',lwd=2)
72
Forecasting Crude Oil and Natural Gas Volatility
EGARCH
##import pakages
rm(list=ls())
install.packages('Quandl')
install.packages('rugarch')
library(Quandl)
library(rugarch)

##download data
gas_daily <- Quandl("CHRIS/CME_NG1", api_key='TPywx-DUcfEE4VMynwHR',
type='raw', collapse='daily', order = 'asc', start_date="2007-11-02",end_date="2017-11-08")
rownames(oil_daily) <- oil_daily$Date
gas_daily$Date <- NULL
ret_total <- diff(log(gas_daily$Settle))
gas_hvt <- read.csv('Natural gas HVT_daily.csv', header = TRUE)
hvt <- gas_hvt[,2]
T <- length(ret_total)

## data description $ testing


mean_return=mean(ret_total)
st_dev = sd(ret_total)
qqnorm(ret_total)
qqline(ret_total)
adf.test(ret_total, alternative='stationary')
acf(ret_total)
pacf(ret_total)

## estimating the model


spec <- ugarchspec(variance.model = list(model='eGARCH'), mean.model =
list(armaOrder=c(0,0)), distribution.model = 'norm')
fit = ugarchfit(data = ret_total, spec=spec)

# iterating over the windows, and trying to find one with least Mean Squared Error(MSE)
forecast_length = c(100, 200, 300, 500, 1000)
refit_length = c(5, 10, 20, 50)
optimal_window = NULL
optimal_refit = NULL
mse = Inf
opt_forc = NULL
model = ugarchspec(variance.model = list(model='eGARCH'), mean.model =
list(armaOrder=c(0,0)), distribution.model = 'norm')
fit2 = ugarchfit(data=ret_total, spec=model)
for(i in c(1:5)){
for (j in c(1:4)){
73
Forecasting Crude Oil and Natural Gas Volatility
rollforc = ugarchroll(spec=spec, data=ret_total, n.ahead = 1, forecast.length =
forecast_length[i], refit.every = refit_length[j], refit.window = c('recursive'), solver = 'hybrid',
keep.coef = TRUE)
sigmapred = as.data.frame(rollforc@forecast$density$Sigma)
error = mean((hvt[(T-forecast_length[i]):(T-1)]- sigmapred)^2)
if (error <= mse){
mse = error
optimal_window = forecast_length[i]
optimal_refit = refit_length[j]
opt_forc = rollforc}
}}
print(optimal_window)
print(optimal_refit)
print(mse)
plot(opt_forc, which=2)

#use linear regression to regress HVT of every last day of a month and Predicted Volatility
by Egarch of every last day of previous month.
result<-read.csv('result.csv', header = TRUE, sep = '\t')
colnames(result)[1] = 'Month'
Monthdiff<-diff(result$Month)
aa<-which(Monthdiff!=0)
HVT<-result$HVT[aa]
Pred<-result$Sigmapred[aa]
HVTd<-HVT[2:48]
Predd<-Pred[1:47]
fit2<-lm(HVTd~Predd)
summary(fit2)
plot(HVTd,type='l')
points(Predd*1954.292-8.917,type='l',col='blue')
74
Forecasting Crude Oil and Natural Gas Volatility
GJR-GARCH
library(Quandl)
library(dplyr)
library(xts)
library(lubridate)
library(forecast)
library(dygraphs)
library(fGarch)
library(rugarch)
library(parallel)

rm(list=ls()) ​### Clean the R workspace

gas_daily1<-Quandl("CHRIS/CME_NG1",api_key="2T1Yy7mQwKqsGtFXKtCy",type="ra
w",collapse="daily",start_date="2007-12-31",end_date="2017-09-30")
gas_daily<-gas_daily1[ nrow(gas_daily1):1, ]

price<-gas_daily$Settle

#Basic data process


Ret_total<-diff(log(price))
T<-length(Ret_total)

#Predict 21 days ahead, could adjust different garch model inside loop.
#Check the output csv file whether there is 999 data in Pred. If there is, it means there is error
in that loop.
#Simply substitute i-1's data if Pred[i] is 999
pred<-numeric(1442)

for (i in 1001:T){
tryCatch({
Retwindow<-Ret_total[(i-1000):(i-1)]
spec = ugarchspec(variance.model=list(model="gjrGARCH"), distribution="std")
fit = ugarchfit(spec, Retwindow)
bootp = ugarchboot(fit, method = c("Partial", "Full")[1],n.ahead = 21, n.bootpred = 100)
predsigma<-bootp@forc@forecast$sigmaFor[21]
pred[i]=predsigma
print(i)
}, error=function(e){
pred[i]=999})
}
plot(pred,type='l')
write.csv(pred,file='GJRGarch-gas.csv')
write.csv(gas_daily,file='gas-daily.csv')
75
Forecasting Crude Oil and Natural Gas Volatility
#Here we divide Date into Year, Month and Day as three columns in excel manually.
result=read.csv('gas-daily.csv')
data_real=read.csv('gas vol.csv')
Real_vol=data_real$VOLATILITY_30D[50:117]/100

Month=result$Month[1002:T]
Monthdiff<-diff(Month)
aa<-which(Monthdiff!=0)
Pred<-result$Pred[1002:T]
Predd<-Pred[aa]
plot(Predd,type='l')
Busday<-read.csv('busdays.csv')
Predddd<-Predd*sqrt(Busday$busdays[2:69])

fit2<-lm(Real_vol~Predddd)
summary(fit2)
plot(Real_vol,type='l')
points(Predddd*fit2$coefficients[2]+fit2$coefficients[1],type='l',col='blue',lwd=2)

You might also like