This action might not be possible to undo. Are you sure you want to continue?
Stochastic Models for Electricity Prices
in Alberta
by
Lei Xiong
A THESIS
SUBMITTED TO THE FACULTY OF GRADUATE STUDIES
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF MASTER OF SCIENCE
DEPARTMENT OF MATHEMATICS AND STATISTICS
CALGARY, ALBERTA
SEPTEMBER, 2004
c Lei Xiong 2004
THE UNIVERSITY OF CALGARY
FACULTY OF GRADUATE STUDIES
The undersigned certify that they have read, and recommend to the Faculty of
Graduate Studies for acceptance, a thesis entitled “Stochastic Models for Electricity
Prices in Alberta” submitted by Lei Xiong in partial fulﬁllment of the requirements
for the degree of MASTER OF SCIENCE.
Dr. L.P. Bos
Department of Mathematics and
Statistics
Dr. A.F. Ware
Department of Mathematics and
Statistics
Dr. G. Sick
Haskayne School of Business
Date
ii
Abstract
This thesis investigates the modeling of electricity prices in the Canadian province of
Alberta. We model the electricity price processes as aﬃne jumpdiﬀusion processes,
and we are able to exploit the transform analysis of Duﬃe, Pan and Singleton (1996)
to develop computationally tractable and asymptotically eﬃcient estimators of the
parameters. We examine six meanreverting jumpdiﬀusion models for modeling
electricity spot prices. The models which we propose have the features of multi
ple types of jumps, or timevarying mean and stochastic volatility. The estimation
methodologies we adopt include maximum likelihood estimation based on conditional
characteristic function and spectral generalized method of moments. Extensive em
pirical comparisons have been conducted via these estimation methods based on
actual spot hourly electricity prices in Alberta.
iii
Acknowledgments
I am indebted to many individuals for invaluable help and advice. While some are
acknowledged in the text, these do not convey the extent of the assistance given to
me.
First and foremost I would like to thank Professor Tony Ware for his support,
guidance and patience. I learned so much from him about stochastic modeling and
its application to ﬁnance. His guidance was invaluable in this thesis.
I am extremely grateful to Professor Len Bos for his incredible patience and
boundless encouragement. His comments and helpful suggestions allowed me to
greatly improve my work.
I also wish to acknowledge Professor Gordon Sick for his comments and correc
tions, which allow this thesis more satisfactory.
This thesis evolved out of a project sponsored and partially funded by NEXEN
and by the Networks of Centers of Excellence for the Mathematics of Information
Technology and Complex Systems (MITACS). I would also like to thank Professor
Ali LariLavassani for the opportunity to be involved in this project and for his
generous support.
A special thanks goes to Professor Peter Zvengrowski, always friendly and helpful
in editing my thesis. I am also grateful to my collaborator Zhiyong Xu for his sincere
assistance in the project.
My deepest appreciation goes to my parents. They have always provided gener
ously, not only ﬁnancially, but also with their unconditional and endless love.
Finally, and by far most importantly, I thank my husband Rui. Without his
iv
constant encouragement and love I would never have made it to this point. His
silent but unwavering support has given me the strength to overcome the diﬃcult
hurdles along the way.
v
Table of Contents
Approval Page ii
Abstract iii
Acknowledgments iv
Table of Contents vi
1 Introduction 1
2 Background 6
2.1 The Alberta Electricity Market . . . . . . . . . . . . . . . . . . . . . 6
2.2 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 Geometric Brownian Motion Process . . . . . . . . . . . . . . 16
2.2.2 OrnsteinUhlenbeck Process . . . . . . . . . . . . . . . . . . . 17
2.2.3 Jumpdiﬀusion model . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.4 GARCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.5 Extensions to Models . . . . . . . . . . . . . . . . . . . . . . . 22
3 Stochastic Models for Electricity Prices 26
3.1 Aﬃne JumpDiﬀusion Process . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Model 1a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.3 Model 1b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 Model 2a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5 Model 2b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.6 Model 3a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.7 Model 3b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4 Parameter Estimation 40
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 MLCCF Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 MLMCCF Estimators . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.4 Spectral GMM Estimators . . . . . . . . . . . . . . . . . . . . . . . . 48
5 Model Comparison 56
5.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.2 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3 Goodness of Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
vi
5.4 Robustness Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.5 Descriptive Statistics of Empirical vs. Calibrated Hourly Returns . . 82
5.6 Quantile and Quantile plot . . . . . . . . . . . . . . . . . . . . . . . . 86
5.7 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6 Conclusion 108
Bibliography 111
vii
List of Tables
5.1 Descriptive statistics of Hourly EP . . . . . . . . . . . . . . . . . . . 57
5.2 Descriptive statistics of Peak EP . . . . . . . . . . . . . . . . . . . . 57
5.3 Descriptive statistics of Oﬀpeak EP . . . . . . . . . . . . . . . . . . . 58
5.4 Descriptive statistics of deseasonalized Hourly EP . . . . . . . . . . . 60
5.5 Hourly EP parameters values for Model 1a . . . . . . . . . . . . . . . 65
5.6 Hourly EP parameters values for Model 2a . . . . . . . . . . . . . . . 67
5.7 Hourly EP parameters values for Model 1b . . . . . . . . . . . . . . . 67
5.8 Hourly EP parameters values for Model 2b . . . . . . . . . . . . . . . 70
5.9 Hourly EP parameters values for Model 3a . . . . . . . . . . . . . . . 72
5.10 Hourly EP parameters values for Model 3b . . . . . . . . . . . . . . . 74
5.11 Bias for Model 3a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.12 Bias for Model 3b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.13 Hourly EP parameters values for Model 3a (ρ = −1) . . . . . . . . . 78
5.14 Hourly EP parameters values for Model 3b (ρ = −1) . . . . . . . . . 78
5.15 Likelihood ratio statistics ﬁtting on Hourly EP . . . . . . . . . . . . . 79
5.16 Schwarz criteria statistics for Models . . . . . . . . . . . . . . . . . . 80
5.17 Robustness test for Model 1a . . . . . . . . . . . . . . . . . . . . . . 81
5.18 Robustness test for Model 1b . . . . . . . . . . . . . . . . . . . . . . 81
5.19 Robustness test for Model 2a . . . . . . . . . . . . . . . . . . . . . . 82
5.20 Robustness test for Model 2b . . . . . . . . . . . . . . . . . . . . . . 82
5.21 Empirical results vs. theoretical results for Model 1a . . . . . . . . . 85
5.22 Empirical results vs. theoretical results for Model 2a . . . . . . . . . 85
5.23 Empirical results vs. theoretical results for Model 1b . . . . . . . . . 86
5.24 Empirical results vs. theoretical results for Model 2b . . . . . . . . . 86
5.25 Descriptive statistics of Peak EP and simulated paths . . . . . . . . . 90
5.26 Descriptive statistics of Peak EP and simulated paths . . . . . . . . . 91
viii
List of Figures
2.1 Plot of the hourly electricity prices . . . . . . . . . . . . . . . . . . . 8
2.2 Empirical histogram of log returns with normal density superimposed 9
2.3 Sample plots of the on/oﬀpeak prices together with the histogram of
log returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Average weekday hourly prices by season . . . . . . . . . . . . . . . . 13
2.5 A sample of hourly electricity prices for two weeks . . . . . . . . . . . 14
2.6 The moving average electricity prices vs. their variance . . . . . . . . 15
5.1 Histogram of the log returns of Hourly EP . . . . . . . . . . . . . . . 58
5.2 Histogram of the changes of the deseasonalized data (dX) . . . . . . 59
5.3 Results from optimization (Model 1a) . . . . . . . . . . . . . . . . . . 66
5.4 Results from optimization (Model 2a) . . . . . . . . . . . . . . . . . . 68
5.5 Results from optimization (Model 1b) . . . . . . . . . . . . . . . . . . 69
5.6 Results from optimization (Model 2b) . . . . . . . . . . . . . . . . . . 71
5.7 Results from optimization (Model 3a) . . . . . . . . . . . . . . . . . . 73
5.8 Results from optimization (Model 3b) . . . . . . . . . . . . . . . . . . 75
5.9 QQ Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.10 Hourly EP superimposed by simulated paths (Model 1a) . . . . . . . 92
5.11 Hourly EP superimposed by simulated paths (Model 1b) . . . . . . . 93
5.12 Hourly EP superimposed by simulated paths (Model 2a) . . . . . . . 94
5.13 Hourly EP superimposed by simulated paths (Model 2b) . . . . . . . 95
5.14 Comparison of simulated price processes with Hourly EP (Model 1a) 96
5.15 Comparison of simulated price processes with Hourly EP (Model 1b) 97
5.16 Comparison of simulated price processes with Hourly EP (Model 2a) 98
5.17 Comparison of simulated price processes with Hourly EP (Model 2b) 99
5.18 Peak EP superimposed by simulated paths (Model 1a) . . . . . . . . 100
5.19 Peak EP superimposed by simulation paths (Model 1b) . . . . . . . . 101
5.20 Oﬀpeak EP superimposed by simulation paths (Model 1a) . . . . . . 102
5.21 Oﬀpeak EP superimposed by simulation paths (Model 1b) . . . . . . 103
5.22 Comparison of simulated price processes with Peak EP (Model 1a) . . 104
5.23 Comparison of simulated price processes with Peak EP (Model 1b) . 105
5.24 Comparison of simulated price processes with Oﬀpeak EP (Model 1a) 106
5.25 Comparison of simulated price processes with Oﬀpeak EP (Model 1b) 107
ix
Chapter 1
Introduction
Over the last decade, the structure of the North America electricity markets has
undergone radical changes. Electricity markets are experiencing rapid deregulation
and restructuring, especially with regard to generation and supply. Electricity prices
are no longer controlled by regulators and now are essentially determined according
to the economic rule of supply and demand. At the same time, the deregulation and
restructuring of electricity markets has paved the way for a considerable amount of
trading activity, which provides utilities with new opportunities but more competi
tion as well. The resulting spot prices are crucial for the valuation of physical assets,
derivatives and more generally for the risk management of utilities. The recent ex
perience of the state of California has illuminated the importance of understanding
electricity price behaviour (see [1]). While there has been a signiﬁcant amount of
research on commodity prices, however, the empirical studies on electricity prices
are not thoroughly developed (see [26]). It is a very challenging task due to the
erratic nature of electricity prices. Furthermore, most studies have been focused on
the U.K. and U.S. electricity markets.
In this thesis, we will primarily study the Alberta Electricity Market. Instead
of using the expected future value of electricity price as the underlying process, it
seems that it is better to use electricity spot prices directly to study the dynamics
of the electricity market. Introducing a convenience yield, a nonobservable risk
factor, as discussed in Eydeland and H´elyette (1998) (see [2]), does not really make
1
2
sense for electricity: since there is no available technique to store power (except for
hydro), holding the underlying asset does not help us. Moreover, in certain regions,
electricity spot markets are relatively more liquid than the corresponding futures
markets, especially for electricity futures with maturity beyond 12 months (see [3]).
The spot price process by itself should contain most of the fundamental properties of
electricity (see [2]). We represent electricity spot prices in a natural logarithmic scale
which stabilizes the statistical estimation procedure and ensures strict positivity of
sample paths.
1
We begin by examining the empirical behaviour of electricity spot prices in the
Alberta Electricity Market in Chapter 2. Through this examination, we believe
that the class of meanreverting jumpdiﬀusion models are one of the ideal models
to capture the characteristics of electricity prices in the Alberta Electricity Market.
Some typical models that have been used to explain the dynamics of electricity prices
are also discussed in Chapter 2.
The asset price models we consider all fall into the class of aﬃne processes, which
are given a general treatment in [5]. In Chapter 3, we outline the relevant features
of this aﬃne framework. As discussed by various authors (see [6], [7], [8]), aﬃne
processes can be applied to multiple types of processes such as those with jumps,
timevarying mean, and stochastic volatility and even more general structures like
latent variables or timevarying jumps without sacriﬁcing computational tractability.
Therefore, aﬃne processes have great advantages for econometric work.
We will start with a one factor model based on the logarithm of the spot prices.
This model will be considered and extended in subsequent sections of the thesis.
1
Up to now, negative electricity prices have been rarely observed, see [4].
3
We will deal with a onejump version of the process and a twojump version of the
process. The onejump version was ﬁrst proposed by Das and Foresi (see [9]). The
jump component has an exponentially distributed absolute value of jump size with
sign of jump determined by a Bernoulli variable, while the twojump version assumes
that there are asymmetric upward jumps and downward jumps. This type of process
was ﬁrst considered in Duﬃe, Pan and Singleton (1998) and adopted by Shejie, Deng
(1998) and Villaplana (2003) to mimic the “spikes” in the electricity price process
(see [5], [10], [11]).
We also have two diﬀerent speciﬁcations for the drift term:
• Deterministic longterm mean,
• Timevarying longterm mean incorporating onpeak and oﬀpeak eﬀects.
Furthermore, we also have two diﬀerent speciﬁcations for the diﬀusion term:
• Deterministic volatility,
• Stochastic volatility, which is itself a meanreverting square root process.
Speciﬁcally, the following alternative meanreverting jumpdiﬀusion (MRJD) models
have been explored.
• Model 1a: onejump version of MRJD,
• Model 1b: twojump version of MRJD,
• Model 2a: onejump version of MRJD with timevarying longterm mean,
• Model 2b: twojump version of MRJD with timevarying longterm mean,
4
• Model 3a: onejump version of MRJD with stochastic volatility,
• Model 3b: twojump version of MRJD with stochastic volatility.
Aﬃne jumpdiﬀusion processes have analytic solutions for the conditional charac
teristic function associated with the conditional density function. This implies that
one can obtain the conditional densities via Fourier inversion or other methods. The
computations of those conditional characteristic functions (CCF) are introduced in
Chapter 3.
Chapter 4 exploits the CCF to develop computationally tractable estimators of
the parameters. Speciﬁcally, we use the Fourier transform of the CCF to derive
the conditional loglikelihood function. When all the state variables are observable,
we can use the standard maximum likelihood (ML) approach by maximizing this
likelihood function. This yields the asymptotically eﬃcient MLCCF estimator (see
[12]). But the computation cost will grow rapidly as the number of state variables
increases.
When there are unobservable state variables (latent variables), the MLCCF es
timator cannot be constructed directly. Following Chacko and Viceira (1999) (see
[13]), we obtain a socalled marginal CCF by integrating the latent variable out of
the CCF. Based on this marginal conditional characteristic function (MCCF), we
can carry out the MLMCCF estimation. This MCCF can be used, as well, to derive
closedform expressions for the socalled conditional moments. One can construct
spectral generalized methods of moments (SGMM) estimators introduced by Chacko
and Viceira (1999) (see [13]). But the estimators based on the MCCF are less eﬃcient
compared to other parameter estimation methods that have been used in stochastic
5
volatility models. It is a tradeoﬀ between computational ease and eﬃciency.
Also, extensive empirical comparisons among the models have been conducted
in Chapter 5. The data series we used are the hourly electricity price series for the
province of Alberta from January 1, 2002 to March 31, 2004.
Chapter 2
Background
2.1 The Alberta Electricity Market
As pointed out by Daniel et al. [14], “Alberta is somewhat unique in Canada in that
there has never existed a single, vertically integrated Crown (that is, government
owned) monopoly serving the electricity needs of the province.” Generation and
retail services are open to competition, which strongly resembles the U.S. industry.
However, transmission and distribution services are still under government regula
tion, which can be construed as a “Canadian” characteristic. The Alberta Electricity
Market is relatively small. It has about 12,400 MW of installed generating capacity,
including 11,500 MW in the province’s integrated electrical system and access to
950 MW from B.C. and Saskatchewan. Thermal sources account for the majority
of Alberta’s installed generating capacity. The remainder is hydro and wind (see
[15]). Restructuring of the electricity industry was ﬁrst broached in Alberta in the
early 1990s and deregulation started from January 1, 1996. Alberta has beneﬁtted
from more than 3,000 MW of new generation since 1998, an increase of about 30%
in capacity. The experience of the Alberta Electricity Market reﬂects many aspects
involved in deregulation and restructuring, and provides insights to other electricity
markets.
Power generated in Alberta is exchanged through the Alberta Electric System
6
7
Operator (AESO).
1
The AESO is also responsible for dispatching all electric power
generation in Alberta and directing the operation of Alberta’s electricity network
to ensure reliable and economic systems. The Alberta Energy and Utilities Board
(EUB), provides governance and direction of the AESO.
The System Coordination Centre is the heart of Alberta’s wholesale realtime
electricity market. The System Controllers are responsible for the realtime oper
ations in this market. They dispatch electricity to meet demand and monitor the
status of the electric system through the Energy Management System (EMS). The
Energy Trading System (ETS), which enables both spot and forwards electricity
markets, facilitates the realtime wholesale electricity market.
All trading of power through the AESO is initiated by a process of oﬀers and
bids. Then the AESO establishes a “merit order”
2
to meet forecast pool demand
by ranking oﬀers and bids from lowest cost to highest cost for each hour of the day.
The System Controllers dispatch oﬀers and bids to keep the balance of supply and
demand in the “merit order” and ensure the lowest cost. The last bid or oﬀer that
is dispatched every minute sets the System Marginal Price (SMP). A timeweighted
average of the marginal prices at the end of the hour is calculated as the market
price for that hour. There are no diﬀerential transmission charges for locations in
Alberta (see [16]).
Figure 2.1 illustrates an example of hourly prices, which are measured in dollars
per megawatt hour ($/MWH, Canadian dollars being the monetary unit here and in
the remainder of this thesis), during the period January 1, 2002 to March 31, 2004
1
The Alberta Electric System Operator brings together two former entities: the Power Pool of
Alberta and the Transmission Administrator of Alberta.
2
For more details, visit www.aeso.ca.
8
(19,704 observations). Figure 2.2 presents an empirical histogram for the log returns
2002 2003 2004
100
200
300
400
500
600
700
800
900
Time
$
/
M
W
H
Hourly Electricity Prices for Jan 1, 2002 to Mar 31, 2004
Figure 2.1: Plot of the hourly electricity prices
of the hourly electricity prices over the same period overlaid with a normal density
curve.
3
Clearly, the probability density curve of the change is not normal, because
it has slimmer tails and a higher peak at the mean than the normal distribution.
The day time is divided into onpeak periods and oﬀpeak periods to assist ﬁrms
in managing electricity consumption and in taking advantage of lower rates in oﬀ
3
The log returns of the price process P at time t in this thesis is deﬁned as d(X
t
) = X
t+1
−X
t
with X
t
= log P
t
.
9
−6 −4 −2 0 2 4 6
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Figure 2.2: Empirical histogram of log returns with normal density superimposed
Note: the histogram is the log returns of the hourly electricity prices, the overlaid
solid line is the plot of a normal density with the same mean and variance as the log
returns.
10
peak hours, which will in turn help conserve energy and enhance available supply.
According to the Alberta Energy and Utilities Board, the onpeak period is from
08:00 to 21:00 Monday through Friday inclusive, with the exception of statutory
holidays.
4
The remaining hours from Monday through Sunday are oﬀpeak hours.
Figure 2.3 includes sample plots of the onpeak and oﬀpeak daily average electricity
prices from January 1, 2002 to March 31, 2004 together with a histogram of the log
returns superimposed by a normal density.
Just as electricity prices in other markets, electricity prices in the Alberta Market
also display the following distinct characteristics (see [17]):
1. Pronounced cyclical eﬀects. Electricity exhibits the most complicated cyclical
patterns of all energy commodities. Normally, there are peaks reﬂecting heat
ing and cooling needs. Additionally, they have hourly, daily, intradaily, and
weekly patterns. Figure 2.4 presents the normalized average weekday hourly
electricity prices for each season of the year 2000 to the year 2003.
5
It may
appear that temperatures are not a very signiﬁcant factor in Alberta since it is
hard to tell that electricity prices have strongly signiﬁcant diﬀerent behaviour,
especially from about 8:00 p.m. to 8:00 a.m., in diﬀerent seasons in Alberta
from the year 2000 to the year 2003. Notice that within a time span of 24 hours,
prices increase as demand increases with a distinct hourly pattern. It increases
at around 5:00 a.m., as people wake up and the work hours begin. Prices con
tinuously increase throughout the day as demand increases, peaking at around
4
The statutory holidays in Alberta are New Year’s Day, Family Day (3rd Monday of February),
Good Friday, Victoria Day, Canada Day, Labour Day, Thanksgiving Day, Remembrance Day,
Christmas Day.
5
In normalizing, we adjust each average weekday hourly price by dividing by the year’s average
hourly price.
11
2003 2004
2
3
4
5
6
O
n
−
P
e
a
k
Log of Price
2002 2003 2004
2
3
4
5
6
O
f
f
−
P
e
a
k
−3 −2 −1 0 1 2
0
10
20
30
40
O
n
−
P
e
a
k
Histogram for the Log−Return
−3 −2 −1 0 1 2
0
10
20
30
40
O
f
f
−
P
e
a
k
Figure 2.3: Sample plots of the on/oﬀpeak prices together with the histogram of
log returns
Note: the upper left plot is the sample plot of the logarithm of the onpeak daily
average prices. The upper right plot is the histogram for the log returns of the on
peak daily average. Similarly, the lower left plot is the sample plot of the logarithm
of the oﬀpeak daily average prices. The lower right plot is the histogram for the log
returns of the oﬀpeak daily average. Both histograms illustrate the deviation from
normality. Notice that, the diﬀerences between the log of the onpeak average prices
and the log of the oﬀpeak are not signiﬁcant.
12
5:00 p.m.. After the work hours, demand shifts to primarily residential usage
and prices begin to decrease. Figure 2.5 illustrates more clearly the daily usage
pattern.
6
The prices are measured in dollars per megawatt hour ($/MWH) on
the left vertical axis and demand is measured in megawatts (MW) on the right
vertical axis. Moreover, it is apparent that prices, in some sense, are mimicking
demand. Electricity prices are normally higher when demand is greater.
2. Sizable price “spikes”. Electricity is nonstorable (other than hydro) and is
more aﬀected by transmission constraints, seasonality and weather. As shocks
in demand and supply cannot be smoothed, electricity spot prices are ex
tremely volatile and occasionally reach extremely high levels, commonly known
as “spikes”. This is the most dramatic feature of Figure 2.1. We see one (or sev
eral) upward jump shortly followed by a steep downward move. It is not uncom
mon to see that the ﬂuctuation of electricity prices is more than $700/MWH.
One consequence of “spikes” is the presence of socalled “fat tails”, i.e., the
probability of a very large positive or negative change (though small) is much
larger than permitted by a normal distribution. This can be measured by
kurtosis.
3. Mean reversion. This has been well documented as an important characteristic
of electricity prices. It is also quite clear in Figure 2.5. The price oscillates
around the mean level and gets pulled back to this mean level rapidly after a
spike. This phenomenon in particular gives rise to a high mean reversion rate.
4. Pricedependent variance. There is empirical evidence suggesting the fact that
6
This is similar to the investigation of the hourly electricity prices from California, see [26].
13
0 5 10 15 20 25
0
0.5
1
1.5
2
2.5
3
Hour
$
/
M
W
H
Spring (Mar 20 −−− Jun 20)
Summer (Jun 21 −−− Sep 22)
Fall (Sep 23 −−− Dec 21)
Winter (Dec 22 −−− Mar 19)
Figure 2.4: Average weekday hourly prices by season
Note: from this plot, it is hard to tell that electricity prices have strongly signiﬁcant
diﬀerent behaviour, especially from about 8:00 p.m. to 8:00 a.m., in diﬀerent seasons
in Alberta from the year 2000 to the year 2003.
14
Tue Thur Sat Mon Wed Fri Sun Tue
0
20
40
60
80
100
120
140
160
180
0 2 4 6 8 10 12 14
5600
5800
6000
6200
6400
6600
6800
7000
7200
7400
7600
P
o
w
e
r
P
r
i
c
e
$
/
M
W
H
D
e
m
a
n
d
M
W
Power Price Demand
Mean Level of Price
Tue Thur Sat Mon Wed Fri Sun Tue
0
20
40
60
80
100
120
140
160
180
0 2 4 6 8 10 12 14
5600
5800
6000
6200
6400
6600
6800
7000
7200
7400
7600
P
o
w
e
r
P
r
i
c
e
$
/
M
W
H
D
e
m
a
n
d
M
W
Power Price Demand
Figure 2.5: A sample of hourly electricity prices for two weeks
Note: this is a plot of sample hourly electricity prices together with the corresponding
demand for two weeks.
15
10
2
10
2
10
3
10
4
Power Price ($/MWH)
V
a
r
i
a
n
c
e
Figure 2.6: The moving average electricity prices vs. their variance
Note: this is a loglog scale plot of the 168hour moving average electricity prices
versus the variance of the corresponding 168hour electricity prices. Electricity prices
are roughly proportional to the square root of their variance.
16
the volatility of electricity prices is high when the aggregate demand is high
and vice versa.
7
. Figure 2.6 plots the 168hour moving average prices
8
from
Jan 1, 2002 to Mar 31, 2004 versus the variance of the corresponding 168
hour electricity prices. The prices are measured in dollars per megawatt hour
($/MWH) on the horizontal axis. This plot clearly exhibits an upward trend,
which is signiﬁcant, indicating that the variance is pricedependent.
5. Nonnegativity. As it costs money to produce electricity, electricity prices are
normally positive although sometimes they are close to zero, as shown in Figure
2.1.
2.2 Models
Common models of asset prices oﬀer a poor representation and forecasting of the
electricity price process because they fail to capture the erratic nature of electricity
prices (see [10], [17], [20]). Thus, modeling the price behaviour of electricity is a very
challenging task for researchers and practitioners. Some typical models that have
been used to explain the dynamics of electricity prices are discussed in the following.
2.2.1 Geometric Brownian Motion Process
The geometric Brownian motion process forms the basis for the “BlackScholes
Merton” option pricing model and can be written as the following stochastic diﬀer
7
In general, volatility is a statistical measure of the tendency of a market or security to rise
or fall sharply within a short period of time Volatility is typically calculated by using variance or
annualized standard deviation of the prices or log returns (see [19]).
8
For the spot prices P
t
, t = 1, · · · , N, the 168hour moving average price M
t
=
i=167
i=0
P
t+i
/168, t = 1, · · · , N −168, and σ
2
t
= var({P
t+i
}
i=167
i=0
).
17
ential equation (SDE):
dS
t
= µS
t
dt +σS
t
dW
t
. (2.1)
Here dS
t
is the stochastic increment over an inﬁnitesimal time interval dt. The
longterm mean µ and the diﬀusion coeﬃcient σ are unknown constants, and dW
t
represents an increment to a standard Brownian motion W
t
which has zero mean and
variance dt. This model assumes that prices are log normal, or alternatively, the log
returns are normally distributed. It is relatively simple to calibrate by maximum
likelihoodbased procedures and very popular in nonenergy markets (see [21]). Likaa
(2001) studied this model for the Alberta Electricity Market and found that it failed
to predict accurately the prices of electricity due to the lack of liquidity in the Alberta
Electricity Market (see [18]).
2.2.2 OrnsteinUhlenbeck Process
Lucia and Schwartz (2000) examined the Nordpool market in terms of a totally
predictable deterministic component and an OrnsteinUhlenbeck process based on
spot prices or log spot prices (see [22]). If the deterministic part is constant, then
their model is a constantmean reverting process. Meanreverting diﬀusion models
have been widely adapted to model ﬁnancial time series, beginning with Vasicek
(1977). Likaa (2001) also applied these models to the Alberta Electricity Market
(see [18]).
One typical example of this type of continuous time models speciﬁes prices as:
dX
t
= κ(α −X
t
)dt +σdW
t
, (2.2)
where X
t
is the logarithm of electricity price at time t with an initial condition X
0
;
18
κ is the mean reversion rate which decides how fast the processes go back to the
longterm mean level α. If the spot price is higher than e
α
, then the drift, κ(α−X
t
),
will be negative. This brings the process back towards the longterm mean level,
and similarly if the spot price is lower than e
α
. The calibration of this model is also
relatively easy by maximumlikelihood based procedures (see [21], [17], [23]).
If we write equation (2.2) in integral form as follows
X
t
= e
−κt
X
0
+α(1 −e
κt
) +
t
0
e
κ(s−t)
σdW
s
, (2.3)
we obtain a ﬁrst order autoregressive model (AR(1)) in continuous time and the
parameters can be estimated by linear regression analysis (see [18]).
Meanreverting diﬀusion models are widely used in energy markets, but in the
case of electricity, we should allow for the possibility of more upside departures than
downside ones and more price “spikes” in one direction than the opposite direction,
which is not captured by pure mean reverting processes (see [23]). Not surprisingly,
these models failed to give a reasonable prediction of electricity prices of the Alberta
Electricity Market due to their limiting assumptions (see [18]).
2.2.3 Jumpdiﬀusion model
Deng, et al.(1998), Clewlow and Strickland (1999), and Barz (1999) examined a broad
class of stochastic models which can be used to model the behaviour of electricity
prices, and performed empirical studies based on the empirical data. They found
that models with meanreversion and jumps are particularly suitable for modeling
electricity spot price processes (see [17], [24], [23]).
Actually, there are many empirical studies in the literature that demonstrate the
19
presence of jumps as a signiﬁcant feature in the behaviour of ﬁnancial time series.
Consequently, jumpdiﬀusion models arise frequently in ﬁnancial literature. One
famous example is Merton’s (1976) option pricing model, a mixed model of a con
tinuous Brownian motion and a discrete Poisson lognormally distributed jump. In
practice, besides the normal distribution, we can also have the uniform distribu
tion (see [25]) and the exponential distribution (see [9])–including some very close
variations–as qualiﬁed jumpsize choices. The jump intensity can also be extended
to vary with the time of day and season (see [26]).
For most jumpdiﬀusion models, the parameters are estimated by Maximum
Likelihood (ML) estimation. ML estimation is a powerful and general method of
estimating the parameters of a stochastic process when one has an analytical form
for the probability density function. Ball and Torous (1985) successfully used ML
estimation to estimate parameters of jumpdiﬀusion models for NYSE stock prices.
But parameter estimation in jumpdiﬀusion models is not as easy as it may appear
to be (see [27], [28]). The major caveat is that the probability density function
for jumpdiﬀusion processes cannot be determined explicitly for most models as it
depends on the timing of jumps as well as their size. Moreover, the combination of
the distributions of the Brownian motion and the jump component often cannot be
computed analytically (see [29], [30]).
Let us take the discretized Merton model as an example,
9
∆X
t
= (α −
1
2
σ
2
)∆t +σ∆W
t
+J
t
∆P
t
, (2.4)
9
This model is also referred to as the Bernoulli diﬀusion model and is discussed by Ball and
Torous (1983) (see [27]). The Poisson process is approximately modeled by a Bernoulli process.
For a detailed discussion, see [27].
20
where ∆X
t
is the change of X
t
during the time interval ∆t, α is the longterm
mean, σ is the volatility, and W
t
is the Brownian Motion. Using the symbol ∼
to denote the distribution of a random variable, one has ∆W
t
∼ N(0, ∆t), the
jumpamplitude J
t
∼ N(µ, δ
2
), where N(µ, δ
2
) denotes a normal distribution with
mean µ and standard deviation δ. Furthermore, ∆P
t
is a Bernoulli process with
parameter λ∆t. Assuming that the two normal processes are independent, then the
probability density function for ∆X
t
(which is denoted as f(Y
t
; θ), where Y
t
= ∆X
t
,
and parameter space Θ ={(µ
1
, σ
1
, µ
2
, σ
2
, λ)} in the Merton model) turns out to have
the form:
10
f(Y
t
; θ) = (2π)
−
1
2
a(σ
−1
1
exp
¸
−(Y
t
−µ
1
)
2
/2σ
2
1
¸
)
+ (1 −a)(σ
−1
2
exp
¸
−(Y
t
−µ
2
)
2
/2σ
2
2
¸
, θ ∈ Θ,
(2.5)
where a = (1 − λ∆t), µ
1
= (α −
1
2
σ
2
)∆t, σ
2
1
= σ
2
∆t, µ
2
= (α −
1
2
σ
2
)∆t + µ, and
σ
2
2
= σ
2
∆t +δ
2
.
Suppose we have a sequence of T observations of X
t
sampled at time t, t =
1, · · · , T, so that ∆t = 1. The joint density function f(·) at the sample is given by
the following equation:
f(Y
1
, · · · , Y
T−1
; θ) =
T−1
¸
t=1
(2π)
−
1
2
a(σ
−1
1
exp
¸
−(Y
t
−µ
1
)
2
/2σ
2
1
¸
)
+ (1 −a)(σ
−1
2
exp
¸
−(Y
t
−µ
2
)
2
/2σ
2
2
¸
=
T−1
¸
t=1
f(Y
t
; θ), θ ∈ Θ,
(2.6)
with Y
t
= X
t+1
−X
t
. Then the logarithm of the likelihood function, L(Y
1
, · · · , Y
T
; θ),
10
For more details, see [30].
21
can be written as:
L(Y
1
, · · · , Y
T−1
; θ) =
T−1
¸
t=1
ln f(Y
t
, θ). (2.7)
However, for example, if µ
1
= Y
t
for any t then f(Y
t
; θ) increases without bound
as σ
1
goes to zero. Moreover, f is not zero at other observations (µ
1
= Y
t
) due to
the existence of the second term of f. Thus, L is unbounded and we cannot use
standard ML estimation to estimate the parameters (see [32]).
11
Therefore MLbased estimation can be only applied to a handful of them (see
[31]), such as for those that fall into the class of aﬃne processes, which we will give
more details later on.
2.2.4 GARCH
Autoregressive Conditional Heteroskedastic (ARCH) models were ﬁrst proposed by
Engle (1982) and have been widely applied in many diﬀerent ﬁnancial areas. Later,
Bollerslev (1986) developed a generalized ARCH model, or GARCH model which
allows more terms in the model. GARCH models are able to capture the very
important volatility clustering phenomenon which clearly exists in electricity prices.
12
In practice, the ﬁrst order GARCH(1,1) model, shown in Equation (2.8) below, is
often adequate to capture the volatility structure of electricity price S
t
(see [35]):
S
t
= a
0
+a
1
S
t−1
+µ
t
, (2.8)
where a
0
and a
1
are unknown constants. The forcast error term is µ
t
= σ
t
ε
t
, where
ε
t
is an independent random error distributed as N(0, 1). The conditional variance,
11
However, it is still possible to obtain estimates via MLbased estimation (see [33], [30]).
12
According to [34], volatility clustering phenomenon describes the situation wherein large volatil
ity movements are more likely to be succeeded by further large volatility movements of either sign
than by small movements.
22
σ
2
t
:= var(S
t
S
t−1
), is given by
σ
2
t
= b
0
+b
1
µ
2
t
+b
2
σ
2
t−1
. (2.9)
Here, b
0
, b
1
and b
2
are unknown constants (see [35]). Knittel and Roberts (2001)
also introduced a similar approach to model electricity prices. They speciﬁed the
price level as the sum of a deterministic component and a stochastic component and
adopted the Exponential GARCH (EGARCH) model of Nelson (1991) (see [26]).
Also, by introducing serial correlation in the error term, they got an Autoregressive
Moving Average Exogenous (ARMAX) model. They also extended this ARMAX
model by incorporating temperature data to capture the seasonality of electricity
prices.
2.2.5 Extensions to Models
There are also some plausible extensions of the above models, three of which are now
discussed.
TimeVarying Longterm Mean
As electricity prices tend to have dramatic changes in diﬀerent seasons due to
heating and cooling needs and in diﬀerent times of day due to change in demand,
they depart from the “usual” behaviour frequently and signiﬁcantly, which cannot
easily be modeled by a constant longterm mean. Dummy variables can be used to
reﬂect those properties of electricity prices. One may consider a longterm mean α
t
such as
α
t
= a
1
peak
t
+a
2
oﬀpeak
t
+
4
¸
i=1
b
i
M
i
t
, (2.10)
23
where a
1
, a
2
and b
i
(i = 1, · · · , 4) are unknown constants,
peak
t
=
1 if in onpeak periods,
0 otherwise,
oﬀpeak
t
=
1 if in oﬀpeak periods,
0 otherwise,
M
i
t
=
1 if belongs to the ith season,
0 otherwise.
(2.11)
Dummy variables are quite intuitive and can potentially provide some necessary
ﬂexibility. But dummy variables are very sensitive to anomalies as a result of this
ﬂexibility. For high frequency data, the use of dummy variables is always treated as
an approximation since the number of steps and the placing of each step point are
arbitrarily ﬁxed (see [22]).
We may also consider including a deterministic general trend in the longterm
mean. For example, one can apply a linear time trend to the logarithm of price. This
implies an exponential trend for the price itself. This gives the “trending Ornstein
Uhlenbeck process” proposed by Lo and Wang (1995) (see [22]). Also, as suggested by
Pilipovic (1998), a sinusoidal function can be used to capture the seasonal pattern in
the price (see [20]). Moreover, we can model the longterm mean itself as a stochastic
process.
Stochastic Volatility
Due to the characteristics of electricity price behaviour, some authors have ar
gued that models for electricity prices should incorporate a form of volatility which
evolves stochastically over time (see [26], [10]). One typical example is to specify the
24
volatility as a squareroot process:
dv
t
= κ(µ −v
t
)dt +σ
√
v
t
dW
v
, (2.12)
where W
v
is correlated with the Brownian motion in the spot prices. This is diﬀerent
from the GARCH type models (2.9), which specify the volatility as a deterministic
function of lagged squared forecast error and lagged conditional variance.
It is also possibly to specify the volatility as a regimeswitching (see [36]) or jump
diﬀusion process (see [5], [10]). Estimation of stochastic volatility models presents
intriguing challenges, and a variety of procedures have been proposed for ﬁtting
the models. Examples include the Bayesian Monte Carlo Markov Chain (MCMC),
Eﬃcient Method of Moments (EMM), Generalized Method of Moments (GMM),
Simulated Method of Moments (SMM), and Kalman ﬁltering methods. Two excellent
recent surveys are Ghysels et al. (1995) and Shephard (1995) (see [37], [38]).
Markov Regime Switching Process
Markov regimeswitching processes have been proved to be quite useful in mod
eling a range of ﬁnancial time series including stocks, exchange rates and interest
rates. Some authors also incorporated them into stochastic processes to capture the
dynamics of electricity prices. For example, H´elyette and Andrea (2002) studied the
dynamics of electricity prices in the major U.S. electricity markets with a combi
nation of a deterministic term and a regime switching process. According to their
study, a regime switching model ensures the Markov property in the dynamics of
electricity prices, which makes the calibration and forecasting easier (see [4]). Shi
jie Deng (1999), Elliott, Sick and Stein (2000) and Geman (2001) also used regime
switching processes to model electricity prices (see [10], [16], [4]).
25
The basic framework of a twostate regime switching model is to assume that there
are a “volatile” state and a “normal” state of the world. The price processes behave
diﬀerently depending on the state of the world. Unexpected events such as earnings
announcements, scandals, or changes in macroeconomic variables signal to investors
new information which often result in price processes following completely diﬀerent
dynamics, i.e., to switch regimes. Some plausible scenarios for electricity prices
would be the forced outages of power generation plants or unexpected contingencies
in transmission networks and the like.
Chapter 3
Stochastic Models for Electricity Prices
A broad class of meanreverting jumpdiﬀusion models will be studied in this chapter
for electricity spot prices modeling. We will impose an aﬃne structure on the coeﬃ
cients of the processes, which leads to closedform or nearly closedform expressions
for the conditional characteristic functions. We will outline the relevant features of
this aﬃne framework, which are given a general treatment in Duﬃe, Pan and Sin
gleton (2000) (see [5], [12]). We will illustrate how to exploit the transform analysis
introduced by Duﬃe, Pan and Singleton (2000) to obtain the CCF in the models we
adopt.
3.1 Aﬃne JumpDiﬀusion Process
Recently, considerable attention has been focused on aﬃne processes. They are
ﬂexible enough to capture certain properties such as multiple jumps, timevarying
longterm mean, and stochastic volatility in various forms, that occur in many ﬁ
nancial time series without sacriﬁcing computational tractability. Therefore, aﬃne
processes have been widely used to study the term structure of interest rates, the
modeling of optimal dynamic portfolios and option pricing, and so on.
We follow here the presentation in Duﬃe, Pan and Singleton [5]. Suppose that
we are given a strong Markov process
1
X with realizations {X
t
, 0 ≤ t < ∞} in some
1
For the technical deﬁnition, see [39].
26
27
state space D ⊂ R
n
. Under certain regularity conditions,
2
X
t
uniquely solves the
following stochastic diﬀerential equation (SDE) (written in integral form):
X
t
= X
0
+
t
0
µ(X
s
, s)ds +
t
0
σ(X
s
, s)dW
s
+
m
¸
i=1
Z
i
t
. (3.1)
The jump behaviour of X is governed by m types of jump processes. Each jump
type Z
i
t
is a pure jump process with a stochastic arrival intensity λ
i
(X
t
, t) for some
λ
i
: (D, t) → R
n
and jump amplitude distribution ν
i
t
on R
n
, where ν
i
t
only depends
on time t. The functions µ : (D, t) →R
n
and σ : (D, t) →R
n×n
must satisfy certain
boundedness conditions in order to guarantee that (3.1) has a unique solution.
3
The
random variable W
s
is a standard Brownian motion in R
n
.
The process X
t
, deﬁned by (3.1), is said to be an aﬃne jumpdiﬀusion process
if
4
µ(X, t) = K
0
(t) +K
1
(t)X,
σ(X, t)σ(X, t)
= H
0
(t) +
n
¸
k=1
H
(k)
1
(t)X
k
,
λ
i
(X, t) = l
i
0
(t) +l
i
1
(t) · X,
where for each 0 ≤ t < ∞, K
0
(t) ∈ R
n
, K
1
(t) ∈ R
n×n
, H
0
(t) ∈ R
n×n
and is
symmetric, H
1
(t) ∈ R
n×n×n
. Also, for k = 1, . . . , n, H
(k)
1
(t), deﬁned to be the matrix
obtained by ﬁxing the third index of H
1
(t) to be k, is in R
n×n
and is symmetric.
Moreover, X
k
is the kth entry in X, l
i
0
(t) ∈ R, l
i
1
(t) ∈ R
n
.
Notice that given an initial condition X
0
, the tuple θ = (K
0
, K
1
, H
0
, H
1
, l
0
, l
1
)
can be used to determine a transform Ψ
θ
: C
n
× [0, ∞) × [0, ∞) × D → C of X
T
2
The details are given in [5].
3
See [5] for details.
4
For a matrix C, C
is the transpose of C. For column vectors a and b, the operation a · b is
the scalar product of a and b.
28
conditional on X
t
, 0 ≤ t ≤ T, deﬁned by
Ψ
θ
(u, t, T, X
t
) = E
θ
[exp(u · X
T
)X
t
], (3.2)
where E
θ
denotes the expectation under the distribution of X
T
determined by θ.
Duﬃe et al. [5] have proved that if we suppose θ = (K
0
, K
1
, H
0
, H
1
, l
0
, l
1
) is “well
behaved” at (u,T),
5
then the transform Ψ
θ
of X
t
, 0 ≤ t ≤ T, deﬁned by (3.2), exists
and is given by:
Ψ
θ
(u, t, T, X
t
) = exp(M(u, t, T) +N(u, t, T) · X
t
).
Here M(·) and N(·) satisfy the following complexvalued Riccati equations,
∂M(u, t, T)
∂t
= −A(N(u, t, T), t), M(u, T, T) = 0,
∂N(u, t, T)
∂t
= −B(N(u, t, T), t), N(u, T, T) = u,
(3.3)
where, for any c ∈C
n
,
6
A(c, t) = K
0
(t) · c +
1
2
c
H
0
(t)c +
m
¸
i=1
l
i
0
(ϕ
i
(c) −1),
B(c, t) = K
1
(t)
c +
1
2
c
H
1
(t)c +
m
¸
i=1
l
i
1
(ϕ
i
(c) −1).
(3.4)
Here ϕ
i
(c) is the socalled “jump transform” for the ith jump. It is given by ϕ
i
(c) =
R
n
exp(c· z)dν
i
t
(z) whenever the integral is well deﬁned. For example, for a normally
distributed jump size with mean µ and variance σ
2
, it can be shown that ϕ(c) =
exp(µc+
1
2
σ
2
c
2
); for an exponentially distributed jump size with mean µ, ϕ(c) =
1
1−µc
(see [5]).
5
See [5] for details.
6
C
n
denotes the set of ntuples of complex numbers; R
n
denotes the set of ntuples of real
numbers.
29
By setting u = is (i =
√
−1), one can obtain the CCF of X
T
conditional on X
t
as:
φ(s, θ, X
T
X
t
) :=Ψ
θ
(is, t, T, X
t
)
=E
θ
[exp(is · X
T
)X
t
]
=
R
N
exp(is · X
T
)f(X
T
, θX
t
)dX
t
,
(3.5)
with f(X
T
, θX
t
) is the conditional density of X
T
conditional on X
t
.
Notice that the CCF is actually the Fourier transform
7
of the conditional density.
Through the inverse Fourier transform, one can recover the conditional density func
tion from the CCF and implement a usual ML estimation. We will exploit this CCF
of discretely sampled observations to develop computationally tractable estimators
of parameters in Chapter 4.
3.2 Model 1a
Our ﬁrst attempt to capture the meanreversion and spikes present in electricity
prices is by a standard meanreverting jumpdiﬀusion process. We start with speci
fying the logarithm of the spot price, X
t
, using a model adopted from Das and Foresi
(1996). The diﬀusion part is represented by an OrnsteinUhlenbeck process and the
jump component has exponentially distributed absolute value of jump size with the
7
Brief summary of Fourier transform:
1. Given a function f(x), we take the Fourier transform to be
ˆ
f(ξ) = F[f](ξ)=
∞
−∞
f(x)e
ixξ
dx
2. Integration by parts reveals that the Fourier transform takes diﬀerentiation to multiplication
(by ξ): F[f
x
](ξ) = −iξ
ˆ
f(ξ)
3. If h(x) =
∞
−∞
f(x −y)g(y)dy then
ˆ
h(ξ) =
ˆ
f(ξ)ˆ g(ξ)
4. The Fourier transform is invertible, and f(x) =
1
2π
∞
−∞
e
−ixξ ˆ
f(ξ)dξ is called the inverse
Fourier transform.
For more details of Fourier transform, check the link mathworld.wolfram.com/FourierTransform.html.
30
sign of the jump determined by a Bernoulli variable. This is encapsulated by the
following SDE:
dX
t
= κ(α −X
t
)dt +σdW
t
+Q
t
dP
t
(ω). (3.6)
Here, the tuple θ = [κ, α, σ
2
, ω, ψ, γ] gives the unknown constant parameters. Specif
ically, κ is the mean reversion rate, α is the longterm mean, W
t
is a standard Brow
nian motion with dW
t
∼ N(0, dt) for an inﬁnitesimal time interval dt, and P
t
is a
discontinuous, one dimensional standard Poisson process with arrival rate ω. Dur
ing dt, dP
t
= 1 if there is a jump, dP
t
= 0 otherwise. The jump amplitude Q
t
is
exponentially distributed with mean γ and the sign of the jump Q
t
is distributed
as a Bernoulli random variable with parameter ψ. We assume that the Brownian
motion, Poisson process, and random jump amplitude are all Markov and pairwise
independent.
Notice that this equation ﬁts in the framework outlined in Section 3.1 with
K
0
= κα, K
1
= −κ, H
0
= σ
2
, H
1
= 0, l
0
= ω, l
1
= 0. (3.7)
Thus the CCF of X
T
given X
t
, φ(s, θ, X
T
X
t
), takes the form:
φ(s, θ, X
T
X
t
) = E[exp(isX
T
)X
t
]
= exp(A(s, t, T, θ) +B(s, t, T, θ)X
t
),
(3.8)
where A(·) and B(·) satisfy the following system of complexvalued ordinary diﬀer
ential equations (ODE):
∂A(s, t, T, θ)
∂t
= −καB(s, t, T, θ) −
1
2
σ
2
B
2
(s, t, T, θ) −ω(ϕ(B(s, t, T, θ)) −1),
∂B(s, t, T, θ)
∂t
= κB(s, t, T, θ),
(3.9)
31
with boundary conditions:
A(s, T, T, θ) = 0, B(s, T, T, θ) = is.
(3.10)
Here, the “jump transform” ϕ(B(s, t, T, θ)) is given by:
ϕ(B(s, t, T, θ)) = ψ
∞
0
exp(B(s, t, T, θ)z)
1
γ
exp(−
z
γ
)dz
+ (1 −ψ)
∞
0
exp(−B(s, t, T, θ)z)
1
γ
exp(−
z
γ
)dz
=
ψ
1 −B(s, t, T, θ)γ
+
1 −ψ
1 +B(s, t, T, θ)γ
.
(3.11)
Now, we solve the system (3.9) for A(·) and B(·) and apply the corresponding bound
ary conditions to obtain (after some calculation)
A(s, t, T, θ) = iαs(1 −e
−κ(T−t)
) −
σ
2
s
2
4κ
(1 −e
−2κ(T−t)
)
+
iω(1 −2ψ)
κ
arctan(γse
−κ(T−t)
) −arctan(γs)
+
ω
2κ
ln
1 +γ
2
s
2
e
−2κ(T−t)
1 +γ
2
s
2
,
B(s, t, T, θ) = ise
−κ(T−t)
.
(3.12)
3.3 Model 1b
In this model, we allow for asymmetric upward and downward jumps (see [10], [11]),
each with an exponentially distributed jump magnitude. More speciﬁcally, we sup
pose that the logarithm of the spot price X
t
satisﬁes the following SDE:
dX
t
= κ(α −X
t
)dt +σdW
t
+Q
u
t
dP
u
t
(ω
u
) +Q
d
t
dP
d
t
(ω
d
). (3.13)
Again, κ is the mean reversion rate, α is the longterm mean, and W
t
is a standard
Brownian motion with dW
t
∼ N(0, dt). The jump behaviour of X
t
is governed
32
by two types of jumps: upward jumps and downward jumps. The upward jumps
Q
u
t
are exponentially distributed with positive mean (γ
u
) and jump arrival rate ω
u
.
The downward jumps (Q
d
t
) are also exponentially distributed negative mean γ
d
and
jump arrival rate ω
d
. Again, P
u
t
and P
d
t
are two independent discontinuous, one
dimensional standard Poisson processes with arrival rate ω
u
and ω
d
respectively.
Notice that this equation also ﬁts in the framework outlined in Section 3.1.
Thus the transform analysis can be implemented in this case and the CCF can
be written out explicitly for this model
φ(s, θ, X
T
X
t
) = E[exp(isX
T
)X
t
]
= exp(A(s, t, T, θ) +B(s, t, T, θ)X
t
).
(3.14)
Here A(·) and B(·) satisfy the complexvalued system of ODE’s:
∂A(s, t, T, θ)
∂t
= −καB(s, t, T, θ) −
1
2
σ
2
B
2
(s, t, T, θ)
−ω
u
(ϕ
u
(B(s, t, T, θ)) −1) −ω
d
(ϕ
d
(B(s, t, T, θ)) −1),
∂B(s, t, T, θ)
∂t
= κB(s, t, T, θ),
(3.15)
with boundary conditions:
A(s, T, T, θ) = 0, B(s, T, T, θ) = is. (3.16)
Here the “jump transform” for the upward jump is given by:
ϕ
u
(B(s, t, T, θ)) =
∞
0
exp(B(s, t, T, θ)z)
1
γ
u
exp(−
z
γ
u
)
dz
=
1
1 −B(s, t, T, θ)γ
u
.
(3.17)
Similarly, the “jump transform” for the downward jump is given by:
ϕ
d
(B(s, t, T, θ)) =
1
1 −B(s, t, T, θ)γ
d
. (3.18)
33
Again, after some computation we solve for A(·) and B(·) and apply the correspond
ing boundary conditions to get (after some calculation)
A(s, t, T, θ) = iαs(1 −e
−κ(T−t)
) −
σ
2
s
2
4κ
(1 −e
−2κ(T−t)
)
+
ω
u
κ
ln
1 −isγ
u
e
−κ(T−t)
1 −isγ
u
+
ω
d
κ
ln
1 −isγ
d
e
−κ(T−t)
1 −isγ
d
,
B(s, t, T, θ) = ise
−κ(T−t)
.
(3.19)
3.4 Model 2a
We impose upon the price process (3.6) a timevarying component in the drift by
replacing the longterm mean α with a deterministic function α(t). The logarithm
of electricity spot price is thus deﬁned by
dX
t
= κ(α(t) −X
t
)dt +σdW
t
+Q
t
dP
t
(ω). (3.20)
Recall that the seasonal eﬀects on electricity prices are not strongly signiﬁcant for
the speciﬁc data we analyze (refer to Figure 2.4). We only incorporate the onpeak
and oﬀpeak eﬀects into the price process and consider the following form for α(t):
α(t) = α
1
peak
t
+α
2
oﬀpeak
t
, (3.21)
where
peak
t
=
1 if in onpeak periods,
0 otherwise,
oﬀpeak
t
=
1 if in oﬀpeak periods,
0 otherwise.
(3.22)
34
This is also an aﬃne process. By the same methods, the CCF of X
T
given X
t
can
be expressed in closedform in this case:
φ(s, θ, X
T
X
t
) = E[exp(isX
T
)X
t
]
= exp(A(s, t, T, θ) +B(s, t, T, θ)X
t
).
(3.23)
Here A(·) and B(·) satisfy the complexvalued system of ODE’s:
∂A(s, t, T, θ)
∂t
= −κα(t)B(s, t, T, θ) −
1
2
σ
2
B
2
(s, t, T, θ)
−ω(ϕ(B(s, t, T, θ)) −1),
∂B(s, t, T, θ)
∂t
= κB(s, t, T, θ),
(3.24)
with boundary conditions:
A(s, T, T, θ) = 0, B(s, T, T, θ) = is. (3.25)
We can obtain A(·) and B(·) in the same fashion:
A(s, t, T, θ) = L(s, t, T, θ) −
σ
2
s
2
4κ
(1 −e
−2κ(T−t)
)
+
iω(1 −2ψ)
κ
arctan(γse
−κ(T−t)
) −arctan(γs)
+
ω
2κ
ln
1 +γ
2
s
2
e
−2κ(T−t)
1 +γ
2
s
2
,
B(s, t, T, θ) = ise
−κ(T−t)
,
(3.26)
Here, let t = t
0
< t
1
· · · < t
N
= T, then we have
L(s, t, T, θ) =
T
t
κα(t)ise
−κ(T−t)
dt
= is
N
¸
j=1
αe
−κ(T−t
j
)
(1 −e
−κ(t
j
−t
j−1
)
),
(3.27)
where
α =
α
1
if [t
j−1
, t
j
] is in onpeak periods,
α
2
otherwise.
(3.28)
35
3.5 Model 2b
We consider the following extension to the model (3.13):
dX
t
= κ(α(t) −X
t
)dt +σdW
t
+Q
u
t
dP
t
(ω
u
) +Q
d
t
dP
t
(ω
d
), (3.29)
where α(t) is deﬁned the same as in (3.21).
Then similar to the above models, the CCF of X
T
given X
t
is of the form:
φ(s, θ, X
T
X
t
) = E[exp(isX
T
)X
t
]
= exp(A(s, t, T, θ) +B(s, t, T, θ)X
t
),
(3.30)
where
A(s, t, T, θ) = L(s, t, T, θ) −
σ
2
s
2
4κ
(1 −e
−2κ(T−t)
)
+
ω
u
κ
ln
1 −isγ
u
e
−κ(T−t)
1 −isγ
u
+
ω
d
κ
ln
1 −isγ
d
e
−κ(T−t)
1 −isγ
d
,
B(s, t, T, θ) = ise
−κ(T−t)
.
(3.31)
with L(s, t, T, θ) deﬁned in Equation (3.27).
3.6 Model 3a
It has been well documented that jumps alone are inadequate to mimic the level of
skewness present in electricity spot prices. Kaminski (1997) and Deng (1998) empha
sized the need to incorporate stochastic volatility in the modeling of electricity spot
prices (see [40], [10]). Also, according to the study of Mika and Andrew, volatility
in electricity prices varies over time and is likely meanreverting itself (see [41]). So
we consider the following twofactor aﬃne process (3.32) to model electricity spot
36
prices. Let X
t
be the logarithm of the spot price of electricity and V
t
be the volatility
of the price process which evolves stochastically over time. We have
d
X
t
V
t
¸
¸
¸
=
κ(α −X
t
)
κ
v
(α
v
−V
t
)
¸
¸
¸
dt +
1 −ρ
2
√
V
t
ρ
√
V
t
0 σ
v
√
V
t
¸
¸
¸
dW
t
dW
v
¸
¸
¸
+
Q
t
dP
t
(ω)
0
¸
¸
¸
.
(3.32)
Here, κ is the mean reversion rate of the log prices, α is the longterm mean of the log
prices, P
t
is a discontinuous, one dimensional standard Poisson process with arrival
rate ω, the amplitude of Q
t
is exponentially distributed with mean γ, and the sign of
Q
t
is distributed as a Bernoulli random variable with parameter ψ. The two random
variables W
t
and W
v
are two uncorrelated standard Brownian motions. Also, κ
v
is
the mean reversion rate of the volatility V
t
, α
v
is the longterm mean of V
t
, and σ
v
is the volatility of V
t
.
This model also ﬁts in the framework outlined in Section 3.1 with
K
0
=
κα
κ
v
α
v
¸
¸
¸
, K
1
=
−κ 0
0 −κ
v
¸
¸
¸
, H
0
=
0 0
0 0
¸
¸
¸
,
H
(1)
1
=
0 0
0 0
¸
¸
¸
, H
(2)
1
=
1 ρσ
v
ρσ
v
σ
2
v
¸
¸
¸
,
l
0
= ω, l
1
=
0
0
¸
¸
¸
.
(3.33)
Suppose that the jump component in the logarithm of the spot price is deﬁned as in
37
Model 1a and Model 2a. Then the “jump transform” is given by
ϕ
¸
¸
A(·)
B(·)
¸
¸
¸
=
∞
0
ψexp(A(·)z) + (1 −ψ)exp(−A(·)z)
1
γ
exp(−
z
γ
)dz
=
ψ
1 −A(·)γ
+
1 −ψ
1 +A(·)γ
.
(3.34)
Thus the CCF is of the form
φ(s
x
, s
v
, θ, X
T
, V
T
X
t
, V
t
)
= E
θ
[exp(is
x
X
T
+is
v
V
T
)(X
t
, V
t
)]
= exp(A(s
x
, s
v
, t, T, θ)X
t
+B(s
x
, s
v
, t, T, θ)V
t
+C(s
x
, s
v
, t, T, θ)),
(3.35)
where A(·), B(·) and C(·) satisfy the following complexvalued Riccati equations,
∂A(·)
∂t
= κA(·),
∂B(·)
∂t
= κ
v
B(·) −
1
2
A(·)(A(·) +ρσ
v
B(·))
−
1
2
B(·)(A(·)ρσ
v
+B(·)σ
2
v
),
∂C(·)
∂t
= −καA(·) −κ
v
α
v
B(·) −ω(ϕ
¸
¸
A(·)
B(·)
¸
¸
¸
−1),
(3.36)
with boundary conditions:
A(s
x
, s
v
, T, T, θ) = is
x
, B(s
x
, s
v
, T, T, θ) = is
v
, C(s
x
, s
v
, T, T, θ) = 0. (3.37)
We can still solve the ﬁrst equation for A(·) and apply the corresponding initial
conditions to obtain
A(·) = is
x
e
−κ(T−t)
.
(3.38)
However, we don’t have closed forms for B(·) and C(·) and need to solve them
numerically.
38
3.7 Model 3b
We now redeﬁne the model (3.32) above by allowing two types of jumps just as
before:
d
X
t
V
t
¸
¸
¸
=
κ(α −X
t
)
κ
v
(α
v
−V
t
)
¸
¸
¸
dt +
1 −ρ
2
√
V
t
ρ
√
V
t
0 σ
v
√
V
t
¸
¸
¸
dW
t
dW
v
¸
¸
¸
+
Q
u
t
dP
t
(ω
u
) +Q
d
t
dP
t
(ω
d
)
0
¸
¸
¸
.
(3.39)
Then the “jump transform” for the upward jump is given by
ϕ
u
¸
¸
A(·)
B(·)
¸
¸
¸
=
∞
0
exp(A(·)z)
1
γ
u
exp(−
z
γ
u
)
dz
=
1
1 −A(·)γ
u
.
(3.40)
Similarly, the “jump transform” for the downward jump is given by
ϕ
d
¸
¸
A(·)
B(·)
¸
¸
¸
=
1
1 −A(·)γ
d
. (3.41)
As this model also ﬁts in the setting of aﬃne processes, one can obtain the CCF in
the same fashion:
φ(s
x
, s
v
, θ, X
T
, V
T
X
t
, V
t
)
= E
θ
[exp(is
x
X
T
+is
v
V
T
)(X
t
, V
t
)]
= exp(A(s
x
, s
v
, t, T, θ)X
t
+B(s
x
, s
v
, t, T, θ)V
t
+C(s
x
, s
v
, t, T, θ)),
(3.42)
39
where A(·), B(·) and C(·) satisfy the following complexvalued Riccati equations,
∂A(·)
∂t
= κA(·),
∂B(·)
∂t
= κ
v
B(·) −
1
2
A(·)(A(·) +ρσ
v
B(·))
−
1
2
B(·)(A(·)ρσ
v
+B(·)σ
2
v
),
∂C(·)
∂t
= −καA(·) −κ
v
α
v
B(·)
−ω
u
(ϕ
u
¸
¸
A(·)
B(·)
¸
¸
¸
−1) −ω
d
(ϕ
d
¸
¸
A(·)
B(·)
¸
¸
¸
−1).
(3.43)
with boundary conditions:
A(s
x
, s
v
, T, T, θ) = is
x
, B(s
x
, s
v
, T, T, θ) = is
v
, C(s
x
, s
v
, T, T, θ) = 0. (3.44)
Again, we obtain A(·) = is
x
e
−κ(T−t)
and need to solve B(·) and C(·) numerically.
Chapter 4
Parameter Estimation
4.1 Introduction
Aﬃne processes are ﬂexible enough to allow us to capture the special characteris
tics of electricity prices such as meanreversion, seasonality, and “spikes”(see [10]).
Moreover, under suitable regularity conditions, one can explore the information from
the CCF of discretely sampled observations to develop computationally tractable
and asymptotically eﬃcient
1
estimators of the parameters of aﬃne processes (see
[5]). Moreover, the CCF is unique and contains the same information as the con
ditional density function through the Fourier transform. We can use it to recover
the conditional density function via the Fourier transform and implement a usual
ML estimation. This is the approach of MLCCF estimation, for which we will give
more details later on. If the Ndimensional state variables are all observable, ML
CCF estimation can be implemented and the so obtained MLCCF estimators are
asymptotically eﬃcient (see [12]).
But the estimation can be costly in higher dimensions (N ≥ 2) because we need
to compute the multivariate Fourier inversions repeatedly and accurately in order
to maximize the likelihood function. According to Singleton (2001), considerable
computational saving can be achieved by using limitedinformation MLCCF (LML
1
The ratio of the RaoCram´er lower bound to the actual variance of any unbiased estimation of
a parameter is called the eﬃciency of that statistic. If the eﬃciency tends to 1 as the number of
observations increases, the estimator is said to be asymptotically eﬃcient. For more details about
RaoCram´er bounds and eﬃciency, see [42].
40
41
CCF) estimation (see [12]). Suppose {X
t
, t = 1, 2, · · · } is a set of discretely sampled
observations of a Ndimensional state variable with a joint CCF φ(s, θ, X
t+1
X
t
).
Let η
j
denote a Ndimensional selection vector where the j
th
entry is 1 and zeros
elsewhere. Deﬁne X
j
t+1
:= η
j
· X
t+1
, then the conditional density of X
j
t+1
conditioned
on X
t
is the inverse Fourier transform of φ(ξη
j
, θ, X
t+1
X
t
) with some scalar ξ:
2
f
j
(X
j
t+1
, θX
t
) =
1
2π
R
φ(ξη
j
, θ, X
t+1
X
t
)e
−iξη
j
X
t+1
dξ. (4.1)
The basic idea behind this is to exploit the information in f
j
(X
j
t+1
, θX
t
) instead of
the information in the joint conditional density function,
f(X
t+1
, θX
t
) =
1
(2π)
N
R
N
φ(s, θ, X
t+1
X
t
)e
−is
X
t+1
ds. (4.2)
Thus the estimation involves at most N onedimensional integrations instead of
doing a Ndimensional integration. The estimators obtained are called LMLCCF
estimators. Although the LMLCCF estimators do not exploit any information about
the joint conditional density function, they are typically more eﬃcient than the quasi
maximum likelihood (QML) estimators for aﬃne diﬀusions (see [12]).
3
But for those multifactor models with unobservable (latent) state variables such
as stochastic volatility models, the MLCCF or LMLCCF estimators cannot be
obtained. There are several recent papers that discuss the methodologies related to
CCFbased estimators of stochastic volatility models. Singleton (1999) (see [12]) pro
posed a Simulated Method of Moments (SMMCCF) estimator; Jiang and Knight
(1999) (see [43]) explored the Method of System of Moments (MSM) estimators;
2
Refer back to (3.5) in Chapter 3.
3
QML acts as if the data were generated by a density function that provides an estimator that
is easy to obtain. This method only makes assumptions about the mean and the second moment.
For further details, see Bollerslev and Wooldridge (1992) and Newey and Steigerwald (1997).
42
Chacko and Viceira (1999) (see [13]) considered the socalled Spectral Generalized
Method of Moments (SGMM) estimators. In this thesis, we adopt the idea of SGMM
because this methodology is more computationally tractable than the others (see
[12]). To deal with stochastic volatility models, they derived the stationary (uncon
ditional) characteristic function
4
from the CCF of the volatility, and utilized this
CCF to obtain a socalled marginal CCF. We apply a ML type estimation based on
the socalled marginal CCF (MLMCCF) to estimate stochastic volatility models.
Furthermore, we also introduce SGMM estimators based on the socalled marginal
CCF to estimate stochastic volatility models.
4.2 MLCCF Estimators
ML estimation is the most common method of estimating the parameters of stochas
tic processes if the probability density has an analytical form. It provides a consistent
approach to parameter estimation problems and ML estimators become minimum
variance unbiased estimators as the sample size increases.
5
Suppose that X is a Ndimensional continuous random variable with probability
density function f(X, θ) where θ = {θ
1
, . . . , θ
k
} are k unknown constant parameters
which need to be estimated. Given a sequence of observations {X
t
} sampled at
4
It turns out that as t →∞, there exists a limiting distribution for X
t
. This limiting distribution
is called the stationary distribution and its Fourier transform is called the stationary characteristic
function (see [44]).
5
Any statistic that converges in probability to a parameter is called a consistent estimator of
that parameter. By unbiased, we mean that the mathematical expectation of the estimator of a
parameter is equal to the parameter. By minimum variance, we mean that the estimator has the
smallest variance among all unbiased estimators of the parameter (see [42]).
43
t = 1, 2, . . . , n, the log likelihood function at the sample is given by:
L(X
1
, . . . , X
n
, θ) =
n
¸
t=1
ln f(X
t
, θ). (4.3)
The maximum likelihood based estimators of θ are obtained by maximizing L(·),
ˆ
θ
ml
= argmax
θ
L(X
1
, . . . , X
n
, θ) = argmax
θ
n
¸
t=1
ln(f(X
t
, θ)). (4.4)
Now, for the models we adopt, the CCF, φ(s, θ, X
t+1
X
t
), of the sample is known,
often in closedform, as an exponential of an aﬃne function of X
t
. Thus the condi
tional density function of X
t+1
given X
t
can be obtained by the Fourier transform
of the CCF:
f(X
t+1
, θX
t
) =
1
(2π)
N
R
N
φ(s, θ, X
t+1
X
t
)e
−is·X
t+1
ds. (4.5)
One can use the standard ML estimation based on this conditional density func
tion to obtain MLCCF estimators of the sample as:
ˆ
θ
CCF
= argmax
θ
n−1
¸
t=1
ln(f(X
t+1
, θX
t
)). (4.6)
Take Model 1a (3.6) as an example. The conditional density function of X
t+1
given X
t
of the sample is of the form:
f(X
t+1
, θX
t
) =
1
2π
∞
−∞
φ(s, θ, X
t+1
X
t
)e
−isX
t+1
ds
=
1
2π
∞
−∞
e
−isYt
h(θ, s)ds,
(4.7)
where
Y
t
= (X
t+1
−α) −e
−κ
(X
t
−α), (4.8)
and
h(θ, s) = exp
−
σ
2
s
2
4κ
(1 −e
−2κ
) +
iω(1 −2ψ)
κ
(arctan(γse
−κ
) −arctan(γs))
+
ω
2κ
ln
1 +γ
2
s
2
e
−2κ
1 +γ
2
s
2
.
(4.9)
44
To assist in computing this integral (4.7) we deﬁne
F(Y
t
, θ) := f(X
t+1
, θX
t
) =
1
2π
lim
R→∞
R
−R
e
−isYt
h(θ, s)ds. (4.10)
Notice that h(θ, s) is continuous in s and h(θ, s) ≤ −
σ
2
s
2
4κ
(1 −e
−2κ
). Thus one can
truncate the integral to a ﬁnite interval [−R, R] outside of which the function h(θ, s)
to be integrated is negligibly small. Then, for this choice of R,
F(Y
t
, θ) ≈
1
2π
R
−R
e
−isYt
h(θ, s)ds. (4.11)
Also, one can discretize Y
t
into M subintervals such that:
6
Y
n
= n∆Y
t
= n(
Y
t
M
),
s
k
= k∆s = k(
R
M
),
F(Y
t
, θ) ≈
1
2π
R
M
M−1
¸
n=−M
(e
−ink
RY
t
M
2
h(θ,
nR
M
)).
(4.12)
If we arrange
RYt
M
= 2π, then we have
F(Y
t
, θ) ≈
1
2π
R
M
M−1
¸
n=−M
(e
−ink
2π
M
h(θ,
nR
M
)). (4.13)
One can approximate F(Y
t
, θ) by the discrete Fourier transform (DFT) of h(θ,
nR
M
),
and the integral in Equation (4.7) can be estimated on a suitable grid of s values by
a fast Fourier transform (FFT) algorithm.
4.3 MLMCCF Estimators
MLCCF estimators are asymptotically eﬃcient if all of the state variables are ob
servable. But for those multifactor models with unobservable state variables such as
6
Here we use the compound trapezoidal rule to approximate the integral, and
i
(A
i
) denotes
the sums of the A
i
with the ﬁrst and last term halved.
45
Model 3a and Model 3b, MLCCF estimators cannot be obtained directly. If option
prices are available, implied volatilities can be calculated from option prices observed
in the market. Various numerical methods have been proposed for estimating im
plied volatility functions from option prices (see [45], [46], [47]). Then one can use
those values as the data of volatilities and implement MLCCF estimation.
But in our case, option prices are not available. Following Chacko and Viceira
(1999) we can integrate the unobservable variable (volatility) from the joint CCF of
the log price and the volatility, and set s
v
= 0 to get the socalled marginal CCF.
Based on this marginal CCF, we can implement a ML based estimation. This is
somewhat similar to LMLCCF or Singleton’s SMM method. But LMLCCF esti
mation not only keeps s
v
= 0 but also needs to utilizes the volatility information
(not workable in our case). Singleton’s SMM method integrates out the unobserv
able variable in the CCF by simulation. This requires a huge number of simulated
paths of the volatility and can be quite timeconsuming. Furthermore, this induces
an estimation bias due to the discretization used in the simulation (see [13]). Mean
while, compared to the SGMM method that we will introduce later on, MLMCCF
estimation avoids the socalled ad hoc moment conditions selection problem and is
easier to implement in case of stochastic volatility models.
Take Model 3a as an example. Recall that the volatility follows a squareroot
process such as
dV
t
= κ
v
(α
v
−V
t
)dt +σ
v
V
t
dW
v
. (4.14)
46
The inﬁnitesimal generator of the squareroot process is:
7
Lf(v) =
σ
2
v
v
2
∂
2
f
∂v
2
+κ
v
(α
v
−v)
∂f
∂v
. (4.15)
Let µ
t
be the distribution function of V
t
, then it solves the forward Kolmogorov
equation (4.16):
8
µ
t
(Lf) =
d
dt
µ
t
(f), (4.16)
with µ
t
(f) :=
f(v)dµ
t
.
In particular, let µ be the stationary characteristic function of the volatility. In
this case, with f(v) = e
iuv
and ˆ µ(u) := µ(e
iuv
), we have
Le
iuv
= −
σ
2
v
v
2
u
2
e
iuv
+iκ
v
(α
v
−v)ue
iuv
= (iv(
iσ
2
v
u
2
2
−κ
v
u) +iκ
v
α
v
u)e
iuv
,
(4.17)
and
µ(Le
iuv
) = (
iσ
2
v
u
2
2
−κ
v
u)
ive
iuv
dµ +iκ
v
α
v
u
e
iuv
dµ
= (
iσ
2
v
u
2
2
−κ
v
u)
dˆ µ(u)
du
+iκ
v
α
v
uˆ µ(u) .
(4.18)
Because
dµ(·)
dt
= 0, we have
(
iσ
2
v
u
2
−κ
v
)
dˆ µ(u)
du
+iκ
v
α
v
ˆ µ(u) = 0 , (4.19)
with ˆ µ(0) = 1. Then the solution for (4.19) has the form:
ˆ µ(u) = (1 −
iuσ
2
v
2κ
v
)
−2κvαv/σ
2
v
. (4.20)
Recall that the joint CCF of the log price and volatility in this model is deﬁned
as
φ(s
x
, s
v
, θ, X
T
, V
T
X
t
, V
t
)
= exp(A(s
x
, s
v
, t, T, θ)X
t
+B(s
x
, s
v
, t, T, θ)V
t
+C(s
x
, s
v
, t, T, θ)),
(4.21)
7
Taken from Rama Cont [48].
8
For more details about forward Kolmogorov equation, see Paul Wilmott [19].
47
where A(·), B(·) and C(·) are the solutions of system (Equation (3.36)). As the
stochastic volatility V
t
is unobservable, we cannot estimate the parameters of stochas
tic models directly from the joint CCF of the log price and volatility. Let us deﬁne
the marginal CCF as
φ(s
x
, θ, X
T
X
t
)
=
∞
0
φ(s
x
, 0, θ, X
T
, V
T
X
t
, V
t
)dµ
=e
A(sx,0,t,T,θ)Xt+C(sx,0,t,T,θ)
∞
0
e
B(sx,0,t,T,θ)Vt
dµ
=e
A(sx,0,t,T,θ)Xt+C(sx,0,t,T,θ)
ˆ µ(−iB(s
x
, 0, t, T, θ)).
(4.22)
Applying Equation (4.20), we obtain the marginal CCF of the form
φ(s
x
, θ, X
T
X
t
) = e
A(sx,0,t,T,θ)Xt+C(sx,0,t,T,θ)
(1 −
B(s
x
, 0, t, T, θ)σ
2
v
2κ
v
)
−2κvαv/σ
2
v
. (4.23)
Through the Fourier transform, the marginal conditional density function is given
by
f(X
t+1
, θX
t
) =
1
2π
R
φ(s
x
, θ, X
t+1
X
t
)e
−isxX
t+1
ds. (4.24)
Then, given a sample {X
t
, t = 1, · · · , n}, one can implement the maximum likelihood
estimation based on this marginal distribution of the observed variables (electricity
prices), and obtain MLMCCF estimators as
ˆ
θ
MCCF
= argmax
θ
n−1
¸
t=1
ln(f(X
t+1
, θX
t
)). (4.25)
Notice that since we only rely on the level of the electricity prices in the pre
vious period, we lose eﬃciency. And the point estimations (including the SGMM
estimators we will discuss later on), as pointed out by Chacko and Viceira (1999),
are biased and inconsistent (see [13]). What’s more, the theoretical value for the
48
bias is hard to calculate as we don’t have closed forms for B(·) and C(·). Follow
ing Chacko and Viceira (1999), we try to correct the bias by a bootstrap method.
Speciﬁcally, we simulate 500 paths with a given parameter θ
0
. For each path, there
are 19,704 hourly observations (same length as the actual data). The estimates
ˆ
θ
i
,
i = 1, · · · , n, obtained from the simulated paths, result in a distribution for each
parameter. We will regard the diﬀerence between the mean of those estimates and
the given parameter as the bias, i.e.,
bias = θ
0
−
1
n
n
¸
i=1
ˆ
θ
i
, (4.26)
with n = 500 in our setting. We will try adjusting the estimates from the actual
data by this bias in the next chapter.
4.4 Spectral GMM Estimators
In this section, we will describe the SGMM estimators constructed by Chacko and
Viceira (1999). This method is essentially GMM in a complex setting. GMM esti
mation is one of the most fundamental estimation methods in statistics and econo
metrics, especially after Hansen’s inﬂuential paper [49] appeared in 1982. Unlike ML
estimation which requires the complete speciﬁcation of the model and its probability
distribution, full knowledge of the speciﬁcation and strong distributional assump
tions are not required for GMM estimation. GMM estimators are best suited to
study models that are only partially speciﬁed, and they are attractive alternatives
to likelihoodtype estimators.
We start with some basics of GMM, and deﬁne the moment conditions and mo
49
ment functions ﬁrst.
9
Deﬁnition 4.1 Suppose that we have a set of random variables {x
t
, t = 1, 2, . . . }.
Let θ = {θ
1
, . . . , θ
k
} be an unknown tuple with true value θ
0
to be estimated; θ
0
, θ
in some parameter space Θ. Then the qdimensional vector of functions m(x
t
, θ) is
called an (unconditional) moment function if the following moment conditions
hold:
E[m(x
t
, θ
0
)] = 0. (4.27)
Notice that θ is a ktuple vector and E[m(x
t
, θ
0
)] = 0 consists of q equations. If
one has as many moment conditions as parameters to be estimated (q = k), one can
simply solve the k equations in k unknowns to obtain the estimates. If we have fewer
moment conditions than unknowns (q < k), then we cannot identify θ. In this case,
we can “create” more moment conditions by the socalled weighting functions (often
termed “instruments” in the GMM literature, see [50]). If we have more functions
than unknowns (q > k), then this is an overidentiﬁed problem. Such cases of over
identiﬁcation can easily arise and the moment estimator is not welldeﬁned. Diﬀerent
choices of moment conditions may lead to diﬀerent estimates. GMM is a method to
solve this kind of overidentiﬁcation problem.
Let us take the standard linear regression model as an example and consider
y = x
θ
0
+ε. (4.28)
Here y is the response variable, x = [x
1
, x
2
, · · · , x
k
]
is a kdimensional vector of
regressors, x
(as before) is its transpose, and θ = [θ
1
, . . . , θ
k
]
is the unknown vector
9
For more details see [50].
50
of parameters with true value θ
0
. We assume that ε has zero expectation and is
uncorrelated with x. Using the Law of Iterated Expectations
10
we ﬁnd that
E[xε] = E[E[xεx]] = E[xE[εx]] = 0.
Therefore, we can have the moment functions m((x, y), θ) = x(y − x
θ). These
moment functions are well deﬁned, since, by the assumptions
E[m((x, y), θ
0
)] = E[x(y −x
θ
0
)] = E[xε] = 0. (4.29)
Suppose n > k observations on the response variable are available, say y
1
, y
2
, . . . , y
n
.
Along with each observed response y
t
, we have a kdimensional observation vector
of regressors x
t
. We have exactly as many moment conditions as parameters to be
estimated, since x
t
is a kdimensional vector. If we assume that the strong law of
large numbers holds then we have
1
n
n
¸
t=1
m((x
t
, y
t
),
ˆ
θ
n
) →E[m((x, y), θ
0
)] = 0, almost surely. (4.30)
So the Method of Moments (MM) estimator for this model is just the solution of
1
n
n
¸
t=1
x
t
(y
t
−x
t
ˆ
θ
n
) = 0, (4.31)
which gives
ˆ
θ
n
= (
n
¸
t=1
x
t
x
t
)
−1
n
¸
t=1
x
t
y
t
= (X
X)
−1
X
y (4.32)
with X = [x
1
, · · · , x
n
] and y = [y
1
, · · · , y
n
]
. Thus, the ordinary least squares (OLS)
estimator is a MM estimator.
Notice that we speciﬁed relatively little information about the error term ε. For
ML estimation we would be required to give the distribution of the error term ε, as
well as the autocorrelation and heteroskedasticity,
11
which are also not required in
10
For more details see [42].
11
Heteroskedasticity means that the variance of the errors is not constant across observations.
51
formulating the moment conditions.
Now instead of assuming that the error term has zero expectation on certain
observed variables, we can specify the moment conditions directly by requiring the
error term to be uncorrelated with certain observed “instruments”. Let’s consider
the previous model again. This time we do not assume the error term has zero
expectation, but that it is still uncorrelated to the regressors. Suppose we have a
qdimensional observed instrument z, (q ≥ k) and E[zε] = 0. Thus we have the
moment conditions
E[zε] = E[z(y −x
θ
0
)] = 0, (4.33)
and the moment functions
m((x, y, z), θ) = z(y −x
θ). (4.34)
If q = k, then this is also a well deﬁned problem. let z
t
denote the corresponding
kdimensional observation vector of instrument to y
t
. We assume that the strong
law of large numbers holds so that we have
1
n
n
¸
t=1
m((x
t
, y
t
, z
t
),
ˆ
θ
n
) →E[m((x, y, z), θ
0
)] = 0, almost surely. (4.35)
Therefore we solve
1
n
n
¸
t=1
z(y −x
ˆ
θ
n
) = 0, (4.36)
which gives
ˆ
θ
n
= (
n
¸
t=1
z
t
x
t
)
−1
n
¸
t=1
z
t
y
t
= (Z
X)
−1
Z
y. (4.37)
with Z = [z
1
, · · · , z
n
].
52
Deﬁnition 4.2
12
Suppose we have an observed sample {x
t
, t = 1, 2, . . . , n } from
a stochastic process x. Assume that the moment conditions E[m(x
t
, θ
0
)] = 0 hold.
One can deﬁne the criterion function as
Q
n
(θ) =
1
n
n
¸
t=1
m(x
t
, θ)
W
n
1
n
n
¸
t=1
m(x
t
, θ)
, (4.38)
where W
n
is a q ×q symmetric positive deﬁnite matrix. The GMM estimator
ˆ
θ
n
of θ associated with W
n
is the solution to the problem:
ˆ
θ
n
= argmin
θ
Q
n
(θ)
= argmin
θ
1
n
n
¸
t=1
m(x
t
, θ)
W
n
1
n
n
¸
t=1
m(x
t
, θ)
.
(4.39)
Consider the linear regression model with instruments again, and suppose that
we have q > k moment functions this time. Suppose we choose
W
n
=
1
n
n
¸
t=1
z
t
z
t
−1
= n(Z
Z)
−1
. (4.40)
Then the criterion function is
Q
n
(θ) = n
−1
(Z
y −Z
Xθ)
(Z
Z)
−1
(Z
y −Z
Xθ). (4.41)
Diﬀerentiating with respect to θ gives the ﬁrst order conditions
∂Q
n
(θ)
∂θ

θ=
ˆ
θn
= n
−1
2X
Z(Z
Z)
−1
(Z
y −Z
X
ˆ
θ
n
) = 0, (4.42)
and solving for
ˆ
θ
n
gives
ˆ
θ
n
= (X
Z(Z
Z)
−1
Z
X)
−1
X
Z(Z
Z)
−1
Z
y. (4.43)
12
Taken from [50].
53
This is the standard instrument variable estimator for the case where there are more
instruments than regressors.
Now, the deﬁnition of the CCF of the sample implies that
E[exp(is · X
T
) −φ(s, θ, X
T
X
t
)] = 0, s ∈ R
n
. (4.44)
By taking real and imaginary parts of this function, we get the following pair of
moment conditions:
E[Re(exp(is · X
T
) −φ(s, θ, X
T
X
t
))] = 0,
E[Im(exp(is · X
T
) −φ(s, θ, X
T
X
t
))] = 0.
(4.45)
Thus one can deﬁne a set of moment functions m(s, θ, X
T
, X
t
) as follows:
13
m(s, θ, X
T
, X
t
) = ε
t
(s, θ, X
T
, X
t
) =
ε
Re
t
(s, θ, X
T
, X
t
)
ε
Im
t
(s, θ, X
T
, X
t
)
¸
¸
¸
,
ε
Re
t
(s, θ, X
T
, X
t
) := Re(ε
t
(s, θ, X
T
, X
t
)) = Re(exp(is · X
T
) −φ(s, θ, X
T
X
t
)),
ε
Im
t
(s, θ, X
T
, X
t
) := Im(ε
t
(s, θ, X
T
, X
t
)) = Im(exp(is · X
T
) −φ(s, θ, X
T
X
t
)).
(4.46)
13
Re(A) denotes the real part of A, while Im(A) denotes the imaginary part of A.
54
More generally, we can add a set of “instruments” or “weighting functions” to
obtain more moment restrictions. So one can deﬁne the moment function based on
the CCF as
14
m(s, θ, X
T
, X
t
) = ε
t
(s, θ, X
T
, X
t
) ⊗p(X
t
),
(4.47)
where p(X
t
) are “instruments” independent of ε
t
(s, θ, X
T
, X
t
). The SGMM estima
tor is of the form:
ˆ
θ
SGMM
= argmin
θ
1
n
n
¸
t=1
m(s, θ, X
T
, X
t
)
W
n
1
n
n
¸
t=1
m(s, θ, X
T
, X
t
)
, θ ∈ Θ.
(4.48)
Just as for other GMM estimators, the asymptotic variance of the SGMM es
timator is minimized with the optimal weighting matrix W
n
= S
−1
, where S is
the covariance matrix of the moment functions (see [13]). Under the usual regu
larity conditions,
15
according to Chacko and Viceira (1999), the SGMM estimator,
ˆ
θ
SGMM
, inherits the optimality properties of GMM estimators such as consistency
and asymptotic normality (see [49], [13]).
We now can apply SGMM to estimate stochastic volatility models like Model 3a
14
Let A, B be K ×L, M ×N matrices with elements indexed as
A[k, l], k = 0, 1, · · · , K −1, l = 0, 1, · · · , L −1,
B[m, n], m = 0, 1, · · · , M −1, l = 0, 1, · · · , N −1.
We deﬁne the Kronecker product A⊗B to be a KM ×LN matrix
A⊗B :=
A[0, 0]B A[0, 1]B · · · A[0, L −1]B
A[1, 0]B A[1, 1]B · · · A[1, L −1]B
.
.
.
.
.
.
.
.
.
A[K −1, 0]B A[K −1, 1]B · · · A[K −1, L −1]B
with elements
(A⊗B)[m +kM, n +lN] := A[k, l]B[m, n].
15
We assume that Hansen’s regularity conditions are satisﬁed. For more details see [49].
55
and Model 3b. Recall that the marginal CCF of the sample is given by:
φ(s
x
, θ, X
t+1
X
t
) = e
A(sx,0,t,θ)Xt+C(sx,0,t,θ)
(1 −
B(s
x
, 0, t, θ)σ
2
v
2κ
v
)
−2κvαv/σ
2
v
, (4.49)
where A(·), B(·) and C(·) are the solutions of system (3.36) and (3.43) for Model
3a and Model 3b respectively. Given a sample {X
t
, t = 1, · · · , T}, we have moment
functions as follows:
m(s
x
, X
t
, θ) = ε
t
(s
x
, X
t
, θ) =
ε
Re
t
(s
x
, X
t
, θ)
ε
Im
t
(s
x
, X
t
, θ)
¸
¸
¸
,
ε
Re
t
(s
x
, X
t
, θ) = (cos(s
x
X
t+1
) −Re(φ(s
x
, θ, X
t+1
X
t
))) ⊗p(X
t
),
ε
Im
t
(s
x
, X
t
, θ) = (sin(s
x
X
t+1
) −Im(φ(s
x
, θ, X
t+1
X
t
))) ⊗p(X
t
).
(4.50)
with φ(s
x
, θ, X
t+1
X
t
)) deﬁned as Equation (4.49). Following Chacko and Viceira
(1999), one can compute the nth conditional moment by simple substitution of
s
x
= n into Equation (4.50).
16
In the next chapter, in order to construct the moment functions, we use the ﬁrst
six spectral moments by setting s
x
= 1, 2, · · · , 6 and p(X
t
) as a Tdimensional vector
of 1s. Again, as we only explored the information in the marginal CCF, the estimates
we obtain are biased and inconsistent (see [50]).
17
16
Paraphrasing [50], the selection of an appropriate set of s
x
is crucial for the eﬃciency of GMM
estimation. A poor selection of s
x
which results in a poor choice of moment conditions may lead
to very ineﬃcient estimators and can even cause identiﬁcation problems. This is known as the ad
hoc choice of moment conditions problem in the GMM literature.
According to Carrasco et al. (see [29]), when s
x
goes through all real numbers, i.e., when the
number of moment conditions increases to inﬁnity, the conditional density can be recovered. The
corresponding GMM estimator based on a continuum of moments conditions yields ML eﬃciency.
But the size of the weighting matrix is n × n where n is the sample size. Therefore the computational
burden increases dramatically as the sample size increases. Also, the estimators will deteriorate
for large sample sizes as the numerical errors associated with large matrix operations, according to
Mikhail Chernov (one of the authors of [29]).
17
We will use the GMM and MINZ Program Libraries for Matlab, by Michael T. Cliﬀ, to perform
the estimation.
Chapter 5
Model Comparison
This chapter gives some empirical comparisons among the models we introduced in
Chapter 3. Three data sets–the hourly electricity prices (Hourly EP), onpeak daily
average electricity prices (Peak EP), and oﬀpeak daily average electricity prices
(Oﬀpeak EP)–from January 1, 2002 to March 31, 2004, are used to evaluate the
price models.
1
5.1 Data Description
A summary of statistics for the hourly electricity price series from January 1, 2002
to March 31, 2004 is presented in Table 5.1. The statistics reported here are for
electricity prices (P), the change in electricity prices (dP), the logarithm of electricity
prices (ln(P)), and the log returns of electricity prices (d ln(P)), from one hour to
the next. Clearly, price series are skewed with excess kurtosis.
The descriptive statistics for the onpeak/oﬀpeak daily average electricity prices
during the period January 1, 2002 to March 31, 2004 are shown in Tables 5.2 and
5.3 respectively. The data is daily in frequency. As might be expected, Hourly
EP has more volatility, larger skewness and higher kurtosis than the daily average
1
The hourly electricity prices were obtained from the public website of the Alberta Electric
System Operator. Altogether, we have 19,704 observations at an hourly frequency. The onpeak
and oﬀpeak daily average electricity price were computed according to the Alberta Energy and
Utilities Board. Thus we have 569 observations for the onpeak period, and 821 observations for
the oﬀpeak period.
56
57
mean Std.Dev Skewness Kurtosis Minimum Maximum
P 58.5911 64.2576 5.1325 46.6263 0.0100 999.9900
dP 0.0042 54.4378 0.4207 53.1331 862.1000 826.3900
ln(P) 3.5922 0.8359 0.1089 4.1445 4.6052 6.9077
d ln(P) 0.0000 0.5529 0.0133 8.4214 6.0426 6.0234
Table 5.1: Descriptive statistics of Hourly EP
Note: here and in the remainder of this thesis, the column labeled Std.Dev reports
the standard deviation. Skewness and Kurtosis are the third, fourth moment around
the mean, namely skewness =
E[X−E[X]]
3
[var[X]]
1.5
, kurtosis =
E[X−E[X]]
4
[var[X]]
2
. For a normal dis
tribution, skewness is equal to 0 and kurtosis is equal to 3.
mean Std.Dev Skewness Kurtosis Minimum Maximum
P 68.3287 43.0324 2.8119 16.6829 12.2264 413.9657
dP 0.0150 46.5599 0.0897 11.1163 283.8329 255.9136
ln(P) 4.0763 0.5340 0.1347 3.6559 2.5036 6.0258
d ln(P) 0.0004 0.5225 0.0910 5.0479 2.3969 2.1600
Table 5.2: Descriptive statistics of Peak EP
onpeak/oﬀpeak data. And the descriptive statistics for Peak EP are larger than
Oﬀpeak EP.
Figure 5.1 plots the histogram of the log returns of Hourly EP over the period
January 1, 2002 to March 31, 2004. There is a big spike in the middle which is a
quantization eﬀect (mainly due to rounding to ± .01). This is hard to be captured
by most models. In order to get reasonable estimates, we remove systematic day to
day variations from data prior to ﬁtting. In this way, we can reduce the inﬂuence of
hourly price patterns and compensate for quantization eﬀects. Speciﬁcally, if P
t
is
the actual data at the ith hour, i = 1, · · · , 24, then the “deseasonalized” data X
t
is
58
mean Std.Dev Skewness Kurtosis Minimum Maximum
P 39.6513 29.8016 2.6022 14.9079 6.5713 299.2710
dP 0.0116 25.5697 0.2651 12.7988 200.9500 128.0813
ln(P) 3.4647 0.6445 0.2188 2.7684 1.8827 5.7013
d ln(P) 0.0005 0.4710 0.1014 4.9058 2.1208 2.2547
Table 5.3: Descriptive statistics of Oﬀpeak EP
−8 −6 −4 −2 0 2 4 6 8
0
500
1000
1500
2000
2500
3000
3500
4000
Figure 5.1: Histogram of the log returns of Hourly EP
Note: the price series we obtained from the public website of the Alberta Electric
System Operator have been rounded to the nearest two decimals. A signiﬁcant
amount of log returns are inside the range 0.01 to 0.01.
59
−6 −4 −2 0 2 4 6
0
500
1000
1500
2000
2500
3000
3500
4000
Figure 5.2: Histogram of the changes of the deseasonalized data (dX)
obtained by:
X
t
= ln(P
t
) −Mean of the log price at the ith Hour. (5.1)
Figure 5.2 plots the histogram of the changes of the “deseasonalized” data. Table
5.4 presents descriptive statistics for the log of Hourly EP (ln(P)), the log returns of
Hourly EP, “deseasonalized” data (X) and the changes of the deseasonalized data
(dX).
60
mean Std.Dev Skewness Kurtosis Minimum Maximum
ln(P) 3.5922 0.8359 0.1089 4.1445 4.6052 6.9077
d ln(P) 0.0000 0.5529 0.0133 8.4214 6.0426 6.0234
X 0.0000 0.7344 0.3104 4.9780 7.7314 3.5917
dX 0.0000 0.5362 0.0307 8.8230 5.9823 5.8074
Table 5.4: Descriptive statistics of deseasonalized Hourly EP
5.2 Calibration
This section presents estimates from ﬁtting each of the six models to Hourly EP
using the methodologies described in Chapter 4. In the remainder of this thesis, we
will refer to Model 1a, Model 2a and Model 3a as the onejump version models and
refer to Model 1b, Model 2b and Model 3b as the twojump version models. In all
cases, Hourly EP were deseasonalized prior to ﬁtting.
2
For Model 1a, Model 2a, Model 1b and Model 2b, we perform calibration via
MLCCF estimation. As we have computed the CCF of X
t+1
given X
t
for each
model in Chapter 3, just as for the example in Section 4.2, the conditional density of
X
t+1
given X
t
can be recovered through the Fourier transform.
3
The estimates are
obtained by using standard ML estimation on these conditional densities. We used
optimization toolbox in Matlab for the optimization. The optimizations converged in
less than 20 steps for Hourly EP. Actual computer time required for each optimization
is around 1 minute on a 2.46 GHz Pentium 4 PC. The resulting parameter estimates,
the corresponding standard error (labeled as Std.) and tratio
4
for Hourly EP via
MLCCF estimation are provided in Tables 5.5–5.10.
2
In the remainder of this thesis, we just denote the deseasonalized Hourly EP as Hourly EP.
3
All the routines are realized in Matlab, and are available on request.
4
Tratio is deﬁned as the ratio of the estimate, to the estimate of standard deviation of the
61
Notice that the models are generally consistent with each other. The mean re
verting rate κ is similar for all the models, and is close to 0.11. Thus the halflife of
the meanreverting process is about 2.7 hours.
5
What’s more, the jump parameters
ω, ψ and γ for all the onejump version models are quite similar. Also, the jump
parameters ω
u
, γ
u
, ω
d
and γ
d
in the twojump version models are similar. This is
maybe because that the diﬀerence between the so obtained onpeak data and oﬀpeak
data is not signiﬁcant in Alberta (refer to Figure [?]).
Meanwhile, we observe that the frequencies of positive jumps and negative jumps
are quite close to each other. The probability of a positive jump ψ is roughly 50% in
the onejump version models. Likewise, the upward jump arrival rate ω
u
is roughly
equal to the downward jump arrival rate in the twojump version models. According
to Barz (1999), the derived balance between upward and downward jumps can be
explained because we use the logarithm of the prices instead of using the spot prices.
The magnitude of downward movements is ampliﬁed relative to upward movements.
What’s more, he suggested that care must be taken to ensure one does not use a
misspeciﬁed model in this case (see [17]).
We also observe some diﬀerent behaviour among these models. Incorporating
the onpeak and oﬀpeak eﬀects in Model 2a does not change most of the estimates
sampling distribution of the estimate. That is,
tratio =
Estimate
Std.
. (5.2)
For large degrees of freedom (usually 30 or more), if the absolute value of the tratio is larger than
1.96, we can consider the parameter to be signiﬁcant at the 95% level.
5
Halflife is a key property of a meanreverting process. It is the time that it takes for the
price to revert half way back to its longterm mean level from its current level if no more random
shocks arrive. Typically, for a meanreverting process, we have the halflife is equal to ln(2) over
the meanreverting rate. For more details, see [23].
62
signiﬁcantly compared with the estimates from Model 1a. Only the estimate for the
longterm mean in Model 1b is much smaller than the estimate for the longterm
mean in Model 1a. The onpeak coeﬃcient α
1
is close to the oﬀpeak coeﬃcient α
2
in Model 2a. However, in Model 2b, this is not the case. Moreover, the probability
of positive jumps is slightly more than the probability of negative jumps in Model
1a, while we have slightly more negative jumps than positive jumps in Model 1b.
The graphic outputs from the optimization process are reported in Figures 5.3–
5.6. Each ﬁgure consists of three parts. The upper left plot is the empirical histogram
of “deviations from the expected values”. For Model 1a and Model 1b, this is deﬁned
as:
6
Y
t
= (X
t+1
−α) −e
−κ
(X
t
−α), (5.3)
where X
t
is the log price at time t, κ is the meanreverting rate, and α is the longterm
mean. It can be shown that the density function of Y
t
is the same as f(X
t+1
, θX
t
)
(with possibly a translated mean). Hence this histogram and the corresponding
density should have the same shape. For Model 2a and Model 2b, it is deﬁned as:
Y
t
= (X
t+1
−α(t)) −e
−κ
(X
t
−α(t)), (5.4)
where
α(t) = α
1
peak
t
+α
2
oﬀpeak
t
, (5.5)
with peak
t
and oﬀpeak
t
deﬁned in Equation (3.22). The upper right plot is the
theoretical density of Y
t
. For good estimates, the histograms of deviations from
expected values should be similar to the theoretical conditional density plots. The
lower plots are slices of the loglikelihood function versus each parameter. The
6
Refer back to Equation (4.8).
63
asterisk on each curve is the estimate for the corresponding parameter. If the asterisk
is at the peak of the curve, then the estimate is a good one. The number labeled
below the xaxis is the width of each curve, which is the 95% conﬁdence interval. All
the histograms of deviations from expected values are very similar to the theoretical
conditional density plots. Also, all of the asterisks (denotes the estimates) are at the
peak of the curves.
For these stochastic volatility models (Model 3a and Model 3b), since the infor
mation of the volatility is not available, we cannot use MLCCF estimation. However,
we can explore the marginal CCF, and perform MLMCCF estimation instead.
7
It
takes more than 20 minutes for the optimizations to converge on a 2.46 GHz Pen
tium 4 PC. The resulting parameter estimates, the corresponding standard error
and tratio for Hourly EP via MLMCCF estimation are provided in Tables 5.9 and
5.10. As pointed out by Chacko and Viceira (1999), these estimates are biased and
inconsistent, and need to be modiﬁed. We tried to correct the bias by a bootstrap
method as suggested by Chacko and Viceira (1999), and simulate 500 paths with a
given parameter θ
0
.
8
The given parameter, the mean of the estimates, mean bias and
7
Refer back to Section 4.3.
8
Unfortunately, we cannot obtain simulated paths with the same length as the actual data using
the estimates in Tables 5.9 and 5.10 directly. Because 2κ
v
α
v
is very close to σ
2
v
, it is very hard to
guarantee the positiveness of the volatility process for a long path.
64
root mean square error (RMSE)
9
for Model 3a and Model 3b are reported in Table
5.11 and Table 5.12 respectively. In comparison with the estimates, the RMSEs are
pretty big, which indicates that the estimates are not reliable. We then decided not
to use these biases to adjust our estimates.
We also report the graphic outputs from the optimization process in Figures 5.7–
5.8. Similar to before, each ﬁgure consists of three parts. The upper left plot is
the empirical histogram of deviations from the expected values. For Model 3a and
Model 3b, deviation from the expected value is deﬁned as:
Y
t
= X
t+1
−e
−κ
X
t
. (5.6)
The upper right plot is the theoretical density of Y
t
. The lower plots are slices of the
loglikelihood function versus each parameter. The histograms of deviations from
expected values are close to the theoretical conditional density plots. But not all
of the asterisks are at the peak of the curves. The asterisk denoted for ρ is not
at the peak. Also, the curve of the loglikelihood function versus the correlation
coeﬃcient (rho) in both models is quite ﬂat (not convex). This indicates that this
parameter is hard to estimate, and extra care should be taken into when computing
this parameter. It seems that ρ tends to go towards 1, and this may result in higher
loglikelihood. Therefore, we recomputed the loglikelihood function for both models
9
Let
ˆ
θ
i
be the estimate for each sample path i, i = 1, · · · , n and θ
0
be the true value, then mean
bias is deﬁned as:
Mean Bias =
1
n
n
i=1
(
ˆ
θ
i
−θ
0
).
RMSE is deﬁned as:
RMSE =
1
n
n
i=1
(
ˆ
θ
i
−θ
0
)
2
.
65
Estimate Std. Tratio
κ 0.1124 0.0040 28.1000
α 0.2368 0.0250 9.4720
σ
2
0.0144 0.0008 18.0000
ω 1.2260 0.0344 35.6395
ψ 0.5261 0.0061 86.2459
γ 0.3367 0.0058 58.0517
Table 5.5: Hourly EP parameters values for Model 1a
Note: this table reports the estimates, the standard error of estimation and t
statistics from ﬁtting Model 1a (dX
t
= κ(α −X
t
)dt +σdW
t
+Q
t
dP
t
(ω)) to Hourly
EP. All the estimates are hourly based. The standard errors are quite small. All
the absolute values of the tratios are larger than 1.96, which indicates that all the
parameters are likely signiﬁcant.
ﬁxing ρ = −1. The resulting parameter estimates, the corresponding standard error
and tratio for Hourly EP via MLMCCF estimation are provided in Tables 5.13
and 5.14. The loglikelihood values are really higher with ρ = −1 which veriﬁes our
hypothesis. As the spot price and the volatility are so strongly correlated, Model 3a
and Model 3b may be misspeciﬁed models. Therefore, in the following sections, we
will only consider the other four models.
We also carried out estimation for Model 3a and Model 3b, via SGMM estimation,
by ﬁtting Hourly EP. Unfortunately, the results that were obtained in the available
time constraint were not very accurate, and are not presented here. In comparison
with the other four models, the optimization processes of Model 3a and Model 3b
are quite computationally intensive. This is because we have to solve the Riccati
equations numerically during each iteration.
66
−10 −5 0 5 10
0
200
400
600
800
Histogram of deviations from expected values
−8 −6 −4 −2 0 2 4 6
0
0.5
1
1.5
Plot of density function
0.015
−11435
−11424
kappa
0.98
alpha
0.0033
sigma2
0.07
omega
0.024
psi
0.023
gamma
Figure 5.3: Results from optimization (Model 1a)
Note: these are the results of ﬁtting Model 1a (dX
t
= κ(α−X
t
)dt+σdW
t
+Q
t
dP
t
(ω))
to Hourly EP. If we get good estimates, the histogram of deviation of expected values
(deﬁned in (5.3)) should be very similar to the plot of the theoretical density function.
From these plots, one can tell that Model 1a is a good ﬁt to Hourly EP. Moreover,
all the estimates are at the peak of each curve, which indicates that we have found
the optimum points. These results are also consistent with the small standard errors
in Table 5.5.
67
Estimate Std. Tratio
κ 0.1051 0.0041 25.6341
α
1
0.2765 0.0278 9.9460
σ
2
0.0136 0.0008 17
ω 1.2188 0.0328 37.1585
ψ 0.5228 0.0061 81.7049
γ 0.3384 0.0057 59.3684
α
2
0.2009 0.0274 7.3321
Table 5.6: Hourly EP parameters values for Model 2a
Note: this table reports the estimates, the standard error of estimation and t
statistics from ﬁtting Model 2a (dX
t
= κ(α(t)−X
t
)dt +σdW
t
+Q
t
dP
t
(ω)) to Hourly
EP. All the estimates are hourly based. The standard errors are quite small. All
the absolute values of the tratios are larger than 1.96, which indicates that all the
parameters are signiﬁcant. In comparison with the estimates from Model 1a, incor
porating the onpeak and oﬀpeak eﬀects does not change our estimates signiﬁcantly.
Moreover, the onpeak coeﬃcient α
1
is close to the oﬀpeak coeﬃcient α
2
.
Estimate Std. Tratio
κ 0.1162 0.0042 27.6667
α 0.1405 0.0320 4.3906
σ
2
0.0131 0.0009 14.5556
ω
u
0.6190 0.0187 33.1016
γ
u
0.3536 0.0073 48.4384
ω
d
0.6622 0.0299 22.1472
γ
d
0.3059 0.0084 36.4167
Table 5.7: Hourly EP parameters values for Model 1b
Note: this table reports the estimates, the standard error of estimation and t
statistics from ﬁtting Model 1b (dX
t
= κ(α−X
t
)dt+σdW
t
+Q
u
t
dP
u
t
(ω
u
)+Q
d
t
dP
d
t
(ω
d
))
to Hourly EP. All the estimates are hourly based. The standard errors are small. All
the absolute values of the tratios are larger than 1.96, which indicates that all the
parameters are signiﬁcant. Notice that we have slightly more negative jumps than
positive jumps, while, in Model 1a, the probability of positive jumps is slightly more
than the probability of negative jumps. Furthermore, the estimate for the longterm
mean in Model 1b is much smaller than the estimate for the longterm mean in Model
1a.
68
−10 −5 0 5
0
100
200
300
400
500
600
700
Histogram of deviations from expected values
−5 0 5
0
0.5
1
1.5
Plot of density function
0.016
−11434
−11413
kappa
0.1251
alpha
0.0034
sigma2
0.036
omegau
0.029
gammau
0.117
omegad
0.033
gammad
Figure 5.4: Results from optimization (Model 2a)
Note: these are the results of ﬁtting Model 2a (dX
t
= κ(α(t) − X
t
)dt + σdW
t
+
Q
t
dP
t
(ω)) to Hourly EP. If we get good estimates, the histogram of deviation of
expected values (deﬁned in (5.4)) should be very similar to the plot of the theoretical
density function. From these plots, one can tell that Model 2a is a good ﬁt to Hourly
EP. Moreover, all the estimates are at the peak of each curve, which indicates that
we have found the optimum points. These results are also consistent with the small
standard errors in Table 5.6.
69
−10 −5 0 5 10
0
200
400
600
800
Histogram of deviations from expected values
−5 0 5
0
0.5
1
1.5
Plot of density function
0.0157
−11411
−11397
kappa
0.0786
alpha
0.003
sigma2
0.13
omega
0.025
psi
0.022
gamma
0.08
alpha2
Figure 5.5: Results from optimization (Model 1b)
Note: these are the results of ﬁtting Model 1b (dX
t
= κ(α − X
t
)dt + σdW
t
+
Q
u
t
dP
u
t
(ω
u
) +Q
d
t
dP
d
t
(ω
d
)) to Hourly EP. If we get good estimates, the histogram of
deviation of expected values (deﬁned in (5.3)) should be very similar to the plot of the
theoretical density function. From these plots, one can tell that Model 1b is a good
ﬁt to Hourly EP. Moreover, all the estimates are at the peak of each curve, which
indicates that we have found the optimum points. These results are also consistent
with the small standard errors in Table 5.7.
70
Estimate Std. Tratio
κ 0.1084 0.0041 26.1427
α
1
0.1665 0.0338 4.9205
σ
2
0.0121 0.0008 14.8066
ω
u
0.6096 0.0175 34.9320
γ
u
0.3569 0.0072 49.5694
ω
d
0.6725 0.0294 22.8429
γ
d
0.3043 0.0081 37.4514
α
2
0.0870 0.0343 2.5346
Table 5.8: Hourly EP parameters values for Model 2b
Note: this table reports the estimates, the standard error of estimation and t
statistics from ﬁtting Model 2b (dX
t
= κ(α(t) − X
t
)dt + σdW
t
+ Q
u
t
dP
t
(ω
u
) +
Q
d
t
dP
t
(ω
d
)) to Hourly EP. All the estimates are hourly based. The standard errors
are quite small. All the absolute values of the tratios are larger than 1.96, which
indicates that all the parameters are signiﬁcant. In comparison with the estimates
from Model 1b, incorporating the onpeak and oﬀpeak eﬀects does not change our
estimates signiﬁcantly. The onpeak coeﬃcient α
1
is not that close to the oﬀpeak
coeﬃcient α
2
as in Model 2a.
71
−10 −5 0 5 10
0
200
400
600
800
Histogram of deviations from expected values
−8 −6 −4 −2 0 2 4 6
0
0.5
1
1.5
Plot of density function
0.016
−11413
−11384
kappa
0.1110
alpha
0.0034
sigma2
0.069
omegau
0.028
gammau
0.132
omegad
0.035
gammad
0.1139
alpha2
Figure 5.6: Results from optimization (Model 2b)
Note: these are the results of ﬁtting Model 2b (dX
t
= κ(α(t) − X
t
)dt + σdW
t
+
Q
u
t
dP
t
(ω
u
) + Q
d
t
dP
t
(ω
d
)) to Hourly EP. If we get good estimates, the histogram of
deviation of expected values (deﬁned in (5.4)) should be very similar to the plot of the
theoretical density function. From these plots, one can tell that Model 2b is a good
ﬁt to Hourly EP. Moreover, all the estimates are at the peak of each curve, which
indicates that we have found the optimum points. These results are also consistent
with the small standard errors in Table 5.8.
72
Estimate Std. Tratio
κ 0.1152 0.0023 50.0870
α 0.5086 0.0413 12.3148
ω 1.0814 0.0170 63.6118
ψ 0.5781 0.0090 64.2333
γ 0.3522 0.0065 54.1846
ρ 0.9710 0.0014 693.5714
κ
v
0.5483 0.0315 17.4063
α
v
0.0261 0.0020 13.0500
σ
v
0.1529 0.0171 8.9415
Loglikelihood 11345.8013
Table 5.9: Hourly EP parameters values for Model 3a
Note: this table reports the estimates, the standard error of estimation, t
statistics and adjusted estimates (labeled as A.Est.) from ﬁtting Model
3a ( d
¸
X
t
V
t
=
¸
κ(α −X
t
)
κ
v
(α
v
−V
t
)
dt +
¸
1 −ρ
2
√
V
t
ρ
√
V
t
0 σ
v
√
V
t
¸
dW
t
dW
v
+
¸
Q
t
dP
t
(ω)
0
) to
Hourly EP. All the estimates are hourly based. The standard errors are still rea
sonably small. All the absolute values of the tratios are larger than 1.96, which
indicates that all the parameters are signiﬁcant.
73
−10 −5 0 5
0
200
400
600
800
Histogram of deviations from expected values
−5 0 5
0
0.5
1
1.5
2
Plot of density function
0.011
−11378
−11346
kappax
0.185
alphax
0.1608
kappav
0.0086
alphav
0.0537
sigmav
0.0172
rho
0.0764
omega
0.0376
psi
0.0298
gamma
Figure 5.7: Results from optimization (Model 3a)
Note: these are the results of ﬁtting Model 3a to Hourly EP. The histogram of
deviation of expected values (deﬁned in (5.6)) is very similar to the plot of the
theoretical density function. But not all the estimates are at the peak of the curves.
Here, the curve of the loglikelihood function versus the correlation coeﬃcient (rho)
is quite ﬂat and the estimate is not at the peak.
74
Estimate Std. Tratio
κ 0.1140 0.0022 51.8182
α 0.5994 0.0737 8.1330
ω
u
0.6420 0.0237 27.0886
γ
u
0.3459 0.0092 37.5978
ω
d
0.4187 0.0338 12.3876
γ
d
0.3670 0.0151 24.3046
ρ 0.9987 0.0002 4993.5
κ
v
0.6915 0.0349 19.8138
α
v
0.0281 0.0022 12.7727
σ
v
0.1782 0.0219 8.1370
Loglikelihood 11343.3605
Table 5.10: Hourly EP parameters values for Model 3b
Note: this table reports the estimates, the standard er
ror of estimation and tstatistics from ﬁtting Model 3b
d
¸
X
t
V
t
=
¸
κ(α −X
t
)
κ
v
(α
v
−V
t
)
dt +
¸
1 −ρ
2
√
V
t
ρ
√
V
t
0 σ
v
√
V
t
¸
dW
t
dW
v
+
¸
Q
u
t
dP
t
(ω
u
) +Q
d
t
dP
t
(ω
d
)
0
to Hourly EP.
All the estimates are hourly based. The standard errors are still reasonably small.
All the absolute values of the tratios are larger than 1.96, which indicates that all
the parameters are signiﬁcant.
75
−10 −5 0 5
0
200
400
600
800
Histogram of deviations from expected values
−8 −6 −4 −2 0 2 4 6
0
0.5
1
1.5
2
Plot of density function
0.0056
−11392
−11346
kappax
0.256
alphax
0.2206
kappav
0.0086
alphav
0.0713
sigmav
0.0036
rho
0.0098
omegau
0.0246
gammau
0.1344
omegad
0.0343
gammad
Figure 5.8: Results from optimization (Model 3b)
Note: these are the results of ﬁtting Model 3b to Hourly EP. The histogram of
deviation of expected values (deﬁned in (5.6)) is very similar to the plot of the
theoretical density function. But not all the estimates are at the peak of the curves.
Similar to Model 3a, the curve of the loglikelihood function versus the correlation
coeﬃcient (rho) is quite ﬂat and the estimate is not at the peak.
76
True Theta Mean Bias RMSE
κ 0.1148 0.1217 0.0069 0.0079
α 0.5169 0.8895 0.3726 0.9010
ω 1.0715 1.3234 0.2519 1.0728
ψ 0.5795 0.6396 0.0601 0.1529
γ 0.3532 0.4019 0.0487 0.1643
ρ 0.9680 0.5994 0.3686 0.8381
κ
v
0.6070 0.6088 0.0018 1.4249
α
v
0.5000 0.5640 0.0640 0.1487
σ
v
0.1695 0.0674 0.1021 0.1567
Table 5.11: Bias for Model 3a
5.3 Goodness of Fit
Because Model 1a is subsumed by Model 2a, likelihood ratio tests (LRT) may be
applied to compare these models. Likewise, we can apply LRT to compare Model 1b
and Model 2b.
LRT is a statistical test of the goodnessofﬁt between two models. A relatively
more complex model is compared to a simpler model to see if it ﬁts a particular
data set signiﬁcantly better. If so, the additional parameters of the more complex
model are often used in subsequent analyses.
10
LRT is only valid if used to compare
hierarchically nested models. That is, the more complex model must only diﬀer from
the simpler model by the addition of one or more parameters. Adding additional
parameters will always result in a higher likelihood. However, one should also con
sider whether adding additional parameters results in signiﬁcant improvement ﬁtting
a model to a particular data set. LRT provides one objective criterion for selecting
among possible models. LRT begins with a comparison of the likelihood values of
10
LRT is explained in detail by Felsenstein (1981).
77
True Theta Mean Bias RMSE
κ 0.1140 0.1207 0.0067 0.0078
α 0.5994 1.4925 0.8931 0.7309
ω
u
0.6420 1.0963 0.4543 0.0141
γ
u
0.3459 0.3563 0.0104 0.0991
ω
d
0.4187 0.4269 0.0082 0.1891
γ
d
0.3670 0.4491 0.0821 0.1835
ρ 0.9987 0.9833 0.0154 0.1679
κ
v
0.6915 1.0192 0.3277 0.1272
α
v
0.5000 0.5404 0.0404 0.1891
σ
v
0.1782 0.0970 0.0812 0.1679
Table 5.12: Bias for Model 3b
the two models:
LR = 2(ln(L
f
) −ln(L
r
)), (5.7)
where L
f
and L
r
are the likelihood values of the more complex model and the simpler
model respectively. This LRT statistic approximately follows a chisquare distribu
tion with degrees of freedom k, where k is equal to the number of additional param
eters in the more complex model. Using this information we can then determine the
critical value of the test statistic from standard statistical tables.
Table 5.15 lists the LR statistics for diﬀerent pairs of models. In this table,
the ﬁrst column lists the pair of models under consideration. The second column
labeled DF lists the additional degrees of freedom contained in the more complex
model. The column labeled LR contains the actual likelihood ratio statistic. The
column labeled Cutoﬀ contains the 95% chisquared cutoﬀ value for a likelihood ratio
statistic with degrees of freedom corresponding to the number in the DF column.
The hypothesis, that the restrictions included in the simpler model are valid, is
78
Estimate Std. Tratio
κ 0.1135 0.1322 0.85855
α 0.5250 1.0000 0.5250
ω 1.0878 0.4979 2.1848
ψ 0.5796 0.4993 1.1608
γ 0.3508 0.3767 0.9312
κ
v
0.5884 0.0000 5884
α
v
0.0259 0.0374 0.6925
σ
v
0.1562 0.0119 13.1261
Loglikelihood 11344.1626
Table 5.13: Hourly EP parameters values for Model 3a (ρ = −1)
Note: this table reports the estimates, the standard error of estimation and t
statistics from ﬁtting Model 3a to Hourly EP with ρ = −1. All the estimates are
hourly based. Although we do get a higher loglikelihood, some of the estimates are
not signiﬁcant any more (the absolute values of the tratio < 1.96). Moreover, the
standard error of α is one, which suggests that the estimates may be not accurate.
Estimate Std. Tratio
κ 0.1135 0.0171 6.6501
α 0.6008 0.6502 0.9240
ω
u
0.6411 0.5239 1.2238
γ
u
0.3461 0.1690 2.0471
ω
d
0.4180 0.3974 1.0519
γ
d
0.3672 0.3612 1.0166
κ
v
0.6809 0.1277 5.3335
α
v
0.0281 0.0271 1.0372
σ
v
0.1769 0.1919 0.9221
Loglikelihood 11343.3294
Table 5.14: Hourly EP parameters values for Model 3b (ρ = −1)
Note: this table reports the estimates, the standard error of estimation and t
statistics from ﬁtting Model 3b to Hourly EP with ρ = −1. All the estimates
are hourly based. Although we do get a higher loglikelihood, some of the estimates
are not signiﬁcant any more (the absolute values of the tratio < 1.96).
79
DF LR Cutoﬀ
Model 1a vs. Model 2a 1 53.9024 3.84
Model 1a vs. Model 3a 2 156.5978 5.99
Model 1b vs. Model 2b 1 58.4908 3.84
Model 1b vs. Model 3b 2 140.0579 5.99
Table 5.15: Likelihood ratio statistics ﬁtting on Hourly EP
Note: in this table, the ﬁrst column lists the pair of models under consideration. The
second column labeled DF lists the additional degrees of freedom contained in the
more complex model. The column labeled LR contains the actual likelihood ratio
statistic. The column labeled Cutoﬀ contains the 95% chisquared cutoﬀ value for
a likelihood ratio statistic with degrees of freedom corresponding to the number in
the DF column. The hypothesis, that the restrictions included in the simpler model
are valid, is rejected if the quantity in the LR column is greater than the quantity
in the Cutoﬀ column.
rejected if the quantity in the LR column is greater than the quantity in the Cutoﬀ
column. Thus this test suggests that the timevarying mean models provide a better
ﬁt than the ones with constant longterm mean and volatility. Moveover, whereas
the twojump version processes and the onejump version processes are not nested,
the loglikelihood values for the twojump version processes are higher than their
onejump counterparts. Thus one would expect they will give a better ﬁt.
Another related statistical measurement of ﬁt is the Schwarz criterion (see [51]).
This is deﬁned as:
SC = −2 ln L +k ln n, (5.8)
where L represents the likelihood value, k is the number of parameters estimated
in the model, and n is the sample size. Models that yield a minimum value for the
criterion are to be preferred. One should also consider whether adding additional
parameters results in signiﬁcant improvement ﬁtting a model to a particular data set.
80
ln L SC
Model 1a 11424.1002 22907.53
Model 2a 11397.1492 22863.51
Model 1b 11413.3895 22895.99
Model 2b 11384.1441 22847.39
Table 5.16: Schwarz criteria statistics for Models
Note: in this table, the ﬁrst column lists the model under consideration. The second
column labeled ln L lists the log likelihood value. The column labeled SC reports
the computed Schwarz criteria. Models that yield a minimum value for the criterion
are to be preferred. From this table, Model 2b is the best model.
The model which has maximum likelihood value and an appropriate number of ad
ditional parameters will be chosen. Therefore, the Schwarz criterion accommodates
the tradeoﬀ between a better ﬁt and less parameters by penalising the model with
additional parameters in the model. The Schwarz criterion values can be compared
among various models as a basis for the selection of the model. Table 5.16 reports
the loglikelihood values together with the Schwarz criteria for models. It appears
that Model 2b better describes the dynamics of Hourly EP as it has the smallest
Schwarz criterion.
While comparative tests like LRT and the Schwartz criterion oﬀer an indication
of the relative quality of each model, they do not yield an absolute assessment. We
will do further evaluation on those models in the following sections.
5.4 Robustness Test
A simulation exercise was undertaken to determine how robust MLCCF estimators
performed. The row labeled True Theta is used to generate 300 sample paths from the
81
κ α σ
2
ω ψ γ
True Theta 0.1124 0.2368 0.0144 1.2260 0.5261 0.3367
Mean 0.1194 0.2299 0.0160 1.2407 0.5249 0.3560
Mean Bias 0.0070 0.0069 0.0016 0.0147 0.0012 0.0193
RMSE 0.0095 0.0760 0.0029 0.0984 0.0165 0.0260
Table 5.17: Robustness test for Model 1a
κ α σ
2
ω
u
γ
u
ω
d
γ
d
True Theta 0.1162 0.1405 0.0131 0.6190 0.3536 0.6622 0.3059
Mean 0.1241 0.1439 0.0148 0.6252 0.3758 0.6664 0.3265
Mean Bias 0.0079 0.0034 0.0017 0.0062 0.0222 0.0042 0.0206
RMSE 0.0100 0.0871 0.0029 0.0587 0.0332 0.0689 0.0307
Table 5.18: Robustness test for Model 1b
discretized model beginning with the longterm mean α. Because each sample path
is for an hourly time interval, we assume the generated sample paths are also for an
hourly time interval. We use Milstein’s scheme to simulate 3000 hourly observations
for each path, and the 300 paths are generated using the antithetic variate technique.
Estimation is undertaken via the MLCCF estimation. The estimates are provided in
Tables 5.17–5.20. We report the true parameter values, along with those statistical
measures from the estimation. Notice that MLCCF estimators performed quite well
and our estimation procedure is very accurate since the mean bias and RMSE are
small.
82
κ α σ
2
ω ψ γ α
1
True Theta 0.1114 0.2353 0.0151 1.2105 0.5216 0.3598 0.2254
Mean 0.1051 0.2765 0.0136 1.2188 0.5228 0.3384 0.2009
Mean Bias 0.0063 0.0412 0.0015 0.0083 0.0012 0.0214 0.0245
RMSE 0.0083 0.0917 0.0027 0.0865 0.0187 0.0280 0.0832
Table 5.19: Robustness test for Model 2a
κ α σ
2
ω
u
γ
u
ω
d
γ
d
α
1
True Theta 0.1084 0.1665 0.0121 0.6096 0.3569 0.6725 0.3043 0.0870
Mean 0.1151 0.1223 0.0135 0.6145 0.3788 0.6817 0.3194 0.1156
Mean Bias 0.0067 0.0443 0.0014 0.0048 0.0219 0.0092 0.0151 0.0287
RMSE 0.0087 0.1076 0.0027 0.0566 0.0329 0.0738 0.0270 0.1009
Table 5.20: Robustness test for Model 2b
5.5 Descriptive Statistics of Empirical vs. Calibrated Hourly
Returns
Given the CCF, we can obtain the moments for any choice of jump distribution
where the jump intensity or distribution does not depend on the state variables. To
obtain the moments, we diﬀerentiate the CCF (φ) successively with respect to s and
then ﬁnd the value of the derivative when s = 0. Let U
n
denote the nth moment,
and φ
n
be the nth derivative of φ with respect to s, i.e., φ
n
=
∂
n
φ
∂s
n
. Then
U
n
:=
1
i
n
[φ
n
s = 0]. (5.9)
For Model 1a, the CCF of X
T
given X
t
has the closed form
φ(s, θ, X
T
X
t
) = E[exp(isX
T
)X
t
]
= exp(A(s, t, T, θ) +B(s, t, T, θ)X
t
),
(5.10)
83
where
A(s, t, T, θ) = iαs(1 −e
−κ(T−t)
) −
σ
2
s
2
4κ
(1 −e
−2κ(T−t)
)
+
iω(1 −2ψ)
κ
arctan(γse
−κ(T−t)
) −arctan(γs)
+
ω
2κ
ln
1 +γ
2
s
2
e
−2κ(T−t)
1 +γ
2
s
2
,
B(s, t, T, θ) = ise
−κ(T−t)
.
(5.11)
Then the unconditional (stationary) characteristic function φ(s, θ) of X
T
, which is
obtained by letting T →∞, is of the form
φ(s, θ) = exp
iαs −
σ
2
s
2
4κ
+
iω(2ψ −1)
κ
arctan(γs) −
ω
2κ
ln(1 +γ
2
s
2
)
. (5.12)
We will let X
∞
denote the random variable with this unconditional (stationary)
distribution. One can obtain the nth moment by equation (5.9). For example, to
get the ﬁrst moment, we diﬀerentiate φ(s, θ) with respect to s:
φ
1
:=
∂φ(s, θ)
∂s
=
iα −
σ
2
s
2κ
+
iω(2ψ −1)
κ
γ
1 +γ
2
s
2
−
ω
2κ
2γ
2
s
1 +γ
2
s
2
φ(s, θ). (5.13)
Thus the ﬁrst moment U
1
is given by
U
1
:= E[X
∞
] =
1
i
[φ
1
s = 0] = α +
ω(2ψ −1)
κ
γ. (5.14)
Similarly, we can obtain other moments:
U
2
:= E[X
2
∞
] =
σ
2
+ 2ωγ
2
2κ
+U
2
1
,
U
3
:= E[X
3
∞
] =
2ω(2ψ −1)γ
3
κ
+ 3U
1
U
2
−2U
3
1
,
U
4
:= E[X
4
∞
] =
6ωγ
4
κ
+ 3U
2
2
−12U
2
U
2
1
+ 4U
3
+ 6U
4
1
.
(5.15)
84
Therefore, we have the unconditional variance, skewness and kurtosis:
Variance := E[(X
∞
−U
1
)
2
] =
σ
2
+ 2ωγ
2
2κ
,
Skewness := E
(X
∞
−U
1
)
3
(U
2
−U
2
1
)
3
2
=
4
√
2κωγ
3
(2ψ −1)
(σ
2
+ 2ωγ
2
)
3
2
,
Kurtosis := E
(X
∞
−U
1
)
4
(U
2
−U
2
1
)
2
=
24κ(ωγ
4
)
(σ
2
+ 2ωγ
2
)
2
+ 3 .
(5.16)
We can also obtain the conditional variance, skewness and kurtosis for the process
in the same fashion:
Variance := E[(X
t+1
−U
1
)
2
X
t
] = (
σ
2
+ 2ωγ
2
2κ
)(1 −e
−2κ
) ,
Skewness := E
(X
t+1
−U
1
)
3
(U
2
−U
2
1
)
3
2
X
t
=
4ω
√
2κ(2ψ −1)γ
3
((σ
2
+ 2ωγ
2
)(1 −e
−2κ
))
3
2
(1 −e
−3κ
) ,
Kurtosis := E
(X
t+1
−U
1
)
4
(U
2
−U
2
1
)
2
X
t
=
24κωγ
4
(1 +e
−2κ
)
(σ
2
+ 2ωγ
2
)
2
(1 −e
−2κ
)
+ 3 .
(5.17)
Notice that since higher moments (variance, skewness and kurtosis) do not include
the information of the longterm mean, Model 2a has the same formulae for higher
moments as Model 1a.
As a ﬁrst check we compute the moments of the unconditional distribution and
conditional distribution of the logarithm of electricity prices, using the estimated
parameters for the Model 1a and Model 2a in Table 5.5 and Table 5.6 respectively.
Empirical and theoretical results are listed in Tables 5.21 and 5.22. According to
Das [52], empirical approximations of the unconditional moments can be obtained by
computing higher moments of the empirical data (Hourly Peak), and the empirical
conditional moments are approximated by computing higher moments of the changes
of the empirical data (Hourly Peak).
For Model 1b, we can obtain the unconditional characteristic function in the same
85
Empirical results Theoretical results
unconditional conditional unconditional conditional
Std.Dev 0.8359 0.5529 1.1404 0.5117
Skewness 0.1089 0.0133 0.0293 0.0929
Kurtosis 4.1445 8.4214 3.4972 7.4424
Table 5.21: Empirical results vs. theoretical results for Model 1a
Empirical results Theoretical results
unconditional conditional unconditional conditional
Std.Dev 0.8359 0.5529 1.1801 0.5138
Skewness 0.1089 0.0133 0.0249 0.0817
Kurtosis 4.1445 8.4214 3.4704 7.4926
Table 5.22: Empirical results vs. theoretical results for Model 2a
fashion,
φ(s, θ) = exp
iαs −
σ
2
s
2
4κ
−
ω
u
κ
ln(1 −isγ
u
) −
ω
d
κ
ln(1 −isγ
d
)
. (5.18)
Thus we have the unconditional variance, skewness and kurtosis:
Variance := E[(X
∞
−U
1
)
2
] =
σ
2
+ 2ω
u
γ
2
u
+ 2ω
d
γ
2
d
2κ
,
Skewness := E
(X
∞
−U
1
)
3
(U
2
−U
2
1
)
3
2
=
4
√
2κ(ω
u
γ
3
u
+ω
d
γ
3
d
)
(σ
2
+ 2ω
u
γ
2
u
+ 2ω
d
γ
2
d
)
3
2
,
Kurtosis := E
(X
∞
−U
1
)
4
(U
2
−U
2
1
)
2
=
24κ(ω
u
γ
4
u
+ω
d
γ
4
d
)
(σ
2
+ 2ω
u
γ
2
u
+ 2ω
d
γ
2
d
)
2
+ 3 ,
(5.19)
and the conditional variance, skewness and kurtosis:
Variance := E[(X
t+1
−U
1
)
2
X
t
] = (
σ
2
+ 2ω
u
γ
2
u
+ 2ω
d
γ
2
d
2κ
)(1 −e
−2κ
) ,
Skewness := E
(X
t+1
−U
1
)
3
(U
2
−U
2
1
)
3
2
X
t
=
4
√
2κ(ω
u
γ
3
u
+ω
d
γ
3
d
)
((σ
2
+ 2ω
u
γ
2
u
+ 2ω
d
γ
2
d
)(1 −e
−2κ
))
3
2
(1 −e
−3κ
) ,
Kurtosis := E
(X
t+1
−U
1
)
4
(U
2
−U
2
1
)
2
X
t
=
24κ(ω
u
γ
4
u
+ω
d
γ
4
d
)(1 +e
−2κ
)
((σ
2
+ 2ω
u
γ
2
u
+ 2ω
d
γ
2
d
)(1 −e
−2κ
))
2
+ 3 .
(5.20)
86
Empirical results Theoretical results
unconditional conditional unconditional conditional
Std.Dev 0.8359 0.5529 1.1206 0.5523
Skewness 0.1089 0.0133 0.1029 0.3207
Kurtosis 4.1445 8.4214 3.5068 7.3809
Table 5.23: Empirical results vs. theoretical results for Model 1b
Empirical results Theoretical results
unconditional conditional unconditional conditional
Std.Dev 0.8359 0.5529 1.1604 0.5544
Skewness 0.1089 0.0133 0.1035 0.3338
Kurtosis 4.1445 8.4214 3.4779 7.4261
Table 5.24: Empirical results vs. theoretical results for Model 2b
Empirical results and theoretical results are listed in Table 5.23. Similarly, Model
2b has the same formulae for higher moments. We report empirical results and the
oretical results for this model in Table 5.24. Notice that theoretical moments of the
twojump version models match empirical results better than the onejump version
models. But we did not get any useful information about the comparison between
Model 1b and Model 2b through computing empirical moments and theoretical mo
ments because the results are quite close to each other.
5.6 Quantile and Quantile plot
In this section, we will apply the quantilequantile (QQ) plot to compare Model
1b and Model 2b. A QQ plot is a graphical technique for determining if two data
87
sets come from the same distribution. A QQ plot is a plot of the quantiles
11
of the
ﬁrst data set against the quantiles of the second data set superimposed by a 45
degree reference line. If the two sets come from a common distribution, the points
should fall approximately along this reference line. The greater the departure from
this reference line, the greater the evidence for the conclusion that the two data
sets come from populations with diﬀerent distributions. We can use a QQ plot to
examine whether the onejump version models or the twojump version models match
the prices dynamics better.
Let us consider
Y
t
:= X
t
−X
t−1
+κX
t−1
∆t (5.21)
with step size ∆t where X
t
is the log price at time t, and ∆t = 1 for the hourly
data series. For onejump version models, one may generate random paths from the
following model (Equation (5.22)) with the estimates in Table 5.5.
Y
t
= κα∆t +σdW
t
+QdP(ω), (5.22)
with the longterm mean α diﬀerent for Model 1a and Model 2a.
12
Likewise, for
twojump version models, one may generate random paths from the following model
(5.23) with the estimates in Table 5.7:
Y
t
= κα∆t +σdW
t
+Q
u
dP
u
(ω
u
) +Q
d
dP
d
(ω
d
), (5.23)
with the longterm mean α diﬀerent for Model 1b and Model 2b.
13
Furthermore, we
can obtain a sample Y from the historical data.
11
By a quantile, we mean the fraction (or percent) of points below a given value. That is, the 0.3
(or 30%) quantile is the point at which 30% of the data falls below and 70% falls above that value.
12
Refer back to Equation (3.6) and Equation (3.20). By doing so, we obtain the stationary
process (5.22).
13
Refer back to (3.13) and (3.29). By doing so, we obtain the stationary process (5.23).
88
Figure 5.9 illustrates QQ plots for the diﬀerent jump version data versus the
adjusted historical data. The horizontal axis shows the quantiles computed from the
adjusted historical data by (5.21) while the vertical axis shows the quantiles for one
typical simulated path. The points in each QQ plot are close to the reference line,
which indicates that the performance of all the models are satisfactory. But it is
hard to tell which model is better from the QQ plots since they are quite similar.
5.7 Simulation Study
Using the parameters we obtained in Tables 5.5–5.8, we can simulate the hourly price
series to compare with Hourly EP. Here we display the sample plot of one typical
simulated path (dashed line), and the sample plot of Hourly EP (solid line) in Figures
5.10–5.13. The 95% quantile plot and 5% quantile plot are also shown in the same
ﬁgure. Notice that most of the empirical data except for those extreme “spikes” are
inside the simulation range. In this case, it is hard to tell the performance of which
model is the best. Furthermore, the histogram of the changes of Hourly EP, and the
corresponding distribution of the log returns of the simulated data (an overlaid black
line) are shown in Figures 5.14–5.17. All the models underestimate the number of
small changes.
If we separate the onpeak/oﬀpeak data, then one only need to ﬁt Model 1a and
Model 1b to the data and compare these two models. Using the resulting estimates
from ﬁtting Peak EP and Oﬀpeak EP, we can simulate price series to compare with
the empirical data. We display plots of one typical simulated path (dashed line), and
the plot of the actual data (solid line) in Figures 5.18–5.21. The 95% quantile plot
89
−4 −2 0 2 4
−8
−6
−4
−2
0
2
4
Y Quantiles
S
i
m
u
l
a
t
e
d
Y
Q
u
a
n
t
i
l
e
s
Model 1a
−4 −2 0 2 4
−10
−5
0
5
Y Quantiles
S
i
m
u
l
a
t
e
d
Y
Q
u
a
n
t
i
l
e
s
Model 2a
−4 −2 0 2 4
−10
−5
0
5
Y Quantiles
S
i
m
u
l
a
t
e
d
Y
Q
u
a
n
t
i
l
e
s
Model 1b
−4 −2 0 2 4
−10
−5
0
5
Y Quantiles
S
i
m
u
l
a
t
e
d
Y
Q
u
a
n
t
i
l
e
s
Model 2b
Figure 5.9: QQ Plots
Note: in each QQ plot, the horizontal axis shows the quantiles for the adjusted
historical data. The vertical axis shows the quantiles for one typical simulated path
for the model whose name is listed in the subtitle.
90
onpeak Model 1a Model 1b
Mean 4.07631 4.04856 4.08371
Std.Dev 0.53402 0.67293 0.63997
Skewness 0.13465 0.31442 0.48105
Kurtosis 3.65594 4.06480 3.89188
Table 5.25: Descriptive statistics of Peak EP and simulated paths
and 5% quantile plot are also plotted in the same ﬁgure for comparison. Still most
of the empirical data except for those extreme “spikes” are inside the simulation
range. The histogram of the actual data and the corresponding distribution of the
log returns of the simulated data (an overlaid black line) are shown in Figures 5.22–
5.25. Moreover, Table 5.25 and Table 5.26 list the comparison of the moments of
the actual data and the simulated data from Model 1a and Model 1b. It seems that
Model 1b does a better job since the moments are closer to the empirical data.
5.8 Conclusion
This chapter report comparisons among all the models via diﬀerent means. A sim
ulation study shows that MLCCF estimators are very robust and the estimation
procedures are quite accurate. But we failed to apply MLMCCF estimators in
stochastic volatility models and we did not get accurate results from SGMM esti
mation in the time constraint. Therefore, a comparison between stochastic volatility
models and the other four models (deterministic volatility) are not supplied. The de
terministic volatility models all performed quite well in various comparisons. There
is only a little bit improvement of the twojump version models over the onejump
91
oﬀpeak Model 1a Model 1b
Mean 3.46474 3.42967 3.47128
Std.Dev 0.64445 0.75322 0.69549
Skewness 0.21882 0.31544 0.20126
Kurtosis 2.76840 3.13474 3.16340
Table 5.26: Descriptive statistics of Peak EP and simulated paths
version models. As the only extra term in Model 2b is the onpeak/oﬀpeak dummy
variables, we feel that, with proper data preparation (such as deseasonalization, and
separation of the onpeak data and the oﬀpeak data), Model 1b will suﬃce in mim
icking the dynamics in electricity price processes of Alberta.
92
2400 2600 2800 3000 3200 3400 3600 3800 4000
0
5
10
15
20
25
30
35
40
45
Deseasonalized Hourly EP
A typical path of simulated P
5% percentile
95% percentile
Figure 5.10: Hourly EP superimposed by simulated paths (Model 1a)
Note: this ﬁgure includes the sample plot of the deseasonalied Hourly EP, one typical
simulated path, the 95% quantile, and 5% quantile of the simulation.
93
2400 2600 2800 3000 3200 3400 3600 3800 4000
0
5
10
15
20
25
30
35
40
45
Deseasonalized Hourly EP
A typical path of simulated P
5% percentile
95% percentile
Figure 5.11: Hourly EP superimposed by simulated paths (Model 1b)
Note: this ﬁgure includes the sample plot of the deseasonalied Hourly EP, one typical
simulated path, the 95% quantile, and 5% quantile of the simulation.
94
2400 2600 2800 3000 3200 3400 3600 3800 4000
0
5
10
15
20
25
30
35
40
45
Deseasonalized Hourly EP
A typical path of simulated P
5% percentile
95% percentile
Figure 5.12: Hourly EP superimposed by simulated paths (Model 2a)
Note: this ﬁgure includes the sample plot of the deseasonalied Hourly EP, one typical
simulated path, the 95% quantile, and 5% quantile of the simulation.
95
2400 2600 2800 3000 3200 3400 3600 3800 4000
0
5
10
15
20
25
30
35
40
45
Deseasonalized Hourly EP
A typical path of simulated P
5% percentile
95% percentile
Figure 5.13: Hourly EP superimposed by simulated paths (Model 2b)
Note: this ﬁgure includes the sample plot of the deseasonalied Hourly EP, one typical
simulated path, the 95% quantile, and 5% quantile of the simulation.
96
−5 −4 −3 −2 −1 0 1 2 3 4 5
0
500
1000
1500
2000
2500
3000
3500
Log Return
F
r
e
q
u
e
n
c
y
Figure 5.14: Comparison of simulated price processes with Hourly EP (Model 1a)
Note: the histogram is of the change of the deseaonalized Hourly EP and the overlaid
black line is the corresponding distribution of the log returns of one typical simulated
path. Note that this model underestimates the number of small changes.
97
−5 −4 −3 −2 −1 0 1 2 3 4 5
0
500
1000
1500
2000
2500
3000
3500
Log Return
F
r
e
q
u
e
n
c
y
Figure 5.15: Comparison of simulated price processes with Hourly EP (Model 1b)
Note: the histogram is of the change of the deseaonalized Hourly EP and the overlaid
black line is the corresponding distribution of the log returns of one typical simulated
path. Note that this model underestimates the number of small changes.
98
−6 −4 −2 0 2 4 6
0
500
1000
1500
2000
2500
3000
3500
Log Return
F
r
e
q
u
e
n
c
y
Figure 5.16: Comparison of simulated price processes with Hourly EP (Model 2a)
Note: the histogram is of the change of the deseaonalized Hourly EP and the overlaid
black line is the corresponding distribution of the log returns of one typical simulated
path. Note that this model underestimates the number of small changes.
99
−5 −4 −3 −2 −1 0 1 2 3 4 5
0
500
1000
1500
2000
2500
3000
3500
Log Return
F
r
e
q
u
e
n
c
y
Figure 5.17: Comparison of simulated price processes with Hourly EP (Model 2b)
Note: the histogram is of the change of the deseaonalized Hourly EP and the overlaid
black line is the corresponding distribution of the log returns of one typical simulated
path. Note that this model underestimates the number of small changes.
100
0 100 200 300 400 500 600
0
50
100
150
200
250
300
350
400
450
500
Simulated Price and Real Price
Time
$
/
M
W
h
Real P
one path of simulated P
5% percentile
95% percentile
Figure 5.18: Peak EP superimposed by simulated paths (Model 1a)
Note: this ﬁgure includes the sample plot of Peak EP, one typical simulated path,
the 95% quantile, and 5% quantile of the simulation.
101
0 100 200 300 400 500 600
0
50
100
150
200
250
300
350
400
450
Simulated Price and Real Price
Time
$
/
M
W
h
Real P
one path of simulated P
5% percentile
95% percentile
Figure 5.19: Peak EP superimposed by simulation paths (Model 1b)
Note: this ﬁgure includes the sample plot of Peak EP, one typical simulated path,
the 95% quantile, and 5% quantile of the simulation.
102
0 100 200 300 400 500 600 700 800 900
0
50
100
150
200
250
300
Simulated Price and Real Price
Time
$
/
M
W
h
Real P
one path of simulated P
5% percentile
95% percentile
Figure 5.20: Oﬀpeak EP superimposed by simulation paths (Model 1a)
Note: this ﬁgure includes the sample plot of Oﬀpeak EP, one typical simulated path,
the 95% quantile, and 5% quantile of the simulation.
103
0 100 200 300 400 500 600 700 800 900
0
50
100
150
200
250
300
350
400
450
Simulated Price and Real Price
Time
$
/
M
W
h
Real P
one path of simulated P
5% percentile
95% percentile
Figure 5.21: Oﬀpeak EP superimposed by simulation paths (Model 1b)
Note: this ﬁgure includes the sample plot of Oﬀpeak EP, one typical simulated path,
the 95% quantile, and 5% quantile of the simulation.
104
−3 −2 −1 0 1 2 3
0
5
10
15
20
25
30
35
Log Return
F
r
e
q
u
e
n
c
y
Figure 5.22: Comparison of simulated price processes with Peak EP (Model 1a)
Note: the histogram is of the change of the deseaonalized Peak EP and the overlaid
black line is the corresponding distribution of the log returns of one typical simulated
path. Note that this model overestimates the number of mediumsized changes.
105
−3 −2 −1 0 1 2
0
5
10
15
20
25
30
35
Log Return
F
r
e
q
u
e
n
c
y
Figure 5.23: Comparison of simulated price processes with Peak EP (Model 1b)
Note: the histogram is of the change of the deseaonalized Peak EP and the overlaid
black line is the corresponding distribution of the log returns of one typical simu
lated path. Note that this model underestimates the number of small changes and
overestimates the mediumsized changes.
106
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
0
5
10
15
20
25
30
35
40
45
Log Return
F
r
e
q
u
e
n
c
y
Figure 5.24: Comparison of simulated price processes with Oﬀpeak EP (Model 1a)
Note: the histogram is of the change of the deseaonalized Oﬀpeak EP and the overlaid
black line is the corresponding distribution of the log returns of one typical simulated
path. Note that this model underestimates the number of small changes.
107
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
0
5
10
15
20
25
30
35
40
45
Log Return
F
r
e
q
u
e
n
c
y
Figure 5.25: Comparison of simulated price processes with Oﬀpeak EP (Model 1b)
Note: the histogram is of the change of the deseaonalized Oﬀpeak EP and the overlaid
black line is the corresponding distribution of the log returns of one typical simulated
path. Note that this model underestimates the number of small changes.
Chapter 6
Conclusion
Energy commodity markets are growing rapidly as the deregulation and restructuring
of electricity furnish industries spreads in North America and around the world.
There is a heightened awareness of the need to understand the dynamics of electricity
markets for trading electricity, risk management and project pricing.
Due to the diﬃculties in storing electricity, the interplay between supply and
demand produces unique electricity price dynamics in each diﬀerent electricity mar
ket. Thus electricity poses the biggest challenge for researchers and practitioners
to model its price behaviour among all the energy commodities. Furthermore, be
cause of the relative newness of deregulation, there have been few empirical studies
focusing entirely on electricity prices.
In this thesis, we have addressed issues of modeling electricity spot prices in
Alberta. We examined a broad class of stochastic models which can be used to
model the behaviour of electricity pricing including meanreversion, timevarying
mean, stochastic volatility, as well as multiple jumps. We demonstrated how to get
the CCF by means of the transform analysis. We also introduced methods showing
how to ﬁt those models via MLCCF and SGMM estimation to the Alberta hourly
electricity prices. Through extensive empirical comparisons among the models, we
showed that the twojump version of meanreverting jumpdiﬀusion (Model 1b) is
generally superior to other models after the proper deseaonalization and separation
of the onpeak and oﬀpeak data.
108
109
In future research, a number of important issues ought to be considered. Aﬃne
processes can be applied to multiple types of process such as those with stochastic
volatility, and jumps, and more general structures such as multiple latent variables
and timevarying jump component, without sacriﬁcing option pricing tractability.
Although this thesis concentrated on calibration, they have many additional ad
vantages that could be further investigated. In future work, we should be able to
compute the prices of various electricity derivatives (options) under the assumed
underlying aﬃne jumpdiﬀusion price processes by exploiting the transform analysis
when applicable.
Stochastic volatility models should be further considered from a purely statis
tical perspective. They may perform better if we fully exploit the information of
the joint CCF and use other estimation methodologies such as the Bayesian Monte
Carlo Markov Chain, eﬃcient method of moments, generalized method of moments,
simulated method of moments, Kalman ﬁltering methods, and others.
However, generating useful forecasts from jumpdiﬀusion models is diﬃcult. Av
eraging processes could cause the loss of much important information. As pointed
out by Knittel and Roberts (see [26]), “One must simulate a forecasted path because
of the models dependency on random jumps. Of course, there are a continuum of
possible paths that may be simulated by, intuitively, ﬂipping a λcoin each time pe
riod, and drawing from the appropriate conditional distribution. Combining many
simulated paths averages out the excess variation induced from the jumps, and leaves
a very smooth forecast representing a number falling somewhere between the means
of the two conditional distributions that make up the mixture.”
Furthermore, all the models we consider cannot capture unexpected events such
110
as changes in macroeconomic variables, forced outages of power generation plants
or unexpected contingencies in transmission networks and the like. Such unex
pected events often result in price processes following completely diﬀerently dynam
ics. Therefore, we need to try other types of aﬃne models to get useful forecasts of
the spot prices, and this could be one of the most interesting area of future research.
Bibliography
[1] S.L. Puller, Pricing and ﬁrm conduct in Californias deregulated electricity mar
ket, University of California Energy Institute, The US Power Market, Risk
Publications, Tech. Report, 2002.
[2] A. Eydeland and G. H´elyette, Pricing power derivatives, Risk, 1998.
[3] J.S. Deng, Valuation of investment and opportunitytoinvest in power gen
eration assets with spikes in electricity price, School of Industrial & Systems
Engineering, Georgia Institute of Technology, Tech. Report, 2003.
[4] H. Geman and A. Roncoroni, A class of marked point processes for model
ing electricity prices, Department of Finance, ESSEC Business school, Tech.
Report, 2001.
[5] D. Duﬃe, J. Pan, and K.J. Singleton, Transform analysis and asset pricing
for aﬃne jumpdiﬀusion, Econometrica (68), 2000.
[6] D.S. Bates, Empirical Option Pricing: A Retrospection, University of Iowa and
the National Bureau of Economic Research, Tech. Report, 2002.
[7] Q. Dai and K.J. Singleton, Speciﬁcation analysis of aﬃne term structure mod
els, Graduate School of Business Stanford University, Working Paper, 1998.
[8] G. Chacko and S. Das, Pricing interest rate derivatives: a general approach,
NBER Working Paper, 2000.
111
112
[9] S. Das and S. Foresi, Exact solutions for bond and option prices with systematic
jump risk, Review of Derivatives Research, Vol. 1, 1996.
[10] J.S. Deng, Pricing electricity derivatives under alternative stochastic spot price
models, University of California Energy Institute, Tech. Report, 1998.
[11] P. Villaplana, Pricing power derivatives: a twofactor jumpdiﬀusion approach,
University Carlos III Madrid, Job Market Paper, 2003.
[12] K.J. Singleton, Estimation of aﬃne asset pricing models using the empirical
characteristic function, Stanford University and NBER, Journal of Economet
rics, 2001.
[13] G. Chacko and L.M. Viceria, Spectral GMM estimation of continuoustime
processes, Graduate School of Business Administration, Harvard University,
Tech. Report, 2001.
[14] T. Daniel, J. Doucet and A. Plourde, Electricity Industry Restructuring: The
Alberta Experience, School of Business, University of Alberta, Working Paper,
2001.
[15] http://www.energy.gov.ab.ca/com/Electricity/Introduction/Electricity.htm
[16] R. Elliott, G. Sick and M. Stein, Pricing electricity calls, University of Alberta
and University of Adelaide, University of Calgary, University of Oregon, Tech.
Report, 2000.
[17] G. Lee and B. Jr, Stochastic Financial Models for Electricity Derivatives,
University of Stanford, Ph.D. Thesis, 1999.
113
[18] L.L. Yousef, Derivation and Empirical Testing of Alternative Pricing Models
in Alberta’s Electricity Market, University of Calgary, Thesis, 2001.
[19] P. Wilmott, Derivatives, John Wiley and Sons, 1998.
[20] D. Pilipovich, Energy Risk: Valuing and Managing Energy Derivatives,
McGrawHill, New York, 1997.
[21] Y. A¨ıtSahalia, Maximum likelihood estimation of discretely sampled diﬀusions:
A closedform approximation approach, Econometrica, Vol.70, No.1, 2002.
[22] J.J. Lucia and E.S. Schwartz, Electricity prices and power derivatives: evidence
from the Nordic Power Exchange, Review of Derivatives Research (5), 2002.
[23] L. Clewlow and C. Strickland, Energy Derivatives: Pricing and Risk Manage
ment, Pricing and Risk, LACIMA Publications, 2000.
[24] J.S. Deng, Stochastic models of energy commodity prices: Meanreversion with
jumps and spikes, University of California Energy Institute, Tech. Report, 1998.
[25] H. Zhou, JumpDiﬀusion term structure and Itˆ o conditional moment generator,
Federal Reserve Board, Tech. Report, 2001.
[26] C.R. Knittel and M.R. Roberts, An empirical examination of deregulated elec
tricity prices, University of California Energy Institute, Working Paper, 2001.
[27] C. Ball and W.N. Torous, A simpliﬁed jump process for common stock returns,
Journal of Financial and Quantitative Analysis (18), 1983.
114
[28] S. Beckers, A note on estimating the parameters of the diﬀusionjump model
of stock returns, Journal of Financial and Quantitative Analysis (16), 1981.
[29] M. Carrasco and J.P. Florens, Estimation of a mixture via the empirical char
acteristic function, Tech. Report, 2000.
[30] P. Honor´e, Pitfalls in estimation jumpdiﬀusion models, center for analytical
ﬁnance, University of Aarhus, Working Paper Series No.18, 1998.
[31] A. Lo, Maximum likelihood estimation of generalized Ito processes with dis
cretely sampled data, Econometric Theory (4), 1988.
[32] N.M. Kiefer, Discrete parameter variation: Eﬃcient estimation of a switching
regression model, Econometrica, 1978.
[33] J.D. Hamilton, Time Series Analysis, Princeton University Press, 1994.
[34] D.R. Smith, Essays in Empirical Asset Pricing, The University of British
Columbia, Ph.D. Thesis, 2002.
[35] T. Bollerslev, Generalized autoregressive conditional heteroskedasticity, Journal
of Econometrics (31), 1986.
[36] V. Naik, Option valuation and hedging strategies with jumps in the volatility
of asset returns, Journal of Finance (48), 1993.
[37] E. Ghysels, A. Harvey, and E. Renault, Handbook of Statistics, vol. 14, 1995.
[38] N. Shephard, Statistical aspects of ARCH and stochastic volatility., Nuﬃeld
College, Oxford, Manuscript, 1995.
115
[39] P.E. Protter, Stochastic Integration and Diﬀerential Equations, Springer
Verlag, 2nd Edition, 2003.
[40] V. Kaminske, The challenge of pricing and risk managing electricity deriva
tives, The US Power Market, Risk Publication, 1997.
[41] M. Goto and G.A. Karolyi, Understanding Electricity Price Volatility Within
and Across Markets, The Central Research Institute of Electric Power Industry
in Japan and the Dice Center for Financial Economics, Tech. Report, 2003.
[42] V. Robert and T. Allen, Introduction to Mathematical Statistics, Fifth Edition,
Prentice Hall, Englewood Cliﬀs, New Jersey, 1995.
[43] G.J. Jiang and J.L. Knight, Estimation of continuous time processes via the
empirical characteristic function, Journal of Business and Economic Statistics,
2001.
[44] G.R. Grimmett and D.R. Stirzaker, Probability and Random Processes, Oxford
Science Publications, 2nd Edition, 1982.
[45] B. Dupire, Pricing with a smile, RISK, Vol. 7, 1994.
[46] T. Coleman, Li and A. Verma, Reconstructing the unknown volatility function,
Journal of Computational Finance, Vol. 2, Number 3, 1999.
[47] B. Hamida and R. Cont, Recovering volatility from option prices by evolution
ary optimization, Manuscript, CNRS., 2004.
[48] R. Cont and P. Tankov, Financial Modeling with Jump Processes, Chapman
and Hall/CRC, 2004.
116
[49] L. Hansen, Large sample properties of generalized method of moments estima
tors, Econometrica, Vol. 50, 1982.
[50] L. Matyas, Generalized Method of Moments Estimation, Cambridge, 1999.
[51] G. Schwartz, Estimating the dimension of a model, Annuals of Statistics (6),
1978.
[52] S.R. Das, The surprise element: jumps in interest rates, Santa Clara Univer
sity, Tech. Report, 2000.
THE UNIVERSITY OF CALGARY FACULTY OF GRADUATE STUDIES
The undersigned certify that they have read, and recommend to the Faculty of Graduate Studies for acceptance, a thesis entitled “Stochastic Models for Electricity Prices in Alberta” submitted by Lei Xiong in partial fulﬁllment of the requirements for the degree of MASTER OF SCIENCE.
Dr. L.P. Bos Department of Mathematics and Statistics
Dr. A.F. Ware Department of Mathematics and Statistics
Dr. G. Sick Haskayne School of Business
Date ii
Abstract
This thesis investigates the modeling of electricity prices in the Canadian province of Alberta. We model the electricity price processes as aﬃne jumpdiﬀusion processes, and we are able to exploit the transform analysis of Duﬃe, Pan and Singleton (1996) to develop computationally tractable and asymptotically eﬃcient estimators of the parameters. We examine six meanreverting jumpdiﬀusion models for modeling electricity spot prices. The models which we propose have the features of multiple types of jumps, or timevarying mean and stochastic volatility. The estimation methodologies we adopt include maximum likelihood estimation based on conditional characteristic function and spectral generalized method of moments. Extensive empirical comparisons have been conducted via these estimation methods based on actual spot hourly electricity prices in Alberta.
iii
Finally. guidance and patience. and by far most importantly. I also wish to acknowledge Professor Gordon Sick for his comments and corrections. Without his iv . these do not convey the extent of the assistance given to me. which allow this thesis more satisfactory. I am extremely grateful to Professor Len Bos for his incredible patience and boundless encouragement. I thank my husband Rui. This thesis evolved out of a project sponsored and partially funded by NEXEN and by the Networks of Centers of Excellence for the Mathematics of Information Technology and Complex Systems (MITACS). My deepest appreciation goes to my parents. A special thanks goes to Professor Peter Zvengrowski. While some are acknowledged in the text. always friendly and helpful in editing my thesis. I learned so much from him about stochastic modeling and its application to ﬁnance. His comments and helpful suggestions allowed me to greatly improve my work. I would also like to thank Professor Ali LariLavassani for the opportunity to be involved in this project and for his generous support. but also with their unconditional and endless love. They have always provided generously. I am also grateful to my collaborator Zhiyong Xu for his sincere assistance in the project. not only ﬁnancially.Acknowledgments I am indebted to many individuals for invaluable help and advice. His guidance was invaluable in this thesis. First and foremost I would like to thank Professor Tony Ware for his support.
His silent but unwavering support has given me the strength to overcome the diﬃcult hurdles along the way.constant encouragement and love I would never have made it to this point. v .
. . . . . . . .7 Model 3b . . . . . . . . 4. . . . . . 2. . . . . . . . . . . . . . . . . . . . .2 MLCCF Estimators . . . . . . . . . . . . . . . . . .2. . . . . . . . . . . .4 GARCH . . . . . . . . . . . . . . . .2. . . . . . . . . . . . . . . . . 60 5. . . . . . . . . . .3 Model 1b . . . . . . . . . . . . 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5 Model 2b . . 3. . . . . . . . . . . . . . Process . .1 Aﬃne JumpDiﬀusion Process . . . . . . . . . . . 2.5 Extensions to Models . . .1 Data Description . . . . . . . . . . 3. . . . . . . . . . 4. .1 The Alberta Electricity Market . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Stochastic Models for Electricity 3. . . . . . . . . . . . . . . . . . 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. . . . . . . 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. . ii iii iv vi 1 6 6 16 16 17 18 21 22 26 26 29 31 33 35 35 38 40 40 42 44 48 Prices . . . . . . . . . . . . . . . . . . . . . . 56 5. .3 MLMCCF Estimators . .4 Model 2a . . . . . . . . . . . . . . . . . . . . . . . .2. . . . . . . . . . . .2 Models . . . . . .Table of Contents Approval Page Abstract Acknowledgments Table of Contents 1 Introduction 2 Background 2. . . . . . .1 Geometric Brownian Motion 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 vi . . . . 2. . . . .2 Calibration . . . . . . . . . 4 Parameter Estimation 4. . . . . . . . . . . . . .2 OrnsteinUhlenbeck Process 2. 3. . . . . . . . . . . . . . .4 Spectral GMM Estimators . . . . . . . . . . . . .1 Introduction . .3 Goodness of Fit . . . . . . . 5 Model Comparison 56 5. . .2. .2. . . . . .3 Jumpdiﬀusion model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6 Model 3a . . . . . .2 Model 1a . . . . . . . . . . . . . . .
. . . Simulation Study . . . .8 Robustness Test . . . . . . . . . Conclusion . . . .5 5. . . . .7 5. . . . . . . . . . . . . . . Calibrated Hourly Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4 5. . . . . . . . . . . . . .6 5. . 80 82 86 88 90 108 111 6 Conclusion Bibliography vii . . . . . . . . . .5. . . . . . . . . . . . . . . Descriptive Statistics of Empirical Quantile and Quantile plot . . . vs. . . . . . . . . .
. . . . . . .8 5. . . . . . . . . . . . . Hourly EP parameters values for Model 2a . . Descriptive statistics of Peak EP . . . . . . . . . . Robustness test for Model 2b . . .21 5. . . . . . . . . . . . Hourly EP parameters values for Model 1a . . . . . . . . . . . . . . . . . Hourly EP parameters values for Model 3a . . . . . . . . . . . . . . . . . . . . . . . . . theoretical results for Model 1a Empirical results vs. . . . . . . . . . . . . . . . . . . . theoretical results for Model 2b Descriptive statistics of Peak EP and simulated paths Descriptive statistics of Peak EP and simulated paths . . . . .15 5. . . .9 5. . . . . . . . . . . . . . . . . . . . . . . . . .22 5. . .14 5. . theoretical results for Model 1b Empirical results vs. . . . . . . . . . . . . . . . . . . . . . . . Empirical results vs. . . . . . . . . .16 5. . . . . . . . . Descriptive statistics of Oﬀpeak EP . . . . . . . . . . . . . . Bias for Model 3a . . . . . .24 5.10 5. . . . . . . . . .17 5. .3 5. . Robustness test for Model 1b . . . . .13 5. . . . . . . . . . . . . . . . . . . Robustness test for Model 2a . . .19 5. . . . . . . . .12 5. . . . . . . . . . . . . . .1 5. . . . Schwarz criteria statistics for Models . . . . . . . . . . . . . . . . . . . . .20 5.18 5. . . . . . . . . . . . . . . . . . . . . . . . theoretical results for Model 2a Empirical results vs. . . . . .6 5. . . .25 5. . . Hourly EP parameters values for Model 3b . . . . .2 5. . . . . . . . . . .List of Tables 5. . Hourly EP parameters values for Model 2b . . .11 5.26 Descriptive statistics of Hourly EP . Hourly EP parameters values for Model 1b . . . . . . . . . . . .5 5. . . . .23 5. Hourly EP parameters values for Model 3a (ρ = −1) Hourly EP parameters values for Model 3b (ρ = −1) Likelihood ratio statistics ﬁtting on Hourly EP . . . . . . . . . 57 57 58 60 65 67 67 70 72 74 76 77 78 78 79 80 81 81 82 82 85 85 86 86 90 91 viii . . . . . . . . Bias for Model 3b . . . . . . . . . . . .7 5. . . . . . . . . . Descriptive statistics of deseasonalized Hourly EP . . . . . . . . .4 5. . . . Robustness test for Model 1a . . . . .
. . . . . . .25 Plot of the hourly electricity prices . . . .13 5. . . . . . . Peak EP superimposed by simulation paths (Model 1b) . . . . . . . . . . .15 5. . . . .6 5. . . . Empirical histogram of log returns with normal density superimposed Sample plots of the on/oﬀpeak prices together with the histogram of log returns . The moving average electricity prices vs. . . . Hourly EP superimposed by simulated paths (Model 2a) . . . . . . . . . . Average weekday hourly prices by season . . . Oﬀpeak EP superimposed by simulation paths (Model 1b) . . . . . . . . . . . . . .3 2. . . . . .3 5.4 5. . Hourly EP superimposed by simulated paths (Model 1a) . Results from optimization (Model 2b) . . . . . .5 2. . .21 5. . . . . . Comparison of simulated price processes with Oﬀpeak EP (Model 1a) Comparison of simulated price processes with Oﬀpeak EP (Model 1b) 8 9 11 13 14 15 58 59 66 68 69 71 73 75 89 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 ix . . .19 5. . . . . . . . . . Comparison of simulated price processes with Peak EP (Model 1b) . . .7 5. . .2 2. . . . . Results from optimization (Model 2a) . Comparison of simulated price processes with Peak EP (Model 1a) . . . . . . QQ Plots . . . . . . their variance . . Hourly EP superimposed by simulated paths (Model 2b) . . . . . . . . . .23 5. . . . . . .11 5. . . .10 5. . . . . . . . . . . . .5 5. . . . . . . . . . . . . .20 5. . . . . . .14 5. . . . . . . Histogram of the changes of the deseasonalized data (dX) . . .2 5. . . . . . . . . . .4 2. . . . . . . . . . . . . . . .1 2.List of Figures 2. . . . . . . . . . .18 5. . . .24 5. Histogram of the log returns of Hourly EP . . . . Comparison of simulated price processes with Hourly EP (Model 1a) Comparison of simulated price processes with Hourly EP (Model 1b) Comparison of simulated price processes with Hourly EP (Model 2a) Comparison of simulated price processes with Hourly EP (Model 2b) Peak EP superimposed by simulated paths (Model 1a) . . Results from optimization (Model 1a) . . . . . . . .22 5.8 5. . . . . Results from optimization (Model 3a) . Results from optimization (Model 1b) . . . . . . . . . . . . . . . . . . . . . Results from optimization (Model 3b) .16 5. . . . . . . . . . . . . . . . . .12 5. . . . . . . Oﬀpeak EP superimposed by simulation paths (Model 1a) . . .1 5. . . . . .9 5. . . . . . A sample of hourly electricity prices for two weeks .6 5. . . . .17 5. Hourly EP superimposed by simulated paths (Model 1b) . . . . . . . .
a nonobservable risk factor. it seems that it is better to use electricity spot prices directly to study the dynamics of the electricity market. The resulting spot prices are crucial for the valuation of physical assets. At the same time.S. derivatives and more generally for the risk management of utilities.K. Introducing a convenience yield. the empirical studies on electricity prices are not thoroughly developed (see [26]). In this thesis. most studies have been focused on the U. and U. electricity markets. Furthermore. however. The recent experience of the state of California has illuminated the importance of understanding electricity price behaviour (see [1]). Electricity markets are experiencing rapid deregulation and restructuring. does not really make e 1 . as discussed in Eydeland and H´lyette (1998) (see [2]). Electricity prices are no longer controlled by regulators and now are essentially determined according to the economic rule of supply and demand. Instead of using the expected future value of electricity price as the underlying process. especially with regard to generation and supply. It is a very challenging task due to the erratic nature of electricity prices.Chapter 1 Introduction Over the last decade. which provides utilities with new opportunities but more competition as well. the structure of the North America electricity markets has undergone radical changes. we will primarily study the Alberta Electricity Market. the deregulation and restructuring of electricity markets has paved the way for a considerable amount of trading activity. While there has been a signiﬁcant amount of research on commodity prices.
which are given a general treatment in [5]. . In Chapter 3. see [4]. The spot price process by itself should contain most of the fundamental properties of electricity (see [2]). aﬃne processes can be applied to multiple types of processes such as those with jumps. timevarying mean. Through this examination.2 sense for electricity: since there is no available technique to store power (except for hydro). negative electricity prices have been rarely observed. We will start with a one factor model based on the logarithm of the spot prices. electricity spot markets are relatively more liquid than the corresponding futures markets. and stochastic volatility and even more general structures like latent variables or timevarying jumps without sacriﬁcing computational tractability. we outline the relevant features of this aﬃne framework. 1 Up to now. The asset price models we consider all fall into the class of aﬃne processes. [7]. [8]). we believe that the class of meanreverting jumpdiﬀusion models are one of the ideal models to capture the characteristics of electricity prices in the Alberta Electricity Market. holding the underlying asset does not help us. aﬃne processes have great advantages for econometric work. in certain regions. Moreover. As discussed by various authors (see [6]. Therefore. We represent electricity spot prices in a natural logarithmic scale which stabilizes the statistical estimation procedure and ensures strict positivity of sample paths. Some typical models that have been used to explain the dynamics of electricity prices are also discussed in Chapter 2.1 We begin by examining the empirical behaviour of electricity spot prices in the Alberta Electricity Market in Chapter 2. This model will be considered and extended in subsequent sections of the thesis. especially for electricity futures with maturity beyond 12 months (see [3]).
• Stochastic volatility.3 We will deal with a onejump version of the process and a twojump version of the process. The onejump version was ﬁrst proposed by Das and Foresi (see [9]). [10]. Furthermore. • Model 1b: twojump version of MRJD. The jump component has an exponentially distributed absolute value of jump size with sign of jump determined by a Bernoulli variable. [11]). • Model 2b: twojump version of MRJD with timevarying longterm mean. • Model 2a: onejump version of MRJD with timevarying longterm mean. . Pan and Singleton (1998) and adopted by Shejie. • Timevarying longterm mean incorporating onpeak and oﬀpeak eﬀects. which is itself a meanreverting square root process. This type of process was ﬁrst considered in Duﬃe. • Model 1a: onejump version of MRJD. Deng (1998) and Villaplana (2003) to mimic the “spikes” in the electricity price process (see [5]. Speciﬁcally. the following alternative meanreverting jumpdiﬀusion (MRJD) models have been explored. We also have two diﬀerent speciﬁcations for the drift term: • Deterministic longterm mean. we also have two diﬀerent speciﬁcations for the diﬀusion term: • Deterministic volatility. while the twojump version assumes that there are asymmetric upward jumps and downward jumps.
When all the state variables are observable. Chapter 4 exploits the CCF to develop computationally tractable estimators of the parameters. But the estimators based on the MCCF are less eﬃcient compared to other parameter estimation methods that have been used in stochastic . to derive closedform expressions for the socalled conditional moments. we obtain a socalled marginal CCF by integrating the latent variable out of the CCF. Speciﬁcally. Aﬃne jumpdiﬀusion processes have analytic solutions for the conditional characteristic function associated with the conditional density function. • Model 3b: twojump version of MRJD with stochastic volatility. This yields the asymptotically eﬃcient MLCCF estimator (see [12]). This implies that one can obtain the conditional densities via Fourier inversion or other methods. But the computation cost will grow rapidly as the number of state variables increases. The computations of those conditional characteristic functions (CCF) are introduced in Chapter 3. When there are unobservable state variables (latent variables). This MCCF can be used. One can construct spectral generalized methods of moments (SGMM) estimators introduced by Chacko and Viceira (1999) (see [13]). the MLCCF estimator cannot be constructed directly. Based on this marginal conditional characteristic function (MCCF). we can use the standard maximum likelihood (ML) approach by maximizing this likelihood function. Following Chacko and Viceira (1999) (see [13]).4 • Model 3a: onejump version of MRJD with stochastic volatility. we can carry out the MLMCCF estimation. as well. we use the Fourier transform of the CCF to derive the conditional loglikelihood function.
extensive empirical comparisons among the models have been conducted in Chapter 5. It is a tradeoﬀ between computational ease and eﬃciency.5 volatility models. Also. The data series we used are the hourly electricity price series for the province of Alberta from January 1. 2002 to March 31. . 2004.
C. The Alberta Electricity Market is relatively small. Power generated in Alberta is exchanged through the Alberta Electric System 6 . transmission and distribution services are still under government regulation. Thermal sources account for the majority of Alberta’s installed generating capacity.000 MW of new generation since 1998. 1996. The experience of the Alberta Electricity Market reﬂects many aspects involved in deregulation and restructuring. “Alberta is somewhat unique in Canada in that there has never existed a single. governmentowned) monopoly serving the electricity needs of the province.S. including 11. [14]. which can be construed as a “Canadian” characteristic. vertically integrated Crown (that is.Chapter 2 Background 2. However.” Generation and retail services are open to competition. and Saskatchewan. It has about 12. industry.500 MW in the province’s integrated electrical system and access to 950 MW from B.400 MW of installed generating capacity.1 The Alberta Electricity Market As pointed out by Daniel et al. The remainder is hydro and wind (see [15]). an increase of about 30% in capacity. which strongly resembles the U. Restructuring of the electricity industry was ﬁrst broached in Alberta in the early 1990s and deregulation started from January 1. Alberta has beneﬁtted from more than 3. and provides insights to other electricity markets.
Canadian dollars being the monetary unit here and in the remainder of this thesis). facilitates the realtime wholesale electricity market.7 Operator (AESO). visit www. A timeweighted average of the marginal prices at the end of the hour is calculated as the market price for that hour. 1 . The System Controllers dispatch oﬀers and bids to keep the balance of supply and demand in the “merit order” and ensure the lowest cost. They dispatch electricity to meet demand and monitor the status of the electric system through the Energy Management System (EMS). The last bid or oﬀer that is dispatched every minute sets the System Marginal Price (SMP). The Alberta Energy and Utilities Board (EUB). All trading of power through the AESO is initiated by a process of oﬀers and bids. The Energy Trading System (ETS). during the period January 1. 2002 to March 31. The System Coordination Centre is the heart of Alberta’s wholesale realtime electricity market. 2004 The Alberta Electric System Operator brings together two former entities: the Power Pool of Alberta and the Transmission Administrator of Alberta. which enables both spot and forwards electricity markets. 2 For more details. Figure 2.1 illustrates an example of hourly prices. The System Controllers are responsible for the realtime operations in this market. There are no diﬀerential transmission charges for locations in Alberta (see [16]). Then the AESO establishes a “merit order”2 to meet forecast pool demand by ranking oﬀers and bids from lowest cost to highest cost for each hour of the day. provides governance and direction of the AESO.1 The AESO is also responsible for dispatching all electric power generation in Alberta and directing the operation of Alberta’s electricity network to ensure reliable and economic systems.aeso.ca. which are measured in dollars per megawatt hour ($/MWH.
the probability density curve of the change is not normal.8 (19. Figure 2.704 observations). 3 .1: Plot of the hourly electricity prices of the hourly electricity prices over the same period overlaid with a normal density curve.3 Clearly. 2002 to Mar 31. The day time is divided into onpeak periods and oﬀpeak periods to assist ﬁrms in managing electricity consumption and in taking advantage of lower rates in oﬀThe log returns of the price process P at time t in this thesis is deﬁned as d(Xt ) = Xt+1 − Xt with Xt = log Pt .2 presents an empirical histogram for the log returns Hourly Electricity Prices for Jan 1. because it has slimmer tails and a higher peak at the mean than the normal distribution. 2004 900 800 700 600 $/MWH 500 400 300 200 100 2002 2003 Time 2004 Figure 2.
.2: Empirical histogram of log returns with normal density superimposed Note: the histogram is the log returns of the hourly electricity prices. the overlaid solid line is the plot of a normal density with the same mean and variance as the log returns.9 5000 4500 4000 3500 3000 2500 2000 1500 1000 500 0 −6 −4 −2 0 2 4 6 Figure 2.
Good Friday. the onpeak period is from 08:00 to 21:00 Monday through Friday inclusive. Normally. According to the Alberta Energy and Utilities Board.4 presents the normalized average weekday hourly electricity prices for each season of the year 2000 to the year 2003. Figure 2. especially from about 8:00 p. and weekly patterns. to 8:00 a. Thanksgiving Day. Christmas Day. Pronounced cyclical eﬀects.m. Canada Day. in diﬀerent seasons in Alberta from the year 2000 to the year 2003. Victoria Day.4 The remaining hours from Monday through Sunday are oﬀpeak hours. Notice that within a time span of 24 hours. prices increase as demand increases with a distinct hourly pattern. there are peaks reﬂecting heating and cooling needs. with the exception of statutory holidays. peaking at around The statutory holidays in Alberta are New Year’s Day. Remembrance Day. It increases at around 5:00 a. Additionally. 5 In normalizing. they have hourly.m. daily.m. 4 . electricity prices in the Alberta Market also display the following distinct characteristics (see [17]): 1. 5 It may appear that temperatures are not a very signiﬁcant factor in Alberta since it is hard to tell that electricity prices have strongly signiﬁcant diﬀerent behaviour. Labour Day. Prices continuously increase throughout the day as demand increases. Just as electricity prices in other markets. Family Day (3rd Monday of February).. which will in turn help conserve energy and enhance available supply. Figure 2. 2004 together with a histogram of the log returns superimposed by a normal density. we adjust each average weekday hourly price by dividing by the year’s average hourly price. 2002 to March 31.10 peak hours. Electricity exhibits the most complicated cyclical patterns of all energy commodities. as people wake up and the work hours begin. intradaily.3 includes sample plots of the onpeak and oﬀpeak daily average electricity prices from January 1..
Both histograms illustrate the deviation from normality.3: Sample plots of the on/oﬀpeak prices together with the histogram of log returns Note: the upper left plot is the sample plot of the logarithm of the onpeak daily average prices. . Similarly. The lower right plot is the histogram for the log returns of the oﬀpeak daily average. the diﬀerences between the log of the onpeak average prices and the log of the oﬀpeak are not signiﬁcant.11 Log of Price 6 40 5 On−Peak On−Peak 30 20 10 0 Histogram for the Log−Return 4 3 2 2003 2004 −3 −2 −1 0 1 2 6 40 5 Off−Peak Off−Peak 2003 2004 30 20 10 0 −3 4 3 2 2002 −2 −1 0 1 2 Figure 2. the lower left plot is the sample plot of the logarithm of the oﬀpeak daily average prices. Notice that. The upper right plot is the histogram for the log returns of the onpeak daily average.
Moreover. Mean reversion. We see one (or several) upward jump shortly followed by a steep downward move.m. Pricedependent variance. This has been well documented as an important characteristic of electricity prices. it is apparent that prices. demand shifts to primarily residential usage and prices begin to decrease. are mimicking demand. in some sense. electricity spot prices are extremely volatile and occasionally reach extremely high levels. This is the most dramatic feature of Figure 2. Figure 2. It is also quite clear in Figure 2. 3. seasonality and weather. This can be measured by kurtosis.e. 4.5. Electricity is nonstorable (other than hydro) and is more aﬀected by transmission constraints. One consequence of “spikes” is the presence of socalled “fat tails”. the probability of a very large positive or negative change (though small) is much larger than permitted by a normal distribution. 2. commonly known as “spikes”.12 5:00 p. After the work hours. see [26].5 illustrates more clearly the daily usage pattern.. This phenomenon in particular gives rise to a high mean reversion rate. Sizable price “spikes”..6 The prices are measured in dollars per megawatt hour ($/MWH) on the left vertical axis and demand is measured in megawatts (MW) on the right vertical axis. As shocks in demand and supply cannot be smoothed. There is empirical evidence suggesting the fact that 6 This is similar to the investigation of the hourly electricity prices from California.1. The price oscillates around the mean level and gets pulled back to this mean level rapidly after a spike. . i. Electricity prices are normally higher when demand is greater. It is not uncommon to see that the ﬂuctuation of electricity prices is more than $700/MWH.
4: Average weekday hourly prices by season Note: from this plot. to 8:00 a.5 1 0.m. it is hard to tell that electricity prices have strongly signiﬁcant diﬀerent behaviour.5 0 0 5 10 Hour 15 20 25 Figure 2.5 2 $/MWH 1. especially from about 8:00 p.13 3 Spring (Mar 20 −−− Jun 20) Summer (Jun 21 −−− Sep 22) Fall (Sep 23 −−− Dec 21) Winter (Dec 22 −−− Mar 19) 2.. in diﬀerent seasons in Alberta from the year 2000 to the year 2003.m. .
14 180 0 2 Power Price 4 6 Demand 8 10 Mean Level of Price 12 14 7600 160 $/MWH 7400 Power Price 120 6800 100 6600 80 6400 60 6200 40 6000 20 5800 0 Tue Thur Sat Mon Wed Fri Sun 5600 Tue Figure 2.5: A sample of hourly electricity prices for two weeks Note: this is a plot of sample hourly electricity prices together with the corresponding demand for two weeks. Demand 7000 MW 140 7200 .
6: The moving average electricity prices vs. . Electricity prices are roughly proportional to the square root of their variance.15 10 4 Variance 10 3 10 2 10 Power Price ($/MWH) 2 Figure 2. their variance Note: this is a loglog scale plot of the 168hour moving average electricity prices versus the variance of the corresponding 168hour electricity prices.
Nonnegativity. 2002 to Mar 31. Some typical models that have been used to explain the dynamics of electricity prices are discussed in the following. 2. indicating that the variance is pricedependent.1.2. 2. Figure 2. the 168hour moving average price Mt = i=167 2 i=167 i=0 Pt+i /168. which is signiﬁcant. electricity prices are normally positive although sometimes they are close to zero. t = 1.16 the volatility of electricity prices is high when the aggregate demand is high and vice versa. · · · . 7 . 5.7 . t = 1. · · · . Thus. 8 For the spot prices Pt . [20]). As it costs money to produce electricity. modeling the price behaviour of electricity is a very challenging task for researchers and practitioners. as shown in Figure 2. [17]. N − 168.6 plots the 168hour moving average prices8 from Jan 1.2 Models Common models of asset prices oﬀer a poor representation and forecasting of the electricity price process because they fail to capture the erratic nature of electricity prices (see [10]. N . volatility is a statistical measure of the tendency of a market or security to rise or fall sharply within a short period of time Volatility is typically calculated by using variance or annualized standard deviation of the prices or log returns (see [19]). The prices are measured in dollars per megawatt hour ($/MWH) on the horizontal axis. and σt = var({Pt+i }i=0 ). This plot clearly exhibits an upward trend. 2004 versus the variance of the corresponding 168hour electricity prices.1 Geometric Brownian Motion Process The geometric Brownian motion process forms the basis for the “BlackScholesMerton” option pricing model and can be written as the following stochastic diﬀerIn general.
Likaa (2001) also applied these models to the Alberta Electricity Market (see [18]). (2. the log returns are normally distributed. One typical example of this type of continuous time models speciﬁes prices as: dXt = κ(α − Xt )dt + σdWt . If the deterministic part is constant.2. Meanreverting diﬀusion models have been widely adapted to model ﬁnancial time series. .2 OrnsteinUhlenbeck Process Lucia and Schwartz (2000) examined the Nordpool market in terms of a totally predictable deterministic component and an OrnsteinUhlenbeck process based on spot prices or log spot prices (see [22]).1) Here dSt is the stochastic increment over an inﬁnitesimal time interval dt. The longterm mean µ and the diﬀusion coeﬃcient σ are unknown constants. It is relatively simple to calibrate by maximumlikelihoodbased procedures and very popular in nonenergy markets (see [21]). or alternatively. Likaa (2001) studied this model for the Alberta Electricity Market and found that it failed to predict accurately the prices of electricity due to the lack of liquidity in the Alberta Electricity Market (see [18]).2) where Xt is the logarithm of electricity price at time t with an initial condition X0 . (2.17 ential equation (SDE): dSt = µSt dt + σSt dWt . 2. then their model is a constantmean reverting process. beginning with Vasicek (1977). and dWt represents an increment to a standard Brownian motion Wt which has zero mean and variance dt. This model assumes that prices are log normal.
et al. If we write equation (2.18 κ is the mean reversion rate which decides how fast the processes go back to the longterm mean level α. They found that models with meanreversion and jumps are particularly suitable for modeling electricity spot price processes (see [17].2) in integral form as follows t Xt = e−κt X0 + α(1 − eκt ) + eκ(s−t) σdWs .(1998). these models failed to give a reasonable prediction of electricity prices of the Alberta Electricity Market due to their limiting assumptions (see [18]). κ(α − Xt ). The calibration of this model is also relatively easy by maximumlikelihood based procedures (see [21].2. Meanreverting diﬀusion models are widely used in energy markets. will be negative. [23]). then the drift. and performed empirical studies based on the empirical data. [17]. and Barz (1999) examined a broad class of stochastic models which can be used to model the behaviour of electricity prices. Actually. Clewlow and Strickland (1999). and similarly if the spot price is lower than eα .3) we obtain a ﬁrst order autoregressive model (AR(1)) in continuous time and the parameters can be estimated by linear regression analysis (see [18]). [24]. 2. there are many empirical studies in the literature that demonstrate the . If the spot price is higher than eα . This brings the process back towards the longterm mean level. which is not captured by pure mean reverting processes (see [23]).3 Jumpdiﬀusion model Deng. 0 (2. Not surprisingly. [23]). but in the case of electricity. we should allow for the possibility of more upside departures than downside ones and more price “spikes” in one direction than the opposite direction.
.4) This model is also referred to as the Bernoulli diﬀusion model and is discussed by Ball and Torous (1983) (see [27]). In practice. Ball and Torous (1985) successfully used ML estimation to estimate parameters of jumpdiﬀusion models for NYSE stock prices. the parameters are estimated by MaximumLikelihood (ML) estimation. see [27]. Let us take the discretized Merton model as an example. The Poisson process is approximately modeled by a Bernoulli process.9 1 ∆Xt = (α − σ 2 )∆t + σ∆Wt + Jt ∆Pt . But parameter estimation in jumpdiﬀusion models is not as easy as it may appear to be (see [27]. The jump intensity can also be extended to vary with the time of day and season (see [26]). For most jumpdiﬀusion models. For a detailed discussion. One famous example is Merton’s (1976) option pricing model. [30]). besides the normal distribution. the combination of the distributions of the Brownian motion and the jump component often cannot be computed analytically (see [29]. [28]). we can also have the uniform distribution (see [25]) and the exponential distribution (see [9])–including some very close variations–as qualiﬁed jumpsize choices. jumpdiﬀusion models arise frequently in ﬁnancial literature. The major caveat is that the probability density function for jumpdiﬀusion processes cannot be determined explicitly for most models as it depends on the timing of jumps as well as their size. Moreover. ML estimation is a powerful and general method of estimating the parameters of a stochastic process when one has an analytical form for the probability density function. Consequently. a mixed model of a continuous Brownian motion and a discrete Poisson lognormally distributed jump. 2 9 (2.19 presence of jumps as a signiﬁcant feature in the behaviour of ﬁnancial time series.
The joint density function f (·) at the sample is given by the following equation: T −1 f (Y1 . the jumpamplitude Jt ∼ N (µ. YT −1 . µ1 = (α − 2 σ 2 )∆t. σ1 = σ 2 ∆t.5) . σ is the volatility.20 where ∆Xt is the change of Xt during the time interval ∆t. σ2 . . · · · . Using the symbol ∼ to denote the distribution of a random variable. δ 2 ) denotes a normal distribution with mean µ and standard deviation δ. θ ∈ Θ. · · · .6) = t=1 f (Yt . θ). with Yt = Xt+1 −Xt . then the probability density function for ∆Xt (which is denoted as f (Yt . µ2 . α is the longterm mean. θ) = (2π)− 2 a(σ1 exp − (Yt − µ1 )2 /2σ1 ) 1 (2. ∆t). t = 1. where N (µ. one has ∆Wt ∼ N (0. µ2 = (α − 2 σ 2 )∆t + µ. where Yt = ∆Xt . σ1 . and Wt is the Brownian Motion. Then the logarithm of the likelihood function. Assuming that the two normal processes are independent. θ ∈ Θ. L(Y1 . ∆Pt is a Bernoulli process with parameter λ∆t. θ) = t=1 −1 2 (2π)− 2 a(σ1 exp − (Yt − µ1 )2 /2σ1 ) 1 −1 2 + (1 − a)(σ2 exp − (Yt − µ2 )2 /2σ2 T −1 (2. θ). Suppose we have a sequence of T observations of Xt sampled at time t. and parameter space Θ ={(µ1 . 10 For more details. λ)} in the Merton model) turns out to have the form:10 −1 2 f (Yt . so that ∆t = 1. and 2 σ2 = σ 2 ∆t + δ 2 . + (1 − −1 a)(σ2 exp − (Yt − µ2 ) 2 2 /2σ2 1 1 2 where a = (1 − λ∆t). Furthermore. θ). T . δ 2 ). see [30]. · · · . YT .
YT −1 . 12 11 . L is unbounded and we cannot use standard ML estimation to estimate the parameters (see [32]). θ) = ln f (Yt .7) However. The conditional variance.11 Therefore MLbased estimation can be only applied to a handful of them (see [31]). · · · . Thus. (2. Bollerslev (1986) developed a generalized ARCH model. where εt is an independent random error distributed as N (0. Later.12 In practice. or GARCH model which allows more terms in the model. θ) increases without bound as σ1 goes to zero. θ). if µ1 = Yt for any t then f (Yt . 1).2. such as for those that fall into the class of aﬃne processes. t=1 (2.8) below. [30]). 2. f is not zero at other observations (µ1 = Yt ) due to the existence of the second term of f .21 can be written as: T −1 L(Y1 . which we will give more details later on. for example.8) where a0 and a1 are unknown constants. it is still possible to obtain estimates via MLbased estimation (see [33]. According to [34].4 GARCH Autoregressive Conditional Heteroskedastic (ARCH) models were ﬁrst proposed by Engle (1982) and have been widely applied in many diﬀerent ﬁnancial areas. is often adequate to capture the volatility structure of electricity price St (see [35]): St = a0 + a1 St−1 + µt . the ﬁrst order GARCH(1.1) model. shown in Equation (2. The forcast error term is µt = σt εt . volatility clustering phenomenon describes the situation wherein large volatility movements are more likely to be succeeded by further large volatility movements of either sign than by small movements. GARCH models are able to capture the very important volatility clustering phenomenon which clearly exists in electricity prices. Moreover. However.
(2. they depart from the “usual” behaviour frequently and signiﬁcantly. Also. They also extended this ARMAX model by incorporating temperature data to capture the seasonality of electricity prices. 2. which cannot easily be modeled by a constant longterm mean. they got an Autoregressive Moving Average Exogenous (ARMAX) model.10) . by introducing serial correlation in the error term. three of which are now discussed. Dummy variables can be used to reﬂect those properties of electricity prices. b1 and b2 are unknown constants (see [35]). b0 . They speciﬁed the price level as the sum of a deterministic component and a stochastic component and adopted the Exponential GARCH (EGARCH) model of Nelson (1991) (see [26]). TimeVarying Longterm Mean As electricity prices tend to have dramatic changes in diﬀerent seasons due to heating and cooling needs and in diﬀerent times of day due to change in demand. t (2.2. One may consider a longterm mean αt such as 4 αt = a1 peakt + a2 oﬀpeakt + i=1 bi Mti .22 2 σt := var(St St−1 ).9) Here. is given by 2 2 σt = b0 + b1 µ2 + b2 σt−1 . Knittel and Roberts (2001) also introduced a similar approach to model electricity prices.5 Extensions to Models There are also some plausible extensions of the above models.
Stochastic Volatility Due to the characteristics of electricity price behaviour. 1 if in oﬀpeak periods. For example. we can model the longterm mean itself as a stochastic process. peakt = 0 otherwise.11) Dummy variables are quite intuitive and can potentially provide some necessary ﬂexibility. a2 and bi (i = 1. the use of dummy variables is always treated as an approximation since the number of steps and the placing of each step point are arbitrarily ﬁxed (see [22]). oﬀpeakt = 0 otherwise. This gives the “trending OrnsteinUhlenbeck process” proposed by Lo and Wang (1995) (see [22]). 4) are unknown constants. · · · . Moreover. One typical example is to specify the . some authors have argued that models for electricity prices should incorporate a form of volatility which evolves stochastically over time (see [26]. We may also consider including a deterministic general trend in the longterm mean. But dummy variables are very sensitive to anomalies as a result of this ﬂexibility. 1 if in onpeak periods. For high frequency data. Mti = 0 otherwise. Also. one can apply a linear time trend to the logarithm of price. a sinusoidal function can be used to capture the seasonal pattern in the price (see [20]). (2. 1 if belongs to the ith season. [10]). as suggested by Pilipovic (1998).23 where a1 . This implies an exponential trend for the price itself.
24 volatility as a squareroot process: √ dvt = κ(µ − vt )dt + σ vt dWv , (2.12)
where Wv is correlated with the Brownian motion in the spot prices. This is diﬀerent from the GARCH type models (2.9), which specify the volatility as a deterministic function of lagged squared forecast error and lagged conditional variance. It is also possibly to specify the volatility as a regimeswitching (see [36]) or jumpdiﬀusion process (see [5], [10]). Estimation of stochastic volatility models presents intriguing challenges, and a variety of procedures have been proposed for ﬁtting the models. Examples include the Bayesian Monte Carlo Markov Chain (MCMC), Eﬃcient Method of Moments (EMM), Generalized Method of Moments (GMM), Simulated Method of Moments (SMM), and Kalman ﬁltering methods. Two excellent recent surveys are Ghysels et al. (1995) and Shephard (1995) (see [37], [38]). Markov Regime Switching Process Markov regimeswitching processes have been proved to be quite useful in modeling a range of ﬁnancial time series including stocks, exchange rates and interest rates. Some authors also incorporated them into stochastic processes to capture the dynamics of electricity prices. For example, H´lyette and Andrea (2002) studied the e dynamics of electricity prices in the major U.S. electricity markets with a combination of a deterministic term and a regime switching process. According to their study, a regime switching model ensures the Markov property in the dynamics of electricity prices, which makes the calibration and forecasting easier (see [4]). Shijie Deng (1999), Elliott, Sick and Stein (2000) and Geman (2001) also used regime switching processes to model electricity prices (see [10], [16], [4]).
25 The basic framework of a twostate regime switching model is to assume that there are a “volatile” state and a “normal” state of the world. The price processes behave diﬀerently depending on the state of the world. Unexpected events such as earnings announcements, scandals, or changes in macroeconomic variables signal to investors new information which often result in price processes following completely diﬀerent dynamics, i.e., to switch regimes. Some plausible scenarios for electricity prices would be the forced outages of power generation plants or unexpected contingencies in transmission networks and the like.
Chapter 3 Stochastic Models for Electricity Prices
A broad class of meanreverting jumpdiﬀusion models will be studied in this chapter for electricity spot prices modeling. We will impose an aﬃne structure on the coeﬃcients of the processes, which leads to closedform or nearly closedform expressions for the conditional characteristic functions. We will outline the relevant features of this aﬃne framework, which are given a general treatment in Duﬃe, Pan and Singleton (2000) (see [5], [12]). We will illustrate how to exploit the transform analysis introduced by Duﬃe, Pan and Singleton (2000) to obtain the CCF in the models we adopt.
3.1
Aﬃne JumpDiﬀusion Process
Recently, considerable attention has been focused on aﬃne processes. They are ﬂexible enough to capture certain properties such as multiple jumps, timevarying longterm mean, and stochastic volatility in various forms, that occur in many ﬁnancial time series without sacriﬁcing computational tractability. Therefore, aﬃne processes have been widely used to study the term structure of interest rates, the modeling of optimal dynamic portfolios and option pricing, and so on. We follow here the presentation in Duﬃe, Pan and Singleton [5]. Suppose that we are given a strong Markov process 1 X with realizations {Xt , 0 ≤ t < ∞} in some
1
For the technical deﬁnition, see [39].
26
2 Xt uniquely solves the following stochastic diﬀerential equation (SDE) (written in integral form): t t m Xt = X 0 + 0 µ(Xs . . 1 where for each 0 ≤ t < ∞. Also. 4 For a matrix C. t) for some λi : (D. is in Rn×n and is symmetric. t) = l0 (t) + li (t) · X. H1 (t) ∈ Rn×n×n . . 1 (k) Notice that given an initial condition X0 . deﬁned to be the matrix obtained by ﬁxing the third index of H1 (t) to be k. l0 (t) ∈ R. For column vectors a and b. the operation a · b is the scalar product of a and b. t) → Rn and jump amplitude distribution νti on Rn . (k) i λi (X. The functions µ : (D. See [5] for details. (3. . K1 .27 state space D ⊂ Rn .3 The random variable Ws is a standard Brownian motion in Rn . t) = H0 (t) + k=1 H1 (t)Xk . C is the transpose of C. where νti only depends on time t. t) = K0 (t) + K1 (t)X. ∞) × D → C of XT The details are given in [5]. t) → Rn and σ : (D. H0 . K1 (t) ∈ Rn×n . H0 (t) ∈ Rn×n and is symmetric. i Moreover. 3 2 . the tuple θ = (K0 . l1 ) can be used to determine a transform Ψθ : Cn × [0. Under certain regularity conditions. Each jump type Zti is a pure jump process with a stochastic arrival intensity λi (Xt . t)σ(X. s)ds + 0 σ(Xs .1) has a unique solution. deﬁned by (3. ∞) × [0. is said to be an aﬃne jumpdiﬀusion process if4 µ(X. n σ(X. n. . t) → Rn×n must satisfy certain boundedness conditions in order to guarantee that (3. l0 . H1 . The process Xt . li (t) ∈ Rn .1) The jump behaviour of X is governed by m types of jump processes. for k = 1. K0 (t) ∈ Rn . Xk is the kth entry in X. s)dWs + i=1 Zti . H1 (t).1).
deﬁned by (3. T. (3. t.28 conditional on Xt . T ) = −B(N(u. for a normally distributed jump size with mean µ and variance σ 2 . 1 Here ϕi (c) is the socalled “jump transform” for the ith jump. t. t. Here M (·) and N(·) satisfy the following complexvalued Riccati equations. T ) · Xt ). ∂t where.T). ϕ(c) = 1 1−µc (see [5]). Xt ) = exp(M (u. ∂t ∂N(u. ∂M (u. It is given by ϕi (c) = Rn exp(c·z)dνti (z) whenever the integral is well deﬁned. T. t). T.5 then the transform Ψθ of Xt . l1 ) is “wellbehaved” at (u. for any c ∈Cn .2) where E θ denotes the expectation under the distribution of XT determined by θ. [5] have proved that if we suppose θ = (K0 . t) = K0 (t) · c + c H0 (t)c + 2 1 B(c. T ) = 0. t. H0 . 6 5 .2). t. l0 . N(u. See [5] for details. Rn denotes the set of ntuples of real numbers. Xt ) = E θ [exp(u · XT )Xt ]. 0 ≤ t ≤ T .6 1 A(c. t. t.3) i=1 m (3. t).4) i=1 li (ϕi (c) − 1). Cn denotes the set of ntuples of complex numbers. T. it can be shown that ϕ(c) = 1 exp(µc+ 2 σ 2 c2 ). T ) = u. t) = K1 (t) c + c H1 (t)c + 2 m i l0 (ϕi (c) − 1). K1 . T ) + N(u. T ). deﬁned by Ψθ (u. T ). M (u. For example. 0 ≤ t ≤ T . exists and is given by: Ψθ (u. Duﬃe et al. t. (3. H1 . T ) = −A(N(u. for an exponentially distributed jump size with mean µ.
we take the Fourier transform to be f(ξ) = F[f ](ξ)= f (x)eixξ dx 2.2 Model 1a Our ﬁrst attempt to capture the meanreversion and spikes present in electricity prices is by a standard meanreverting jumpdiﬀusion process. check the link mathworld. We start with specifying the logarithm of the spot price. Through the inverse Fourier transform. XT Xt ) :=Ψθ (is. using a model adopted from Das and Foresi (1996). If h(x) = ∞ −∞ ˆ ˆ g f (x − y)g(y)dy then h(ξ) = f (ξ)ˆ(ξ) 4. We will exploit this CCF of discretely sampled observations to develop computationally tractable estimators of parameters in Chapter 4.html. The diﬀusion part is represented by an OrnsteinUhlenbeck process and the jump component has exponentially distributed absolute value of jump size with the 7 Brief summary of Fourier transform: ∞ −∞ ˆ 1. Integration by parts reveals that the Fourier transform takes diﬀerentiation to multiplication ˆ (by ξ): F[fx ](ξ) = −iξ f(ξ) 3. The Fourier transform is invertible. Given a function f(x).29 By setting u = is (i = as: φ(s. θXt )dXt . θ. 1 2π ∞ −ixξ ˆ f (ξ)dξ −∞ e is called the inverse For more details of Fourier transform. one can obtain the CCF of XT conditional on Xt (3. and f (x) = Fourier transform. Notice that the CCF is actually the Fourier transform7 of the conditional density.com/FourierTransform. one can recover the conditional density function from the CCF and implement a usual ML estimation. T.5) exp(is · XT )f (XT . with f (XT . θXt ) is the conditional density of XT conditional on Xt . Xt . Xt ) =E θ [exp(is · XT )Xt ] = RN √ −1). .wolfram. t. 3.
γ] gives the unknown constant parameters. (3. l1 = 0. t. ∂t 2 ∂B(s.7) Thus the CCF of XT given Xt . θ) = κB(s. (3. φ(s. t. ψ. T. and random jump amplitude are all Markov and pairwise independent. ω. ∂t (3. We assume that the Brownian motion. Poisson process. one dimensional standard Poisson process with arrival rate ω. t. t.6) Here. T. t. XT Xt ) = E[exp(isXT )Xt ] = exp(A(s. Notice that this equation ﬁts in the framework outlined in Section 3. T.30 sign of the jump determined by a Bernoulli variable. θ. θ) 1 = −καB(s. Wt is a standard Brownian motion with dWt ∼ N (0. H1 = 0. takes the form: φ(s. σ 2 . α. dt) for an inﬁnitesimal time interval dt.9) (3. This is encapsulated by the following SDE: dXt = κ(α − Xt )dt + σdWt + Qt dPt (ω). dPt = 0 otherwise. The jump amplitude Qt is exponentially distributed with mean γ and the sign of the jump Qt is distributed as a Bernoulli random variable with parameter ψ. T. κ is the mean reversion rate. α is the longterm mean.8) . and Pt is a discontinuous. l0 = ω. θ) − ω(ϕ(B(s. where A(·) and B(·) satisfy the following system of complexvalued ordinary diﬀerential equations (ODE): ∂A(s. H0 = σ 2 . t. XT Xt ). t. θ). T. T. T. θ) + B(s. During dt. θ. Specifically. dPt = 1 if there is a jump. θ)Xt ). T. t. the tuple θ = [κ. θ)) − 1). K1 = −κ. θ) − σ 2 B 2 (s.1 with K0 = κα.
t.9) for A(·) and B(·) and apply the corresponding boundary conditions to obtain (after some calculation) A(s. [11]). θ) = is. 1 − B(s. The jump behaviour of Xt is governed . t. α is the longterm mean. we solve the system (3. t t (3. each with an exponentially distributed jump magnitude. θ) = 0. T. and Wt is a standard Brownian motion with dWt ∼ N (0. T. T. 2κ 1 + γ 2 s2 B(s. T. T.11) Now. T. T. we suppose that the logarithm of the spot price Xt satisﬁes the following SDE: dXt = κ(α − Xt )dt + σdWt + Qu dPtu (ωu ) + Qd dPtd (ωd ). t.13) Again. the “jump transform” ϕ(B(s. T. we allow for asymmetric upward and downward jumps (see [10]. κ is the mean reversion rate. θ)z) exp(− )dz γ γ 0 ∞ z 1 + (1 − ψ) exp(−B(s. 3. (3. T.31 with boundary conditions: A(s. T. t. θ)) is given by: ϕ(B(s. B(s. T.10) Here. θ)γ ψ ∞ (3.3 Model 1b In this model. θ) = iαs(1 − e−κ(T −t) ) − + σ 2 s2 (1 − e−2κ(T −t) ) 4κ (3. More speciﬁcally. t. t.12) iω(1 − 2ψ) arctan(γse−κ(T −t) ) − arctan(γs) κ 1 + γ 2 s2 e−2κ(T −t) ω ln + . T. t. θ)) = z 1 exp(B(s. θ)γ 1 + B(s. dt). t. θ)z) exp(− )dz γ γ 0 ψ 1−ψ = + . θ) = ise−κ(T −t) .
θ) = −καB(s. the “jump transform” for the downward jump is given by: ϕd (B(s. θ) = 0. (3. Here A(·) and B(·) satisfy the complexvalued system of ODE’s: 1 ∂A(s. θ) ∂t 2 ∂B(s. Ptu and Ptd are two independent discontinuous. t. T.18) . 1 − B(s. t. t.32 by two types of jumps: upward jumps and downward jumps. θ) + B(s. T. t. θ)) = 1 . θ)) − 1). T. T. t.1. t. θ) = is. ∂t with boundary conditions: A(s.16) − ωu (ϕu (B(s. 1 − B(s. θ)z) z 1 exp(− ) dz γu γu 1 = . T. Thus the transform analysis can be implemented in this case and the CCF can be written out explicitly for this model φ(s. Again. (3. θ). t. T. T. t. T.15) (3. t. one dimensional standard Poisson processes with arrival rate ωu and ωd respectively. θ)Xt ). θ) − σ 2 B 2 (s. t. θ)γu Similarly. t. t. T. θ)) = 0 exp(B(s. T. T. XT Xt ) = E[exp(isXT )Xt ] = exp(A(s. T. B(s. T. θ) = κB(s. T. t. Notice that this equation also ﬁts in the framework outlined in Section 3. θ.17) (3. T. t. d The downward jumps (Qt ) are also exponentially distributed negative mean γd and jump arrival rate ωd . The upward jumps u Qt are exponentially distributed with positive mean (γu ) and jump arrival rate ωu . T.14) Here the “jump transform” for the upward jump is given by: ∞ ϕu (B(s. T. θ)γd (3. T. θ)) − 1) − ωd (ϕd (B(s.
θ) = ise−κ(T −t) . The logarithm of electricity spot price is thus deﬁned by dXt = κ(α(t) − Xt )dt + σdWt + Qt dPt (ω). after some computation we solve for A(·) and B(·) and apply the corresponding boundary conditions to get (after some calculation) A(s.20) Recall that the seasonal eﬀects on electricity prices are not strongly signiﬁcant for the speciﬁc data we analyze (refer to Figure 2.4 Model 2a We impose upon the price process (3. θ) = iαs(1 − e−κ(T −t) ) − σ 2 s2 (1 − e−2κ(T −t) ) 4κ ωu ωd 1 − isγu e−κ(T −t) 1 − isγd e−κ(T −t) ln ln + + .21) oﬀpeakt = 0 otherwise.19) B(s. t. 0 otherwise. 1 if in oﬀpeak periods.6) a timevarying component in the drift by replacing the longterm mean α with a deterministic function α(t). (3.33 Again. We only incorporate the onpeak and oﬀpeak eﬀects into the price process and consider the following form for α(t): α(t) = α1 peakt + α2 oﬀpeakt . T.22) . 3. T. (3.4). where peakt = (3. 1 if in onpeak periods. κ 1 − isγu κ 1 − isγd (3. t.
T. θ) = L(s. Here A(·) and B(·) satisfy the complexvalued system of ODE’s: ∂A(s. let t = t0 < t1 · · · < tN = T . t.27) αe −κ(T −tj ) = is j=1 (1 − e −κ(tj −tj−1 ) ). then we have T L(s. θ) = t κα(t)ise−κ(T −t) dt N (3. t. T.34 This is also an aﬃne process. T. θ)Xt ). θ) ∂t 2 ∂B(s. θ). t. the CCF of XT given Xt can be expressed in closedform in this case: φ(s. t. T. ∂t with boundary conditions: A(s. T. T. θ) + B(s. T. θ) − σ 2 B 2 (s. T. θ) = κB(s. t. Here. θ) = ise−κ(T −t) . 2κ 1 + γ 2 s2 B(s. 1 j−1 j α otherwise. T.28) . T. T. t.24) (3. θ) 1 = −κα(t)B(s. θ) = 0.25) We can obtain A(·) and B(·) in the same fashion: A(s. T. t. θ)) − 1). (3. t. t. (3. XT Xt ) = E[exp(isXT )Xt ] = exp(A(s. θ. T. t ] is in onpeak periods. T. By the same methods. t. T.26) iω(1 − 2ψ) arctan(γse−κ(T −t) ) − arctan(γs) κ ω 1 + γ 2 s2 e−2κ(T −t) + ln . θ) = is. where α= α if [t . t. T. t. − ω(ϕ(B(s. 2 (3. θ) − + σ 2 s2 (1 − e−2κ(T −t) ) 4κ (3.23) B(s.
t.27).5 Model 2b We consider the following extension to the model (3. XT Xt ) = E[exp(isXT )Xt ] = exp(A(s. t. t. So we consider the following twofactor aﬃne process (3. (3. t. T. T.13): dXt = κ(α(t) − Xt )dt + σdWt + Qu dPt (ωu ) + Qd dPt (ωd ). θ) deﬁned in Equation (3. Also.21).30) (3. T. the CCF of XT given Xt is of the form: φ(s. θ) + B(s. t t where α(t) is deﬁned the same as in (3.29) (3. θ) = L(s.32) to model electricity spot . T. where σ 2 s2 A(s. [10]). T. volatility in electricity prices varies over time and is likely meanreverting itself (see [41]). θ) = ise−κ(T −t) .31) 3. T. θ. θ)Xt ). Then similar to the above models.35 3. t. t. θ) − (1 − e−2κ(T −t) ) 4κ ωd ωu 1 − isγu e−κ(T −t) 1 − isγd e−κ(T −t) + .6 Model 3a It has been well documented that jumps alone are inadequate to mimic the level of skewness present in electricity spot prices. Kaminski (1997) and Deng (1998) emphasized the need to incorporate stochastic volatility in the modeling of electricity spot prices (see [40]. with L(s. ln ln + κ 1 − isγu κ 1 − isγd B(s. according to the study of Mika and Andrew.
κ is the mean reversion rate of the log prices. H1 = . and the sign of Qt is distributed as a Bernoulli random variable with parameter ψ. l1 = . We have √ √ 2 Xt κ(α − Xt ) 1 − ρ Vt ρ Vt dWt Qt dPt (ω) d = dt + + .36 prices. H0 = . κv α v 0 −κv 0 0 0 0 1 ρσv (1) (2) H1 = . 2 0 0 ρσv σv 0 l0 = ω. and σv is the volatility of Vt . αv is the longterm mean of Vt . Let Xt be the logarithm of the spot price of electricity and Vt be the volatility of the price process which evolves stochastically over time. the amplitude of Qt is exponentially distributed with mean γ. one dimensional standard Poisson process with arrival rate ω. α is the longterm mean of the log prices. Also.1 with κα −κ 0 0 0 K0 = . 0 (3.33) Suppose that the jump component in the logarithm of the spot price is deﬁned as in . √ 0 σ v Vt Vt κv (αv − Vt ) dWv 0 (3. Pt is a discontinuous. κv is the mean reversion rate of the volatility Vt . The two random variables Wt and Wv are two uncorrelated standard Brownian motions. This model also ﬁts in the framework outlined in Section 3.32) Here. K1 = .
θ) = isv . t. (3. t. Vt )] = exp(A(sx .35) We can still solve the ﬁrst equation for A(·) and apply the corresponding initial conditions to obtain A(·) = isx e−κ(T −t) . T.34) Thus the CCF is of the form φ(sx . T. we don’t have closed forms for B(·) and C(·) and need to solve them numerically. ∂t B(·) with boundary conditions: A(sx . sv . sv . 1 − A(·)γ 1 + A(·)γ (3.38) However. where A(·). T. θ)Xt + B(sx . Then the “jump transform” is given by ∞ z 1 A(·) ϕ ψexp(A(·)z) + (1 − ψ)exp(−A(·)z) exp(− )dz = γ γ 0 B(·) = ψ 1−ψ + . θ.37) (3. 2 ∂C(·) A(·) = −καA(·) − κv αv B(·) − ω(ϕ − 1). (3. XT . sv . T. T.36) − B(·)(A(·)ρσv + B(·)σv ). sv . Vt ) = E θ [exp(isx XT + isv VT )(Xt . VT Xt . sv . sv . θ)). B(sx . ∂t ∂B(·) 1 = κv B(·) − A(·)(A(·) + ρσv B(·)) ∂t 2 1 2 (3. θ)Vt + C(sx . T. B(·) and C(·) satisfy the following complexvalued Riccati equations. T. C(sx . .37 Model 1a and Model 2a. θ) = 0. ∂A(·) = κA(·). T. t. T. θ) = isx . sv .
42) . θ)). 0 ρ2 √ √ ρ Vt dWt √ dWv σ v Vt (3. (3. θ)Vt + C(sx . t.7 Model 3b We now redeﬁne the model (3. θ)Xt + B(sx . sv . θ. sv . T. t. T.40) Similarly. one can obtain the CCF in (3.41) As this model also ﬁts in the setting of aﬃne processes. T. VT Xt . the “jump transform” for the downward jump is given by 1 A(·) ϕd . sv .38 3.32) above by allowing two types of jumps just as before: Vt 1− Xt κ(α − Xt ) d = dt + κv (αv − Vt ) Vt 0 u d Qt dPt (ωu ) + Qt dPt (ωd ) + . Vt )] = exp(A(sx . XT . t. = 1 − A(·)γd B(·) the same fashion: φ(sx . sv . Vt ) = E θ [exp(isx XT + isv VT )(Xt . 1 − A(·)γu (3.39) Then the “jump transform” for the upward jump is given by ∞ 1 z A(·) exp(A(·)z) ϕu exp(− ) dz = γu γu 0 B(·) = 1 .
B(·) and C(·) satisfy the following complexvalued Riccati equations. T.43) ∂C(·) ∂t = −καA(·) − κv αv B(·) A(·) A(·) − ωu (ϕu − 1) − ωd (ϕd − 1). T. C(sx . T. sv . we obtain A(·) = isx e−κ(T −t) and need to solve B(·) and C(·) numerically. sv .44) Again. ∂t ∂B(·) 1 ∂t = κv B(·) − 2 A(·)(A(·) + ρσv B(·)) 1 2 − B(·)(A(·)ρσv + B(·)σv ). B(·) B(·) with boundary conditions: A(sx . B(sx . 2 (3. θ) = isx . . T. ∂A(·) = κA(·). θ) = isv . (3. T.39 where A(·). sv . T. θ) = 0.
If the N dimensional state variables are all observable. But the estimation can be costly in higher dimensions (N ≥ 2) because we need to compute the multivariate Fourier inversions repeatedly and accurately in order to maximize the likelihood function. We can use it to recover the conditional density function via the Fourier transform and implement a usual ML estimation. Moreover. According to Singleton (2001). e 1 40 . This is the approach of MLCCF estimation. For more details about RaoCram´r bounds and eﬃciency.Chapter 4 Parameter Estimation 4. for which we will give more details later on.1 Introduction Aﬃne processes are ﬂexible enough to allow us to capture the special characteristics of electricity prices such as meanreversion. If the eﬃciency tends to 1 as the number of observations increases. and “spikes”(see [10]). considerable computational saving can be achieved by using limitedinformation MLCCF (LMLThe ratio of the RaoCram´r lower bound to the actual variance of any unbiased estimation of e a parameter is called the eﬃciency of that statistic. the CCF is unique and contains the same information as the conditional density function through the Fourier transform. one can explore the information from the CCF of discretely sampled observations to develop computationally tractable and asymptotically eﬃcient 1 estimators of the parameters of aﬃne processes (see [5]). the estimator is said to be asymptotically eﬃcient. Moreover. MLCCF estimation can be implemented and the so obtained MLCCF estimators are asymptotically eﬃcient (see [12]). seasonality. see [42]. under suitable regularity conditions.
For further details. Deﬁne Xt+1 := η j · Xt+1 . f (Xt+1 . then the conditional density of Xt+1 conditioned on Xt is the inverse Fourier transform of φ(ξη j . 2. θ. the MLCCF or LMLCCF estimators cannot be obtained. Singleton (1999) (see [12]) proposed a Simulated Method of Moments (SMMCCF) estimator. θ. Xt+1 Xt ) with some scalar ξ:2 j fj (Xt+1 .2) RN Thus the estimation involves at most N onedimensional integrations instead of doing a N dimensional integration. · · · } is a set of discretely sampled observations of a N dimensional state variable with a joint CCF φ(s. The estimators obtained are called LMLCCF estimators. θXt ) = 1 2π R φ(ξη j .3 But for those multifactor models with unobservable (latent) state variables such as stochastic volatility models. Refer back to (3. θ.5) in Chapter 3. Xt+1 Xt )e−is Xt+1 ds. (4. QML acts as if the data were generated by a density function that provides an estimator that is easy to obtain. θ. Let η j denote a N dimensional selection vector where the j th entry is 1 and zeros j j elsewhere. Xt+1 Xt ). Xt+1 Xt )e−iξηj Xt+1 dξ. θXt ) instead of the information in the joint conditional density function. There are several recent papers that discuss the methodologies related to CCFbased estimators of stochastic volatility models. Jiang and Knight (1999) (see [43]) explored the Method of System of Moments (MSM) estimators. 3 2 .1) j The basic idea behind this is to exploit the information in fj (Xt+1 . t = 1. see Bollerslev and Wooldridge (1992) and Newey and Steigerwald (1997).41 CCF) estimation (see [12]). (4. they are typically more eﬃcient than the quasimaximum likelihood (QML) estimators for aﬃne diﬀusions (see [12]). This method only makes assumptions about the mean and the second moment. Suppose {Xt . Although the LMLCCF estimators do not exploit any information about the joint conditional density function. θXt ) = 1 (2π)N φ(s.
By unbiased. θ) where θ = {θ1 . we adopt the idea of SGMM because this methodology is more computationally tractable than the others (see [12]). . This limiting distribution is called the stationary distribution and its Fourier transform is called the stationary characteristic function (see [44]). we mean that the mathematical expectation of the estimator of a parameter is equal to the parameter.5 Suppose that X is a N dimensional continuous random variable with probability density function f (X. Furthermore. In this thesis. we also introduce SGMM estimators based on the socalled marginal CCF to estimate stochastic volatility models. . Given a sequence of observations {Xt } sampled at It turns out that as t → ∞.2 MLCCF Estimators ML estimation is the most common method of estimating the parameters of stochastic processes if the probability density has an analytical form. 5 Any statistic that converges in probability to a parameter is called a consistent estimator of that parameter. It provides a consistent approach to parameter estimation problems and ML estimators become minimum variance unbiased estimators as the sample size increases. we mean that the estimator has the smallest variance among all unbiased estimators of the parameter (see [42]).42 Chacko and Viceira (1999) (see [13]) considered the socalled Spectral Generalized Method of Moments (SGMM) estimators. 4. and utilized this CCF to obtain a socalled marginal CCF. θk } are k unknown constant parameters which need to be estimated. By minimum variance. 4 . . . they derived the stationary (unconditional) characteristic function4 from the CCF of the volatility. To deal with stochastic volatility models. We apply a ML type estimation based on the socalled marginal CCF (MLMCCF) to estimate stochastic volatility models. there exists a limiting distribution for Xt .
(4. . Xt+1 Xt ). . . θ). n ˆ θml = argmax L(X1 . (4. 2. n. . the log likelihood function at the sample is given by: n L(X1 .6) Take Model 1a (3.5) RN One can use the standard ML estimation based on this conditional density function to obtain MLCCF estimators of the sample as: n−1 ˆ θCCF = argmax θ t=1 ln(f (Xt+1 . . θ)). + 2κ 1 + γ 2 s2 (4. s) = exp − σ 2 s2 iω(1 − 2ψ) (1 − e−2κ ) + (arctan(γse−κ ) − arctan(γs)) 4κ κ (4. for the models we adopt. Xn . the CCF. φ(s. often in closedform.8) .6) as an example. Thus the conditional density function of Xt+1 given Xt can be obtained by the Fourier transform of the CCF: f (Xt+1 .4) Now.9) 1 + γ 2 s2 e−2κ ω ln . θXt ) = 1 (2π)N φ(s. Xn . t=1 (4. where Yt = (Xt+1 − α) − e−κ (Xt − α). as an exponential of an aﬃne function of Xt . Xt+1 Xt )e−isXt+1 ds e −isYt (4. s)ds.3) The maximum likelihood based estimators of θ are obtained by maximizing L(·). and h(θ. θ. of the sample is known. .7) h(θ. θ) = ln f (Xt . t=1 (4. θ) = argmax θ θ ln(f (Xt . .43 t = 1. The conditional density function of Xt+1 given Xt of the sample is of the form: f (Xt+1 . θXt )). . . . θ. θ. . θXt ) = 1 2π 1 = 2π ∞ −∞ ∞ −∞ φ(s. Xt+1 Xt )e−is·Xt+1 ds. .
M M −1 RYt nR 1 R )). θXt ) = 1 lim 2π R→∞ R e−isYt h(θ. θ) ≈ 2π M M −1 (e−ink M h(θ. R] outside of which the function h(θ. M R sk = k∆s = k( ). M and the integral in Equation (4. −R 2 2 (4.13) One can approximate F (Yt . θ) ≈ 2π R e−isYt h(θ. and i (Ai ) denotes the sums of the Ai with the ﬁrst and last term halved.7) can be estimated on a suitable grid of s values by a fast Fourier transform (FFT) algorithm. s) is continuous in s and h(θ.12) = 2π. s)ds. θ) ≈ 2π M n=−M M Yn = n∆Yt = n( If we arrange RYt M (4. then we have 1 R F (Yt . one can discretize Yt into M subintervals such that:6 Yt ). s)ds. −R (4. θ) := f (Xt+1 .11) Also. Then. s) to be integrated is negligibly small. But for those multifactor models with unobservable state variables such as 6 Here we use the compound trapezoidal rule to approximate the integral.44 To assist in computing this integral (4. nR ). (e−ink M 2 h(θ. 1 F (Yt . 4.7) we deﬁne F (Yt .3 MLMCCF Estimators MLCCF estimators are asymptotically eﬃcient if all of the state variables are observable. M (4. Thus one can truncate the integral to a ﬁnite interval [−R. F (Yt . for this choice of R. θ) by the discrete Fourier transform (DFT) of h(θ. n=−M 2π nR )). .10) s Notice that h(θ. s) ≤ − σ4κ (1 − e−2κ ).
Various numerical methods have been proposed for estimating implied volatility functions from option prices (see [45]. implied volatilities can be calculated from option prices observed in the market. Furthermore. [46]. compared to the SGMM method that we will introduce later on. Recall that the volatility follows a squareroot process such as dVt = κv (αv − Vt )dt + σv Vt dWv . Then one can use those values as the data of volatilities and implement MLCCF estimation. Take Model 3a as an example. Based on this marginal CCF. Singleton’s SMM method integrates out the unobservable variable in the CCF by simulation. (4. Following Chacko and Viceira (1999) we can integrate the unobservable variable (volatility) from the joint CCF of the log price and the volatility. we can implement a ML based estimation. this induces an estimation bias due to the discretization used in the simulation (see [13]). This requires a huge number of simulated paths of the volatility and can be quite timeconsuming. But in our case.14) . MLMCCF estimation avoids the socalled ad hoc moment conditions selection problem and is easier to implement in case of stochastic volatility models. MLCCF estimators cannot be obtained directly. But LMLCCF estimation not only keeps sv = 0 but also needs to utilizes the volatility information (not workable in our case). Meanwhile. option prices are not available. If option prices are available.45 Model 3a and Model 3b. and set sv = 0 to get the socalled marginal CCF. [47]). This is somewhat similar to LMLCCF or Singleton’s SMM method.
sv . see Paul Wilmott [19]. we have ˆ Leiuv = − 2 σv v 2 iuv u e + iκv (αv − v)ueiuv 2 iσ 2 u2 = (iv( v − κv u) + iκv αv u)eiuv .17) and Because dµ(·) dt 2 iσv u2 − κv u) iveiuv dµ + iκv αv u 2 iσ 2 u2 dˆ(u) µ = ( v − κv u) + iκv αv uˆ(u) .18) ( 2 iσv u dˆ(u) µ − κv ) + iκv αv µ(u) = 0 . 2 (4. with f (v) = eiuv and µ(u) := µ(eiuv ). t. VT Xt . θ)Xt + B(sx .19) with µ(0) = 1.19) has the form: ˆ µ(u) = (1 − ˆ 2 iuσv −2κv αv /σv 2 ) . sv . Vt ) = exp(A(sx . In this case. θ)Vt + C(sx .46 The inﬁnitesimal generator of the squareroot process is:7 Lf (v) = 2 σv v ∂ 2 f ∂f + κv (αv − v) . t. sv . T. For more details about forward Kolmogorov equation. θ. t.16):8 µt (Lf ) = with µt (f ) := f (v)dµt . then it solves the forward Kolmogorov equation (4. T. dt (4. 2 2 ∂v ∂v (4. 2κv (4. θ)). we have µ(Leiuv ) = ( eiuv dµ (4. Then the solution for (4. T.21) Taken from Rama Cont [48].20) Recall that the joint CCF of the log price and volatility in this model is deﬁned as φ(sx . .15) Let µt be the distribution function of Vt . let µ be the stationary characteristic function of the volatility. µ 2 du = 0. d µt (f ). 7 8 (4. sv . XT .16) In particular. ˆ 2 du (4.
24) Then. XT Xt ) ∞ = 0 φ(sx . one can implement the maximum likelihood estimation based on this marginal distribution of the observed variables (electricity prices). θ. θ)σv −2κv αv /σv 2 ) . t.23) 2κv Through the Fourier transform. T. θ. given a sample {Xt . And the point estimations (including the SGMM estimators we will discuss later on). n}. θ.47 where A(·).T. 0.T. 0. t = 1. ˆ Applying Equation (4.T. XT Xt ) = eA(sx .0. T. (4.T.0. θXt )).t.0. and obtain MLMCCF estimators as n−1 ˆ θM CCF = argmax θ t=1 ln(f (Xt+1 .t. Let us deﬁne the marginal CCF as φ(sx . are biased and inconsistent (see [13]). Vt )dµ ∞ (4.θ)Xt +C(sx .θ)Xt +C(sx . we lose eﬃciency. What’s more. · · · .20).0. XT . the theoretical value for the .t. VT Xt . 0.θ) 0 e B(sx .0.t. we obtain the marginal CCF of the form φ(sx .0.t. as pointed out by Chacko and Viceira (1999).0. we cannot estimate the parameters of stochastic models directly from the joint CCF of the log price and volatility. Xt+1 Xt )e−isx Xt+1 ds. θXt ) = 1 2π R φ(sx . (4. t.t.θ)Xt +C(sx .θ) µ(−iB(sx .T.θ) (1 − 2 B(sx . the marginal conditional density function is given by f (Xt+1 . As the stochastic volatility Vt is unobservable. B(·) and C(·) are the solutions of system (Equation (3.θ)Vt =eA(sx .22) dµ =e A(sx .25) Notice that since we only rely on the level of the electricity prices in the previous period. θ)).36)).T.T. θ.t. (4.
4 Spectral GMM Estimators In this section. For each path.26) with n = 500 in our setting. n. GMM estimators are best suited to study models that are only partially speciﬁed. GMM estimation is one of the most fundamental estimation methods in statistics and econometrics. especially after Hansen’s inﬂuential paper [49] appeared in 1982.. there ˆ are 19. 4. i. This method is essentially GMM in a complex setting. We start with some basics of GMM. We will try adjusting the estimates from the actual data by this bias in the next chapter.704 hourly observations (same length as the actual data). and deﬁne the moment conditions and mo . Unlike ML estimation which requires the complete speciﬁcation of the model and its probability distribution.48 bias is hard to calculate as we don’t have closed forms for B(·) and C(·). 1 bias = θ0 − n n ˆ θi . and they are attractive alternatives to likelihoodtype estimators. we try to correct the bias by a bootstrap method. · · · . Speciﬁcally. we simulate 500 paths with a given parameter θ0 . The estimates θi . full knowledge of the speciﬁcation and strong distributional assumptions are not required for GMM estimation. Following Chacko and Viceira (1999). obtained from the simulated paths. i = 1.e. result in a distribution for each parameter. we will describe the SGMM estimators constructed by Chacko and Viceira (1999). We will regard the diﬀerence between the mean of those estimates and the given parameter as the bias. i=1 (4.
then we cannot identify θ. }. t = 1.27) Notice that θ is a ktuple vector and E[m(xt . see [50]). . . xk ] is a kdimensional vector of regressors. θ0 )] = 0 consists of q equations. Diﬀerent choices of moment conditions may lead to diﬀerent estimates. . . θ0 . (4. Then the qdimensional vector of functions m(x t .9 Deﬁnition 4. θk ] is the unknown vector 9 For more details see [50]. . . x = [x1 .1 Suppose that we have a set of random variables {xt . Such cases of overidentiﬁcation can easily arise and the moment estimator is not welldeﬁned. x (as before) is its transpose. (4. . . If we have more functions than unknowns (q > k). θ in some parameter space Θ. If we have fewer moment conditions than unknowns (q < k). θ0 )] = 0. GMM is a method to solve this kind of overidentiﬁcation problem. In this case. . then this is an overidentiﬁed problem. . θk } be an unknown tuple with true value θ0 to be estimated.49 ment functions ﬁrst. Let us take the standard linear regression model as an example and consider y = x θ0 + ε. Let θ = {θ1 . we can “create” more moment conditions by the socalled weighting functions (often termed “instruments” in the GMM literature. x2 . θ) is called an (unconditional) moment function if the following moment conditions hold: E[m(xt . If one has as many moment conditions as parameters to be estimated (q = k). one can simply solve the k equations in k unknowns to obtain the estimates. 2. .28) Here y is the response variable. · · · . and θ = [θ1 . .
yn ] .11 which are also not required in 10 11 For more details see [42]. y). We have exactly as many moment conditions as parameters to be estimated. θ) = x(y − x θ). If we assume that the strong law of large numbers holds then we have 1 n n t=1 ˆ m((xt . by the assumptions E[m((x.50 of parameters with true value θ0 . . We assume that ε has zero expectation and is uncorrelated with x. yt ). xn ] and y = [y1 . y). Along with each observed response yt .29) Suppose n > k observations on the response variable are available. · · · . Notice that we speciﬁed relatively little information about the error term ε. Thus. . y2 . . since. we have a kdimensional observation vector of regressors xt . (4. θ0 )] = E[x(y − x θ0 )] = E[xε] = 0. Using the Law of Iterated Expectations10 we ﬁnd that E[xε] = E[E[xεx]] = E[xE[εx]] = 0.30) So the Method of Moments (MM) estimator for this model is just the solution of 1 n which gives n n t=1 ˆ xt (yt − xt θn ) = 0.31) ˆ θn = ( t=1 xt xt ) −1 t=1 xt yt = (X X)−1 X y (4. For ML estimation we would be required to give the distribution of the error term ε. These moment functions are well deﬁned. . (4.32) with X = [x1 . θ0 )] = 0. . the ordinary least squares (OLS) estimator is a MM estimator. since xt is a kdimensional vector. almost surely. as well as the autocorrelation and heteroskedasticity. say y1 . y). Therefore. n (4. Heteroskedasticity means that the variance of the errors is not constant across observations. · · · . we can have the moment functions m((x. θn ) → E[m((x. yn .
Thus we have the moment conditions E[zε] = E[z(y − x θ0 )] = 0. zn ]. Let’s consider the previous model again. (4. θ0 )] = 0.36) ˆ θn = ( t=1 z t xt ) −1 t=1 zt yt = (Z X)−1 Z y. Suppose we have a qdimensional observed instrument z.35) Therefore we solve 1 n which gives n n t=1 ˆ z(y − x θn ) = 0.37) with Z = [z1 . This time we do not assume the error term has zero expectation. Now instead of assuming that the error term has zero expectation on certain observed variables. and the moment functions m((x. y. yt . θn ) → E[m((x. then this is also a well deﬁned problem. n (4. z). . we can specify the moment conditions directly by requiring the error term to be uncorrelated with certain observed “instruments”. almost surely. z). zt ).34) (4. (q ≥ k) and E[zε] = 0. θ) = z(y − x θ). y. · · · . We assume that the strong law of large numbers holds so that we have 1 n n t=1 ˆ m((xt .33) If q = k. let zt denote the corresponding kdimensional observation vector of instrument to yt . (4.51 formulating the moment conditions. but that it is still uncorrelated to the regressors. (4.
t=1 (4. θ) . . θ) Wn t=1 1 n n m(xt .38) ˆ where Wn is a q × q symmetric positive deﬁnite matrix. θ) .42) (4. One can deﬁne the criterion function as 1 Qn (θ) = n n m(xt .43) Taken from [50]. Assume that the moment conditions E[m(xt . t=1 Consider the linear regression model with instruments again. . . 12 n zt zt t=1 −1 = n(Z Z)−1 . (4. θ) Wn t=1 1 n n (4. The GMM estimator θn of θ associated with Wn is the solution to the problem: ˆ θn = argmin Qn (θ) θ = argmin θ 1 n n m(xt . n } from a stochastic process x. Suppose we choose 1 Wn = n Then the criterion function is Qn (θ) = n−1 (Z y − Z Xθ) (Z Z)−1 (Z y − Z Xθ). θ0 )] = 0 hold.52 Deﬁnition 4.2 12 Suppose we have an observed sample {xt . . Diﬀerentiating with respect to θ gives the ﬁrst order conditions ∂Qn (θ) ˆ θ=θn = n−1 2X Z(Z Z)−1 (Z y − Z Xθn ) = 0. ˆ ∂θ ˆ and solving for θn gives ˆ θn = (X Z(Z Z)−1 Z X)−1 X Z(Z Z)−1 Z y. t = 1. . 2.39) m(xt .41) (4. and suppose that we have q > k moment functions this time.40) (4.
θ.46) (4. XT Xt )] = 0. XT Xt )).45) 13 Re(A) denotes the real part of A. θ. Thus one can deﬁne a set of moment functions m(s. θ.53 This is the standard instrument variable estimator for the case where there are more instruments than regressors. Xt ) as follows:13 Re εt (s. the deﬁnition of the CCF of the sample implies that E[exp(is · XT ) − φ(s. (4. θ. E[Im(exp(is · XT ) − φ(s. εIm (s. XT . θ. Xt ) = . Xt ) := Re(εt (s. XT . t (4. θ. θ. θ. while Im(A) denotes the imaginary part of A. . Xt ) := Im(εt (s. θ. θ. XT Xt ))] = 0. XT . XT Xt ))] = 0. XT . Xt ) m(s. θ. Xt )) = Re(exp(is · XT ) − φ(s. XT .44) By taking real and imaginary parts of this function. s ∈ Rn . XT Xt )). θ. θ. XT . Xt ) = εt (s. XT . θ. t εIm (s. Xt ) t εRe (s. we get the following pair of moment conditions: E[Re(exp(is · XT ) − φ(s. XT . XT . Xt )) = Im(exp(is · XT ) − φ(s. Now.
XT . Xt ) Wn t=1 1 n n m(s. inherits the optimality properties of GMM estimators such as consistency and asymptotic normality (see [49]. 0]B A[1. θ. For more details see [49]. Under the usual regularity conditions. We now can apply SGMM to estimate stochastic volatility models like Model 3a 14 Let A. XT . So one can deﬁne the moment function based on the CCF as14 m(s.47) where p(Xt ) are “instruments” independent of εt (s. K − 1. The SGMM estimator is of the form: ˆ θSGM M = argmin θ 1 n n m(s. XT . 1]B ··· A[1. M − 1. N − 1. · · · .54 More generally. l]. n]. . (4. the asymptotic variance of the SGMM estimator is minimized with the optimal weighting matrix Wn = S −1 . We deﬁne the Kronecker product A ⊗ B to be a KM × LN matrix A[0. L − 1]B A[1. . 15 We assume that Hansen’s regularity conditions are satisﬁed. . l]B[m. [13]). 1. n]. ˆ θSGM M . XT . L − 1. . L − 1]B with elements (A ⊗ B)[m + kM. . 1]B ··· A[0.48) Just as for other GMM estimators. · · · . Xt ) . θ. B be K × L. · · · . the SGMM estimator. L − 1]B A ⊗ B := . 1. 1]B · · · A[K − 1. A[K − 1. n + lN ] := A[k. θ. B[m. m = 0. l = 0. 1. 0]B A[0. where S is the covariance matrix of the moment functions (see [13]). . Xt ) = εt (s. t=1 θ ∈ Θ. XT . . we can add a set of “instruments” or “weighting functions” to obtain more moment restrictions. (4. 1. θ. l = 0. M × N matrices with elements indexed as A[k. · · · . k = 0. θ. Xt ) ⊗ p(Xt ). Xt ). .15 according to Chacko and Viceira (1999). 0]B A[K − 1. .
as we only explored the information in the marginal CCF. · · · . Also. Cliﬀ. the selection of an appropriate set of sx is crucial for the eﬃciency of GMM estimation. Xt+1 Xt ) = eA(sx . Again.t.0.t. θ) . by Michael T. Xt+1 Xt ))) ⊗ p(Xt ). The corresponding GMM estimator based on a continuum of moments conditions yields ML eﬃciency. 6 and p(Xt ) as a T dimensional vector of 1s.49).16 In the next chapter. Therefore the computational burden increases dramatically as the sample size increases. 17 We will use the GMM and MINZ Program Libraries for Matlab.36) and (3. B(·) and C(·) are the solutions of system (3. According to Carrasco et al. when sx goes through all real numbers. Recall that the marginal CCF of the sample is given by: φ(sx .50) εRe (sx . · · · . θ.0.θ)Xt +C(sx . θ)σv −2κv αv /σv 2 ) . Xt . 2κv (4. Xt .43) for Model 3a and Model 3b respectively.e. when the number of moment conditions increases to inﬁnity. in order to construct the moment functions.θ) (1 − 2 B(sx .50). A poor selection of sx which results in a poor choice of moment conditions may lead to very ineﬃcient estimators and can even cause identiﬁcation problems. θ. t. This is known as the ad hoc choice of moment conditions problem in the GMM literature.17 16 Paraphrasing [50]. T }. Following Chacko and Viceira (1999). Xt . Xt . θ) = εt (sx . θ. θ) t (4. t εIm (sx . Xt+1 Xt )) deﬁned as Equation (4. the estimators will deteriorate for large sample sizes as the numerical errors associated with large matrix operations. Xt+1 Xt ))) ⊗ p(Xt ). 0. εIm (sx . the conditional density can be recovered. t with φ(sx . the estimates we obtain are biased and inconsistent (see [50]). But the size of the weighting matrix is n × n where n is the sample size. one can compute the nth conditional moment by simple substitution of sx = n into Equation (4. θ. we use the ﬁrst six spectral moments by setting sx = 1. θ) = (cos(sx Xt+1 ) − Re(φ(sx . 2. θ) = Re εt (sx . (see [29]). θ) = (sin(sx Xt+1 ) − Im(φ(sx . Given a sample {Xt .55 and Model 3b.. Xt . we have moment functions as follows: m(sx . according to Mikhail Chernov (one of the authors of [29]). t = 1. i. . Xt .49) where A(·). to perform the estimation.
Chapter 5 Model Comparison
This chapter gives some empirical comparisons among the models we introduced in Chapter 3. Three data sets–the hourly electricity prices (Hourly EP), onpeak daily average electricity prices (Peak EP), and oﬀpeak daily average electricity prices (Oﬀpeak EP)–from January 1, 2002 to March 31, 2004, are used to evaluate the price models.1
5.1
Data Description
A summary of statistics for the hourly electricity price series from January 1, 2002 to March 31, 2004 is presented in Table 5.1. The statistics reported here are for electricity prices (P ), the change in electricity prices (dP ), the logarithm of electricity prices (ln(P )), and the log returns of electricity prices (d ln(P )), from one hour to the next. Clearly, price series are skewed with excess kurtosis. The descriptive statistics for the onpeak/oﬀpeak daily average electricity prices during the period January 1, 2002 to March 31, 2004 are shown in Tables 5.2 and 5.3 respectively. The data is daily in frequency. As might be expected, Hourly EP has more volatility, larger skewness and higher kurtosis than the daily average
1 The hourly electricity prices were obtained from the public website of the Alberta Electric System Operator. Altogether, we have 19,704 observations at an hourly frequency. The onpeak and oﬀpeak daily average electricity price were computed according to the Alberta Energy and Utilities Board. Thus we have 569 observations for the onpeak period, and 821 observations for the oﬀpeak period.
56
57
mean Std.Dev P 58.5911 64.2576 dP 0.0042 54.4378 ln(P ) 3.5922 0.8359 d ln(P ) 0.0000 0.5529
Skewness 5.1325 0.4207 0.1089 0.0133
Kurtosis 46.6263 53.1331 4.1445 8.4214
Minimum 0.0100 862.1000 4.6052 6.0426
Maximum 999.9900 826.3900 6.9077 6.0234
Table 5.1: Descriptive statistics of Hourly EP Note: here and in the remainder of this thesis, the column labeled Std.Dev reports the standard deviation. Skewness and Kurtosis are the third, fourth moment around 3 4 the mean, namely skewness = E[X−E[X]] , kurtosis = E[X−E[X]] . For a normal dis[var[X]]1.5 [var[X]]2 tribution, skewness is equal to 0 and kurtosis is equal to 3. mean Std.Dev P 68.3287 43.0324 dP 0.0150 46.5599 ln(P ) 4.0763 0.5340 d ln(P ) 0.0004 0.5225 Skewness 2.8119 0.0897 0.1347 0.0910 Kurtosis 16.6829 11.1163 3.6559 5.0479 Minimum 12.2264 283.8329 2.5036 2.3969 Maximum 413.9657 255.9136 6.0258 2.1600
Table 5.2: Descriptive statistics of Peak EP onpeak/oﬀpeak data. And the descriptive statistics for Peak EP are larger than Oﬀpeak EP. Figure 5.1 plots the histogram of the log returns of Hourly EP over the period January 1, 2002 to March 31, 2004. There is a big spike in the middle which is a quantization eﬀect (mainly due to rounding to ± .01). This is hard to be captured by most models. In order to get reasonable estimates, we remove systematic day to day variations from data prior to ﬁtting. In this way, we can reduce the inﬂuence of hourly price patterns and compensate for quantization eﬀects. Speciﬁcally, if Pt is the actual data at the ith hour, i = 1, · · · , 24, then the “deseasonalized” data Xt is
58
mean Std.Dev P 39.6513 29.8016 dP 0.0116 25.5697 ln(P ) 3.4647 0.6445 d ln(P ) 0.0005 0.4710
Skewness 2.6022 0.2651 0.2188 0.1014
Kurtosis 14.9079 12.7988 2.7684 4.9058
Minimum 6.5713 200.9500 1.8827 2.1208
Maximum 299.2710 128.0813 5.7013 2.2547
Table 5.3: Descriptive statistics of Oﬀpeak EP
4000
3500
3000
2500
2000
1500
1000
500
0 −8
−6
−4
−2
0
2
4
6
8
Figure 5.1: Histogram of the log returns of Hourly EP Note: the price series we obtained from the public website of the Alberta Electric System Operator have been rounded to the nearest two decimals. A signiﬁcant amount of log returns are inside the range 0.01 to 0.01.
Table 5. “deseasonalized” data (X) and the changes of the deseasonalized data (dX).1) Figure 5.4 presents descriptive statistics for the log of Hourly EP (ln(P )).2 plots the histogram of the changes of the “deseasonalized” data. (5.59 4000 3500 3000 2500 2000 1500 1000 500 0 −6 −4 −2 0 2 4 6 Figure 5. . the log returns of Hourly EP.2: Histogram of the changes of the deseasonalized data (dX) obtained by: Xt = ln(Pt ) − Mean of the log price at the ith Hour.
All the routines are realized in Matlab.4214 4.8359 d ln(P ) 0. to the estimate of standard deviation of the .2 For Model 1a.8074 Table 5.10.4: Descriptive statistics of deseasonalized Hourly EP 5. Model 2a and Model 3a as the onejump version models and refer to Model 1b.0000 0.7314 5.0234 3.2 Calibration This section presents estimates from ﬁtting each of the six models to Hourly EP using the methodologies described in Chapter 4. Model 2b and Model 3b as the twojump version models.7344 dX 0.5922 0.8230 Minimum 4.1445 8. 4 Tratio is deﬁned as the ratio of the estimate.9780 8. The resulting parameter estimates.46 GHz Pentium 4 PC. Hourly EP were deseasonalized prior to ﬁtting.5–5.9823 Maximum 6.6052 6.0426 7.2.0133 0. Actual computer time required for each optimization is around 1 minute on a 2. We used optimization toolbox in Matlab for the optimization. Model 2a. and are available on request. The optimizations converged in less than 20 steps for Hourly EP.3 The estimates are obtained by using standard ML estimation on these conditional densities. we just denote the deseasonalized Hourly EP as Hourly EP.0000 0.9077 6.) and tratio4 for Hourly EP via MLCCF estimation are provided in Tables 5. Model 1b and Model 2b.3104 0.5529 X 0. we perform calibration via MLCCF estimation.1089 0.Dev ln(P ) 3. In the remainder of this thesis. just as for the example in Section 4. 2 3 In the remainder of this thesis.0000 0. the corresponding standard error (labeled as Std. In all cases.5917 5. we will refer to Model 1a. As we have computed the CCF of Xt+1 given Xt for each model in Chapter 3.5362 Skewness 0.60 mean Std.0307 Kurtosis 4. the conditional density of Xt+1 given Xt can be recovered through the Fourier transform.
for a meanreverting process. ωd and γd in the twojump version models are similar. According to Barz (1999). the derived balance between upward and downward jumps can be explained because we use the logarithm of the prices instead of using the spot prices. see [23]. Typically. the jump parameters ω. This is maybe because that the diﬀerence between the so obtained onpeak data and oﬀpeak data is not signiﬁcant in Alberta (refer to Figure [?]).96. For more details. We also observe some diﬀerent behaviour among these models. The magnitude of downward movements is ampliﬁed relative to upward movements.7 hours. Meanwhile.2) For large degrees of freedom (usually 30 or more). The probability of a positive jump ψ is roughly 50% in the onejump version models. we can consider the parameter to be signiﬁcant at the 95% level. the upward jump arrival rate ωu is roughly equal to the downward jump arrival rate in the twojump version models. 5 Halflife is a key property of a meanreverting process. (5. What’s more. Incorporating the onpeak and oﬀpeak eﬀects in Model 2a does not change most of the estimates sampling distribution of the estimate. we have the halflife is equal to ln(2) over the meanreverting rate. The mean reverting rate κ is similar for all the models. Likewise. Thus the halflife of the meanreverting process is about 2. γu .11. .61 Notice that the models are generally consistent with each other. he suggested that care must be taken to ensure one does not use a misspeciﬁed model in this case (see [17]). Also. That is. tratio = Estimate . ψ and γ for all the onejump version models are quite similar. if the absolute value of the tratio is larger than 1. It is the time that it takes for the price to revert half way back to its longterm mean level from its current level if no more random shocks arrive.5 What’s more. we observe that the frequencies of positive jumps and negative jumps are quite close to each other. Std. and is close to 0. the jump parameters ωu .
in Model 2b. The onpeak coeﬃcient α1 is close to the oﬀpeak coeﬃcient α2 in Model 2a. the probability of positive jumps is slightly more than the probability of negative jumps in Model 1a. while we have slightly more negative jumps than positive jumps in Model 1b. this is not the case. The upper left plot is the empirical histogram of “deviations from the expected values”. the histograms of deviations from expected values should be similar to the theoretical conditional density plots. κ is the meanreverting rate. The lower plots are slices of the loglikelihood function versus each parameter. However.3– 5. The graphic outputs from the optimization process are reported in Figures 5.8). Each ﬁgure consists of three parts. it is deﬁned as: Yt = (Xt+1 − α(t)) − e−κ (Xt − α(t)). The 6 Refer back to Equation (4.3) where Xt is the log price at time t. The upper right plot is the theoretical density of Yt . For good estimates. where α(t) = α1 peakt + α2 oﬀpeakt . It can be shown that the density function of Yt is the same as f (Xt+1 . and α is the longterm mean. For Model 2a and Model 2b. θXt ) (with possibly a translated mean). For Model 1a and Model 1b. .22).6.62 signiﬁcantly compared with the estimates from Model 1a. Moreover. Only the estimate for the longterm mean in Model 1b is much smaller than the estimate for the longterm mean in Model 1a. (5. (5.4) with peakt and oﬀpeakt deﬁned in Equation (3. this is deﬁned as:6 Yt = (Xt+1 − α) − e−κ (Xt − α).5) (5. Hence this histogram and the corresponding density should have the same shape.
10. these estimates are biased and inconsistent.63 asterisk on each curve is the estimate for the corresponding parameter. Unfortunately.7 It takes more than 20 minutes for the optimizations to converge on a 2. the corresponding standard error and tratio for Hourly EP via MLMCCF estimation are provided in Tables 5. since the information of the volatility is not available. We tried to correct the bias by a bootstrap method as suggested by Chacko and Viceira (1999).3.9 and 5. The number labeled below the xaxis is the width of each curve.46 GHz Pentium 4 PC. The resulting parameter estimates. If the asterisk is at the peak of the curve. As pointed out by Chacko and Viceira (1999).9 and 5. mean bias and Refer back to Section 4. we can explore the marginal CCF.10 directly. and simulate 500 paths with a given parameter θ0 . However. 8 7 . All the histograms of deviations from expected values are very similar to the theoretical conditional density plots. and perform MLMCCF estimation instead. Because 2κv αv is very close to σv . For these stochastic volatility models (Model 3a and Model 3b). it is very hard to guarantee the positiveness of the volatility process for a long path. and need to be modiﬁed. all of the asterisks (denotes the estimates) are at the peak of the curves. then the estimate is a good one.8 The given parameter. we cannot obtain simulated paths with the same length as the actual data using 2 the estimates in Tables 5. we cannot use MLCCF estimation. Also. the mean of the estimates. which is the 95% conﬁdence interval.
11 and Table 5. the RMSEs are pretty big. · · · . The histograms of deviations from expected values are close to the theoretical conditional density plots. This indicates that this parameter is hard to estimate. Also. we recomputed the loglikelihood function for both models 9 ˆ Let θi be the estimate for each sample path i.8. then mean bias is deﬁned as: n 1 ˆ Mean Bias = (θi − θ0 ). n and θ0 be the true value. But not all of the asterisks are at the peak of the curves. (5. i = 1. and this may result in higher loglikelihood. It seems that ρ tends to go towards 1. Similar to before. In comparison with the estimates.64 root mean square error (RMSE)9 for Model 3a and Model 3b are reported in Table 5. The asterisk denoted for ρ is not at the peak. and extra care should be taken into when computing this parameter.6) The upper right plot is the theoretical density of Yt . each ﬁgure consists of three parts. n i=1 RMSE is deﬁned as: RMSE = 1 n n i=1 ˆ (θ i − θ 0 )2 . . the curve of the loglikelihood function versus the correlation coeﬃcient (rho) in both models is quite ﬂat (not convex). We also report the graphic outputs from the optimization process in Figures 5. The upper left plot is the empirical histogram of deviations from the expected values. The lower plots are slices of the loglikelihood function versus each parameter. deviation from the expected value is deﬁned as: Yt = Xt+1 − e−κ Xt .7– 5. For Model 3a and Model 3b.12 respectively. We then decided not to use these biases to adjust our estimates. Therefore. which indicates that the estimates are not reliable.
1124 0. and are not presented here. All the absolute values of the tratios are larger than 1.0008 0. Unfortunately. ﬁxing ρ = −1. Therefore. the standard error of estimation and tstatistics from ﬁtting Model 1a (dXt = κ(α − Xt )dt + σdWt + Qt dPt (ω)) to Hourly EP.6395 86.0058 Tratio 28.0144 1.1000 9.14.2260 0. 0. .2459 58.3367 Std. via SGMM estimation. in the following sections. the corresponding standard error and tratio for Hourly EP via MLMCCF estimation are provided in Tables 5. The loglikelihood values are really higher with ρ = −1 which veriﬁes our hypothesis.96.0344 0. The standard errors are quite small.0517 Table 5. We also carried out estimation for Model 3a and Model 3b. All the estimates are hourly based.4720 18. As the spot price and the volatility are so strongly correlated. The resulting parameter estimates. which indicates that all the parameters are likely signiﬁcant.0250 0.2368 0. Model 3a and Model 3b may be misspeciﬁed models.13 and 5.0040 0. the optimization processes of Model 3a and Model 3b are quite computationally intensive. This is because we have to solve the Riccati equations numerically during each iteration.65 κ α σ2 ω ψ γ Estimate 0. In comparison with the other four models.0061 0. the results that were obtained in the available time constraint were not very accurate. by ﬁtting Hourly EP.5261 0. we will only consider the other four models.5: Hourly EP parameters values for Model 1a Note: this table reports the estimates.0000 35.
024 0.3)) should be very similar to the plot of the theoretical density function. These results are also consistent with the small standard errors in Table 5.5.5 Plot of density function 600 400 1 200 0. . If we get good estimates.0033 0. one can tell that Model 1a is a good ﬁt to Hourly EP. the histogram of deviation of expected values (deﬁned in (5.015 0.66 Histogram of deviations from expected values 800 1.5 0 −10 −5 0 5 0 10 −8 −6 −4 −2 0 2 4 6 kappa −11424 alpha sigma2 omega psi gamma −11435 0. From these plots. Moreover. which indicates that we have found the optimum points.023 Figure 5.3: Results from optimization (Model 1a) Note: these are the results of ﬁtting Model 1a (dXt = κ(α−Xt )dt+σdWt +Qt dPt (ω)) to Hourly EP.98 0.07 0. all the estimates are at the peak of each curve.
3536 0. Furthermore.0328 0. while. the probability of positive jumps is slightly more than the probability of negative jumps.5556 0.5228 0.3321 Table 5.6667 0. All the estimates are hourly based.0008 0.6: Hourly EP parameters values for Model 2a Note: this table reports the estimates.96. incorporating the onpeak and oﬀpeak eﬀects does not change our estimates signiﬁcantly.0274 Tratio 25.6190 0.0299 22. Notice that we have slightly more negative jumps than positive jumps.0061 0.3684 7. which indicates that all the parameters are signiﬁcant.1016 0. All the absolute values of the tratios are larger than 1.96.0131 0.7: Hourly EP parameters values for Model 1b Note: this table reports the estimates.0041 0.7049 59.1051 0.6341 9. which indicates that all the parameters are signiﬁcant. the standard error of estimation and tstatistics from ﬁtting Model 2a (dXt = κ(α(t) − Xt )dt + σdWt + Qt dPt (ω)) to Hourly EP. Estimate 0.2188 0. 0.0320 4. Tratio 0. the standard error of estimation and tstatistics from ﬁtting Model 1b (dXt = κ(α−Xt )dt+σdWt +Qu dPtu (ωu )+Qd dPtd (ωd )) t t to Hourly EP.1472 0.3906 0. the onpeak coeﬃcient α1 is close to the oﬀpeak coeﬃcient α2 .6622 0.0009 14.0073 48.3059 Std.0278 0.0042 27. All the estimates are hourly based.0057 0.1585 81.67 κ α1 σ2 ω ψ γ α2 Estimate 0. in Model 1a. the estimate for the longterm mean in Model 1b is much smaller than the estimate for the longterm mean in Model 1a. The standard errors are quite small.1405 0. .1162 0. The standard errors are small.2765 0.0136 1.4167 κ α σ2 ωu γu ωd γd Table 5. All the absolute values of the tratios are larger than 1.2009 Std.0187 33.3384 0. In comparison with the estimates from Model 1a. Moreover.0084 36.4384 0.9460 17 37.
Moreover. .036 0. all the estimates are at the peak of each curve.5 1 1.029 0.117 0.4)) should be very similar to the plot of the theoretical density function.016 0. These results are also consistent with the small standard errors in Table 5.0034 0.033 Figure 5.1251 0.5 Plot of density function −5 0 5 kappa −11413 alpha sigma2 omegau gammau omegad gammad −11434 0. the histogram of deviation of expected values (deﬁned in (5.4: Results from optimization (Model 2a) Note: these are the results of ﬁtting Model 2a (dXt = κ(α(t) − Xt )dt + σdWt + Qt dPt (ω)) to Hourly EP. which indicates that we have found the optimum points.68 Histogram of deviations from expected values 700 600 500 400 300 200 100 0 −10 −5 0 5 0 0.6. From these plots. If we get good estimates. one can tell that Model 2a is a good ﬁt to Hourly EP.
08 Figure 5. If we get good estimates. all the estimates are at the peak of each curve.022 0.5: Results from optimization (Model 1b) Note: these are the results of ﬁtting Model 1b (dXt = κ(α − Xt )dt + σdWt + Qu dPtu (ωu ) + Qd dPtd (ωd )) to Hourly EP.5 0 −10 −5 0 5 10 0 −5 0 5 kappa −11397 alpha sigma2 omega psi gamma alpha2 −11411 0. which indicates that we have found the optimum points.13 0.5 Plot of density function 600 400 1 200 0. . These results are also consistent with the small standard errors in Table 5.69 Histogram of deviations from expected values 800 1. the histogram of t t deviation of expected values (deﬁned in (5.025 0.003 0. Moreover.0786 0.7. From these plots.3)) should be very similar to the plot of the theoretical density function. one can tell that Model 1b is a good ﬁt to Hourly EP.0157 0.
The standard errors t are quite small.8429 37.9205 14. the standard error of estimation and tstatistics from ﬁtting Model 2b (dXt = κ(α(t) − Xt )dt + σdWt + Qu dPt (ωu ) + t Qd dPt (ωd )) to Hourly EP.3043 0.3569 0.8: Hourly EP parameters values for Model 2b Note: this table reports the estimates.4514 2.0343 Tratio 26.5346 Table 5. incorporating the onpeak and oﬀpeak eﬀects does not change our estimates signiﬁcantly.96.0175 0. which indicates that all the parameters are signiﬁcant. The onpeak coeﬃcient α1 is not that close to the oﬀpeak coeﬃcient α2 as in Model 2a.6725 0. .9320 49.1084 0.0041 0.0008 0.0338 0.1427 4.0121 0. All the absolute values of the tratios are larger than 1.0294 0.6096 0. All the estimates are hourly based.8066 34.0870 Std.0072 0.5694 22. 0.0081 0.70 κ α1 σ2 ωu γu ωd γd α2 Estimate 0.1665 0. In comparison with the estimates from Model 1b.
6: Results from optimization (Model 2b) Note: these are the results of ﬁtting Model 2b (dXt = κ(α(t) − Xt )dt + σdWt + Qu dPt (ωu ) + Qd dPt (ωd )) to Hourly EP. one can tell that Model 2b is a good ﬁt to Hourly EP. From these plots.035 0.069 0.71 Histogram of deviations from expected values 800 1. all the estimates are at the peak of each curve.132 0. .016 0.5 Plot of density function 600 400 1 200 0.1139 Figure 5.5 0 −10 −5 0 5 0 10−8 −6 −4 −2 0 2 4 6 kappa −11384 alpha sigma2 omegau gammau omegad gammad alpha2 −11413 0.1110 0. which indicates that we have found the optimum points. the histogram of t t deviation of expected values (deﬁned in (5. These results are also consistent with the small standard errors in Table 5. If we get good estimates.028 0.8. Moreover.0034 0.4)) should be very similar to the plot of the theoretical density function.
0170 63.) from ﬁtting Model √ √ 2 V Xt κ(α − Xt ) dWt Qt dPt (ω) 1−ρ V ρ√ t t 3a ( d = dt + + ) to Vt κv (αv − Vt ) 0 0 σv Vt dWv Hourly EP.9415 11345.8013 Table 5. the standard error of estimation.9: Hourly EP parameters values for Model 3a Note: this table reports the estimates.0065 54.2333 0.0413 12. The standard errors are still reasonably small.5781 0.1152 0.96.0500 0.0315 17.4063 0. tstatistics and adjusted estimates (labeled as A.72 κ α ω ψ γ ρ κv αv σv Loglikelihood Estimate Std.1846 0. All the absolute values of the tratios are larger than 1.0014 693.5714 0. which indicates that all the parameters are signiﬁcant.0814 0.6118 0.3522 0. Tratio 0.1529 0.0090 64. All the estimates are hourly based.3148 1.5483 0.9710 0.0171 8.0023 50.Est.0020 13.5086 0.0870 0. .0261 0.
0298 Figure 5.0764 0.6)) is very similar to the plot of the theoretical density function.011 0.7: Results from optimization (Model 3a) Note: these are the results of ﬁtting Model 3a to Hourly EP.0537 0. . the curve of the loglikelihood function versus the correlation coeﬃcient (rho) is quite ﬂat and the estimate is not at the peak.0376 0.0086 0.5 0 Plot of density function 200 0 −10 −5 0 5 −5 0 5 kappax −11346 alphax kappav alphav sigmav rho omega psi gamma −11378 0.0172 0.1608 0. The histogram of deviation of expected values (deﬁned in (5.73 Histogram of deviations from expected values 800 2 600 1. Here.185 0. But not all the estimates are at the peak of the curves.5 400 1 0.
5 0.3605 Table 5.5978 0. which indicates that all the parameters are signiﬁcant.8138 0. The standard errors are still reasonably small. Tratio 0.7727 0.0237 27.1370 11343.0281 0.0338 12.0022 12.4187 0.6915 0.0349 19.1782 0.5994 0.6420 0. the standard error of estimation and tstatistics √ from ﬁtting Model 3b √ Xt κ(α − Xt ) dWt 1 − ρ 2 Vt ρ √ t V d = dt + Vt κv (αv − Vt ) 0 σv Vt dWv to Hourly EP.0092 37.1330 0.9987 0.0022 51. .96.0151 24.3046 0.3670 0.1140 0. Qu dPt (ωu ) + Qd dPt (ωd ) t t + 0 All the estimates are hourly based.10: Hourly EP parameters values for Model 3b Note: this table reports the estimates.3459 0.0002 4993.0886 0.0219 8.3876 0.8182 0. All the absolute values of the tratios are larger than 1.74 κ α ωu γu ωd γd ρ κv αv σv Loglikelihood Estimate Std.0737 8.
0036 0. Similar to Model 3a.256 0.6)) is very similar to the plot of the theoretical density function.2206 0.0098 0.07130. .1344 0. But not all the estimates are at the peak of the curves.8: Results from optimization (Model 3b) Note: these are the results of ﬁtting Model 3b to Hourly EP.0056 0.75 Histogram of deviations from expected values 800 2 600 1.0086 0. The histogram of deviation of expected values (deﬁned in (5. the curve of the loglikelihood function versus the correlation coeﬃcient (rho) is quite ﬂat and the estimate is not at the peak.0246 0.5 0 5−8 Plot of density function 400 200 0 −10 −5 0 −6 −4 −2 0 2 4 6 kappax −11346 alphax kappav alphav sigmav rho omegau gammau omegad gammad −11392 0.5 1 0.0343 Figure 5.
6088 0.5994 0.4249 0.1217 0. we can apply LRT to compare Model 1b and Model 2b.0640 0.1529 0.3234 0.0487 0.6396 0. Adding additional parameters will always result in a higher likelihood. Likewise.4019 0.5640 0. one should also consider whether adding additional parameters results in signiﬁcant improvement ﬁtting a model to a particular data set.6070 0.1695 0.9680 0.0018 1. That is.5795 0.1487 0. LRT begins with a comparison of the likelihood values of 10 LRT is explained in detail by Felsenstein (1981). the more complex model must only diﬀer from the simpler model by the addition of one or more parameters. .3 Goodness of Fit Because Model 1a is subsumed by Model 2a.10 LRT is only valid if used to compare hierarchically nested models. A relatively more complex model is compared to a simpler model to see if it ﬁts a particular data set signiﬁcantly better.76 κ α ω ψ γ ρ κv αv σv True Theta Mean Bias RMSE 0.11: Bias for Model 3a 5.1021 0. LRT provides one objective criterion for selecting among possible models.8895 0.1567 Table 5.3532 0.0674 0.9010 1.0069 0. However.1643 0.2519 1.3686 0.8381 0. likelihood ratio tests (LRT) may be applied to compare these models.3726 0.1148 0.0601 0. LRT is a statistical test of the goodnessofﬁt between two models.0079 0.0728 0. If so.5000 0.5169 0. the additional parameters of the more complex model are often used in subsequent analyses.0715 1.
5404 0.6420 1.1891 0.5000 0.1679 Table 5.4187 0. Table 5.3277 0.0067 0.0141 0.5994 1.0104 0. The column labeled LR contains the actual likelihood ratio statistic.1891 0. (5. the ﬁrst column lists the pair of models under consideration.1272 0.7) where Lf and Lr are the likelihood values of the more complex model and the simpler model respectively. Using this information we can then determine the critical value of the test statistic from standard statistical tables.1679 0.8931 0.0812 0. The second column labeled DF lists the additional degrees of freedom contained in the more complex model.0404 0. is .4543 0. that the restrictions included in the simpler model are valid.1835 0.4491 0.9833 0.3459 0. where k is equal to the number of additional parameters in the more complex model. The hypothesis. This LRT statistic approximately follows a chisquare distribution with degrees of freedom k.15 lists the LR statistics for diﬀerent pairs of models.4269 0.77 κ α ωu γu ωd γd ρ κv αv σv True Theta Mean Bias RMSE 0.1207 0.4925 0.7309 0.0082 0.3670 0.1782 0.12: Bias for Model 3b the two models: LR = 2(ln(Lf ) − ln(Lr )).0154 0.1140 0.0991 0. The column labeled Cutoﬀ contains the 95% chisquared cutoﬀ value for a likelihood ratio statistic with degrees of freedom corresponding to the number in the DF column.6915 1.0970 0. In this table.0963 0.0078 0.0821 0.9987 0.0192 0.3563 0.
All the estimates are hourly based.1690 2.1769 0. the standard error of estimation and tstatistics from ﬁtting Model 3b to Hourly EP with ρ = −1.9240 0.78 κ α ω ψ γ κv αv σv Loglikelihood Estimate Std.6501 0.1135 0.96). which suggests that the estimates may be not accurate.6809 0.4180 0.3974 1.6502 0.6925 0.3767 0.3294 Table 5.0119 13.5250 1.0259 0.0000 5884 0. some of the estimates are not signiﬁcant any more (the absolute values of the tratio < 1.1562 0.1277 5.3612 1.1626 Table 5.0171 6.5239 1.1608 0. κ α ωu γu ωd γd κv αv σv Loglikelihood Estimate Std.6411 0.1322 0.96).1919 0. the standard error of α is one.0166 0.0519 0.0271 1. Although we do get a higher loglikelihood.0471 0.0372 0.1848 0.9221 11343.14: Hourly EP parameters values for Model 3b (ρ = −1) Note: this table reports the estimates. All the estimates are hourly based.5884 0.6008 0. Moreover.1135 0.4993 1. some of the estimates are not signiﬁcant any more (the absolute values of the tratio < 1. the standard error of estimation and tstatistics from ﬁtting Model 3a to Hourly EP with ρ = −1.4979 2.5250 1.5796 0.9312 0.0281 0.2238 0.1261 11344.0878 0.0374 0.3461 0.3672 0. .3335 0. Tratio 0.3508 0.0000 0. Although we do get a higher loglikelihood.13: Hourly EP parameters values for Model 3a (ρ = −1) Note: this table reports the estimates. Tratio 0.85855 0.
that the restrictions included in the simpler model are valid. Model 2a Model 3a Model 2b Model 3b DF LR Cutoﬀ 1 53. Model 1b vs.84 2 140. Another related statistical measurement of ﬁt is the Schwarz criterion (see [51]). and n is the sample size. the loglikelihood values for the twojump version processes are higher than their onejump counterparts. Models that yield a minimum value for the criterion are to be preferred. This is deﬁned as: SC = −2 ln L + k ln n. rejected if the quantity in the LR column is greater than the quantity in the Cutoﬀ column. (5.15: Likelihood ratio statistics ﬁtting on Hourly EP Note: in this table. whereas the twojump version processes and the onejump version processes are not nested. is rejected if the quantity in the LR column is greater than the quantity in the Cutoﬀ column.99 Table 5. The second column labeled DF lists the additional degrees of freedom contained in the more complex model. Thus this test suggests that the timevarying mean models provide a better ﬁt than the ones with constant longterm mean and volatility. Moveover. Model 1b vs.4908 3.99 1 58.0579 5. One should also consider whether adding additional parameters results in signiﬁcant improvement ﬁtting a model to a particular data set. The column labeled LR contains the actual likelihood ratio statistic.79 Model 1a vs.9024 3. Model 1a vs. The column labeled Cutoﬀ contains the 95% chisquared cutoﬀ value for a likelihood ratio statistic with degrees of freedom corresponding to the number in the DF column.5978 5. the ﬁrst column lists the pair of models under consideration. k is the number of parameters estimated in the model.84 2 156. . The hypothesis. Thus one would expect they will give a better ﬁt.8) where L represents the likelihood value.
Table 5. From this table. The column labeled SC reports the computed Schwarz criteria.1441 SC 22907. The row labeled True Theta is used to generate 300 sample paths from the .51 22895. 5. The model which has maximum likelihood value and an appropriate number of additional parameters will be chosen. While comparative tests like LRT and the Schwartz criterion oﬀer an indication of the relative quality of each model. the Schwarz criterion accommodates the tradeoﬀ between a better ﬁt and less parameters by penalising the model with additional parameters in the model.3895 11384.4 Robustness Test A simulation exercise was undertaken to determine how robust MLCCF estimators performed.1002 11397.1492 11413. Model 2b is the best model. Therefore.16 reports the loglikelihood values together with the Schwarz criteria for models. The Schwarz criterion values can be compared among various models as a basis for the selection of the model. Models that yield a minimum value for the criterion are to be preferred.53 22863.80 Model 1a Model 2a Model 1b Model 2b ln L 11424. We will do further evaluation on those models in the following sections. the ﬁrst column lists the model under consideration.39 Table 5. It appears that Model 2b better describes the dynamics of Hourly EP as it has the smallest Schwarz criterion.16: Schwarz criteria statistics for Models Note: in this table. they do not yield an absolute assessment. The second column labeled ln L lists the log likelihood value.99 22847.
0100 0.0034 0.0070 0.0165 0.0012 0.2407 0.0871 0.2260 0.0689 0.0062 0.0332 Table 5. We report the true parameter values. Notice that MLCCF estimators performed quite well and our estimation procedure is very accurate since the mean bias and RMSE are small.0095 0.0193 0.81 True Theta Mean Mean Bias RMSE κ α σ2 ω ψ γ 0.0984 0.0147 0.6622 0.1439 0.0079 0. We use Milstein’s scheme to simulate 3000 hourly observations for each path.20.0222 0.3367 0.17: Robustness test for Model 1a True Theta Mean Mean Bias RMSE κ α σ2 ωu γu 0.3758 0.0016 0.5249 0.6252 0.18: Robustness test for Model 1b ωd γd 0. along with those statistical measures from the estimation.3265 0.0017 0.1124 0.1194 0.0144 1.3059 0.0069 0.0029 0.0029 0.17–5.0148 0. Estimation is undertaken via the MLCCF estimation.0131 0.5261 0. we assume the generated sample paths are also for an hourly time interval.0206 0. .1162 0.0760 0.6190 0.0260 Table 5. Because each sample path is for an hourly time interval.1405 0.3536 0.2368 0.0042 0. The estimates are provided in Tables 5.1241 0.6664 0.0160 1.0307 discretized model beginning with the longterm mean α.0587 0.3560 0.2299 0. and the 300 paths are generated using the antithetic variate technique.
3194 0.0566 0.2353 0. θ) + B(s.5 Descriptive Statistics of Empirical vs.0027 0.5216 0.9) For Model 1a. ∂sn Then (5.0329 ωd γd α1 0.0015 0. XT Xt ) = E[exp(isXT )Xt ] = exp(A(s.0151 1.2765 0.82 True Theta Mean Mean Bias RMSE κ α σ2 ω ψ γ α1 0.. Let Un denote the nth moment.6145 0.0014 0.0280 0.1223 0.2188 0. θ)Xt ).2105 0.1084 0.19: Robustness test for Model 2a True Theta Mean Mean Bias RMSE κ α σ2 ωu γu 0.0087 0.3043 0.e.1151 0.10) .0738 0.20: Robustness test for Model 2b 5. To obtain the moments.0412 0.6096 0.0048 0. (5. t. we can obtain the moments for any choice of jump distribution where the jump intensity or distribution does not depend on the state variables.3788 0.0870 0.2254 0.1051 0. t.1665 0.0832 Table 5. we diﬀerentiate the CCF (φ) successively with respect to s and then ﬁnd the value of the derivative when s = 0.0214 0.5228 0.0063 0.0865 0.0287 0.6817 0.0012 0. i. in ∂nφ .0245 0.0092 0.0121 0.1114 0.1009 Table 5.0083 0.0027 0.0443 0. the CCF of XT given Xt has the closed form φ(s. Calibrated Hourly Returns Given the CCF. T. T.6725 0.0136 1.0917 0.3384 0.0083 0.0135 0. and φn be the nth derivative of φ with respect to s. φn = Un := 1 [φn s = 0].2009 0.3569 0.0067 0.0219 0.3598 0.1156 0.1076 0.0187 0.0270 0.0151 0. θ.
= iα − + − ∂s 2κ κ 1 + γ 2 s2 2κ 1 + γ 2 s2 (5. 4κ κ 2κ (5. θ) = ise−κ(T −t) . 2κ 1 + γ 2 s2 B(s. U1 := E[X∞ ] = [φ1 s = 0] = α + i κ Similarly.15) . One can obtain the nth moment by equation (5. 2κ 2ω(2ψ − 1)γ 3 3 3 U3 := E[X∞ ] = + 3U1 U2 − 2U1 .12) We will let X∞ denote the random variable with this unconditional (stationary) distribution. θ) γ σ 2 s iω(2ψ − 1) ω 2γ 2 s φ(s.11) iω(1 − 2ψ) arctan(γse−κ(T −t) ) − arctan(γs) κ ω 1 + γ 2 s2 e−2κ(T −t) + ln . T. T. is of the form φ(s. to get the ﬁrst moment. κ 6ωγ 4 4 2 2 4 U4 := E[X∞ ] = + 3U2 − 12U2 U1 + 4U3 + 6U1 .14) (5.9). θ) with respect to s: φ1 := ∂φ(s. κ 2 E[X∞ ] (5. θ) = exp iαs − ω σ 2 s2 iω(2ψ − 1) + arctan(γs) − ln(1 + γ 2 s2 ) . For example. θ) of XT . θ).13) Thus the ﬁrst moment U1 is given by ω(2ψ − 1) 1 γ. t. t. we can obtain other moments: σ 2 + 2ωγ 2 2 U2 := = + U1 . θ) = iαs(1 − e−κ(T −t) ) − + σ 2 s2 (1 − e−2κ(T −t) ) 4κ (5.83 where A(s. we diﬀerentiate φ(s. which is obtained by letting T → ∞. Then the unconditional (stationary) characteristic function φ(s.
6 respectively. Kurtosis := E 2 (U2 − U1 )2 (σ + 2ωγ 2 )2 (1 − e−2κ ) e−2κ )) 2 Xt = σ 2 + 2ωγ 2 )(1 − e−2κ ) .16) We can also obtain the conditional variance. For Model 1b.21 and 5. using the estimated parameters for the Model 1a and Model 2a in Table 5.5 and Table 5. As a ﬁrst check we compute the moments of the unconditional distribution and conditional distribution of the logarithm of electricity prices.84 Therefore. 2κ √ 4ω 2κ(2ψ − 1)γ 3 ((σ 2 2ωγ 2 )(1 3 (1 − e−3κ ) . we have the unconditional variance. Empirical and theoretical results are listed in Tables 5. we can obtain the unconditional characteristic function in the same . skewness and kurtosis for the process in the same fashion: Variance := E[(Xt+1 − U1 )2 Xt ] = ( Skewness := E (Xt+1 − U1 )3 2 3 U1 ) 2 (U2 − + − 4 4 (Xt+1 − U1 ) 24κωγ (1 + e−2κ ) Xt = 2 +3. = 2 (σ + 2ωγ 2 )2 (5. 2κ √ 4 2κωγ 3 (2ψ − 1) = . According to Das [52]. skewness and kurtosis: Variance := E[(X∞ − U1 )2 ] = Skewness := E 2 (U2 − U1 ) 2 (X∞ − U1 )4 Kurtosis := E 2 (U2 − U1 )2 (X∞ − U1 )3 3 σ 2 + 2ωγ 2 . and the empirical conditional moments are approximated by computing higher moments of the changes of the empirical data (Hourly Peak). 3 (σ 2 + 2ωγ 2 ) 2 24κ(ωγ 4 ) +3. Model 2a has the same formulae for higher moments as Model 1a. skewness and kurtosis) do not include the information of the longterm mean.17) Notice that since higher moments (variance. (5.22. empirical approximations of the unconditional moments can be obtained by computing higher moments of the empirical data (Hourly Peak).
4424 Table 5. theoretical results for Model 1a Empirical results unconditional conditional 0.4214 Theoretical results unconditional conditional 1.1445 8.Dev Skewness Kurtosis Empirical results unconditional conditional 0.18) Thus we have the unconditional variance.20) .1445 8. (5.1089 0.8359 0.0249 0. skewness and kurtosis: Variance := E[(X∞ − U1 )2 ] = Skewness := E 2 (U2 − U1 ) 2 (X∞ − U1 )4 Kurtosis := E 2 (U2 − U1 )2 (X∞ − U1 )3 3 (5.21: Empirical results vs.0133 4. 4κ κ κ 2 2 σ 2 + 2ωu γu + 2ωd γd .1089 0. 2κ √ 3 3 4 2κ(ωu γu + ωd γd ) 2 2 + 2ωu γu + 2ωd γd )(1 − e−2κ )) 2 4 4 24κ(ωu γu + ωd γd )(1 + e−2κ ) 2 2 ((σ 2 + 2ωu γu + 2ωd γd )(1 − e−2κ ))2 Xt = ((σ 2 3 (1 − e−3κ ) .4926 Std.0817 3. = 2 2 2 (σ + 2ωu γu + 2ωd γd )2 (5. 2 3 2 (σ 2 + 2ωu γu + 2ωd γd ) 2 4 4 24κ(ωu γu + ωd γd ) +3.4214 Theoretical results unconditional conditional 1. +3.1404 0.5529 0. theoretical results for Model 2a fashion.1801 0.85 Std. √ 2κ 3 3 4 2κ(ωu γu + ωd γd ) = .19) and the conditional variance. φ(s.4972 7.4704 7.5138 0.0133 4.22: Empirical results vs.5529 0. skewness and kurtosis: Variance := E[(Xt+1 − U1 )2 Xt ] = ( Skewness := E (U2 − (Xt+1 − U1 )4 Kurtosis := E Xt = 2 (U2 − U1 )2 (Xt+1 − U1 )3 2 3 U1 ) 2 2 2 σ 2 + 2ωu γu + 2ωd γd )(1 − e−2κ ) .5117 0. θ) = exp iαs − ωd σ 2 s 2 ωu − ln(1 − isγu ) − ln(1 − isγd ) .Dev Skewness Kurtosis Table 5.0929 3.0293 0.8359 0.
0133 4.1089 0. Similarly. Model 2b has the same formulae for higher moments.5523 0.1029 0.8359 0.4214 Theoretical results unconditional conditional 1.0133 4.23: Empirical results vs. But we did not get any useful information about the comparison between Model 1b and Model 2b through computing empirical moments and theoretical moments because the results are quite close to each other.4779 7. 5.1604 0. A QQ plot is a graphical technique for determining if two data .1445 8.3207 3.3809 Table 5.Dev Skewness Kurtosis Empirical results unconditional conditional 0.8359 0.1445 8.5529 0.5544 0.Dev Skewness Kurtosis Table 5.5529 0. Notice that theoretical moments of the twojump version models match empirical results better than the onejump version models.24.1035 0.24: Empirical results vs.4261 Std. theoretical results for Model 1b Empirical results unconditional conditional 0.86 Std.5068 7.4214 Theoretical results unconditional conditional 1.3338 3. we will apply the quantilequantile (QQ) plot to compare Model 1b and Model 2b. theoretical results for Model 2b Empirical results and theoretical results are listed in Table 5.6 Quantile and Quantile plot In this section.1089 0. We report empirical results and theoretical results for this model in Table 5.23.1206 0.
the greater the evidence for the conclusion that the two data sets come from populations with diﬀerent distributions. For onejump version models.23). By a quantile.3 (or 30%) quantile is the point at which 30% of the data falls below and 70% falls above that value. one may generate random paths from the following model (5. We can use a QQ plot to examine whether the onejump version models or the twojump version models match the prices dynamics better. That is.22) with the longterm mean α diﬀerent for Model 1a and Model 2a.87 sets come from the same distribution. 11 .29). the 0. Let us consider Yt := Xt − Xt−1 + κXt−1 ∆t (5. for twojump version models. Yt = κα∆t + σdWt + QdP (ω). 13 Refer back to (3. 12 Refer back to Equation (3. By doing so. the points should fall approximately along this reference line.22)) with the estimates in Table 5.5.23) with the longterm mean α diﬀerent for Model 1b and Model 2b.13) and (3. (5. we can obtain a sample Y from the historical data. A QQ plot is a plot of the quantiles 11 of the ﬁrst data set against the quantiles of the second data set superimposed by a 45degree reference line. we obtain the stationary process (5.23) with the estimates in Table 5.22).12 Likewise.6) and Equation (3. we mean the fraction (or percent) of points below a given value.21) with step size ∆t where Xt is the log price at time t. If the two sets come from a common distribution. By doing so.7: Yt = κα∆t + σdWt + Qu dPu (ωu ) + Qd dPd (ωd ). we obtain the stationary process (5.20). and ∆t = 1 for the hourly data series.13 Furthermore. one may generate random paths from the following model (Equation (5. (5. The greater the departure from this reference line.
The 95% quantile plot and 5% quantile plot are also shown in the same ﬁgure. We display plots of one typical simulated path (dashed line).13. The 95% quantile plot . which indicates that the performance of all the models are satisfactory. Here we display the sample plot of one typical simulated path (dashed line).7 Simulation Study Using the parameters we obtained in Tables 5. we can simulate price series to compare with the empirical data. If we separate the onpeak/oﬀpeak data. it is hard to tell the performance of which model is the best. Notice that most of the empirical data except for those extreme “spikes” are inside the simulation range. But it is hard to tell which model is better from the QQ plots since they are quite similar.18–5. The horizontal axis shows the quantiles computed from the adjusted historical data by (5.14–5.17. and the corresponding distribution of the log returns of the simulated data (an overlaid black line) are shown in Figures 5. Using the resulting estimates from ﬁtting Peak EP and Oﬀpeak EP. then one only need to ﬁt Model 1a and Model 1b to the data and compare these two models.9 illustrates QQ plots for the diﬀerent jump version data versus the adjusted historical data. we can simulate the hourly price series to compare with Hourly EP. and the sample plot of Hourly EP (solid line) in Figures 5. In this case.88 Figure 5. All the models underestimate the number of small changes.10–5. Furthermore. The points in each QQ plot are close to the reference line. 5.5–5. the histogram of the changes of Hourly EP.8.21) while the vertical axis shows the quantiles for one typical simulated path. and the plot of the actual data (solid line) in Figures 5.21.
9: QQ Plots Note: in each QQ plot.89 Model 1a 4 Simulated Y Quantiles 2 Simulated Y Quantiles 0 −2 −4 −6 −8 −4 −2 0 Model 1b 5 Simulated Y Quantiles Y Quantiles Simulated Y Quantiles 5 2 4 5 Model 2a 0 −5 −10 −4 −2 0 Y Quantiles Model 2b 2 4 0 0 −5 −5 −10 −4 −2 0 Y Quantiles 2 4 −10 −4 −2 0 Y Quantiles 2 4 Figure 5. . The vertical axis shows the quantiles for one typical simulated path for the model whose name is listed in the subtitle. the horizontal axis shows the quantiles for the adjusted historical data.
The deterministic volatility models all performed quite well in various comparisons.08371 0.13465 0.07631 4.63997 0.06480 3. Moreover. It seems that Model 1b does a better job since the moments are closer to the empirical data. Still most of the empirical data except for those extreme “spikes” are inside the simulation range.53402 0. But we failed to apply MLMCCF estimators in stochastic volatility models and we did not get accurate results from SGMM estimation in the time constraint.65594 4.26 list the comparison of the moments of the actual data and the simulated data from Model 1a and Model 1b.67293 0. The histogram of the actual data and the corresponding distribution of the log returns of the simulated data (an overlaid black line) are shown in Figures 5.25.31442 0. 5. a comparison between stochastic volatility models and the other four models (deterministic volatility) are not supplied.Dev Skewness Kurtosis onpeak Model 1a Model 1b 4.8 Conclusion This chapter report comparisons among all the models via diﬀerent means. A simulation study shows that MLCCF estimators are very robust and the estimation procedures are quite accurate.22– 5.25 and Table 5. Table 5. There is only a little bit improvement of the twojump version models over the onejump .48105 3.90 Mean Std.89188 Table 5.04856 4. Therefore.25: Descriptive statistics of Peak EP and simulated paths and 5% quantile plot are also plotted in the same ﬁgure for comparison.
91 Mean Std.13474 3.76840 3. with proper data preparation (such as deseasonalization.20126 2.16340 Table 5.21882 0. and separation of the onpeak data and the oﬀpeak data). we feel that.64445 0.Dev Skewness Kurtosis oﬀpeak Model 1a Model 1b 3.69549 0. As the only extra term in Model 2b is the onpeak/oﬀpeak dummy variables.47128 0. .46474 3.31544 0. Model 1b will suﬃce in mimicking the dynamics in electricity price processes of Alberta.42967 3.26: Descriptive statistics of Peak EP and simulated paths version models.75322 0.
92 45 Deseasonalized Hourly EP A typical path of simulated P 5% percentile 95% percentile 40 35 30 25 20 15 10 5 0 2400 2600 2800 3000 3200 3400 3600 3800 4000 Figure 5. one typical simulated path. . the 95% quantile.10: Hourly EP superimposed by simulated paths (Model 1a) Note: this ﬁgure includes the sample plot of the deseasonalied Hourly EP. and 5% quantile of the simulation.
93
45
Deseasonalized Hourly EP A typical path of simulated P 5% percentile 95% percentile
40
35
30
25
20
15
10
5
0 2400 2600 2800 3000 3200 3400 3600 3800 4000
Figure 5.11: Hourly EP superimposed by simulated paths (Model 1b) Note: this ﬁgure includes the sample plot of the deseasonalied Hourly EP, one typical simulated path, the 95% quantile, and 5% quantile of the simulation.
94
45
40 Deseasonalized Hourly EP A typical path of simulated P 5% percentile 95% percentile
35
30
25
20
15
10
5
0 2400 2600 2800 3000 3200 3400 3600 3800 4000
Figure 5.12: Hourly EP superimposed by simulated paths (Model 2a) Note: this ﬁgure includes the sample plot of the deseasonalied Hourly EP, one typical simulated path, the 95% quantile, and 5% quantile of the simulation.
95
45
Deseasonalized Hourly EP A typical path of simulated P 5% percentile 95% percentile
40
35
30
25
20
15
10
5
0 2400 2600 2800 3000 3200 3400 3600 3800 4000
Figure 5.13: Hourly EP superimposed by simulated paths (Model 2b) Note: this ﬁgure includes the sample plot of the deseasonalied Hourly EP, one typical simulated path, the 95% quantile, and 5% quantile of the simulation.
14: Comparison of simulated price processes with Hourly EP (Model 1a) Note: the histogram is of the change of the deseaonalized Hourly EP and the overlaid black line is the corresponding distribution of the log returns of one typical simulated path. Note that this model underestimates the number of small changes.96 3500 3000 2500 Frequency 2000 1500 1000 500 0 −5 −4 −3 −2 −1 0 Log Return 1 2 3 4 5 Figure 5. .
97 3500 3000 2500 Frequency 2000 1500 1000 500 0 −5 −4 −3 −2 −1 0 Log Return 1 2 3 4 5 Figure 5. Note that this model underestimates the number of small changes. .15: Comparison of simulated price processes with Hourly EP (Model 1b) Note: the histogram is of the change of the deseaonalized Hourly EP and the overlaid black line is the corresponding distribution of the log returns of one typical simulated path.
.16: Comparison of simulated price processes with Hourly EP (Model 2a) Note: the histogram is of the change of the deseaonalized Hourly EP and the overlaid black line is the corresponding distribution of the log returns of one typical simulated path. Note that this model underestimates the number of small changes.98 3500 3000 2500 Frequency 2000 1500 1000 500 0 −6 −4 −2 0 Log Return 2 4 6 Figure 5.
Note that this model underestimates the number of small changes.17: Comparison of simulated price processes with Hourly EP (Model 2b) Note: the histogram is of the change of the deseaonalized Hourly EP and the overlaid black line is the corresponding distribution of the log returns of one typical simulated path.99 3500 3000 2500 Frequency 2000 1500 1000 500 0 −5 −4 −3 −2 −1 0 Log Return 1 2 3 4 5 Figure 5. .
.18: Peak EP superimposed by simulated paths (Model 1a) Note: this ﬁgure includes the sample plot of Peak EP. and 5% quantile of the simulation. the 95% quantile. one typical simulated path.100 Simulated Price and Real Price 500 Real P one path of simulated P 5% percentile 95% percentile 450 400 350 300 $/MWh 250 200 150 100 50 0 0 100 200 300 Time 400 500 600 Figure 5.
the 95% quantile. one typical simulated path. . and 5% quantile of the simulation.19: Peak EP superimposed by simulation paths (Model 1b) Note: this ﬁgure includes the sample plot of Peak EP.101 Simulated Price and Real Price 450 Real P one path of simulated P 5% percentile 95% percentile 400 350 300 $/MWh 250 200 150 100 50 0 0 100 200 300 Time 400 500 600 Figure 5.
. the 95% quantile. and 5% quantile of the simulation.20: Oﬀpeak EP superimposed by simulation paths (Model 1a) Note: this ﬁgure includes the sample plot of Oﬀpeak EP.102 Simulated Price and Real Price 300 250 Real P one path of simulated P 5% percentile 95% percentile 200 $/MWh 150 100 50 0 0 100 200 300 400 Time 500 600 700 800 900 Figure 5. one typical simulated path.
the 95% quantile. one typical simulated path.21: Oﬀpeak EP superimposed by simulation paths (Model 1b) Note: this ﬁgure includes the sample plot of Oﬀpeak EP.103 Simulated Price and Real Price 450 Real P one path of simulated P 5% percentile 95% percentile 400 350 300 $/MWh 250 200 150 100 50 0 0 100 200 300 400 Time 500 600 700 800 900 Figure 5. and 5% quantile of the simulation. .
22: Comparison of simulated price processes with Peak EP (Model 1a) Note: the histogram is of the change of the deseaonalized Peak EP and the overlaid black line is the corresponding distribution of the log returns of one typical simulated path.104 35 30 25 Frequency 20 15 10 5 0 −3 −2 −1 0 Log Return 1 2 3 Figure 5. Note that this model overestimates the number of mediumsized changes. .
. Note that this model underestimates the number of small changes and overestimates the mediumsized changes.23: Comparison of simulated price processes with Peak EP (Model 1b) Note: the histogram is of the change of the deseaonalized Peak EP and the overlaid black line is the corresponding distribution of the log returns of one typical simulated path.105 35 30 25 Frequency 20 15 10 5 0 −3 −2 −1 0 Log Return 1 2 Figure 5.
5 −1 −0.5 0 0. .5 2 Figure 5.24: Comparison of simulated price processes with Oﬀpeak EP (Model 1a) Note: the histogram is of the change of the deseaonalized Oﬀpeak EP and the overlaid black line is the corresponding distribution of the log returns of one typical simulated path.5 Log Return 1 1. Note that this model underestimates the number of small changes.106 45 40 35 30 Frequency 25 20 15 10 5 0 −2 −1.
Note that this model underestimates the number of small changes.5 0 Log Return 0.5 2 Figure 5.5 −1 −0.25: Comparison of simulated price processes with Oﬀpeak EP (Model 1b) Note: the histogram is of the change of the deseaonalized Oﬀpeak EP and the overlaid black line is the corresponding distribution of the log returns of one typical simulated path. .5 1 1.107 45 40 35 30 Frequency 25 20 15 10 5 0 −2 −1.
stochastic volatility.Chapter 6 Conclusion Energy commodity markets are growing rapidly as the deregulation and restructuring of electricity furnish industries spreads in North America and around the world. Thus electricity poses the biggest challenge for researchers and practitioners to model its price behaviour among all the energy commodities. 108 . as well as multiple jumps. we showed that the twojump version of meanreverting jumpdiﬀusion (Model 1b) is generally superior to other models after the proper deseaonalization and separation of the onpeak and oﬀpeak data. risk management and project pricing. timevarying mean. Furthermore. Through extensive empirical comparisons among the models. the interplay between supply and demand produces unique electricity price dynamics in each diﬀerent electricity market. we have addressed issues of modeling electricity spot prices in Alberta. In this thesis. We also introduced methods showing how to ﬁt those models via MLCCF and SGMM estimation to the Alberta hourly electricity prices. because of the relative newness of deregulation. Due to the diﬃculties in storing electricity. We examined a broad class of stochastic models which can be used to model the behaviour of electricity pricing including meanreversion. There is a heightened awareness of the need to understand the dynamics of electricity markets for trading electricity. We demonstrated how to get the CCF by means of the transform analysis. there have been few empirical studies focusing entirely on electricity prices.
there are a continuum of possible paths that may be simulated by. and drawing from the appropriate conditional distribution. generalized method of moments. They may perform better if we fully exploit the information of the joint CCF and use other estimation methodologies such as the Bayesian Monte Carlo Markov Chain. and leaves a very smooth forecast representing a number falling somewhere between the means of the two conditional distributions that make up the mixture. Averaging processes could cause the loss of much important information. Combining many simulated paths averages out the excess variation induced from the jumps. simulated method of moments. As pointed out by Knittel and Roberts (see [26]). Although this thesis concentrated on calibration. Stochastic volatility models should be further considered from a purely statistical perspective.” Furthermore. intuitively. and others. and more general structures such as multiple latent variables and timevarying jump component. we should be able to compute the prices of various electricity derivatives (options) under the assumed underlying aﬃne jumpdiﬀusion price processes by exploiting the transform analysis when applicable. “One must simulate a forecasted path because of the models dependency on random jumps. In future work. a number of important issues ought to be considered. Aﬃne processes can be applied to multiple types of process such as those with stochastic volatility. generating useful forecasts from jumpdiﬀusion models is diﬃcult. ﬂipping a λcoin each time period. Kalman ﬁltering methods. without sacriﬁcing option pricing tractability. Of course. However. eﬃcient method of moments. all the models we consider cannot capture unexpected events such . they have many additional advantages that could be further investigated.109 In future research. and jumps.
Therefore. . and this could be one of the most interesting area of future research.110 as changes in macroeconomic variables. forced outages of power generation plants or unexpected contingencies in transmission networks and the like. Such unexpected events often result in price processes following completely diﬀerently dynamics. we need to try other types of aﬃne models to get useful forecasts of the spot prices.
The US Power Market. School of Industrial & Systems Engineering. 2001. Bates. Valuation of investment and opportunitytoinvest in power generation assets with spikes in electricity price. Report. [5] D. Report. and K.S.S. Risk. Eydeland and G. 2002. Risk Publications. 1998. Graduate School of Business Stanford University. Singleton. Das. Econometrica (68). 2000. 111 . 1998. e [3] J. 2000. Working Paper. Pricing interest rate derivatives: a general approach. J. Speciﬁcation analysis of aﬃne term structure models. A class of marked point processes for modeling electricity prices. Report. Department of Finance. Georgia Institute of Technology. Tech. Pan. [4] H. Report.L. Geman and A. University of California Energy Institute. Singleton. Empirical Option Pricing: A Retrospection. Pricing power derivatives.J. Puller. [6] D. 2003.J. [8] G. Dai and K. Transform analysis and asset pricing for aﬃne jumpdiﬀusion. Chacko and S. Tech. H´lyette.Bibliography [1] S. Duﬃe. Roncoroni. 2002. NBER Working Paper. Tech. [7] Q. ESSEC Business school. Pricing and ﬁrm conduct in Californias deregulated electricity market. Tech. University of Iowa and the National Bureau of Economic Research. Deng. [2] A.
Job Market Paper. University of Stanford. Foresi.ab. Working Paper.M. 1998. Chacko and L. G. Sick and M. Ph. [14] T. Journal of Econometrics. Harvard University.S. Report. Singleton. [12] K. Daniel. 2001. Review of Derivatives Research. Vol. Report. 1. [10] J. [17] G.D. 1996.112 [9] S. . [15] http://www.energy. Plourde. University of Alberta and University of Adelaide. 2001. [11] P. Electricity Industry Restructuring: The Alberta Experience. University of Oregon. Stein. School of Business. Das and S. Pricing electricity derivatives under alternative stochastic spot price models.gov. Graduate School of Business Administration. University Carlos III Madrid. 2001. Doucet and A. Spectral GMM estimation of continuoustime processes. Pricing power derivatives: a twofactor jumpdiﬀusion approach. Elliott. University of Alberta. Tech. [13] G. J. Tech.htm [16] R. University of Calgary. Lee and B.J. Stochastic Financial Models for Electricity Derivatives. Report. Tech. Deng.ca/com/Electricity/Introduction/Electricity. University of California Energy Institute. 2003. 2000. Stanford University and NBER. Exact solutions for bond and option prices with systematic jump risk. Viceria. Pricing electricity calls. Jr. 1999. Estimation of aﬃne asset pricing models using the empirical characteristic function. Thesis. Villaplana.
New York. Yousef. Knittel and M. Tech. Ball and W. Schwartz. Report. Derivation and Empirical Testing of Alternative Pricing Models in Alberta’s Electricity Market.J. Tech. Strickland. o Federal Reserve Board.S. Roberts. [21] Y. Thesis.113 [18] L. Electricity prices and power derivatives: evidence from the Nordic Power Exchange. University of California Energy Institute. 1997. No. University of Calgary. [19] P. An empirical examination of deregulated electricity prices. [23] L. . Energy Risk: McGrawHill. Energy Derivatives: Pricing and Risk Management. [24] J. LACIMA Publications. Valuing and Managing Energy Derivatives.R. A¨ ıtSahalia. Vol. 2000. 2002. 2001. Maximum likelihood estimation of discretely sampled diﬀusions: A closedform approximation approach. Econometrica. A simpliﬁed jump process for common stock returns.R. 1998. Zhou. 2001. 1998. Working Paper.1. John Wiley and Sons. [27] C. 1983. Torous. Pricing and Risk. Deng. 2001. JumpDiﬀusion term structure and Itˆ conditional moment generator.S. Review of Derivatives Research (5). Journal of Financial and Quantitative Analysis (18). Stochastic models of energy commodity prices: Meanreversion with jumps and spikes. Pilipovich.N. [26] C. Report. Wilmott. 2002. [25] H. Derivatives. [20] D.L. Clewlow and C. [22] J. University of California Energy Institute.70. Lucia and E.
Oxford. Princeton University Press. Renault.. Option valuation and hedging strategies with jumps in the volatility of asset returns. Report. Florens. Maximum likelihood estimation of generalized Ito processes with discretely sampled data. Estimation of a mixture via the empirical characteristic function. [38] N. Econometric Theory (4). Time Series Analysis. 1978. Journal of Finance (48). [36] V. [35] T. 1995. [33] J. Ph. Econometrica. [37] E. Nuﬃeld College. and E. University of Aarhus. Naik. .D.114 [28] S. Generalized autoregressive conditional heteroskedasticity. 1988. The University of British Columbia. center for analytical e ﬁnance. 1998. Honor´. Journal of Econometrics (31). [34] D. Discrete parameter variation: Eﬃcient estimation of a switching regression model. Thesis. Statistical aspects of ARCH and stochastic volatility. Journal of Financial and Quantitative Analysis (16). Working Paper Series No.R. Hamilton. Shephard.18. Smith. 1994.P. [29] M. [30] P. A. Kiefer. Carrasco and J. Tech. Manuscript. [32] N. 1993. Essays in Empirical Asset Pricing. vol. Beckers. [31] A. Ghysels. Pitfalls in estimation jumpdiﬀusion models. A note on estimating the parameters of the diﬀusionjump model of stock returns. 1981. Lo. Harvey. Bollerslev. 14. Handbook of Statistics.D. 2000. 2002. 1995.M. 1986.
RISK. Stirzaker. Oxford Science Publications. Financial Modeling with Jump Processes. Li and A.E. Number 3. 1999. Risk Publication. Journal of Computational Finance. The Central Research Institute of Electric Power Industry in Japan and the Dice Center for Financial Economics. 2nd Edition. 2003. [45] B. Tankov. Vol. New Jersey. 1995. Estimation of continuous time processes via the empirical characteristic function. Stochastic Integration and Diﬀerential Equations. 1994.J. Vol. Reconstructing the unknown volatility function. Coleman. 2. 2004. Prentice Hall. Kaminske. 7. Goto and G. Manuscript. The US Power Market.. Robert and T. 1982. Protter. Cont and P.R. 2001. [47] B. Englewood Cliﬀs.R. [41] M. Verma. [42] V. CNRS. 2nd Edition. Cont.A.L. The challenge of pricing and risk managing electricity derivatives. Fifth Edition. Dupire. Pricing with a smile. Tech. Report. Hamida and R. Knight. Understanding Electricity Price Volatility Within and Across Markets. Probability and Random Processes. Karolyi. [40] V. [44] G. . Introduction to Mathematical Statistics. Grimmett and D. SpringerVerlag. Journal of Business and Economic Statistics. Allen. Chapman and Hall/CRC. [43] G. 1997. 2003.115 [39] P. [48] R. Jiang and J. [46] T. Recovering volatility from option prices by evolutionary optimization. 2004.
Generalized Method of Moments Estimation. Report. 1999. Matyas. Cambridge. Econometrica. Estimating the dimension of a model. 1982.116 [49] L. Tech. Vol. Santa Clara University. . The surprise element: jumps in interest rates.R. 50. Annuals of Statistics (6). Das. 1978. 2000. [52] S. Hansen. Schwartz. [51] G. [50] L. Large sample properties of generalized method of moments estimators.