You are on page 1of 21

Comparison of static and adaptive models for short-term residential

natural gas forecasting in Croatia

Primož Potočnika*, Božidar Soldob, Goran Šimunovićc, Tomislav Šarićc, Andrej Jeromena,
Edvard Govekara

a
University of Ljubljana, Faculty of Mechanical Engineering, SI-1000 Ljubljana, Slovenia
b
HEP-Plin Ltd., HR-31000 Osijek, Croatia
c
University of Osijek, Faculty of Mechanical Engineering Slavonski Brod, HR-35000 Slavonski Brod, Croatia

Abstract
In this paper the performance of static and adaptive models for short-term natural gas
load forecasting has been investigated. The study is based on two sets of data, i.e. natural gas
consumption data for an individual model house, and natural gas consumption data for a local
distribution company. Various forecasting models including linear models, neural network
models, and support vector regression models, were constructed for the one day ahead
forecasting of natural gas demand. The models were examined in their static versions, and in
adaptive versions. A cross-validation approach was applied in order to estimate the
generalization performance of the examined forecasting models. Compared to the static model
performance, the results confirmed the significantly improved forecasting performance of
adaptive models in the case of the local distribution company, whereas, as was expected, the
forecasts made in the case of the individual house were not improved by the adaptive models,
due to the stationary regime of the latter’s heating. The results also revealed that nonlinear
models do not outperform linear models in terms of generalization performance. In summary,
if the relevant inputs are properly selected, adaptive linear models are recommended for
applications in daily natural gas consumption forecasting.

Keywords: Short-term natural gas demand; Adaptive forecasting models; Linear forecasting
models; Nonlinear forecasting models.

Highlights
 Comparisons of static and adaptive models for natural gas consumption forecasting.
 Comparisons of linear and nonlinear forecasting models (ARX, neural networks, SVM).
 Relevant inputs to forecasting models are determined by stepwise regression.
 Methods are applied to local distribution company and individual house data sets.
 The best results are obtained by using linear adaptive forecasting models.

*
Corresponding author:
Tel.: +386-1-4771-167; fax: +386-1-2518-567; E-mail address: primoz.potocnik@fs.uni-lj.si

1
1. Introduction
Planning and forecasting of natural gas consumption has become a vital component in
providing the stability of distribution systems. Therefore, recently we can find a plenty of
discussion about natural gas consumption forecasting which has been investigated at many
different levels: at a world level [1,2], at national level [3-7], at both industrial [8,9] and
residential sector [10-14], at the level of gas distribution systems [15,16], in both the
commercial and residential sectors, and finally, at the level of individual customers [17,18].
Demand and production have been investigated by means of various forecasting tools, using
various techniques, the forecasting horizons varying from a few hours ahead to a few decades
ahead [19]. Many different tools have been applied in this area, such as autoregressive
integrated moving average model [20], support vector regression [21], neural networks [22],
and an adaptive network-based fuzzy inference system [23]. A recent broad overview of the
various approaches which have so far been used in the field of natural gas consumption
forecasting was summarized by Soldo [24].
Natural gas transmission system operators commonly utilize natural gas consumption
forecasting models as supply-demand balancing tools. This area is usually regulated by the
government, or by natural gas supply contracts. These regulations usually require the
forecasting of future consumption within a defined tolerance range. Otherwise, a penalty
system is applied, as for example in the case of Slovenia [25]. For this reason, local
distribution companies need accurate forecasting models. This is also the case in Croatia,
where natural gas regulations currently anticipate a future penalty system for short-term
natural gas load forecasting. HEP-Plin Ltd., Croatia’s local distribution company, will
therefore need an online application for one day ahead forecasting of natural gas demand.
The existing literature on natural gas demand forecasting provides only a few
comparative studies analysing the performance of different forecasting models. Comparisons
between several mathematical models for natural gas consumption forecasting have been
presented in Sabo et al. [16]. The presented results are obtained with three different models:
Gompertz model function, FTW (Fermat – Torricelli –Weber approach) estimation and Linear
model function with usage of outside temperature and past consumption as input data. The
results show that most acceptable forecast is provided by mathematical models in which
natural gas consumption and temperature are related explicitly.
A comparison between the nonlinear mixed effect models with various autoregressive
models with exogenous inputs was studied by Brabec et al. [17]. The authors presented
natural gas consumption forecasting results for 62 individual customers. Results were
obtained with NLME (nonlinear mixed effect models) and compared with traditional ARX
(auto-regressive model with exogenous inputs) and ARIMAX (auto-regressive moving-
average model with exogenous inputs) models. The authors observed that the NLME
approach and individual time series (ARIMAX, ARX) approaches all have their merits, and
evaluation using the Diebold-Mariano test to compare them showed no clear winner.
A comparison between seasonal autoregressive integrated moving average models with
exogenous inputs (SARIMAX), neural network models (ANN) and ordinary least squares
regression models (OLS) was examined by Taşpinar et al. [26]. The authors used four year
consumption data of one region in Turkey and compared the results obtained with mentioned
models. Their overall conclusion was that time series forecaster (SARIMAX) for prediction of

2
daily natural gas consumption in residences performs better than ANN approaches. This study
also indicated that local daily natural gas consumption can successively be forecasted in the
short run covering meteorological parameters.
Furthermore, several papers have presented research results based on only one
forecasting model, for example, an autoregressive integrated moving average model
(ARIMA) [6], that has been applied to estimate the natural gas demand in Turkey. A
statistical approach based on nonlinear regression principles has been proposed in [18]. The
approach is based on customer segmentation into various types and is suitable for natural gas
consumption estimation of individual residential and small commercial customers.
A neural network model with different training algorithms was studied by Kizilaslan
[22]. The models were developed for forecasting natural gas consumption for residential and
commercial consumers in Istanbul in Turkey. The results showed that various artificial neural
network models could be useful for the natural gas consumption forecasting, and the most
efficient solution based on conjugate gradient descent neural network was confirmed.
Azadeh et al. [23] developed an adaptive network-based fuzzy inference system
(ANFIS) for estimation of natural gas demand and reported good results on data representing
Iranian natural gas consumption. The proposed approach is capable of handling non-linearity,
complexity as well as uncertainty that may exist in actual data sets due to erratic responses
and measurement errors. The input variables used included day of the week, demand of the
same day in previous year, demand of a day before and demand of 2 days before, but the
weather related parameters including the temperature have not been considered at all.
Consequently it is not possible, from the existing literature, to obtain clear answers to
the following questions: Which input variables and extracted features are the most relevant for
the construction of forecasting models? What are the required embedding dimensions of the
inputs? Which model is the most suitable for use as a forecasting tool in real-world short-term
natural gas forecasting applications? Should the model be linear or nonlinear? Do adaptive
models outperform non-adaptive models?
The aim of this paper is to provide new answers to these questions by constructing and
comparing a broad range of natural gas consumption forecasting models with a one day ahead
forecasting horizon. The comparative study includes linear modelling approaches (stepwise
regression, auto-regressive model with exogenous inputs, adaptive auto-regressive model with
exogenous inputs), as well as nonlinear modelling approaches (neural networks, adaptive
neural networks, support vector regression), and brings new insights regarding the
performance of static and adaptive versions of the forecasting models. The forecasting models
are applied to two natural gas consumption systems on different scales: an individual house,
and the residential sector of a local distribution company. A successful natural gas forecasting
solution targeted for real-world applications depends on the synergy of many factors,
including feature extraction, feature selection, data pre-processing, model selection with
respect to model order, and the selection of linear/nonlinear and static/adaptive model
structures, as well as learning approaches and applied training/validating procedures. The
novelty of this approach is thus in demonstrating that a successful forecasting solution can be
obtained by a balanced combination of feature extraction, feature selection, and the
construction of a simple linear model with an adaptation mechanism. The performed research
thus makes a contribution by providing a broad comparison between the various established
forecasting models for short-term natural gas forecasting, and a novel combination of several
known methods for building forecasting solutions. The conclusions of the paper provide
3
answers regarding the appropriate selection of inputs, feature extraction, the choice of a
suitable model structure (linear/nonlinear), the recommended training approach
(static/adaptive), and the required model order, all of which are relevant for efficient short-
term natural gas forecasting. The results of the research are directly applicable to the
preparation of forecasting solutions for the natural gas market.
The paper is organized as follows: the aims of the research are described in the next
section. The “Data sets” section describes the weather and natural gas consumption data
collection and preparation process, presents the basic relations between the collected data, and
describes extraction of additional features relevant for the forecasting task. The forecasting
models applied in this study are introduced in the “Forecasting models” section, whereas in
the section “Selection of relevant inputs” the stepwise regression based selection of relevant
inputs is described. The “Forecasting framework” section presents the formulation of the
forecasting task, the performance measures, and the cross-validation procedure, and describes
the static vs. adaptive models testing procedure. Comparisons and evaluations of the
forecasting results are presented in the “Results” section. The key findings of the research are
summarized in the “Conclusions” section.

2. Research aims
HEP-Plin Ltd. is a local distribution company (LDC), which is obliged to forecast and
nominate the amount of natural gas that will be consumed in the following day. The company
needs an accurate forecasting tool in order to avoid the anticipated penalty rule. The first task
of this research was to develop forecasting models for short-term natural gas demand
forecasting, with emphasis on the static or adaptive versions of the forecasting models. The
second task was to investigate a wide range of forecasting model structures, including linear
models, neural networks, and support vector regression models. In order to achieve the given
objectives, the following forecasting models were examined:
 Benchmark models: random walk model and temperature correlation model.
 Linear models: stepwise regression model, an auto-regressive model with
exogenous inputs, and an adaptive linear auto-regressive model with exogenous
inputs.
 Nonlinear models: neural networks, an adaptive neural network, and support vector
regression.
The main aim of this paper was to select the best forecasting model with respect to
generalization performance as evaluated by cross-validation on the testing data. The
forecasting horizon was one day ahead which corresponds to the anticipated forecasting
regulations in Croatia.

3. Data sets
This study is based on two different sets of natural gas consumption data and
corresponding weather data. In order to study the natural gas consumption on two different
scales, the first set of natural gas consumption data was acquired from a model house, and the
second set from the local distribution company. The objective of including the individual
house data is also to form a basis for comparison of static and adaptive models. For the
individual house, that was not changed during the data acquisition period, the adaptive model
4
is not expected to yield improved results, therefore this represents a suitable basis from which
to evaluate models for a larger distribution system. The measurements of natural gas
consumption and meteorological data were carried out through two heating seasons,
2011/2012 and 2012/2013, as follows:
 Season 1: from 5th November 2011 until 26th April 2012,
 Season 2: from 9th November 2012 until 31st March 2013.

3.1. Weather data


Weather data were collected in a weather station with hourly readings, located at N 45°
41.170', E 18° 24.200'. Temperature and solar radiation were both included as potential input
variables in our research. Correlation coefficient between average daily temperature and solar
radiation is low (R = 0.462) which indicates that the two variables carry different kind of
information that may be relevant for natural gas consumption.

3.2. Individual house data


A model house that represents a typical individual residential building, was used for the
research. The house is located at N 45° 39.720', E 18° 25.970'. It has only one storey, and the
natural gas heated space covers an area of approximately 100 m 2. The living space is heated at
22 °C for 24 hours a day, every day, during the whole heating season. Natural gas data are
measured by a natural gas meter with hourly readings.

3.3. Local distribution company data


In the observed natural gas distribution area, the local distribution company delivers
natural gas to two small towns and two villages, with a total of 4314 customers. A model
house described above also represents one of the customers. There are only twelve customers
with technological natural gas consumption in a production process, whereas the other
residential, small commercial and industrial customers use natural gas for space heating.
Local distribution company consumption data are collected from natural gas meters with
hourly readings.

3.4. Preparing the data for analysis


The weather data, the model house (HOUSE) data, and the local distribution company
(LDC) data, collected as described above, were combined into a database suitable for
subsequent analysis. The variables were re-sampled from hourly values into daily values as
follows:
– the natural gas consumption and solar radiation values were expressed as daily
sums,
– the outside temperature was expressed as a daily average.
The combined data, including the HOUSE and LDC cumulative daily natural gas
consumption, the average daily outside temperature, and the cumulative daily solar radiation,
are presented in Fig. 1.

5
Fig. 1. HOUSE and LDC data (natural gas consumption, average daily outside temperature, and cumulative daily
solar radiation) for the 2011/2012 and 2012/2013 heating seasons.

3.5. The relationship between the natural gas consumption and the outside temperature
With regard to the effect of the weather on residential natural gas consumption, the
outside temperature has the strongest influence, as has been previously shown by numerous
authors [12,16,17]. As shown in Fig. 2, the outside temperature is inversely proportional to
the natural gas consumption, and exhibits high correlations with both the HOUSE and the
LDC data. The correlation coefficients are R = -0.895 for the HOUSE data, and R = -0.939 for
the LDC data. Consequently, the outside temperature is considered as one of the most
informative inputs to the forecasting models.

Fig. 2. Correlation of the HOUSE and LDC natural gas consumption with the average daily outside temperature.

6
3.6. Comparison of the HOUSE and LDC data
The comparison of HOUSE and LDC natural gas consumption data, which is presented
in Fig. 3, shows a slightly nonlinear relationship between the two data sets. Compared to the
HOUSE data, the LDC system exhibits increased consumption at low outside temperatures,
and the correlation coefficient between the two data sets amounts to R = 0.926. Based on the
results shown in Fig. 3 it can be assumed that the HOUSE and LDC data sets represent two
slightly different dynamic processes. Both data sets are included in our study as
representatives of an individual natural gas consumer and a regional natural gas distributor.

Fig. 3. Comparison of HOUSE and LDC natural gas consumption data.

3.7. Feature extraction


Besides the original data presented above, two additional features with a potential to
improve natural gas consumption forecasting were extracted from the hourly weather data as
follows:
Tmin: minimum daily temperature
SRmax: maximum daily solar radiation
Since residential natural gas consumption is influenced by the repetitive behaviour
patterns of residents, additional features describing population dynamics can be extracted,
such as days of the week and holidays. For the purpose of this study, the information about
holidays was implicitly encoded in the day of the week by encoding holidays as Sundays. Day
of the week was represented through a normalized weekly gas consumption index (WGCI)
that was extracted through the normalization of daily consumption data y(t) by the mean of
the current week:
y (t )
yw ( t ) = 3
1 (1)

7 k=−3
y (t +k )

When the weekly-normalized daily consumption data yw(t) have been collected for each
day of the week (Monday, Tuesday, ...), a weekly gas consumption index (WGCI), with
confidence intervals, can be obtained. Fig. 4 shows the WGCI profiles for the HOUSE and
LDC data. An analysis of variance (ANOVA) of the WGCI profile for different days of the
week doesn’t indicate significant differences for the HOUSE data (p = 0.358) but more
significant differences were obtained in the case of the LDC data set (p = 0.090). For this
7
reason the WGCI information was included in further modelling analysis. The proposed
WGCI profile reduces the dimensionality of input data to only one input, instead of 7 dummy
variables for each day of the week and additional holiday indicators. The WGCI index already
captures the likely nonlinear gas consumption response for different days of the week, so it
can be effectively applied as an input to models that are otherwise linear in their parameters.
WGCI index is also directly applicable to online forecasting solutions. It can be calculated
offline on available past data and then applied in the forecasting solution.

Fig. 4. The influence of population dynamics expressed by means of a weekly gas consumption index (WGCI)
for HOUSE and LDC natural gas consumption.

3.8. Complete data for analysis


The complete set of variables and extracted features applied in the forecasting study are
summarized in Table 1. Symbol t denotes time in daily resolution.

Table 1: Description of the variables used in the forecasting analysis.

Symbol Description
y(t) cumulative daily natural gas consumption
T(t) average daily temperature
SR(t) cumulative daily solar radiation
Tmin(t) minimum daily temperature
SRmax(t) maximum daily solar radiation
WGCI(t) weekly gas consumption index

4. Forecasting models
In this section the various forecasting models applied in this study are introduced. The
models were examined with the aim of finding the most suitable structure for the one day
ahead forecasting of residential natural gas consumption. Benchmark models are presented for
comparison only, and include the random-walk (RW) model and the temperature correlation
(TC) model. Linear models include the stepwise regression method, and auto-regressive
models with exogenous inputs (ARX) of various model orders. In addition to linear models,
nonlinear modelling approaches are considered by including neural network (NN) models and
support vector regression (SVR). Beside the model structures, the research described in this
paper was focused on the question of using static or adaptive versions of forecasting models.

8
Consequently, various adaptive versions of the above-mentioned forecasting models were also
examined; these were adaptive auto-regressive models with exogenous inputs (RARX), and
an adaptive neural network (RNN).

4.1. RW model
The random-walk (RW) model predictor y(t+1) derives its value from past natural gas
consumption y(t), with e(t) denoting noise and t the arbitrary time in daily resolution:

y ( t+1 ) = y (t)+e (t+ 1) (2)

Random walk model is only considered as a basis to evaluate the other more elaborated
models, as recommended in [34] where RW model is implicitly included in the proposed
mean absolute scaled error measure.

4.2. TC model
The temperature correlation model (TC) correlates the natural gas consumption y(t+1)
with the average daily temperature T(t+1):

y ( t+1 ) =b0 +b 1 T ( t +1 ) +e (t+1) (3)

The motivation for this model is the strong negative correlation between daily outside
temperature and natural gas consumption, as described in section 3.5.

4.3. Stepwise regression


Stepwise regression model in this study is a linear regression model which is
constructed by iteratively adding and removing terms from a multi-linear model, based on
their statistical significance in a regression [27]. The method begins with an initial model, and
then compares the explanatory power of incrementally larger and smaller models. At each
step, the p-value of an F-statistic is computed in order to test models both with and without a
potential input. Tested inputs are iteratively added or removed from the model until the
procedure converges to a locally optimal forecasting model with statistically significant input
variables. The method also resolves the collinearity problem by reducing the available set of
inputs to the relevant ones.

4.4. ARX model


The linear auto-regressive model with exogenous inputs (ARX) is defined by the
equation:
A ( q ) y ( t ) =B ( q ) x ( t ) + e(t ) (4)

where A(q) and B(q) denote polynomials with respect to the time-shift operator q:
A(q) = 1+a1q-1+...+aMq-M, and B(q) = b1+b2q-1+...+bMq-M+1. Various model orders M were
tested in order to find the best compromise between the model’s accuracy and complexity.

4.5. RARX model


The RARX model denotes an adaptive (recursive) linear auto-regressive model with
exogenous inputs of order M. The model has the same form as the ARX model (4) but

9
additionally uses an online adaptation mechanism with exponential forgetting defined by a
forgetting factor λ. A description of this model can be found in [28] and the algorithm is also
known as recursive least squares (RLS). The forgetting factor algorithm can be described by
the following equations. The change of the model parameter estimate θ(t ^ ) at time t is
obtained by multiplying the forecasting error y ( t ) −^y (t) by the gain K(t):
θ^ ( t )=θ^ ( t−1 ) + K (t) ( y ( t ) − ^y (t) ) (5)

Gain K(t) is expressed as:


K ( t ) =Q ( t ) ψ (t) (6)

where ψ(t) represents the gradient of the model output ^y (t∨θ) with respect to the parameters
θ and Q(t) is defined as:
Q ( t−1 )
Q (t)= T (7)
λ +ψ ( t ) Q(t−1)ψ ( t )
The actual value of Q(t) is obtained by the minimization of the following function at time t:
t

∑ λt −k e 2 (k ) (8)
k =1

This approach discounts old measurements exponentially such that an observation that
is τ samples old carries a weight that is equal to λτ times the weight of the most recent
observation. τ = 1/(1−λ) represents the memory horizon of this algorithm. Measurements older
than τ = 1/(1−λ) typically carry a weight that is less than about 0.3. λ is called the forgetting
factor and typically has a positive value between 0.97 and 0.995. In this study, the best results
were obtained by using a forgetting factor of λ = 0.98.

4.6. NN models
Various Neural Network (NN) models [29] were examined in order to estimate the
benefits of nonlinear model structures compared to linear models. NN models were defined as
feed-forward networks containing one hidden layer with sigmoid neurons and an output layer
with a linear activation function. The number of hidden neurons L was an open design
parameter, and several NN architectures were examined to find a suitable network
architecture with L = 5 hidden neurons (increasing the number of neurons did not improve the
generalization capability). An output of a NN can be represented by a generic expression of
the inputs u, e.g. for a network with a K-dimensional input u={u1 ,u 2 , … , u K }, L neurons in
the single hidden layer, and a single output, the model can be described as:

( ∑ (∑ ) )
L K
y ( t+1 ) =F 0 w j Fh w ji u i +e ( t +1 ) (9)
j=0 i=0

Fo and Fh represent linear output and sigmoid hidden layer activation functions. The NN
models were trained by the Levenberg–Marquardt algorithm with early stopping based on
internal validation on a training data set.

10
4.7. RNN model
The RNN model denotes an adaptive (recursive) neural network model with the same
architecture as the described NN model, but with additional online adaptation capability.
Online adaptation was accomplished by a momentum back-propagation algorithm defined by
a learning rate η = 0.01, and a momentum constant α = 0.1.

4.8. SVR model


Support Vector Regression (SVR) is a regression formulation of the Support Vector
Machines (SVM) proposed in [30]. The theory is well established and is explained in several
excellent works [31−33]. The SVR model can be represented by a linear combination of N
kernel functions denoted by Φi , weights wi , and bias b:
N
y ( t+1 ) =∑ wi ϕ i ( ui ) +b (10)
i=0

In this study, the radial basis function (RBF) kernel was applied in the SVR
formulation. The calculation of a SVR solution depends on a generalization parameter C, and
a kernel parameter γ. The optimal values of both parameters were determined numerically by
a cross-validation method over a wide range of possible values as follows:
HOUSE data: C = 1.94×104 and γ = 3.97×10-6
LDC data: C = 6.65×108 and γ = 1.13×10-4

5. Selection of relevant inputs


Insight into the predictive relevance of various inputs was obtained by using the
stepwise regression method. The complete set of available inputs as presented in Table 1 (y,
T, SR, Tmin, SRmax, WGCI) was used, with delayed values up to M = 10. The next value of
natural gas consumption y(t+1) was assigned as the output of the model. The results of
stepwise regression input selection are presented in Table 2 for both data sets. All the
proposed regressors from Table 1 have some relevance either for the HOUSE or the LDC
system (or both systems). Consequently, for the unified treatment of both data sets, the
complete set of inputs as presented in Table 1 was applied to both data sets, and to both linear
and nonlinear forecasting models. The future values of weather parameters such as T(t+1) and
SR(t+1) that are applied in our analysis, should be substituted by corresponding weather
forecasts for online forecasting applications. Usually, relevant weather forecasts can be
obtained from governmental weather forecasting institutions.

Table 2: Selected relevant inputs based on stepwise regression.

Data Output Selected inputs


set
HOUSE y(t+1) y(t), y(t-1), T(t+1), T(t), T(t-1), SR(t+1)
LDC y(t+1) y(t), T(t+1), T(t), T(t-1), Tmin(t-1), SRmax(t+1) , SRmax(t), WGCI(t+1)

11
6. Forecasting framework
This section presents a formulation of the forecasting task, describes the training and
testing procedures, and defines the performance measures which were applied for the
evaluation of the forecasting models. The emphasis of the research was on a comparison of
the static (RW, TC, ARX, NN, SVR) and the adaptive versions (RARX, RNN) of the
forecasting models.

6.1. Formulation of the forecasting task


The forecasting problem can be formulated as follows:

{ }
y ( t ) , y ( t−1 ) , ⋯ , y ( t−M y ) ,
^y ( t+1 ) =ϕ x1 ( t+1 ) , x 1 ( t ) , x 1 ( t−1 ) , ⋯ , x 1 ( t−M x1 ) , (11)
⋯,
x K ( t +1 ) , x K ( t ) , x K ( t−1 ) , ⋯ , x K ( t −M xK )

The next value of natural gas consumption ^y ( t +1 ) is forecast by a model ϕ based on


autoregressive values of past natural gas consumption y and inputs x 1 , ⋯ , x K . The inputs x
consist of past, current and future values of influential variables collected in Table 1. Time
resolution is one day ( Δ t=1 ) and the forecasting horizon h is also equal to one day (h=1).
{M y , M x 1 , … , M xK } denote the embedding dimensions of the autoregressive values y and the
inputs x 1 , ⋯ , x K .

6.2. Performance measures


For the evaluation of the performance of the applied forecasting models, the following
two performance measures were used:
a) the mean absolute range normalized error (e), and
b) the adjusted R2 measure.
The mean absolute range normalized error e is expressed as the absolute difference
between the forecast and the actual natural gas consumption, normalized to the maximum
natural gas consumption of the system:
|^y − y|
e=100 [%] (12)
max ( y )

In the study, the mean absolute range normalized error e was chosen as the final model
performance measure, and the adjusted R2 measure was used as an additional reported value
to express the forecasting results. In the analyses, the error e was averaged over the training
and testing sets separately, but using the same max(y) denoting the maximum transfer
capacity of the system. Such an approach is relevant for the natural gas market where
forecasting errors are often rescaled to the nominal transfer capacity of the local distribution
system.

6.3. Cross-validation procedure


The focus of this paper is on the investigation of forecasting models with respect to
generalization performance, so a cross-validation approach was applied and the available data

12
were split into training (season 2011/2012) and testing (season 2012/2013) subsets. For both
data sets (HOUSE and LDC), and for each investigated forecasting model, the cross-
validation procedure was applied as follows:
1. The forecasting model was trained on a training data subset (season 2011/2012).
2. The model was then tested on an independent testing data subset (season 2012/2013).
The performance measure e obtained on testing data subset was selected as a criterion
for the evaluation of model performance.
However, for constructing real-world applications, after initial training/testing of
models, finally a complete available data should be used for training in order to derive the
maximum information for model construction.

6.4. Static vs. adaptive models


All of the applied forecasting models (i.e. both the static and the adaptive models) were
trained on a training data subset. After training, the static models kept their parameters frozen
during testing on an independent test data subset. While the adaptive models were tested, they
were adapted after every forecast on the testing data subset, thus utilizing the most recent past
data. Such an adaptive approach is suitable for online forecasting applications, and also
complies with the cross-validation principle. The next section presents the results of this study
that were obtained according to the described forecasting framework.

7. Results
An overview of forecasting results, expressed through the mean absolute range
normalized error (e) and adjusted R2 performance measures, is presented in Table 3. The
results are shown separately for the HOUSE and LDC data sets, and for the training and
testing data subsets. The forecasting models are presented as RW, TC, Stepwise, ARX(M),
RARX(M), NN(M), RNN(M), and SVR(M), where M denotes the embedding dimensions of
the inputs. Various embedding dimensions M were applied in the study (M = 1,2,…,10), but
for clarity of presentation only results with M = {3,5,7} are reported because increasing the
embedding dimension M > 3 did not improve the generalization capability of the forecasting
models. In summary, the best generalization performance was obtained for both data sets in
the case of the linear adaptive RARX(3) model, but only in the case of the LDC data set did
the adaptive model considerably improve the forecasting performance compared to a static
version of the model. Nonlinear models do not improve the generalization performance.

Table 3: Overview of training and testing forecasting results for all the models which were applied to the
HOUSE and LDC data sets. The results are expressed in terms of the mean absolute range normalized error e
and the adjusted R2.

HOUSE LDC
training testing training testing
Model e [%] adj. R 2
e [%] adj. R
2
e [%] adj. R
2
e [%] adj. R2
TC 5.81 0.868 6.69 0.667 4.46 0.925 10.04 0.414
RW 5.43 0.888 6.80 0.662 3.37 0.954 5.17 0.815
Stepwise 3.64 0.947 4.93 0.815 1.29 0.994 2.48 0.959
ARX(3) 3.52 0.946 4.75 0.813 1.25 0.994 2.18 0.964
ARX(5) 3.43 0.944 4.74 0.793 1.14 0.994 2.17 0.961
ARX(7) 3.13 0.947 4.69 0.763 1.05 0.994 2.05 0.960

13
NN(3) 3.54 0.879 4.84 0.387 0.86 0.992 2.46 0.856
SVM(3) 3.55 -- 4.84 -- 0.79 -- 1.91 --
RARX(3) 3.52 0.946 4.68 0.824 1.25 0.994 1.65 0.977
RARX(5) 3.43 0.944 4.84 0.793 1.14 0.994 1.72 0.972
RARX(7) 3.13 0.947 4.91 0.752 1.05 0.994 1.70 0.970
RNN(3) 3.54 0.879 4.77 0.367 0.86 0.992 1.80 0.911

A comparison of the training and testing results on the HOUSE data set is shown in
Figure 5. The best test result e = 4.68 % was obtained by the RARX(3) model, but very
similar results were obtained also by the static ARX models. As expected due to the
unchanging conditions of the HOUSE natural gas consumption system, the adaptive models
are not advantageous as forecasting tools compared to static models. Also, the nonlinear
models failed to improve the performance of the simpler linear models.
A comparison of the forecasting models in the case of the LDC data set is shown in Fig.
6. This data set is more predictable compared to the HOUSE data, and the improvement
achieved by the more elaborate models compared to benchmark models is considerably
greater. The best test result e = 1.65 % was obtained in the case of the RARX(3) model, and it
also can be noticed that all the adaptive models considerably surpassed the performance of the
static models. The best static result was obtained in the case of the nonlinear SVM(3) model,
but among the adaptive models the nonlinear models did not surpass the performance of the
adaptive linear models.

Fig. 5. Training and testing forecasting errors e in the case of the HOUSE data set.

14
Fig. 6. Training and testing forecasting errors e in the case of the LDC data set.
Fig. 7 presents a comparison of the forecasting performance of the static and adaptive
models for both data sets. Whereas no improvement can be seen in the case of the HOUSE
data set, a clearly improved adaptive forecasting performance can be observed in the case of
the LDC data set. In general, individual HOUSE data set is less predictable compared to the
LDC data set due to the switching regime of the individual heating system.

Fig. 7. Comparison of the static and adaptive models for the HOUSE and LDC data sets.

Fig. 8-9 present the forecasting results obtained in the case of the RARX(3) model for
the HOUSE and LDC data sets, respectively. The time scales include both the training and the
testing seasons. The graphs show the normalized measured and forecast natural gas
consumption, and also include the absolute range normalized errors in original and filtered
15
form. The filtered errors were obtained by applying a smoothing filter (local regression over
n = 40 points using the weighted linear least squares method and a 2nd degree polynomial
model) to the obtained original errors.

Fig. 8. Forecasting results of the RARX(3) model in the case of the HOUSE data set.

Fig. 9. Forecasting results of the RARX(3) model in the case of the LDC data set.

A comparison of the filtered absolute range normalized errors obtained by the static
ARX(3) and the adaptive RARX(3) models is presented in Fig. 10. The results refer to the
testing period of HOUSE and LDC data sets. The results for the LDC data set confirm
significantly improved performance of adaptive forecasting compared to static forecasting.
16
Fig. 10. Comparison of the filtered forecasting errors in the case of the static ARX(3) model and the adaptive
RARX(3) model, for the HOUSE and LDC data sets.

The statistical significance of the presented results was tested by using the Diebold-
Mariano (DM) statistics [35]. The test was performed at the 0.05 significance level on the
testing seasons of both data sets under the null hypothesis of equal forecasting accuracy for
the static ARX model and adaptive RARX model. The results of DM test (using squared error
loss function) confirm that in the case of a LDC data set, the adaptive RARX model
significantly outperforms the static ARX model (DM statistics S = 3.88), whereas in the case
of a HOUSE data set (DM statistics S = 0.93), as expected, the null hypothesis of equal
forecasting accuracy can not be rejected.
Based on the best results obtained by the adaptive RARX model, the flow chart
summarizing the proposed adaptive forecasting approach is presented in Fig. 11. The data (for
each data set) are prepared for the analysis and the relevant features are extracted. Training
data are applied for model construction as follows. Based on stepwise selection of informative
inputs and their corresponding embedding dimensions, the static ARX model is developed. In
the next step, the static model is upgraded with the exponential forgetting mechanism into an
adaptive RARX model in order to accommodate the future system changes. The final step is
to assess the generalization performance of the RARX model on independent testing data in
order to confirm a suitable forecasting performance.

17
Preparing data, Selecting relevant inputs
Data Training data
feature extraction (stepwise regression)

Building static ARX model


on training data

Upgrade model into


adaptive (exponential
forgetting) RARX model

Test model performance


Testing data
on testing data

Testing errors

Fig. 11. Flow chart summarizing the proposed adaptive forecasting approach.

8. Conclusions
The comparative investigation of forecasting models for short-term residential natural
gas consumption is presented in this paper. The research was focused on a comparison of
static and adaptive forecasting models. The comparative study involved various types of
models, including simple benchmark models, linear models, and nonlinear models. Linear
ARX models and nonlinear neural network based models were examined in their static and
adaptive versions. The presented results are based on two data sets, representing an individual
house and a local distribution system. Natural gas consumption data and additional weather
related parameters were collected through the last two winter seasons, and the data were
applied for the training and testing of the forecasting models. In a cross-validation procedure,
the data for the season 2011/2012 were applied for training, and the data for the season
2012/2013 for the testing of the forecasting models, in order to estimate the generalization
capability on independent data sets. Based on the results presented in the previous section, the
following conclusions can be drawn:
 The comparison of static and adaptive models for natural gas consumption forecasting
reveals the superiority of adaptive models for local distribution systems (the LDC data
set), whereas individual house consumption (the HOUSE data set) can be sufficiently
well estimated by static forecasting models. The individual house system in this case
study was stationary throughout the entire data acquisition period, so that adaptive
forecasting models are not needed to improve forecasting accuracy. On the other hand,
the local distribution system described in this case study is a dynamic one, evolving
over time (system growth and other changes), and can be therefore more adequately
represented by adaptive versions of forecasting models. As local distribution systems
(and currently not individual consumers) are the target application area of short-term
natural gas forecasting methods, the first recommendation of this research is a
preference towards adaptive forecasting models.
 The second conclusion of this research is derived from the results of comparisons of
linear and nonlinear forecasting models. The nonlinear models (NN, SVR) may yield
better training results, and thus better fit the training data, but the results reveal that the
generalization performance of these models does not surpass the testing performance

18
of simpler linear models. Thus, unless clearly nonlinear relations are observed in the
data, the application of linear models is recommended. This conclusion is expected
due to the fairly linear relationship between residential natural gas consumption and its
main causal influence, i.e. the outside temperature. Nonlinear features (such as WGCI)
may be helpful in capturing the eventual nonlinear response of a system, so that
models which are linear in their parameters can be applied. The principal advantages
of linear models are their robustness, their smaller-scale and easier to understand
architectures, and the fast and globally optimal training procedures which can be
applied to them.
 The successful forecasting solution should be based on informative inputs which
represent the dynamics of a system. In the present research, the relevance of possible
inputs was investigated through a stepwise regression method, and by examining the
various embedding dimensions for each applied input. The results showed that the
most relevant inputs include the past natural gas consumption (y), the outside
temperature (T), the solar radiation (SR), and possibly also their derivatives such as the
minimum daily temperature (Tmin) and the maximum daily solar radiation (SRmax),
calculated as peak hourly values. The results in this study are based on measured
values of the influencing weather parameters. Further research will be directed
towards the analysis of forecasted weather parameters where weather forecasts are
obtained from corresponding weather forecasting institutions. Natural gas distribution
systems are usually subject to population dynamics, so it is also recommended that
population dynamics indicators (day of the week, holiday, …) are included as inputs
either directly or indirectly in the form of condensed features (such as the proposed
WGCI). The required embedding dimensions of relevant inputs are usually as low as
M = 3 because it has been shown in this study that more complex models do not
necessarily improve the generalization performance.
In a summary, the recommended combination for short-term residential natural gas
consumption forecasting is an adaptive linear model, with the proper selection of the relevant
inputs and low embedding dimensions. In this study, the winning forecasting model is
RARX(3), since it yielded the best testing forecasting performance in the case of both data
sets in terms of the e and adjusted R2 performance measures. Such a model is simple to
implement and can be constructed by using short data sets (as demonstrated in this research)
therefore it is a suitable solution to be included in an online forecasting system.

9. Acknowledgments
The support from the Slovenian Research Agency (Program P2-0241 Synergetics of
complex systems and processes) is hereby gratefully acknowledged, as well as the help of
HEP-Plin Ltd. in providing natural gas consumption data for the LDC forecasting model.

10. References
[1] Valero A, Valero A. Physical geonomics: Combining the exergy and Hubbert peak
analysis for predicting mineral resources depletion. Resour. Conserv. Recy.
2010;54(12):1074-1083.

19
[2] Mohr SH, Evans GM. Long term forecasting of natural gas production. Energ. Policy
2011;39(9):5550-5560.
[3] Siemek J, Nagy S, Rychlicki S. Estimation of natural-gas consumption in Poland based
on the logistic-curve interpretation. Appl Energy 2003;75(1–2):1–7.
[4] Gutierrez R, Nafidi A, Sanchez RG. Forecasting total natural-gas consumption in Spain
by using the stochastic Gompertz innovation diffusion model. Appl Energy
2005;80(2):115–24.
[5] Forouzanfar M, Doustmohammadi A, Menhaj MB, Hasanzadeh S. Modeling and
estimation of the natural gas consumption for residential and commercial sectors in Iran.
Appl Energy 2010;87(1):268–74.
[6] Erdogdu E. Natural gas demand in Turkey. Appl Energy 2010;87(1):211–9.
[7] Li J, Dong X, Shangguan J, Hook M. Forecasting the growth of China's natural gas
consumption. Energy 2011;36(3):1380-1385.
[8] Sanchez-Ubeda EF, Berzosa A. Modeling and forecasting industrial end-use natural gas
consumption. Energ. Econ. 2007;29(4):710-742.
[9] Huntington HG. Industrial natural gas consumption in the United States: an empirical
model for evaluating future trends. Energy Econ 2007;29(4):743–59.
[10] Sarak H, Satman A. The degree-day method to estimate the residential heating natural gas
consumption in Turkey: A case study. Energy 2003;28:929-939.
[11] Aras, H., Aras, N. Forecasting residential natural gas demand. Energy Source.
2004;26(5):463-472.
[12] Timmer RP, Lamb PJ. Relations between temperature and residential natural gas
consumption in the Central and Eastern United States. J. Appl. Meteorol. Clim.
2007;46(11):1993-2013.
[13] Aydinalp-Koksal M, Ugursal VI. Comparison of neural network, conditional demand
analysis, and engineering approaches for modeling end-use energy consumption in the
residential sector. Appl Energy 2008;85(4):271-296.
[14] Yoo S-H, Lim H-J, Kwak S-J. Estimating the residential demand function for natural gas
in Seoul with correction for sample selection bias. Appl Energy 2009;86(4):460–5.
[15] Potočnik P, Thaler M, Govekar E, Grabec I, Poredoš A. Forecasting risks of natural gas
consumption in Slovenia. Energ. Policy 2007;35:4271-4282.
[16] Sabo K, Scitovski R, Vazler I, Zekić-Sušac M. Mathematical models of natural gas
consumption. Energ. Convers. Manage. 2011:52(3):1721-1727.
[17] Brabec M, Konar O, Pelikan E, Maly M. A nonlinear mixed effects model for the
prediction of natural gas consumption by individual customers. Int J Forecasting
2008;24(4):659-678.
[18] Vondracek J, Pelikan E, Konar O, Cermakova J, Eben K, Maly M, et al. A statistical
model for the estimation of natural gas consumption. Appl Energy 2008;85(5):362–70.
[19] Bianco V, Scarpa F, Tagliafico LA. Scenario analysis of nonresidential natural gas

20
consumption in Italy, Applied Energy 2014;113:392–403
[20] Ediger VS, Akar S. ARIMA forecasting of primary energy demand by fuel in Turkey.
Energ Policy 2007;35(3):1701-1708.
[21] Liu H, Liu D, Zheng G, Liang Y. Research on natural gas short-term load forecasting
based on support vector regression. Chinese J. Chem. Eng. 2004;12(5):732-736.
[22] Kizilaslan R, Karlik B. Combination of neural networks forecasters for monthly natural
gas consumption prediction. Neural Netw. World 2009;19(2):191–199.
[23] Azadeh A, Asadzadeh SM, Ghanbari A. An adaptive network-based fuzzy inference
system for short-term natural gas demand estimation: Uncertain and complex
environments. Energ Policy 2010;38(3):1529-1536.
[24] Soldo B. Forecasting natural gas consumption. Appl Energy 2012;92:26–37.
[25] Potočnik P, Thaler M, Govekar E, Grabec I, Poredoš A. Forecasting risks of natural gas
consumption in Slovenia, Energy Policy 2007(8);35:4271–4282.
[26] Taşpinar F, Çelebi N, Tutkun N. Forecasting of daily natural gas consumption on
regional basis in Turkey using various computational methods. Energ Buildings
2013;56:23-31.
[27] Draper N, Smith H. Applied Regression Analysis, 2nd ed. New York: John Wiley & Sons
Inc.; 1981.
[28] Ljung L. System Identification: Theory for the User, 2nd ed., New Jersey: PTR Prentice
Hall, Upper Saddle River; 1999.
[29] Haykin, S., editor. Neural networks and learning machines, 3rd ed. New York: Pearson;
2009.
[30] Vapnik VN. The Nature of Statistical Learning Theory. New York: Springer; 1995.
[31] Cortes C, Vapnik VN. Support vector networks. Mach. Learn. 1995;20(3):273–97.
[32] Vapnik VN. Statistical Learning Theory. New York: Wiley; 1998.
[33] Smola AJ, Schölkopf B. A tutorial on support vector regression. Stat. Comput.
2004;14:199–222.
[34] Hyndman RJ, Koehler AB. Another look at measures of forecast accuracy, International
Journal of Forecasting 2006;22(4):679–688.
[35] Diebold FX, Mariano RS. Comparing predictive accuracy, Journal of Business &
Economic Statistics 1995;13(3):134-144.

21

You might also like