Professional Documents
Culture Documents
Abstract² 7KLV SDSHU FRQWULEXWHV ZLWK WZR GLIIHUHQW Although many forecasting methods were developed [4], none
SUHGLFWLRQDSSURDFKHVIRUORQJWHUPSRZHUFRQVXPSWLRQGHPDQG can be generalized for all demand patterns. Different models
SUHGLFWLRQ XVLQJ DQ DUWLILFLDO QHXUDO QHWZRUNV $11 VKRUWWHUP for electric energy demand forecasting have been proposed in
WLPH VHULHV SUHGLFWRU ILOWHU 7KH WHFKQLTXHV SURSRVHG KHUH DUH recent decades [5] [6] [7] [8], which play an important role in
QRQOLQHDUVWRFKDVWLFPRGHOVXVLQJWKHHQHUJ\DVVRFLDWHGWRVHULHV economic planning and safe operation of modern power
DQG %D\HVLDQ LQIHUHQFH LPSOHPHQWHG E\ $11 7KH V\VWHP KDV systems [9].
WKH DGYDQWDJH RI UHTXLULQJ DV LQSXW RQO\ WKH KLVWRULFDO GHPDQG
WLPH VHULHV RI SRZHU FRQVXPSWLRQ DQG DOORZV LWV H[WHQVLRQ WR D These models can be divided into two categories: the first
IRUHFDVW PHGLXP DQG ORQJ WHUP PRQWKV IRUZDUG 7KH includes the traditional algorithms of load forecasting,
SDSHUSUHGLFWVWKHSRZHUFRQVXPSWLRQLQWKHDUHDFRYHUHGE\WKH including time series analysis, regression and gray models. In
FRXQWU\ GXULQJ WKH SHULRG -DQXDU\ 1RYHPEHU LQ the second category includes latest algorithms for load
$UJHQWLQD 7KXV WKH QH[W IRUHFDVWHG YDOXHV DUHSUHVHQWHG E\ forecasting such as neural networks and intelligent expert
WKHHYROXWLRQRIWRWDOPRQWKO\SRZHUFRQVXPSWLRQGHPDQGRIWKH systems [10] [11] [12] [13]. This paper proposes alternatives
1DWLRQDO,QWHUFRQQHFWHG6\VWHPRI$UJHQWLQD7KHFRPSXWDWLRQDO for improving prediction in electricity demand [14].
UHVXOWV RI WKH SUHGLFWLRQ FRPSDULVRQ DUH HYDOXDWHG DJDLQVW WKH
FODVVLFDOQRQOLQHDU$11SUHGLFWRURQKLJKURXJKQHVVVKRUWWHUP Time series forecasting recently has a preponderant
FKDRWLF WLPH VHULHV WKDW VKRZV D EHWWHU SHUIRUPDQFH RI %D\HVLDQ significance in order to know which will be the best the
DSSURDFKLQORQJVKRUWWHUPIRUHFDVWLQJ behavioral of a system in study such as the availability of
estimated scenarios for water predictability [15], the rainfall
Keywords² power consumption forecast; energy time series; forecast problem [16] [17] in some geographical points of
neural networks; energy associated to series; Bayesian inference; Argentina, the energy demand purposes [18] [19] [20], the
Computational Intelligence. guidance of seedling growth [21], [22]. For general feed-
forward neural networks [23] [24] [25] [26], the computational
I. INTRODUCTION complexity [27] [28] [29] [30] of these solutions grows
Electricity is one of the most important and used forms of exponentially with the number of missing features [46]. In this
energy and they are widely used for different kind of needs. paper we describe an approximation for the problem of short-
Nowadays electricity is essential for economic development term prediction that is applicable to a large class of learning
especially for the industrial sector. Decision makers around the algorithms [10] [11] [12] and [26] including ANN’s. One
world widely use energy demand forecasting as one of the most major advantage of the proposed technique solution is that the
important policy tools. So this issue becomes a key energy complexity does not increase with an increasing number of
source in each country and an important condition for inputs. The solutions can easily be generalized to the problem
economic development. Reliable forecast of energy of uncertain (noisy) inputs, such as Bayesian inference [31]
consumption represents a starting point in policy development against other generalized approaches [17].
and improvement of production and distribution facilities in The problem of short time series forecasting [32] [33] [34]
Argentina. poses a difficulty to the analysis which depend on what
Electricity demand forecasting is a central and integral methods of estimation and prediction fit better and efficient.
process for planning periodical operations and facility Various techniques exist as a solution to this problem,
expansion in the electricity sector [1]. Demand pattern is employing statistical and artificial intelligence techniques [35]
almost very complex due to the deregulation of energy [36] [37] [38]. The techniques proposed here are non-linear
markets. Therefore, finding an appropriate forecasting model stochastic auto-regressive moving average (NAR) models
[2] [3] for a specific electricity network is not an easy task. using the energy associated [23] to series and Bayesian
approach [17], implemented by ANN. The power consumption
,(((
forecasts obtained using the proposed methods are then algorithm, so the H parameter from this time series is called
compared with a well-known neural network based predictor HS. After the training process is completed, both sequences -
for a case study of Argentina. The study analyses and compares {{In}, {Ie}} and {³{yn, ye}}, in accordance with the hypothesis
the relative advantages and limitations of each time-series that they should have the same H parameter.
predictor technique [39] used for issuing short-term electrical
consumption forecast. The structure of the filter is changed B. Bayesianapproachfortuningtheneuralnetworks
taking into account the energy of the short series calculated as A model is most often recognized as Bayesian when a
the primitive of the original and Bayesian inference. The long- probability distribution is used to describe uncertainty
short term stochastic dependence of the time series is measured regarding the unknown parameters and when Bayes Theorem
by the Hurst parameter, in which they are considered as a path is applied [40]. A full Bayesian analysis can lead to the optimal
of the fractional Brownian motion. A 20 percentage of the choice among a set of alternative inferences, taking into
dataset is considered to give the prediction horizon and the account all sources of uncertainty in the problem and the
validation data. Moreover, the next 15 time series forecasted consequences of every possible selection. When a rainfall
values are presented by cumulative monthly historical series is being analyzed, it is important to make use of the
electricity consumption and solutions of the Mackey-Glass simplest possible models. Specifically, the number of unknown
(MG) and one-dimensional Henon equation. parameters must be kept at a minimum. For forecasting
The paper is organized as follows; Section 2 presents a will problems, Bayesian analysis generates point and interval
review two methods for evolving various parameters of ANNs forecasts by combining all the information and sources of
to model the NN parameters and the optimum uncertainty into a predictive distribution for the future values
architecture/weights applied to electrical time series. Section 3 [53]. It does so with a function that measures the loss to the
provides an overview of dataset uses and the methodology forecaster that will result from a particular choice of forecasts.
proposed. In Section 4, prediction results are carried out and The gamma distribution is chosen for this purpose [31].
highlighted the application to electrical load forecasting. When a Bayesian analysis is conducted, inferences about the
Finally, Section 5 provides some discussions and concluding unknown parameters are derived from the posterior
remarks. distribution. This is a probability model which describes the
knowledge gained after observing a set of data. The application
II. REVIEW OF PROPOSED NEURAL NETWORKS ALGORITHMS of the regression problem [54] involving the correspond neural
network function y(x,w) and the data set consisting of N pairs,
The main issue when forecasting a time series is how to
input vector lx and targets tn (n=1,….,N).
retrieve the maximum of information from the available data
[52]. In this work the coefficients of the ANNs filter are Assuming Gaussian noise on the target, the likelihood
adjusted on-line in the learning process, by considering the function takes the form:
two methods proposed: energy associated to series and N /2
§ E · E N
2 ½ (3)
Bayesian approach as a new entrance to the neural networks. P( D / w, M ) ¨ ¸ exp ® ¦ y ( xn ; w) t n ¾,
In both cases, the criterion followed modifies at each pass of © 2S ¹ ¯ 2 n 1 ¿
the time series the number of patterns, the number of iterations § w2 ·
and the length of the tapped-delay line according to the long- P( w) 2Sw 2 N / 2 ¨
exp ¨
¸ (3)
short term stochastic behavior of the series, respectively. 2 ¸,
¨ 2w ¸
© ¹
A. Energyassociatedtoseriesapproach assuming that the expected scale of the weights is given by w
The assumption of the method is the following [23]: the set by hand. This was carried out considering that the network
area resulting of integrating the time series data is obtained by function f(xn+1,w) is approximately linear with respect to win
considering each value of time series its derivate; the vicinity of this mode, in fact, the predictive distribution for
yn+1 will be another multivariate Gaussian.
tk 1
³ y dt # y t
tk
t t k 1 tk (3)
III. DATA AND METHODOLOGY
where yt is the original time series value. The approximation The performance of the proposed approaches is given for
area is assumed to be its periodical primitive: predicting the long-short term chaotic time series that have
tn p appeared in the literature. The normalized symmetric mean-
tn p
I tn absolute percentage square error (SMAPE) is used as a
³ y dt
tn
t Yt tn
, n 1, 2,...N . (4)
performance index for measuring the quality of prediction of
During the learning process, those primitives are calculated the time series.
as a new input to the ANN. The predictor filter attempts to A time series can be actually regarded as an integration of
make the area of the forecasted times series equal to the stochastic (or random) and deterministic components [40] [41]
primitive real area predicted. The real area is used in two [42] [43]. Once the stochastic (noise) component is
instances; the first one from the real time series an area is appropriately eliminated, the deterministic component can
obtained. The H parameter associated of this series is called then be easily modeled. Rainfall is an end product of a number
HA. On the second one, the time series data is forecasted by of complex atmospheric processes which vary both in space
and time.
The standard non-parametric approaches presented in this IV. PREDICTION RESULTS
work are based on stochastic techniques that assume non- The simulation results in different order approximations
linear relationship among data that reproduce the power and time periods are presented in the following Table 1. The
consumption demand series only in statistical sense. performance of the comparison is measured by the Symmetric
Mean Absolute Percent Error (SMAPE) proposed in the most
A. PowerComsunptiondemandseries of metric evaluation, defined by,
The case study considered herein is referred to the evolution
of total power monthly consumption demand series [44] from 1 n X t Ft
SMAPES ¦ 100 (9)
the National Interconnected System over the period January n t 1 X t Ft 2
1980 - September 2013 of Argentina shown in Fig.1. where t is the observation time, n is the size of the test set, V
<HDU -DQXDU\ )HEUXDU\ 0DUFK $SULO 0D\ -XQH -XO\ $XJXVW 6HSW 2FW 1RY 'LF
is each time series, Xt and Ft are the actual and the forecasted
&RQVXPLQ &RQVXPLQ &RQVXPLQ &RQVXPLQ &RQVXPLQ &RQVXPLQ &RQVXPLQ &RQVXPLQ &RQVXPLQ &RQVXPLQ &RQVXPLQ &RQVXPLQ
0:K
3.909
0:K
4.032
0:K
4.169
0:K
4.089
0:K
4.154
0:K
4.511
0:K
4.679
0:K
4.327
0:K
4.162
0:K
3.941
0:K
3.942
0:K
4.442
time series values at time t respectively. The SMAPE of each
series s calculates the symmetric absolute error in percent
4.325 4.452 4.553 4.495 4.510 5.048 4.879 4.590 4.655 4.368 4.393 4.602
4.485 4.615 4.808 4.569 4.416 4.911 4.820 4.698 4.553 4.485 4.575 4.728
4.564
4.886
5.338
4.567
4.653
5.311
4.720
4.931
5.624
4.632
4.957
5.536
5.098
5.272
5.694
5.301
5.919
5.976
5.166
5.720
6.046
5.135
5.503
6.024
4.876
5.183
5.900
4.726
4.943
5.475
4.780
4.928
5.715
4.982
5.071
5.725
between the actual Xt and its corresponding forecast value Ft,
5.731
6.442
5.768
6.329
5.753
6.577
6.034
6.548
6.342
7.543
6.251
7.585
6.501
7.481
6.691
7.842
6.590
7.190
6.095
6.879
6.072
7.217
6.257
7.291 across all observations t of the test set of size n for each time
series V. where t is the observation time, n is the size of the test
7.176 6.821 7.471 7.232 7.785 7.783 7.949 7.463 6.996 6.705 7.129 7.020
7.098 7.174 6.974 6.991 7.155 7.041 7.495 6.442 6.401 6.241 6.408 6.548
6.645 6.388 6.375 6.345 6.975 7.190 7.049 6.898 6.892 6.645 6.700 6.664
6.610
7.353
8.065
6.594
7.573
7.906
7.111
7.974
8.749
6.952
7.810
8.317
7.295
8.411
9.105
7.723
8.715
9.126
7.886
9.035
9.325
7.892
8.799
9.158
7.409
8.386
9.056
7.301
8.080
8.481
7.331
7.565
8.540
7.147
7.757
8.759
set, s is each time series, Xt and Ft are the actual and the
8.660
9.386
8.774
9.253
9.274
9.890
9.037
9.550
9.263
9.563
9.842
10.190
10.104
10.213
9.547
10.117
9.531
9.317
9.081
9.659
9.394
9.614
9.820
10.058
forecasted time series values at time t respectively. The
9.805
10.721
11.393
10.186
10.899
11.361
10.479
10.971
12.220
10.193
11.399
11.399
10.426
11.366
11.821
11.243
11.776
12.269
11.081
11.656
12.033
10.423
11.661
12.085
10.181
10.959
11.684
10.487
11.066
11.990
10.830
11.161
11.766
10.842
11.526
12.034
SMAPE of each series s calculates the symmetric absolute
11.382
12.788
13.501
12.259
12.808
14.061
12.650
12.709
13.780
11.734
12.347
12.866
12.112
12.641
12.968
12.545
13.211
13.639
12.730
13.754
13.794
12.503
12.781
13.030
11.862
12.969
12.642
12.154
12.412
12.365
12.470
12.621
12.595
12.640
13.224
12.626
error in percent between the actual Xt and its corresponding
forecast value Ft, across all observations t of the test set of size
12.296 13.481 13.481 12.209 12.444 13.428 13.405 12.908 12.392 12.394 12.828 12.939
13.774 13.900 13.721 12.670 13.218 13.567 14.359 14.331 13.570 13.384 13.461 14.185
14.350
15.129
15.831
14.207
15.253
16.753
14.655
15.211
15.723
14.732
14.552
15.212
14.257
14.900
16.224
14.512
15.699
16.406
14.789
15.792
16.777
14.848
15.648
16.686
13.611
15.485
16.448
13.569
14.799
16.649
14.708
16.143
16.579
15.032
15.657
16.689
n for each time series s.
15.831 16.753 16.335 15.898 16.876 17.037 17.395 17.309 17.097 17.252 17.237 17.323
17.073
17.885
17.351
17.654
17.930
18.596
17.400
17.697
17.218
17.881
17.129
16.963
18.279
18.670
17.780
18.345
19.126
18.948
17.743
18.389
19.566
17.669
18.071
17.862
16.590
17.615
17.895
16.745
16.652
18.023
17.291
18.441
17.426
17.786
17.571
18.422
In each figure are detailed the testing and computing data,
19.370
20.531
19.332
20.171
18.408
20.913
16.937
18.309
18.228
18.765
18.770
21.024
20.396
21.403
20.743
21.564
19.346
18.648
17.211
17.565
18.353
19.508
20.209
20.513 where the testing are labeled “Validation data” and had not
been used in the computation of the predictor filter.
21.309 21.949 20.095 18.264 18.472 20.978 20.912 19.995 18.626 17.834 20.991 20.921
21.982 22.169 19.523 18.443 20.035 21.270 22.552 21.773 21.711 19.484 20.436
Fig. 1. Total power monthly consumption demand series from the National
Interconnected System of Argentina. The assessments of the obtained results by comparing the
performance of the predictor filter shows a significance
B. Chaotictimeseries improvement measured by SMAPE index toward Bayesian
The benchmark chosen are called MG17 with τ=17 and approach over the energy associated and NAR neural networks
MG30 τ=30 in the forecasting. Here one of the proposed approach, all based on ANN.
algorithms to predict values of time series are taken from the
Although the difference between filters resides only in the
solution of the MG equation [46], which is explained by the model, the coefficients that each filter has, each ones performs
time delay differential equation defined as: different behaviors. It can be noted that even the training points
x Dy(t W ) are too short for the learning process [44], the behavior of the
y (t ) Ey (t ) (1)
proposed filter reach the expectation for short-term time series
1 y c (t W )
prediction [26]. The POWER series presents more roughness
Equation (1) is solved by a standard fourth order Runge-Kutta than MG and HEN solutions, so the Bayesian approach applied
integration step, and the series to forecast is formed by to the parameter of the ANN demonstrate a level improvement,
sampling values with a given time interval. in which the adequate prior distribution model chosen
demonstrate it can be used for tuning the parameters and
The algorithm uses wavelet method to estimate the H outputs of the predictor filter [36].
parameter in the time series to have an idea of roughness of a
signal [48] [49]. Such series are considered as a trace of an
fBm depending on the so-called Hurst parameter 0<H<1. TABLE I. RESULTS OBTAINED BY THE PROPOSED APPROACHES
Furthermore, by setting the parameter β between 0.1 and 6HULHV1R )LOWHU + 5HDO0HDQ 60$3(
POWER Energy 1.68 20.42 0.689
1.9 the stochastic dependence of the deterministic time series POWER Bayesian 0.71 20.42 0.026
obtained varies according to its roughness. [47]. POWER Neural 0.71 20.42 0.689
In order to compare the results of the proposed technique MG17 Energy 2.92 2.80 184.56
MG17 Bayesian 1.78 1.72 7e-06
with the results published in the literature, the second set of MG17 Neural 1.78 1.76 1.20
times series is chosen from the Henon equation [50] according HEN Energy 0.346 0.349 0.19
to [51], where the constants are taken to be A = 1.3, B = 0.22, HEN Bayesian 0.469 0.474 6.5e-15
x(0) = 0 and x(1) = 0. The benchmark is called HEN. The first HEN Neural 0.469 0.559 13.41
65 data points are used for training and the remaining 15 points
are kept for validation data. The Monte Carlo method was used to forecast the next 15
values from each MG, HEN, and 18 values for POWER time
series. Such outcomes are shown from Fig. 2 to Fig. 4.
25
21
series forecast. The structure of the filter is changed according
15
20 the long-short term stochastic dependence method taking into
account the energy of the short series calculated as the
19
MW
MW
18
10
Mean Forescasted
Data
17
16 Mean Forescasted
primitive of the original and Bayesian inference.
5 Real Data
15 Real
25
24
14
0
0 50 100 150 200 250 300 350 400 450 340 350 360 370 380 390 400 410 420 22
Time [samples]. Time [samples]. 20
20
(a) (b) 15
18
MW
MW
y p E , , , p 10
16
0.07 0.35
Mean Forescasted Mean Forescasted
0.065 Data Mean Forescasted Mean Forescasted
Data 0.3 14
5 Real
Real Real Real
0.06
Validation Data Validation Data
0.055 0.25 12
0
0.05 0 50 100 150 200 250 300 350 400 450 375 380 385 390 395 400 405 410 415 420
0.2
Time [samples] from 1980 to 2013. Time [samples] from 1980 to 2013.
0.045
0.15
(a) (b)
0.04
0.035 0.1
0.03 p
y p E
0.05 0.35
0.08
0.025 Mean Forescasted
Mean Forescasted
0.3 Real
0.02 0 Real
0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90 0.07 Validation Data
Validation Data
Time [samples]. Time [samples].
0.25
0.06
0.2
(c) (d) 0.05
0.15
Fig. 2. Non-linear Autoregressive (NAR) Neural network predictor filter; a) 0.04
0
0.02 0 10 20 30 40 50 60 70 80 90
0 10 20 30 40 50 60 70 80 90
Time [samples].
Time [samples].
120
110
100
(c) (d)
Fig. 4. Bayesian approach-based neural network predictor filter; POWER
105
series, b) Horizon of POWER Series, c) MG17 series with τ =17, d) HEN one-
80
100
60 95
40 90
(a) (b)
in terms of SMAPE indices when compared with other existing
Henon series parameters a 1.3 b 0.22: y p E , , ,
forecasting methods in the literature..
0.9 0.35
Area of the forecasted time series Area of the forecasted time series
0.8 Forecasted area Forecasted area
0.3
0.7
0.25
ACKNOWLEDGMENT
0.6
0.5 0.2
This work was supported by Universidad Nacional de
0.4
0.3
0.15
Córdoba (UNC), FONCYT-PDFT PRH N°3 (UNC Program
0.2
0.1
RRHH03), SECYT UNC, Universidad Nacional de San Juan –
0.1
0.05
Institute of Automatics (INAUT), National Agency for
Scientific and Technological Promotion (ANPCyT) and
0 0
0 10 20 30 40 50 60 70 80 0 10 20 30 40 50 60 70 80
Time [samples]. Time [samples].