Professional Documents
Culture Documents
https://doi.org/10.1007/s42835-021-00973-5
ORIGINAL ARTICLE
Abstract
Accurate and rapid price forecasting plays a crucial role in the electricity spot market. Owing to the variability of market
participants’ activities, prices are usually too volatile to forecast accurately. In this study, a short-term price forecasting
model for locational marginal price (LMP) based on a multiple temporal convolutional network (mTCN) and attention-long
short-term memory (ATT-LSTM) is proposed. An attention-LSTM method is adopted to reconstruct the electrical features
for the future. The mTCN is used to extract the hidden information and long-term temporal relationships in the input features
included in electrical features. The effectiveness of the proposed model is demonstrated using datasets from the New England
electricity market (ISO-NE) in the U.S. The proposed model provides accurate load forecasting results in experiments and
comparisons with existing models.
13
Vol.:(0123456789)
Journal of Electrical Engineering & Technology
distributions [10]. A multiple input LSTM method was (3) We use the architecture of a model, including daily
proposed for electricity price and demand forecasting with and weekly forecasts. Multistep forecasting is improved
large data, whose hyperparameters were tuned using the Jaya by paying different attention to each feature and consider-
optimization algorithm to improve forecasting capability ing consequences from previous steps.
[11]. A method was proposed to create new variables based
on temperature to forecast electricity demand and tariffs.
The variables were linearly related to electricity demand,
and the method could avoid clustering of data to different 2 Summary of Method
seasons and accurately determine the temperature [12]. A
day-ahead SMP forecasting model implemented an artificial 2.1 Attentional LSTM
neural network (ANN) algorithm [13]. A statistical data-
filtering method worked at an input data preprocessing stage LSTM is a recurrent neural network with hidden states ht
to evaluate the reliability of input load data by analyzing all and Ct to store short- and long-term sequence feature infor-
possible data confidence levels and filtering out noise/outli- mation, respectively, as shown in Fig. 1. Its three-gate struc-
ers to improve the accuracy of short-term load forecasting ture realizes the long-term memory of the time series. The
models [14]. Input features were separated into historical three gates are the input gate it, forgetting gate ft, and output
and prediction data. Historical data were input to an LSTM ot, and it has a sigmoid activation function,σ.
layer to model its relationships, and its outputs were incor- The parameters of the LSTM model are collected as
porated with those of a fully connected layer [15]. Deep
learning methods have greater adaptability and accuracy in ⎡ ̂it ⎤ ⎡ sigmoid ⎤⎡ Wi Ui bi ⎤
⎡ x ⎤
time series forecasting at different scales because they can ⎢ ô t ⎥ ⎢ sigmoid ⎥⎢ Wo Uo bo ⎥ ⎢ t ⎥
⎢ f̂ ⎥ = ⎢ sigmoid ⎥⎢ W (1)
bf ⎥⎢ t−1 ⎥
h
identify structures and patterns, such as nonlinearity and Uf
⎢ ∼t ⎥ ⎢ ⎥⎢ f ⎥⎣ 1 ⎦
complexity. Bai et al. of Carnegie Mellon University pro- ⎣ Ĉ ⎦ ⎣ tanh ⎦⎣ Wc Ua ba ⎦
posed a temporal convolutional network (TCN) [16], which
including 12 LSTM parameters, such as W*, U*, and
has performance advantages over classical recurrent neural *
b . Through the control of the three gates, the relationship
networks for sequential data processing tasks.
between the front and back of the time series is established,
Attention-based neural networks have achieved good
so that the gradient of the model error will not disappear dur-
results in natural language processing, such as machine
ing backpropagation, and the error will remain unchanged.
translation, parsing, and automatic summarization. By
A network of information can be created and trained for
assigning different weights to the hidden layer units of the
long-term dependency [17].
neural network, the attention mechanism enables the hidden
For each moment, we should attach importance to which
layer to pay more attention to the critical information. The
moment in its past is different. If we can set a vector that
TCN model is used in fields such as pattern recognition,
not only considers the weight of moments but changes
anomaly detection, and mental assessment, but its applica-
dynamically according to different moments, we can
tion to feature extraction in load forecasting is relatively lim-
obtainthe attention model. To achieve this, h(t-1) is changed
ited. We propose a method based on the attention-LSTM-
to h(t),considering the weight of time, weight information is
TCN model to forecast the short-term electricity price,
added, and the weight proportion corresponding to past time
aiming to improve prediction accuracy. The main contribu-
t is introduced [18],
tions are summarized as follows:
13
Journal of Electrical Engineering & Technology
∼
⎛ h ⎞ t−1 � � perceptual field, and allowing it to receive information over a
⎜ (t−1)⎟ � h(t−1) longer period of time. With hole convolution, as the number
⎜ ∼ ⎟= 𝜕𝜏(t) (2) of layers increases, the convolution window becomes larger,
⎜ c ⎟ 𝜏=1 c(t−1)
⎝ (t−1)⎠ and the number of holes in the window increases. For a one-
dimensional sequence input x ∈ Rn and a filter F: {0, 1, …,
k − 1} → R, the hole convolution operation F on the sequence
𝜕𝜏(t) = soft max(w𝜏(t) ) (3)
element s,
( )
∼ ∑
k−1
w𝜏(t) ∶=g x(t) , h , h F(s) = (x ∗ fd )(s) = f (i)xs−di (6)
(t−1) (𝜏)
( ∼
) (4) i=0
T
=v tanh Wx x(t) + W ∼ h +Wh , h
h (t−1)
where x is the input time series, "*" denotes convolution,
(𝜏)
fd is the filter corresponding to dilation factor d,xs-di is the
The parameters of the attention-LSTM model are col- input sequence, and k is the filter size. An example of causal
lected as convolution with dilation factors d = 1, 2, 4 and filter size
k = 3 is shown in Fig. 1, wherex0, x1, …, xT are the input
⎡ ̂it ⎤ ⎡ sigmoid ⎤⎡ Wi Ui bi ⎤⎡ xt ⎤ sequences, and y0, y1, …, yT are the output sequences. Obvi-
⎢ ô ⎥ ⎢ ously, the output sensory field can cover all the values in
⎢ t ⎥ = ⎢ sigmoid ⎥⎥⎢⎢ Wo Uo bo ⎥⎢ ∼ ⎥
(5)
bf ⎥⎢ t ⎥
h the input sequence both after and before the time point, as
⎢ f̂ t ⎥ ⎢ sigmoid ⎥⎢ Wf Uf
⎥⎢ ⎥
⎢ C̃̂ ⎥ ⎣ tanh ⎦⎣ W Ua ba ⎦⎣ 1 ⎦ in Fig. 3.
⎣ ⎦ c
TCN uses residual networks to solve the problem of gradi-
The attention mechanism gives more attention to the key ent disappearance or explosion in deep networks. ResNet uses
parts of the input sequence that affect the output results, so a nonlinear change function to describe the input and output
asto better learn its information. This does not increase the of a network, i.e., the input is X and the output is o. F usually
calculation and storage of the model. Therefore, the attention includes operations such as convolution and activation. The
mechanism is introduced in the LSTM model to effectively output is expressed as a linear superposition and nonlinear
improve forecasting. transformation of the input,
o = F(x) + x (7)
2.2 Temporal Convolutional Network
Since deep learning relies on chain backpropagation of
The TCN is essentially a one-dimensional CNN that is opti- errors for parameter updates, once one of the derivatives is
mized and adapted to the time series problem, and improved small, the gradient may become increasingly smaller after mul-
through causal convolution, dilated convolution, and are si tiple concatenations, which is referred to as gradient vanishing.
dual block [16]. For deep networks, the pass to shallow layers is almost gone.
Causal convolution ensures that the prediction results With the use of residuals, a constant term of 1 is added to each
at earlier time steps do not involve future data information, derivative,
which allows the convolutional network to be used in time
series models, as in Fig. 2; hole convolution allows each hid- 𝜕h 𝜕(f + x) 𝜕f
= =1+ (8)
den layer to keep the size of the input sequence of the previ- 𝜕x 𝜕x 𝜕x
ous layer, reducing computational effort while increasing the
13
Journal of Electrical Engineering & Technology
13
Journal of Electrical Engineering & Technology
Fig. 6 Multiple TCN-based
real-time LMP forecasting
model for electricity spot
market
timeseries data. The adaptive moment estimation method for different parameter gradients based on the loss function,
(Adam) is used for gradient optimization. This can dynami- which has the advantages of high computational efficiency,
cally adjust the first- and second-order moment estimation small memory occupation, and easy implementation.
13
Journal of Electrical Engineering & Technology
2.4 Performance Evaluation
N | pre real |
100% ∑ ||Pi − Pi || (10)
MAPE =
N 1 Preal
i
�
�N
� ∑ pre
� (P − Preal )2 Fig. 7 Real-time tariff fluctuations in September 2019
� (11)
� 1 i i
RMSE =
N
∑
N
� pre �
�Pi − Preal
i ��
1 � (12)
MAE =
N
as evaluation metrics of power prediction error, where N is
the number of test data points, Pipre denotes the i-th predicted
power data, and Pireal denotes the i-th measured power data.
Fig. 8 Real-time LMP forecast on July 31, 2019
3 Case Study
3.2 Experiment 1:Real‑time LMP Forecasting in Two
Representative Days
3.1 Experimental Settings
This experiment involved the prediction of 24 h real-time
The experimental environment was implemented with
LMP on July 31, 2019. Data from January 1, 2016, to July
Python 3.6.2, the deep learning development framework was
31, 2019, were used as the training set, data of July 29–30,
TensorFlow 2.0.0a GPU, Intel Core i5-7200U 64-bitCPU at
2019, as the validation set, and data from July 31, 2019, as
2.50 GHz ~ 2.70 GHz, 8 GB RAM, and an Nvidia GeForce
the test set.
940MX graphics card.
We predicted 24 h real-time LMP on December 31, 2019,
The dataset was the annual whole-point data of the New
with data from January 1, 2016, to December 29, 2019,as the
England Electricity Market (ISO-NE) in the United States,
training set, data of December 29–30, 2019, as the validation
selected for the Connecticut (CT) region [19]. The real-time
set, and data of December 31, 2019, as the test set.
electricity price data were collected for 1461 consecutive days,
Figure 8 and Table 2 compare the 24 h real-time LMP
from January 1, 2016 to December 31, 2019, once per hour, for
prediction results of the two models of LSTM, S-TCN,
a total of 35,064 moments, including real-time tariff-related
and M-TCN on July 31, 2019, Fig. 9 and Table 3 present
data, load-related data, and day-ahead tariff-related data.
the same comparisons on December 31, 2019, and Table 4
Fluctuations of the real-time LMP series are shown in
shows the overall evaluation.
Fig. 7, using a time series decomposition of 24 full points per
day in December of the 2019 dataset.
3.3 Experiment 2: Real‑timeLMP intraweek
Figure 7 shows the series of forecast dates, from top to bot-
forecasting
tom, as the original, trend, seasonal, and residual series. It can
be seen that the linear autocorrelation of the real-time market
In this experiment, price data were forecast from Decem-
real-time LMP is not strong and is influenced by random noise.
ber 25 to 31, 2019, a span of 168 h. The training set was
from January 1, 2016, to December10, 2019, the validation
13
Journal of Electrical Engineering & Technology
Table 2 The forecast result Predicted time Actual value ($/ LSTM S-TCN M-TCN
comparisons in July 31, 2019 MWh)
Values APE Values APE Values APE
4 Conclusion
13
Journal of Electrical Engineering & Technology
13
Journal of Electrical Engineering & Technology
Acknowledgements This work was supported by National Key R&D their application(DSA), 978–1–7281–6057–3/19/$31.00, IEEE,
Program of China (No.2016YFB0900100). https://doi.org/10.1109/DSA.2019.00045
10. Haolin Yang, Kristen R. Schell (2020) HFNet: Forecast-
ing real-time electricity price via novel GRU architec-
tures;978–1–7281–2822–1/20/$31.00, IEEE PMAPS
References 11. Khalid R, Javaid N et al (2020) Electricity load and price forecast-
ing using jaya-long short term memory (JLSTM) in smart grids.
Entropy 22(1):10. https://doi.org/10.3390/e22010010
1. Su Juan, Du Songhuai, Li Talent (2007) Research on short-term
12. Jasinski To (2020) Use of new variables based on air temperature
spot electricity price forecasting method based on multi-factor
for forecasting day-ahead spot electricity prices using deep neural
wavelet decomposition by neural network, Power Autom Equip-
networks: A new approach. Energy 213:118784
ment, 27(11)
13. Jufri F H, Seongmun O, Jung J (2019) Day-ahead system marginal
2. Songhuai Du, Buying Wen, Chuanwen Jiang, (2004) “Power Mar-
price forecasting using artificial neural network and similar-days
ket,” China Electric Power Press, March, pp 1–9
information. J Electr Eng Technol 14:561–568. https://d oi.o rg/1 0.
3. Mohammad shahidehpour et al. (2005) Original book, Compiled
1007/s42835-018-00058-w
by Songhuai Du et al., “Market oriented operation of power sys-
14. Kwon B-S, Park R-J, Song K-B (2020) Short-term load fore-
tem,” China Electric Power Press.9, pp 13–14
casting based on deep neural networks using LSTM layer. J
4. Zhang J, Tan Z, Wei Y (2020) An adaptive hybrid model for short
Electr Eng Technol 15:1501–1509. https://d oi.o rg/1 0.1 007/
term electricity price forecasting. Appl Energy 258:114087
s42835-020-00424-7
5. Heydari A, Nezhad M M, Pirshayan E, Garcia D A, Keynia F, De
15. Bui DM, Le PD, Cao TM et al (2020) A statistical data-filter-
Santoli L (2020) Short-term electricity price and load forecast-
ing method proposed for short-term load forecasting models.
ing in isolated power grids based on composite neural network
J Electr Eng Technol 15:1947–1967. https://doi.org/10.1007/
and gravitational search optimization algorithm. ApplEnergy
s42835-020-00460-3
277:115503
16. S. Bai, J. Z. Kolter, and V. Koltun, (2018) An empirical evalua-
6. Deng J, Song W, Zio E (2020) A discrete increment model
tion of generic convolutional and recurrent networks for sequence
for electricity price forecasting based on fractional brownian
modelling. In: Proceedings. AAAI conference. artificial. intel-
motion. IEEE Access. 8:130762–130770. https://d oi.o rg/1 0.1 109/
ligence. (AAAI), New Orleans, LA, United states, pp 2159–2166
ACCESS.2020.3008797
17. D. Bahdanau, K. Cho ,Y. Bengio. (2015) Neural machine transla-
7. Jahangir H, Tayarani H, Baghali S, Ahmadian A, Elkamel A
tion by jointly learning to align and translate[J]. ICIR
(2020) A novel electricity price forecasting approach based on
18. J. Cheng, L. Dong, M. Lapata. (2016) Long short-term memory-
dimension reduction strategy and rough artificial neural networks.
networks for machine reading [D].arxiv:1601.06733
IEEE Trans Ind Inform 16(4):2369–2361
19. https://www.iso-ne.com/
8. Ronit Das, Rui Bo, et al (2020) Cross-market price difference
forecast using deep learning for electricity markets. In: IEEE PES
Publisher's Note Springer Nature remains neutral with regard to
innovative smart grid technologies Europe (ISGT-Europe) Virtual,
jurisdictional claims in published maps and institutional affiliations.
26–28
9. Sangli Liu, Liang Zhang, Bin Zou (2019) Study on electricity mar-
ket price forecasting with large-scale wind power based on LSTM,
In: 2019 6th International conference on dependable systems and
13
Journal of Electrical Engineering & Technology
13