You are on page 1of 17

Energy Conversion and Management 185 (2019) 783–799

Contents lists available at ScienceDirect

Energy Conversion and Management


journal homepage: www.elsevier.com/locate/enconman

Multifactor spatio-temporal correlation model based on a combination of T


convolutional neural network and long short-term memory neural network
for wind speed forecasting
Yong Chen, Shuai Zhang , Wenyu Zhang, Juanjuan Peng, Yishuai Cai

School of Information, Zhejiang University of Finance and Economics, Hangzhou 310018, China

ARTICLE INFO ABSTRACT

Keywords: The accurate forecasting of wind speed plays a vital role in the transformation of wind energy and the dis-
Wind speed forecasting patching of electricity. However, the inherent intermittence of wind makes it a challenge to achieve high-pre-
Convolutional neural network cision wind speed forecasting. Many existing studies consider the spatio-temporal correlation of wind speed but
Long short-term memory neural network ignore the influence of meteorological factors on wind speed with changes in time and space. Therefore, to
Multifactor spatio-temporal correlation
obtain a reliable and accurate forecasting result, a novel multifactor spatio-temporal correlation model for wind
Deep learning
speed forecasting is proposed in this study by combining a convolutional neural network and a long short-term
memory neural network. The convolutional neural network is used to extract the spatial feature relationship
between the meteorological factors at various sites. The long short-term memory neural network is used to
extract the temporal feature relationship between the historical time points. Meanwhile, a new data re-
construction method based on a three-dimensional matrix is developed to represent the proposed multifactor
spatio-temporal correlation model. Finally, the datasets collected from the National Wind Institute in Texas, 14
baseline models, 8 evaluation metrics, a performance improvement percentage, and hypothesis testing are used
to evaluate the proposed model and provide further discussion comprehensively and scientifically. The ex-
periment results demonstrate that the proposed model outperforms other baseline models in the accuracy of
forecasting and the generalization ability.

1. Introduction 30% compared with the expected values [5]. Therefore, under such
background, reducing the errors in wind speed forecasting and en-
In recent years, problems such as environmental pollution and en- hancing the conversion efficiency of wind energy has become a major
ergy depletion have hindered economic development and social pro- initiative of the world’s energy development strategy [1].
gress. To alleviate the energy crisis, renewable energy, which has the Over the past few decades, a variety of forecasting models has been
advantages of cleanliness and inexhaustibility, has received consider- proposed and improved to enhance wind speed forecasting perfor-
able attention all over the world [1]. Meanwhile, wind energy is con- mance. These models can be divided into the following categories:
sidered to be the most promising and exploitable of all renewable en- physical model, statistical model, machine learning model, deep
ergy sources such as tidal energy, solar energy, and biomass energy [2]. learning model, combined model, and spatio-temporal correlation
However, as the core factor of wind energy generation, the intermittent (STC) model [6]. The physical models forecast wind speed by using
and random characteristics of wind speed pose a great challenge to physical parameters such as meteorological and geographical [1].
wind energy transformation and management. Inaccurate wind speed However, physical models are not suitable for short-term wind speed
forecasting may affect power system scheduling decisions [3] and forecasting because of their high calculation cost and inability to cap-
subsequently increase operating costs, reduce energy efficiency, and ture the real-time temporal relationship between various meteor-
reduce the reliability of electricity grids [4]. In addition, the relevant ological factors. By contrast, statistical models such as autoregressive
research shows that if the accuracy of wind speed forecasting is im- moving average (ARMA) [7] model, and autoregressive integrated
proved by 10%, wind power generation can increase by approximately moving average (ARIMA) [8] model use the linear relation of each


Corresponding author.
E-mail addresses: chenyong@zufe.edu.cn (Y. Chen), zhangshuai@zufe.edu.cn (S. Zhang), wyzhang@e.ntu.edu.sg (W. Zhang), pengjj81@csu.edu.cn (J. Peng),
caiyishuai@zufe.edu.cn (Y. Cai).

https://doi.org/10.1016/j.enconman.2019.02.018
Received 24 November 2018; Accepted 6 February 2019
0196-8904/ © 2019 Elsevier Ltd. All rights reserved.
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

variable in the historical time series to realize short-term wind speed represented the wind speed of 100 sites as a matrix of 10 × 10 . A CNN
forecasting. These models overcome the shortcomings of physical was used to analyze the spatial correlation between various sites, and
models for short-term wind speed forecasting, but they can only analyze MLP was used to integrate the spatial features of each historical time
the linear relationship between the variables in the historical time point to realize STC wind speed forecasting. These models can make
series and has difficulty dealing with nonlinear relationships. With the effective use of the geographical features of each site and obtain good
rapid development of computer technology, machine learning models forecasting results. On the other hand, a STC model can also solve the
such as logistic regression (LR) [9], support vector machine (SVM) [10], difficulty of the lack of wind speed data at some sites.
multi-layer perceptron (MLP) [11], and extreme learning machine As mentioned above, the existing combined model and STC model
(ELM) [12] are widely used in wind speed forecasting. These models can improve the forecast performance to a certain extent, but there are
can extract complex nonlinear features in wind speed time series and still some drawbacks that need to be explored: (a) There is a tight
improve the accuracy of the forecast to a certain extent. For instance, coupling relationship between various meteorological factors such as
Zhang et al. [13] presented a hybrid ELM model based on feature se- temperature, wind speed, and wind direction. The existing STC model
lection and a backtracking search algorithm for parameter optimization considers the STC of wind speed changes between different sites but
to forecast wind speed. The results showed that the model can effec- ignores the influence of the meteorological factors on the wind speed in
tively capture nonlinear features of wind speed series and has good time and space. A STC analysis based on multiple meteorological factors
forecast performance compared with statistical models such as the has not been studied. (b) Extracting spatial correlation requires a lot of
ARIMA model. Nevertheless, these traditional linear and nonlinear prior knowledge and professional background. Meanwhile, there are
models can only extract very superficial features, and the models need some hidden abstract feature relationships that cannot be obtained di-
to do a lot of feature engineering before extracting features. The au- rectly and effectively. (c) CNN and LSTM are good at extracting spatial
tomatic extraction of time series features cannot be achieved. features and time-series features, respectively. However, the existing
Fortunately, as an important branch of machine learning, deep STC model can only take advantage of the feature extraction capability
learning has developed rapidly in many fields such as image recognition of CNN or LSTM, but a model that combines the advantages of CNN and
[14] and text mining [15]. Compared with the shallow learning model, LSTM to achieve simultaneous extraction of the correlation between
the deep learning model uses a distributed and hierarchical feature time and space has not been studied. Thus, the motivation behind this
representation method [16], which can automatically extract the in- study is to construct a multifactor spatio-temporal correlation (MFSTC)
herent abstract features and hidden invariant structures in data from model by combining CNN and LSTM to overcome the above short-
the lowest level to the highest level [17]. The excellent feature ex- comings and achieve more accurate and reliable wind speed fore-
traction ability of the deep learning model has attracted the attention of casting. The main contributions and novelty of this study are as follows:
scholars in the fields of wind speed and wind power forecasting. Among
them, Wang et al. [2] proposed a convolutional neural network (CNN) (1) Proposed a new STC model: To make use of the relationship be-
model based on an ensemble approach to achieve probabilistic wind tween various meteorological factors and wind speed, an MFSTC
power forecasting. Ghaderi et al. [18] used a long short-term memory model is presented in this study. This model can consider the
network (LSTM) model based on STC for wind speed forecasting, and multiple temporal and spatial correlations between time, site, and
the results showed that the proposed model exhibited a significant meteorological factors simultaneously, and enhance the reliability
improvement over the widely used benchmark models. of wind speed forecasting.
The models mentioned above can forecast wind speed accurately. (2) Presented a new data reconstruction method: To clearly represent
However, each individual model has its advantages and disadvantages, the proposed MFSTC model, a novel data reconstruction method
and to realize the complementary advantages of each individual model based on a three-dimensional (3D) matrix is put forward. The ma-
to further improve the accuracy of forecasting, many combined models trix contains values for each meteorological factor of each site at
that combine multiple individual models have been proposed. For in- multiple historical time points. At the same time, each re-
stance, Cadenas et al. [19] developed a combined artificial neural constructed matrix is used as an input of the proposed forecasting
network (ANN) and ARIMA model to fully consider the linear and model.
nonlinear tendencies of wind speed time series. Liu et al. [6] proposed a (3) Developed a new deep learning-based combined forecasting ap-
combined LSTM and CNN model to forecast the wind speed of different proach: To simultaneously extract the temporal and spatial corre-
frequency sub-layers obtained by a data decomposition algorithm. To lations and obtain a satisfactory forecasting result, a CNN-LSTM
improve the generalization ability and robustness of a single LSTM model with a multiple-inputs single-output combination structure is
model, Chen et al. [20] combined a cluster of LSTMs with diverse developed. A CNN is used to extract the spatial feature relationships
neurons and nonlinear-learning regression. All of the combined models between the meteorological factors at various sites. Meanwhile, the
presented in these studies improved the forecast accuracy and relia- abstract spatial feature relation vector obtained by the CNN at each
bility compared with the corresponding individual models. Therefore, historical time point will be used as the input of the LSTM. This
the combination of individual models will be adopted in this paper. combined model of CNN-LSTM based on deep learning not only
Moreover, a recent study showed that there is a significant cross- automatically extracts deep spatial and temporal features but can
correlation between the wind speed of a target site and its adjacent site also effectively handle the difficulty in obtaining complex geo-
[21], and more exact data showed that the distance of correlation be- graphic features of various sites.
tween sites can reach at least 435 km [22]. Therefore, with an in- (4) Conducted a comprehensive and scientific model evaluation and
creasing amount of spatial and temporal correlation data at each site, it discussion: To verify the practicability of the model, datasets from
has become a hot research direction to explore wind speed forecasting the National Wind Institute in Texas containing 46 sites are used in
models using a STC. For instance, Baxevani et al. [23] proposed an experiments. Eight common evaluation metrics and 14 baseline
anisotropic STC model to achieve multi-step ahead probability fore- models are employed to verify the accuracy and superiority of the
casting for wind power. Zhao et al. [24] presented a correlation-con- proposed model. Moreover, a Diebold–Mariano (DM) test [27], and
strained and sparsity-controlled vector autoregressive model by con- performance improvement percentage are introduced into the dis-
sidering the spatial correlation between different wind farms. Ye et al. cussion, and the effectiveness of the proposed model is further de-
[25] presented a short-term wind speed forecasting model based on monstrated. The experimental results show that the proposed model
both a physical method and spatial correlation. The spatial correlation achieves excellent performance compared with other baseline
was extracted according to the specific spatial position of each wind models.
turbine and the layout of the adjacent wind turbine. Zhu et al. [26]

784
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

The rest of the paper is organized as follows. Section 2 provides the time point t. S, F, and T represent the numbers of sites, meteorological
detailed information on the wind speed forecasting model proposed in factors, and historical time points, respectively. Further, the correlation
this paper. Section 3 introduces some metrics to evaluate the perfor- coefficient R ee12 between any two points e1 (s, f , t ) and e2 (s, f , t ) can be
mance of the model. The results and analysis of the experiments are expressed as
presented in Sections 4 and 5. Section 6 puts forward some conclusions
R ee12 = Corr (e1 (s, f , t ), e2 (s, f , t )) (2)
of this study and prospects for future work.
Then, based on the correlation coefficient between points, the wind
2. Proposed combined wind speed forecasting model speed of target site at the next moment can be calculated from the
meteorological factor values of each site at T historical time points. The
In this section, before formally introducing the model proposed in formula is defined as
this paper, the basic concepts of MFSTC, CNN, and LSTM are described. S F T
e
ea (q, wind, t + 1) = (eg (s , f , t ) Reag + ag )
2.1. Mathematical representation of multifactor spatio-temporal correlation s=1 f =1 t=1 (3)
model based on three-dimensional matrix where eg (s, f , t ) is the value of a point in Esft, ea (q, wind , t + 1) is the
value of the wind speed forecast of target site q at the next moment
The basic idea of a spatio-temporal model is that the wind speed e
t + 1, R eag is the correlation coefficient between points
features of a region are similar to those of its adjacent regions [28]. ea (q, wind , t + 1) and eg (s, f , t ) , and ag is the error term.
However, some meteorological factors are closely related to wind speed
changes, including temperature, air pressure, humidity, and wind di-
2.2. Convolutional neural network
rection. Therefore, making full use of the meteorological information of
the target site and its surrounding site will help improve the accuracy
A CNN is a multi-layer feed-forward artificial neural network and
and reliability of wind speed forecasting. The core of the MFSTC model
has been proven to have salient performance in extracting hidden
based on a 3D matrix is that it simultaneously considers multiple cor-
spatial features [29]. CNNs are widely used to solve image recognition
relations of sites and meteorological factors in the dimensions of time
and classification tasks. Compared with other types of neural networks
and space. The correlations include three aspects: correlation between
such as the deep belief network (DBN) [16], a CNN has the character-
multiple sites, correlation between multiple factors, and correlation
istics of sparse connectivity and weight sharing [30]. These two char-
between sites and factors. The model structure is shown in Fig. 1, in
acteristics of a CNN greatly reduce the number of parameters it needs to
which the 3D matrix can be represented by
learn. In this study, a CNN will be used to extract the potential spatial
SFT = TF , TS, FS = Esft (1) relationship between the meteorological factors of a target site and its
adjacent site to reduce the forecasting errors of wind speed. The ar-
where SFT represents a 3D matrix of “site-factor-time”. TF, TS, and FS chitecture of a CNN is shown in Fig. 2 and consists of three parts:
represent a two-dimensional plane set of “time-factor”, “time-site”, and convolutional layer, pooling layer, and fully connected layer. The
“factor-site”, respectively. The corresponding plane segmentation op- computation of the CNN can be defined as [31]:
eration is shown in Fig. 1(II). Esft is a set of 3D matrix points, and each
h 1 w 1
point can be defined as e(s, f, t), which represents the value of the fth uv hw (u + h ) (v + w )
mapmn = relu mnl ·map(m 1) l + bmn
meteorological factor of the sth site at historical time point t. Mean-
l h=0 w=0 (4)
while, as shown in Fig. 1(III), each two-dimensional plane can be re-
presented by a two-dimensional matrix. Among them, TFs contains where u and v are the indexes of the rows and columns of the feature
values of all meteorological factors that change with time at T historical map, respectively. h and w are the indexes of the rows and columns of
time points at site s, TSf contains values of the fth meteorological factor the convolution filter, respectively. h' and w' are the numbers of rows
changing with time at T historical time points at each site, and FSt and columns of the convolution filter, respectively. l is the index of the
contains values of all meteorological factors of each site at historical feature maps in the (m − 1)th layer. bmn is the bias of the nth feature

Fig. 1. Structure of MFSTC model based on 3D matrix.

785
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

Fig. 2. Basic structure of CNN model.

map in the mth layer. mnl hw


is the value of a position (h, w) in the con- of the memory block at time point t − 1, and the state information of
volution filter that connects the lth feature map in the (m − 1)th layer the memory cell at time point t − 1, respectively. fort , int , Mct , Mct ,
with the nth new feature map in the mth layer. mapmn uv
is the value of a outt , and t are the output of forget gate, input gate, temporary memory
position (u, v) in the nth new feature map in the mth layer. cell, new memory cell, output gate, and memory block at time point t,
map((mu + h1)) l(v + w ) is the value of the position (u + h, v + w) in the lth respectively. Wf , Wf , Wfm , Wi , Wi , Wim , Wm , Wm , Wo , Wo , and Wom
feature map in the (m − 1)th layer. relu is the activation function for represent the corresponding weight matrices. bfor, bin, bmc, and bout are
each layer of CNN, and its formula can be defined in Eq. (5) [32]. For the bias of the forget gate, input gate, memory cell, and output gate,
more detailed descriptions of CNN, please refer to [30]. respectively. denotes the Hadamard product [35] between two ma-
relu (x ) = max(0, x ) (5) trices. and tanh are two different activation functions, and their for-
mulas can be defined in Eqns. (12) [36] and (13) [6]. For more detailed
descriptions of LSTM, please refer to [33].
2.3. Long short-term memory network 1
(x ) =
1+e x (12)
As a special variant of a recurrent neural network (RNN) [30], the
LSTM has the ability to automatically store and remove temporal state ex e x
tanh (x ) =
information. Therefore, it can extract complex feature relationship of ex + e x (13)
long and short time series and significantly alleviate the issues of
vanishing gradients in RNNs [33]. The LSTM has a special built-in 2.4. Process of the proposed combined model
memory block, which determines the addition and deletion of memory
information through three gate structures: input gate, forget gate, and In this study, to reduce the influence of the intermittence of the
output gate. There is a memory cell (MC) in the memory block that wind on wind speed forecasting, an MFSTC-CNN-LSTM model is pro-
stores temporal state information about current and past moments. The posed. The model not only considers the spatio-temporal correlation
forget gate determines which information needs to be retained in the (STC) of wind speed in the process of propagation and change, but also
previous memory cell. The input gate determines which part of the considers the intrinsic relationship between the wind speed and various
input information needs to be updated. And the output gate determines meteorological factors. To simultaneously extract in depth the hidden
which part of the memory block information is exported. The structure correlation between the three dimensions of time, space, and meteor-
of a memory block is shown in Fig. 3. And the detailed implementation ological factors, a combined CNN-LSTM model with a multiple-inputs
process of LSTM and the calculation of the corresponding gate are single-output combination structure is proposed. CNN has been proven
shown below [34]: to be a reliable technology to extract the spatial feature [29]. Compared
fort = (Wf t + Wf t 1 + Wfm Mct 1 + bfor ) (6) with other deep learning models, such as MLP and DBN, CNN has more
efficient computation ability and spatial feature extraction ability be-
int = (Wi t + Wi t 1 + Wim Mct 1 + bin ) (7) cause of its characteristics of sparse connectivity and local perception.
However, in the aspect of feature extraction of long-short term time
Mct = tanh (Wm t + Wm t 1 + bmc ) (8) series, CNN is similar to ordinary neural network, and has no advantage
in time series feature extraction. On the other hand, the LSTM, which is
Mct = int Mct + fort Mct (9)
good at automatically storing information, is specially designed for
1

outt = (Wo + Wo + Wom Mct + bout ) (10) extracting complex temporal features. But LSTM is not good at ex-
t t 1
tracting spatial feature. Therefore, the effective combination of CNN
t = outt tanh (Mct ) (11) and LSTM will give full play to their respective capability in extracting
time and space features, and finally achieve the deep extraction of the
where t, t 1, and Mct−1 are the input vector at time point t, the output
correlation features between meteorological factors of multiple sites.
The framework of the MFSTC-CNN-LSTM model is depicted in Fig. 4.
The detailed implementation process of the model is described as fol-
lows:

(1) Historical meteorological data are collected from the website of the
National Wind Institute in Texas, USA. The details of the meteor-
ological information for each site are presented in Sub-Section 4.1.
(2) To apply the proposed MFSTC-CNN-LSTM model, the input data
from each sub-dataset is reconstructed. As shown in Part 2 of Fig. 4,
after data reconstruction, each input of the model is a 3D matrix
consisting of three dimensions: time, site, and meteorological
factor. The matrix contains values for all meteorological factors of
Fig. 3. Basic structure of LSTM memory block.

786
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

Fig. 4. Process of proposed wind speed forecasting model.

all sites at T historical time points. In this study, the value of T is 12, feature vectors obtained by the CNN are used as input for the LSTM,
that is, the total time span of the data used is 1 h. The details of the and the temporal feature relationships are extracted and analyzed.
MFSTC model and 3D matrix are depicted in Sub-Section 2.1. The details of the LSTM are depicted in Sub-Section 2.3. In this
(3) The correlation between the points in the matrix is extracted and study, the output of the LSTM is a feature vector of 128 in length,
analyzed by the CNN-LSTM, and the wind speed forecasting value which represents the abstract correlation between time, site, and
at the target site is output. The CNN-LSTM effectively combines the meteorological factor. Further, to extract deeper feature relation-
advantages of CNN and LSTM, that is, CNN and LSTM are effective ships, the output layer of the LSTM is connected to a fully connected
at extracting hidden feature information from spatial and temporal layer that consists of 256 neurons. Last, all neurons in the fully
dimensions, respectively. The structure of the CNN-LSTM is shown connected layer are connected to the output layer to obtain the
in Part 3 of Fig. 4. First, the CNN is executed to extract the spatial wind speed forecasting results of the target site.
feature relationship between all meteorological factors at each site. (4) As shown in Part 4 of Fig. 4, to verify the practicability of the
The input of the CNN is a two-dimensional matrix FSt that contains proposed model, three sub-datasets with different site distributions
values of all meteorological factors of each site at a historical time are selected for experimental analysis. Each dataset includes the
point t. The CNN consists of two convolutional layers, one max- data about multiple target sites for performance evaluation. Fur-
pooling layer, and one fully connected layer. The “Conv” and “FC” thermore, several baseline models and evaluation metrics are se-
in the figure represent the convolution and fully connected opera- lected to verify the accuracy and superiority of the proposed model.
tions, respectively. The details of the CNN are illustrated in Sub- To manifest the effectiveness of the proposed model more in-
Section 2.2. It is noted that to prevent the loss of the boundary tuitively, a DM test and performance improvement percentage are
feature of the input matrix, the same padding operation [30] is used adopted. The details of the evaluation metrics and test functions are
in the first convolutional layer and the pooling layer. Through the provided in Section 3.
feature extraction of the CNN, the abstract correlation between all
meteorological factors of each site at a historical time point t can be
obtained, which is represented by a feature vector. Then, the T

787
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

Table 1
Forecasting performance evaluation metrics.
Metrics Definition Equation

SSE Sum of squares for error SSE =


K
Pi ) 2
i = 1 (Ai
MAE Mean absolute error of the forecasting results MAE =
1 K
K i = 1 |Ai Pi|
RMSE Square root of average of the error squares 1 K
RMSE = i=1 (Ai Pi ) 2
K
SDE Standard deviation of error
(A )
1 K 1 K 2
SDE = i=1 i Pi i=1 (Ai Pi )
K K
U1 Theil U statistic 1 of the forecasting results
U1 =
K
i=1 (Ai Pi ) 2 / ( K
i = 1 Ai
2
+
K
i = 1 Pi
2
)
IA Index of agreement of the forecasting results IA = 1
K
(Ai Pi ) 2/
K
(|Pi A¯ | + |Ai + A¯ |) 2
i=1 i=1
DA Direction accuracy of the forecasting results 100 K 1 1, if (Ai + 1 Ai )(Pi + 1 Ai ) > 0
DA = i=1 i, i =
K 1 0, otherwise
PCC Pearson correlation coefficient K K K
PCC = i=1 (Ai A¯ )(Pi P¯ )/ i=1 (Ai A¯ ) 2 i=1 (Pi P¯ ) 2

3. Performance evaluation metric H0 : E [L (ri1) L (ri2)] = 0 (16)

To verify the performance of the proposed wind speed forecasting H1: E [L (ri1) L (ri2)] 0 (17)
model, several evaluation metrics and statistical testing methods are
employed for experimental comparison. In this section, the relevant where H0 indicates that the expected forecast errors of the proposed
evaluation metrics will be introduced in detail. model and the baseline model are equal, that is, there is no significant
difference in the forecast performance of the two models. The meaning
of H1 is opposite to that of H0. ri1 and ri2 are the ith forecast residuals of
3.1. Typical evaluation metric
the two models. L is a loss function whose calculation formula is defined
in Eq. (18) [27]:
To the best knowledge of the authors, there are many metrics used
in various studies to measure the forecasting ability of the models [37]. L (ri j ) = (ri j )2 , j = 1, 2 (18)
However, there is no uniform criterion for selecting evaluation metrics
[38]. In this study, eight common metrics are selected to evaluate the Finally, the DM test statistic can be represented by Eq. (19) [27]:
performance of the models. The names, definitions, and calculation
K
formulas of the metrics are listed in Table 1. Here, K is the total sample i=1
[L (ri1) L (ri2)]/ K
DM = s2
number of the forecasting series. Ai and Pi are the actual and forecasted S 2/ K (19)
values of the ith wind speed samples, respectively. Ā and P̄ are the
average of the actual and forecasted values of all wind speed samples, where DM is the value of the test statistic result, K is the length of the
respectively. Except for IA, DA, and PCC, a smaller metric value in- forecast series, S2 is an estimation of the variance
dicates that the model has better forecasting accuracy. For more de- (Vari = L (ri1) L (ri2)) , and s2 is an adjustment value of the DM test
tailed descriptions of SSE, MAE, RMSE, U1, IA, and DA, please refer to statistic. When the significance level is , if the values of DM fall into
reference [37]. For a detailed description of PCC and SDE, please refer the acceptable domain [ Z /2, Z /2], the null hypothesis is accepted, and
to Refs. [38] and [39] respectively. In addition, to better analyze the the alternative hypothesis is rejected. Conversely, if the values of DM
forecasting effectiveness of the proposed model, the improved percen- fall into the rejection domain [ , Z /2] or [Z /2, + ], the null hy-
tage of each evaluation metric of the proposed model is used in this pothesis is rejected, and the alternative hypothesis is accepted.
study. The formula for calculating the percentage of performance im-
provement between the proposed model and baseline models is defined 4. Experiments
in Eqns. (14) and (15):
MEbaseline MEproposed In this section, the performance of the proposed MFSTC-CNN-LSTM
IMP1 = × 100% model is verified by three experiments. In each experiment, different
MEbaseline (14)
datasets are used to forecast the wind speed of different target sites in
MEproposed MEbaseline the spring and summer, and eight performance evaluation metrics are
IMP2 = × 100%
MEbaseline (15) used to evaluate each model. To ensure the fairness of the model
comparison and the effectiveness of the experiments, the same hyper-
where IMP1 is used to calculate the improvement percentage of SSE, parameters are assigned to each model in the three experiments. Among
MAE, RMSE, SDE, and U1. IMP2 is used to calculate the improvement them, the size of the batch, number of iteration epochs, value of the
percentage of IA, DA, and PCC. MEproposed and MEbaseline are the eva- regularization coefficient, and value of the learning rate are set to 128,
luation values of each metric of the proposed model and other baseline 2000, 0.34, and 0.3, respectively [30]. At the same time, the “Adam”
models, respectively. optimizer is used to optimize the network parameters, and the “MSE”
loss function is used as the objective function of the network [30]. In
3.2. Diebold–Mariano test addition, to extract different wind speed features in the corresponding
region, the internal parameters of the model will be trained and ad-
To further determine whether the forecast results of the proposed justed by different sub-datasets of different experiments. All of the ex-
model are significantly superior to those of other baseline models, a periments are developed in Python version 3.6, and the relevant deep
widely used DM test method is adopted in this paper. The relevant learning algorithms are implemented using the Keras deep learning
formulas for the DM test [27] are shown below. package (https://keras.io/). The core configuration of the computer
First, the null hypothesis and the alternative hypothesis are defined includes Intel Core i7-8700 k processors running at 3.7 GHz, and a 64-
in Eqns. (16) and (17), respectively [27]: bit system with a 32 GB of RAM.

788
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

Table 2
Site selection for each sub-dataset.
Sub-dataset Site type Site name

Sub-dataset 1 All sites PADU, JAYT, ASPE, KNOX, HASK, CROW, THRO
Target sites ASPE, THRO

Sub-dataset 2 All sites HERE, HART, AMNW, UMBA, TULI, AMAN, AMAS, CLAU, PANH, GDNT, CLAR, SILY, PAMP, VALL, MEMP, MCLE
Target sites CLAU, AMNW

Sub-dataset 3 All sites LAMS, LEVE, WELC, BROW, ODON, ANTO, ABER, WOLF, REE, NEWH, TAHO, LBBW, SLAT, RALL, POST, MACY, FLOY, ALAN, AIKE, SPUR, FLUV,
GAIL, ROAR
Target sites SLAT, RALL, MACY

Table 3
Comparison of forecasting results of spring and summer wind-speed datasets between proposed model and baseline models (Experiment 1).
Site name Seasons Models Evaluation metrics

SSE MAE RMSE SDE U1 IA DA PCC

ASPE Spring ARIMA 11714.8984 1.1040 1.5037 1.5037 0.0628 0.9969 39.3050 0.9652
MLP 12072.4160 1.1086 1.5265 1.5188 0.0635 0.9967 39.1120 0.9641
CNN 12376.1260 1.1060 1.5456 1.5439 0.0645 0.9966 37.0270 0.9628
LSTM 12067.3496 1.1050 1.5262 1.5189 0.0634 0.9967 38.7838 0.9642
CNN-LSTM 12410.0440 1.1033 1.5477 1.5473 0.0650 0.9966 37.5869 0.9627
MFSTC-CNN 11004.0469 1.0716 1.4574 1.4563 0.0612 0.9970 43.0502 0.9670
MFSTC-CNN-LSTM 10809.8447 1.0652 1.4445 1.4444 0.0605 0.9971 42.3359 0.9675
Summer ARIMA 11607.8971 1.0979 1.4968 1.4968 0.0713 0.9958 43.2432 0.9320
MLP 12969.4463 1.1629 1.5822 1.5639 0.0749 0.9953 37.8958 0.9240
CNN 12244.2637 1.1100 1.5373 1.5357 0.0738 0.9955 41.9884 0.9268
LSTM 12149.1914 1.1171 1.5313 1.5242 0.0738 0.9955 42.2973 0.9282
CNN-LSTM 12682.6934 1.1270 1.5646 1.5624 0.0751 0.9954 39.2471 0.9242
MFSTC-CNN 10505.2178 1.0463 1.4240 1.4237 0.0679 0.9962 46.2934 0.9375
MFSTC-CNN-LSTM 10158.9609 1.0384 1.4003 1.3960 0.0666 0.9963 46.9112 0.9400

THRO Spring ARIMA 22981.4320 1.5330 2.1061 2.1061 0.0971 0.9924 50.3861 0.9169
MLP 27958.6016 1.7006 2.3230 2.2982 0.1064 0.9905 43.7839 0.8977
CNN 27973.5801 1.6803 2.3236 2.3214 0.1075 0.9905 44.7876 0.8956
LSTM 26015.9004 1.6222 2.2413 2.2398 0.1039 0.9911 46.9486 0.9032
CNN-LSTM 27933.9258 1.6859 2.3220 2.3178 0.1073 0.9905 43.3591 0.8959
MFSTC-CNN 21719.4746 1.5266 2.0475 2.0449 0.0955 0.9926 52.5290 0.9200
MFSTC-CNN-LSTM 21170.5234 1.4837 2.0214 2.0214 0.0935 0.9929 52.3938 0.9220
Summer ARIMA 24182.7105 1.5785 2.1605 2.1605 0.1127 0.9894 50.6371 0.8318
MLP 28493.0859 1.7459 2.3451 2.3210 0.1213 0.9872 46.2355 0.7939
CNN 28078.9160 1.7196 2.3280 2.3195 0.1212 0.9874 47.2201 0.7947
LSTM 25810.9688 1.6289 2.2320 2.2135 0.1193 0.9883 49.2664 0.8136
CNN-LSTM 28340.1211 1.7125 2.3388 2.3355 0.1220 0.9874 46.1969 0.7931
MFSTC-CNN 21324.6563 1.5056 2.0288 2.0281 0.1069 0.9904 53.6680 0.8463
MFSTC-CNN-LSTM 20877.7207 1.4857 2.0074 2.0059 0.1049 0.9907 54.0541 0.8504

Bold values represent the optimal value of each evaluation metric.

4.1. Data description models participate in the comparison to verify the validity of the
MFSTC model. For comparison 2, CNN, LSTM, MFSTC-CNN, and
In this paper, datasets containing 46 sites are collected from the MFSTC-CNN-LSTM models participate in the comparison to verify the
website of the National Wind Institute in Texas, USA (http://www. validity of combining CNN and LSTM based on the MFSTC model. In
depts.ttu.edu). Each set of site data contains measurements of wind addition, ARIMA and MLP as the benchmark models also participate in
speed, wind direction, temperature, dew point temperature, gust, alti- the model comparison. The comparison results of the evaluation me-
meter setting, and relative humidity for an observation interval of 5 min trics for each model are listed in Table 3. The bold numbers represent
from January 1 to June 29, 2018. According to the distribution of the the best values for each evaluation metric among the various models.
site and the weather patterns of different seasons, the original datasets Moreover, Figs. 5 and 6 intuitively show the forecast effect of the
are divided into three sub-datasets, each containing data for the spring proposed model and the comparison results of each model mentioned
and summer seasons. The selection of all sites and target sites for each above. On both sides of Fig. 5, the red line represents the wind speed
sub-dataset is shown in Table 2. In addition, the numbers of observation series forecast by the proposed model at each time point. The lower
points in the spring and summer datasets are 25,918 and 25,920, re- orange bar graph represents the actual value of the wind speed at each
spectively. The proportion of the training set, validation set, and test set time point. The upper green bar graph represents the residual difference
of the model is set to 6:2:2. between the actual value and the forecasted value of the wind speed at
each time point. The detailed comparison results of experiment 1 are
summarized below.
4.2. Experiment 1
(1) By comparing the forecasting results of each model (i.e., CNN vs.
In Experiment 1, sub-dataset 1 is selected to compare two experi- MFSTC-CNN and CNN-LSTM vs. MFSTC-CNN-LSTM), it can be
mental results to verify the superiority of the proposed model. For found that the metric evaluation result of MFSTC-CNN (MFSTC-
comparison 1, CNN, MFSTC-CNN, CNN-LSTM, and MFSTC-CNN-LSTM

789
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

Fig. 5. Forecasting results of proposed model in spring and summer wind-speed datasets (Experiment 1).

Fig. 6. Comparison of forecasting results of spring and summer wind-speed datasets between proposed model and baseline models (Experiment 1).

790
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

Table 4
Comparison of forecasting results among three combined models and other individual models in spring and summer wind-speed datasets (Experiment 2).
Site name Seasons Models Evaluation metrics

SSE MAE RMSE SDE U1 IA DA PCC

CLAU Spring LR 40287.4193 2.0461 2.7886 2.7871 0.0973 0.9922 50.6564 0.8944
MLP 42124.8398 2.0950 2.8514 2.8513 0.0994 0.9918 48.0695 0.8891
CNN 42433.8789 2.0931 2.8619 2.8619 0.0998 0.9918 48.7645 0.8882
LSTM 40634.3164 2.0510 2.8005 2.7955 0.0984 0.9920 50.7143 0.8945
MFSTC-CNN-MLP 34639.6641 1.9241 2.5857 2.5854 0.0899 0.9934 56.6602 0.9100
MFSTC-MLP-LSTM 35711.3086 1.9195 2.6254 2.6217 0.0915 0.9932 56.1776 0.9078
MFSTC-CNN-LSTM 33052.8164 1.8456 2.5258 2.5255 0.0879 0.9936 57.5483 0.9142
Summer LR 43106.3738 2.1476 2.8845 2.8837 0.1138 0.9892 52.4131 0.8440
MLP 46072.8711 2.2675 2.9821 2.9227 0.1150 0.9885 50.0000 0.8392
CNN 44307.0352 2.1875 2.9244 2.9243 0.1152 0.9888 50.6950 0.8390
LSTM 44586.2578 2.1923 2.9336 2.9305 0.1165 0.9886 50.0386 0.8396
MFSTC-CNN-MLP 37215.5445 2.0349 2.6801 2.6783 0.1056 0.9908 57.6448 0.8672
MFSTC-MLP-LSTM 43269.4766 2.1873 2.8899 2.8833 0.1144 0.9892 56.3321 0.8442
MFSTC-CNN-LSTM 36970.2227 2.0055 2.6713 2.6698 0.1053 0.9908 58.4942 0.8679

AMNW Spring LR 40752.8522 2.0107 2.8046 2.8029 0.0863 0.9940 46.8919 0.9291
MLP 41316.6719 2.0217 2.8239 2.8224 0.0870 0.9939 45.3668 0.9281
CNN 41839.1484 2.0356 2.8417 2.8414 0.0876 0.9938 44.2278 0.9271
LSTM 42150.9414 2.0468 2.8523 2.8501 0.0879 0.9937 43.6293 0.9267
MFSTC-CNN-MLP 39523.9414 2.0388 2.7620 2.7585 0.0851 0.9941 51.4865 0.9318
MFSTC-MLP-LSTM 37518.6992 1.9582 2.6910 2.6910 0.0830 0.9945 50.7143 0.9349
MFSTC-CNN-LSTM 34470.5352 1.8761 2.5794 2.5764 0.0793 0.9949 52.7992 0.9405
Summer LR 38348.2321 1.9717 2.7206 2.7173 0.0899 0.9934 47.9730 0.9031
MLP 40149.8945 2.0372 2.7838 2.7674 0.0908 0.9931 45.6564 0.8992
CNN 39278.0586 1.9961 2.7534 2.7440 0.0914 0.9931 45.8880 0.9011
LSTM 39319.2344 1.9992 2.7548 2.7473 0.0915 0.9931 45.3282 0.9010
MFSTC-CNN-MLP 38495.5977 1.9609 2.7258 2.7242 0.0894 0.9934 54.4209 0.9027
MFSTC-MLP-LSTM 40707.0391 2.0352 2.8030 2.8026 0.0919 0.9930 53.0502 0.8974
MFSTC-CNN-LSTM 35179.0430 1.8680 2.6058 2.6039 0.0854 0.9940 53.9382 0.9114

Bold values represent the optimal value of each evaluation metric.

Fig. 7. Forecasting results of proposed model in spring and summer wind-speed datasets (Experiment 2).

791
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

Fig. 8. Histogram of mean forecasting residuals for different models at two target sites (Experiment 2).

CNN-LSTM) is better than that of CNN (CNN-LSTM). For instance, 4.3. Experiment 2
at the “THRO” site in Table 3, the SSE (MAE) values of MFSTC-CNN
and CNN are 21719.4746 and 27973.5801 (1.5266 and 1.6803), In this part of the experiment, by comparing the forecasting results
respectively. of the MFSTC-CNN-MLP, MFSTC-MLP-LSTM, and MFSTC-CNN-LSTM
(2) By comparing the evaluation results of CNN, LSTM, MFSTC-CNN, models, the superiority of the proposed CNN-LSTM combination
and MFSTC-CNN-LSTM in Table 3, it can be clearly observed that strategy over other common combination strategies is further verified.
except for the DA values of the “ASPE” and “THRO” sites, all eva- In addition, the individual LR, MLP, CNN, and LSTM models also par-
luation metric values of the combined MFSTC-CNN-LSTM model ticipate in the comparison to show the effects of each combination
are superior to those of the individual models. For example, at the model. Sub-dataset 2 is selected for the training and testing of the
“ASPE” site in Table 3, the SSE, MAE, and RMSE values of MFSTC- models, and the corresponding performance evaluation results are
CNN are 10505.2178, 1.0463, and 1.4240, respectively. When presented in Table 4. The best values for each evaluation metric are
considering the combination strategy of CNN-LSTM, the forecast expressed in bold font. Moreover, similar to Fig. 5, Fig. 7 displays the
effect is further improved, and the SSE, MAE, and RMSE values of forecasting effects of the MFSTC-CNN-LSTM model visually. Fig. 8 plots
MFSTC-CNN-LSTM reach 10158.9609, 1.0384, and 1.4003, re- a histogram that illustrates the mean forecasting residuals of the spring
spectively. and summer wind-speed datasets for each model. The detailed experi-
(3) In addition, by comparing with several baseline models, it can be mental comparison results of experiment 2 are summarized below.
further proven that the proposed model has superior forecasting
performance. For example, at “ASPE” and “THRO” sites in Table 3, (1) By comparing the forecasting results of the combined models and
the MFSTC-CNN-LSTM model achieves the optimal value for each individual models, it can be clearly seen that the combined models
evaluation metric in summer datasets. outperform the individual models at most of the datasets. For ex-
Remark: Through the experimental verification and analysis of ample, at the “CLAU” site in Table 4, the MFSTC-CNN-LSTM has the
multiple sites and evaluation metrics, it is found that the proposed best forecasting performance, while the individual CNN model has
MFSTC model and CNN-LSTM combination strategy can effectively the worst performance. The SSE value of MFSTC-CNN-LSTM is
improve the accuracy of wind speed forecasting. In addition, a 33052.8164, which is much lower than that of the CNN model.
comparison of the models shows that the proposed MFSTC-CNN- (2) At all target sites in sub-dataset 2, it can be observed that the
LSTM model has excellent forecasting performance at different proposed model has a lower forecasting error than the other two
forecasting sites. combined models. For instance, in Table 4, the SSE values of
MFSTC-CNN-LSTM in spring and summer datasets at two target
sites are 1586.8477, 245.3218, 5053.4062, and 3316.5547 lower
than those of MFSTC-CNN-MLP, respectively. The SSE values of
MFSTC-CNN-LSTM in spring and summer datasets at two target

792
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

Table 5
Comparison of forecasting results of proposed model and nine baseline models in spring wind-speed datasets (Experiment 3).
Site name Models Evaluation metrics

SSE MAE RMSE SDE U1 IA DA PCC

SLAT ARIMA 13219.2842 1.1596 1.5973 1.5973 0.0522 0.9978 40.5212 0.9722
LR 13957.9795 1.1795 1.6414 1.6379 0.0536 0.9977 40.0386 0.9704
SVC 15227.0000 1.1702 1.7144 1.7108 0.0564 0.9975 4.1313 0.9677
ELM 13954.8418 1.1792 1.6412 1.6377 0.0536 0.9977 39.9807 0.9704
MLP 14214.9815 1.1717 1.6564 1.6561 0.0542 0.9976 37.0270 0.9697
CNN 14708.0342 1.2035 1.6849 1.6669 0.0557 0.9975 37.6834 0.9695
LSTM 13835.8057 1.1725 1.6342 1.6316 0.0534 0.9977 40.1738 0.9706
CNN-LSTM 14653.1592 1.2000 1.6817 1.6640 0.0556 0.9975 37.0656 0.9696
MFSTC-CNN 14047.1641 1.2213 1.6466 1.6453 0.0541 0.9977 42.2008 0.9702
MFSTC-CNN-LSTM 12530.1357 1.1341 1.5552 1.5551 0.0510 0.9979 44.5946 0.9734

RALL ARIMA 15073.1137 1.2701 1.7057 1.7057 0.0674 0.9964 40.7143 0.9600
LR 13148.3552 1.1653 1.5931 1.5925 0.0631 0.9968 43.7452 0.9643
SVC 14928.0000 1.1870 1.6974 1.6872 0.0679 0.9963 9.5753 0.9599
ELM 13148.3214 1.1653 1.5931 1.5925 0.0631 0.9968 43.7645 0.9643
MLP 13263.8389 1.1695 1.6000 1.5969 0.0633 0.9968 42.7413 0.9641
CNN 14053.8926 1.1974 1.6470 1.6467 0.0653 0.9966 38.9189 0.9618
LSTM 13413.8662 1.1791 1.6091 1.5950 0.0633 0.9967 42.6448 0.9642
CNN-LSTM 14064.8565 1.1956 1.6476 1.6471 0.0653 0.9966 38.5328 0.9617
MFSTC-CNN 12987.2549 1.1929 1.5833 1.5832 0.0629 0.9968 44.8456 0.9647
MFSTC-CNN-LSTM 12476.4961 1.1585 1.5518 1.5518 0.0616 0.9969 44.7490 0.9661

MACY ARIMA 16936.4353 1.3386 1.8080 1.8080 0.0623 0.9969 42.2780 0.9604
LR 16622.1853 1.3168 1.7912 1.7912 0.0619 0.9969 44.5946 0.9604
SVC 18624.0000 1.3492 1.8960 1.8839 0.0661 0.9965 9.7490 0.9561
ELM 16621.9592 1.3169 1.7912 1.7912 0.0619 0.9969 44.4788 0.9604
MLP 16790.1738 1.3227 1.8002 1.7983 0.0622 0.9969 44.0734 0.9601
CNN 17175.6699 1.3321 1.8208 1.8206 0.0630 0.9968 41.8340 0.9591
LSTM 16816.5625 1.3223 1.8016 1.7988 0.0622 0.9969 43.7259 0.9601
CNN-LSTM 17318.4023 1.3329 1.8283 1.8283 0.0633 0.9968 41.0425 0.9587
MFSTC-CNN 16929.3359 1.3615 1.8077 1.8050 0.0627 0.9968 46.1390 0.9598
MFSTC-CNN-LSTM 15615.6690 1.2843 1.7361 1.7353 0.0602 0.9971 47.4904 0.9629

Bold values represent the optimal value of each evaluation metric.

sites are 2658.4922, 6299.2539, 3048.1640, and 5527.9961 lower conclusions can be drawn like that from experiments 1 and 2. Obser-
than those of MFSTC-MLP-LSTM, respectively. ving the results of the evaluation metrics at each site except for the DA
(3) It can also be seen from the residuals histogram that the proposed value of the “RALL” and “MACY” sites in Tables 5 and 7, all metrics’
model has better forecasting quality than the other models at dif- evaluation values of the MFSTC-CNN-LSTM model outperform those of
ferent target sites with different fluctuations. the other baseline models. For example, at the “RALL” site in Table 6,
Remark: In a comparison with other commonly used combined models the SVC model has the worst forecasting performance with the MAE and
and individual models, the MFSTC-CNN-LSTM model obtained RMSE values at 1.2295 and 1.7695, respectively. The MFSTC-CNN-
satisfactory results at all target sites. In other words, based on the LSTM model is significantly superior with regard to SVC, and its MAE
MFSTC model, the proposed CNN-LSTM combination strategy is and RMSE values are 1.1927 and 1.6419, respectively. At the “MACY”
superior to other combination strategies in reducing forecasting errors. site in Table 7, the SSE values of the MFSTC-CNN-LSTM, ARIMA-ANN
[19], STC-LSTM [18], and STC-CNN-MLP [26] in spring datasets are
15615.6690, 16338.0734, 16192.3145, and 16023.5273, respectively.
4.4. Experiment 3 Moreover, from the scatter plot diagrams in Fig. 11, it can also be seen
that the proposed MFSTC-CNN-LSTM model obtained the best fit effect
To further verify the generalization ability and stability of the at each site.
proposed model, sub-dataset 3 and twelve forecasting models are se- Remark: By comparing several different types of forecasting models, it
lected to make a comparison analysis in experiment 3. Among them, in can be concluded that the proposed combined MFSTC-CNN-LSTM
addition to the common forecasting models, several recent models model has a stronger generalization ability and higher forecasting
proposed by other researchers, such as ARIMA-ANN [19], STC-LSTM accuracy. Moreover, the proposed model can maintain excellent
[18], and STC-CNN-MLP [26] are also involved in the comparison. The forecasting quality and strong forecasting stability under wind speed
comparison models involved in the experiment include statistical datasets with different degrees of fluctuation.
models, machine learning models, deep learning models, STC models,
and combined models. By comparing the models of different types, the
superiority and accuracy of the proposed MFSTC-CNN-LSTM model 5. Discussion and analysis
could be demonstrated comprehensively. Detailed comparison results
are listed in Tables 5–7. The best value of each evaluation metric is also In this section, the Diebold–Mariano (DM) test method is employed
expressed in bold font. Similar to Fig. 5, Figs. 9 and 10 show the to analyze the performance difference levels between the proposed
forecasting results of the MFSTC-CNN-LSTM model at each target site. model and other baseline models as judged through hypothesis tests to
In addition, Fig. 11 displays the correlation between the observed va- further illustrate the effectiveness of the proposed model. Then, the
lues and forecast values of eight models in summer wind-speed datasets performance improvement percentage of the MFSTC-CNN-LSTM model
through a scatter plot diagram. compared with baseline models is calculated to manifest the optimi-
According to Tables 5–7 and Figs. 9–11 for experiment 3, similar zation effect of the proposed model.

793
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

Table 6
Comparison of forecasting results of proposed model and nine baseline models in summer wind-speed datasets (Experiment 3).
Site name Models Evaluation metrics

SSE MAE RMSE SDE U1 IA DA PCC

SLAT ARIMA 17955.8585 1.2686 1.8616 1.8616 0.0632 0.9967 40.0772 0.9482
LR 16966.6184 1.2300 1.8096 1.8096 0.0616 0.9969 39.7297 0.9501
SVC 18048.0000 1.2117 1.8664 1.8663 0.0635 0.9967 3.6680 0.9472
ELM 16975.6821 1.2305 1.8101 1.8101 0.0616 0.9969 39.6718 0.9500
MLP 17657.2266 1.2478 1.8461 1.8392 0.0625 0.9968 36.6602 0.9484
CNN 17575.3125 1.2434 1.8418 1.8380 0.0630 0.9968 37.7220 0.9484
LSTM 17114.3770 1.2630 1.8175 1.8069 0.0624 0.9968 39.1506 0.9503
CNN-LSTM 17549.3613 1.2339 1.8405 1.8389 0.0629 0.9968 37.6834 0.9483
MFSTC-CNN 17510.9238 1.2743 1.8384 1.8384 0.0626 0.9968 43.0116 0.9485
MFSTC-CNN-LSTM 15907.8789 1.2160 1.7523 1.7490 0.0594 0.9971 44.1699 0.9534

RALL ARIMA 15723.9537 1.2385 1.7421 1.7421 0.0783 0.9950 43.7838 0.9324
LR 14895.6419 1.2193 1.6956 1.6956 0.0764 0.9952 43.7066 0.9343
SVC 16222.0000 1.2295 1.7695 1.7689 0.0799 0.9948 6.7761 0.9283
ELM 14894.9349 1.2193 1.6956 1.6955 0.0764 0.9952 43.6680 0.9343
MLP 16178.9912 1.2773 1.7671 1.7564 0.0791 0.9948 39.4981 0.9290
CNN 15535.9258 1.2419 1.7317 1.7315 0.0783 0.9950 42.2201 0.9310
LSTM 15610.0791 1.2494 1.7358 1.7357 0.0784 0.9950 41.6795 0.9307
CNN-LSTM 15931.2578 1.2570 1.7536 1.7535 0.0793 0.9949 40.3475 0.9292
MFSTC-CNN 15401.0781 1.2547 1.7241 1.7240 0.0779 0.9950 46.2741 0.9317
MFSTC-CNN-LSTM 13966.4707 1.1927 1.6419 1.6411 0.0739 0.9955 47.7606 0.9383

MACY ARIMA 20671.1250 1.4579 1.9975 1.9974 0.0678 0.9962 44.9035 0.9408
LR 20984.3921 1.4671 2.0125 2.0125 0.0685 0.9961 44.7490 0.9390
SVC 22246.0000 1.4646 2.0721 2.0673 0.0708 0.9959 12.3552 0.9355
ELM 20987.9707 1.4672 2.0127 2.0127 0.0685 0.9961 44.5560 0.9390
MLP 22378.7441 1.5196 2.0783 2.0670 0.0703 0.9958 41.3900 0.9355
CNN 21866.6055 1.5011 2.0544 2.0458 0.0695 0.9960 41.0425 0.9369
LSTM 20739.5352 1.4687 2.0008 1.9997 0.0681 0.9961 44.2278 0.9399
CNN-LSTM 21871.6172 1.4982 2.0546 2.0472 0.0695 0.9960 40.1931 0.9368
MFSTC-CNN 20922.3242 1.4722 2.0096 2.0095 0.0684 0.9961 47.0463 0.9391
MFSTC-CNN-LSTM 18861.8906 1.4106 1.9080 1.9066 0.0648 0.9965 47.8378 0.9454

Bold values represent the optimal value of each evaluation metric.

Table 7
Comparison of forecasting results of proposed model with those of the other three researchers (Experiment 3).
Site name Seasons Models Evaluation metrics

SSE MAE RMSE SDE U1 IA DA PCC

SLAT Spring ARIMA-ANN 13355.6161 1.1533 1.6056 1.6052 0.0526 0.9978 40.0000 0.9716
STC-LSTM 13673.9102 1.1739 1.6246 1.6028 0.0538 0.9977 40.9266 0.9722
STC-CNN-MLP 13482.6318 1.1830 1.6132 1.6081 0.0531 0.9978 43.8417 0.9716
MFSTC-CNN-LSTM 12530.1357 1.1341 1.5552 1.5551 0.0510 0.9979 44.5946 0.9734
Summer ARIMA-ANN 17373.0450 1.2317 1.8312 1.8307 0.0622 0.9968 39.9421 0.9490
STC-LSTM 16697.5996 1.2287 1.7952 1.7850 0.0617 0.9969 41.0811 0.9523
STC-CNN-MLP 16926.2695 1.2553 1.8075 1.8053 0.0614 0.9969 43.3012 0.9503
MFSTC-CNN-LSTM 15907.8789 1.2160 1.7523 1.7490 0.0594 0.9971 44.1699 0.9534

RALL Spring ARIMA-ANN 13044.6819 1.1662 1.5868 1.5867 0.0630 0.9968 42.6062 0.9645
STC-LSTM 12883.8750 1.1641 1.5769 1.5628 0.0632 0.9968 43.4942 0.9660
STC-CNN-MLP 12742.1367 1.1802 1.5683 1.5679 0.0623 0.9969 45.4633 0.9655
MFSTC-CNN-LSTM 12476.4961 1.1585 1.5518 1.5518 0.0616 0.9969 44.7490 0.9661
Summer ARIMA-ANN 14933.7217 1.2222 1.6978 1.6977 0.0765 0.9952 43.5714 0.9339
STC-LSTM 14188.2842 1.2020 1.6549 1.6523 0.0745 0.9954 44.8456 0.9374
STC-CNN-MLP 14394.4238 1.2298 1.6668 1.6663 0.0751 0.9954 46.5830 0.9363
MFSTC-CNN-LSTM 13966.4707 1.1927 1.6419 1.6411 0.0739 0.9955 47.7606 0.9383

MACY Spring ARIMA-ANN 16338.0734 1.3077 1.7758 1.7756 0.0615 0.9969 43.9189 0.9611
STC-LSTM 16192.3145 1.2970 1.7679 1.7528 0.0617 0.9969 45.1351 0.9624
STC-CNN-MLP 16023.5273 1.3249 1.7586 1.7586 0.0609 0.9970 46.8147 0.9619
MFSTC-CNN-LSTM 15615.6690 1.2843 1.7361 1.7353 0.0602 0.9971 47.4904 0.9629
Summer ARIMA-ANN 20252.6223 1.4402 1.9771 1.9770 0.0672 0.9963 45.0386 0.9412
STC-LSTM 19820.4551 1.4208 1.9559 1.9537 0.0669 0.9963 45.9653 0.9431
STC-CNN-MLP 19930.4258 1.4439 1.9613 1.9589 0.0667 0.9963 47.8958 0.9425
MFSTC-CNN-LSTM 18861.8906 1.4106 1.9080 1.9066 0.0648 0.9965 47.8378 0.9454

Bold values represent the optimal value of each evaluation metric.

794
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

Fig. 9. Forecasting results of proposed model in spring wind-speed datasets (Experiment 3).

Fig. 10. Forecasting results of proposed model in summer wind-speed datasets (Experiment 3).

5.1. Diebold–Mariano test 5.2. Performance improvement percentage

The DM test is a common hypothesis testing method, and it will be According to Eqns. (14) and (15), the mean improvement percen-
used to analyze the forecasting performance difference levels between tage of each evaluation metric in experiments 1–3 is calculated to
the proposed model and baseline models from a statistical point of further verify the effectiveness and superiority of the proposed model.
view. Details of the DM test are provided in Sub-Section 3.2. The results The results are listed in Tables 9–11, and detailed analyses are shown
of the DM test are listed in Table 8. From the test results, of the 120 DM below.
values, 104 are higher than the upper limit at the 1% significance level,
and 115 are higher than the upper limit at the 5% significance level. In (1) Compared with models CNN, and CNN-LSTM, which consider only
addition, 4 DM values are lower than the upper limit at the 10% sig- the relationship between the meteorological attributes of a single
nificance level. In other words, there is a 96.67% probability that 116 of site, the MFSTC model fully takes into account the coupling re-
the 120 comparison experiments show that the proposed model is sig- lationship between the time and space of various meteorological
nificantly better than the baseline models. Therefore, it can be firmly factors at different sites, and provides a more complete and reliable
believed that the proposed model is superior to the baseline models in basis for forecasting the wind speed at the next moment. By ob-
wind speed forecasting. serving the comparison results of the MFSTC-CNN-LSTM model

795
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

Fig. 11. Scatter plot of forecast and observed values of 10 models in summer wind-speed datasets (Experiment 3).

with CNN, STC-LSTM, STC-CNN-MLP, CNN-LSTM, and MFSTC- feature extraction capability. The deep extraction of the relation-
CNN, it can be found that considering the STC of multiple factors ship between the time and space features is realized by the com-
can significantly improve the model forecasting accuracy. For in- plementary advantages of the two individual models. The com-
stance, at “ASPE” site in Table 9, compared to the CNN, CNN-LSTM, parison results of MFSTC-CNN-LSTM, MFSTC-CNN, MFSTC-CNN-
and MFSTC-CNN models, the SSE values of the MFSTC-CNN-LSTM MLP, and MFSTC-MLP-LSTM further manifest the superiority of the
model are reduced by 14.84%, 16.4%, and 2.53%, respectively. proposed CNN-LSTM combination strategy. For example, at
Moreover, at “SLAT” site in Table 11, compared to the STC-LSTM, “AMNW” site in Table 10, compared to the MFSTC-CNN-MLP, and
and STC-CNN-MLP models, the SSE values of the MFSTC-CNN- MFSTC-MLP-LSTM models, the SSE values of the MFSTC-CNN-
LSTM model are reduced by 5.76%, and 4.57%, respectively; the LSTM model are reduced by 10.70%, and 10.85%, respectively; the
RMSE values are reduced by 2.93%, and 2.32%, respectively. U1 values of the MFSTC-CNN-LSTM model are reduced by 5.65%,
(2) On the basis of the MFSTC model, the CNN-LSTM combination and 5.76%, respectively.
strategy makes full use of the advantages of the CNN’s strong spatial (3) In addition, by comparing the MFSTC-CNN-LSTM model with other
feature extraction capability and the LSTM’s strong temporal individual baseline models, the effectiveness of the proposed

796
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

Table 8
DM test results of experiments 1–3.
Sites Seasons Models
ARIMA MLP CNN LSTM CNN-LSTM MFSTC-CNN
ASPE Spring 7.885** 7.548*** 10.088*** 7.105*** 10.296*** 2.475**
Summer 5.452** 13.411*** 10.228*** 9.411*** 11.595*** 2.265**
THRO Spring 6.518*** 13.219*** 12.945*** 6.635*** 12.955*** 2.373**
Summer 10.088*** 16.033*** 15.084*** 10.873*** 15.161*** 1.991**

Sites Seasons LR MLP CNN LSTM MFSTC-CNN- MFSTC-MLP-LSTM


MLP
*** *** *** ***
CLAU Spring 8.927 11.139 11.249 9.365 3.054*** 5.019***
Summer 6.658*** 9.701*** 7.953*** 8.304*** 0.374 7.428***
AMNW Spring 8.823*** 9.538*** 10.082*** 9.876*** 8.442*** 5.443***
Summer 3.075*** 4.723*** 4.168*** 4.610*** 5.400*** 7.818***

Sites Seasons ARIMA LR SVC ELM MLP CNN LSTM CNN-LSTM MFSTC-CNN ARIMA-ANN STC-LSTM STC-CNN-
MLP
*** *** *** *** *** *** *** *** *** *** ***
SLAT Spring 3.857 6.307 10.513 6.283 7.776 9.298 5.763 9.170 9.037 4.498 5.199 5.939***
Summer 3.513*** 2.048** 4.364*** 2.069*** 3.706*** 3.512*** 2.425** 3.453*** 3.459*** 2.834*** 1.612 2.587***
RALL Spring 9.806*** 3.059*** 9.987*** 3.059*** 3.641*** 7.321*** 4.003*** 7.424*** 2.843*** 3.084*** 2.044** 1.607
Summer 3.078*** 2.656*** 6.527*** 2.654*** 6.299*** 4.695*** 4.883*** 5.807*** 5.245*** 3.085*** 0.841 1.856*
MACY Spring 5.389*** 3.783*** 9.222*** 3.780*** 4.059*** 5.801*** 4.061*** 6.184*** 5.267*** 2.967*** 2.228** 1.993**
Summer 5.877*** 5.380*** 8.617*** 5.387*** 8.959*** 7.787*** 5.518*** 8.051*** 7.349*** 4.442*** 3.502*** 4.614***

*** represents 1% significance level.


** represents 5% significance level.
* represents 10% significance level.

Table 9
Calculation results of mean performance improvement percentage for experiment 1.
Models Sites SSE MAE RMSE SDE U1 IA DA PCC

MFSTC-CNN-LSTM vs ARIMA ASPE 10.10% 4.46% 5.19% 5.34% 5.15% 0.03% 8.10% 0.54%
THRO 7.18% 3.03% 3.70% 3.73% 3.53% 0.06% 3.58% 0.93%

MFSTC-CNN-LSTM vs MLP ASPE 16.06% 7.31% 8.43% 7.82% 7.93% 0.07% 16.02% 1.04%
THRO 17.00% 9.22% 9.13% 8.54% 8.56% 0.20% 12.19% 3.28%

MFSTC-CNN-LSTM vs CNN ASPE 14.84% 5.07% 7.73% 7.77% 8.03% 0.06% 13.03% 0.95%
THRO 16.66% 8.43% 8.93% 8.82% 8.80% 0.19% 10.49% 3.32%

MFSTC-CNN-LSTM vs LSTM ASPE 13.40% 5.32% 6.96% 6.66% 7.27% 0.06% 10.03% 0.81%
THRO 12.58% 5.78% 6.62% 6.38% 7.36% 0.14% 7.11% 2.20%

MFSTC-CNN-LSTM vs CNN-LSTM ASPE 16.40% 5.66% 8.59% 8.65% 9.15% 0.07% 16.08% 1.11%
THRO 16.85% 8.41% 9.04% 8.97% 8.96% 0.19% 12.61% 3.38%

MFSTC-CNN-LSTM vs MFSTC-CNN ASPE 2.53% 0.67% 1.27% 1.38% 1.58% 0.01% −0.16% 0.16%
THRO 1.54% 1.38% 0.78% 0.75% 1.30% 0.02% 0.15% 0.23%

Table 10
Calculation results of mean performance improvement percentage for experiment 2.
Models Sites SSE MAE RMSE SDE U1 IA DA PCC

MFSTC-CNN-LSTM vs LR CLAU 16.10% 8.21% 8.41% 8.40% 8.57% 0.15% 12.60% 2.53%
AMNW 11.84% 5.98% 6.13% 6.13% 6.53% 0.08% 12.52% 1.07%

MFSTC-CNN-LSTM vs MLP CLAU 20.65% 11.73% 10.92% 10.04% 10.00% 0.21% 18.35% 3.12%
AMNW 14.48% 7.75% 7.53% 7.31% 7.41% 0.10% 17.26% 1.34%

MFSTC-CNN-LSTM vs CNN CLAU 19.33% 10.07% 10.20% 10.23% 10.26% 0.19% 16.70% 3.18%
AMNW 14.02% 7.13% 7.30% 7.22% 8.06% 0.10% 18.46% 1.29%

MFSTC-CNN-LSTM vs LSTM CLAU 17.87% 9.27% 9.38% 9.28% 10.18% 0.20% 15.19% 2.79%
AMNW 14.38% 7.45% 7.49% 7.41% 8.19% 0.10% 20.01% 1.32%

MFSTC-CNN-LSTM vs MFSTC-CNN-MLP CLAU 2.62% 2.76% 1.32% 1.32% 1.27% 0.02% 1.52% 0.28%
AMNW 10.70% 6.36% 5.51% 5.51% 5.65% 0.07% 0.83% 0.95%

MFSTC-CNN-LSTM vs MFSTC-MLP-LSTM CLAU 11.00% 6.08% 5.68% 5.54% 5.96% 0.10% 3.14% 1.76%
AMNW 10.85% 6.20% 5.59% 5.67% 5.76% 0.07% 2.89% 1.08%

combined model is verified once again. Compared with the in- deeper feature information. For instance, at “RALL” site in Table 11,
dividual baseline models, the MFSTC-CNN-LSTM model not only compared to the ARIMA, SVC, and ELM models, the RMSE values of
takes into account the multiple spatio-temporal correlations be- the MFSTC-CNN-LSTM model are reduced by 7.39%, 7.9%, and
tween various sites but is also combined with the mainstream deep 2.88%, respectively.
learning model, which makes it possible to automatically extract

797
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

Table 11
Calculation results of mean performance improvement percentage for experiment 3.
Models Sites SSE MAE RMSE SDE U1 IA DA PCC

MFSTC-CNN-LSTM vs ARIMA SLAT 8.31% 3.18% 4.26% 4.35% 4.15% 0.02% 10.13% 0.33%
RALL 14.20% 6.24% 7.39% 7.41% 7.05% 0.05% 9.50% 0.64%
MACY 8.28% 3.65% 4.23% 4.28% 3.90% 0.03% 9.43% 0.37%

MFSTC-CNN-LSTM vs LR SLAT 8.23% 2.49% 4.21% 4.20% 4.25% 0.02% 11.28% 0.33%
RALL 5.67% 1.38% 2.88% 2.89% 2.77% 0.02% 5.79% 0.31%
MACY 8.08% 3.16% 4.13% 4.19% 4.10% 0.03% 6.70% 0.47%

MFSTC-CNN-LSTM vs SVC SLAT 14.78% 1.37% 7.70% 7.69% 8.01% 0.04% 1041.83% 0.63%
RALL 15.16% 2.70% 7.90% 7.63% 8.42% 0.07% 486.09% 0.87%
MACY 15.68% 4.25% 8.18% 7.83% 8.74% 0.06% 337.16% 0.88%

MFSTC-CNN-LSTM vs ELM SLAT 8.25% 2.50% 4.22% 4.21% 4.27% 0.02% 11.44% 0.33%
RALL 5.67% 1.38% 2.88% 2.89% 2.77% 0.02% 5.81% 0.31%
MACY 8.09% 3.16% 4.14% 4.19% 4.11% 0.03% 7.07% 0.47%

MFSTC-CNN-LSTM vs MLP SLAT 10.88% 2.88% 5.60% 5.50% 5.48% 0.03% 20.46% 0.45%
RALL 9.81% 3.78% 5.05% 4.70% 4.60% 0.05% 12.81% 0.61%
MACY 11.36% 5.04% 5.88% 5.63% 5.56% 0.05% 11.67% 0.67%

MFSTC-CNN-LSTM vs CNN SLAT 12.15% 3.99% 6.28% 5.77% 7.13% 0.04% 17.72% 0.46%
RALL 10.66% 3.61% 5.48% 5.50% 5.61% 0.05% 14.05% 0.62%
MACY 11.41% 4.81% 5.89% 5.74% 5.57% 0.04% 15.04% 0.65%

MFSTC-CNN-LSTM vs LSTM SLAT 8.24% 3.50% 4.21% 3.95% 4.64% 0.02% 11.91% 0.31%
RALL 3.14% 4.48% 4.08% 4.22% 0.04% 9.76% 0.51% 8.76%
MACY 3.41% 4.14% 4.09% 4.00% 0.03% 8.39% 0.44% 8.10%

MFSTC-CNN-LSTM vs CNN-LSTM SLAT 11.92% 3.47% 6.16% 5.71% 6.89% 0.04% 18.76% 0.46%
RALL 4.11% 6.09% 6.10% 6.19% 0.05% 17.25% 0.72% 11.81%
MACY 4.75% 6.09% 5.98% 5.84% 0.04% 17.37% 0.68% 11.80%

MFSTC-CNN-LSTM vs MFSTC-CNN SLAT 9.98% 5.86% 5.12% 5.17% 5.40% 0.03% 4.18% 0.43%
RALL 3.92% 3.38% 3.40% 3.61% 0.03% 1.50% 0.43% 6.62%
MACY 4.93% 4.51% 4.49% 4.69% 0.03% 2.31% 0.50% 8.80%

MFSTC-CNN-LSTM vs ARIMA-ANN SLAT 5.27% 1.16% 2.67% 2.66% 2.66% 0.01% 8.26% 0.17%
RALL 5.42% 1.54% 2.75% 2.77% 2.77% 0.02% 7.32% 0.32%
MACY 5.64% 1.92% 2.87% 2.91% 2.85% 0.02% 7.17% 0.32%

MFSTC-CNN-LSTM vs STC-LSTM SLAT 5.76% 1.94% 2.93% 1.84% 3.93% 0.02% 5.92% 0.07%
RALL 2.36% 0.63% 1.19% 0.69% 1.64% 0.01% 4.69% 0.06%
MACY 4.20% 0.85% 2.12% 1.70% 2.85% 0.02% 4.65% 0.15%

MFSTC-CNN-LSTM vs STC-CNN-MLP SLAT 4.57% 2.98% 2.32% 2.16% 2.53% 0.01% 0.07% 0.13%
RALL 2.53% 2.43% 1.27% 1.27% 1.29% 0.01% 0.48% 0.14%
MACY 3.95% 2.68% 2.00% 2.00% 2.01% 0.02% 0.66% 0.20%

6. Conclusion and future work percentage. The experiment results can be concluded that: (a) it is
conducive to improve the accuracy and reliability of the wind speed
To improve the energy conversion efficiency and reduce un- forecasting by fully considering the temporal and spatial correlation of
necessary labor costs, it is of great importance to improve the accuracy meteorological factors among multiple sites; (b) the combined CNN-
and superiority of wind speed forecasting. However, the intermittence LSTM model can effectively utilize the advantages of individual models
and randomness of wind speed remains a large obstacle to accurate to deeply extract the temporal and spatial correlation features si-
forecasting. Therefore, in this study, a novel combined model based on multaneously; (c) compared with other baseline models, the proposed
multifactor spatio-temporal correlations and deep learning algorithms forecasting model has better precision and generalization under dif-
is proposed. In the proposed MFSTC-CNN-LSTM model, the spatial and ferent levels of wind speed fluctuations. Overall, the proposed MFSTC-
temporal correlations between multiple sites and meteorological factors CNN-LSTM model can provide more reliable and accurate short wind
are considered simultaneously. Then, the deep extraction of temporal speed forecasting for wind power generation systems, thus enhancing
and spatial correlations and the forecasting of the wind speed at target its operation efficiency, power quality, and economic benefit. On the
sites are realized by a CNN-LSTM combination model. Meanwhile, for other hand, according to accurate short-term wind speed forecast re-
CNN-LSTM to adapt and solve the proposed MFSTC model, a new data sults, the maintenance, dispatch, and scheduling plan of wind energy
reconstruction method based on a 3D matrix is put forward. In addition, equipment can be arranged more flexibly, thus reducing unnecessary
to verify the accuracy and superiority of the proposed model, datasets economic losses.
from the National Wind Institute in Texas containing 46 sites are used However, there are still some limitations that need to be addressed.
in experiments. Three different comparison experiments and eight For example, a data preprocessing method is not considered to reduce
common metrics are used to evaluate the forecasting performance of the influence of data noise on the forecasting effect. The deeper theo-
the models involved in the experiments, including statistical models retical analysis of the deep neural network combination model used in
(ARIMA), machine learning models (LR, ELM, MLP, and SVC), deep this study needs to be further investigated. In addition, with the con-
learning models (CNN, and LSTM), and combined models (CNN-LSTM, tinued improvement of deep learning model and technology, more
MFSTC-CNN, MFSTC-CNN-MLP, MFSTC-MLP-LSTM, ARIMA-ANN [19], advanced combination model can be explored and applied in the field
STC-LSTM [18], and STC-CNN-MLP [26]). Moreover, the effectiveness of wind speed forecasting. In the future work, more effective data
and superiority of the proposed model are further verified by statistical preprocessing methods and deep learning combination model can be
testing and calculation of the model performance improvement evaluated and considered. Moreover, the proposed model can be

798
Y. Chen, et al. Energy Conversion and Management 185 (2019) 783–799

adapted by modifying the underlying architecture of the deep neural [14] Ferreira A, Giraldi G. Convolutional neural network approaches to granite tiles
network and applied to other types of forecasting such as stock fore- classification. Expert Syst Appl 2017;84:1–11.
[15] Han DM, Liu QG, Fan WG. A new image classification method using CNN transfer
casting and air quality forecasting. learning and web data augmentation. Expert Syst Appl 2018;95:43–56.
[16] Wang HZ, Wang GB, Li GQ, Peng JC, Liu YT. Deep belief network based determi-
Conflict of interest nistic and probabilistic wind speed forecasting approach. Appl Energy
2016;182:80–93.
[17] Zhao Z, Chen WH, Wu XM, Chen PC, Liu JM. LSTM network: a deep learning ap-
The author declares that there is no conflict of interest. proach for short-term traffic forecast. IET Intel Trans Syst 2017;11(2):68–75.
[18] Ghaderi A, Sanandaji BM, Ghaderi F. Deep forecast: deep learning-based spatio-
temporal forecasting. arXiv preprint arXiv 2017. 1707.08110.
Acknowledgements [19] Cadenas E, Rivera W. Wind speed forecasting in three different regions of Mexico,
using a hybrid ARIMA–ANN model. Renew Energy 2010;35(12):2732–8.
The work has been supported by National Natural Science [20] Chen J, Zeng GQ, Zhou WN, Du W, Lu KD. Wind speed forecasting using nonlinear-
learning ensemble of deep learning time series prediction and extremal optimiza-
Foundation of China (No. 51875503, No. 71701065, No. 51775496,
tion. Energy Convers Manage 2018;165:681–95.
No. 51475410). [21] Sanandaji BM, Tascikaraoglu A, Poolla K, Varaiya P. Low-dimensional models in
spatio-temporal wind speed forecasting. Proceedings of the American Control
Appendix A. Supplementary material Conference, July 1-3, Chicago, USA. 2015. p. 4485–90.
[22] Hill DC, McMillan D, Bell KR, Infield D. Application of auto-regressive models to UK
wind speed data for power system impact studies. IEEE Trans Sustain Energy
Supplementary data to this article can be found online at https:// 2012;3(1):134–41.
doi.org/10.1016/j.enconman.2019.02.018. [23] Baxevani A, Lenzi A. Very short-term spatio-temporal wind power prediction using
a censored Gaussian field. Stoch Env Res Risk Assess 2018;32(4):931–48.
[24] Zhao Y, Ye L, Pinson P, Tang Y, Lu P. Correlation-constrained and sparsity-con-
References trolled vector autoregressive model for spatio-temporal wind power forecasting.
IEEE Trans Power Syst 2018;33:5029–40.
[25] Ye L, Zhao Y, Zeng C, Zhang C. Short-term wind power prediction based on spatial
[1] Du P, Wang JZ, Guo ZH, Yang WD. Research and application of a novel hybrid
model. Renew Energy 2017;101:1067–74.
forecasting system based on multi-objective optimization for wind speed fore-
[26] Zhu QM, Chen JF, Zhu L, Duan XZ, Liu YL. Wind speed prediction with spatio-
casting. Energy Convers Manage 2017;150:90–107.
temporal correlation: a deep learning approach. Energies 2018;11(4). https://doi.
[2] Wang HZ, Li GQ, Wang GB, Peng JC, Jiang H, Liu YT. Deep learning based ensemble
org/10.3390/en11040705.
approach for probabilistic wind power forecasting. Appl Energy 2017;188:56–70.
[27] Diebold FX, Mariano RS. Comparing predictive accuracy. J Business Econ Stat
[3] Meng AB, Ge JF, Yin H, Chen SZ. Wind speed forecasting based on wavelet packet
2002;20(1):134–44.
decomposition and artificial neural networks trained by crisscross optimization
[28] Pourhabib A, Huang JZ, Ding Y. Short-term wind speed forecast using measure-
algorithm. Energy Convers Manage 2016;114:75–88.
ments from multiple turbines in a wind farm. Technometrics 2016;58(1):138–47.
[4] Khare V, Nema S, Baredar P. Solar-wind hybrid renewable energy system: a review.
[29] Oehmcke S, Zielinski O, Kramer O. Input quality aware convolutional LSTM net-
Renew Sustain Energy Rev 2016;58:23–33.
works for virtual marine sensors. Neurocomputing 2018;275:2603–15.
[5] Ackermann T, Söder L. Wind energy technology and current status: a review. Renew
[30] Goodfellow I, Bengio Y, Courville A. Deep Learning. The MIT Press. 2016.
Sustain Energy Rev 2000;4(4):315–74.
[31] Ji S, Xu W, Yang M, Yu K. 3D convolutional neural networks for human action
[6] Liu H, Mi XW, Li YF. Smart deep learning based wind speed prediction model using
recognition. IEEE Trans Pattern Anal Mach Intell 2013;35(1):221–31.
wavelet packet decomposition, convolutional neural network and convolutional
[32] Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines.
long short term memory network. Energy Convers Manage 2018;166:120–31.
Proceedings of the 27th International Conference on Machine Learning (ICML-10),
[7] Jiang Y, Chen XY, Yu K, Liao YC. Short-term wind power forecasting using hybrid
June 21-24, Haifa, Israel. 2010. p. 807–14.
method based on enhanced boosting algorithm. J Mod Power Syst Clean Energy
[33] Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput
2017;5(1):126–33.
1997;9(8):1735–80.
[8] Liu H, Mi XW, Li YF. An experimental investigation of three new hybrid wind speed
[34] Ma XL, Tao ZM, Wang YH, Yu HY, Wang YP. Long short-term memory neural
forecasting models using multi-decomposing strategy and ELM algorithm. Renew
network for traffic speed prediction using remote microwave sensor data. Trans Res
Energy 2018;123:694–705.
Part C: Emerg Technol 2015;54:187–97.
[9] Park B, Hur J. Accurate short-term power forecasting of wind turbines: the case of
[35] Davis C. The norm of the Schur product operation. Numer Math 1962;4(1):343–4.
Jeju Island’s wind farm. Energies 2017;10(6). https://doi.org/10.3390/
[36] Han J, Moraga C. The influence of the sigmoid function parameters on the speed of
en10060812.
backpropagation learning. Proceedings of the International Workshop on Artificial
[10] Yuan XH, Chen C, Yuan YB, Huang YH, Tan QX. Short-term wind power prediction
Neural Networks, June 7-9, Malaga-Torremolinos, Spain. 1995. p. 195–201.
based on LSSVM-GSA model. Energy Convers Manage 2015;101:393–401.
[37] Wang JZ, Yang WD, Du P, Niu T. A novel hybrid forecasting system of wind speed
[11] Li S, Wang P, Goel L. Wind power forecasting using neural network ensembles with
based on a newly developed multi-objective sine cosine algorithm. Energy Convers
feature selection. IEEE Trans Sustain Energy 2017;6(4):1447–56.
Manage 2018;163:134–50.
[12] Liu H, Mi XW, Li YF. Smart multi-step deep learning model for wind speed fore-
[38] Xu YZ, Yang WD, Wang JZ. Air quality early-warning system for cities in China.
casting based on variational mode decomposition, singular spectrum analysis, LSTM
Atmos Environ 2017;148:239–57.
network and ELM. Energy Convers Manage 2018;159:54–64.
[39] Qureshi AS, Khan A, Zameer A, Usman A. Wind power prediction using deep neural
[13] Zhang C, Zhou JZ, Li CS, Fu WL, Peng T. A compound structure of ELM based on
network based meta regression and transfer learning. Appl Soft Comput
feature selection and parameter optimization using hybrid backtracking search al-
2017;58:742–55.
gorithm for wind speed forecasting. Energy Convers Manage 2017;143:360–76.

799

You might also like