You are on page 1of 15

Energy Conversion and Management 213 (2020) 112824

Contents lists available at ScienceDirect

Energy Conversion and Management


journal homepage: www.elsevier.com/locate/enconman

A new short-term wind speed forecasting method based on fine-tuned LSTM T


neural network and optimal input sets
Gholamreza Memarzadeha, Farshid Keyniab,

a
Department of Power and Control Engineering, Graduate University of Advanced Technology, Kerman, Iran
b
Department of Energy Management and Optimization, Institute of Science and High Technology and Environmental Sciences, Graduate University of Advanced
Technology, Kerman, Iran

ARTICLE INFO ABSTRACT

Keywords: In recent years, clean energies, such as wind power have been developed rapidly. Especially, wind power
Wind speed forecasting generation becomes a significant source of energy in some power grids. On the other hand, based on the un-
Wavelet transform certain and non-convex behavior of wind speed, wind power generation forecasting and scheduling may be very
Feature selection difficult. In this paper, to improve the accuracy of forecasting the short-term wind speed, a hybrid wind speed
Crow search algorithm
forecasting model has been proposed based on four modules: crow search algorithm (CSA), wavelet transform
Long short term memory
Neural network
(WT), Feature selection (FS) based on entropy and mutual information (MI), and deep learning time series
prediction based on Long Short Term Memory neural networks (LSTM). The proposed wind speed forecasting
strategy is applied to real-life data from Sotavento that is located in the south-west of Europe, in Galicia, Spain,
and Kerman that is located in the Middle East, in the southeast of Iran. The presented numerical results de-
monstrate the efficiency of the proposed method, compared to some other existing wind speed forecasting
methods.

1. Introduction capable of interpreting the physical properties of the components. But


they are not capable of detecting nonlinear behaviors of wind speed
Wind energy as one of renewable energy sources in the world has signals. The third approach is spatial correlation methods, which can
attracted a lot of attention. Its benefits include reducing pollution, achieve greater forecasting accuracy in some conditions [8,9]. How-
safety, renewables, and environmental protection. Governments have ever, artificial intelligence-based approaches such as artificial neural
also shown support for increasing wind energy potentials to compete networks (ANN) [10–13], support vector machine (SVM) [14], and
with traditional energy sources [1]. fuzzy logic [15] are more successful in short term wind speed fore-
However, due to the randomness and nonlinearity of wind speed casting. However, among neural networks, deep neural networks are
data, the use of wind energy is a challenging task in terms of economic much more efficient in short-term wind speed forecasting [16–19].
development and social progress [2,3]. Furthermore, the uncertainty of Besides, these neural networks are remarkably accurate in forecasting
wind speed makes it difficult to predict how much power will be pro- various parameters, such as electrical load and price [20,21]. For ex-
duced. So, accurate wind speed forecasting can reduce the costs of wind ample, khodayar et al. proposed two deep neural network architecture
power producers and also helps them to be more successful in the which in the first model stacked auto-encoder is used to extract im-
electricity market. In recent years, many researchers have proposed portant features and in the second model the stacked denoising auto-
different methods for wind speed forecasting. In some of the approach encoder is proposed [16]. Hu et al. proposed transfer learning with
the researchers employed meteorological and geographical data to deep neural networks for short-term wind speed forecasting [17]. Wang
forecast wind speed, but they are weak in forecast short term wind et al. for the wind power forecasting proposed a convolutional neural
speed [3,4]. Some of the techniques that used in short term wind speed network [18]. A method based on Hinton and Salakhutdinov's deep
forecasting is tried to find a relationship between the input and output belief nets for time series forecasting is proposed by Kuremoto et al.
variables. These approaches are part of the statistical methods [5–7]. [19]. However, the forecasting model strengthens better if the input
These models have better performance than physical model in wind data can be eliminated by selecting the most efficient and useful inputs.
speed forecasting. These methods are attractive because they are Including feature selection methods can be called correlation analysis


Corresponding author.
E-mail address: f.keynia@kgut.ac.ir (F. Keynia).

https://doi.org/10.1016/j.enconman.2020.112824
Received 21 December 2019; Received in revised form 6 April 2020; Accepted 7 April 2020
Available online 21 April 2020
0196-8904/ © 2020 Elsevier Ltd. All rights reserved.
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

Nomenclature ft The output of the output gate


ot The output of forget gate
CSA Crow search algorithm ct The finishing state in the memory block
WT Wavelet transform c¯t The temporary of the memory cell in the memory block
DWT Discrete wavelet transform The sigmoid function which determining the propagation
IWT Inverse wavelet transform of the information
LSTM Long Short Term Memory Wxf , Wxi , Wxc andWxo The input weight matrices
FS Feature selection Whf , Whi , Whc andWho The recurrent weight matrices
MI mutual information Why The hidden output weight matrix
RNN Recurrent Neural Network bf , bi , bc , bo andby The related bias vectors
MAE Mean Absolute Error ri Uniformly distributed random number
MAPE Mean Absolute Percentage Error AP j, iter Awareness probability of crow j at iteration iter
RMSE Root Mean Square Error fli,iter Flight length of crow i at iteration iter
xt The input vector SiACT The actual value at time i
yt The output vector SiFOR Forecasted value at the time i
it The output of the input gate N The total number of data

[22–24] and numerical sensitivity analysis [25,26], all of which are explanation of WT, FS technique, LSTM model, and CSA. And Section 4
linear methods of input selection, while the wind speed data are non- Contains simulation and discussion of results.
linear. Therefore, Mutual information feature selection methods will be
more efficient [27]. This method calculates the Relevancy of input and 2. The framework of the proposed model
output data.
The hybrid models are developed to improve the accuracy of the This paper proposed a novel model that consists of WT, FS, LSTM
proposed methods for short term wind speed forecasting. These hybrid and CSA for wind speed forecasting. Fig. 1 shows the framework of the
models are typically based on a forecaster method and one of the data proposed model, and the steps are as follows:
processing methods [28–38]. For example, Wang et al. proposed multi-
objective sine cosine algorithm (MOSCA) for wind speed forecasting 1. In the first step, WT is applied to decompose the original wind speed
named [39]. A hybrid mode that based on decomposition method and signal. In Section 3.1, the detail of WT is presented.
online sequential outlier robust extreme learning machine is proposed 2. FS based on entropy and MI is used to rank candidate inputs ac-
by Zhang et al. for short-term wind speed forecasting [40]. Jiang et al. cording to their information value for prediction of wind speed
proposed a novel hybrid forecasting system based on a data pre- signal. Section 3.2 presents the detail of FS.
processing module, optimization module, and forecasting module for 3. Then, all sub-layer in previous steps are forecasted by the LSTM
accurate and stable forecasting [41]. Chen et al. proposed a novel hy- model. In Section 3.3, the aspect of the LSTM network is presented.
brid method for short term wind speed forecasting called EnsemLSTM. 4. In two previous step, the CSA is used to optimized LSTM structure
In this method, LSTMs, SVRM, and EO are used for this purpose [42]. and number of input features.
Liu et al. proposed a novel hybrid model, which based on the empirical 5. Finally, The results are aggregated from all sub-layer to obtain short
wavelet transformation and two kinds of recurrent neural networks term wind speed forecasting results.
[43]. Zhou et al. proposed a novel hybrid wind speed forecasting
model, which includes data analysis, model selection strategy, fore- 3. Methodology
casting processing combined with a modified multi-objective optimi-
zation algorithm, and model evaluation [44]. 3.1. Wavelet transform
In this paper to improve the accuracy and speed of the model to
forecast short term wind speed, developed a hybrid wind speed fore- Generally, time-series patterns of wind velocity have nonlinear and
casting model, which includes four modules: CSA, WT, FS based on
dynamic properties that present themselves as sharp and fluctuating points.
entropy and mutual information, and deep learning time series pre- These points strongly influence the prediction of wind speed. Wind speeds
diction based on LSTM. To eliminate fluctuation behaviors of wind
include several variable features such as changes in levels, slope, and sea-
speed time series the WT module decomposed the original wind speed sonal features that are often one of the most challenging parts of the signal.
signal to four sub-series. The FS module can rank candidate inputs and
Therefore, the WT is used to analyze the asymmetric nature of this time
eliminate redundant inputs according to their information value for the series data set. In other words, the WT can extract different aspects of data
forecasting of wind speed signal. By doing this, in addition to improving such as breakpoints and discontinuities. Usually, a DWT is presented to
the prediction accuracy, the run time of the model will be significantly improve the computation; the wavelet filter bank used in this paper is the
decreased. Finally, the outputs of the previous step have been used as Daubechies orthogonal wavelet. It divides the input data into four distinct
inputs of the LSTM module and forecasted by it. The CSA is used to levels (A3, d1, d2, d3). A and d stands for approximation and detail, re-
optimize the LSTM structure and the number of input features. To va- spectively (f = A3 + D3 + D2 + D1). In Fig. 2 is shown the separating
lidate the performance of the proposed model for short term wind speed procedure of the input data using WT.
forecasting, two case studies data collected from Sotavento is located in
the south-west of Europe, in Galicia, Spain, and Kerman is located in the
3.2. Feature selection technique
middle-east, in the center of Iran are used for this purpose. The results
have demonstrated that the proposed hybrid model has satisfactory
FS is a process commonly used in machine learning, in which a
performance in the high precision wind speed forecasting.
subset of the available features of the data is selected for using the
The rest of the paper is organized as follows: Section 2 describes the
learning algorithm [27]. One of the most advanced FS techniques is MI,
framework of the proposed model; Section 3 provides the methodology
which is based on the entropy concept. In the following, the proposed
of the proposed model for wind speed forecasting, which includes
method to select the best inputs is described.

2
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

Fig. 1. Framework of the proposed method.

3
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

RNN is a type of deep neural network. One of the major problems of


this model is the gradient vanishing. To solve this problem, the LSTM
network is introduced, which is a combination of short-term and long-
term memory that can partially solve the problem of gradient van-
ishing. The LSTM network is capable of addressing the long-term and
short-term dependency problems [43]. The critical parameter of the
LSTM network is the memory cell. The basic LSTM cell is shown in
Fig. 4.
According to Fig. 4, in the basic LSTM cell, three gates units is ex-
isted that called input gates, output gates, and forget gates. The im-
plementation of cell state updates and computation of LSTM outputs
can be seen below:
ft = (Wxf . xt + Whf . ht 1 + bf ) (2)
Fig. 2. Data separation using DWT.
it = (Wxi. xt + Whi. ht 1 + bi ) (3)
Relevancy: The similarity of input and target is an essential factor
for input selection. The MI between output and input is an efficient way
c¯t = tanh(Wxc . xt + Whc . ht 1 + bc ) (4)
to calculate this similarity. In the filter that used to select the input in ct = ft ct 1 + it c¯t (5)
this paper, the purpose is to find the MI between inputs X and target Y.
The mutual information MI (x , y ) between the two variables is defined ot = (Wxo. xt + Who. ht 1 + bo) (6)
as following [27]:
ht = ot tanh(ct ) (7)
n m
P (x i , yj )
MI (x , y ) = P (xi , yj )log2
P (x i ) P (yj ) yt = (Why. ht + by ) (8)
i=1 j=1 (1)

If the MI between input and target is large, these two variables are 1
(x ) =
similar, and also, if the MI becomes zero, the two random variables are 1+e x (9)
totally unrelated. In Eq. (1), the input x and target y contain n and m As is known, neural network accuracy improves with an increasing
variable, respectively. In other words, by applying this method, after number of network layers. This is why multi-layer LSTM networks also
calculating the MI value based on Eq. (1), the input feature is ranked. perform better. The multi-layer LSTM network has only one direction of
Without using the optimization algorithm, FS is preset using the transmission for the sequence.
threshold limit. In other words, candidate’s inputs that their MI value is
higher than a relevancy, threshold are selected. But when the optimi-
3.4. Crow search algorithm
zation algorithm is used, the input selection between candidate inputs is
made by it.
Crows are the smartest birds. There is enormous evidence of in-
telligent crows. Crows can remember faces and warn each other
3.3. LSTM model whenever an unkind person approaches them. Also, they can use tools,
communicate in complex ways, and remember the hiding place of their
Deep learning is a machine learning method that uses deep neural foods for several months. Besides, the crows have a greedy habit to find
networks. As we know, a deep neural network is a multi-layer neural where other birds hide their foods. Likewise, the crows try to find the
network that has several layers. Fig. 3 illustrates the concept of deep hiding place of other bird’s foods and steal the hidden food. The CSA is
learning and its relation to machine learning. trying to imitate this smart behavior to provide an efficient way to solve
According to the figure, the deep neural network is in the final optimization problems [45]. In this algorithm, N is the number of
position of machine learning, and the learning rule is an algorithm that crows, and the position of crow i at iteration iter is determinate as
extracts the deep neural network model from the training data. When xi,iter = [x1i, iter , x 2i, iter , ...,xdi, iter ], where d is the number of decision vari-
the proper model is obtained with the appropriate learning rule, deep ables. Each crow has a memory that is memorized the position of its
neural network, and training data, then it is used for input data. hiding place. The process of the algorithm is detailed as follows:

Fig. 3. The concept of deep learning and its relation to machine learning.

4
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

Fig. 4. the basic LSTM cell.

Table 1 3. Position update. At iteration iter, the position of the hiding place of
The parameter of WT-FS-LSTM model. crow i is shown by mi, iter [45]. In CSA at each iteration position of
WT-FS-LSTM model parameter Input feature Learning rate Batch size crow i is updated as follow:

Range 1–50 0.0001–0.1 10–256 x i, iter + ri × fli, iter × (m j, iter x i, iter ) ri AP j, iter
x i, iter + 1 =
a random position otherwise (10)
Based on Eq. (10) if crow j is not aware that followed by crow i then
1. Initialization of the position and memory of crow. The initial posi- it updated its position. Otherwise, it is not updated its position.
tion of crow is randomly determined. In the first iteration, the
memory of crow is the same as the initial position. 4. Memory update. If a position of crow is better than the memory
2. Fitness evaluation. In this step, the objective function of the opti- position, the memory of the crow is updated.
mization problem is calculated for each crow. 5. Convergence. By increasing the number of iterations, the best

Fig. 5. Hourly wind speed data in Sotaventogalicia site.

5
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

Table 2
The results of different forecasting methods in Sotaventogalicia site.
Forecasting method Number of the input feature Learning rate Batch size MAE (m/s) RMSE (m/s) MAPE (%)

MLP 400 – – 2.6867 3.4945 38.4705


WT-FS-MLP 12 – – 0.3850 0.6180 5.6050
LSTM 400 0.0005 128 2.0018 2.6535 28.0287
FS-LSTM 12 0.0005 128 1.4308 1.8871 21.5643
WT-LSTM 400 0.0005 128 0.6673 0.8739 10.0586
WT-FS-LSTM 12 0.0005 128 0.1973 0.2700 3.1271
Optimized WT-FS-LSTM by PSO 7 0.0053 193 0.1921 0.2629 2.9940
Optimized WT-FS-LSTM by CSA 9 0.0043 126 0.1839 0.2591 2.8578

Fig. 6. The comparison of a different forecasting method in Sotaventogalicia site.

objective function of the problem is found and by this the best so- used to solve this point. In this paper, to resolve these problematic
lution of the optimization problem is identified. patterns in wind speed prediction at first, the wind speed data is sent to
the WT block, and then the same oscillation data is grouped into its
4. Simulation and discussion of results category. Then the train and target matrices are made. Then the ap-
propriate inputs are selected according to the method described in
As mentioned, the time series of wind speeds have many nonlinear Section 3.2. And finally, for forecasting, these data are given to the
and dynamic patterns. It was also noted that a different approach was LSTM network. One of the features of the presented paper is relying on

6
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

Fig. 7. The real and forecasted results in Sotaventogalicia site using optimized WT-FS-LSTM methods.

the use of the LSTM network as a suitable approach to predict wind network and the rest to test it.
speed. Therefore, the following steps should be followed to forecast 6- Currently, 400 training samples have entered the FS step, and
wind speed using the method described in this paper: among these, those data are ranked based on the MI method. Then,
given the random number generated by the CSA for the number of
1- First, the historical time series of wind speed data is received, and inputs features, among these ranked data, the numbers of input
then these data are normalized. features is determined. Therefore the inputs that are of high value in
2- Now the normalized data from the previous step has been imported predicting wind speed have been selected.
into the CSA. 7- At this step, an appropriate forecaster based on the LSTM network in
3- In this step, CSA generates crow position based on the decision which CSA optimizes learning rate and batch size, is designed for
variables, and it tries to optimize the number of input features and each WT frequency level. The designed LSTM network has two
LSTM structure. The decision variables of CSA are the number of layers, and the first layer has 200 hidden units, and the second layer
input features, learning rate, and batch size. The objective function has 100 hidden units.
of it is the MAE, which is described in Eq. (12). 8- IWT is taken from each of the forecaster output.
4- For each crow position, WT divides normalized data in step 1 into 9- Now the appropriate weighting coefficients extracted from the
four different frequency levels. forecaster are implemented on the test data, and the objective
5- At this step, train and target matrices for each frequency level function is evaluated, and the optimized input feature and LSTM
should be constructed. For this purpose, 400 training samples are structure is obtained. The formula of the MAPE, RMSE, and MAE are
considered from the end of the input time series. In other words, the described in Eqs. (11), (12), and (13), respectively. These indices are
size of the training matrix is the 1200*400, and the size of the target used to evaluate the efficiency of the proposed method in the pre-
matrix is 1200*1. Of these data, 80% was used to train the LSTM sented paper.

7
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

Fig. 8. the errors for different forecasting methods in Sotaventogalicia site.

Fig. 9. the errors for optimized WT-FS-LSTM methods in Sotaventogalicia site.

1
N
|Si ACT SiFOR | for forecasting with other methods, the wind speed data of the city of
MAPE = × 100% Sotaventogalicia, Spain in the year 2017, and the city of Kerman, Iran,
N i=1
SiACT (11)
in the year 2009 have been used. The comparison models are listed as
N follows: LSTM model, WT-LSTM model, WT-FS-LSTM model, FS-LSTM
1
RMSE = (SiACT SiFOR )2 model, optimized WT-FS-LSTM model multilayer perceptron (MLP)
N (12)
i=1 neural network model, and WT-FS-MLP model. The CSA parameters in
N the optimized WT-FS-LSTM method are as follow: population size is 30,
1
MAE = |SiACT SiFOR| awareness probability is 0.2, and flight length is 1. Also, the range of
N i=1 (13) WT-FS-LSTM model parameters such as input feature, learning rate, and
In this paper, to compare the performance of the proposed method batch size is shown in Table 1. The feed forward MLP neural network

8
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

Fig. 10. coefficient of determination for Sotaventogalicia site by different methods.

that is used in this paper have two layer and it was trained with Le- speed in the Sotaventogalicia wind farm is recorded per 10 min, but in this
venberg- Marquardt method. In this paper, the all proposed methods paper, the hourly wind speed is used. In this case study, the hourly wind
are implemented on MATLAB R2018b. speed data from October 10, 2017, to December 30, 2017, was utilized as
the dataset to perform short term wind speed forecasting [46]. 80% of the
4.1. Sotaventogalicia site total 1200 samples are used for a train sets, and 20% of them as a test sets.
The wind speed prepared in case study 1 is shown in Fig. 5.
Sotavento is located in the south-west of Europe, in Galicia, Spain, in “A The results of a different forecasting methods in the
Serra da Loba” with Lat/Long of 43.354377°N and 7.881213°W. The wind Sotaventogalicia site are listed in Table 2. As can be seen from the

9
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

Fig. 11. the hourly wind speed data in the Kerman site.

Table 3
The results of different forecasting methods in the Kerman site.
Forecasting method Number of the input feature Learning rate Batch size MAE (m/s) RMSE (m/s) MAPE (%)

MLP 400 – – 2.1418 2.6213 70.7681


WT-FS-MLP 15 – – 0.1810 0.2335 6.1221
LSTM 400 0.0005 128 1.8017 2.1735 57.4884
FS-LSTM 15 0.0005 128 1.3863 1.7223 47.4771
WT-LSTM 400 0.0005 128 1.5289 1.8756 47.5317
WT-FS-LSTM 15 0.0005 128 0.1311 0.1655 4.2365
Optimized WT-FS-LSTM by PSO 5 0.0056 243 0.1306 0.1621 4.2201
Optimized WT-FS-LSTM by CSA 11 0.0034 152 0.1217 0.1536 4.0857

results in Table 1, the Optimized WT-FS-LSTM by CSA model has an speed values using different methods. It has been observed that the
excellent capability in forecasting wind speed over other proposed value of R2 for the Sotaventogalicia site using the WT-FS-LSTM and
methods for this case study. The Optimized WT-FS-LSTM by CSA optimized WT-FS-LSTM methods are not less than 0.99. This indicates
method has the lowest values of MAPE, RMSE, and MAE, among the that the forecasted and real values in the WT-FS-LSTM and optimized
other methods. So, by adding CSA optimization method, FS, and WT WT-FS-LSTM methods are near. In other words, one can claim that the
blocks, the accuracy of the LSTM method is significantly improved. forecasting and real values are very close. That is, the mentioned
Also. These results indicated that the CSA optimization method has methods are well able to forecast the wind speed in the Sotaventogalicia
better performance than PSO algorithm. site.
These results are verified by Fig. 6, which in it the real and pre-
dicted wind speed are presented by the first six methods. As is evident, 4.2. Kerman site
the WT-FS-LSTM method has followed the real wind speed exactly.
These results indicated that the LSTM network have better performance Kerman is located in the middle-east, in the center of Iran, with Lat/
than MLP neural network. But by the optimized WT-FS-LSTM method Long of 30.2839° N and 57.0834° E. In this case, the hourly wind speed
can still improve the forecasting results. As can be seen, the results of data from the year of 2009 were utilized as the dataset to perform short
the real and forecasted wind speed by optimized WT-FS-LSTM methods term wind speed forecasting. Like the previous case study 80% of the
are shown in Fig. 7. The results in this figure also confirm the prediction total 1200 samples are used for a train set, and 20% of them as a test
improvement. sets. The yearly wind speed collected in this case study is shown in
Fig. 8 shows the error value of the wind speed forecasting for So- Fig. 11. As it is clear, the wind speed in this case study has a lot of
taventogalicia site using different methods. As it is clear, the error value fluctuations and made it difficult to forecast.
in the WT-FS-LSTM method is low. Also, the error value of the wind To forecast the wind speed in this case study, eight models, in-
speed forecasting for the Sotaventogalicia site using optimized WT-FS- cluding MLP, WT-FS-MLP, LSTM, FS-LSTM, WT-LSTM, WT-FS-LSTM,
LSTM methods is shown in Fig. 9. Among all the proposed methods in and optimized WT-FS-LSTM by PSO and CSA methods have been pro-
the Sotaventogalicia site, the optimized WT-FS-LSTM by CSA method posed as a previous case study. The forecasting results of different
has the lowest prediction error. To illustrate the efficiency of the pro- forecasting models are listed in Table 3. As can be seen from the results
posed method for forecasting wind speed in addition to the previously in Table 3, the optimized WT-FS-LSTM by CSA model has an excellent
mentioned indexes, the coefficient of determination (R2) has been used. capability in forecasting wind speed over other proposed methods for
Fig. 10 shows the correlation between the real and the forecasted wind Kerman site. Therefore, although the wind speed in the Kerman site is

10
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

Fig. 12. The comparison of a different forecasting method in Kerman site.

highly volatile, the optimized WT-FS-LSTM by CSA method is capable in WT-FS-LSTM by CSA method is significantly lower than those cal-
of forecasting this data accurately. The lower values of MAPE, RMSE, culated from other presented models for the wind speed forecasting in
and MAE in this method over the other methods also confirm this claim. the Kerman site. Therefore, these results demonstrate that the opti-
In other words, by adding CSA and more valuable features as well as mized WT-FS-LSTM by CSA model is superior compared to all other
adding WT to the LSTM network, the accuracy of this prediction proposed models in this paper in wind speed forecasting.
method is increased. The graph of the coefficient of determination (R2) between the
These results are verified by Fig. 12, which in it the real and pre- forecasted and the real wind speed values are used to show the exact fit
dicted wind speed are presented by different methods. Also, the opti- of the forecasted and real data. This is shown in Fig. 16. It has been
mized WT-FS-LSTM methods can still improve the forecasting results. observed that the value of R2 for the Kerman site using optimized WT-
As it is clear that the results of the optimized WT-FS-LSTM methods and FS-LSTM by CSA the proposed method is 0.9942. In other words, this
the real wind speed are almost overlapping, and it can be claimed that figure indicates that the proposed method fits the actual and the fore-
the results are very similar. The result of these method is shown in casted data very well. Also, this method has better R2 value in com-
Fig. 13. parison other methods.
Fig. 14 shows the error value of the wind speed forecasting for
Kerman site using different methods. As it is clear, the error value in the 5. Conclusion
WT-FS-LSTM method is low. Also, the error values calculated from the
optimized WT-FS-LSTM methods are shown in Fig. 15. The error value So far, various methods have been proposed for short term wind

11
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

Fig. 13. The real and forecasted results in Kerman site using optimized WT-FS-LSTM.

speed forecasting, in which each of them addressed this issue using a 6. Conclusion
variety of methods and techniques. In this paper, for short term wind
speed forecasting, a hybrid model based on the CSA, WT, FS based on So far, various methods have been proposed for short term wind
entropy and MI, and deep learning time series prediction based on speed forecasting, in which each of them addressed this issue using a
LSTM is proposed. In the proposed hybrid model, the WT is applied to variety of methods and techniques. In this paper, for short term wind
decompose the raw wind speed signal into four sub-layers, FS module is speed forecasting, a hybrid model based on the crow search algorithm,
used to rank candidate inputs according to their information value for WT, FS based on entropy and MI, and deep learning time series pre-
forecasting of wind speed signal, the LSTM network is employed to diction based on LSTM is proposed. In the proposed hybrid model, the
forecast all sub-layers and CSA is used to optimized LSTM structure and WT is applied to decompose the raw wind speed signal into four sub-
number of input feature. To illustrate the performance of the proposed layers, FS module is used to rank candidate inputs according to their
algorithm, two case studies, including wind speed data of the year 2017 information value for forecasting of wind speed signal, the LSTM net-
in Sotavento and Kerman in the year 2009, have been used. Also, to work is employed to forecast all sub-layers and CSA is used to optimized
evaluate the validity of the hybrid system, three error measures, which LSTM structure and number of input feature. To illustrate the perfor-
include the MAPE, MAE, and RMSE, are utilized in the forecasting ex- mance of the proposed algorithm, two case studies, including wind
periments. For example, in the Sotaventogalicia site, the MAPE, MAE, speed data of the year 2017 in Sotavento and Kerman in the year 2009,
and RMSE are 2.8578, 0.1839, and 0.2591, respectively. The results have been used. Also, to evaluate the validity of the hybrid system,
show that the optimized WT-FS-LSTM by CSA model can outperform three error measures, which include the MAPE, MAE, and RMSE, are
MLP, WT-FS-MLP, basic LSTM, FS-LSTM, WT-LSTM, WT-FS-LSTM, and utilized in the forecasting experiments. For example, in the
optimized WT-FS-LSTM by PSO. Also, this proposed model can be im- Sotaventogalicia site, the MAPE, MAE, and RMSE are 2.8578, 0.1839,
plemented to forecast other important parameters in the power system and 0.2591, respectively. The results show that the optimized WT-FS-
such as price, load, and reserve and so on. LSTM model can outperform basic LSTM, FS-LSTM, WT-LSTM, and WT-

12
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

Fig. 14. the errors for different forecasting methods in case study 2.

Fig. 15. The errors for optimized WT-FS-LSTM methods in Kerman site.

13
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

Fig. 16. coefficient of determination for Kerman site by different methods.

14
G. Memarzadeh and F. Keynia Energy Conversion and Management 213 (2020) 112824

FS-LSTM. Also, this proposed model can be implemented to forecast and probability density forecasting. Energy 2018;160:1186–200. https://doi.org/
other important parameters in the power system such as price, load, 10.1016/j.energy.2018.07.090.
[21] Chen K, Chen K, Wang Q, He Z, Hu J. Short-term load forecasting with deep residual
and reserve and so on. networks. IEEE Trans Smart Grid 2019;10:3943–52. https://doi.org/10.1109/TSG.
2018.2844307.
CRediT authorship contribution statement [22] Fan S, Chen L. Short-term load forecasting based on an adaptive hybrid method.
IEEE Trans Power Syst 2006;21:392–401. https://doi.org/10.1109/TPWRS.2005.
860944.
Gholamreza Memarzadeh: Conceptualization, Methodology, Data [23] Rana M, Koprinska I, Khosravi A. Feature selection for interval forecasting of
curation, Supervision, Writing - review & editing. Farshid Keynia: electricity demand time series data. Artificial Neural Network 2015;4:445–62.
https://doi.org/10.1007/978-3-319-09903-3_22.
Writing - original draft, Software, Investigation, Validation. [24] Kouhi S, Keynia F, Ravadanegh SN. A new short-term load forecast method based
on neuro-evolutionary algorithm and chaotic feature selection. Int J Electr Power
Declaration of Competing Interest 2014;62:862–7. https://doi.org/10.1016/j.ijepes.2014.05.036.
[25] Shahidehpour M, Yamin H, Li Z. Market Operations in Electric Power Systems.
Wiley Publishing; 2002.
The authors declare that they have no known competing financial [26] Mandal P, Srivastava AK, Negnevitsky M, Park JW. Sensitivity analysis of neural
interests or personal relationships that could have appeared to influ- network parameters to improve the performance of electricity price forecasting. Int
J Energy Res 2009;33:38–51. https://doi.org/10.1002/er.1469.
ence the work reported in this paper. [27] Amjady N, Keynia F. Day-ahead price forecasting of electricity markets by mutual
information technique and cascaded neuro-evolutionary algorithm. IEEE Trans
References Power Syst 2009;24:306–18. https://doi.org/10.1109/TPWRS.2008.2006997.
[28] Kiplangat DC, Asokan K, Kumar KS. Improved week-ahead predictions of wind
speed using simple linear models with wavelet decomposition. Renew Energy
[1] Afshar K, Shamsini Ghiasvand F, Bigdeli N. Optimal bidding strategy of wind power 2016;93:38–44. https://doi.org/10.1016/j.renene.2016.02.054.
producers in pay-as-bid power markets. Renew Energy 2018;127:575–86. https:// [29] Liu H, Tian H, Pan D, Li Y. Forecasting models for wind speed using wavelet, wa-
doi.org/10.1016/j.renene.2018.05.015. velet packet, time series and artificial neural networks. Appl Energy
[2] Meng A, Ge J, Yi H, Chen S. Wind speed forecasting based on wavelet packet de- 2013;107:191–208. https://doi.org/10.1016/j.apenergy.2013.02.002.
composition and artificial neural networks trained by crisscross optimization al- [30] Zhang C, Wei H, Zhao J, Liu T, Zhu T, Zhang K. Short-term wind speed forecasting
gorithm. Energy Convers Manage 2016;114:75–88. https://doi.org/10.1016/j. using empirical mode decomposition and feature selection. Renew Energy
enconman.2016.02.013. 2016;96:727–37. https://doi.org/10.1016/j.renene.2016.05.023.
[3] Zhang W, Qu Z, Zhang K, Mao W, Ma Y, Fan X. A combined model based on [31] Hu J, Wang J, Zeng G. A hybrid forecasting approach applied to wind speed time
CEEMDAN and modified flower pollination algorithm for wind speed forecasting. series. Renew Energy 2013;60:185–94. https://doi.org/10.1016/j.renene.2013.05.
Energy Convers Manage 2017;136:439–51. https://doi.org/10.1016/j.enconman. 012.
2017.01.022. [32] Du P, Wang J, Guo Z, Yang W. Research and application of a novel hybrid fore-
[4] Wang J, Wang Y, Jiang P. The study and application of a novel hybrid forecasting casting system based on multi-objective optimization for wind speed forecasting.
model – a case study of wind speed forecasting in China. Appl Energy Energy Convers Manage 2017;150:90–107. https://doi.org/10.1016/j.enconman.
2015;143:472–88. https://doi.org/10.1016/j.apenergy.2015.01.038. 2017.07.065.
[5] Ait Maatallah O, Achuthan A, Janoyan K, Marzocca P. Recursive wind speed fore- [33] Wang D, Luo H, Grunder O, Lin Y. Multi-step ahead wind speed forecasting using an
casting based on Hammerstein auto-regressive model. Appl Energy improved wavelet neural network combining variational mode decomposition and
2015;145:191–7. https://doi.org/10.1016/j.apenergy.2015.02.032. phase space reconstruction. Renew Energy 2017;113:1345–58. https://doi.org/10.
[6] Torres JL, García A, De Blas M, De Francisco A. Forecast of hourly average wind 1016/j.renene.2017.06.095.
speed with ARMA models in Navarre (Spain). Sol Energy 2005;79:65–77. https:// [34] Wang S, Zhang N, Wu L, Wang Y. Wind speed forecasting based on the hybrid
doi.org/10.1016/j.solener.2004.09.013. ensemble empirical mode decomposition and GA-BP neural network method.
[7] Yang D, Sharma V, Ye Z, Lim LI, Zhao L, Aryaputera AW. Forecasting of global Renew Energy 2016;94:629–36. https://doi.org/10.1016/j.renene.2016.03.103.
horizontal irradiance by exponential smoothing, using decompositions. Energy [35] Poitras G, Cormier G. Wind speed prediction for a target station using neural net-
2015;81:111–9. https://doi.org/10.1016/j.energy.2014.11.082. works and particle swarm optimization. Wind Eng 2011;35:369–80. https://doi.
[8] Barbounis TG, Theocharis JB. A locally recurrent fuzzy neural network with ap- org/10.1260/0309-524X.35.3.369.
plication to the wind speed prediction using spatial correlation. Neurocomputing [36] Jiang P, Wang Y, Wang J. Short-term wind speed forecasting using a hybrid model.
2007;70:1525–42. https://doi.org/10.1016/j.neucom.2006.01.032. Energy 2017;119:561–77. https://doi.org/10.1016/j.energy.2016.10.040.
[9] Focken U, Lange M, Mönnich K, Waldl HP, Beyer HG, Luig A. Short-term prediction [37] Carro-Calvo L, Salcedo-Sanz S, Prieto L, Kirchner-Bossi N, Portilla-Figueras A,
of the aggregated power output of wind farms – a statistical analysis of the reduc- Jiménez Fernández S. Wind speed reconstruction from synoptic pressure patterns
tion of the prediction error by spatial smoothing effects. J Wind Eng Ind Aerodyn using an evolutionary algorithm. Appl Energy 2012;89:347–54. https://doi.org/10.
2002;90:231–46. https://doi.org/10.1016/S0167-6105(01)00222-7. 1016/j.apenergy.2011.07.044.
[10] Wang J, Du P, Niu T, Yang W. A novel hybrid system based on a new proposed [38] Deo RC, Ghorbani MA, Samadianfard S, Maraseni T, Bilgili M, Biazar M. Multi-layer
algorithm-multi objective whale optimization algorithm for wind speed forecasting. perceptron hybrid model integrated with the firefly optimizer algorithm for wind
Appl Energy 2017;208:344–60. https://doi.org/10.1016/j.apenergy.2017.10.031. speed prediction of target site using a limited set of neighboring reference station
[11] Sun W, Liu M. Wind speed forecasting using FEEMD echo state networks with RELM data. Renew Energy 2018;116:309–23. https://doi.org/10.1016/j.renene.2017.09.
in Hebei, China. Energy Convers Manage 2016;114:197–208. https://doi.org/10. 078.
1016/j.enconman.2016.02.022. [39] Wang J, Yang W, Du P, Niu T. A novel hybrid forecasting system of wind speed
[12] Niu T, Wang J, Zhang K, Du P. Multi-step-ahead wind speed forecasting based on based on a newly developed multi-objective sine cosine algorithm. Energy Convers
optimal feature selection and a modified bat algorithm with the cognition strategy. Manage 2018;163:134–50. https://doi.org/10.1016/j.enconman.2018.02.012.
Renew Energy 2017;118:213–29. https://doi.org/10.1016/j.renene.2017.10.075. [40] Zhang D, Peng X, Pan K, Liu Y. A novel wind speed forecasting based on hybrid
[13] Liu H, Tian H, Li Y. Four wind speed multi-step forecasting models using extreme decomposition and online sequential outlier robust extreme learning machine.
learning machines and signal decomposing algorithms. Energy Convers Manage Energy Convers Manage 2019;180:338–57. https://doi.org/10.1016/j.enconman.
2015;100:16–22. https://doi.org/10.1016/j.enconman.2015.04.057. 2018.10.089.
[14] Santamaría-Bonfil G, Reyes-Ballesteros A, Gershenson C. Wind speed forecasting for [41] Jiang P, Yang H, Heng J. A hybrid forecasting system based on fuzzy time series and
wind farms: a method based on support vector regression. Renew Energy multi-objective optimization for wind speed forecasting. Appl Energy
2016;85:790–809. https://doi.org/10.1016/j.renene.2015.07.004. 2019;235:786–801. https://doi.org/10.1016/j.apenergy.2018.11.012.
[15] Sfetsos A. A comparison of various forecasting techniques applied to mean hourly [42] Chen J, Zeng GQ, Zhou W, Du W, Lu KD. Wind speed forecasting using nonlinear-
wind speed time series. Renew Energy 2000;21:23–35. https://doi.org/10.1016/ learning ensemble of deep learning time series prediction and extremal optimiza-
S0960-1481(99)00125-1. tion. Energy Convers Manage 2018;165:681–95. https://doi.org/10.1016/j.
[16] Khodayar M, Kaynak O, Khodayar ME. Rough deep neural architecture for short- enconman.2018.03.098.
term wind speed forecasting. IEEE Trans Ind Informat 2017;13. http://dx.doi.org/ [43] Liu H, Mi XW, Li YF. Wind speed forecasting method based on deep learning
2770-9. 10.1109/TII.2017.2730846. strategy using empirical wavelet transform, long short term memory neural net-
[17] Hu Q, Zhang R, Zhou Y. Transfer learning for short-term wind speed prediction with work and Elman neural network. Energy Convers Manage 2018;156:498–514.
deep neural networks. Renew Energy 2016;85:83–95. https://doi.org/10.1016/j. https://doi.org/10.1016/j.enconman.2017.11.053.
renene.2015.06.034. [44] Zhou Q, Wang C, Zhang G. Hybrid forecasting system based on an optimal model
[18] Wang HZ, Li GQ, Wang GB, Peng JC, Jiang H, Liu YT. Deep learning based ensemble selection strategy for different wind speed forecasting problems. Appl Energy
approach for probabilistic wind power forecasting. Appl Energy 2017;188:56–70. 2019;250:1559–80. https://doi.org/10.1016/j.apenergy.2019.05.016.
https://doi.org/10.1016/j.apenergy.2016.11.111. [45] Askarzadeh A. A novel metaheuristic method for solving constrained engineering
[19] Kuremoto T, Kimura S, Kobayashi K, Obayashi M. Time series forecasting using a optimization problems: crow search algorithm. Comput Struct 2016;169:1–12.
deep belief network with restricted Boltzmann machines. Neurocomputing https://doi.org/10.1016/j.compstruc.2016.03.001.
2014;137:47–56. https://doi.org/10.1016/j.neucom.2013.03.047. [46] [Online]. Available: http://www.sotaventogalicia.com/.
[20] Guo Z, Zhou K, Zhang X, Yang S. A deep learning model for short-term power load

15

You might also like