Professional Documents
Culture Documents
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2929542, IEEE
Access
ABSTRACT Wind is one of the most essential sources of clean, renewable energy, and is therefore a critical
element in responsible power consumption and production. The accurate prediction of wind speed plays a
key role in decision-making and management in wind power generation. This study proposes a model using
a deep belief network with genetic algorithms (DBNGA) for wind speed forecasting. The genetic algorithms
are used to determine parameters for deep belief networks. Wind speed and weather-related data are collected
from Taiwan’s central weather bureau for this purpose. This study uses both time series data and multivariate
regression data to forecast wind speed. The seasonal autoregressive integrated moving average (SARIMA)
method and the least squares support vector regression for time series with genetic algorithms (LSSVRTSGA)
are used to forecast wind speed in a time series, and the least squares support vector regression with genetic
algorithms (LSSVRGA) and DBNGA models are used to predict wind speed in a multivariate format.
Empirical results show that forecasting wind speed by DBNGA models outperforms the other forecasting
models in terms of forecasting accuracy. Thus, the DBNGA model is a feasible and effective approach for
wind speed forecasting.
INDEX TERMS Forecast, wind speed, deep belief networks, genetic algorithms, multivariate regression,
time series
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2929542, IEEE
Access
into low-frequency wind speed data to be processed by the [21-24] have been used for deep learning network parameter
long short term memory neural network, and high-frequency selection. In this study, genetic algorithms [25] are used for
wind speed data to be processed by the Elman neural network. deep belief network parameter selection for predicting wind
The experiment results showed that the proposed hybrid speed by weather-related data. The remainder of this paper is
model outperformed other forecasting techniques in terms of arranged as follows. Section 2 presents the methodologies
forecasting accuracy. Liu et al. [9] proposed a multistep used in this study. The proposed wind speed forecasting
forecasting model including variational mode decomposition, framework and numerical results are described in Section 3.
singular spectrum analysis, long short term memory networks, Finally, conclusions are drawn in Section 4.
and an extreme learning machine for predicting wind speed.
The variational mode decomposition and the singular
spectrum analysis were employed to decompose data and II. METHODOLOGIES
extract data trends; the long short term memory network and A. THE SEASONAL AUTOREGRESSIVE INTEGRATED
the extreme learning machine were used to deal with low- MOVING AVERAGE MODEL
frequency and high-frequency data, respectively. The Developed by Box and Jenkins [26] the autoregressive
empirical results revealed that the presented hybrid model integrated moving average (ARIMA) model is a popular
could generate more accurate results in forecasting wind speed method in coping with time series data. In the ARIMA method,
than the other models. Chen et al. [10] developed a combined two components, the autoregressive and the moving average;
model by applying the ensemble learning technique to long are included; and a future variable value is determined by a
short term memory neural networks and a support vector linear combination of prior values and errors. Three
regression machine for wind speed forecasting. Simulation parameters, the order of the autoregressive (p), the degree of
results indicated that the proposed model achieved a better differencing (d), and the order of the moving average model
forecasting performance than the autoregressive integrated (q), are contained in the ARIMA model. The autoregressive
moving average models, support vector regression, the k- model, AR(p), represents the association between present and
nearest neighbor technique, artificial neural networks, and the past values. For a sequence with variables Y, the values of the
gradient boosting regression tree for wind speed forecasting. prior period are employed to predict the present period value
Liu et al. [11] proposed a hybrid model using the wavelet (Yt) and expressed by Eq.(1).
packet decomposition technique, a convolutional neural 𝑌 𝐶 ∑ ∅𝑌 𝜀 ∑ 𝜃𝜀 (1)
network and a long short term memory network for predicting where C is a constant, ∅ and 𝜃 are the autoregressive
wind speed. The wavelet packet decomposition technique was coefficient and the moving average coefficient respectively,
used to decompose wind speed data into high-frequency and and Ɛ t is white noise at time t.
low-frequency data patterns. The convolutional neural The seasonal autoregressive integrated moving average
network was employed to cope with high-frequency wind (SARIMA) model [26, 27] is a revised form of the ARIMA
speed data, while the low-frequency wind speed data was model. The SARIMA model was designed for forecasting
processed by the convolutional long short term memory time series data with trend and seasonal tendency, and uses
network. The numerical results demonstrated that the regular and seasonal differences to remove trends and seasonal
developed hybrid model outperformed other forecasting presences. A SARIMA model (p, d, q)(P, D, Q)S produces a
models in terms of wind speed prediction accuracy. Li et al. time series, {Yt, t = 1, 2. . ..k}, with a mean M from an ARIMA
[12] developed a hybrid model for forecasting wind speed. [26] time series model satisfying Eq. (2):
The proposed model included empirical wavelet transform ∅p(L) ϑ(Ls)(1-L)d(1-Ls)D(Yt-M)=θq(L)ΘQ(Ls)Ɛ t (2)
decomposition, a long short term memory network, a where L is the backward shift operator, and Ɛ t is white noise
regularized extreme learning machine network and inverse at time t, while p, d, q, P, D and Q are nonnegative integers,
empirical wavelet transform. The models performed raw wind the orders p and P are a regular autoregressive operator and a
speed data decomposition, time series wind speed data seasonal autoregressive operator, the orders q and Q are a
forecasting, forecasting error series modeling, final regular moving average operator and a seasonal moving
forecasting series integration, and outlier filtering. The average operator, d and D denote degrees of regular
numerical results showed that the performance of the differencing and seasonal differencing, s indicates the seasonal
presented hybrid model was superior to that of the other seven interval, ∅ p(L)=(1− ∅ 1L− ∅ 2L2−· · ·− ∅ pLp) is the regular
compared forecasting techniques in terms of forecasting autoregressive operator with order p, and θq(L)=(1-θ1L-θ
2 q
accuracy. 2L -· · ·-θqL ) is the regular moving average operator with
Nevertheless, the choice of parameters for deep belief order q. In addition, ϑ(Ls)=(1- ϑ1Ls- ϑ2L2s-· · ·- ϑPLps) is the
networks significantly influences their forecasting accuracy. seasonal autoregressive operator with order P, and ΘQ(LS)
Experimental or trial-and-error methods have been adopted to =(1−Θ1BS -Θ2B2S-· · ·ΘQBQS) is the seasonal moving average
cope with such parameter selection [6, 13-16]. In addition, operator with order Q. The degrees of differencing, d and D,
some metaheuristics, including the tabu search algorithm [17], are determined in terms of eliminating the tendency and
particle swarm optimization [18-20] and genetic algorithms seasonality of data. The autocorrelation function and the
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2929542, IEEE
Access
partial autocorrelation function of the differenced series are RBM only allows links between nodes in adjacent layers. An
employed to determine the values of p, P, q and Q. In general, RBM consists of two layers: a visible layer and a hidden
a four-step procedure involving identification, estimation, layer. Each node in the visible layer is linked to all hidden
validation, and forecasting, is performed to find suitable nodes without directions. Deep belief network learning
parameters and to make a forecast in the SARIMA model [26- strategies include unsupervised learning in the pre-training
29]. stage, and supervised learning in the fine-tuning stage. The
main task of unsupervised learning is to select appropriate
B. THE LEAST SQUARE SUPPORT VECTOR initial parameters, including biases and weights, for the
REGRESSION MODEL supervised learning using only independent variables. Thus,
Revised from the support regression model [30-32] by the pre-training phase rebuilds training samples by adjusting
parameters in order to maximize the likelihood estimation.
decreasing computation load, the least square support vector
Based on the initial parameters provided by the pre-training
regression model [33] is a popular and powerful technique for
phase, the supervised learning phase can further tune biases
dealing with forecasting problems in both time series and
and weights efficiently. Figure 1 illustrates an RBM structure
multivariate regression formats. By introducing a least squares consisting of a visible layer with i units, and a hidden layer
approach to the support vector regression model for a training with j units. The outputs of units in a visible layer are treated
data set {Xi,Yi}, i=1,…..N in order to formulate a regression as inputs of units in the hidden layer. An energy function of
problem, the LSSVR model can be depicted as the following a joint structure (v,h) of visible and hidden units can be
optimization problem [33, 34]: depicted as Eq. (8):
Min ∶ 𝑤 𝑤 Ʊ∑ e (3) Energy v, h ∑ v ∑ h
subject to ∑ ∑ V W h (8)
𝑌 𝑤 ∙𝜌 𝑋 𝑏 e , 𝑖 1, … . . 𝑁 where Wm,n is the weight of connection unit m in the visible
where w is the weight vector of the same dimension as the layer with the bias m and unit n in the hidden layer with the
feature space, Ʊ is the regularization constant signifying the bias n. Values of Vm and hn satisfy a binary distribution
trade-off between the empirical error and the flatness of the stochastically, and can be represented as Eq. (9):
,
function, N is the number of data points, 𝑒 is the ith training p v, h (9)
∑ ∑ ,
error of 𝑋 , 𝑋 and Yi are the input and output data points for
the ith pattern data, 𝜌(X) is a nonlinear mapping function, and ,
where ∑ ∑ exp is a normalization element.
b is the bias. By the Lagrange multipliers technique, the
When state vector v of the visible layer or h of the hidden
Lagrange form of Eq. (3) is rewritten as:
layer are generated, the activated probabilities of ℎ and
L 𝑤, b, e, α ‖𝑊‖ Ʊ∑ 𝑒 ∑ 𝛼 𝑤 ∙ 𝑣 can be expressed as Eq. (10) and Eq. (11), respectively:
𝜌 𝑋 𝑏 𝑒 𝑌 (4) p ℎ 1|𝑣 𝑠𝑖𝑔𝑚 ∑ 𝑣 𝑊 (10)
where 𝛼 are the Lagrange multipliers. p 𝑣 1|h 𝑠𝑖𝑔𝑚 ∑ 𝑊 ℎ (11)
Then, the partial derivative of Eq. (4) is performed with The sigm(・) is a sigmoid function and can be represented as
respect to w, b, e and α, when all derivatives are equal to zero Eq.(12).
according to the Karush–Kuhn–Tucker conditions [35-37]. 𝑠𝑖𝑔𝑚 x (12)
Using the least squares method to resolve the linear equations, The aim of the RBM is to search three appropriate
the LSSVR model can be represented as Eq. (5): parameters, namely α, β and W, represented collectively as a
𝑌 𝑋 ∑ 𝛼 𝜌 X T∙𝜌 𝑋 T 𝑏 (5) parameter set Ω. The pre-trained parameter set serves as an
With respect to Mercer’s principle [38], the following kernel initial parameter set for the supervised learning stage of the
function is defined: deep belief network. Thus, features of the training data set
K(X,Xi)=𝜌(X)T∙𝜌 𝑋 T (6) can be depicted well by the parameter set, and the likelihood
In this study, the radial basis function with the kernel width σ function can be maximized. A constructive divergence
indicated by Eq.(7) serves as a kernel function. algorithm [42] was presented to optimize the likelihood
‖ ‖ function when distributions of the visible layer and the
𝐾 𝑋, 𝑋 𝑒𝑥𝑝 ) (7)
hidden layer are in steady states [43]. The likelihood function
In this investigation, suitable LSSVR model parameters are is expressed as Eq. (13):
provided by genetic algorithms [25, 39]. ln 𝐿 Ω ln ∏ 𝑃 𝑉 ∑ ln 𝑃 𝑉 (13)
where T is the number of training data. The gradient of
C. DEEP BELIEF NETWORKS Eq.(13) with respect to the parameter set can be expressed as
A deep belief network [40] consists of several stacked Eq.(14):
restricted Boltzmann machines (RBMs), which are the main ∑ (14)
components of a deep belief network. The RBM was revised
from the original Boltzmann machine BM [41] with node Then, the derivative of ln 𝑃 𝑉 with respect to Ω can be
connections both within layers and between layers. The
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2929542, IEEE
Access
𝛼 𝑡 1 𝑀𝑂 𝛼 𝑡 𝜂 ∑ 𝑣 𝑣 (19) …….
𝛽 𝑡 1 𝑀𝑂 𝛽 𝑡 𝜂 ∑ 𝑃 ℎ 𝑣
𝑃 ℎ 𝑣 (20) …….
Errors propagation
FIGURE 2. An illustration of a deep belief network. [44-46]
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2929542, IEEE
Access
provided by genetic algorithms using negative RMSE values Use GA to generate a new set of
𝑀𝑂1 and 𝜂1 , and provisional
of wind speed forecast as fitness function in the training stage. Calculate values of the
RBM models.
generations is used as a stop criterion for the genetic Use GA to generate a new set of
𝑀𝑂2 and 𝜂2 , and provisional
algorithms. The goal of selecting proper controlling Train parameters (γ, W) of MLP models.
parameters for metaheuristics is to optimize problems. The MLP models.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2929542, IEEE
Access
1
4
7
100
103
106
109
112
115
118
121
124
127
130
133
136
139
142
145
148
151
154
157
160
163
166
169
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
88
91
94
97
at three weather stations. The wind speed forecasting values
of different forecasting models and actual values are shown FIGURE 5. Actual and forecasted values of hourly wind speed at the
in Figs. 5-7 at three weather stations. Figures 5-7 show that Penghu station.
the DBNGA model can capture wind speed patterns well,
especially at turning points. However, the other three models
Wind speeds (m/s)
are unable to effectively follow wind speed data patterns in 8
Actual values
SARIMA
mid-term time windows. Tables 4-6 list various DBNGA
measurements of forecasting accuracy by the four models
7
LSSVRGA
LSSVRTSGA
with various time windows at three weather stations. Table 7 6
2
TABLE 1. Parameters of forecasting models at the Penghu station.
Stations Models Parameters 1
101
105
109
113
117
121
125
129
133
137
141
145
149
153
157
161
165
169
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
93
97
4
TABLE 3. Parameters of forecasting models at the Matsu station.
Stations Models Parameters 3
DBNGA (𝑀𝑂 , 𝜂 )=( 0.88, 0.49);
2
(𝑀𝑂 ,𝜂 )=( 0.98, 0.94))
Matsu LSSVRGA (Ʊ,σ)= (335.44, 8.92) 1
LSSVRTSGA (Ʊ,σ)=( 463.41, 4.86)
0
SARIMA (p, d, q)(P, D, Q)s Hours
1
5
9
101
105
109
113
117
121
125
129
133
137
141
145
149
153
157
161
165
169
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
93
97
=(0,1,1)( 0,0,1)24
FIGURE 7. Actual and forecasted values of hourly wind speed at the
Matsu station.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2929542, IEEE
Access
TABLE 4. Measurements of forecasting accuracy by four models with MAPE(%) 11.831 11.299 21.360 16.609
various time windows at the Penghu station. 24
Forecasting models RMSE 0.708 0.655 1.338 0.875
Time Measureme
Window nts of MAPE(%) 13.116 12.505 21.472 29.550
s forecasting DBN LSSVRG LSSVRT 36
SARIMA RMSE 0.708 0.655 1.183 1.301
(hours) accuracy GA A SGA
TABLE 5. Measurements of forecasting accuracy by four models with MAPE(%) 14.119 11.971 16.484 11.622
various time windows at the Kinmen station. 24
Forecasting models RMSE 0.734 0.630 0.629 0.589
Time Measureme
Window nts of MAPE(%) 12.528 12.181 16.609 10.601
s forecasting DBN LSSVRG LSSVRT 36
SARIMA RMSE 0.659 0.615 0.710 0.533
(hours) accuracy GA A SGA
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2929542, IEEE
Access
MAPE(%) 14.077 15.199 32.731 22.525 with genetic algorithms model for predicting wind speed in
60
RMSE 0.652 0.661 1.163 0.846 Taiwan. Both wind speed data and various weather factors
are employed to forecast wind speed at three weather stations
MAPE(%) 13.799 16.232 30.632 22.517 in Taiwan. Numerical results indicate that DGNBA models
72 can achieve more accurate forecasting results over mid-term
RMSE 0.671 0.790 1.174 1.000
periods, and in terms of the average forecasting accuracy for
MAPE(%) 13.407 15.764 27.826 22.200
all time windows than the other models. Thus, the proposed
84 DBNGA model is a feasible and promising method for
RMSE 0.678 0.801 1.116 1.020
forecasting wind speed. However, this study only used
MAPE(%) 17.010 18.585 48.938 39.818
limited weather factors for multivariate forecasting methods
96
RMSE 0.679 0.786 1.410 1.231 of predicting wind speed. Future work will include some
other factors, such as landforms, to increase the forecasting
MAPE(%) 17.471 23.154 52.414 42.570
accuracy. Another possible direction for future study is to
108
RMSE 0.741 0.964 1.485 1.296 include the number of hidden layers and the number of
MAPE(%) 16.681 21.728 49.262 41.365
hidden nodes in the fitness function of genetic algorithms
120
used to select deep belief network models. Finally, other
RMSE 0.735 0.943 1.472 1.366 metaheuristics could be employed select deep belief network
MAPE(%) 15.791 20.666 45.612 39.365 models in order to improve forecasting performance.
132
RMSE 0.712 0.917 1.412 1.338
REFERENCES
MAPE(%) 15.780 20.460 43.966 37.636 1. H. Samet, F. Marzbani,“Quantizing the deterministic
144 nonlinearity in wind speed time series,” Renew. Sust.
RMSE 0.709 0.919 1.385 1.302
Energ. Rev, vol. 39, pp. 1143-1154, 2014.
MAPE(%) 17.831 23.888 50.277 42.257 2. A. Tascikaraoglu, M. Uzunoglu, “A review of combined
156
RMSE 0.733 0.980 1.484 1.351 approaches for prediction of short- term wind speed and
power,” Renew. Sust. Energ. Rev. vol. 34, pp. 243-254,
MAPE(%) 21.015 30.071 62.415 52.073
2014.
168
RMSE 0.739 1.032 1.604 1.427 3. J. Jung, R. P. Broadwater, “Current status and future
advances for wind speed and power forecasting,” Renew.
TABLE 7. Averages of forecasting measurements at three stations.
Sust. Energ. Rev. vol. 31, pp. 762-777, 2014.
Forecasting models 4. L. Xiao, J. Wang, Y. Dong, J. Wu, “Combined
Measureme forecasting models for wind energy forecasting: A case
nts of
Stations
forecasting DBN LSSVRG LSSVRT study in China,” Renew. Sust. Energ. Rev. vol. 31, pp.
SARIMA
accuracy GA A SGA 271-288, 2015.
5. J. Wang, Y. Song, F. Liu, R. Hou, “Analysis and
MAPE(%)
12.000 13.953 39.426 59.060 application of forecasting models in wind power
Penghu
RMSE integration: A review of multi-step-ahead wind speed
0.621 1.326 1.645 2.417
MAPE(%)
forecasting models,” Renew. Sust. Energ. Rev. vol. 60, pp.
Kinmen
17.850 23.060 37.310 61.414 960-981, 2016.
RMSE 6. H. Z. Wang, G. B. Wang, G. Q .Li, J. C. Peng, Y. T. Liu,
0.712 0.887 1.254 1.771
MAPE(%) “Deep belief network based deterministic and
15.597 18.092 36.741 29.275 probabilistic wind speed forecasting approach,” Appl.
Matsu
RMSE
0.710 0.808 1.185 1.038 Energy. vol. 182, pp. 80-93, 2016.
7. Q. Hu, R. Zhang, Y. Zhou, “Transfer learning for short-
term wind speed prediction with deep neural networks,”
IV. CONCLUSIONS Renew. Energy. vol. 85, pp. 83-95, 2016.
As global fossil fuel reserves have shrunk, and global 8. H. Liu, X. W. Mi, Y. F. Li, “Wind speed forecasting
warming has become an ever more serious concern, energy method based on deep learning strategy using empirical
policy has begun to receive increasingly more attention, and wavelet transform, long short term memory neural
options for clean, renewable energy, such as wind power, network and Elman neural network,” Energ. Convers.
have become an essential issue. Wind speed is always
Manage, vol. 156, pp. 498-514, 2018.
intermittent, making its prediction very challenging. Due to
9. H. Liu, X. Mi, Y. Li, “Smart multi-step deep learning
topographical features, and strong annual South West and
model for wind speed forecasting based on variational
North East monsoons, Taiwan has an abundance of wind,
and wind energy utilization, as well as forecasting, has been mode decomposition, singular spectrum analysis, LSTM
indispensable. This study develops a deep belief network network and ELM,” Energ. Convers. Manage, vol. 159,
pp. 54-64, 2018.
8
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2929542, IEEE
Access
10. J. Chen, G. Q. Zeng, W. Zhou, W. Du, K. D. Lu, “Wind deep belief network,” Appl. Soft Comput, vol.62, pp. 251-
speed forecasting using nonlinear-learning ensemble of 258, 2018.
deep learning time series prediction and extremal 23. H. Shao, H. Jiang, X. Li, T. Liang, “Rolling bearing fault
optimization,” Energ. Convers. Manage , vol. 165, pp. detection using continuous deep belief network with
681-695, 2018. locally linear embedding,” Comput. Ind, vol. 96, pp. 27-
11. H. Liu, X. Mi, Y. Li, “Smart deep learning based wind 39, 2018.
speed prediction model using wavelet packet 24. Y. Zhang, G. Huang, “traffic flow prediction model based
decomposition, convolutional neural network and on deep belief network and genetic algorithm,” IET Intelt.
convolutional long short term memory network,” Energ. Transp. Sy, vol. 12, Issue 6, pp. 533-541. 2018.
Convers. Manage, vol. 166, pp. 120-131, 2018. 25. J. Holland, “Adaptation in natural and artificial systems.
12. Y. Li, H. Wu, “Multi-step wind speed forecasting using An introductory analysis with application to biology, ”
EWT decomposition, LSTM principal computing, RELM Control, and Artificial Intelligence, Ann Arbor,
subordinate computing and IEWT reconstruction,” Energ. University of Michigan Press, pp. 439-444, 1975.
Convers. Manage, vol. 167, pp. 203-219. 2018. 26. G.E.P. Box, G.M. Jenkins, “Time Series Analysis:
13. M. Berglund, T. Raiko, K. Cho, “Measuring the Forecasting and Control, ” revised Edition, CA , Holden-
usefulness of hidden units in Boltzmann machines with Day, San Francisco,1976.
mutual information,” Neural Networks , vol. 64, pp. 12- 27. G.E.P. Box, G.M. Jenkins, G. C. Reinsel, G. M. Ljung,
18.2018. “Time Series Analysis: Forecasting and Control, ” 5th
14. F. Shen, J. Chao, J. Zhao, “Forecasting exchange rate Edition, John Wiley and Sons Inc., Hoboken, New Jersey,
using deep belief networks and conjugate gradient 2015.
method,” Neurocomputing, vol. 167, pp.243-253.2015 28. A.K. Palit, D. Popovic, “Computational Intelligence in
15. G. Wang, J. Qiao, X. Li, L. Wang, Wang, Improved Time Series Forecasting, ” Springer-verlag, London,
classification with semi-supervised deep belief network,” 2005.
IFAC-PapersOnLine, vol. 50, pp. 4174-4179.2017. 29. R.H. Shumway, D.S. Stoffer, “Time series analysis and
16. Z. Geng, Z. Li, Y. Han, “new deep belief network based its applications. With R Examples, ” Springer, New York,
on RBM with glial chains,” Information Sciences, vol. 2011.
463, pp. 294-306,2018. 30. S. Mukherjee, E. Osuna, F. Girosi, “Nonlinear prediction
17. T. Hayashida, I. Nishizaki, S. Sekizaki, M. Nishida, M. D. of chaotic time series using support vector machines, ”
Prasetio, “Structural Optimization of Deep Belief In Proceedings of the 1997 IEEE Signal Processing
Network by Evolutionary Computation Methods Society Workshop . IEEE, pp. 511-520, 1997.
including Tabu Search. Transactions on Machine 31. K. R. Müller, A. J. Smola, G. Rätsch, B. Schölkopf, J.
Learning and Artificial Intelligence,” Transactions on Kohlmorgen, V. Vapnik, “Predicting time series with
Machine Learning and Artificial Intelligence, vol. 6, pp. support vector machines, ” In: International Conference
69, 2018. on Artificial Neural Networks, Springer, Berlin,
18. T. Kuremoto, S. Kimura, K. Kobayashi, M. Obayashi, Heidelberg, pp. 999-1004, 1997.
“Time series forecasting using a deep belief network with 32. V. Vapnik, S. E. Golowich, A. J. Smola, “Support vector
restricted Boltzmann machines,” Neurocomputing. vol. method for function approximation, regression estimation
137, pp. 47-56, 2014. and signal processing, ” In Proceedings of the advances
19. B. Qolomany, M. Maabreh, A. Al-Fuqaha, A. Gupta, D. in neural information processing systems, pp. 281-287,
Benhaddou, “Parameters optimization of deep learning 1997.
models using Particle swarm optimization,” In: 2017 13th 33. J.A.K. Suykens, Joos. Vandewalle, “Least squares
International Wireless Communications and Mobile support vector machine classifiers, ” Neural processing
Computing Conference (IWCMC). IEEE, pp. 1285-1290, Lett, vol. 9, pp. 293-300,1999.
2017. 34. J.A.K. Suykens, J. De Brabanter, L. Lukas, J. Vandewalle,
20. Q. Sun, Y. Wang, Y. Jiang, L. Shao, D. Chen, “Fault “Weighted least squares support vector machines:
diagnosis of SEPIC converters based on PSO-DBN and robustness and sparse approximation, ” Neurocomputing,
wavelet packet energy spectrum,” In: Prognostics and vol. 484, pp. 85-105, 2002.
System Health Management Conference (PHM-Harbin). 35. R. Fletcher, “Practical methods of optimization, ” John
IEEE, pp. 1-7, 2017. Wiley & Sons. New York, 2013.
21. D. Hossain, G. Capi, “Genetic algorithm based deep 36. W. Karush, “Minima of functions of several variables
learning parameters tuning for robot object recognition with inequalities as side constraints, ”M. Sc. Dissertation.
and grasping,” Int. Sch. Sci. Res. Innov, vol. 11, pp. 629- Department of Mathematics, University of Chicago, 1939.
633,2017. 37. H.W. Kuhn, A.W. Tucker, “Nonlinear programming, ” In
22. F. Ghasemi, A. Mehridehnavi, A. Fassihi, H. Pérez- Proceeding of 2nd Berkeley Symposium on
Sánchez, “Deep neural network in QSAR studies using
9
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/ACCESS.2019.2929542, IEEE
Access
10
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.