You are on page 1of 12

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2021.3097716, IEEE
Transactions on Industrial Informatics
1

Deep Spatial-Temporal 2-D CNN-BLSTM Model


for Ultra-Short-Term LiDAR-Assisted Wind
Turbine’s Power and Fatigue load Forecasting
Amirhossein Dolatabadi, Graduate Student Member, IEEE, Hussein Abdeltawab, Member, IEEE,, and Yasser
Abdel-Rady I. Mohamed, Fellow, IEEE

Abstract—Optimizing wind turbine performance is still a τm Mechanical time constant of the wind turbine.
challenge due to the dynamic interactions between the spatially- τe Electrical time constant of the wind turbine.
temporally stochastic wind fields and the wind turbine as a ρ Air density.
complex mechanical system. Recent cost reduction of remote
sensing wind measurement technologies, such as light detection A Swept area of the rotor blades.
and ranging (LiDAR), has opened a new research area on the use R Radius of the rotor blades.
of deep learning models for predicting wind turbine’s responses. Cp Power coefficient of the rotor.
In this paper, a LiDAR-aided deep learning model is presented λ Tip speed ratio.
to learn the powerful spatial-temporal characteristics from the β Pitch angle.
input wind fields. In the proposed method, the combination
of 2-D convolutional neural networks (CNNs) and bidirectional Ct Thrust coefficient.
long short-term memory (BLSTM) units is used to capture N Total number of samples.
high levels of abstractions in wind fields concurrently, and
thus forecasting wind output power and fatigue load as two
representatives of wind turbine responses. The LiDAR wind C. Variables:
preview information is used as the 2-D-images of wind fields
for the CNN. Moreover, the BLSTM is incorporated with the υilos line-of-sight wind speed.
proposed CNN to improve the forecasting accuracy further and u Wind field component vector in x direction.
learn deep temporal features. The aero-elastic 5-MW reference v Wind field component vector in x direction.
wind turbine of National Renewable Energy Laboratory (NREL) w Wind field component vector in x direction.
is used to evaluate the performance of proposed model compared Powt Wind turbine output electrical power.
to the state-of-the-art deep-learning based architectures in the
recent literature. Piwt Wind turbine input mechanical power.
kl Trainable kernel weight vector of the lth layer.
Index Terms—Deep learning, convolutional neural network,
bl Trainable kernel bias vector of the lth layer.
bidirectional long short-term memory, spatial-temporal features,
fatigue load, light detection and ranging (LiDAR). ξl Trainable weight matrix of the lth layer.
f Activation vectors of the forget gate.
z Activation vectors of the input gate.
N OMENCLATURE o Activation vectors of the output gate.
A. Indices: hft Forward hidden states in BLSTM network
t Index of the time sample. hbt Backward hidden states in BLSTM network
i Index of laser beam vector focus point. yb Desired output vector.
l Index of hidden layer for CNN. y Actual output vector.
n Index of output layer sample.
I. I NTRODUCTION
B. Parameters:
a
ϕ
Distance of wind profile from the hub.
Vertical dimension of the measurement grid.
W ITH the boost of concerns about the greenhouse effect
and climate change, integration of renewable energy
sources (RESs) to the energy systems has gained much at-
γ Horizontal dimension of the measurement grid. tention around the world. In particular, the government of
η wt Overall efficiency of energy conversion system. Alberta, Canada has set a firm target for the Alberta electric
This work was supported in part by the Future Energy Systems Research system operator (AESO) to move towards having 30% of
through the Canada First Research Excellence Fund (CFREF) and in part by Alberta’s electricity coming from RESs by the year 2030, with
the Natural Sciences and Engineering Research Council of Canada (NSERC). an estimated additional RESs integration of 5,000 MW to the
(Corresponding author: Amirhossein Dolatabadi.)
Amirhossein Dolatabadi and Yasser Abdel-Rady I. Mohamed are with the grid. It is anticipated that a significant portion of the 5,000
Department of Electrical and Computer Engineering, University of Alberta, MW of renewable energy capacity will come from wind power
Edmonton, AB T6G 2V4, Canada (e-mail: adolatab@ualberta.ca; yasser2@ [1]. While being non-polluting and widely available, wind
ualberta.ca).
Hussein Abdeltawab is with the School of Engineering - Penn State power also could cause considerable challenges to the stability
Behrend, Erie, PA 16563, USA (e-mail: hza5222@psu.edu). and security of the energy system due to its stochasticity,

1551-3203 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA. Downloaded on July 29,2021 at 18:40:20 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2021.3097716, IEEE
Transactions on Industrial Informatics
2

uncertainty, and intermittency [2], [3]. Thus, in this chang- sophisticated features, and thus the prediction accuracy may
ing environment, an accurate and stable prediction of wind be highly variable. With the booming advancement of machine
turbine responses is a pressing issue for operational planning, learning techniques, recent research of wind turbine responses
scheduling and real-time balancing of large-scale wind power prediction has focused more on deep models which can effec-
integration. tively extract inherent abstract features of the highly varying
Fatigue loads induced by sudden variations of the wind can time series data. Commonly seen networks include combina-
increase maintenance cost and reduce the operating life of tion of interval probability distribution learning (IPDL) and
wind turbine. In this regard, various yaw misalignment and deep belief network (DBN), gated recurrent unit (GRU) and
de-loading control methods have been presented to increase long short-term memory (LSTM). Especially, LSTM architec-
power generation efficiency and mitigate the mechanical loads ture, which was initially proposed by Hochreiter and Schmid-
on the wind turbine structure [4]. From the wind farms control huber [14], opened a new research area in the realm of deep
engineering point of view, low maintenance cost and efficient networks. In very recent literature, a hybrid forecasting model
energy harvesting in wind turbines can be achieved by more based on stacked denoising autoencoders and LSTM network
precise and superior wind fields measurements methodologies was developed to predict wind speed data in [15]. As another
instead of using traditional technologies, such as the anemome- example, a two-layer nonlinear combination method based
ter and nacelle vane, in which only close wind flows in the on extreme learning machine (ELM), Elman neural network
vicinity of sensor location can be measured [5]. (ENN), and LSTM for short-term wind speed prediction was
Broadly speaking, according to the length of the forecasting introduced in [16].
horizon, the wind turbine responses prediction techniques can Nevertheless, the aforementioned approaches focus on the
be mainly categorized into: 1) ultra-short-term and short-term, wind speed time series measured at the hub heights of turbine,
2) medium-term, 3) long-term ones [6]. An ultra-short-term which ignore the chaotic and stochastic characteristics of wind
and short-term forecasting methods refer to the prediction turbulent flow over a rotor area. Thus, there is a pressing
of wind turbine dynamic responses in the range of a few need for considering more extensive feature measurements
seconds to few minutes and few minutes to hours ahead, of inflowing wind towards the turbine, such as the speed,
respectively. These tasks are mainly applied for real-time direction and turbulence. Recently, as an advanced remote
operations and electricity market clearing [7]. A medium-term sensing wind measurement technology, the light detection and
prediction model seeks to forecast wind turbine output data ranging (LiDAR) has been proposed and attracted extensive
at time intervals of several hours to one week. This type of attention to improve inaccurate and unstable measurements of
prediction is generally benefiting unit commitment decisions nacelle vane [17]. Moreover, besides higher accuracy, other ad-
and energy storage system (ESS) scheduling [8]. The Long- vantages of the LiDAR over the conventional mechanical wind
term forecasting model focuses on the time scale of one week vanes and anemometers used in wind turbines are measuring
to a year or more to optimize expansion and maintenance diverse and longer distances and also flexible installation. For
planning [9]. example, in [18], the annual energy production increased by
In the technical literature, wind turbine responses forecast- 1.83% through a LiDAR-based method for yaw error align-
ing is generally made using a wind speed time series measured ment. Results from [19] showed that LiDAR-aided wind speed
at the hub height of the turbine. This highly varying time series measurement yielded fatigue or extreme load reduction on the
has a chaotic and stochastic characteristic; thus, forecasting wind turbine tower by 10%, thus improving the operation life
can be considered as a complex regression task. With a pack of wind turbine. Comprehensive analyses and discussions on
of meteorological factors and boundary conditions, a physical the LiDAR-based wind turbine performance can be found in
forecasting model was built in [10]. Some forecasting models [20]. Although different efforts have been made to improve
are statistical-based, which try to capture mathematical rela- LiDAR-assisted control of wind turbines, minimal studies
tionships in the historical data. For example, an autoregressive are made on integrating such extensive wind field data and
integrated moving average (ARIMA) approach was used to machine learning approaches to find wind turbine responses. In
represent the upper and lower bounds of the wind power [21], a FFNN was employed for the extrapolation of the higher
generation in [11]. However, the linear nature of statistical heights wind speed using lower heights values. However,
models restricts their ability to handle the nonlinear patterns the LiDAR measurements were only based on single hub
and deal with challenging wind data prediction problems. On height values, and thus, did not consider the dynamics of
the other hand, wind turbine responses can be forecasted by the turbulent wind field. Table I summarizes a taxonomy of
pattern recognition and machine learning techniques such as recently proposed models in LSTM and CNN combination
support vector regression (SVR) and artificial neural networks area. To predict wind turbine’s responses with satisfactory
(ANNs). Using wind speed data, a wind power prediction accuracy, three main challenges have to be addressed. (1) Most
model based on feed-forward neural network (FFNN) was data-driven methods have employed 1-D and 2-D networks
proposed in [12]. Recurrent neural networks (RNNs) which in a serial manner. This assumption creates some error in
forecast the future value by using the current inputs and the the forecasted results as the extracted features from the first
experience have also been investigated to improve the forecast network have a significant influence on the training of the
performance [13]. next network. (2) The proposed one-directional LSTM models,
Conventional shallow ANNs forecast models as mentioned e.g., [31], introduce some error in spike points forecasting as
above suffer from the problem that they cannot efficiently learn they solely obey the recursive procedure which fed back the

1551-3203 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA. Downloaded on July 29,2021 at 18:40:20 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2021.3097716, IEEE
Transactions on Industrial Informatics
3

previous information in an iterative manner. (3) Most CNN- cient yaw misalignment and de-loading control strategies
based approaches have just converted 1-D time series data to a for a LiDAR-enhanced wind turbine and consequently
matrix form and then apply it to the CNN instead of using 2- better lifetime and energy harvesting efficiency.
D wind field images, e.g., [32]–[35]. These time series-based The rest of this paper is structured as follows. Section II
matrices cannot model the complex interactions between the introduces the LiDAR and wind turbine modeling methodolo-
turbine and a sequence of wind fields. gies. In section III, the overall body of the proposed frame-
To bridge this gap, we seek to address two important issues work, which consists of 2-D CNN and BLSTM networks, is
for wind power operators. Is it possible to apply deep learning presented. The numerical results are presented and discussed
to the data of a LiDAR-enhanced wind turbine? Can image- in Section IV. Finally, the conclusion is drawn in Section V.
based deep learning approaches really increase the accuracy
of forecasting compared with other techniques? To this end,
this paper proposes a novel LiDAR-assisted deep neural net- II. P RINCIPLE OF L I DAR MEASUREMENTS
work model to better forecast ultra-short-term wind turbine
Generally, there are two types of LiDAR installations: 1)
response. Complex wind fields measured by the LiDAR are
Nacelle-mounted LiDAR, which similar to the nacelle vane
employed as a sequence of images for the deep convolutional
requires to be deployed on the nacelle rooftop and can measure
neural network (CNN) and time series of different wind speed
freestream wind at 50 to 200 meters distance in front of the
components are handled by the bidirectional LSTM (BLSTM)
turbine blades as shown in Fig. 1(a). 2) Ground-based LiDAR,
network to learn deep spatial and temporal features of tur-
which demands to be located on the ground and can vertically
bulent wind flow simultaneously. Considering the limitations
emit a laser beam to measure freestream wind as shown in
of previous models and research works, we propose a novel
Fig. 1(b). A LiDAR sensor mounted on nacelle can overcome
data-driven LiDAR-assisted framework, namely 2-D CNN-
the above-stated disadvantages, as it is able to provide a
BLSTM, for predicting wind turbine’s responses and our main
sufficiently early preview measurement of the undisturbed
contributions summarized as follows.
inflow over the entire rotor area at a far distance. Moreover,
To the best of our knowledge, this is the first effort it is worth noting that a nacelle-mounted LiDAR can reliably
to incorporate the 2-D CNN with recurrent BLSTM to determine the speed and direction of the wind regardless of
concurrently capture the spatial and temporal features of the rotor turbulence, which is possible by manipulation the
a turbulent wind field. location of the beam in front of the rotor. The wind speed
Wind profiles in front of the turbine’s blades are used to measurement along the direction of i-th LiDAR laser beam
predict future wind turbine responses instead of using a can be modeled by equations (1) and (2) [36].
single hub-height wind speed time series. Z +∞
The proposed 2-D CNN-BLSTM framework can better
υilos = (lix u (a) + liy v (a) + liz w (a))fL (a) da (1)
handle the uncertainties and learn deep temporal features −∞
from the sequential input data using recurrent BLSTM
units, which utilizes the previous and future hidden layer 2
features, compared to single-network models. e−4 ln 2(a/W )
fL (a) = R +∞ 2
(2)
The proposed LiDAR-assisted deep model provides ultra- −∞
e−4 ln 2(a/W ) da
short-term future measurements according to upcoming
wind flows before reaching the turbine blades, rather than where fL (a) is the weighting function at the distance a. The
considering those which have already interacted with the simplified version of equation (1) can be expressed by equation
turbine. (3).
The proposed model provides accurate ultra-short-term
forecasting of fatigue or extreme loads yields more effi- υilos = lix ui + liy vi + liz wi (3)

TABLE I: Comparison of recently published CNN+LSTM studies with the proposed approach..

Reference Forecasting area CNN input Bidirectional process Combination Ensemble learning
[22] Air pollutant 2D time series × Serial ×
[23] Turbofan RUL 2D time series × Serial ×
[24] PMU data 2D time series × Serial ×
[25] Battery RUL 2D time series × Serial X
[26] Battery SOC 1D time series × Serial ×
[27] Load 2D time series × Serial ×
[28] Load 2D time series X Serial ×
[29] PV 1D time series × Serial ×
[30] PV 2D time series × Parallel X
[31] Wind 2D time series × Serial ×
Proposed Wind 2D LiDAR images X Serial X

1551-3203 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA. Downloaded on July 29,2021 at 18:40:20 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2021.3097716, IEEE
Transactions on Industrial Informatics
4

same effect of wind speed grid at time ti can be defined as


follows:

υ (t) = θ υpx , py (t) ∀ px ∈ [−ϕ, ϕ] and py ∈ [−γ, γ] (6)
The wind turbine output power can be described by the
following equation:
(a) (b)

η wt Piwt (s) η wt P wt (s)


Powt (s) = (τm s+1)(τe s+1) ' τm s+1
i

Fig. 1: Different types of LiDAR based on installation loca- (7)


= G (s) Piwt (s)
tion.(a) Nacelle-mounted LiDAR. (b) Ground-based LiDAR
Piwt is the input power of turbine and can be calculated as
follows.
T 1
where υilos is wind vector [ui vi wi ] projection in the i-th Piwt = ρAR υ 3 Cp (λ, β) (8)
focus point on the direction of normalized laser beam vector 2
with the length fi : υ is the magnitude of the wind velocity component which is
 x    perpendicular to the rotor plane. From equation (8), it can be
li xi seen that wind turbine extracted power increases with the cube
 liy  = 1  yi  (4) of the υ. Apart from affecting the generated power, inaccurate
fi
liz zi estimations of wind speed and wind direction will also have
a significant effect on the turbine’s structural loads. From the
wind farms operator point of view, a premature fatigue failure
III. BACKGROUND T HEORIES
problem can be problematic as the drivetrain of a wind turbine
To enhance the forecasting accuracy and reliability of the needs much more maintenance cost and downtime than the
wind turbine response, it is essential to learn the spatial- other subassemblies of the wind turbine.
temporal patterns, which improves the knowledge of a tur- Under this background, the most significant issue is how
bulent wind field. In this section, the structure of the designed to calculate the wind turbine’s fatigue load. In this paper, we
wind turbine response forecasting method is described in full consider the tower bending moment as a measure of fatigue
details. load [4]. The thrust force, which is the main cause of tower
bending moment, is described by the following equation:
A. Wind turbine modeling 1
wind
ρAR υ 2 Ct (λ, β)
Fthrust = (9)
In real-world, the horizontal axis wind turbine of MW class 2
typically adopts different active de-loading and nacelle direc- The details of the Ct and Cp are stated in [38] as 2-D lookup
tion control strategies by using the predicted wind speed and tables.
wind direction data. However, traditional wind field measure- By using the convolution operation, the wind turbine output
ment methods such as anemometer and vane have a common power is expressed as:
drawback. They can only cover a limited close vicinity of t+t
R i
their location; therefore, these approaches would not have an Powt (t + ti ) = g (t)Piwt (t + ti − τ ) dτ
0 (10)
appropriate performance to be applied for the wind turbine
= g (t) ∗ Piwt (τ + ti )
response prediction task. On the other hand, accurate wind
data measurement has a crucial role in the optimal harvesting Finally, it is worth noting that all the above-mentioned
of wind power and increasing turbine fatigue lifespan by procedure can be used for wind turbine fatigue load by
decreasing the induced parasitic load on the blades and main replacing Eq. (8) with Eq. (9). The structure of the proposed
bearing. For example, reference [37] showed that for 7.5° to deep spatial-temporal feature learning model, namely 2-D
15° of yaw error, the power loss in a wind turbine lied in the CNN-BLSTM, is visualized in Fig. 2. As presented in the
range from 2.4% to 13%. figure, 2-D wind field images and 1-D wind time series data
The wind speed at height py and distance px from the hub are fed to the 2D-CNN and BLSTM deep learning architecture
center of the turbine is υpx , py (ai ), which is measured by to extract spatial and temporal features from each wind data,
LiDAR at distance ai in front of the turbine. On the other respectively. These branches operate independently of each
hand, the wind field at distance ai will reach the turbine after other until they are concatenated.
time ti = υaavg
i
, where υavg denotes the average wind speed.
By assuming uniform average wind speed for the proposed B. Convolutional neural network (CNN)
measurement grid:
Due to the superior ability of conventional algorithms in
υpx , py (ai ) = υpx , py (υavg . ti ) ' υpx , py (t + ti ) (5) solving complex tasks, CNNs have been successfully used
in many areas. CNNs have powerful self-tuning & learning

px ∈ [−ϕ, ϕ] and py ∈ [−γ, γ] ( ∆p x
= Rpx and capability; thus, they can efficiently capture complex spatial

∆py = Rpy are the resolutions in the vertical and horizontal features from the highly varying wind flows. The convolutional
directions). The effective uniform speed that will have the layers provide translation invariance ability by using a set

1551-3203 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA. Downloaded on July 29,2021 at 18:40:20 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2021.3097716, IEEE
Transactions on Industrial Informatics
5

...
...

power or fatigue load)


Fully connected layer

Output layer (wind


2-D wind field images Convolution pooling

Concatenate layer
Convolution pooling Flattrning

... LSTM
... LSTM

... LSTM
... LSTM
... LSTM
... LSTM

Fully connected layer


Fully connected layer
...
LSTM
LSTM

LSTM
LSTM
LSTM
LSTM
LSTM
LSTM

LSTM
LSTM
LSTM
LSTM
1-D hub-height wind speed time series BLST M layer BLST M layer BLST M layer

Fig. 2: Overall Structure of 2-D CNN-BLSTM framework.

ht
ht ht+1
Ct-1
ˣ + Ct

ˣ tanh
Ct-1 Ct Ct+1 ...
LSTM LSTM
ft it gt Ot
ˣ block block
ht-1 ht ht+1 ...
σ σ tanh σ
ht-1 ht
xt xt+1
xt

Fig. 3: Schematic of a typical LSTM block. Fig. 4: Illustration of an un-rolled LSTM network.

of learnable kernels and inductive bias of local connectivity, In a deep CNN with multiple filters with different kernel
decreasing the number of learning parameters and, conse- sizes, deep spatial features can be effectively extracted. It
quently, increases generalization capability. Such a convolution is worth noting that this benefit emerged due to the shared
operator is equivalent to moving kernels over spatial positions, weights, local connectivity, and receptive fields features of
which for the q th feature map of the lth layer is defined as CNNs.
given in equation (11) [39].
 
X C. BLSTM Network
xql = f  xpl−1 ⊗ klpq + bql 

(11)
p∈N q RNN has gradually gained attentions as a powerful
sequence-based architecture to process dynamic characteristics
f (.) is a nonlinear activation function, and ⊗ denotes a
of input data along with the development of data-driven
convolution operation.
techniques. In contrast to FFNNs, such models are capable
Then, a pooling layer is integrated to reduce the compu-
of using the data of the previous hidden or output layers with
tational cost of the deep CNN network and consequently the
their internal state (memory) to find the temporal correlations
possibility of over-fitting by down-sampling feature maps. The
between the current information and past circumstances.
pooling operation can be defined mathematically by:
xql = fdown xql−1

(12)
Output
where fdown (.) represents the down-sampling function. ... yt-1 yt yt+
1
... layer

Average-pooling and max-pooling operations are the two Backward


ht-1 ht ht+1
popular pooling strategies in the CNNs. In this work, max- process

pooling is chosen to produce a feature map containing the ht-1 ht ht+1


Forward
process
most prominent features. Finally, the fully-connected layer is
utilized to flatten the 2-D feature maps and make them suitable ... xt-1 xt xt+1 ...
Input
layer

for putting through the activation function as follows:


Fig. 5: General structure of BLSTM framework.
xl = gf c (ξl xl−1 + bl ) (13)

1551-3203 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA. Downloaded on July 29,2021 at 18:40:20 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2021.3097716, IEEE
Transactions on Industrial Informatics
6

(a) (b) (c)


Fig. 6: Different wind field components generated by TurbSim. (a) streamwise (longitudinal) component u. (b) transverse
(crosswise) component v. (c) vertical component w.

Although RNNs are considered a significant improvement matrices and the bias vector which will be tuned during by
compared to traditional feed-forward networks, they can suffer optimization algorithm the training procedure.
from the exponentially fast decreasing of training backprop- Another important issue is that by processing the data
agated errors in the steepest descent algorithm called gra- in temporal way, the future context’s information can be
dient vanishing problem. This issue restricts the capability neglected. In this study, to overcome this shortcoming of
of the network to learn temporal correlations. To address the LSTM networks, the bidirectional structure is incorpo-
this, LSTM architecture as an elaborated version of RNN rated into the proposed LSTM network to handle the whole
with rich dynamics is introduced [14]. A general structure temporal horizon’s information. In this way, not only the
of the LSTM block is shown in Fig. 3. Each LSTM block previous features are used but also upcoming information is
has three main multiplicative units called input, output, and implemented. The overall structure of the BLSTM concept is
forget gates for continuous writing, reading, and resetting data, illustrated in Fig. 5. Two different hidden layers are utilized
respectively. Specifically, the new and previous information to bring bidirectional memory; one as a forward hidden layer
are memorized by the input gate, and inconsequential and for passing information from past to future and the other as a
irrelevant information is scraped from the memory cell by the backward hidden layer for passing information from future to
forget gate. The output gate exploits the beneficial information past. More specifically, when deep learning-based architectures
and controls the impact of memory content on the output. are built, one can achieve much higher data representation
Different LSTM blocks can be stacked together to compose a capability than typical LSTMs. This process prevents error
deep LSTM network and propagate the features among these accumulation and leads to having a more accurate prediction
blocks during the network’s training process. In this situation, of highly intermittent and stochastic phenomena. In BLSTM,
the stochastic trend of complex and challenging temporal the output is explained according to the following equations:
phenomena is captured more effectively. Succinctly, Fig. 4  
illustrates the sequential schematic of an un-rolled LSTM hft = tanh xt Wxhf
+ hft−1 Whh f
+ bfh (20)
network.
hbt = tanh xt Wxh
b
+ hbt−1 Whh
 b
Overall, the computational procedure of the forward pass + bbh (21)
in a LSTM architecture can be formulated by equations (14)-
(19): ytb = ht Wo + bo (22)
hft hbt .
 
ft = σ Wxf xt + Whf ht−1 + bf (14) ht is composed of integrating and

IV. C ASE STUDY AND NUMERICAL RESULTS


zt = σ (Wxz xt + Whz ht−1 + bz ) (15) In this section, the simulation environment applied in this
research and the details of datasets are explained firstly, and
gt = tanh (Wxg xt + Whg ht−1 + bg ) (16) a comprehensive prediction evaluation and comparison with
well-known benchmark models are performed subsequently.
ot = σ (Wxo xt + Who ht−1 + bo ) (17)
A. Wind turbine simulation environment
Ct = gt it + Ct−1 ft (18) In this work, the 5-MW reference horizontal axis wind
turbine of the National Renewable Energy Laboratory (NREL)
ht = tanh (Ct ) ot (19) is used as a full nonlinear aero-elastic model to perform
a broad range of wind speeds simulations [40]. Table II
where σ denotes the logistic sigmoid and tanh is the hy- shows the detailed characteristics of the proposed NREL wind
perbolic tangent activation function. W and b are the weight turbine.

1551-3203 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA. Downloaded on July 29,2021 at 18:40:20 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2021.3097716, IEEE
Transactions on Industrial Informatics
7

1) TurbSim wind field simulator: TurbSim is a full-field, uph×pv×pt , vph×pv×pt and wph×pv×pt in x, y, and z directions,
stochastic, turbulent-wind simulator developed by the NREL. respectively, where ph and pv denote the number of horizontal
It uses statistical models to generate realistic time series and vertical points in the spatial grid, respectively, and pt
of longitudinal, crosswise, and vertical components of the represents the temporal dimension of the vector. A broad
wind field [41]. Taylor’s frozen wind hypothesis is utilized range of the wind fields is generated according to various
to simulate wind signals; that is, wind field is modeled as a mean longitudinal wind speeds from 6 to 24 m/s with 2 m/s
turbulence box marching toward the turbine. Thus, the wind resolution steps to ensure that the forecasting model could be
fields do not evolve with time and are assumed to be frozen trained with different wind field characteristics. The complete
from the LIDAR focal point until the turbine. The height and data set contains 3000 wind fields and the corresponding
width of the wind field grid are chosen to be 145 m to be large electrical generator power and low-speed shaft torque. In this
enough to encompass the entire rotor disk of the proposed work, the training, validation and testing sets account for 75%,
wind turbine. Moreover, the hub is horizontally centered in the 15%, and 10% of the dataset, respectively [43].
grid, so the top of the grid can be determined by the turbine
hub height plus rotor radius. C. Evaluation Criteria for wind turbine response forecasting
2) FAST wind turbine simulator:: The fatigue, aerodynam-
ics, structures, and turbulence (FAST) code developed by In this study, the root means square error (RMSE), the mean
the NREL is used to simulate the response of the proposed absolute error (MAE), and the mean absolute percentage error
turbine by providing a full nonlinear and high-fidelity turbine (MAPE) are employed as three evaluation metrics to evaluate
response simulation [42]. The wind field vectors created by the forecasting performance of the different models as follows:
the TurbSim are applied to the FAST turbine simulator to
v
u
u1 X N
compute the turbine’s output signal and state vector. In this RM SE = t y (n) − y(n))
(b
2
(23)
study, electrical generator power (GenPwr) and low-speed N n=1
shaft torque (LSShftTq) are considered as the wind turbine
responses’ representatives. It should be noted that we enabled N
the degrees of freedom (DOFs) associated with first and 1 X
M AE = |b
y (n) − y(n)| (24)
second blade flap-wise modes (2 × 3 DOFs), first and second N n=1
tower side-to-side modes (2 DOFs), first and second tower
fore-aft modes (2 DOFs), first and second blade edgewise N
1 X yb (n) − y(n)
mode (2 × 3 DOFs), drive-train mode (1 DOF), and generator M AP E = × 100% (25)
mode (1 DOF). N n=1 yb (n)

where yb (n), y(n) and N represents the desired output, the


B. Data description actual output, and the number of samples, respectively.
The proposed approach aims to perform the ultra-short-
term forecasting of wind turbine responses by taking the 2-D- D. Results and comparisons
images of wind fields and hub-height wind speed time series
To verify the efficiency and validity of the proposed frame-
as inputs for the CNN and BLSTM networks, respectively.
work, several single and hybrid forecasting approaches that
This means that besides hub height wind speed time series,
have been proposed in the literature are chosen as the bench-
upcoming wind fields are also utilized to increase the capa-
marks. ARIMA, MLP, DBN, IPDL, GRU, LSTM, and BLSTM
bility of the proposed model on capturing complex wind data
models use 1D-wind speed time series, 2D-CNN use 2D-wind
abstractions.
fields images, and 2D-CNN-MLP, 2D-CNN-GRU, and 2D-
As mentioned earlier, the NREL TurbSim package is em-
CNN-LSTM use both of them similar to the proposed 2D-
ployed to generate turbulent wind fields and NREL FAST
CNN-BLSTM model. As these machine learning models are
turbine simulator is used to model high order aeroelastic
completely data-dependent, all hyperparameters, such as the
nonlinear wind turbine. As depicted in Fig. 6, each Turbsim
simulation generates 3×3-D wind field components vectors
TABLE III: Cross-Validation results of proposed model with
some typical BLSTM structures for 1-step ahead power fore-
TABLE II: Characteristics of NREL 5-MW Wind Turbine
casting.
Symbol Parameter Value
RMSE MAE MAPE Online execution time
BLSTM structure
Prated Rated power 5 MW (kW) (kW) (%) (ms)
Mrated Rated generator torque 43.1 kN m [100 100] 63.48 29.14 2.26 0.31
Vin Cut-in wind speed 3 m/s [250 250] 63.01 28.78 2.21 0.38
Vrated Rated wind speed 11.4 m/s [100 100 50] 58.13 25.64 1.93 0.46
Vout Cut-out wind speed 25 m/s [250 250 50] 56.27 23.77 1.92 0.61
hhub Hub height 90 m [250 250 100] 61.22 26.94 2.02 0.73
Rrotor Rotor radius 63 m [250 250 250] 61.38 27.18 2.11 0.81

1551-3203 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA. Downloaded on July 29,2021 at 18:40:20 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2021.3097716, IEEE
Transactions on Industrial Informatics
8

TABLE IV: Forecasting performance comparison for 1-step and 2-step ahead tasks.

1-Step (10 sec) 2-step (20 sec) 1-Step (10 sec) 2-step (20 sec)
Method
RMSE MAE MAPE RMSE MAE MAPE RMSE MAE MAPE RMSE MAE MAPE
(kW) (kW) (%) (kW) (kW) (%) (kN.m) (kN.m) (%) (kN.m) (kN.m) (%)
ARIMA [44] 351.80 219.17 21.76 463.82 424.34 32.43 591.51 326.42 32.27 746.58 717.64 43.85
DBN [45] 338.63 214.31 20.59 458.91 408.70 32.02 589.72 318.43 29.85 718.89 668.54 40.66
MLP [46] 325.68 196.64 19.32 424.78 391.36 27.56 560.59 295.98 27.49 650.23 577.73 34.54
GRU [47] 277.63 163.73 15.81 369.46 342.68 24.07 535.97 255.62 20.93 548.01 437.58 27.11
LSTM [15] 281.02 165.66 16.04 365.71 338.51 22.83 534.71 255.57 20.07 542.23 435.43 25.64
IPDL [48] 267.16 159.93 15.38 342.07 312.22 21.39 466.69 247.91 19.72 537.64 421.80 23.96
BLSTM [49] 263.01 154.38 15.29 330.08 298.21 21.01 454.46 239.34 17.96 518.56 398.44 23.80
2D-CNN 234.02 107.08 8.64 241.29 140.37 9.25 252.95 97.9 8.11 282.95 193.81 10.86
2D-CNN-MLP 176.23 77.31 6.61 190.57 139.49 8.49 197.17 87.94 7.66 234.86 185.55 9.94
2D-CNN-GRU 150.89 52.97 3.66 160.05 68.49 4.73 110.38 59.18 5.31 153.34 123.02 6.90
2D-CNN-LSTM 148.33 51.38 3.61 155.61 64.82 4.46 110.22 59.19 5.29 150.59 111.56 5.73
2D-CNN-BLSTM 58.34 24.73 1.98 67.02 41.05 2.74 102.61 51.22 4.55 145.11 105.58 5.41

Targets Targets
4000 4000
Forecasted Forecasted
Generated wind power (kW)

Generated wind power (kW)


3500 3500

3000 3000

2500 2500

2000 2000

1500 1500

0 10 20 30 40 50 60 0 10 20 30 40 50 60
Time steps (10 sec) Time steps (10 sec)

(a) (b)

Targets Targets
4000 4000
Forecasted Forecasted
Generated wind power (kW)

Generated wind power (kW)

3500 3500

3000 3000

2500 2500

2000 2000

1500 1500

0 10 20 30 40 50 60 0 10 20 30 40 50 60
Time steps (10 sec) Time steps (10 sec)

(c) (d)
Fig. 7: Wind power forecasting comparison of four 2D-CNN-based models for the first 60 steps (600 sec) of test data. (a)
2D-CNN. (b) 2D-CNN-MLP. (c) 2D-CNN-LSTM. (d) 2D-CNN-BLSTM.

number of neurons in each layer and the number of layers structures. Table III shows the cross-validation results of some
in each model, are optimally tuned using the training data typical BLSTM structures with different number of layers
according to common practice recommended by the deep and blocks. As shown in this table, both the increase and
learning community [50]. The optimal number of hidden decrease in the network parameters reduces the performance
layers for MLP model is determined to be 2 with 50 hidden according to the forecasting indices. The online computation
nodes in each layer. The DBN is established by using three time of the proposed method for different network parameters
hidden layers with 100, 125, and 125 units, respectively. The is calculated in Table III. Considering that the sampling time
overall structure of the GRU, LSTM and B-LSTM networks scale of the forecasting is 10 s or longer, the computational
are composed of three main hidden layers with 250, 250, time of around 0.6 ms is relatively fast enough to guarantee
and 50 units, respectively. The IPDL model consists of three reliable and safe wind turbine real-time operation. Moreover,
stacked layers with a linear regression model at the top. all models are implemented in Python with the Keras library
The CNN structure comprises six layers, including three and TensorFlow as the backend [51]. The workstation used
convolution layers, three pooling layers, and a flattening layer. is configured with an Intel Core TM i7-8700 3.2 GHz CPU,
To verify the performance of the validation, the 1-step ahead NVIDIA GPU GeForce GTX 1070 GPU, and 32 GB of RAM.
power forecasting case study is repeated for different network
Table IV compares the forecasting results and provides

1551-3203 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA. Downloaded on July 29,2021 at 18:40:20 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2021.3097716, IEEE
Transactions on Industrial Informatics
9

2000
Proposed CNN-LSTM with 51.86% and 13.46% MAE improvements
CNN-LSTM
1800
CNN-GRU in power and load 1-step ahead forecasts. Furthermore, this
1600
CNN-MLP
CNN
model decreases MAE by 36.67% and 5.36% for power and
1400
BLSTM
IPDL
load 2-step ahead prediction tasks, respectively. The higher
LSTM
GRU
accuracy of 2D-CNN-BLSTM indicates the superiority of bidi-
1200
rectional learning method to effectively capture the previous
RMSE (kW)

MLP
DBN
1000
ARIMA and future hidden features of the proposed wind data.
800
To provide better visualization, Fig. 7 demonstrates the
600 wind power 1-step ahead prediction results of the four models
400 which use both 1D-wind speed time series and 2D-wind fields
200
as inputs. As shown in this figure, 2D-CNN is dominated
by 2D-CNN-MLP and 2D-CNN-LSTM has relatively better
0
1 2 3
Time Steps
4 5 6 performance in spike points than both of them. Our deep
hybrid model significantly outperforms all of them and can
Fig. 8: Comparison of power forecasting RMSE results for
follow the sharp spikes accurately.
1-step up to 6-step ahead tasks (step time = 10 sec).
Fig. 8 depicts the performance comparison of all single and
hybrid models for extended time horizons from 1-step ahead to
6-step ahead prediction tasks. For the extended prediction time
the average test RMSE, MAE, and MAPE for ultra-short- step, hybrid models with more complex structure are needed
term generated power and load predictions with 1-step (10 to achieve high accuracy by using both 1-D and 2-D data.
sec) and 2-step (20 sec) ahead tasks, respectively. This table As shown in Fig. 8, single architectures are dominated by
clearly shows that the proposed method (2D-CNN-BLSTM) hybrid deep learning models in all time horizons. BLSTM and
outperforms the other benchmarking models and has the IPDL have relatively good performance compared to CNN-
best performance in both 1-step and 2-step ahead forecasting based models; however, for larger forecasting time steps, the
tasks. The results show that machine learning models perform plot shows that hybrid frameworks outperform single ones.
better than ARIMA as a linear statistical model. DBN has Fig. 9 illustrates the regression responses of the proposed
the largest error range and randomness, which is due to method for wind output power and fatigue load as two
the pretraining and supervised training process. The IPDL representatives of wind turbine responses. As can be seen in
model yields more accurate predictions when compared to this figure, the predicted values mostly stay below the blue line
the classic DBN. According to the results, GRU and LSTM and show slightly higher error in wind powers above 4500
approaches both outperform shallow architecture, MLP, since kW and higher than 4000 kN.m fatigue loads. The fatigue
these deep recurrent networks can better model the highly load case has faced some higher errors compared to the wind
nonlinear temporal features of wind time series. As the variant power case which seems to be reasonable in that the fatigue
of LSTM, the forecasting metrics of GRU are closer to those of load is highly influenced by the dynamic interactions between
LSTM, but in 2-step ahead scenarios LSTM has a remarkable the turbine structure and the wind flow. Overall, as it is clear,
improvement in all criteria. BLSTM is the best single time our proposed method, shows the acceptable performance in
series-based architecture compared to the LSTMS and MLP both tasks and can helps us to make more accurate results.
models. Moreover, using wind fields has a significant effect on To investigate the effect of wind fields snapshots on the
the prediction accuracy which can be understood by comparing performance of our proposed model in another way, another
first three models, MLP, LSTM, BLSTM, with other 2D-CNN- extension of 2D-CNN-BLSTM is designed as our baseline.
based ones. For example, the RMSE of 2D-CNN-MLP for 1- This baseline methodology only uses a longitudinal component
step and 2-step ahead wind power predictions are 176 kW and of the wind field, u, instead of entire wind field components.
190.57 kW, respectively, which are increased to 325.68 kW Fig. 10 compares the 1-step ahead load prediction results of
and 424.78 kW for the single MLP model. MAPE result of proposed and baseline models. As it is clear, our proposed
2D-CNN-BLSTM for 1-step predictions is 1.98 which reaches method, by considering all wind fields components, shows
to 2.74 in the 2-step compared to the best time series-based slightly better performance and generalization capability com-
approach, single BLSTM, which has MAPE of 15.29 and pared to the baseline model which considers only streamwise
21.01 in 1-step and 2-step, respectively. Therefore, applying component, especially when load time series has an abrupt
single time series-based models for longer-term predictions change. It should be noted that the main limitation of this
cannot yield reliable performance. 2D-CNN-MLP has 24.69% work lies in the fact that these improvements are conditioned
power RMSE and 22.05% load RMSE improvements over to have an effective preview of wind data provided by the
single 2D-CNN. These improvements are further increased to LiDAR sensor, which is still a high-cost solution.
36.61% and 56.42% for the power RMSE and load RMSE In case the LiDAR system is failed due to an unknown
results, respectively, when the MLP is replaced by the deep reason and consequently 2D wind field images are not avail-
LSTM network. The more precise forecast shows the better able, the proposed framework can still be used to forecast
generalization capability of the deep recurrent models. as it is kind of ensemble learning and combined CNN and
The proposed 2D-CNN-BLSTM obtains better results com- BLSTM networks in a parallel manner. Fig. 11 compares the
pared to 2D-CNN-LSTM. 2D-CNN BLSTM outperforms 2D- wind power 1-step ahead prediction results of the three models

1551-3203 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA. Downloaded on July 29,2021 at 18:40:20 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2021.3097716, IEEE
Transactions on Industrial Informatics
10

5000

4500 4000

4000
3500
Predicted values

Predicted values
3500

3000
3000

2500 2500

2000
2000

1500
1500
1000
1000 1500 2000 2500 3000 3500 4000 4500 5000 1500 2000 2500 3000 3500 4000 4500
Target Values Target Values

(a) (b)
Fig. 9: Regression plots of the proposed method (2D-CNN-BLSTM): (a) Wind power (kW); (b) fatigue load (kN.m).

4500 4500

4000 4000
Fatigue load (kN.m)

Fatigue load (kN.m)


3500 3500

3000 3000

2500 2500
Targets Targets
Forecasted Forecasted
0 10 20 30 40 50 60 0 10 20 30 40 50 60
Time steps (10 sec) Time steps (10 sec)

(a) (b)
Fig. 10: Fatigue load forecasting results comparison of proposed 2D-CNN-BLSTM and baseline model for the first 60 steps
(600 sec) of test data. (a) baseline model. (b) proposed 2D-CNN-BLSTM model.

(a) (b) (c)


Fig. 11: Wind power forecasting comparison of three Image-based models for the first 60 steps (600 sec) of test data in case
of LiDAR failure. (a) 2D-CNN-MLP. (b) 2D-CNN-LSTM. (c) 2D-CNN-BLSTM.

which previously used both 1D-wind speed time series and and HNN-QR models, MAPEs of power forecast are 4.42%
2D-wind fields as inputs. As shown in this figure, all of the and 4.07%, respectively. Comparing HNN-QR with 2D-CNN-
methods have relatively higher prediction error than the normal BLSTM, for the wind power prediction, 2D-CNN-BLSTM
situation (Fig. 7). The figure shows that the predicted values method can improve MAPE by 51.35%. Similar trend can be
of 2D-CNN-BLSTM generally conform well with the target seen for the load forecasting. 2D-CNN-BLSTM can forecast
values with low average error levels. the most accurate ultra-short-term power and load prediction
Fig. 12 compares the ultra-short-term generated power tasks, which can get the best forecasting metrics. This finding
and load 1-step forecasting performance of the proposed can verify the superiority of the proposed 2D-CNN-BLSTM
model with spatiotemporal prediction methods which have model.
been recently employed in the literature, including GDNN
[52], CNN-Taguchi [53], HNN-QR [31], and STNN-VB
V. C ONCLUSION
[54]. It is worth noting that the procedure of designing an
optimal structure for the other frameworks is similar to that This paper develops a novel LiDAR-assisted deep 2-D
of the proposed 2D-CNN-BLSTM model. In the generated CNN-BLSTM model for the ultra-short-term prediction of
power prediction task, CNN-Taguchi and STNN-VB produce future wind turbine responses using upcoming sequences of
the MAPE of 8.52% and 7.14%, respectively. For GDNN full wind field components and hub-height wind speed time

1551-3203 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA. Downloaded on July 29,2021 at 18:40:20 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2021.3097716, IEEE
Transactions on Industrial Informatics
11

[4] Q. Yao, J. Liu, and Y. Hu, “Optimized active power dispatching strategy
considering fatigue load of wind turbines during de-loading operation,”
10
IEEE Access, vol. 7, pp. 17 439–17 449, 2019.
Power Load
9 [5] A. Mahdizadeh, R. Schmid, and D. Oetomo, “LIDAR-Assisted Exact
8 Output Regulation for Load Mitigation in Wind Turbines,” IEEE Trans.
7
Control Syst. Technol., 2020.
[6] M. Khodayar, O. Kaynak, and M. E. Khodayar, “Rough deep neural
6
architecture for short-term wind speed forecasting,” IEEE Trans. Ind.
MAPE

5
Inf., vol. 13, no. 6, pp. 2770–2779, 2017.
4 [7] M. B. Ozkan and P. Karagoz, “A novel wind power forecast model:
3 Statistical hybrid wind power forecast technique (SHWIP),” IEEE Trans.
2
Ind. Inf., vol. 11, no. 2, pp. 375–387, 2015.
[8] S. Buhan and I. Çadırcı, “Multistage wind-electric power forecast by
1
using a combination of advanced statistical methods,” IEEE Trans. Ind.
0
2D-CNN-BLSTM HNN-QR GDNN STNN-VB CNN-Taguchi
Inf., vol. 11, no. 5, pp. 1231–1242, 2015.
Forecasting method [9] H. B. Azad, S. Mekhilef, and V. G. Ganapathy, “Long-term wind speed
forecasting and general pattern recognition using neural networks,” IEEE
Trans. Sustainable Energy, vol. 5, no. 2, pp. 546–553, 2014.
[10] D. Allen, A. Tomlin, C. Bale, A. Skea, S. Vosper, and M. Gallani,
Fig. 12: Comparison of power and fatigue load forecasting “A boundary layer scaling technique for estimating near-surface wind
energy using numerical weather prediction and wind map data,” Appl.
results with different spatiotemporal models. Energy, vol. 208, pp. 1246–1257, 2017.
[11] P. Chen, T. Pedersen, B. Bak-Jensen, and Z. Chen, “ARIMA-based time
series model of stochastic wind power generation,” IEEE Trans. Power
Syst., vol. 25, no. 2, pp. 667–676, 2009.
series before reaching the turbine blades as inputs. As a data- [12] K. Bhaskar and S. Singh, “AWNN-assisted wind power forecasting using
driven framework, the performance of the proposed model feed-forward neural network,” IEEE Trans. Sustainable Energy, vol. 3,
no. 2, pp. 306–315, 2012.
is determined solely by the potential interactions hidden in [13] Z. Shi, H. Liang, and V. Dinavahi, “Direct interval forecast of uncertain
the wind field and time series data rather than the physical wind power based on recurrent neural networks,” IEEE Trans. Sustain-
equations or predetermined distribution types. Thus, it can able Energy, vol. 9, no. 3, pp. 1177–1187, 2017.
[14] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural
avoid the dual risks of model incorrectness or distribution type Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
misspecification. The NREL 5-MW reference horizontal axis [15] Y.-X. Wu, Q.-B. Wu, and J.-Q. Zhu, “Data-driven wind speed forecasting
wind turbine with FAST are utilized for simulations. Realistic using deep feature extraction and LSTM,” IET Renewable Power Gener.,
3-D wind field components vectors are generated by NREL vol. 13, no. 12, pp. 2062–2069, 2019.
[16] M.-R. Chen, G.-Q. Zeng, K.-D. Lu, and J. Weng, “A two-layer nonlinear
TurbSim. The proposed 2-D CNN-BLSTM model is designed combination method for short-term wind speed prediction based on
for ultra-short-term forecasting of wind turbine response and ELM, ENN, and LSTM,” IEEE Internet Things J., vol. 6, no. 4, pp.
shows the high-quality outputs with the smallest metrics. For 6997–7010, 2019.
[17] M. Harris, M. Hand, and A. Wright, “Lidar for turbine control,” National
example, it has demonstrated 78% and 75% improvement in Renewable Energy Laboratory, Golden, CO, Report No. NREL/TP-500-
RMSE when compared with single BLSTM and 2-D CNN 39154, 2006.
models, respectively. [18] L. Zhang and Q. Yang, “A Method for Yaw Error Alignment of Wind
Turbine Based on LiDAR,” IEEE Access, vol. 8, pp. 25 052–25 059,
The proposed model employs 2D-CNN and BLSTM net- 2020.
works to better handle complex spatial- temporal features [19] D. Schlipf, P. Fleming, F. Haizmann, A. Scholbrock, M. Hofsäß,
from the highly variable wind data compared to conventional A. Wright, and P. W. Cheng, “Field testing of feedforward collective
pitch control on the CART2 using a nacelle-based lidar scanner,” in J.
forecasting methods which simply use historical time series Phys. Conf. Ser., vol. 555, no. 1. IOP Publishing, 2014, p. 012090.
data. The main advantage of proposed model over other [20] A. Clifton, P. Clive, J. Gottschall, D. Schlipf, E. Simley, L. Simmons,
deep learning-based forecasting methods is that it uses wind D. Stein, D. Trabucchi, N. Vasiljevic, and I. Würth, “IEA Wind Task 32:
Wind lidar identifying and mitigating barriers to the adoption of wind
preview information provided by LIDAR as an advanced lidar,” Remote Sensing, vol. 10, no. 3, p. 406, 2018.
remote sensing wind measurement technology. Thus, it can [21] M. A. Mohandes and S. Rehman, “Wind speed extrapolation using
be helpful for the wind farm operators as an efficient tool machine learning methods and LiDAR measurements,” IEEE Access,
vol. 6, pp. 77 634–77 642, 2018.
in yaw misalignment and de-loading control strategies. While [22] D. Qin, J. Yu, G. Zou, R. Yong, Q. Zhao, and B. Zhang, “A novel
this study is geared towards exploring the effects of the combined prediction scheme based on cnn and lstm for urban pm 2.5
proposed forecasting methodology on onshore wind turbine concentration,” IEEE Access, vol. 7, pp. 20 050–20 059, 2019.
control strategies, future work could study the effectiveness of [23] J. Li, X. Li, and D. He, “A directed acyclic graph network combined
with cnn and lstm for remaining useful life prediction,” IEEE Access,
adding LIDAR to offshore wind turbines and considering the vol. 7, pp. 75 464–75 475, 2019.
sea current and wave data as external inputs. [24] Q. Wang, S. Bu, Z. He, and Z. Y. Dong, “Toward the prediction level of
situation awareness for electric power systems using cnn-lstm network,”
IEEE Trans. Ind. Inf., 2020.
R EFERENCES [25] L. Ren, J. Dong, X. Wang, Z. Meng, L. Zhao, and J. Deen, “A data-
[1] “Renewable Energy in Alberta.” [Online]. Available: https://www. driven auto-cnn-lstm prediction model for lithium-ion battery remaining
alberta.ca/renewable-energy-in-alberta.aspx/. useful life,” IEEE Trans. Ind. Inf., 2020.
[2] A. Dolatabadi, B. Mohammadi-Ivatloo, M. Abapour, and S. Tohidi, [26] X. Song, F. Yang, D. Wang, and K.-L. Tsui, “Combined cnn-lstm
“Optimal stochastic design of wind integrated energy hub,” IEEE Trans. network for state-of-charge estimation of lithium-ion batteries,” IEEE
Ind. Inf., vol. 13, no. 5, pp. 2379–2388, 2017. Access, vol. 7, pp. 88 894–88 902, 2019.
[3] A. Dolatabadi, M. Jadidbonab, and B. Mohammadi-ivatloo, “Short-term [27] T.-Y. Kim and S.-B. Cho, “Predicting residential energy consumption
scheduling strategy for wind-based energy hub: A hybrid stochastic/igdt using cnn-lstm neural networks,” Energy, vol. 182, pp. 72–81, 2019.
approach,” IEEE Trans. Sustainable Energy, vol. 10, no. 1, pp. 438–448, [28] F. U. M. Ullah, A. Ullah, I. U. Haq, S. Rho, and S. W. Baik, “Short-term
2019. prediction of residential power energy consumption via cnn and multi-

1551-3203 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA. Downloaded on July 29,2021 at 18:40:20 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TII.2021.3097716, IEEE
Transactions on Industrial Informatics
12

layer bi-directional lstm networks,” IEEE Access, vol. 8, pp. 123 369– bidirectional lstm network,” IEEE Access, vol. 8, pp. 229 219–229 232,
123 380, 2019. 2020.
[29] K. Wang, X. Qi, and H. Liu, “Photovoltaic power forecasting based [50] I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning.
lstm-convolutional network,” Energy, vol. 189, p. 116225, 2019. MIT press Cambridge, 2016, vol. 1.
[30] H. Zang, L. Liu, L. Sun, L. Cheng, Z. Wei, and G. Sun, “Short-term scale machine learning,” in 12th {USENIX} symposium on operating
global horizontal irradiance forecasting based on a hybrid cnn-lstm systems design and implementation ({OSDI} 16), 2016, pp. 265–283.
model with spatiotemporal correlations,” Renewable Energy, vol. 160, [52] M. Khodayar and J. Wang, “Spatio-temporal graph deep neural network
pp. 26–41, 2020. for short-term wind speed forecasting,” IEEE Trans. Sustainable Energy,
[31] Y. Yu, X. Han, M. Yang, and J. Yang, “Probabilistic prediction of vol. 10, no. 2, pp. 670–681, 2018.
regional wind power based on spatiotemporal quantile regression,” IEEE [53] Y.-Y. Hong and T. R. A. Satriani, “Day-ahead spatiotemporal wind speed
Trans. Ind. Appl., vol. 56, no. 6, pp. 6117–6127, 2020. forecasting using robust design-based deep learning neural network,”
[32] M. Afrasiabi, M. Mohammadi, M. Rastegar, and S. Afrasiabi, “Advanced Energy, vol. 209, p. 118441, 2020.
deep learning approach for probabilistic wind speed forecasting,” IEEE [54] Y. Liu, H. Qin, Z. Zhang, S. Pei, Z. Jiang, Z. Feng, and J. Zhou, “Prob-
Trans. Ind. Inf., vol. 17, no. 1, pp. 720–727, 2020. abilistic spatiotemporal wind speed forecasting based on a variational
[33] Z. Wang, J. Zhang, Y. Zhang, C. Huang, and L. Wang, “Short-term wind bayesian deep learning model,” Applied Energy, vol. 260, p. 114259,
speed forecasting based on information of neighboring wind farms,” 2020.
IEEE Access, vol. 8, pp. 16 760–16 770, 2020.
[34] Y. Ju, G. Sun, Q. Chen, M. Zhang, H. Zhu, and M. U. Rehman, “A
model combining convolutional neural network and lightgbm algorithm
for ultra-short-term wind power forecasting,” IEEE Access, vol. 7, pp.
28 309–28 318, 2019.
Amirhossein Dolatabadi (GS’16) was born in
[35] Q. Zhu, J. Chen, D. Shi, L. Zhu, X. Bai, X. Duan, and Y. Liu, “Learning
Tabriz, Iran in 1991. He received the B.Sc. and
temporal and spatial correlations jointly: A unified framework for wind
M.Sc. degrees in electrical engineering from Uni-
speed prediction,” IEEE Trans. Sustainable Energy, vol. 11, no. 1, pp.
versity of Tabriz, Tabriz, Iran, in 2014 and 2016,
509–523, 2019.
respectively. From 2016 to 2018, he served as a
[36] A. Peña, C. B. Hasager, J. Lange, J. Anger, M. Badger, F. Bingöl,
Research Assistant with the Smart Energy Systems
O. Bischoff, J.-P. Cariou, F. Dunne, S. Emeis et al., Remote sensing
Laboratory (SES Lab), University of Tabriz.
for wind energy. DTU Wind Energy, 2013.
He is currently pursuing the PhD degree at the
[37] P. A. Fleming, A. Scholbrock, A. Jehu, S. Davoust, E. Osler, A. D.
University of Alberta, Edmonton, Canada. His re-
Wright, and A. Clifton, “Field-test results using a nacelle-mounted
search interests include artificial intelligence, ma-
lidar for improving wind turbine power capture by reducing yaw
chine learning and data analytics applications in
misalignment,” in J. Phys. Conf. Ser., vol. 524, no. 1. IOP Publishing,
energy systems.
2014, p. 012002.
[38] A. D. Hansen, F. Iov, P. E. Sørensen, N. A. Cutululis, C. Jauch, and
F. Blaabjerg, “Dynamic wind turbine models in power system simulation
tool DIgSILENT,” 2007.
[39] H. Samet, S. Ketabipoor, M. Afrasiabi, S. Afrasiabi, and M. Moham-
madi, “Deep Learning Forecaster based Controller for SVC: Wind Farm Hussein Abdeltawab (GS’12–M’17) was born in
Flicker Mitigation,” IEEE Trans. Ind. Inf., 2020. Bani-Souwaif, Egypt, in April 1987. He received
[40] J. Jonkman, S. Butterfield, W. Musial, and G. Scott, “Definition of a 5- the B.Sc. (Hons.) and M.Sc. degrees in electrical
MW reference wind turbine for offshore system development,” National engineering from Cairo University, in 2009 and
Renewable Energy Lab.(NREL), Golden, CO (United States), Tech. 2012, respectively, and the Ph.D. degree in electrical
Rep., 2009. engineering from the University of Alberta, Edmon-
[41] B. J. Jonkman, “TurbSim user’s guide: Version 1.50,” National Renew- ton, AB, Canada, in 2017.
able Energy Lab.(NREL), Golden, CO (United States), Tech. Rep., 2009. He is currently an Assistant Professor of electrical
[42] J. M. Jonkman and M. L. Buhl Jr, “FAST user’s guide-updated august engineering with Penn State University, Erie, PA,
2005,” National Renewable Energy Lab.(NREL), Golden, CO (United USA. He is also a licensed Professional Engineer
States), Tech. Rep., 2005. in Saskatchewan, Canada. His research interests in-
[43] H. Jahangir, H. Tayarani, S. Sadeghi Gougheri, M. Aliakbar Golkar, clude energy management, control system applications in renewable energy,
A. Ahmadian, and A. Elkamel, “Deep Learning-based Forecasting Ap- energy storage, and smart distribution systems.
proach in Smart Grids with Micro-Clustering and Bi-directional LSTM
Network,” IEEE Trans. Ind. Electron., pp. 1–1, 2020.
[44] P. Gangwar, A. Mallick, S. Chakrabarti, and S. N. Singh, “Short-term
forecasting-based network reconfiguration for unbalanced distribution
systems with distributed generators,” IEEE Trans. Ind. Inf., vol. 16, no. 7, Yasser Abdel-Rady I. Mohamed (M’06–SM’11–
pp. 4378–4389, 2019. F’021) was born in Cairo, Egypt, on November 25,
[45] K.-P. Lin, P.-F. Pai, and Y.-J. Ting, “Deep belief networks with genetic 1977. He received the B.Sc. (with honors) and M.Sc.
algorithms in forecasting wind speed,” IEEE Access, vol. 7, pp. 99 244– degrees in electrical engineering from Ain Shams
99 253, 2019. University, Cairo, in 2000 and 2004, respectively,
[46] L. Wang, Z. Zhang, H. Long, J. Xu, and R. Liu, “Wind Turbine Gearbox and the Ph.D. degree in electrical engineering from
Failure Identification With Deep Neural Networks,” IEEE Trans. Ind. the University of Waterloo, Waterloo, ON, Canada,
Inf., vol. 13, no. 3, pp. 1360–1368, 2017. in 2008.
[47] C. Li, G. Tang, X. Xue, A. Saeed, and X. Hu, “Short-term wind He is currently with the Department of Electrical
speed interval prediction based on ensemble GRU model,” IEEE Trans. and Computer Engineering, University of Alberta,
Sustainable Energy, vol. 11, no. 3, pp. 1370–1380, 2020. AB, Canada, as a Professor. His research interests
[48] M. Khodayar, J. Wang, and M. Manthouri, “Interval deep generative include dynamics and controls of power converters; grid integration of dis-
neural network for wind speed forecasting,” IEEE Trans. Smart Grid, tributed generation and renewable resources, microgrids, modeling, analysis,
vol. 10, no. 4, pp. 3974–3989, 2019. and control of smart grids; and electric machines and motor drives.
[49] A. Dolatabadi, H. Abdeltawab, and Y. A.-R. I. Mohamed, “Hybrid deep Dr. Mohamed is an Associate Editor of the IEEE Transactions on Power
learning-based model for wind speed forecasting based on dwpt and Electronics and an Editor of the IEEE Transactions on Power Systems, IEEE
[51] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, Transactions on Smart Grid, and IEEE Power Engineering Letters. He is a
S. Ghemawat, G. Irving, M. Isard et al., “Tensorflow: A system for large- registered Professional Engineer in the Province of Alberta.

1551-3203 (c) 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF ALBERTA. Downloaded on July 29,2021 at 18:40:20 UTC from IEEE Xplore. Restrictions apply.

You might also like