You are on page 1of 22

Received: 16 August 2019 Revised: 11 February 2020 Accepted: 27 February 2020

DOI: 10.1002/er.5331

REVIEW PAPER

Deep learning methods and applications for electrical


power systems: A comprehensive review

Asiye K. Ozcanli | Fatma Yaprakdal | Mustafa Baysal

Department of Electrical Engineering,


Yıldız Technical University, Istanbul,
Summary
Turkey Over the past decades, electric power systems (EPSs) have undergone an evolu-
tion from an ordinary bulk structure to intelligent flexible systems by way of
Correspondence
Asiye K. Ozcanli, Department of Electrical advanced electronics and control technologies. Moreover, EPS has become a
Engineering, Yıldız Technical University, more complex, unstable and nonlinear structure with the integration of distrib-
Istanbul, Turkey.
uted energy resources in comparison with traditional power grids. Unlike clas-
Email: f4915038@std.yildiz.edu.tr
sical approaches, physical methods, statistical approaches and computer
Funding information calculation techniques are commonly used to solve EPS problems. Artificial
The Scientific and Technological Research
Council of Turkey, Grant/Award Number:
intelligent (AI) techniques have especially been used recently in many fields.
2211/C Deep neural networks have become increasingly attractive as an AI approach
due to their robustness and flexibility in handling nonlinear complex relation-
ships on large scale data sets. Major deep learning concepts addressing some
problems in EPS have been reviewed in the present study by a comprehensive
literature survey. The practices of deep learning and its combinations are well
organized with up-to-date references in various fields such as load forecasting,
wind and solar power forecasting, power quality disturbances detection and
classifications, fault detection power system equipment, energy security,
energy management and energy optimization. Furthermore, the difficulties
encountered in implementation and the future trends of this method in EPS
are discussed subject to the findings of current studies. It concludes that deep
learning has a huge application potential on EPS, due to smart technologies
integration that will increase considerably in the future.

KEYWORDS
CNN, DBM, deep learning, forecasting, power systems, RNN, SAE, smart grid

1 | INTRODUCTION not only complicated but also complex and needs a state-
of-the-art technology to overcome this growing complex-
There has been a significant alteration in the distribution ity. And in consequence, the smart grid paradigm has
level with the installation of distributed generation units been revealed for resolutions that enable powerful, secure
such as wind and solar, integration of storage systems and and reliable use of electricity as well as an optimized and
the deployment of charge stations for electric vehicles in more resilient comprehension of the power grid.2,3 Essen-
recent times.1 These new integrations are providing new tially, it is quite common to consider smart grid as a mod-
opportunities and more resilience in conventional energy ern electric power system (EPS) of near future which
management. However, this emerging energy system is integrates the most novel power electronics, computers,

Int J Energy Res. 2020;1–22. wileyonlinelibrary.com/journal/er © 2020 John Wiley & Sons Ltd 1
2 OZCANLI ET AL.

information, communication and cyber technologies. Tra- ability to automatically learn from massive amounts of
ditional power system analysis techniques are no longer historical or synthetic data without any human interven-
suitable for such advanced power systems (smart grids). tion. ML algorithms based on neural networks (NNs) and
Therefore, many researchers have been interested in the its variations (eg, artificial neural networks (ANNs),
use of intelligent techniques in EPS. At this point, artifi- recurrent neural networks (RNNs), deep neural networks
cial intelligence (AI) plays a major role in power system (DNNs), have received more attention and have been
problems such as operation, protection, control, planning, applied to several fields of power systems.8,9 Among them,
diagnosis, etc.4-7 Unlike stringent mathematical algo- deep learning (DNNs) has especially become a vital part
rithms, AI can easily cope with nonlinearities and discon- of state of the art systems in different disciplines including
tinuities in the internal structure of power system speech recognition,10 image classification,11 natural lan-
problems. AI can be grouped into five categories as learn- guage processing and fault diagnosis.
ing methods (machine learning [ML], learning probabilis- In recent years, deep learning (DL) has achieved huge
tic methods), statistical methods, search methods and success in many fields with advanced training algorithms
optimization theory (genetic algorithm, particle swarm and the power of parallel and distributed computing.
optimization), game theory and decision-making algo- DNN architectures can model the relationship between
rithms. The chart of artificial intelligent methods is nonlinear complex systems, enabling raw data to be
depicted in Figure 1. In AI, ML provides systems the developed from the first layer to the last layer for feature

FIGURE 1 AI subfields and techniques [Colour figure can be viewed at wileyonlinelibrary.com]


OZCANLI ET AL. 3

extraction. DNNs enable the composition of features with DL applications, and it includes novel hybrid variations
multiple layers and have the potential for complex data of DL unlike other reviews.
modeling with fewer units, as opposed to shallow neural In this context, this review presents an up to date
networks (SNNs).12 Therefore, DL algorithms have been library of DL algorithms developed in EPS. Three popular
used for many application areas and it has been put forth search engines “ScienceDirect,” IEEE Xplore and “Goo-
in many different studies that complex problems may be gle Scholar” have been used to browse the papers which
solved by way of novel methods using DL. Figure 2 repre- have been written between 2014 and 2019. This literature
sents the popularity of DL in all fields and EPS according research has been carried out by using the keywords
to ScienceDirect database. The number of articles has “electrical power systems,” “ANN” and “DL” in general
been attained annually from 2012 to 2019 by searching and “SAE, DAE, CNN, DBM, DNN, AE, RNN, LSTM” as
for the term “deep learning” for all areas as well as the DL algorithms in detail. As a result, 84 articles have
terms “deep learning AND electric OR power” for EPS in researched for full review. DL algorithms are classified
the title, keywords or abstract in Figure 2. according to application areas as load forecasting, wind
DL applications and methods have been reviewed in and solar power forecasting, power quality disturbances
diverse fields in literature such as bioinformatics,13 medi- (PQDs) detection and classifications, fault detection
cal image analysis,14 smart manufacturing.15 A number power system equipment and other application fields. All
of review papers on DL applications have been presented articles are separately categorized to the type of algo-
in EPS16-21 to the best of the author's knowledge. DL rithms, accuracy and compared existing methods.
approaches for automatic autonomous vision-based Consequently, the paper presents the most frequently
power line inspection are reviewed in Nguyen et al.16 encountered problems in literature with references in
Unsupervised and supervised DL methods are used for addition to discussions on the superiority of DL algo-
fault detection in wind turbines in Helbing and Ritter.17 rithms from different aspects. Furthermore, DL tech-
Various other studies have been carried out on DL appli- niques are discussed for future scopes. This review study
cations for load and renewable energy forecasting in also aims to put forth the various concepts of DL applied
Almalaq and Edwards18 and Wang et al.19 These studies in EPS and to plant a seed of interest for many
present single DL application field in power systems. researchers on how DL may offer solutions to their elec-
Unlike,16-19 an extensive research is presented on various trical data. This review will thus present valuable insight
fields of EPS (ie, load, solar, wind forecasting, grid protec- while serving as a reference resource for researchers who
tion, fault analysis, power quality, energy management may implement DL approaches to electrical power sys-
and cyber security) in this review. On the other hand, the tems studies.
rest of the review papers addresses both ML and DL The main focus points of DL applications research in
methods in different power systems fields. In Zhang power systems can be summarized as indicated below:
et al20 and Cheng and Yu,21 ML technologies and rein-
forcement learning applications are reviewed with DL in • A brief overview is given regarding the history, algo-
smart grid applications and EPS, but DL section is men- rithms, tools, applications of DL.
tioned briefly. However, this paper focuses specifically on • The most frequently used algorithms of DL in power
systems are categorized as single and hybrid models
which are combined with DL, preprocessing and ML
techniques.
• A comparative analysis is realized between various
existing ML techniques.
• The different DL application fields in power systems are
presented with an abundance of significant references.
• The whole reference papers are classified according to
the proposed algorithm, application area, compared
methods and tabulated results.
• Some useful remarks which can be used for the appli-
cation of DL techniques are given in power systems.
• The advantages and some challenges about DL encoun-
tered in the reference papers have been addressed.

F I G U R E 2 Annual trend in the number of article on deep The remaining article is organized as follows.
learning [Colour figure can be viewed at wileyonlinelibrary.com] Section 2 presents the main DL methods that have been
4 OZCANLI ET AL.

used especially for electrical power systems. Section 3 technological constraints as lack of sufficient data, com-
presents applications of DL for EPS and is divided into puting resources and training methods, difficulty train-
subchapters on load forecasting, wind and solar power ing, vanishing gradient, local minima, optimization
forecasting, PQDs detection and classifications, fault problems.23,24 Deep networks take a much longer time to
detection power system equipment and other application train due to complex mathematical algorithms and codes.
fields. Section 4 focuses on the obtained results and open The use of GPU (graphical processing unit) is allowed for
challenges in different areas of application. We end with the development of scalable applications with high com-
a conclusion and suggestions for future research with puting capacity.25 Especially, leading studies by Geoff
Section 5. Hinton, Yann LeCun, Yoshua Bengio created a break-
through using new learning algorithm known as greedy
layer-wise unsupervised pre-training in 2006–2007.26,27
2 | OVERVIEW OF DEEP As a result of these advancements, DL methods have
LEARNING found a significant range of application fields such as
computer vision,11 genomics,28 autonomous vehicles,29
DL is a kind of ML which has sophisticated training algo- robotics.30 Top technology companies such as Google,
rithms with a nonlinear combination of multi-layers. DL Facebook, Microsoft, IBM have made major investments
is used to solve nonlinear problems (recognition, detec- in research and development projects such as Big Sur,
tion, classification, etc.) and to extract representations of Tensorflow, Watson platforms, etc. The key aspects of DL
data with multiple levels of features. The idea of DL by can be summarized as used methods, problem categories,
McCulloch and Pitts dates actually back to 1943.22 The general use case, performance indices, application indus-
progress of DNNs has taken a long time due to try and model parameters. Figure 3 presents general

FIGURE 3 The key aspects of deep learning [Colour figure can be viewed at wileyonlinelibrary.com]
OZCANLI ET AL. 5

information on these classifications. As seen in Figure 3, in object recognition through images and competitions
DL has been applied to various fields such as computer on the classification of images. CNN has been success-
vision, social media, finance, genomics, automotive and fully utilized by way of visual data in many areas such
EPS. Also, DL can make use of any video, image, speech as face recognition, individual recognition, traffic sign
signal or time series to classify, detect or predict a dataset. recognition and handwriting recognition.32 There are
Many libraries have been developed to carry out any DL hundreds of millions of weight values and billions of
tasks such as Tensorflow, Caffe, Theano and Keras. One of connections between neurons in CNN where high-
the most important challenges is to determine the model dimensional training is done. Figure 4A illustrates an
parameters to train the data set. These parameters include example of CNN structure through image classification
activation function, number of layer, epoch, weight, etc. which contains several key components such as a convo-
The definitions of these parameters are given in detailed lution layer, a pooling layer, a fully connected layer, layer
in Goodfellow et al.31 The model accuracy can be mea- filters, activation function, and these components are
sured and compared with other ML algorithms thanks to detailed in Krizhevsky et al.11
performances by indices like RMSE (root mean square
error) and MAPE (mean absolute percentage error).
DL architectures can be classified into different 2.2 | Autoencoders and its variants
groups based on the training algorithms used. In this
study, DL architectures which have been used especially in An AE is an unsupervised learning algorithm that uses
electric power systems are classified into five different backpropagation (BP) to extract a feature from input
groups including CNN (convolutional neural network), data. There is a three-layered network structure com-
AEs (autoencoders), RBM (restricted Boltzmann machine), prised of input, hidden and output sections.27 The basic
RNN and other combined approaches. General features of structure of the AE is formed by minimizing the recon-
these structures are given in the following sections. struction error generated by combining an encoder func-
tion, fƟ, consisting of representations of inputs to lower-
dimensional space and a decoder function, (r), that
2.1 | Convolutional neural network reconstructs this input into an old representation. While
fƟ transforms the input data x with an explicit and effec-
CNN is intended to teach high-level features via convolu- tive calculation into hidden representation h(t) (feature
tion to the data set.32 CNN set a milestone in image, vector); r, transforms the mapping from the hidden layer
video, speech and sound processing.33 These networks into the output layer. These functions are given by for-
have gained popularity thanks to their superior features mulae (1) and (2). AE has some constraints due to the

F I G U R E 4 Structures of DL algorithms (A) CNN, (B) AEs, (C) Boltzmann machines, (D) RNN & LSTM [Colour figure can be viewed
at wileyonlinelibrary.com]
6 OZCANLI ET AL.

quantity of hidden units so this AE can be used as a low- neurons. The RBM NN developed in this way is a
dimensional representation of data: nonlinear graphical model that defines probabilistic dis-
tribution consisting of observational, visible or hidden
 
hðtÞ = f Ɵ x ðtÞ ! h = sðW × x + bÞ, ð1Þ vectors.26,39 A group of binary number vectors (a sample
image) can be modeled as two layers with RBM. Here
the binary pixels detect features by way of symmetric
r = gƟ0 ðhÞ ! r = sðW 0 × h + b0 Þ: ð2Þ weighted links and they are connected stochastically.
The pixels represent the “visible” unit of the RBM
0
The parameter set for such a model is Ɵ = {W, b, W , because their states can be observed while feature
b0 }. The data parameters are the variables including detectors correspond to “hidden” units. The energy
encoder weight matrices, W, with bias vectors, b, and values for the common structures of visible and hidden
decoder weight matrices, W0 , with bias vectors, b0 . (h, v) can be put forth by the following formula (4),
s expresses the activation functions (sigmoid, tanh, etc.) where W is the weight matrix between hidden (h) and
used in the network. The reproduction error is tried to be visible (v) layers, bv and bh represent the biases in the
minimized with the best selection of these values through visible and hidden variables, respectively. Figure 4C
training. Stochastic gradient descent (SGD) is usually shows the structures of developed Boltzmann machines.
used to ensure that the output is equal to the input in The RBMs are generally trained using negative gradient
order to calculate the parameters in this minimization descent. The neurons in the same layers are indepen-
process. The average reconstruction function is mini- dent of each other but dependent on the next layer. In
mized by optimizing the model variables as indicated by terms of learning, the RBMs with hidden layers are
the following expression (3). This automatic encoder pro- faster than classical Boltzmann machines due to these
vides training with an error value that minimizes the properties40:
reconstruction error, L(x; r)34:
E ðv,hÞ = vT Wh − vT bv −hT bh : ð4Þ
X    
J DAE ðƟÞ = L x ð t Þ , gƟ f Ɵ x ð t Þ : ð3Þ
t
Deep Boltzmann machine (DBM) is formed by stac-
Consequently, AEs usually involves an unsupervised king RBMs, and it is constructed as a multilayer deep net-
pre-training phase which helps to better optimize the work with connections only between adjacent layers.
training criterion and to prevent overfitting. The use of After each RBM is trained in this network structure, the
AEs can be a better choice if there is a large amount of activities of its own hidden units serve as training data
unlabeled and high-dimensional data.35 Different learning for the next RBM layer.41,42 Due to the structure of DBM,
algorithms based on the basic structure of the automatic it possesses several interesting properties. The ability of
encoder can be derived such as stacked autoencoders internal representative learning helps by capturing com-
(SAEs),36 denoising autoencoders (DAEs),37 etc. plex statistical structures on higher layers. High-level rep-
The structure of SAE includes a stack of the multiple resentations can be established with a significant number
AEs which enable to learn more complex features and of unsupervised and a small number of supervised data,
illustrates in Figure 4B.26,38 On the other hand, DAE allowing it to be used as a fine-tuned model for a particu-
uses corrupted data ~x different from SAE, which is lar discrimination task. Figure 4C presents a DBM with
reconstructed artificially from the clean data x. In this both visible-to-hidden and hidden-to-hidden connections
structure, an example x is corrupted to ~x . The AE then but no within-layer connections. All connections in a
maps it to h and attempts to reconstruct x. Then it is DBM are undirected. In another aspect, the parameters
trained to estimate the reconstruction distribution of the in all layers can be optimized together by following the
AE using this data. As a result, uncorrupted data will approximate variation of a variable bound lower on the
bring out its output and extracts features useful for probability function.43
denoising. Deep belief network (DBN) first emerged in 2006 with
the Hinton model created by way of a greedy layer
unlabeled learning algorithm. Following greedy learning,
2.3 | Restricted Boltzmann machine and the whole stack has been studied as a single prediction
its variants model known as the DBN.26 The DBN consists of a
4-layer network with the visible × layer comprising the
Boltzmann machine is an ANN composed of probabilistic first layer and hidden layers of h1, h2, h3 as presented in
decision-making units and symmetrically connected Figure 4C. Unlike the DBM, the top two layers come
OZCANLI ET AL. 7

from the non-directional RBM, but the other layers are DL algorithms such as LSTM-AE, SAE-DBN and CNN-
formed with a directional sigmoid belief network.44 LSTM. The second combined approaches are realized for
data pre-processing such as data filter and data decompo-
sition. For example, wavelet transform (WT) is combined
2.4 | Recurrent neural network for data decomposition with DL methods such as WT-SAE,
EWT (empirical wavelet transform)-LSTM-Elman neural
RNN is a type of ANN which utilizes connections of units network, WT-DNN in EPS. Finally, some approaches are
to form a directed graph along with a sequence input. combined with ML algorithms such as SAE-support vector
This allows for the exhibition of dynamic temporal machine (SVM), SAE-ELM (extreme learning method),
behavior. All inputs in RNN are related to each other and DBN-Q learning and deep reinforcement learning (DRL).
hence the previous hidden state value is fed as input to These combined methods are also mentioned in detail in
the current state. Therefore, RNN is capable of handling the following sections.
a variable length sequential data which makes it quite
advantageous for DL.8,45 In general, it is used in areas
such as machine translation, language processing, text to 3 | D E E P LEA R N IN G
speech and market forecasting. APPLIC ATIONS F OR ELECTRIC AL
RNN has an intrinsically deep structure since RNN POWER SYSTEMS
unfolded time can be expressed by way of combinations
with multiple nonlinear layers. Several deep RNN struc- Smart grid concept emerged with the development of
tures have been recommended with different approaches currently used power systems that can integrate energy
which have been reported to perform better than classical demand, production and storage areas. These grids
RNN.46 Among these solutions, the most widely used require flexibility in generating and distributing energy at
structure has been the long short term memory an extraordinary level in order to minimize energy con-
(LSTM).47 LSTM, which consists of memory cells and sumption and to optimize its usage. Furthermore, the
gates, is developed to solve the gradient descent problem necessity for power system planning, operation and con-
and to overcome the complicated time series. LSTM can trol developments has become compulsory as a result of
be created by incorporating memory cells as hidden latest improvements in smart power system fields;
layers inside the classical RNN structure. Simple RNN namely, deregulations of power markets, insistent needs
and LSTM structures are illustrated in Figure 4D. in efficiency and quality of power grids, the expansion of
distributed generation and growing size of interconnec-
tions, and power exchange between utilities.
2.5 | Other approaches combined with A great deal of AI implementations such as ANN, ML
deep learning methods and DL techniques can remarkably contribute to solving
the issues at hand.
The idea of combining methods can be described as the Within the scope of this study, power system areas for
utilization of different algorithms together for overcom- which DL methods are applied in four categories of load
ing the drawbacks of the individual methods. Thus, the forecasting, wind and solar power forecasting, PQDs
combined approaches including applications of DL in detection and classifications, fault detection on power
EPS can be classified in three groups as presented in system and equipment and other application fields have
Figure 5. The first group is obtained by composing two been presented as a summary in Table 1.

F I G U R E 5 The schema of
combined approaches with DL in EPS
[Colour figure can be viewed at
wileyonlinelibrary.com]
8 OZCANLI ET AL.

TABLE 1 Summary of deep learning approaches in electrical power systems

Application fields in electrical Literature


Problem categories Methods system Input data in training studies
*Load forecasting LSTM-RNN, DBN, RBM *Load consumption *Power consumption load 50-77
(CRBM-FCRBM), characterization, residents
Deep CNN behaviors
*Wind forecasting LSTM-RNN *Wind speed and wind power *Wind speed, power, direction, 78-92
temperature, atmosphere
pressure, humidity
*Solar forecasting CNN and K-means *Solar power and solar irradiance *Solar radiation, power, the sun 93-99
Clustering, FF-DNN, and weather data
D-RNN
Power quality SAE, SAE-SVM, WT- *Power quality disturbances, *Voltage and current signals 101-115
disturbances detection SAE, CNN, Wide- classification *Measurements of voltage,
and classifications CNN, CDBN, FF- *Island detection in DG systems current, active power, and
DNN, SSAE reactive power
Fault detection on SAE, WT-DNN, LSTM- *Fault type classification in power *Gearbox failures, rotor current 116-124
power system and SVM, DNN, DBN- systems signals, the gearbox lubricant
equipment DNN, CSAE *Fault diagnosis of transformer, pressure SCADA data,
gearbox, fuel cell… transformer oil data *Voltage
and current signals
Other application fields Deep reinforcement, *Detection of false data injection *Historical energy consumption 125-134
in power systems BM, WT-DNN, wide- in smart grid and power generation and
deep CNN *Power system security electrical devices data
*Electricity-theft detection in *Fire, natural disaster,
smart grid environmental pollution,
energy interruption data
*Optimization of the decentralized
renewable energy system

3.1 | Load forecasting in power systems utilization. When it comes to analyze input parameters for
feature extraction; the aggregated load is highly affected
The importance of energy demand forecasting is due to by many contextual parameters such as weather (tempera-
the significant role played by energy production planning ture, humidity, solar radiation, etc.), special events (like
and adaptation in power systems. The power network holidays) and the day of the week, while the energy con-
must be able to respond dynamically to variations in sumption of a residential house (individual level) has a
demand and should be able to distribute the energy effi- higher dependency on the behaviors of individuals.
ciently and optimally. Furthermore, the usage of renew- Energy consumption estimation is essentially a time-
ables should be kept at the optimum level for smart grids. series forecasting problem. Many linear forecasting
The intelligent and adaptable elements that require more methods such as auto-regressive and moving average
advanced techniques for accurate and precise future (ARMA), auto-regressive integrated moving average
energy demand and generation estimates are impera- (ARIMA), linear regression (LR), iteratively re-weighted
tively used by smart power systems for maximum perfor- least-squares (IRWLS) and nonlinear forecasting methods
mance. Demand forecasting is realized at the aggregated such as ANNs, multi-layer perceptron (MLP), general
level and building level, and depending on the applica- regression neural network (GRNN) and SVM have
tion is categorized as very short-term load forecasting recently been presented in literature.60,61 However, it is
(VSTLF), from seconds or minutes to several hours; inevitable to start implementing statistical ML techniques
short-term load forecasting (STLF), from 1 hour to a in load estimates as well as in other prediction problems
week; medium-term load forecasting (MTLF), from due to the rapid development that took place during the
1 week to a year and, long-term load forecasting (LTLF) 1990s. The popularity of DNNs as the latest improved
corresponding to a time period that is longer than a year. subset of ML techniques has recently been increasing in
Depending on the forecasting category, load forecasting time-series load forecasting as in many fields of power
uses different models to meet the particular objectives of systems. Table 2 provides a detailed overview on the
OZCANLI ET AL. 9

TABLE 2 Summary of reports for application of deep learning (DL) approaches in load forecasting

Reference DL methods Application Comparative methods Outcome


48 Parallel CNN; LSTM-RNN STLF, AL LR, SVR, DNN and CNN-RNN MAPE 1.349
62 LSTM-RNN STLF, AL SARIMA, NARX, SVR, NNETAR MAPE 0.0535
63 LSTM-RNN S-MTLF, AL The ML benchmark RMSE 341.40
64 DBN STLF, AL Traditional NN MAPE improv.21%
65 DNN DeepEnergy STLF, AL SVM, RF, DT, MLP, LSTM MAPE 9.77
66 FCRBM STLF, AL SVM MAPE 1.43
67 LSH stack Deep autoencoder STLF, — RMSE 6.99
VSTLF, AL
68 IoT-based DL (DCEN + ILFN) STLF, AL SDNN, DCEN, ILFN, Holt-Winters MAPE ~1.00
(HW) HW
69 SDA STLF, AL Classical NNs, SVM, MARS, LAS and SO MAPE 2.47
70 CNN STLF, AL TSAM, SVM DBI 8.57
71 LSTM-RNN STLF, BL BPNN, KNN-R, ELM, IS-HF MAPE 8.18
72 LSTM based RNN STLF, BL FFNN, KNN-R MAPE average
21.99
73 Pooling based D-RNN STLF, BL ARIMA, SVR RMSE 0.4505 kWh
74 D-RNN STLF, BL Different prediction approaches and RMSE 111.9
recurrent unit types
75 CNN-LSTM STLF, BL ARIMA, persistent model, SVR, LSTM MAPE 10.1582
76 Pinball loss guided LSTM S-LTLF, BL Q-LSTM, Q-RNN, Q-GBRT Error 0.1775 kWh
77 HFM with GPU STLF, BL — No result given
78 DNN (RBM), DNN (RELU) STLF, BL 3-Layered NN, ARIMA and DSHW MAPE 8.84
79 CNN STLF, BL LSTM S2S, FCRBM, ANN and SVM RMSE 0.732%
80 CRBM, FCRBM STLF, BL SVM and RNN RMSE 0.1702 kW
81 Q-learning with DBN, SARSA with LTLF, BL Q-learning without DBN, SARSA without RMSE 0.02
DBN DBN
82 Parallel CPU-GPU (HFM) STLF, BL CPU No result gave
83 DBN Ensemble M-LTLF, AL SVR, FF-NN, DBN, Ensemble FF-NN MAPE 5.93
84 LSTM-DRNN M-LTLF, MLP (3-layered) Average RMSE
AL + BL 7.10 kWh
85 Combined SAE and ELM M-LTLF, BL BPNN, GRPFNN, MLR and SVR MRE 2.92
86 D-FC-NN, CNN and LSTM-NN M-LTLF, BL LR, RR, PLS and SGD regressors RE 17.29%
87 RNN, CNN M-LTLF, BL SARIMAX Average RMSE
17.84 kW
49 EMD-DL Ensemble M-LTLF, BL SVR, ANN, DBN, Ensemble DBN, EMD- MAPE 3.00
SVR, EMD-ANN and EMD-RF

Abbreviations: AL, aggregated level; BL, building (individual) level; DBI, Davies-Bouldin index; GBRT, gradient boosting regression tree; MAPE, mean
absolute percentage error; Q, quantile; RE, relative error; RMS, root mean square; SARSA, state action reward state action; TD, time domain.

application studies of DL methods in electrical load fore- in the content of short-term and aggregated level load
casting. Comparative methods are also presented in the forecasting. Within the framework of this topic, LSTM-
table since convolutional linear and nonlinear forecasting RNN algorithm48 is the first most commonly used62,63
methods have been used frequently for performance followed by DBN64 and RBM-based65 deep architectures,
comparison with state of the art DNN methods. respectively. Unlike other relevant reference studies of Jian
As can be seen in Table 2, it is important to notice et al62 and Bouktif et al,63 parallel CNN components are
that the greatest number of studies have been presented used to derive various features of past energy consumption
10 OZCANLI ET AL.

data in He,48 with the variable and dynamic structure of model, SVR and LSTM alone. Wang et al72 perform both
the past consumption data modeled via the LSTM-based STLF and LTLF by extending conventional LSTM-based
RNN component. It is important to notice that, Bouktif point forecasting to probabilistic forecasting and proposed
et al63 have also a MTLF study with the same data as STLF pinball loss guided LSTM method outperforms traditional
and that it also obtains optimal time lags and a number of methods. In Coelho et al,77 a hybrid fuzzy model (HFM)
layers for predictive performance optimization making a is used, and it is shown that the proposed GPU learning
difference to other studies using the LSTM method. strategy is scalable with increasing number of training
Debinec et al64 utilize a DBN that is composed of multiple rounds. Ryu et al78 proposed a DNN architecture for both
layers of Boltzmann machines for demand forecasting in load forecasting and demand-side management subjects,
Macedonia over a time period of 24 hours. Here, electrical and the training process in the study is driven by RBM
consumption estimates of the distribution (mostly repre- and BP. The proposed CNN algorithm in Amarasinghe
sented by the residential consumption sector) and trans- et al79 is able to put forth results that are comparable with
mission networks (often represented by major industrial the LSTM-based S2S method and the FCRBM method
companies) are made separately and together. In this way, employed in Mocanu et al80 for the related forecasting
it is noticed that electrical consumption estimates for the problem with the same dataset and it also outperforms
distribution network are more accurate than the transmis- the SVR algorithm. Among all investigations, the study by
sion network. In Kuo and Huang66 a robust deep CNN Mocanu et al80 is different from the others in both short-
model (DeepEnergy) is presented for electricity demand term and long-term forecasting categories of power con-
forecasting. It is experienced that DeepEnergy can exactly sumption in a residential building. The authors utilize
forecast electricity consumption for the following 3 days two newly developed stochastic models for the time-series
and that it outperforms all tested algorithms. Furthermore, problem: (a) a conditional restricted Boltzmann machine
some other deeply structured methods such as LSH stack (CRBM), (b) factored conditional Boltzmann machine
deep AE,67 IoT-based DL (DCEN + ILFN),68 SDA69 and (FCRBM) and they find that as the prediction horizon is
Deep-CNN70 have so far been utilized in other studies with increasing (eg, 1 year ahead), FCRBMs and CRBMs seem
regard to this content. to be stronger and that their prediction errors reduce by
Within the scope of short-term and individual level almost half compared to other classical methods. Besides,
load forecasting, LSTM-based deep architectures71,72 have Mocanu et al81 go further among state-of-the-art energy
been used most frequently in the literature thus far. Kong estimation methods by presenting a novel approach
et al71 put forth a deep LSTM-RNN framework for the which does not call for labeled data. As a major academic
related problem with a much better performance than contribution, DBN is used to extract the high-level fea-
various ML algorithms including the most advanced tech- tures automatically; it learns a building model by involv-
nologies. Moreover, it is the first study which shows that ing a generalization of the state space domain. The point
aggregation of all individual forecasts that provides better of attention in Coelho et al82 by proposing a parallel GPU
performance than the aggregated load forecasting. (HFM) DL metaheuristic-based architecture is that it is
Another study73 proposes LSTM-based RNN and empha- presented as an alternative solution with low-energy con-
sizes the importance of usage information of the appli- sumption and high processing power for the analysis of
ances obtained from the advanced metering infrastructure large data which takes a long and costly process for ordi-
datasets. Shi et al74 propose a novel approach of pooling- nary computer conditions.
based deep RNNs for the forecasting problem. They utilize For the aggregated level M-LTLF, Qio et al83 propose
a pool of inputs for a group of 920 smart metered house- a DL belief networks (DBN) ensemble to estimate regres-
holds in Ireland to enhance the variety and dimension of sions and time series for the first time. Another novelty
the data. Lastly, deep RNN technique encompassing the in this study is the collection of output from various
LSTM structure has been assessed in detail in recent DBNs using the support vector regression (SVR) model.
times.75 Here, the capacity of deep recurrent frameworks A new Deep-RNN model is developed and optimized for
is confirmed in STLF of individual buildings. A hybrid medium- and long-term electrical load estimations on an
DL-NN algorithm that integrates the CNN method and hourly basis (medium- and long-term LF), and DNN was
LSTM as the most commonly used algorithm in this case used to work with a data set that contains missing data
as mentioned at the outset, has been utilized for a multi- blocks in Rahman et al.84 For the building level M-LTLF,
step short-term forecasting strategy to extend the time to a new prediction model is developed in combination with
response electricity market bidding in Yan et al.76 This the DNN method, SAE and the ELM in Li et al.85 Here,
hybrid method yields more powerful results for the multi- the high-quality energy consumption data are revealed by
time step forecasting strategy in comparison with the the SAE learning algorithm while the estimation accuracy
other compared methods like ARIMA model, persistent is increased with ELM. Unlike the methods presented in
OZCANLI ET AL. 11

previous power consumption forecasting studies, this hidden rules of wind speed patterns in the deep network
study includes a dataset with monthly energy consump- architecture. After the unsupervised learning is per-
tion and several meta-data related to customer behaviors formed during pre-training on each layer, the supervised
in Berriel et al.86 Three DL models with more than 10 mil- learning is used to fine tune the network in the hidden
lion data sets are examined: Deep Fully Connected, Con- and output layers. Qureshi et al88 report that the SAE as
volutional and LSTM Neural Networks, and LSTM-based a base regressor and the DBM as meta-regressor are com-
model show the best results for the proposed problem in bined for short term wind power prediction. Training
terms of absolute and relative errors. Unlike classical time is reduced effectively and quickly due to transfer
CNN and RNN algorithms, the Gated RNN and CNN are learning and meta-regressor. Similarly, the SAE and DAE
proposed in another related study by Cai et al,87 for day- algorithms are proposed separately in Khodayar et al89
ahead multistep load forecasting. The weather prediction for very short (10-30 minutes.) and short-term wind
information is also used as an input variable in this study. speed forecasting (STWSF). Dalto et al90 present the stan-
The proposed models are compared to the Seasonal dard feedforward multilayer perceptron NN (MLP) with
ARIMAX model in terms of accuracy, computational effi- four layers for ultra-short term wind speed forecasting.
ciency and robustness. The GCNN outperforms all other The partial mutual method (PMI) based on input variable
models for 24-hour prediction. A hybrid algorithm is used selection (IVS) is preferred to reduce the complexity of
in Qiu et al49 for predicting load demand, and the signals inputs and make learning computationally possible in
from load demand time series in that ensemble model are limited time. The proposed models possess more power-
divided into several different structures by way of an ful prediction capacity than the conventional forecasting
experimental approach defined as Empirical Decomposi- methods in all studies.
tion Mode (EMD). These new signals, also known as the Since it solves the local optimum problem, a DBN
IMF, are applied to DBN with two RBMs and an ANN. algorithm is used in Tao et al91 thus achieving better
According to the estimation results, hybrid models based learning algorithms with more hidden layers. Zhang
on EMD outperform the other nine methods in mono- et al92 use the DBM model with real wind speed datasets
lithic prediction algorithms. for both hour and day ahead multi steps prediction exper-
iments. Similarly, DBN based on Boltzmann machine
was used for wind power forecasting in Wang et al.93 In
3.2 | Wind and solar power forecasting order to obtain higher accuracy, k-means clustering algo-
in power systems rithm is used to analyze a large number of numerical
weather prediction (NWP) samples.
Wind and solar energy systems are utilized in stand-alone A CNN and LSTM hybrid model is used in Wu et al94
and grid-connected systems to generate power for indi- for the prediction of wind power. The original wind
vidual building, residential, commercial and industrial power time series is also regulated with the equivalent
applications. Power forecasting of wind and solar systems probabilistic power curve method before the training due
are essential for operating, planning and controlling of to measurement error and noise. Another hybrid CNN
distributed generations. Thus, accurate and effective algorithm is presented in Wang et al95 with WT as a
power prediction approaches are crucial for operating hybrid model in order to remove the uncertainties of
power networks. Solar and wind power forecasting wind power generation. In Qin et al,96 the proposed
methods are quite similar and can be categorized into model uses hybrid CNN-LSTM and SWM model for wind
time-series statistical methods (ie, ANN, SVM, Markov turbine energy and demand forecasting. This model is
Chain, Autoregressive, Regression Models, DL), physical contrasted with MLP and RBFNN (radial basis function
methods (numeric weather prediction) and ensemble neural network), and it performs with less forecasting
methods (ie, NN-fuzzy, wavelet-NN, various prediction error than others. In Liu et al,97 the WPD (wavelet packet
models ensembles). decomposition)-LSTM-CNN hybrid model shows robust
DBM, DBN, CNN, DAE, SAE, LSTM and hybrid and effective performance in predicting wind speed com-
models are frequently used in short-medium term wind pared to ARIMA, SVM, BP model, etc. A hybrid model
and solar power prediction frameworks as DL tech- named LSTM-PCA (principal component analysis) is
niques. The related papers are summarized in Table 3. It used to estimate the wind power in Xiaoyun et al.98 Vari-
provides a detailed overview of DL applications on wind ables such as atmospheric pressure, humidity, wind
and solar forecasting studies. The studies are categorized speed and direction of wind are classified according to
as applied DL methods, forecasting time periods (short, their effect levels by PCA method. Another study99 pro-
medium, long), compared methods and accuracy of pro- poses a hybrid model based on LSTM, SVRM and
posed models. In Hu et al,50 DAE is used to extract the extremal optimization algorithm. A combined approach
12 OZCANLI ET AL.

TABLE 3 Summary of reports for the application of deep learning (DL) approaches in wind and solar forecasting

Reference DL methods Time span Comparative methods Outcome/accuracy


50 DAE STWSF SVR, ELM MAE 0.679-MAPE 14.27
88 SAE-DBM STWSF ARIMA, SVR RMSE 0.094-MAE 0.065
89 SAE, SDA STWSF FFNN, TDNN, NARNN RMSE 0.521-MAE 0.213
90 MLP STWSF SDA, SNN MAE 0.53
91 DBN SWPF SVR, MLP Effective accuracy
92 DBM-BP STWSF AR, ANFIS, SVR MSE 1.29-MAPE 6.17
93 DBN STWPF BP, MWNN MAPE 1.1739
94 CNN-LSTM WPF CNN-FFNN, CNN-RNN, FNN RMSE 0.079-MAE 0.056
95 CNN-WT STWPF SVC + QR and BP + QR CRPS 0.2809
96 CNN-LSTM-SVM MTWPF MLP, RBFNN MAPE 0.78
97 WPD-CNNLSTM-CNN STWSF ARIMA, SVM, BP, Elman, ELM Effective accuracy
98 LSTM-PCA STWPF BP, SVM RMSE 1.073-MAE 0.567
99 LSTM-SVRM-EO STWSF ARIMA, SVR, ANN, KNN and GBRT MAE 1.14-MAPE 1.5335 RMSE 17.10
100 EWT-LSTM-Elman STWSF ARIMA, BP, hybrid methods RMSE 0.37-MAE 0.28
51 ELM-LSTM-DE STWSF GBRT, persistence model MAE 0.47054-RMSE 0.658
52 Deep LSTM-RNN STSPF CNN-LSTM RMSE 82.15
101 LSTM and AE MTSPF SAE, SDA RMSE 0.0642
102 DBN STSPF SAE-DBM MAPE 5.11%
103 DL with DBN STSPF DBM-BP MSE 4.801
104 D-CNN STSPF MLP MAE 0.53
105 D-RNN with LSTM — SDA Average RMSE 0.086
53 LSTM supported D-RNN — RNN RMSE 0.077

Abbreviations: ANFIS, adaptive neuro fuzzy inference system; CRPS, continuous ranking probability score; FFNN, feed-forward neural network; GBRT,
gradient boosted regression trees; KNN, k-nearest neighbors; MWNN, Morelet wavelet neural network; NARNN, nonlinear autoregressive neural network;
PCA, principal component analysis; SVRM, support vector regression machine; TDNN, time delay neural network.

(EWT-LSTM-Elman) for the multi-step wind speed fore- problem. Among the different models on LSTM applied
casting in which EWT is used to decompose the raw wind in several case studies, LSTM for regression with time
speed data into a few sub-layers is proposed in Liu steps yields more accurate results. In Gensler et al,101
et al.100 The low- and high-frequency parts of the combinations of DL algorithms such as LSTM, AE and
decomposed wind signal are predicted by LSTM model DBN similar to those in Li et al102 and Neo et al103 which
and Elman NN, respectively. The compared algorithm yield successful results in the state-of-art estimation
has satisfactory prediction performance in comparison to methods have been used for power generation estimation
11 different forecasting models. A novel hybrid method is of 21 solar power plants. According to the RMSE method,
presented in Hu and Chen51 utilizing LSTM, extreme the best estimation algorithm is realized with the Auto-
learning machine (ELM) and differential evolution LSTM method with a mean of 0.0713. An AE is utilized
(DE) algorithm. DE optimization algorithm is used to to execute the feature learning followed by combining an
determine the number of hidden layers and neurons for LSTM network to the encoding part of the AE to use tem-
LSTM network. poral information in form of sequences of the extracted
The papers related with solar forecasting can be features. In Wang et al,104 a deep CNN is used to estimate
found at the end of Table 3. The LSTM-RNN model is PV energy generation. In this method, past photovoltaic
proposed in Abdel-Nasser and Mahmoud52 for estimating production data are in the form of one-dimensional
the output power of PV, since the LSTM networks can nonlinear signals. The time-series signals are separated
model short time changes in the PV output power due to into a main signal and other detail signals by way of WT
their repeating structure and memory units. The LSTM method for increasing their estimation accuracy after
network is trained via BP, which removes the gradient which they are used in a NN. A theory called QR also
OZCANLI ET AL. 13

statistically analyzed the prediction error of the results be implemented easily thanks to the availability of this
after reconstruction of the signals by re-wavelet transfor- high-dimensional data. In a recent study,106 soft comput-
mations as a result of training. Consequently, the com- ing techniques on classifying PQD have been reviewed
bined use of the proposed WT-DCNN-QR techniques has and DL applications are briefly summarized by a few
produced a high-specification, stable and robust predic- articles. However, the applications of DNNs are com-
tion structure in comparison with other methods. prehensively presented by comparing them with the
Unlike other studies, deep recurrent neural network existing methods in this section. These studies are sum-
(DRNN) model is used to estimate the solar radiation marized in Table 4. It provides a detailed overview on
value in Alzahrani et al53,105 and it is aimed to solve com- the application studies of DL methods in EPS. The stud-
plex models and to reveal high-level features in this man- ies are categorized as applied DL methods, application
ner. The application built with LSTM supported DRNN area, comparing methods and performance accuracy of
network structure is implemented in Keras API and proposed models.
Matlab platforms. SAE is used to classify PQD which including sag,
swell, interruption, harmonic, oscillatory in Ma et al.54
The method uses unsupervised learning without any
3.3 | Detection and classification of labeled data for extracting high-level features from input
power quality disturbances in power data. Mahdi and Genc107 proposed a novel method based
systems on stacked sparse AE for detection of transient stability
status after the occurrence of a fault in power systems.
The determination and classification of PQDs are impor- The results show that the proposed method is quite effec-
tant in improving the protection, security and reliability tive for real-time detection of stability. Similarly, The
of power systems. In recent years, the system data of SAE algorithm is combined with independent component
smart grids (such as voltage-current signals and magni- analysis (ICA) in Shi et al.108 The PQD signals are classi-
tudes, status information of grids, electric usage) have fied as voltage sag, voltage swell, voltage interruption,
become available due to developments of inter- impulsive transient, oscillation transient, harmonic and
communicating devices, such as phasor measurement flicker. In Deng et al,109 a novel DL architecture is used
units and smart meters. Artificial intelligent methods can differently via directional Gated Recurrent Unit (Bi-

TABLE 4 Summary of reports for the application of deep learning (DL) approaches in detection and classification of power quality
disturbances

Reference DL methods Application Comparative methods Outcome/accuracy


54 SAE Classification of PQD SK-ANN, DWT-FFT Accuracy 99.75%
107 SSAE Detection of transient instabilities MLP Accuracy 99.4%
108 ICA-SAE Classification of complex PQD — Accuracy 98.6%
109 RNN (Bi-GRU) Detection of type and time location PQD — Accuracy 98%
110 MFCNN Identification of complex PQD SVM, DT, DWT, PNN, … Accuracy 99.26%
111 LSTM Voltage dip classification — Accuracy 93.4%
112 DBN Classification of PQD WT and SVM Accuracy 99.15%
113 DBN Diagnosis of voltage sag events SVM Accuracy 96.92%
114 CNN Detecting and classifying PQD Conventional methods Accuracy 99.96%
115 CNN Detecting and classifying PQD SVM Accuracy 99.04%
116 PCA-CNN Monitoring and classification of PQD DWT, ELM, HST, PNN, … Accuracy 99.92%
117 CNN-LSTM Classification of PQD CNN, RNN, LSTM, GRU Accuracy 98.4%
118 CS-CNN Classification of complex PQD ELM, PCA-SVM, PNN, PSO-H-ELM Accuracy 98.6%
119 CS-SAE Classification of complex PQD WT-PNN, DT-NN Accuracy 99.52%
55 WT-SAE Island detection and classifying PQD DT, SVM Accuracy 98.3%

Abbreviations: AUC, area under curve; CDBN, conditional deep belief network; DT, decision tree; DWT-FFT, discrete wavelet transform-fast Fourier
transform; ELM, extreme learning method; GRU, gated recurrent unit; HST, hyperbolic S-transform; PCA, principal component analysis; PNN, probabilistic
NN; PQD, power quality disturbance; PSO, particle swarm optimization; RF, random forest; SK, spectral kurtosis; SSAE, sparse stack autoencoder; WT, wavelet
transform.
14 OZCANLI ET AL.

GRU). In Deng et al109 and Qiu et al,110 the proposed Kong et al55 with WT and multi-resolution singular spec-
models are evaluated under different noises and real trum entropy for the detection and classification of grid
operation environments. Three-phase voltage dip pat- disturbance and islanding based on the voltage at the
terns are trained with LSTM to classify PQD in Balouji.111 point of common coupling of the distributed grid. While
The accuracy of the proposed model is too low according the island mode is classified as voltage rise and drop, grid
to other DL applications. disturbance is labeled as sag and swell.
It is indicated in other studies such as Li et al112 and
Mei et al113 that the DBN algorithm has achieved better
performance in comparison with conventional methods 3.4 | Fault detection on power system
and that they are robust against noise. The various volt- and equipment
age sags have been classified in Mei et al113 such as three-
phase short circuit, induction motor starting, and trans- Fault detection and prediction is the process of analyzing
former energizing. historical data to detect if there is a fault in the electrical
In recent studies, various CNN architectures have system for ensuring the reliability and stability of the
been used to classify PQDs, especially hybrid CNN power system. The applications of DNNs in this field are
models provide better performance than conventional presented for both power system protection and electrical
methods. For example, Wang and Chen114 and Liu equipment protection such as wind turbine, wind turbine
et al115 have addressed both the detection and classifica- gearbox, transformer and fuel cell. These studies are pres-
tion of PQD. While Liu et al115 combine a novel full ented shortly in Table 5. It provides a detailed overview
closed-loop approach based on a deep CNN, Wang and of the DL applications in EPS studies. The studies are cat-
Chen114 use the singular spectrum analysis (SSA), egorized as applied DL methods, application area, com-
curvelet transform (CT) and deep CNN. The other com- pared methods and accuracy of proposed models.
bined methods utilizing CNN-PCA and CNN-LSTM, An alternative solution to fault diagnosis in power
respectively, are presented in Shen et al116 and Mohan system is presented by employing the SAE algorithm
et al.117 The suggested model is tested on modified IEEE which has better local optimum and diffusion of gradi-
13 bus system by synthetic PQDs in Reference 116. It is ents in Yixing et al.56 The SAE is constructed with differ-
noticed that the method is prone to noise. Various DL ent dimension hidden layers and thus the influence of
algorithms such as CNN, RNN, LSTM, GRU and hybrid hidden layers on accuracy rate of diagnosis is demon-
CNN-LSTM are compared for effective and accurate clas- strated. A combined approach is presented in Yu et al120
sification in Mohan et al.117 The best result is obtained with WT-based DNNs for microgrid fault detection
with CNN-LSTM architecture. The compression sensing including fault type classification, fault phase identifica-
(CS) technology as a data acquisition method is com- tion and fault location detection. The line trip fault is
bined with both CNN and SAE algorithms in Wang predicted using LSTM and SVM algorithms for opera-
et al118 and Liu et al,119 respectively. The proposed tional reliability and the stability of a power system in
models verify rapid classification and high accuracy Zhang et al.121 According to comprehensive results in
thanks to the CS. A novel DL framework is presented in

T A B L E 5 Summary of reports for the application of deep learning (DL) approaches in fault detection on power system and power
system equipment

Reference DL methods Application Comparative methods Outcome/accuracy


56 SAE Power system fault diagnosis BP Accuracy 71.3%
120 WT-DNN Fault detection for microgrids DT, kNN, SVM, … Effective accuracy
121 LSTM-SVM Line trip fault prediction BPNN, SAE, RNN, … Accuracy 97.7%
122 DNN WT gearbox failure identification kNN, SVM, NN Accuracy 89.3%
123 DBN-DNN, SAE Fault diagnosis for WT gearbox SVM MAPE 6.01
124 SAE Fault diagnosis for SOFC system SVM Accuracy 79.94%
125 CSAE Transformer fault diagnosis BP Accuracy 93.6%
126 DBN Transformer fault diagnosis SVM, BPNN Accuracy 89.2%
57 DBN Cable fault recognition BP, SVM, ACCLN Accuracy 97.8%

Abbreviations: ACCLN, annealed chaotic competitive learning network; BPNN, back propagation NN; CSAE, continues sparse autoencoder.
OZCANLI ET AL. 15

References 56, 120, 121, DL algorithms perform much wide and deep CNN is presented to secure the smart grid
better than conventional methods. in Zheng et al.128 Here, the wide component which forms
FF-DNN-based framework is developed by the a fully-connected layer of NNs has memorization of the
authors in Wang et al122 to detect and monitor the WT global knowledge from the one-dimensional electricity
gearbox failures by using the lubricant pressure from consumption data while the deep CNN component is
SCADA data. The DNN method demonstrates more accu- trained on two-dimensional electricity consumption data
rate prediction results compared with five benchmarking since it can accurately classify the non-periodicity of
data-driven methods. Similarly, the SAE and SVM algo- electricity-theft and the periodicity of normal electricity
rithms are proposed using a rotor current signal for the usage. Another CNN is applied for detection of the solar
drivetrain gearboxes of the wind turbines by Cheng photovoltaic panel in satellite imagery in Malof et al.129
et al.123 Another fault diagnosis method based SAE algo- The sample work includes the determination of about 2700
rithm is applied to solid oxide fuel cell systems (SOFCs) individual distributed PV panels on the area of 135 km2.
which are extensively used in auxiliary power units and The proposed CNN model is not overfitting to the training
stationary power generators in Zhang et al.124 In the pro- data and could generalize well to previously unseen aerial
posed model, the raw data of the SOFC system without imagery. A significant study is carried out to estimate the
any preprocessing are used for training so it could be capacity of lithium-ion batteries via deep CNN algorithm
directly applied with real system measurements. in Shen et al.130 The proposed method achieves promising
The transformer fault was diagnosed with the concen- accuracy despite its limitations such as the applicability of
trations of gases dissolved in transformer oil in Wang under variable ambient temperature conditions.
et al125 and Dai et al.126 The proposed models are con- For the first time, the benefits of using the DRL
ducted with sparse AE and DBN, respectively. According model, a hybrid method consisting of a combination of
to the results in Dai et al,126 multiple fault type and num- DL and RL, have been investigated with regard to the
ber of sample sets have a direct impact on the accuracy of optimization of building energy systems (smart grid) by
the model. Similarly, the DBN architecture is proposed to Mocanu et al.131 The learning procedure has been applied
identify underground cable fault in Yizhe et al.57 The using deep policy gradient (DPG) and deep Q-learning
results demonstrate that DBN-based methods have dis- methods, which are extended to perform many actions at
tinct advantages compared with shallow NNs. the same time. These methods are proposed for solving
the same consecutive decision problem both at the indi-
vidual and the aggregated building levels. For both levels,
3.5 | Other application fields in power the DPG method has been shown to be more suitable for
systems on-line planning of energy resources. François et al132
suggest designing energy storage management strategy
The DL methods used in different areas other than those via “Deep Reinforcement Learning” hybrid method as a
mentioned in the preceding sections are summarized in solution to the sequential decision-making problem (the
this part. There are significant studies in literature on the making of storage devices at the optimum level at all
fields of energy management, energy optimization and times). The proposed DRL architecture is based on a
power system security. The security, reliability and eco- large continuous non-handcrafted feature space that uti-
nomic issues of power systems are being coordinated lizes convolutional layers to extract significant features.
tighter than ever by way of the deregulation of electric The proposed approach is empirically tested in the case
power industry in recent times. Therefore, there is a of a residential customer located in Belgium, and it is
much greater need for fast and robust optimization tools original in the overall validation process. A real-time
than before. For example, false data injection (FDI) attacks optimal control strategy based on DL adaptive dynamic
composed of cyberattacks are launched by the attackers programming for managing of distributed energy in a
who can manipulate the measurement data without caus- microgrid is presented in Wu and Wang.133 The results
ing any severe change in the physical structure of the sys- show that this dynamic and DL approach is quite effec-
tem. The conditional deep belief network (CDBN) tive on online real-time decision making, reducing opera-
structure is proposed to detect the unobservable FDI tional costs and so on. Another study has been carried
attacks in the smart grid by He et al.58 SAE is suggested to out on utilizing the DL techniques to estimate the energy
predict and detect power system security weak spots in consumption and the power generation together with
potential operation scenarios in another study.127 The sim- numerical weather prediction in Sogabe et al.134 An opti-
ulation results put forth that model simplification and data mization tool platform using Boltzmann machine algo-
parallelism can reduce training time in a real system. Simi- rithm for NMIP problem has also been proposed in this
larly, a novel electricity theft detection model including the study for better computing scalable decentralized
16 OZCANLI ET AL.

renewable energy systems. In fact, the DL method has • DL is quite suitable for detection of power system dis-
been used to estimate energy consumption and produc- turbances considering the rapid growth of monitor
tion, but forecast values have been integrated into the devices in the multi-energy system.
optimization problem. For this reason, this study is cov- • Another advantage in the implementation stage is that
ered here as a part of optimization studies. In Wang it can be easily applied in programming languages
et al,59 a hybrid DL method is used for data analysis and such as Python which are widely used in data sciences.
emergency system management in large power systems. There are also unique platforms, such as Tensorflow
This study is discussed in this section because the data and Keras, which have DL libraries.
analyzed by way of DL method is integrated into the opti-
mization problem as in Sogabe et al.134 Moreover, the Despite all benefits, DL methods on power systems
papers are summarized in Table 6. The papers are catego- have also some challenges which can be indicated as
rized as applied DL methods, application area, compared follows:
methods and accuracy of proposed models.
• One of the most important points that affect the accu-
racy of the DNNs is training data size, which is partic-
4 | ADVANTAG E S AN D ularly problematic on electrical power systems due to
CHALLENGES the fact that generating the training data on physical
systems can be expensive and time-consuming. There
It is clear that DL methods have several advantages over is a risk of overfitting in case sufficient amount of data
traditional methods. It is noticed from the studies are not used, and in this case, the test error seems high
reviewed that higher-performance accuracy may be even for low training errors.
obtained in comparison with traditional methods: • Another important feature that should be included in
the data sets to improve the result is the criterion of
• DL can extract high-level features automatically even clean and noiseless data. Disturbed input sources, poor
from insufficient labeled data and it has high computa- quality data, trustworthiness of data analysis, limited
tional efficiency in terms of the capability of para- labeled data have adverse impacts on the performance
lleling the computation. As a result, DL has better of the model.
generalization ability than traditional methods. • Model parameter selection is another important stage
• DL has higher robustness in case of missing data, noise for an optimal deep architecture. There are no specific
data, uncertainty data, etc. defined rules for the selection of model parameters and
hence it is difficult to find the most appropriate model.

TABLE 6 Summary of reports for the application of deep learning (DL) approaches in various application areas

Outcome/
Reference DL methods Application Comparative methods accuracy
58 CDBN Detection of false data injection in smart grid ANN, SVM Accuracy 93%
127 SAE Prediction of power system security weak Shallow model Accuracy 95.78%
spots
128 Wide-Deep CNN Electricity-theft detection in smart grid SVM, RF, Wide, CNN AUC 0.78
129 CNN Solar photovoltaic array detection RF Precision rate 72%
130 Deep CNN Estimating capacity of lithium-ion batteries RVM RMSE 0.622%
131 DRL On-line building energy optimization Heuristic Methods (E.G. PSO) —
132 DRL Microgrids energy management No comparison —
133 DNN Microgrid economic dispatch (ED) The robust and deterministic —
optimization methods
134 DML (BMA) Optimization of decentralized renewable No comparison —
energy system
59 Multilevel DL Big data analysis and emergency SVM —
management of power system

Abbreviations: BMA, Boltzmann machine algorithm; DML, deep machine learning; DRL, deep reinforcement learning; RF, random forest; RVM, relevance
vector machine; SVM, support vector machine.
OZCANLI ET AL. 17

Trial and error method is used for determining the systems are related to energy management, energy opti-
parameters such as the number of layers, the number mization and power system security. On the other hand,
of neurons in each layer, learning rate, momentum, the there are very few studies on power system protection
number of epochs and hence the process may be time- and islanding detection. In this context, there is a wide
consuming. research and application area of electrical power systems
• The time elapsed in the detection of problems in elec- studies in terms of DL techniques is waiting to be
trical power systems is another parameter that must be explored. This study will serve as a launching point for
considered for the protection and operational sustain- many electrical power system researchers who aim to
ability of other systems. DL algorithms include mathe- apply DL approaches in their profession.
matical operations with millions of connections, Partial solutions have been suggested until now for
indirect and interconnected weights. In electrical sys- via DL techniques for EPS issues; however, the current
tems, especially in real-time applications, this process literature is still at a starting stage. There are still some
needs to be performed quickly and so, DL algorithms questions on DL structure such as the best algorithm,
need additional powerful hardware like GPU which number of layers, type of hidden unit activation function,
can perform parallel and distributed computing. learning ratio, momentum, number of training cases,
weight of neurons, etc.? Even so, DL algorithms have
great application potential for EPS in the future since the
5 | C ONCLUSIONS A ND F UTURE increasing integration of renewable energy sources in
PERSPECTIV ES power systems will facilitate the use of DL techniques
thanks to intelligent monitoring systems and smart
Electric power systems have faced serious challenges metering. Moreover, DL can be appropriate for use in tra-
such as power outages, losses, faults in operation, control ditional methods as an auxiliary system in terms of the
and monitoring with the integration of distributed gener- reliability of the system due to the fact that 100% accu-
ation and its increased capacities. Researchers have racy cannot be achieved in the solutions realized with
recently tried to use intelligent methods instead of tradi- AI. However, DL algorithms remain on the agenda as
tional methods in order to cope with these difficulties. promising approaches in spite of various limitations and
DL has emerged as a state-of-the-art solution tool with challenges in electrical power systems.
robust training algorithms and parallel computing capa-
bility where high dimensional data is available. In this ACKNOWLEDGEMENT
study, the methods related to DL in electrical power sys- We hereby wish to acknowledge the support provided by
tems have been reviewed comprehensively. Thus, the the Scientific and Technological Research Council of Tur-
developed algorithms are summarized in terms of the key under grant (2211-C).
input data, the fields of application, the proposed
methods and the level of accuracy, in detail. ORCID
In this context, this review has been studied with the Asiye K. Ozcanli https://orcid.org/0000-0001-5536-5371
examination of 84 articles which are related with DL Mustafa Baysal https://orcid.org/0000-0002-6298-918X
algorithms' usage on EPS. According to searching results,
it is observed that AEs, deep CNN and LSTM-RNN RE FER EN CES
methods have been mostly used in the literature so far. 1. Bharati GR, Paudyal S. Coordinated control of distribution
SAEs have been applied by 20% of all articles, and the grid and electric vehicle loads. Electr Pow Syst Res. 2016;140:
authors' of these papers have taken advantage of its capa- 761-768. https://doi.org/10.1016/j.epsr.2016.05.031.
bility of feature extraction, dimensionality reduction and 2. Golshannavaz S, Afsharnia S, Siano P. A comprehensive sto-
chastic energy management system in reconfigurable
data denoising. LSTM-RNNs have been proposed by 24%
microgrids. Int J Energy Res. 2016;40:1518-1531. https://doi.
of all studies and applied frequently in time series fore- org/10.1002/er.
casting such as load, wind and solar forecasting. CNNs 3. Yigit K, Acarkan B. A new ship energy management algo-
have been used by 20% of all papers for detection, fore- rithm to the smart electricity grid system. Int J Energy Res.
casting and classification problems. According to DL 2018;42:1-16. https://doi.org/10.1002/er.4062.
applications in EPS, the most widely studied fields are 4. Raza MQ, Khosravi A. A review on artificial intelligence
electric power consumption, wind and solar power out- based load demand forecasting techniques for smart grid and
put forecasting problems. After that, DL algorithms are buildings. Renew Sustain Energy Rev. 2015;50:1352-1372.
https://doi.org/10.1016/j.rser.2015.04.065.
applied extensively for the classification of PQDs and
5. Bicer Y, Dincer I, Aydin M. Maximizing performance of fuel
fault detection of a transformer, turbine gearbox, cable, cell using artificial neural network approach for smart grid
etc. The remaining DL application studies on power
18 OZCANLI ET AL.

applications. Energy. 2016;116:1205-1217. https://doi.org/10. and electric power systems. Int J Energy Res. 2019;43:1-46.
1016/j.energy.2016.10.050. https://doi.org/10.1002/er.4333.
6. Evangelopoulos VA, Georgilakis PS. Optimal operation of 22. McCulloch WS, Pitts W. A logical calculus of the ideas imma-
smart distribution networks: a review of models, methods and nent in nervous activity. Bull Math Biophys. 1943;5(4):115-133.
future research. Electr Pow Syst Res. 2016;140:95-106. https:// https://doi.org/10.1007/BF02478259.
doi.org/10.1016/j.epsr.2016.06.035. 23. Tesauro G. Practical issues in temporal difference learning.
7. Yaprakdal F, Baysal M. Optimal operational scheduling of Mach Learn. 1992;277:257-277.
reconfigurable microgrids in presence of renewable energy 24. Bengio Y, Lecun Y. Scaling learning algorithms towards AI.
sources. Energies. 2019;12(10):1858. https://doi.org/10.3390/ Large-Scale Kernel Machines. Vol 1. Cambridge, MA: MIT
en12101858. Press; 2007:1-41.
8. Ku C, Lee KY. Diagonal recurrent neural networks for dynamic 25. Schmidhuber J. Deep learning in neural networks: an over-
systems control. IEEE Trans Neural Netw. 1995;6(1):144-156. view. Neural Netw. 2015;61:85-117. https://doi.org/10.1016/j.
9. Kermany SD, Joorabian M, Deilami S, Masoum MAS. Hybrid neunet.2014.09.003.
islanding detection in microgrid with multiple connection 26. Hinton G, Asindero S, Whye TY. A fast learning algorithm for
points to smart grids using fuzzy-neural network. IEEE Trans deep belief nets. Neural Comput. 2006;18(7):1527-1554.
Power Syst. 2017;32(4):2640-2651. https://doi.org/10.1109/ https://doi.org/10.1162/neco.2006.18.7.1527.
TPWRS.2016.2617344. 27. Lecunn Y, Bottou L, Bengiu Y, Haffner P. Reducing the
10. Hinton G, Deng L, Yu D, et al. Deep neural networks for dimensionality of data with neural networks. Science. 2006;
acoustic modeling in speech recognition. IEEE Signal Process 313(5786):504-507. https://doi.org/10.1126/science.1127647.
Mag. 2012;29(6):82-97. https://doi.org/10.1109/MSP.2012. 28. Yue T, Wang H. Deep learning for genomics: a concise over-
2205597. view. Handbook of Deep Learning Applications; 2018. https://
11. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification doi.org/10.1021/acs.molpharmaceut.5b00982.
with deep convolutional neural networks. Adv Neural Inf Pro- 29. NVIDIA. Deep learnıng for self-drıvıng cars. https://www.
cess Syst. 2012;1:1-9. https://doi.org/10.1016/j.protcy.2014. nvidia.com/en-us/deep-learning-ai/industries/automotive/.
09.007. Accessed May 10, 2018.
12. Bengio Y. Learning deep architectures for AI. Found Trends® 30. Gashler MS. Deep learning in robotics: a review of recent
Mach Learn. 2009;2(1):1-127. https://doi.org/10.1561/2200000006. research. arXiv:1707.07217. 1–41.
13. Min S, Lee B, Yoon S. Deep learning in bioinformatics. Brief 31. Goodfellow I, Bengio Y, Courville A. Deep Learning. Cam-
Bioinform. 2017;18(5):851-869. https://doi.org/10.1093/bib/ bridge, MA: MIT Press. http://www.deeplearningbook.org/.
bbw068. Accessed 9 October 2019; 2016.
14. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learn- 32. Bengio Y, Haffner P. Gradient-based learning applied to docu-
ing in medical image analysis. Med Image Anal J. 2017;42:60- ment recognition. Proc IEEE. 1998;86(11):2278-2324.
88. https://doi.org/10.1016/j.media.2017.07.005. 33. Rawat W. Deep convolutional neural networks for image clas-
15. Wang J, Ma Y, Zhang L, Gao RX, Wu D. Deep learning for sification: a comprehensive review. Neural Comput. 2017;
smart manufacturing: methods and applications. J Manuf 2449:2352-2449. https://doi.org/10.1162/NECO.
Syst. 2018;48:1-13. https://doi.org/10.1016/j.jmsy.2018.01.003. 34. Bengio Y, Courville A, Vincent P. Representation learning: a
16. Nguyen VN, Jenssen R, Roverso D. Electrical power and review and new perspectives. IEEE Trans Softw Eng. 2013;35
energy systems automatic autonomous vision-based power (8):1798-1828. https://doi.org/10.1145/1756006.1756025.
line inspection: a review of current status and the potential 35. Erhan D, Courville A, Vincent P. Why does unsupervised pre-
role of deep learning. Electr Power Energy Syst. 2018;99:107- training help deep learning? J Mach Learn Res. 2010;11(2007):
120. https://doi.org/10.1016/j.ijepes.2017.12.016. 625-660. https://doi.org/10.1145/1756006.1756025.
17. Helbing G, Ritter M. Deep learning for fault detection in wind 36. Bengio Y, Lamblin P, Popovici D, Larochelle H. Greedy layer-
turbines. Renew Sustain Energy Rev. 2018;98:189-198. https:// wise training of deep networks. Adv Neural Inf Process Syst.
doi.org/10.1016/j.rser.2018.09.012. 2007;19(1):153.
18. Almalaq A, Edwards G. A review of deep learning methods 37. Vincent P, Larochelle H. Extracting and composing
applied on load forecasting. Paper presented at: 2017 16th robust features with denoising autoencoders. Paper pres-
IEEE International Conference on Machine Learning and ented at: International Conference on Machine Learning,
Applications (ICMLA), Cancun, 2017. https://doi.org/10. 2008.
1109/ICMLA.2017.0-110. 38. Hinton GE. Learning multiple layers of representation. Trends
19. Wang H, Lei Z, Zhang X, Zhou B, Peng J. A review of deep Cogn Sci. 2007;11(10):428-434. https://doi.org/10.1016/j.tics.
learning for renewable energy forecasting. Energ Conver Man- 2007.09.004.
age. 2019;198:111799. https://doi.org/10.1016/j.enconman. 39. Rumelhart DE. Parallel Distributed Processing: Explorations in the
2019.111799. Microstructure of Cognition. Cambridge, MA: MIT Press; 1986.
20. Zhang D, Han X, Deng C. Review on the research and prac- 40. Pearl J. Probabilistic Reasoning in Intelligent Systems: Networks
tice of deep learning and reinforcement learning in smart of Plausible Inference. San Mateo, CA: Morgan Kaufmann
grids. CSEE J Power Energy Syst. 2018;4(3):362-370. https:// Publishers; 1988.
doi.org/10.17775/CSEEJPES.2018.00520. 41. Salakhutdinov R, Hinton G. Deep Boltzmann machines.
21. Cheng L, Yu T. A new generation of AI: a review and perspec- In van Dyk D, and Welling M. (Eds). Int Conf Artif Intell Stat.
tive on machine learning technologies applied to smart energy 2009;5:448-455.
OZCANLI ET AL. 19

42. Salakhutdinov R, Larochelle H. Efficient learning of deep management of power system. IEEE Int Conf Big Data Anal.
Boltzmann machines. Int Conf Artif Intell Stat. 2010;9: 2016;1-5. https://doi.org/10.1109/ICBDA.2016.7509811.
693-700. 60. Saviozzi M, Massucco S, Silvestro F. Implementation of
43. Salakhutdinov R, Hinton G, Hinton G. An Efficient Learning advanced functionalities for distribution management sys-
Procedure for Deep Boltzmann Machines. Cambridge, MA: tems: load forecasting and modeling through Artificial neural
MIT Press; 2010. networks ensembles. Electr Pow Syst Res. 2019;167:230-239.
44. Xu J, Li H, Zhou S. An overview of deep generative models. https://doi.org/10.1016/j.epsr.2018.10.036.
IETE Tech Rev. 2015;32(2):131-139. https://doi.org/10.1080/ 61. Pappas SS, Ekonomou L, Karampelas P, et al. Electricity
02564602.2014.987328. demand load forecasting of the Hellenic power system using
45. Williams RJ, Zipser D. A learning algorithm for continually an ARMA model. Electr Pow Syst Res. 2010;80:256-264.
running fully recurrent neural networks. Neural Comput. https://doi.org/10.1016/j.epsr.2009.09.006.
1989;1:270-280. https://doi.org/10.1162/neco.1989.1.2.270. 62. Zheng J, Xu C, Zhang Z, Li X. Electric load forecasting in
46. Chung J, Gulcehre C, Cho K, Bengio Y. Empirical evaluation smart grids using long-short-term-memory based recurrent
of gated recurrent neural networks on sequence modeling, neural network. Annu Conf Inf Sci Syst. 2017;1-6. https://doi.
2014. arXiv: 1412.3555. http://arxiv.org/abs/1412.3555 org/10.1109/CISS.2017.7926112.
47. Hochrejter S, Schmidhuber J. Long short term memory. Neu- 63. Bouktif S, Fiaz A, Ouni A, Serhani MA. Optimal deep learn-
ral Comput. 1997;9(8):1735-1780. ing LSTM model for electric load forecasting using feature
48. He W. Load forecasting via deep neural networks. Procedia selection and genetic algorithm: comparison with machine
Comput Sci. 2017;122:308-314. https://doi.org/10.1016/j.procs. learning approaches. Energies. 2018;11(7):1636. https://doi.
2017.11.374. org/10.3390/en11071636.
49. Qiu X, Ren Y, Suganthan PN, Amaratunga GAJ. Empirical 64. Dedinec A, Filiposka S, Dedinec A, Kocarev L. Deep belief
mode decomposition based ensemble deep learning for load network based electricity load forecasting: an analysis of Mac-
demand time series forecasting. Appl Soft Comput. 2017;54: edonian case. Energy. 2016;115:1688-1700. https://doi.org/10.
246-255. https://doi.org/10.1016/j.asoc.2017.01.015. 1016/j.energy.2016.07.090.
50. Hu Q, Zhang R, Zhou Y. Transfer learning for short-term 65. Mocanu E, Nguyen PH, Gibescu M, Larsen EM, Pinson P.
wind speed prediction with deep neural networks. Renew Demand forecasting at low aggregation levels using factored
Energy. 2016;85:83-95. https://doi.org/10.1016/j.renene.2015. conditional RBM. Paper presented at: 19th Power Systems
06.034. Computation Conference, 2016. https://doi.org/10.1109/PSCC.
51. Hu YL, Chen L. A nonlinear hybrid wind speed forecasting 2016.7540994.
model using LSTM network, hysteretic ELM and differential 66. Kuo PH, Huang CJ. A high precision artificial neural net-
evolution algorithm. Energ Conver Manage. 2018;173:123-142. works model for short-term energy load forecasting. Energies.
https://doi.org/10.1016/j.enconman.2018.07.070. 2018;11(1):1-13. https://doi.org/10.3390/en11010213.
52. Abdel-Nasser M, Mahmoud K. Accurate photovoltaic power 67. Varga ED, Beretka SF, Noce C, Sapienza G. Robust real-time
forecasting models using deep LSTM-RNN. Neural Comput load profile encoding and classification framework for efficient
Applic. 2017;31:1-14. https://doi.org/10.1007/s00521-017- power systems operation. IEEE Trans Power Syst. 2014;30(4):
3225-z. 1897-1904. https://doi.org/10.1109/TPWRS.2014.2354552.
53. Alzahrani A, Shamsi P, Dagli C, Ferdowsi M. Solar irradiance 68. Li L, Ota K, Dong M. When weather matters: IoT-based elec-
forecasting using deep recurrent neural networks. ICRERA trical load forecasting for smart grid. IEEE Commun Mag.
USA. 2017;114:304-313. https://doi.org/10.1016/j.procs.2017. 2017;55(10):46-51. https://doi.org/10.1109/MCOM.2017.
09.045. 1700168.
54. Ma J, Zhang J, Xiao L, Chen K, Wu J. Classification of power 69. Wang L, Zhang Z, Chen J. Short-term electricity price fore-
quality disturbances via deep learning. IETE Tech Rev. 2017; casting with stacked denoising autoencoders. IEEE Trans
34(4):408-415. https://doi.org/10.1080/02564602.2016.1196620. Power Syst. 2017;32(4):2673-2681. https://doi.org/10.1109/
55. Kong X, Xu X, Yan Z, Chen S, Yang H, Han D. Deep learning TPWRS.2016.2628873.
hybrid method for islanding detection in distributed genera- 70. Li L, Ota K, Dong M. Everything is image: CNN-based short-
tion. Appl Energy. 2018;210:776-785. https://doi.org/10.1016/j. term electrical load forecasting for smart grid. Paper presented
apenergy.2017.08.014. at: 11th International Conference on Frontier of Computer
56. Yixing W, Meiqin LIU, Zhejing BAO. Deep learning neural Science and Technology, Exeter, UK. 2017:344–351. https://
network for power system fault diagnosis. Paper presented at: doi.org/10.1109/ISPAN-FCST-ISCC.2017.78.
35th Chinese Control Conference, 2016:6678–6683. 71. Kong W, Dong ZY, Jia Y, Hill DJ, Xu Y, Zhang Y. Short-term
57. Yizhe Z, Wang M, Gang D, Jun G, Pai W. A cable fault recog- residential load forecasting based on LSTM recurrent neural
nition method based on a deep belief network. Comput Electr network. IEEE Trans Smart Grid. 2017;3053(c):1-11. https://
Eng. 2018;71:452-464. https://doi.org/10.1016/j.compeleceng. doi.org/10.1109/TSG.2017.2753802.
2018.07.043. 72. Wang Y, Gan D, Sun M, Zhang N, Lu Z, Kang C. Probabilistic
58. He Y, Mendis GJ, Wei J. Real-time detection of false data injec- individual load forecasting using pinball loss guided LSTM.
tion attacks in smart grid: a deep learning-based intelligent Appl Energy. 2019;235:10-20. https://doi.org/10.1016/j.
mechanism. IEEE Trans Smart Grid. 2017;8(5):2505-2516. apenergy.2018.10.078.
59. Wang XZ, Zhou J, Huang ZL, Bi XL, Ge ZQ, Li L. A multilevel 73. Kong W, Dong ZY, Hill DJ, Luo F, Xu Y. Short-term residen-
deep learning method for big data analysis and emergency tial load forecasting based on resident behaviour learning.
20 OZCANLI ET AL.

IEEE Trans Power Syst. 2018;33(1):1-1088. https://doi.org/10. series techniques. Appl Energy. 2019;236:1078-1088. https://
1109/TPWRS.2017.2688178. doi.org/10.1016/J.APENERGY.2018.12.042.
74. Shi H, Xu M, Li R. Deep learning for household load 88. Qureshi AS, Khan A, Zameer A, Usman A. Wind power pre-
forecasting—a novel pooling deep RNN. IEEE Trans Smart Grid. diction using deep neural network based meta regression and
2018;9:5271-5280. https://doi.org/10.1109/TSG.2017.2686012. transfer learning. Appl Soft Comput J. 2017;58:742-755.
75. Fan C, Wang J, Gang W, Li S. Assessment of deep recurrent https://doi.org/10.1016/j.asoc.2017.05.031.
neural network-based strategies for short-term building 89. Khodayar M, Kaynak O, Khodayar ME. Rough deep neural
energy predictions. Appl Energy. 2019;236:700-710. https:// architecture for short-term wind speed forecasting. IEEE
doi.org/10.1016/j.apenergy.2018.12.004. Trans Ind Inform. 2017;13(6):2770-2779.
76. Yan K, Wang X, Du Y, Jin N, Huang H, Zhou H. Multi-step 90. Dalto M, Matuško J, Vašak M. Deep neural networks for
short-term power consumption forecasting with a hybrid deep ultra-short-term wind forecasting. Paper presented at: IEEE
learning strategy. Energies. 2018;11(11):1-15. https://doi.org/ International Conference on Industrial Technology
10.3390/en11113089. (ICIT), Seville. 2015:1657-1663.
77. Coelho VN, Coelho IM, Rios E, et al. A hybrid deep learning 91. Tao Y, Chen H. Wind power prediction and pattern feature
forecasting model using GPU disaggregated function evalua- based on deep learning method. Paper presented at: IEEE PES
tions applied for household electricity demand forecasting. Asia-Pacific Power and Energy Engineering Conference
Energy Procedia. 2016;103:280-285. https://doi.org/10.1016/j. (APPEEC), Hong Kong, China. 2014:1–4.
egypro.2016.11.286. 92. Zhang C, Chen CLP, Gan M, Chen L. Predictive deep
78. Ryu S, Noh J, Kim H. Deep neural network based demand side Boltzmann machine for multiperiod wind speed forecast-
short term load forecasting. Paper presented at: 2016 IEEE ing. IEEE Trans Sustain Energy. 2015;6(4):1416-1425.
International Conference on Smart Grid Communications, 93. Wang K, Qi X, Liu H, Song J. Deep belief network based k-
Sydney, NSW, Australia. 2016:308–313. https://doi.org/10. means cluster approach for short-term wind power forecast-
1109/SmartGridComm.2016.7778779. ing. Energy. 2018;165:840-852. https://doi.org/10.1016/j.
79. Amarasinghe K, Marino DL, Manic M. Deep neural net- energy.2018.09.118.
works for energy load forecasting. Paper presented at: 94. Wu W, Chen K, Qiao Y, Lu Z. “Probabilistic short-term wind
IEEE 26th International Symposium on Industrial Elec- power forecasting based on deep neural networks,” 2016
tronics, 2017:1483–1488. https://doi.org/10.1109/ISIE.2017. International Conference on Probabilistic Methods Applied to
8001465 Power Systems (PMAPS), Beijing, 2016;1-8. https://doi.org/10.
80. Mocanu E, Nguyen PH, Gibescu M, Kling WL. Deep learning 1109/PMAPS.2016.7764155.
for estimating building energy consumption. Sustain Energy 95. Wang H, Li G, Wang G, Peng J, Jiang H, Liu Y. Deep learning
Grids Netw. 2016;6:91-99. https://doi.org/10.1016/j.segan.2016. based ensemble approach for probabilistic wind power fore-
02.005. casting. Appl Energy. 2017;188:56-70. https://doi.org/10.1016/
81. Mocanu E, Nguyen PH, Kling WL, Gibescu M. Unsupervised j.apenergy.2016.11.111.
energy prediction in a smart grid context using reinforcement 96. Qin Y, Li K, Liang Z, et al. Hybrid forecasting model based on
cross-building transfer learning. Energ Buildings. 2016;116: long short term memory network and deep learning neural
646-655. https://doi.org/10.1016/j.enbuild.2016.01.030. network for wind signal. Appl Energy. 2019;236:262-272.
82. Coelho IM, Coelho VN, Eduardo J, Luz S, Ochi LS, https://doi.org/10.1016/j.apenergy.2018.11.063.
Guimar~aes FG. A GPU deep learning metaheuristic based 97. Liu H, Mi X, Li Y. Smart deep learning based wind
model for time series forecasting. Appl Energy. 2017;201:412- speed prediction model using wavelet packet decomposi-
418. https://doi.org/10.1016/j.apenergy.2017.01.003. tion, convolutional neural network and convolutional
83. Qiu X, Zhang L, Ren Y, Suganthan PN. Ensemble deep learn- long short term memory network. Energ Conver Manage.
ing for regression and time series forecasting. Paper presented 2018;166:120-131. https://doi.org/10.1016/j.enconman.2018.
at: IEEE Symposium on Computational Intelligence in Ensem- 04.021.
ble Learn, 2014. https://doi.org/10.1109/CIEL.2014.7015739 98. Xiaoyun Q, Xiaoning K, Chao Z, Shuai J, Ma X. Short-
84. Rahman A, Srikumar V, Smith AD. Predicting electricity con- term prediction of wind power based on deep long
sumption for commercial and residential buildings using deep short-term memory. Paper presented at: IEEE PES Asia-
recurrent neural networks. Appl Energy. 2018;212:372-385. Pacific Power and Energy Engineering Conference,
https://doi.org/10.1016/j.apenergy.2017.12.051. Xi'an, China. 2016. https://doi.org/10.1109/APPEEC.2016.
85. Li C, Ding Z, Zhao D, Yi J, Zhang G. Building energy 7779672.
consumption prediction: an extreme deep learning 99. Du W, Chen J, Zeng G-Q, Lu K-D, Zhou W. Wind speed fore-
approach. Energies. 2017;10(10):1525. https://doi.org/10. casting using nonlinear-learning ensemble of deep learning
3390/en10101525. time series prediction and extremal optimization. Energ Con-
86. Berriel RF, Lopes AT, Rodrigues A, Varejao FM, Oliveira- ver Manage. 2018;165:681-695. https://doi.org/10.1016/j.
Santos T. Monthly energy consumption forecast: a deep learn- enconman.2018.03.098.
ing approach. Paper presented at: International Joint Confer- 100. Liu H, Mi X-W, Li Y-F. Wind speed forecasting method based
ence on Neural Networks, Anchorage, AK, USA. 2017: on deep learning strategy using empirical wavelet transform,
4283–4290. https://doi.org/10.1109/IJCNN.2017.7966398. long short term memory neural network and Elman neural
87. Cai M, Pipattanasomporn M, Rahman S. Day-ahead building- network. Energ Conver Manage. 2019;156:498-514. https://doi.
level load forecasts using deep learning vs. traditional time- org/10.1016/j.enconman.2017.11.053.
OZCANLI ET AL. 21

101. Gensler A, Henze J, Sick B, Raabe N. Deep learning for 115. Liu H, Hussain F, Shen Y, Arif S, Nazir A, Abubakar M. Com-
solar power forecasting—an approach using AutoEncoder plex power quality disturbances classification via curvelet
and LSTM neural networks. Paper presented at: IEEE Inter- transform and deep learning. Electr Pow Syst Res. 2018;163:1-
national Conference on Systems, Man, and Cybernetics 9. https://doi.org/10.1016/j.epsr.2018.05.018.
(SMC), Budapest, Hungary. 2016:2858–2865. https://doi.org/ 116. Shen Y, Abubakar M, Liu H, Hussain F. Power quality distur-
10.1109/SMC.2016.7844673. bance monitoring and classification based on improved PCA
102. Li L-L, Cheng P, Lin H-C, Dong H. Short-term output power and convolution neural network for wind-grid. Energies. 2019;
forecasting of photovoltaic systems based on the deep belief 12:1280. https://doi.org/10.3390/en12071280.
net. Adv Mech Eng. 2017;9(9). https://doi.org/10.1177/ 117. Mohan N, Soman KP, Vinayakumar R. Deep power: deep
1687814017715983. learning architectures for power quality disturbances classifica-
103. Neo YQ, Teo TT, Woo WL, Logenthiran T, Sharma A. Fore- tion. 2017 International Conference on Technological
casting of photovoltaic power using deep belief network. Advancements in Power and Energy (TAP Energy 2017). 2018:
Paper presented at: TENCON 2017—2017 IEEE Region 1–6. https://doi.org/10.1109/TAPENERGY.2017.8397249.
10 Conference, Penang, Malaysia. 2017:1189–1194. https:// 118. Wang J, Xu Z, Che Y. Power quality disturbance classification
doi.org/10.1109/TENCON.2017.8228038. based on compressed sensing and deep convolution neural
104. Wang H, Yi H, Peng J, et al. Deterministic and probabilistic networks. IEEE Access. 2019;7:78336-78346. https://doi.org/
forecasting of photovoltaic power based on deep con- 10.1109/ACCESS.2019.2922367.
volutional neural network. Energ Conver Manage. 2017;153: 119. Liu H, Hussain F, Yue S, Yildirim O. Classification of multiple
409-422. https://doi.org/10.1016/j.enconman.2017.10.008. power quality events via compressed deep learning. Int Trans
105. Alzahrani A, Shamsi P, Dagli C, Ferdowsi M. Solar irradiance Electr Energy Syst. 2019;29:1-14. https://doi.org/10.1002/2050-
forecasting using deep neural networks. Procedia Comput Sci. 7038.12010.
2017;114:304-313. https://doi.org/10.1016/j.procs.2017.09.045. 120. Yu JJQ, Hou Y, Lam AYS, Li VOK. Intelligent fault detection
106. Mishra M. Power quality disturbance detection and classifica- scheme for microgrids with wavelet-based deep neural net-
tion using signal processing and soft computing techniques: a works. IEEE Trans Smart Grid. 2017;3053(c):1-10. https://doi.
comprehensive review. Int Trans Electr Energy Syst. 2019;29: org/10.1109/TSG.2017.2776310.
e12008. https://doi.org/10.1002/2050-7038.12008. 121. Zhang S, Wang Y, Liu M, Bao Z. Data-based line trip fault pre-
107. Mahdi M, Genc VMI. Post-fault prediction of transient insta- diction in power systems using LSTM networks and SVM.
bilities using stacked sparse autoencoder. Electr Pow Syst Res. IEEE Access. 2017;6:7675-7686. https://doi.org/10.1109/
2018;164:243-252. https://doi.org/10.1016/j.epsr.2018.08.009. ACCESS.2017.2785763.
108. Shi X, Yang HUI, Xu Z, Zhang X, Farahani MR. An indepen- 122. Wang L, Zhang Z, Long H, Xu J, Liu R. Wind turbine gearbox
dent component analysis classification for complex power failure identification with deep neural networks. IEEE Trans
quality disturbances with sparse autoencoder features. IEEE Ind Inform. 2017;13(3):1360-1368.
Access. 2019;7:20961-20966. https://doi.org/10.1109/ACCESS. 123. Cheng F, Wang J, Qu L. Rotor current-based fault diagnosis
2019.2898211. for DFIG wind turbine drivetrain gearboxes using frequency
109. Deng Y, Wang L, Jia H, Tong X, Li F. A sequence-to-sequence analysis and a deep classifier. IEEE Trans Ind Appl. 2018;54
deep learning architecture based on bidirectional GRU for (2):1062-1071. https://doi.org/10.1109/TIA.2017.2773426.
type recognition and time location of combined power quality 124. Zhang Z, Li S, Xiao Y, Yang Y. Intelligent simultaneous fault
disturbance. IEEE Trans Ind Inform. 2019;15:4481-4493. diagnosis for solid oxide fuel cell system based on deep learn-
https://doi.org/10.1109/TII.2019.2895054. ing. Appl Energy. 2019;233-234:930-942. https://doi.org/10.
110. Qiu W, Tang Q, Liu J, Yao W. An automatic identification 1016/j.apenergy.2018.10.113.
framework for complex power quality disturbances based on 125. Wang L, Zhao X, Pei J, Tang G. Transformer fault diagnosis
multifusion convolutional neural network. IEEE Trans Ind using continuous sparse autoencoder. SpringerPlus. 2016;5:
Inform. 2020;16:3233-3241. https://doi.org/10.1109/TII.2019. 448. https://doi.org/10.1186/s40064-016-2107-7.
2920689. 126. Dai J, Song H, Sheng G, Jiang X. Dissolved gas analysis of
111. Balouji E. A LSTM-based deep learning method with applica- insulating oil for power transformer fault diagnosis with deep
tion to voltage dip classification. Paper presented at: 18th Inter- belief network. IEEE Trans Dielectr Electr Insul. 2017;24:2828-
national Conference on Harmonics and Quality of Power, 2835. https://doi.org/10.1109/TDEI.2017.006727.
2018:1–5. https://doi.org/10.1109/ICHQP.2018.8378893. 127. Huang T, Guo Q, Sun H, Tan C, Hu T. A deep spatial-
112. Li C, Li Z, Jia N, Qi Z, Wu J. Classification of power-quality temporal data-driven approach considering microclimates for
disturbances using deep belief network. Paper presented at: power system security assessment. Appl Energy. 2019;237:36-
International Conference on Wavelet Analysis and Pattern 48. https://doi.org/10.1016/j.apenergy.2019.01.013.
Recognition (ICWAPR), Chengdu, China. 2018:231–237. 128. Zheng Z, Yatao Y, Niu X, Dai H-N, Zhou Y. Wide and deep
113. Mei F, Ren Y, Wu Q, Zhang C, Pan Y, Sha H. Online recogni- convolutional neural networks for electricity-theft detection
tion method for voltage sags based. Energies. 2018;12(1):43. to secure smart grids. IEEE Trans Ind Inform. 2017;3203(c):1-
https://doi.org/10.3390/en12010043. 1615. https://doi.org/10.1109/TII.2017.2785963.
114. Wang S, Chen H. A novel deep learning method for the classi- 129. Malof JM, Collins LM, Bruadbury K, Newell RG. A deep con-
fication of power quality disturbances using deep con- volutional neural network and a random forest classifier for
volutional neural network. Appl Energy. 2019;235:1126-1140. solar photovoltaic array detection in aerial imagery. Int Conf
https://doi.org/10.1016/j.apenergy.2018.09.160. Renew Energy Res Appl. 2016;5:5-9.
22 OZCANLI ET AL.

130. Shen S, Sadoughi M, Chen X, Hong M, Hu C. A deep learning 134. Sogabe T, Haruhisa I, Sakamoto K, Yamaguchi K,
method for online capacity estimation of lithium-ion batteries. Sogabe M, Sato T. Optimization of decentralized renewable
J Energy Storage. 2019;25:100817. https://doi.org/10.1016/j.est. energy system by weather forecasting and deep machine
2019.100817. learning techniques. IEEE Innovative Smart Grid Technol.
131. Mocanu E, Mocanu DC, Nguyen PH, et al. On-line building 2016;1-5.
energy optimization using deep reinforcement learning. arXiv:
1707.05878, 2017:1–9.
132. François-lavet V, Fonteneau R, Ernst D. Deep reinforcement
learning solutions for energy microgrids management. Paper How to cite this article: Ozcanli AK,
presented at: European Workshop on Reinforcement Learn- Yaprakdal F, Baysal M. Deep learning methods
ing, 2016:1–7.
and applications for electrical power systems: A
133. Wu N, Wang H. Deep learning adaptive dynamic program-
ming for real time energy management and control strategy of
comprehensive review. Int J Energy Res. 2020;1–22.
micro-grid. J Clean Prod. 2018;204:1169-1177. https://doi.org/ https://doi.org/10.1002/er.5331
10.1016/j.jclepro.2018.09.052.

You might also like