You are on page 1of 12

Energy 281 (2023) 128256

Contents lists available at ScienceDirect

Energy
journal homepage: www.elsevier.com/locate/energy

A comprehensive review of machine learning and IoT solutions for demand


side energy management, conservation, and resilient operation
Mahmoud Elsisi a, b, *, Mohammed Amer c, ***, Alya’ Dababat d, Chun-Lien Su a, **
a
Department of Electrical Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, 807618, Taiwan
b
Department of Electrical Engineering, Faculty of Engineering at Shoubra, Benha University, Cairo, 11629, Egypt
c
Department of Mechanical Engineering, Palestine Technical University – Kadoorie, Tulkarm, Palestine
d
Department of Computer Systems Engineering, Faculty of Engineering and Information Technology, Arab American University, Palestine

A R T I C L E I N F O A B S T R A C T

Handling editor: Henrik Lund The energy consumption of major equipment in residential and industrial facilities can be minimized through a
variety of cost-effective energy-saving measures. Most saving strategies are economically viable where several
Keywords: algorithms can be employed to reduce energy consumption to reduce costs to a considerable extent. Machine
Energy conservation learning (ML) is one of these techniques. A review of recent research efforts concerning the application of ML
Energy management
strategies to energy conservation and management problems is presented in this study. In addition, ML ap­
Machine learning
proaches and strategies for energy-saving problems, management, technologies, and control methods have been
IoT
Control discussed. A comprehensive review of all available publications is also used to make observations about past
Decision making considerations. As a result, it has been concluded that ML is capable of solving a wide range of decision and
management problems within a short period of time with minimal energy consumption. In addition, ML per­
spectives have been viewed from the perspective of emerging communication technologies, instruments, and
cyber-physical systems (CPSs), along with the advancement of ultra-durable and energy-efficient Internet-of-
Things (IoT) based communication sensors technology. Moreover, a comprehensive review of recent de­
velopments in ML algorithms is also included, including safe reinforcement learning (RL), Deep RL, path integral
control for RL, and others not previously. Lastly, critical ML considerations such as emergency and remedial
measures, integrity protection, fusion with existing robust controls, and combining preventive and emergent
measures have been discussed. The implementation of recently applied ML, RL, and IoT strategies for energy
management, conservation, and resilient operation is clarified in this paper. The proposed review highlights the
advantages and drawbacks of the recent energy conservation strategies. Finally, the perspective solutions have
been clarified to cope with the world direction for zero energy buildings.

recent years to conserve energy in both industrial and domestic appli­


1. Introduction cations. IEMSs are designed to manage the energy needs of multiple
facilities using the Internet of Things (IoT) and AI techniques [2–4].
There has been a growing concern about energy in many countries Smart systems can be implemented to reduce energy consumption, labor
over the past few years. The trend toward almost zero-energy buildings costs, and device lifespans by managing the various facilities used in
has been gaining momentum in major countries around the world [1]. their design [5]. As far as the planning and construction of IEMSs are
Various techniques have been employed to accomplish this, but the concerned, smart IoT technology is increasingly being incorporated. AI
process is not as straightforward as it appears due to the cost associated is being integrated into existing environments as means of conserving
with change. With the advent of artificial intelligence (AI), energy sav­ energy. Then, energy management strategies are being investigated for
ings or energy management can now be accomplished. Numerous enhancing energy efficiency in existing systems [6].
intelligent energy management systems (IEMSs) have been developed in When implementing IoT, an advanced low-cost IoT-based intelligent

* Corresponding author. Department of Electrical Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, 807618, Taiwan.
** Corresponding author. National Kaohsiung University of Science and Technology, Department of Electrical Engineering , Kaohsiung, 807618, Taiwan.
*** Corresponding author. Department of Mechanical Engineering, Palestine Technical University – Kadoorie, Tulkarm, Palestine.
E-mail addresses: mahmoudelsisi@nkust.edu.tw (M. Elsisi), mohammed.amer@ptuk.edu.ps (M. Amer), a.dababat2@student.aaup.edu (A. Dababat), cls@nkust.
edu.tw (C.-L. Su).

https://doi.org/10.1016/j.energy.2023.128256
Received 12 March 2023; Received in revised form 21 May 2023; Accepted 23 June 2023
Available online 26 June 2023
0360-5442/© 2023 Elsevier Ltd. All rights reserved.
M. Elsisi et al. Energy 281 (2023) 128256

Nomenclature IEMS Intelligent Energy Management System


IoE Internet of Everything
AE Autoencoder IoT Internet of Things
AI Artificial Intelligence IPM Interior Point Method
ANN Artificial Neural Network k-NN k-Nearest Neighbor
BEMS Building Energy Management System LR Linear Regression
CNN Convolution Neural Network LSTM Long Short-Term Memory
CPS Cyber-Physical System MAS Multi Agent System
DBNs Deep Belief Networks MEC Mobile Edge Computing
DBSCAN Density-Based Spatial Clustering ML Machine Learning
DNNs Deep Neural Networks MLP Multilayer perceptron
DQNs Deep Q-networks PCA Principal Component Analysis
DR Demand Response QoS Quality of Service
DRL Deep Reinforcement Learning RC Remote Controlled
DT Decision Tree RC Reservoir Computing
EL Ensemble Learning RF Random Forests
ELMs Extreme Learning Machines RL Reinforcement Learning
FC Fully Connected RNN Recurrent Neural Network
FMDP Principal Component Analysis RVM Relative Vector Machine
GAN Generative Adversarial Network SVC Support Vector Clustering
GP Gaussian Random Process Distribution SVM Support Vector Machines
GRP Gaussian Processing Regression SLFNN Feedforward Neural Network
GRU Gate Recurrent Unit Smart-CAM Smart context-awareness management
HVAC Heating, Ventilation, and Air Conditioning SMO Sequential Minimization Method
ICA-BEMS Intelligent Context-Aware Building Energy Monitoring TL Transfer Learning
System

strategy can be performed during the preplanning process for a new changes. A CPS system that accommodates more features often lags on
IEMS. The construction process may be launched with high-efficiency the front end due to the difficulty in modeling all possible features. Note
equipment and an integrated management system. Adding an IEMS to that, data-driven methods analyze occupant behavior rather than
an existing system, however, will require additional resources. For relying on physical models or historical data. Instead of attempting to
example, developing a smart IoT system will require the replacement of understand the behavior of occupants, CPS systems often maximize
expensive facilities and the hiring of additional employees. By imple­ future rewards through intelligent control procedures. With these
menting simple, light, and low-cost IoT solutions on existing infra­ techniques, a machine can learn from historical behavior and adjust its
structure, operational efficiency can be improved over the old and costly actions to the behavior of its occupants. The CPS control system can
ones. As a result, it is more feasible to deploy IEMSs within existing provide feedback regarding an individual’s comfort level as he or she
structures. Monitoring environmental and energy usage by implement­ interacts with the linked systems like heating, ventilation, air condi­
ing the cyber-physical system (CPS) can be an example [7–9]. Increasing tioning (HVAC), lighting, windows, or other systems. It is therefore
energy efficiency would seem to make sense since the cost of building an possible to model the system in a novel way by optimizing control
IEMS in terms of price, size, and labor is high. In this regard, the methods that consider both occupant impact and CPS performance.
cost-effectiveness and lightness of IoT devices are important factors During a control problem, a decision-making machine is used to make
[10]. IoT devices can perform critical functions by integrating with the decisions in accordance with the requirements by combining hardware
control plan of IEMs. Integrating an IEMS into an existing CPS does not and software networks, utilizing renewable energy technologies to meet
necessitate the replacement of large and expensive devices since the IoT local energy demands while maintaining indoor comfort levels. Using a
components are implemented on the CPS near the consumer and the CPS control system will optimize energy efficiency, and reduce costs
environment. Further, the facility is equipped with sensors and control while meeting local energy loads [14,15]. In most cases, controls are
interfaces that can be intelligently controlled by gathering information directed at the shading systems, windows, lighting systems, or HVAC
about the users and the surroundings. Large amounts of data are another systems.
important component of the IoT. By incorporating the user’s and the Data analysis is dominated today by machine learning (ML), which is
environment’s information into the analysis of the data collected over the most extensively used state-of-the-art technology [16,17]. A recently
time from IoT devices, the most efficient methods for decreasing energy developed Markov decision process-based machine learning technique,
consumption can be identified. The CPS provides an optimal CPS envi­ reinforcement learning (RL), can be applied in both model-based and
ronment for users by referencing guidelines that are derived based on model-free environments [18]. In AI applications, however, model-free
the analyzed data. learning algorithms, such as Q-learning and TD (λ), have proven to be
Models of occupant behavior can be classified into three types: rule- more effective and attractive than traditional RL methods [19–22]. Deep
based, stochastic, and data-driven [11]. These models are no longer learning provides a possibility to work with large continuous datasets
deterministic but are rather described by stochastic laws [12]. A sto­ through the application of RL [23,24]. By using RL, an agent can
chastic model incorporates occupant behavior as a stochastic process determine optimal action through trial and error without the assistance
since it can vary over time and from individual to individual [13]. On of a supervisor, which matches the objective of a control problem.
the other hand, the data-driven method fails to capture occupant Decisions can be made using data-driven models in the control sys­
behavior in a structured manner. Yet, there is a lot of complexity and tems as previously discussed. Data can be adapted to underlying logic
dynamic in the physical environment of a CPS. A process that is often using the ML and RL methods to effectively make decisions in a sto­
non-stationary allows occupants to respond quickly to environmental chastic environment. Based on the observations of human interactions,

2
M. Elsisi et al. Energy 281 (2023) 128256

RL agents learn to adapt themselves to the behavior of occupants. electric power stations and enhancing energy conservation. Research
Although RL has existed for over seventy years, researchers only started shows that there has been very limited research done on the integration
exploring its applications a decade ago. Yet, the methodological point of of IoT sensors and energy conservation with RL. According to Shroud
view on ML and RL applications to occupant behavior or the appropriate et al. [25], energy costs comprise the majority of a factory’s total pro­
literature reviews has not been considered. This results in uncertainty duction costs in modern industry. Decision-makers can then approach
regarding the future of ML applications in general and energy man­ this issue with the utmost priority. Through ML methods, a mathemat­
agement and conservation in particular. The purpose of this study is as ical model is developed to intelligently turn the machine on or off to
follows: minimize energy consumption. This method ultimately results in less
energy consumption. By shifting production time into lower-priced
− Review the available empirical studies in order to provide details on sessions, considerable energy savings can also be achieved. Energy
the applications of ML methods to energy conservation. consumption reduction during peak periods is another benefit of this
− Presents the results of the literature search and highlights key points minimization process. Consequently, CO2 emissions from energy gen­
emerging from recent research on this topic. eration are likely to be reduced. Fig. 1 shows the benefits of utilizing ML
− Describes the ML’s role in CPS control and how to implement it. in the industry.
− Describe the characteristics of ML and RL algorithms and the modi­ Wang et al. [26] developed a multi-objective optimization technique
fications to achieve energy saving. for building energy management systems (BEMS) via integrating solar
− Study the benefits of combining different learning techniques on the energy resources with other electricity production methods. In the
accuracy and robustness. context of the interior environment, user relief can be classified into
− Describes the current research gap in ML and suggests possible future three categories: visual relief, thermal relief, and indoor air quality re­
lines of research. lief. During the optimal operation, the energy style, electrical, thermal,
− Highlights perspective solutions to cope with the world direction for and cooling loads are balanced by taking into account the adjustable
zero energy buildings. loads that can be used in the demand response (DR) control. The opti­
mization was solved using MATLAB’s YALMIP toolbox in order to verify
The second section discusses the scope of our literature search and its the effectiveness and adaptability of the model. On the other hand,
findings. The philosophy and algorithms of ML are briefly introduced in Wang et al. [27] described a system that uses Wi-Fi probes to detect
section 3. Section 4 explains the challenges of data pre-processing. Then, occupancy based on energy-cyber-physics. An ensemble classification
section 5 gives an empirical analysis of the articles. Section 6 offers algorithm extracts three types of occupancy information using the pro­
conclusions and possible directions for future research. posed framework. A connection between energy management and cyber
physics is achieved by creating a data interface that connects weak
2. Related works identifiers for Wi-Fi data. It also creates a data interface to detect and
interpret occupants. Verification tests were performed in a wide office to
A considerable amount of research has been conducted on the subject evaluate the performance of the proposed inhabited
of minimizing power consumption and maximizing resource efficiency. energy-cyber-physical system. Based on these results, the system was
Modern research in this field primarily relies on machine learning (ML) able to save 26.4%. Degha et al. [28] introduced a smart building with
techniques to optimize resource utilization. It was primarily focused on context-awareness (CA) for energy monitoring. The system uses
reducing workload by distributing it among data centers. Additionally, smart-CA management to assemble smart building knowledge and pro­
the focus was on reducing energy consumption and carbon emissions in vides contextual information in order to use energy-saving technology.

Fig. 1. Machine learning (a) use in industry and (b) its effect on energy management.

3
M. Elsisi et al. Energy 281 (2023) 128256

Park et al. [29] introduced a smart DR-based home energy management IoT-based systems with deep learning has been extensively researched in
approach based on human comfort. Thermal comfort and visual comfort order to improve quality of service (QoS) and resource optimization
were used as parameters to control the heating and lighting systems. As a [50]. There is, however, a heavy focus on deep learning algorithms in
result of the approach, users’ energy consumption was reduced and their this research. The research did not only focus on energy-saving resource
comfort was improved. Through intelligent energy savings, machine utilization but also various detection schemes and network environ­
learning is used effectively in BEMS to reduce energy consumption. In ments [51,52]. The concept of RL in computing was coined by Sutton
Dey et al. [30,31] studies, ML-based fault detection, and diagnosis are et al. [53]. Definitions are broken down into sectors. As far as RL is
demonstrated to increase energy savings and user convenience. Jafar­ concerned, agents, infrastructures, states, actions, and rewards can be
inejad et al. [32] proposed an optimized energy consumption reduction used to explain it properly. As a result of deep learning-based RL, the
approach for university departmental buildings by using a agent can find patterns in the data while providing suitable action to
demand-driven control strategy and a bi-level energy-efficient occu­ maximize reward by integrating the agent with deep learning neural
pancy profile optimization method. In terms of data transfer accuracy, networks. As a result of this objective, the emergence of
Fan et al. [33] conducted a systematic study about insufficient/ex­ deep-learning-based RL agents performed well in control-based simu­
tremely limited operational data in order to calibrate models and adapt lations [54]. A detailed description of all techniques can be found in
domains. As a result of the study, the results provided a helpful Section 3.
description of the transfer patterns and guidelines for developing
cost-effective data-driven solutions for predicting energy usage in 3. Machine learning models
buildings.
ML, such as classification, clustering, and regression, has been Combining machine learning and reinforcement learning is a general
revolutionized by intelligent decisions based on data [34,35]. The method of developing algorithms that can be adapted for specific
application and implementation of deep learning did not take long. In problem denotations and trained accordingly. Yet, problem-solving in
comparison with neural networks’ evolution over time, deep learning real life can be improved with ML and RL by learning the best solution to
neural networks have relatively novel concepts. Initially, neural net­ a specific problem set. The following sections describe the features of
works were based on perception as a result of Frank Rosenblatt’s dis­ this algorithm and the modifications that have been made to reduce the
covery of perceptron-based learning in 1957 where this method has energy consumption of the target. A detailed explanation has been
been used for more than two decades [36]. The application was initially provided of ML, RL, and hybrid models used in this research as illus­
described as relying more on machines than programs. A perceptron is trated in Fig. 2.
essentially a threshold-based logical output machine in which the inputs
influence its final output to varying degrees [37]. Activation or 3.1. Unsupervised learning strategy
nonlinearity is required to provide the final output. As a result of the
high computation power required in those computing eras, Unsupervised learning is typically used to solve clustering problems
multi-layered ML perceptron became very powerful for learning and since it involves learning from unlabeled data. Common clustering al­
predicting data patterns. In 1989, multilayer perceptron (MLP) became gorithms include k-mean clustering algorithms such as k-nearest
a potentially effective algorithm for ML when it introduced the back­ neighbor (k-NN). Initial cluster centroids are determined randomly for k
propagation strategy and faster computing processing [38]. In the later clusters, and then patterns are assigned to the nearest cluster centroids.
years, parallel computing based on GPUs has been extended to many Ultimately, the cluster centroids need to be moved to decrease the
layers and multiple activation functions. This is an entirely new form of objective function, such as the distance through the patterns and the
deep learning neural network. Currently, deep learning neural networks centroids. Once the maximum number of iterations or the minimum cost
are used to achieve targeted, certain objective learning across large function has been reached, the learning process comes to an end. The
datasets [39]. The discovery of patterns in big data can improve almost hierarchical clustering algorithm is an alternative widely used clustering
every aspect of human knowledge discovery [40]. In spite of this, the algorithm that compares the similarity of two sets of data (characterized
emphasis on ML was placed on sensor management and resource utili­ based on the Euclidean distance), (characterized based on the linkage
zation because ML was meant to be simplified [41]. method), or between datasets with a high degree of similarity. Upon
Recent studies on building energy consumption modeling and fore­ reaching a specified number of clusters, the learning process is termi­
casting conducted by Zhao et al. [42] included various studies using nated. Among the widely used clustering methods is density-based
statistical and AI approaches. Zhao mentioned that artificial neural spatial clustering (DBSCAN), which was introduced by Mnih et al.
networks (ANNs) and support vector machines (SVMs) are the most [55]. By combining data points that form dense regions (described by
widely adopted AI methods. Kalogorirou et al. [43] reviewed various the scanning radius, ε, and the minimum number of points required to
applications of energy systems with neural networks. A neural network determine a dense area, MinPts), this algorithm groups the data points
has been applied to simulate and predict energy in a variety of that are closely packed and marks them as outliers when they are located
energy-related fields. With the aid of Bayesian normalization algorithm, in the low-density regions. Upon scanning all data points, the learning
Chae et al. [44] developed a predictive model for building energy con­ process ends. An alternative approach would be to apply Gaussian
sumption in short-term time using ANN. The researchers explored how mixture strategy or a mean shift method.
performance changes due to network factors such as the hidden neurons
number and training data size. Karagorou et al. [45] suggest using an 3.2. Supervised learning strategy
ANN to describe buildings’ thermal characteristics using a multilayer
iterative process architecture based on a standard backpropagation The supervised learning process relies on data labels to construct
algorithm. learning models, which are particularly useful when dealing with
A deep learning model based on IoT sensors can be applied to mobile problems related to classifying and regression. Regression techniques
edge computing (MEC) and cloud computing strategies to save energy such as linear, polynomial, and exponential can be used to extract the
[46]. A variety of state-of-the-art research is currently being conducted characteristics of independent and dependent states. Regression strate­
in the field of cloud computing and virtual platforms for MEC [47]. MEC gies utilizing Gaussian processing regression (GPR) are gaining
and cloud infrastructure provide lower communication latency and increasing attention. According to the GPR, the pattern follows a
faster processing speed [48]. The MEC environment facilitates the Gaussian random process distribution rather than a parametric scheme
development of faster applications that provide fast service or where the sample’s mean and variance function determine the sample’s
pre-recommendations in response to recommendations [49]. The use of shape, see Eqns. (1)–(3). In Eqns. (1)–(3), x and x’ represent the input

4
M. Elsisi et al. Energy 281 (2023) 128256

Fig. 2. Machine learning models used in energy management and conservation.

vectors, m(x) specifies the mean function, and k(x, x’) specifies the interior point approach, a sequential minimization method (SMO), or
covariance function. The covariance functions (kernel functions) can be stochastic gradient descent. SVM is also frequently used in other ways,
linear kernels, squared exponential kernels, Matern kernels, periodic such as support vector clustering (SVC) and Bayesian SVM. The relative
kernels, or a compound format of multiple types. Hyperparameters are vector machine (RVM) can also be referred to as a kernel-based model in
optimized in Eqn. (1) as part of the training process with a labeled terms of sparse Bayesian learning theory [56,57], the relative vector
sample dataset. An advantage of GPR over the LR algorithm lies in its machine (RVM) can also be classified as a kernel-based model. Energy
accuracy when dealing with nonlinear relationships. storage has been addressed using RVM in a similar manner to SVM.

f (x) = GP{m(x), κ(x, x′)} (1) wT x + b = 0 (4)


[ ]
where: 1∑ n
( ( ))
L= max 0, 1 − y wT xi + b + λ‖w‖2 (5)
m(x) = E{f (x)} (2) n i=1

and wT φ(x) + b = 0 (6)


[ ]
κ(x, x′) = GP {〈f (x) − m(x)〉〈f (x) − m(x)〉}T (3) ( ) ( )
κ xi , xj = φ(xi )T φ xj (7)
The support vector machine (SVM) is one of the most popular kernel- The decision tree (DT) is another supervised learning algorithm. DT
based supervised learning algorithms. The linearly separable dataset (x, selects a root node by identifying the features and calculating their in­
y) can be clearly defined with two hyperplanes (margin) using SVM to formation gain rate. Nodes in the child nodes contain data based on
separate the data from the input pattern vector (x) and the labeled target different values in the main node, whereas the primary node has the
vector (y). The decision zone of two parallel hyperplanes has the form most information gain. This process continues until there is no further
clarified in which w denotes the weight and b denotes the bias factor information gain left or no features to choose from each subsequent
vector as shown in Eqn. (4). When using a normalized or standard child node. DTs typically optimize the information gain using algorithms
dataset, the distance between the decision zone and every hyperplane is based on information entropy, such as ID3, C4.5, or CART. Furthermore,
1/‖w‖. For classification targets on linearly inseparable datasets, SVM random forests (RFs), which are ensemble learning algorithms consist­
utilizes a hinge loss formula to the hyperplanes to decrease classification ing of DT models, can also be used to enhance the robustness of tree-
errors. The target is to minimize the function L in Eqn. (5), where n refers based algorithms. Each DT model is compared with the RF algorithm
to the pattern number, and L corresponds to the regularization param­ result, and the one with the most votes is selected.
eter. It is also possible to use SVM to determine decision boundaries (the
decision boundary is defined in Eqns. (6) and (7), where φ represents a
mapping function), by utilizing kernel functions, for example, those 3.3. Deep learning strategy
shown in Table 1, to translate input states into a wide-dimensional
feature region, and then applying hyperplanes to divide input data. A deep learning system can be either supervised or unsupervised, and
Besides supporting vector machines that can be used to deal with it is composed of ANNs. By using non-linear functions, it builds a rela­
regression problems. Parameters in hyperplanes can be solved using an tionship between input and target factors and then calculates the func­
tion parameters. As big data mining and recent computational
Table 1 methodologies advance, deep learning is gaining more attention. Fig. 3a
Types of Kernels used for vector machine approaches. illustrates two-layer feedforward neural network consisting of an input
Kernel type Formulation layer, a hidden layer, and an output layer. This two-layer feedforward
neural network is defined as single-layer feedforward neural network
Linear kernel κ(xi , xj ) = (xi T xj )
(SLFNN). The SLFNN is presented by Eqn. (8), where x represents the
Polynomial kernel κ(xi , xj ) = (xi T xj )n
Laplacian kernel
⃦ ⃦ input sample, f(x) represents the output sample, fa represents the acti­
κ(xi , xj ) = exp(1 − ⃦xi − xj ⃦ /σ)
Radial basis kernel (Gaussian kernel)
⃦ ⃦2 vation formula (e.g., Table 2), W represents the weight, and b represents
κ(xi , xj ) = exp(1 − ⃦xi − xj ⃦ /2σ2 )
Sigmoid kernel
the bias. Note that deep neural networks (DNNs) contain multiple hid­
κ(xi , xj ) = tan[a(xi T xj ) − b], a, b > 0
den layers (Fig. 3b). The DNN is depicted in Eqn. (9). Backpropagation is
xi and xj are two sample vectors used to train SLFNN and DNN where backpropagation is considered an
n, a, b, σ are kernel function parameters
optimization method using gradient descent to determine the weights

5
M. Elsisi et al. Energy 281 (2023) 128256

Fig. 3. A representation diagram of poplar utilized ML algorithms: (a) SLFNN, (b) DNN, (c) AE, (d) CNN, (e) RNN, (e) RL architecture, and (g) GAN.

determined by gradient descent.


Table 2
Autoencoder (AE) is an unsupervised learning algorithm based on
Commonly used deep learning activation functions.
NNs, as illustrated in Fig. 3c. Input datasets are encoded into codes (also
Name (fa ) Expression called latent variables or latent representations), whereas the original
Sigmoid function
fa (x) =
1 inputs are reconstructed into codes. Commonly used AEs include sparse,
1 + exp(− x) denoising, and contractive. The combination of various AE models with
Relu function fa (x) = max (0, x)
[ exp(x) − exp(− x) ] further approaches (for example clustering) can also be used as a deep
Tanh function
fa (x) =
exp(x) + exp (− x) learning technique.
Convolutional neural networks (CNNs) are advanced deep learning
models that distill features and learn representations. These models are
and bias of the neural network. Deep belief networks (DBNs) were commonly used for object and image recognition. The convolutional
proposed as another widely used type of ANN [58]. layer is composed of convolutional kernels, which stretch the input
[ ( ) ] tensor and the tensor of prior convolutional layers to extract feature
y = f (x) = fa W0T fa W0T x + b0 + b1 (8)
behavior. As shown in Fig. 3d, pooling layers and fully connected layers
are used to provide further feature identification. Equations (10) and
where:
(11) [60] illustrate the convolutional layer format, where Zl + 1 (i; j)
[ ( ( )] )
f (x) = fa WiT fa Wi−T 1 …fa W0T x + b0 + bi (9) stands for the (ith, jth) output pixel from the (l + 1)th feature map of the
convolutional layer, Zlk stands for the input to the (l + 1)th convolutional
Extreme learning machines (ELMs) are neural networks with a single
layer (the kth channel), K represents channels number in the 1st con­
hidden layer [59]. Weights and biases are randomly selected in the
volutional layer, Ll+1 represents the size of Zl+1, wl+1k (x; y) represents
output layer, so it does not contain any bias. When training an output
the weight of (xth, yth) element in the convolutional kernel in the (l +
layer, weights are used to solve a linear system via the inverse method.
1)th convolutional layer, b stands for the bias vector, f represents the size
In typical ANN-based models, weight and bias parameters are

6
M. Elsisi et al. Energy 281 (2023) 128256

of the convolutional kernel, s0 represents the stride number, and p rep­ primarily with their environment. The agent receives rewards based on
resents the padding number. Pooling is the process where features are its actions. A simple algorithm can describe this process. Fig. 4 illustrates
selected and information filtered and normally takes the form of an Eqn. reinforcement learning states and actions, with S = {1, …, n} and A =
(12) [61], where Plk + 1 (i; j) is the output and a denotes the parameter {1, …, n}. The problem specification determines which of these sets of
that determines the pooling strategy (a = 1 denotes the average pooling; values are discrete or continuous. This policy was then defined for that
a→∞ denotes the max pooling). Common pooling strategies include the function. The reinforcement learning agent was then tasked with
maximum or the average. Data mining performance is often enhanced maximizing the reward and obtaining it by taking appropriate actions
by combining the convolutional layer and the pooling layer. Equations based on its state. This algorithm requires an exit condition since it is a
(8) and (9) demonstrate that the fully connected (FC) layer aggregates self-contained loop. To achieve the desired results, RL must be properly
the distilled functions before sending them to the output layer. The conditioned. Sometimes RL conditions are improperly designed when
backpropagation algorithm is commonly used to train CNN, like SLFNN states are simplified. This results in poor results and an inability to
and DNN. Some DNN and CNN models might benefit from transfer respond positively to rewards.
learning (TL) and ensemble learning (EL) techniques to enhance training RL is mostly used to determine which actions, whether in the intel­
performance and robustness. Training TL uses small sets of training data ligent agent’s interactions with the environment, will lead to the
to train a complex neural network. In essence, the use of multiple greatest cumulative reward. This is done with the principle of Markov
learning systems in a neural network increases the accuracy and decision processes. It is typically composed of an environment and an
robustness of the network. agent. There are two categories of RL, model-based and model-free,
f
which are based on whether explicit modeling of the environment is
K ∑

Z l+1 (i, j) = needed, see Fig. 3f. Agents learn actions based on their environment, and
k=1 x=1 environments reward them. The regular RL approaches utilize:
f
∑ [ l ]
× Zk (s0 i + x, s0 j + y)wl+1
k (x, y) + b, (i, j) ∈ {0, 1, … Ll+1 } (10) 1) Q-learning, which combines quality values Q(s, a) within the Q-table
y=1
to create the next step action, resulting in a new quality value that is
generated by Eqn. (13), where a and g denote learning rates, R de­
Ll + 2p − f
Ll+1 = (11) notes rewards, and a and s’ denote the next steps.
s0 + 1
2) Deep Q-networks (DQNs) are aimed at overcoming the exponentially
[ ]1a growing computational costs of Q-learning by utilizing deep learning
f
∑ f

Pl+1 Zkl (s0 i + x, s0 j + y)a (12) techniques (e.g., DNN, CNN, DT).
k (i, j) =
x=1 y=1 3) Gradient policy algorithms generate the posterior step based on
policy functions (which are quantifications of status and action be­
A recurrent neural network (RNN) has become increasingly popular
haviors at the current step) rather than Q values.
for handling time-series data as shown in Fig. 3e. At every moment t,
4) The actor and critic algorithm where the actor generates the
each RNN block receives an input variable x(t). The hidden status of
posterior-step action using the current-step state to adjust its scoring
each block is h(t), and the output is y(t). Importing h(t) and combining it
policy using the critic’s score, and the critic scores the actor at the
with x(t + 1) produces h(t + 1), which in turn produces the output y(t +
current step.
1). The process creates time-series memories. Input time-series variables
can be predicted by a trained RNN. A representative RNN architecture is Q(s, a) ← Q(s, a) + α[R + γ maxa′ Q(s′, a′) − Q(s, a)], s←s′ (13)
Long Short-Term Memory (LSTM), which was first proposed by
Hochreiter [62]. LSTMs contain a forget gate (which overcomes the
3.5. Q-learning strategy
gradient vanishing and exploding problem by forgetting the unimpor­
tant inputs to the block and strengthening the crucial ones). It is known
Reinforcement learning in unsupervised machine learning is
that stacked RNNs, bidirectional RNNs, and reservoir computing algo­
accomplished using Q-learning. To achieve the final goal, the agent can
rithms (RC) provide better training performance than traditional RNN
use Q-learning to determine the most rewarding policy to implement.
algorithms. Other types of RNN architecture include gate recurrent unit
Stochastic transformations and incentive issues can be managed without
(GRU), which is a mutational form of LSTM [63], stacked RNN, and
changing the atmosphere design. In other words, it is a method of
bidirectional RNN.
determining an optimal approach for a finite Markov decision process
(FMDP) in the sense that it increases all the cumulative results of the
3.4. Reinforcement learning strategy subsequent steps. The Q-learning goal is to optimize the expected
reward of every subsequent stage that makes up the maximum reward
Originally designed by Sutton [53], The concept of reinforcement is for the current state of any FMDP. An action’s value can be reflected in
derived from the field of computer science originally designed by Sutton Q, the function that feedbacks the reward to provide reinforcement.
[53]. Several traditional standards are included in it. RL can be When a value-based model-free RL strategy is employed, the action
explained adequately using agents, environments, states, actions, and value is approximated by an approximation function, such as NN. DNNs
rewards. Note that, agents take action. The use of a remote-controlled are based on iteratively reducing loss functions, which is the basis of
(RC) drone to deliver goods or a player navigating a video game are one-step Q-learning to teach action-value variables. This is an off-policy
some exemplary examples. Agents can take a finite number of actions. learning technique that finds the highest possible result or movement
Then, they choose one action from a series of options. Although this is
almost self-explanatory, it should be noted that they select from a set of
options. Agent environment is the environment in which it moves. The
measurement of a reward allows us to determine whether an agent’s
actions were successful or unsuccessful. To dampen the effects of future
rewards on the agent’s possible action sets, the discount factor is
multiplied by future rewards. A variety of settings can be chosen to make
future rewards more or less valuable than immediate rewards,
depending on the settings. A sort of short-term hedonistic nature is
enforced on agents through feedback or rewards. Agents interact Fig. 4. Basic reinforcement algorithm.

7
M. Elsisi et al. Energy 281 (2023) 128256

based on the current state. In the case of Q-learning, decision-making


outside the recent policy, like taking small actions randomly, is
considered off-policy because the mechanism does not require the
existing policy every time. Moreover, Q-learning aims to maximize its
overall incentive through implementation. As a consequence of this
feedback mechanism, learning is more precise and compensation is
optimized. The input structure is classifiable by an agent with time series
capabilities, and output is provided in a timely manner. As a result of
deep learning LSTM agent training, the memory portion inside the DNN
layer can determine the action and state of time series-based elements.
Various optimization problems can be solved using the generated deep
Q-learning model. Nevertheless, the main focus of the research is the
resource utilization of sensors in the context of a closed smart sensor
grid. In other words, the application was designed to be used exclusively
in the above-mentioned application domain. Fig. 5 illustrates the DNN-v
structure. Input vectors include the previous action, the current state,
and the reward. To overcome overfitting and model errors in the latest
ensemble, more LSTM layers are combined with a batch normalization
layer. In the following part, the data was fed into the so-called fully
connected layer of deep learning. As input numbers from real-life circuit
signals or simulation inputs vary, so will the total layers and nodes.
Using a simple grid search algorithm and sample data, it can be easily
Fig. 6. Flowchart of Q-learning algorithm.
selected (Fig. 6).

module epitomizes all the mechanisms to communicate with the goal


3.6. Reinforcement learning-based IoT IoT device. Moreover, it can also facilitate the receipt of feedback
immediately obtained from the IoT system, as well as the sending
Fig. 7 shows a general framework for RF-based IoT technology. As command if any are available. For example, it could help with exegetic
shown in the figure, the core of the RL-based IoT framework is sum­ messages that return back to the device state.
marized. It derives input as a target that the RL paradigm must learn The features of applied ML and RL strategies, as well as hybrid RL
how to accomplish. The purpose is to perform a state-of-the-art chain with IoT, are enhanced. However, many challenges still need to be taken
setting that the device needs to comply with, i.e., routes on the machine into account such as cybersecurity issues [64,65], computational bur­
state. This target is a precise tool, which will be clarified in discussions dens [66,67], and the implementation cost [68]. Table 3 summarizes the
regarding practical energy savings. main features and comparison between the AI strategies and AI with IoT.
RF-based IoT grows an internal message dictionary including a
listing of today’s IoT protocol commands that can be utilized to link with 4. Data pre-processing
smart devices. The dictionary can be built from protocol elaborations,
automated reverberates engineering answers, or visitors sniffing. It can ML models require data pre-processing in different situations in
include a mix of awards from exclusive IoT protocols, types, vendors, order to achieve high levels of training accuracy. The ML model is un­
and many others. RF-based IoT technology utilizes state-of-the-art RL able to comprehend dataset’s features with fewer data because the
techniques, where the learning model builds and upgrades the device amount of data varies significantly for different types of datasets. A
state. The learning model enhances the formerly called RL techniques widely used pre-processing method is data compression. Compressing
such as Q-learning [24], with its parameters. It identifies which of the data is often done with principal component analysis (PCA). The goal is
numerous commands in the dictionary may be utilized to change the to keep the major features of the dataset while reducing its dimensions.
state of the IoT device. This will be done for a specified goal. The reward During PCA, first, an eigenvector of the correlation coefficient matrix
function of RL techniques evaluates the benefits of each commend taken during preprocessing data matrix is calculated, then the correlation
by the learner in a specified state. coefficient matrix eigenvector is computed, and eventually, the data is
Other modules work in conjunction with the learning model. The projected into the space created by the feature vector. The use of AE for
discoverer module is primarily responsible for scanning the nearby data compaction is similar to PCA. A number of other algorithms have
community inside the look for IoT devices. It deploys traditional scan­ been used to compact the dataset, for example, under-sampling algo­
ning procedures for searching online devices and displays a preliminary rithms using clustering.
fingerprint to decide on open ports. At remaining, the Socket API ML algorithms with complex structures, for example, DNNs with

Fig. 5. Schematic of LSTM-based QN agent.

8
M. Elsisi et al. Energy 281 (2023) 128256

Fig. 7. Schematic of RL-based IoT framework.

Table 3
Comparison between the AI and IoT strategies.
Items AI Strategies

Unsupervised ML Supervised ML Deep RL Deep RL with IoT

Preference Discovering data correlation, finding Mapping between inputs and Behavioral learning Online monitoring and remote control
new patterns, and clustering outputs mapping
Training data Unlabeled data Learn from labeled data Utilizing the interaction with the Data sharing and interaction with the
environment for learning environment for learning
Optimal strategy Depending on the data and its categories Depending on the data and Learn optimal strategy from Utilize the cyber and physical
learning criteria experience components for learning
Exploration No exploration No exploration Adaptive according to the changes Adaptive according to the changes
Computational Need high processing Need high processing Medium processing Low computational burden by using the
burden cloud facilities
Cybersecurity No cybersecurity risk No cybersecurity risk No cybersecurity risk Cybersecurity risk

various hidden layers or CNNs with various convolutional layers and also widely used, including oversampling algorithms based on SMOTE
pooling layers, may have difficulties training in some application sce­ [70].
narios (especially for datasets with complicated features or CNNs with
minmaxEx∼pdata [log D(x)] + Ez∼p(z) [log 1 − D(G(z))] (14)
multiple convolutional and pooling layers). An ML model’s parameters G D

are usually not optimal during the training process (if the model pa­
rameters are acquired using optimization algorithms) or calculated l G = − Ez∼p(z) [log D(G(z))] (15)
directly (if the model parameters are calculated analytically). The ML
model training requires having more datasets with similar characteris­ l D = − Ex∼pdata [log D(x)] − Ez∼p(z) [log(1 − D(G(z))] (16)
tics but diverse details. In particular, ML models whose model param­ Prior to ML training, other techniques such as data normalization
eters are acquired through optimization algorithms or analytically and data smoothing are frequently used. Data normalization involves
cannot reach their optimal values during training. It requires a large scaling the data values so that they fit within a certain range (typically
dataset of similar features, but with different details to train an ML between 0 and 1), which can accelerate parameter convergence through
model. Goodfellow [69] proposed an unsupervised learning method training and enhance learning accuracy during the process. In normal­
called generative adversarial networks (GAN) to solve this problem. ization, data is normalized by one of two methods: min-max normali­
Datasets with similar features to the original inputs are generated or zation, as shown by x’=(x-min(x))/(max(x)-min(x)) as well as mean
reconstructed by GAN. These datasets can also be used to improve or normalization, as shown by x’=(x-mean(x))/(max(x)-min(x)). Several
enhance the dataset. A GAN consists of a generator and a discriminator. algorithms are used to decrease the noise within the signal during
Fig. 3g shows an example of dual neural networks. Generators receive smoothing, which reflects the dataset properties and prevents over­
noise vectors as inputs that contain details of the real dataset, whereas fitting. In addition to moving averages and exponential mean averages,
they output fabricated datasets. The fabricated dataset is fed into the there are many algorithms for smoothing data. These data include
discriminator, while the output indicates whether the fabricated dataset smoothing based on Savitzky Laplacian and kernels as well as filtering
is false or not. GANs are trained by optimizing the game function in Eqn. based on Golay and Kalman.
(14), which is quantified according to cross-entropy theory. This is
achieved by first fabricating a dataset that can be easily used to diagnose 5. Practical issues in building energy saving
whether or not it is fake. Based on the loss function, the diversity be­
tween the fabricated and real datasets is reduced by continuous training This section presents and discusses the causes and impacts of issues
with the noise sample distribution (p(z)), the generated dataset (G(z)) that appear during the ML implementation, as well as provides potential
derived from the noise samples, and the judgment result based on the solutions for practical energy-saving strategies in buildings. The study
generator (D(G(z)), refer to Eqn. (15). In addition, the discriminator includes various cases to assist researchers and practitioners in identi­
keeps improving its ability to recognize the real dataset from the fying major factors that limit the implementation of ML approaches to
fabricated one based upon the loss function in Eqn. (16), where x ∼ pdata building energy-saving practices.
represents the real datasets conforming to a distribution characteristic Building energy-saving studies have been conducted with ML algo­
pdata , and D(x) represents the decision result of the real datasets. In rithms due to ML development. A number of algorithms have been used
training the GAN, the training process ends at the Nash equilibrium, in the past, including ANNs [71,72], SVMs [73,74], multiple linear re­
where the model cannot distinguish between fake and real datasets (i.e., gressions (MLRs) [75,76], ELMs [77,78], deep learning algorithms (DLs)
the system optimization achieves the Nash equilibrium, D(x) = 1/2). To [79], DTs [80], and RFs [81,82]. Numerous studies have used over 128
regenerate datasets with similar features, other types of algorithms are

9
M. Elsisi et al. Energy 281 (2023) 128256

ML algorithms [83] according to the latest research. Yet, a suitable al­ need to be considered in this research paradigm. Research findings ob­
gorithm has not been found. Essentially, when several different algo­ tained from this paradigm are not very useful or practical. Furthermore,
rithms have been developed to resolve the same problem, no one ML restricted data collection approaches and inadequate resources, insuf­
algorithm has proven to be more effective than the others. Various ficient model adaptability, shortage of user confidence, and low model
studies have used inconsistent experimental settings, which is the main adoption in building energy-saving applications make machine learning
cause of this phenomenon. For model testing and verification, most models ill-suited for implementation and promotion. Table 4 shows the
studies use building examples. Data structures, prediction output, and aspects that need to be considered during the designing of an energy
data volume differ from one study to another. This is due to the unique management system. These aspects include system parameters uncer­
characteristics of the buildings used in each referenced study. Due to tainty, resilient operation, time delays, computational burden, proper
these differences, each study draws different conclusions that are not selection of system parameters, online monitoring, and cybersecurity.
generalizable. As well as inconsistent evaluation criteria, there is no
optimal algorithm. The algorithm will be evaluated on a variety of 6. Potential research and development interest
criteria, including Cv, MAE, and computation time. This is to determine
whether the algorithm is stable, accurate, and fast. ML algorithms have There is a growing interest in the use of ML in the field of energy
been analyzed using various evaluation indicators. However, few studies conservation. Research is currently being conducted on a variety of
have done so. Researchers and users may prefer different indicators, potential applications, including building energy management systems
which results in different algorithmic evaluations. Building energy and smart energy grids. Through ML technology, energy systems can be
savings have been extensively studied using ML algorithms. However, optimized to achieve greater efficiency and lower costs. The application
because building characteristics are unique and user preferences vary, of ML to energy systems can help them to detect and respond to changes
existing research is unable to identify one algorithm that suits most in power conditions more effectively, in turn reducing energy waste. By
buildings. In addition to facilitating cross-project comparisons of analyzing energy consumption patterns, for example, the amount of
different studies, sharing data and standardizing evaluation criteria energy used can be adjusted to match the amount of demand. As a result,
enhance the selection of optimal algorithms between them. less energy will be consumed while energy will be utilized more effi­
In most current ML-based energy-saving studies, a single model is ciently. It is also possible to use ML to assist in the identification and
used or a comparison of several models is performed [84–90]. Though correction of problems with energy systems. For example, it can be used
these two approaches vary in the types and number of forecasting al­ to identify wiring issues and other electrical components, thereby
gorithms, they both follow a strategy-oriented paradigm, which in­ reducing the risk of power outages and saving energy. Furthermore, it is
vestigates whether a machine-learning model could be used. This capable of detecting issues with air conditioning and heating systems,
research paradigm aims to validate that ML algorithms can match the thereby reducing energy consumption and enhancing the overall com­
available energy creation data. Recognition of problems with consider­ fort of the building. Automating energy-saving processes is another
able application potential is a prerequisite for effective ML applications, potential application. The device can be used to automatically turn off
as mentioned before. ML models should be able to solve these problems unused lights and appliances, set thermostats at optimal levels, and
and impact their fields. As a result of technology-oriented research perform other energy-saving operations. Through this automation, en­
paradigms, researchers tend to focus on evaluating ML models’ capa­ ergy consumption can be reduced as well as energy systems’ efficiency
bilities to learn from created energy data. According to existing studies, can be increased. In addition, it can be used to improve the accuracy of
there is a lack of analysis and discussion regarding how the proposed ML energy forecasts. Analysis of energy usage patterns can assist in making
models fit into building energy management practices. This explains more accurate forecasts of energy consumption. As a result, energy
why research targets and data structures are inconsistent. Research waste can be reduced, and energy can be used more efficiently. Overall,
findings and practical demands eventually clash, resulting in mis­ ML has the potential to revolutionize the energy sector and make energy
matches. Implementers should use a data-driven research paradigm savings a reality. By leveraging its power, energy systems can be opti­
when applying ML models to building energy-saving practices to effec­ mized for efficiency, cost savings, and improved comfort. With the right
tively promote ML technology application. applications, it can help reduce energy waste and increase energy sav­
As a whole, existing studies suffer from fragmentation due to the ings a reality.
shatter and diversity of research objects, learning algorithms, and data Modeling complex systems with high data requirements and many
structures. Despite the research fragmentation, integrating it into gen­ interdependencies is possible through the application of artificial intel­
eral solutions for building energy savings is challenging. Secondly, ligence [91]. Based on this study, the following directions can be
current studies primarily involve a technology-oriented paradigm to concluded:
validate whether ML models could fit building energy-related data and
their effectiveness. The application value and feasibility of the model

Table 4
The involved aspects in the previous works.
Methods Aspects

Consumption Robustness Feedback Computational Intelligent Online


prediction response burden design monitoring

ANN Deb et al. [71] ✓ – – – – –


Hybrid NAA Li et al. [72] ✓ ✓ – ✓ ✓ –
SVM, DT, and Paudel et al. [73], Yu et al. [80], and ✓ – – ✓ – –
RF Wang et al. [81]
ELM Sajjadi et al. [77] ✓ – – ✓ ✓ –
RNN and CNN Cai et al. [79] ✓ ✓ – – – –
RF Smarra et al. [82] ✓ ✓ ✓ ✓ – –
Deep RL Liu et al. [84] and Yang et al. [85] ✓ – ✓ ✓ – –
Hybrid ML and Wu et al. [86] and Ruan et al. [87] ✓ – ✓ – ✓ –
RL
IoT and RNN Bedi et al. [90] ✓ – ✓ – – ✓

10
M. Elsisi et al. Energy 281 (2023) 128256

1) The random forest classifier has been recommended for accelerating Declaration of competing interest
buildings’ open data energy savings [92]. It has been shown that this
technique is more accurate than decision trees, regression trees, The authors declare that they have no known competing financial
random forest regressors, gradient-boosting classifiers, and interests or personal relationships that could have appeared to influence
gradient-boosting regressions. the work reported in this paper.
2) Hybrid RL with IoT is recommended for energy management and
conservation applications. However, additional research is necessary Data availability
to address cybersecurity issues, computational burdens, and the cost
of implementation. No data was used for the research described in the article.
3) The inconsistency of experimental settings makes it difficult to select
one effective ML algorithm for application. Consequently, future Acknowledgment
research should concentrate on applying ML strategies in the same
building or environment to facilitate judgment and choose the best The authors would like to thank National Kaohsiung University of
strategy. Science and Technology, Taiwan, National Science and Technology
4) An online platform is recommended for collaboration between re­ Council of Taiwan Grant MOST 110-2221-E-992-044-MY3, Taiwan, and
searchers in order to accelerate the application of machine learning Palestine Technical University – Kadoorie, Palestine for supporting this
in the industry. work.

7. Conclusions References

There has been an increase in the importance of the energy man­ [1] Qin H, Yu Z, Li T, Liu X, Li L. Energy-efficient heating control for nearly zero energy
residential buildings with deep reinforcement learning. Energy 2023;264:126209.
agement industry in recent years. A variety of energy management [2] Ouedraogo KE, Ekim PO, Demirok E. Feasibility of low-cost energy management
projects are regularly planned by the company’s top management. This system using embedded optimization for PV and battery storage assisted residential
paper examines several energy-saving strategies, including energy sav­ buildings. Energy 2023:126922.
[3] Al-Ali AR, Zualkernan IA, Rashid M, Gupta R, AliKarar M. A smart home energy
ings through management, technologies, and policies. Several cost- management system using IoT and big data analytics approach. IEEE Trans Consum
effective energy-saving measures to reduce the energy consumption of Electron 2017;63(4):426–34.
major energy-consuming equipment used in industrial facilities include [4] Marinakis V, Doukas H. An advanced IoT-based system for intelligent energy
management in buildings. Sensors 2018;18:610.
high-efficiency electric motors, reduction of boiler flue gas tempera­ [5] Ahammed MT, Khan I. Ensuring power quality and demand-side management
tures, and the use of variable speed drives to match load requirements. through IoT-based smart meters in a developing country. Energy 2022;250:
Saving strategies like these are economically viable in most cases. 123747.
[6] Griego D, Krarti M, Hernandez-Guerrero A. Energy efficiency optimization of new
Various strategies to reduce energy consumption can reduce a signifi­
and existing office buildings in Guanajuato, Mexico. Sustain Cities Soc 2015;17:
cant amount of costs. A reduction in industrial energy consumption and 132–40.
cost is recommended. Based on this review, the following conclusions [7] Olabi AG, Abdelkarem MA, Jouhara H. Energy digitalization: main categories,
applications, merits and barriers. Energy 2023:126899.
can be made:
[8] Peters L, Saidin H. IT and the mass customization of services: the challenge of
implementation. Int J Inf Manag 2000;20:103–19.
1) The application of machine learning, the neural network in LSTM [9] Ma S, Zhang Y, Lv J, Ge Y, Yang H, Li L. Big data driven predictive production
modules, a modified version of DNN, and RL incorporating previous planning for energy-intensive manufacturing industries. Energy 2020;211:118320.
[10] Li S, Xu LD, Zhao S. The internet of things: a survey. Inf Syst Front 2015;17:
moves can enhance the decision-making process of energy conser­ 243–59.
vation issue and takes into account the time series aspect of data. [11] Carlucci S, et al. Modeling occupant behavior in buildings. Build Environ 2020;
2) Using the ML module has a positive effect on agent performance as 174:106768.
[12] Hong T, Yan D, D’Oca S, Chen C. Ten questions concerning occupant behavior in
input observation is highly sensitive to time. buildings: the big picture. Build Environ 2017;114:518–30.
3) Reinforcement learning is an innovative method of boosting model [13] Yan D, et al. Occupant behavior modeling for building performance simulation:
performance by transferring learning from the general ML model to current state and future challenges. Energy Build 2015;107:264–78.
[14] Shaikh PH, Nor NBM, Nallagownden P, Elamvazuthi I, Ibrahim T. A review on
the DQN. optimized control systems for building energy and comfort management of smart
4) Many challenges still need to be taken into account such as cyber­ sustainable buildings. Renew Sustain Energy Rev 2014;34:409–29.
security issues, computational burdens, and the implementation [15] Zhao P, Suryanarayanan S, Simoes MG. An energy management system for building
structures using a multi-agent decision-making control methodology. IEEE Trans
cost.
Ind Appl 2013;49(1):322–30.
[16] O’Dwyer E, Pan I, Acha S, Shah NJ. Smart energy systems for sustainable smart
There is an urgent need to change traditional methods in this field. cities: current developments, trends and future directions. Appl Energy 2019;237:
581–97.
This includes focusing on algorithm development and the implementa­
[17] Qin SJ, Chiang LH. Advances and opportunities in machine learning for process
tion of ML models. That can really make ML technology useful for data analytics. Comput Chem Eng 2019;126:465–73.
building energy savings programs. A modern nano-grid-based micro- [18] Kaelbling LP, Littman ML, Moore AW. Reinforcement learning: a survey. J Artif
grid substation with multiple battery packs and high-performance 5G Intell Res 1996;4:237–85.
[19] Mnih V, et al. Human-level control through deep reinforcement learning. Nature
networks could revolutionize future telecommunications, the internet of 2015;518(7540):529–33.
things (IoT), and machine learning for energy management aspects. [20] Silver D, et al. Mastering the game of Go with deep neural networks and tree
However, the deployment of IoT technology for online monitoring of search. Nature 2016;529(7587):484–9.
[21] Silver D, et al. Mastering the game of Go without human knowledge. Nature 2017;
energy systems still needs additional efforts to handle cybersecurity is­ 550(7676):354–9.
sues and the uncertainty of energy system parameters in future works. [22] Mnih V, et al. Playing atari with deep reinforcement learning. 2013. http://arxiv.
org/abs/1312.5602. [Accessed 16 June 2022].
[23] Gu S, Lillicrap T, Sutskever I, Levine S. Continuous deep Q-learning with model-
Credit author statement based acceleration. New York, NY, USA. In: Proceedings of the 33rd international
conference on machine learning, vol. 48; 2016.
Mahmoud Elsisi: Methodology, Data curation, Writing- Original [24] Lillicrap TP, et al. Continuous control with deep reinforcement learning. htt
p://arxiv.org/abs/1509.02971; 2016.
draft preparation, Visualization, Validation, Writing- Reviewing and [25] Shrouf F, Ordieres-Meré J, García-Sánchez A, Ortega-Mier M. Optimizing the
Editing. Mohammed Amer: Investigation, Validation, Writing- Original production scheduling of a single machine to minimize total energy consumption
draft preparation, Writing- Reviewing and Editing. Alya’ Dababat, costs. J Clean Prod 2014;67:197207.
Chun-Lien Su: Conceptualization, Writing- Reviewing and Editing.

11
M. Elsisi et al. Energy 281 (2023) 128256

[26] Wang F, Zhou L, Ren H, Liu X, Talari S, Shafie-khah M, Catalão JP. Multi-objective [61] Estrach JB, Szlam A, LeCun Y. Signal recovery from pooling representations. In:
optimization model of source–load–storage synergetic dispatch for a building International conference on machine learning. PMLR; 2014. p. 307–15.
energy management system based on TOU price demand response. IEEE Trans Ind [62] Hochreiter S. Long short-term memory. Neural Comput 1997;9:1735–80.
Appl 2017;54:1017–28. [63] Cho K, et al. Learning phrase representations using RNN encoder-decoder for
[27] Wang W, Hong T, Li N, Wang RQ, Chen J. Linking energy-cyber-physical systems statistical machine translation. In: EMNLP 2014 - 2014 conf. Empir. Methods Nat.
with occupancy prediction and interpretation through WiFi probe-based ensemble Lang. Process. Proc. Conf.; 2014. p. 1724–34.
classification. Appl Energy 2019;236(55–69). [64] Li Yunfeng, et al. Intrusion detection of cyber physical energy system based on
[28] Degha HE, Laallam FZ, Said B. Intelligent context-awareness system for energy multivariate ensemble classification. Energy 2021;218:119505.
efficiency in smart building based on ontology. Sustain. Comput. Inform. Syst. [65] Sheng Lina, et al. Optimal communication network design of microgrids
2019;21:212–33. considering cyber-attacks and time-delays. IEEE Trans Smart Grid 2022;13(5):
[29] Park H. Human comfort-based-home energy management for demand response 3774–85.
participation. Energies 2020;13:2463. [66] Zhang Bin, et al. Hybrid data-driven method for low-carbon economic energy
[30] Dey M, Rana SP, Dudley S. A case study based approach for remote fault detection management strategy in electricity-gas coupled energy systems based on
using multi-level machine learning in A smart building. Smart Cities 2020;3(21). transformer network and deep reinforcement learning. Energy 2023;273:127183.
[31] Yang C, Gunay B, Shi Z, Shen W. Machine learning-based prognostics for central [67] Taheri Saman, Jooshaki Mohammad, Moeini-Aghtaie Moein. Long-term planning
heating and cooling plant equipment health monitoring. IEEE Trans Autom Sci Eng of integrated local energy systems using deep learning algorithms. Int J Electr
2020;18(1):346–55. Power Energy Syst 2021;129:106855.
[32] Jafarinejad T, Erfani A, Fathi A, Shafii MB. Bi-level energy-efficient occupancy [68] Lu Yu, et al. Deep reinforcement learning based optimal scheduling of active
profile optimization integrated with demand-driven control strategy: university distribution system considering distributed generation, energy storage and flexible
building energy saving. Sustain Cities Soc 2019;48:101539. load. Energy 2023;271:127087.
[33] Fan C. Data-centric or algorithm-centric: exploiting the performance of transfer [69] Goodfellow I, et al. Generative adversarial nets. In advances in neural information
learning for improving building energy predictions in data-scarce context. Energy processing systems. In: Ghahramani Z, Welling M, Cortes C, Lawrence ND,
2022;240:122775. Weinberger KQ, editors. NeurIPS proceedings, vol. 27. Curran Associates, Inc.);
[34] Wu X, et al. Top 10 algorithms in data mining. Knowl Inf Syst 2008;14(1):1–37. 2014. p. 2672–80.
[35] Pradhan A. Support vector machine-a survey. Int. J. Emerg. Technol. Adv. Eng. [70] Chawla NV, et al. SMOTE: synthetic minority over-sampling technique. J Artif
2012;2(8):82–5. Intell Res 2002;16:321–57.
[36] Rosenblatt F. The perceptron, a perceiving and recognizing automaton,” project [71] Deb C, Eang LS, Yang J, Santamouris M. Forecasting diurnal cooling energy load
para. Buffalo, NY, USA: Cornell Aeronaut. Lab.; 1957. for institutional buildings using Artificial Neural Networks. Energy Build 2016;
[37] Zini G, d’Onofrio G. Neural network in hematopoietic malignancies. Clinica Chim. 121:284–97.
Acta 2003;333(2):195–201. [72] Li K, Xie X, Xue W, Dai X, Chen X, Yang X. A hybrid teaching-learning artificial
[38] Hirose Y, Yamashita K, Hijiya S. Back-propagation algorithm which varies the neural network for building electrical energy consumption prediction. Energy
number of hidden units. Neural Netw 1991;4(1):61–6. Build 2018;174:323–34.
[39] LeCun Y, Bengio Y. Convolutional networks for images, speech, and time series. In: [73] Paudel S, et al. A relevant data selection method for energy consumption
The handbook of brain theory and neural networks, vol. 3361. Cambridge, MA, prediction of low energy building based on support vector machine. Energy Build
USA: MIT Press; 1995. 2017;138:240–56.
[40] Yu J, Zhang B, Kuang Z, Lin D, Fan J. IPrivacy: image privacy protection by [74] Shen M, Lu Y, Wei KH, Cui Q. Prediction of household electricity consumption and
identifying sensitive objects via deep multi-task learning. IEEE Trans Inf Forensics effectiveness of concerted intervention strategies based on occupant behaviour and
Secur 2017;12(5):1005–16. personality traits. Renew Sustain Energy Rev 2020;127:109839.
[41] Yu J, Tan M, Zhang H, Tao D, Rui Y. Hierarchical deep click feature prediction for [75] Ahmad T, Chen H. Short and medium-term forecasting of cooling and heating load
ne-grained image recognition. IEEE Trans Pattern Anal Mach Intell 2022;44(2): demand in building environment with data-mining based approaches. Energy Build
563–78. 2018;166:460–76.
[42] Zhao H-X, Magoulès F. A review on the prediction of building energy consumption. [76] Ciulla G, D’Amico A. Building energy performance forecasting: a multiple linear
Renew Sustain Energy Rev 2012;16:3586–92. regression approach. Appl Energy 2019;253:113500.
[43] Kalogirou SA. Applications of artificial neural-networks for energy systems. Appl [77] Sajjadi S, et al. Extreme learning machine for prediction of heat load in district
Energy 2000;67:17–35. heating systems. Energy Build 2016;122(222–7).
[44] Chae YT, Horesh R, Hwang Y, Lee YM. Artificial neural network model for [78] Roy SS, Roy R, Balas VE. Estimating heating load in buildings using multivariate
forecasting sub-hourly electricity usage in commercial buildings. Energy Build adaptive regression splines, extreme learning machine, a hybrid model of MARS
2016;111:184–94. and ELM. Renew Sustain Energy Rev 2018;82:4256. –68.
[45] Kalogirou SA, Bojic M. Artificial neural networks for the prediction of the energy [79] Cai M, Pipattanasomporn M, Rahman S. Day-ahead building-level load forecasts
consumption of a passive solar building. Energy 2000;25:479–91. using deep learning vs. traditional time-series techniques. Appl Energy 2019;236:
[46] Yin Y, Yu F, Xu Y, Yu L, Mu J. Network location-aware service recommendation 1078. –88.
with random walk in cyber-physical systems. Sensors 2017;17(9):2059. [80] Yu Z, Haghighat F, Fung BCM, Yoshino H. A decision tree method for building
[47] Subashini S, Kavitha V. A survey on security issues in service delivery models of energy demand modeling. Energy Build 2010;42:1637–46.
cloud computing. J Netw Comput Appl 2011;34(1):1–11. [81] Wang Z, et al. Random Forest based hourly building energy prediction. Energy
[48] Yin Y, et al. QoS prediction for service recommendation with deep feature learning Build 2018;171:11–25.
in edge computing environment. Mobile Netw. Appl. 2020;25(2):391–401. [82] Smarra F, et al. Data-driven model predictive control using random forests for
[49] Yin Y, et al. Group-wise itinerary planning in temporary mobile social network. building energy optimization and climate control. Appl Energy 2018;226:1252–72.
IEEE Access 2019;7:83682–93. [83] Zhang L, et al. A review of machine learning in building load prediction. Appl
[50] Yu J, Li J, Yu Z, Huang Q. Multimodal transformer with multi-view visual Energy 2021;285:116452.
representation for image captioning. IEEE Trans. Circuits Syst. Video Technol., [84] Liu T, Tan Z, Xu C, Chen H, Li Z. Study on deep reinforcement learning techniques
early access 2019. for building energy consumption forecasting. Energy Build 2020;208:109675.
[51] Liu L, Cheng Y, Cai L, Zhou S, Niu Z. Deep learning based optimization in wireless [85] Yang Ningkang, et al. Reinforcement learning-based real-time intelligent energy
network. Proc. IEEE Int. Conf. Commun. (ICC) May 2017:1–6. management for hybrid electric vehicles in a model predictive control framework.
[52] Lane ND. “DeepX: a software accelerator for low-power deep learning inference on Energy 2023;270:126971.
mobile devices,”. In: Proc. 15th ACM/IEEE Int. Conf. Inf. Process. Sensor Netw. [86] Wu Changcheng, et al. The application of machine learning based energy
(IPSN), 23; 2016. management strategy in multi-mode plug-in hybrid electric vehicle, part I: twin
[53] Sutton RS, Barto AG. Introduction to reinforcement learning. Cambridge, MA, USA: Delayed Deep Deterministic Policy Gradient algorithm design for hybrid mode.
MIT Press 1998;2(4). Energy 2023;262:125084.
[54] Li Yanxue, et al. Modeling and energy dynamic control for a ZEH via hybrid model- [87] Ruan Jiageng, et al. The application of machine learning-based energy
based deep reinforcement learning. Energy 2023;127627. management strategy in a multi-mode plug-in hybrid electric vehicle, part II: deep
[55] Ester M, Kriegel H-P, Sander J, Xu X. A density-based algorithm for discovering deterministic policy gradient algorithm design for electric mode. Energy 2023;269:
clusters in large spatial databases with noise. KDD-96 Proceedings (AAAI) 1996: 126792.
226–31. [88] Chae YT, Horesh R, Hwang Y, Lee YM. Artificial neural network model for
[56] Tipping ME. Bayesian inference: an introduction to principles and practice in forecasting sub-hourly electricity usage in commercial buildings. Energy Build
machine learning. In: Bousquet O, von Luxburg U, Ratsch G, editors. Advanced 2016;111:184–94.
lectures on machine learning. Springer; 2003. p. 41–62. [89] Fan C, Xiao F, Wang S. Development of prediction models for next-day building
[57] Tipping ME. Sparse Bayesian learning and the relevance vector machine. J Mach energy consumption and peak power demand using data mining techniques. Appl
Learn Res 2001;1:211–44. Energy 2014;127(1–10).
[58] Hinton GE, Osindero S, Teh Y-W. A fast learning algorithm for deep belief nets. [90] Bedi G, Venayagamoorthy GK, Singh R. Development of an IoT-driven building
Neural Comput 2006;18:1527–54. environment for prediction of electric energy consumption. IEEE Internet Things J
[59] Huang G-B, Zhou H, Ding X, Zhang R. Extreme learning machine for regression and 2020;7(6):4912–21.
multiclass classification. IEEE Trans Syst Man Cybern B Cybern 2011;42:513–29. [91] Mazzeo D, et al. Artificial intelligence application for the performance prediction of
[60] Goodfellow I, Bengio Y, Courville A, Bengio Y. Deep learning, vol. 1. MIT Press; a clean energy community. Energy 2021;232:120999.
2016. [92] Hettinga S, van ’t Veer R, Boter J. Large scale energy labelling with models: the EU
TABULA model versus machine learning with open data. Energy 2023;264:126175.

12

You might also like