You are on page 1of 9

Available online at www.sciencedirect.

com
Available online at www.sciencedirect.com
ScienceDirect
ScienceDirect
Procedia
Available Computer
online Science 00 (2021) 000–000
at www.sciencedirect.com
Procedia Computer Science 00 (2021) 000–000
www.elsevier.com/locate/procedia
www.elsevier.com/locate/procedia
ScienceDirect
Procedia Computer Science 197 (2022) 16–24

Sixth Information Systems International Conference (ISICO 2021)


Sixth Information Systems International Conference (ISICO 2021)
Performance analysis of
Performance analysis artificial neural
of artificial neural network
network models
models for
for hour-
hour-
ahead electric load forecasting
ahead electric load forecasting
Lemuel Clark P. Velasco*, Karl Anthony S. Arnejo, Justine Shane S. Macarat
Lemuel Clark P. Velasco*, Karl Anthony S. Arnejo, Justine Shane S. Macarat
Premiere Research Institute of Science and Mathematics
Premiere Research
Mindanao State University-Iligan Institute
Institute of Science and
of Technology, Mathematics
Iligan City, 9200, The Philippines
Mindanao State University-Iligan Institute of Technology, Iligan City, 9200, The Philippines

Abstract
Abstract
Supervised Artificial Neural Networks (ANN) is considered as a popular machine learning framework for year-ahead, month-ahead
Supervised
and day-ahead Artificial Neural
electric load Networks
forecasting (ANN)
but isisyet
considered as a popular
to be optimized for machine
very shortlearning framework for
term predictions likeyear-ahead,
hour-aheadmonth-ahead
forecasting
and day-ahead electric load forecasting but is yet to be optimized for very short term predictions
granularity. This study conducted a performance analysis of ANN models for hour-ahead electric load forecasting like hour-ahead forecasting
that power utility
granularity.can
companies Thisusestudy
for conducted a performance
agile reaction analysisamong
to its participation of ANNspotmodels for hour-ahead
markets. electricprocedures
Data preparation load forecasting that power which
were conducted utility
companies can
transformed theuse for agile
historical reaction
electric loadtorecords
its participation
of a certainamong spot markets.
geographic area servedDatabypreparation procedures
a power utility were conducted
into appropriate which
forms resulting
transformed
to partitioned,therepresented
historical electric load records
and normalized of a certain
datasets for ANN geographic areatesting
training and servedprocesses.
by a powerThirty-six
utility intoANN
appropriate
models forms resulting
all having nine
to partitioned, represented and normalized datasets for ANN training and testing processes. Thirty-six
input neurons and one output neuron were evaluated and found out that ANN models with Sigmoid activation function exhibited ANN models all having nine
input neurons
promising and one performance
forecasting output neuronshown were evaluated
in terms ofand found
Mean out thatPercentage
Absolute ANN models with
Error Sigmoid
(MAPE) activation
with function exhibited
Back Propagation training
promising with
algorithm forecasting performance
three hidden neuronsshown
havingin2.85%
terms MAPE;
of MeanQuick
Absolute Percentage
Propagation Erroralgorithm
training (MAPE) with with Back Propagation
six hidden neuronstraining
having
algorithm with three hidden neurons having 2.85% MAPE; Quick Propagation training
2.91% MAPE; and Resilient Propagation training algorithm with six hidden neurons having 3.49% MAPE. The algorithm with six hidden neurons having
performance
2.91% MAPE; and Resilient Propagation training algorithm with six hidden neurons having 3.49%
analysis presented in this study shows how these ANN models were able to generate close to accurate forecasting results which MAPE. The performance
analysisutility
power presented in thiscan
companies study shows how
effectively these
use for ANN resource
optimal models were able to generate
management close to accurate
as they participate in spotforecasting
markets thatresults which
will require
power utility companies
internationally can effectively
accepted forecasting use forerror
tolerance optimal resource management
in hour-ahead electric load as they participate in spot markets that will require
nomination.
internationally accepted forecasting tolerance error in hour-ahead electric load nomination.
©
© 2021
2021 The
The Authors.
Authors. Published
Published byby ELSEVIER
Elsevier B.V.B.V.
© 2021
This is The
an Authors.
open access Published
article by
under ELSEVIER
the B.V.
This
This is an open
is an open access
access article under
article under the CC
the CC BY-NC-ND
CC BY-NC-ND license
BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
license (https://creativecommons.org/licenses/by-nc-nd/4.0)
(https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review
Peer-review under responsibility of the scientific committeeofofthe
under responsibility of the scientific committee theSixth
SixthInformation
Information Systems
SystemsInternational
InternationalConference.
Conference.
Peer-review under responsibility of the scientific committee of the Sixth Information Systems International Conference.
Keywords: Artificial neural networks; performance analysis; hour-ahead load forecasting
Keywords: Artificial neural networks; performance analysis; hour-ahead load forecasting

* Corresponding author. Tel.: +63-935-868-8933; fax: +63-063-221-4071.


* E-mail
Corresponding
address:author. Tel.: +63-935-868-8933; fax: +63-063-221-4071.
lemuelclark.velasco@g.msuiit.edu.ph
E-mail address: lemuelclark.velasco@g.msuiit.edu.ph
1877-0509 © 2021 The Authors. Published by ELSEVIER B.V.
1877-0509 © 2021
This is an open Thearticle
access Authors. Published
under by ELSEVIER
the CC BY-NC-ND B.V.(https://creativecommons.org/licenses/by-nc-nd/4.0)
license
This is an open
Peer-review access
under article under
responsibility of the scientific
CC BY-NC-ND license
committee (https://creativecommons.org/licenses/by-nc-nd/4.0)
of the Sixth Information Systems International Conference.
Peer-review under responsibility of the scientific committee of the Sixth Information Systems International Conference.

1877-0509 © 2021 The Authors. Published by Elsevier B.V.


This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility of the scientific committee of the Sixth Information Systems International Conference.
10.1016/j.procs.2021.12.113
Lemuel Clark P. Velasco et al. / Procedia Computer Science 197 (2022) 16–24 17
2 Lemuel Clark P. Velasco et al. / Procedia Computer Science 00 (2021) 000–000

1. Introduction

Artificial Neural Networks (ANN) which draws inspiration from the systemic analysis of human brains and
biological cognitive patterns has been widely used for the forecast of natural and social phenomena. Despite its
growing popularity in processing both linear and non-linear historical datasets, performance of ANN models for very
short-term forecasting which generates micro scale predictions are still considered limited despite optimal performance
on long-term, medium term and short-term forecasting [1]-[4]. Among supervised ANNs that adjusts weights
according to the error between the input data and the desired output, challenges on data preparation as well as tuning
the ANN model is still faced by data modelers in order for the ANN to deliver close to accurate predictions [1], [3],
[5] [6]. Being a machine learning framework that can only exhibit optimal outputs based on its fed inputs, ANN is
highly dependent on data preparation as preprocessing techniques that will ensure the integrity of the data that it is
processing. For ANNs with multilayer perceptron architecture having feed-forward neurons among the three layers of
input layer, hidden layer, and output layer, challenges in determining the number of hidden neurons along with the
appropriate training algorithm and activation function are very common in the implementation of an ANN model.
Moreover, there are parameters that also needs to be determined such as the corresponding learning rate, momentum,
epoch and maximum error which greatly affects the predictive capability of the ANN model [2] [4]. These
considerations on data preparation and ANN model implementation has led to various ways of performance analysis
procedures with the purpose to determine proper tuning of model variables so that ANN as a machine learning
framework can deliver accurate forecasting results in long-term, medium-term, short-term and very short-term
predictions.
Hour-ahead electric load forecasting is considered by most power utility practitioners as very-short term prediction
that augments the year-ahead or long-term forecast, week to month-ahead or medium term forecast and day-ahead or
short-term electricity consumption forecast. With a distinct supplementary purpose of making agile purchasing
decisions to electricity spot markets, power utility companies make use of machine learning frameworks like ANN in
order to develop electric load forecasting models that are used to avoid under nomination of electric load consumption
resulting to power outages and over nomination which could result to unutilized resources [2], [6], [7]. Instantaneous
predictions in terms of hour-ahead electric load forecast are used by power utility companies in the occurrence of
fluctuations and shortages of electricity supply that is unforeseen by day-ahead and week-ahead load forecasting.
Despite the obvious essentiality in the decision making processes of unit commitment, economic dispatch, and
maintenance scheduling, power utility companies has not much utilized machine learning in hour-ahead load
forecasting relying only to last minute estimation compared to the deployment of month-ahead and day-ahead electric
load forecasting models [8]. More than the issues of the model’s training time, power utility companies are continuing
to explore hour-ahead load forecasting models that can meet the 5% international tolerance error as well as avoid
additional operation costs [5] [ 8]. With this, forecasting models like ANN have the huge potential to address the gap
of precise prediction of electricity consumption that will maximize the profit and expenditure of both the producer,
distributors and consumers.
A power utility company in Southern Philippines already has month-ahead, week-ahead and day-ahead electricity
load forecasting models which are used an analysis tools for bi-lateral contracts with their supplier load generators. In
preparation for the participation of Southern Philippine power utility companies to electricity spot markets, the said
power utility needs to develop an ANN model which could generate hour-ahead load forecasting since spot markets
will require power utility market participants to come-up with nomination reactions in less than an hour of the
prospective time frame. The historical electric load consumption data of the implementing power utility company
should undergo data preparation techniques in order for the data to be suited for any machine learning processing [3,
7]. Additionally, the ANN model that should be deployed composing of the architecture, training algorithm, activation
function along with necessary training and testing procedures including essential parameters like the respective
learning rate, applicable momentum, epoch and maximum error should corporately result to predictions that should
meet the forecasting tolerance error of less than 5% [2], [3], [7]. This study aims to conduct a performance analysis of
different ANN models that power utility companies can use in deciding appropriate solutions for hour-ahead electric
load forecasting. By conducting a thorough examination of ANN model performance, the results of this research hopes
to significantly contribute to the research on load prediction by exploring the capability of ANN in very short-term
18 Lemuel Clark P. Velasco et al. / Procedia Computer Science 197 (2022) 16–24
Lemuel Clark P. Velasco et al. / Procedia Computer Science 00 (2021) 000–000 3

electricity load forecasting while at the same time shed light on how power utility companies can prepare participation
to spot markets through the implementation of forecasting models that generate acceptable tolerance errors.

2. Methodology

2.1. Electric load data preparation

The historical electric load from the power utility composing of a highly urbanized city’s electricity consumption
from 2012-2014 has undergone data cleaning, data partitioning and data transformation. Shown in Table 1 is a sample
to the 85, 896 rows of 96 daily fifteen-minute observation along with its corresponding electricity load consumption
represented by kilowatt delivered (KW_DEL). Stored in a .csv file format, the raw datasets from three metering points
were examined for data cleaning before being aggregated to represent the city’s overall electricity load consumption.
It was found out that 2.6% of the entire dataset reflect power interruptions showing zeros in the KW_DEL column and
was corrected by getting the corresponding average based from the preceding and the proceeding day’s electric load
consumption [6]. After correcting the out-of-range values brought about by power outage service interruptions that
could potentially affect the ANN models’ performance, no other data errors resulting from human intervention or
automated electricity recording were identified as noise.

Table 1. A sample of the electric load data from the power utility.
DATE TIME KW_DEL DATE TIME KW_DEL DATE TIME KW_DEL
MM/DD/YYYY 00:15 NN.NN MM/DD/YYYY 00:45 NN.NN MM/DD/YYYY 01:15 NN.NN
MM/DD/YYYY 00:30 NN.NN MM/DD/YYYY 01:00 NN.NN MM/DD/YYYY 01:30 NN.NN

Data partitioning of the dataset was then conducted in order to prepare for the ANN training. As an ANN data
preprocessing technique, numerous studies highly recommend that the training set that will be used for training by
adjusting the neural network’s weights should be maximized as much as possible [5] [9]. Furthermore, the testing set
being a separate set of independent data that will be utilized in testing the design of the neural network in order to
confirm each ANN model’s predictive performance should be enough to validate the respective model’s accuracy and
precision [1], [4], [6]. Fig. 1 shows that the training set contained 83,267 rows or 97% of the whole data set while the
testing set contained 2,629 rows or 3% of the whole data set. Almost three years of the city’s electric load dataset was
partitioned into the training comprising of the electric load consumption records from 00:00:00 of April 01, 2012 to
07:45:00 of August 16, 2014 while the testing data set was comprised of the electric load consumption records from
08:00:00 of August 16, 2014 to 17:15:00 of September 12, 2014.

a. b.

Fig. 1. (a) The training set; (b) The testing set.

ANN does not just receive any kind data so it can yield accurate results. In order for the dataset to be accepted by
the ANN models, the dataset of this study have undergone a transformation process in which the data was consolidated
into forms appropriate for the ANN training and testing processes. The dataset’s identified temporal attributes like the
time capture sequence, day of the week, weekend indicator, type of day, week number, month, date and year needs to
be represented numerically in order to be processed by the ANN. Additionally, the data transformation process also
included data normalization which is significantly proven to make the training of the network faster and memory-
efficient by transforming the dataset into certain ranges in such a way that the dataset became uniform and that no
value overpowered another value to prevent instability of data [10]. In this study, the entire dataset was transformed
to fall between the specified range of 0.0 to 1.0 as suggested by studies on electric load forecasting since negative
values were considered outliers and can affect the accuracy of the load forecasting [4], [8], [11]. As shown in Equation
1, the dataset employed the Min-Max normalization technique where min(x) is the minimum value of attribute and
Lemuel Clark P. Velasco et al. / Procedia Computer Science 197 (2022) 16–24 19
4 Lemuel Clark P. Velasco et al. / Procedia Computer Science 00 (2021) 000–000

max(x) is the maximum value of an attribute. Consequently, after the ANN models’ training and testing which
processed normalized data, the outputs from the ANN models underwent a denormalization phase employing a derived
reverse formula so that the forecasted values were revealed back to its original form and compared to the actual load
values.

(1)

2.2. ANN model design

Following the data preprocessing phase, ANN models were formulated by identifying the architecture as well as
ANN model design followed by training and testing of the formulated models. A multilayer perceptron (MLP) neural
network was used in this study due to the nature of the data with electricity load and the factors that affect it exhibiting
non-linear relationship due to the varying consumption behavior which can create spikes [12] [13]. Having three layers
namely: input layer, hidden layer, and output layer with each containing sets of neurons, MLP along with its feed-
forward architecture has been considered to be best suited for electricity load data due to its capability to solve non-
linear problems resulting to optimal ANN model performance for electricity load forecasting [4-6, 8, 13]. With each
layer on the ANN containing corresponding sets of neurons, the number of input layer neurons in the input layer were
determined according to the number of factors that affected the prediction of the next hour load along with the 15-
minute electric load data. For the hidden layer, a single hidden layer is enough to calculate the next hour load forecast
since more than one hidden layer can cause the system to complicate and take more training time [5, 6]. For its
respective hidden layer neurons, there is no uniform theoretical approach in determining it but researchers have come
up with varied techniques in calculating the number of hidden layer neurons [14] [15]. Table 2 shows the different
techniques from different authors in calculating the number of hidden neurons that this study used to determine the
neurons in the single hidden later. With the expectations of electricity spot markets that the power utility should react
to at least 15 minutes of load nomination, a single output neuron in the output layer was identified to be enough in
order to generate the next 15-minute load.

Table 2. A sample of the electric load data from the power utility.
AUTHOR(S) HIDDEN LAYER NEURONS IDENTIFICATION
Researcher’s Rule of thumb Within the range of the number of input and output neurons
S., Param (2015) [⅔(input layer size)] + output layer size
S., Karsoliya (2012) Input layer size – 2

After the ANN architecture was determined, the MLP was designed with its needed configuration and parameters
such as the training algorithm and activation function along with its parameters such as the learning rate, epoch, and
the maximum error were determined in order for the ANN to run optimally. The researchers formulated different ANN
models with different combinations of training algorithms and activation functions followed by training and testing.
The learning rate was scaled so as to determine the learning pace of the network while the maximum error was set
whether the system has to iterate or stop [13]. If the error after each iteration is greater than the defined maximum
error value, then the network will have to iterate again and again until it is lower than the defined maximum error [4],
[13], [14]. The epoch was identified by the researchers as a determining value of how many iterations it went through
before completing the training while momentum specifies to what degree the previous iteration weight changes should
be applied to the current iteration [1]. Activation functions were then attached to its appropriate layer and they were
used to scale data output from a layer. The identified activation functions were assigned for hidden and output layers
only since activation functions affects data coming from the previous layer and there was no previous layer before the
input layer [2], [10], [13]. Since electricity load forecasted does not go zero and below, the activation functions that
were used in this study should also cater data sets that does not go zero and below. Thus, training the neural network
using the training set with different combinations of learning algorithm, learning rate, momentum, hidden neurons and
activation function that does not go zero and below were used in this study.
20 Lemuel Clark P. Velasco et al. / Procedia Computer Science 197 (2022) 16–24
Lemuel Clark P. Velasco et al. / Procedia Computer Science 00 (2021) 000–000 5

In order to analyze the performance of the different ANN models in hour-ahead electric load forecasting, the
researchers tried out the different combinations of the configuration and parameters followed by the calculation of the
respective error measurements. After designing, the ANN models were then trained using the training set followed by
testing on which the formulated ANN models were evaluated using the testing set for its predictive capability. As
suggested by researchers on electric load forecasting, being shown in Equation 2 is how Mean Absolute Percentage
Error (MAPE) was used for each of the model’s error measurement where PA is the actual load, PF is the forecasted
load, and N is the number of data [1], [8], [14]. Shown in terms of percentage, the ANN model’s MAPE exhibits the
forecasting error between the predicted and the actual making it easier for the conduct of performance analysis as well
as the evaluation of the ANN model’s compliance to the 5% internationally accepted tolerance error being used by
both electricity spot markets and power utility companies in assessing the performance acceptability of an electric load
forecasting model. The electric load results of the top 3 best performing ANN models were then denormalized to have
a visual representation on the comparison of the actual and the predicted electric load.

(2)

3. Results and discussion

3.1. Electric load data preparation results

As a result of the data preprocessing phases, the 85, 896 rows of data underwent a transformation process in order
for the dataset to be consolidated into forms appropriate for the ANN processes. The researchers first converted the
time attribute into integers before conducting further transformation processes since the raw time cannot be normalized
immediately. Shown in Table 3 is the representation of the dataset’s time attribute in an integer format wherein the
numerical value of 1 was assigned to the day’s 00:00 time and incrementing every 15 minutes until 23:45 was
represented by the numerical value of 96.

Table 3. Time attribute in integer format.


TIME VALUE TIME VALUE TIME VALUE TIME VALUE TIME VALUE
00:00 1 00:30 3 01:15 5 01:45 7 … …
00:15 2 00:45 4 01:30 6 02:00 8 23:45 96

Table 4 shows the raw data of the study’s dataset. With the dataset normalized using Min-Max normalization that
produced transformed outputs within the range of 0 to 1, it was found out that normalizing the electricity load alone
produces non-converging results during training, evaluation, and prediction. Thus, the other factors affecting the
electricity load were also normalized using Min-Max normalization with the results of the normalized data showing
values that lie only between 0 and 1 are shown in Table 5.

Table 4. The raw data of the dataset.


RAW LOAD RAW DAY RAW RAW TYPE RAW WEEK RAW TIME RAW DAY RAW RAW YEAR
OF THE WEEKEND(0)/ OF DAY NUMBER MONTH
WEEK WEEKDAY(1) Holiday(1)/
Regular Day
(0)
MIN MAX MIN MAX MIN MAX MIN MAX MIN MAX MIN MAX MIN MAX MIN MAX MIN MAX
689.145 39243.663 1 7 0 1 0 1 1 52 1 96 1 31 1 12 2012 2014
… … … … … … … … …
26199.175 1 0 0 1 1 1 1 2012
26133.892 2 1 1 2 2 2 2 2013
… … … … … … … … …

A study contends that the input features should be represented in a binary format of N number of bits [16]. However,
as all the input features should be normalized for convergence, a problem arose in determining the Min and Max if
they were converted into binary structures since the computer system used cannot determine if the inputted data for
Lemuel Clark P. Velasco et al. / Procedia Computer Science 197 (2022) 16–24 21
6 Lemuel Clark P. Velasco et al. / Procedia Computer Science 00 (2021) 000–000

Min-Max normalization is in binary format or not. Unless the input feature only has two choices like the weekend or
weekday indicator or the type of day, the input features were and should be converted into counting sequence of
numbers to easily determine its Min and Max for normalization.

Table 5. The dataset’s normalized data.


NORMALIZED NORMALIZED NORMALIZED NORMALIZED NORMALIZED NORMALIZED NORMALIZED NORMALIZED NORMALIZED
LOAD DAY OF THE WEEKEND(0)/ TYPE OF DAY WEEK TIME DAY MONTH YEAR
WEEK WEEKDAY(1) Holiday(1)/ NUMBER
Regular Day (0)
… … … … … … … … …
0.661661235 0 0 1 0 0 0 0 0
0.659967971 0.166666667 1 0 0.019607843 0.010526316 0.033333333 0.090909091 0.5
… … … … … … … … …

3.2. ANN model design results

Performance analysis results should show the predictive performance of the designed ANN models implemented
in a MLP feed-forward architecture. Fig. 2 shows the block diagram of the ANN models with nine input neurons
consisting of the 15-minute electric load, day of the week, weekend/weekday indicator, type of day to be holiday or
regular day, week number, time, day, month, and year [2] [8]. As per recommendation from authors, the training
algorithms that need exploration on performance optimization were the ones used in this study: Resilient Propagation,
Back Propagation and Quick Propagation [2], [6], [8], [13]. Consequently, activation functions were identified as
Sigmoid and Gaussian due to their ability and fit to process 0 to 1 in an MLP feed-forward architecture.

Fig. 2. The block diagram of the ANN models.

This study set the epoch value of 20,000 with a maximum error of 0.001 since the researchers have observed that
all the training duration for all models finished at less than 20,000 epochs with an error rate of 0.001. Shown in Table
6 are the results of the learning rate and momentum determined using a trial and error method showing that
convergence happened at 0.00001 learning rate and 0.7 momentum. Since it was observed that the learning rate of
0.01, 0.001, 0.0001 and a momentum of less than 0.7 produced non-converging results, the learning rate was set to
0.00001 while the momentum was set to 0.7 for all training algorithms that require learning rate and momentum.

Table 6. Results of the ANN Models’ learning rate and momentum.


LEARNING RATE MOMENTUM RESULTS LEARNING RATE MOMENTUM RESULTS
0.01 0.4 Does not converge 0.0001 0.6 Does not converge
0.001 0.5 Does not converge 0.00001 0.7 Converging

The researchers have formulated different combinations of training algorithms and activation functions along with
the identified hidden neurons since there are a number of training algorithms and activation functions that iteratively
modify the weights of the network to minimize the overall error from the actual data to the predicted output [10] [17].
22 Lemuel Clark P. Velasco et al. / Procedia Computer Science 197 (2022) 16–24
Lemuel Clark P. Velasco et al. / Procedia Computer Science 00 (2021) 000–000 7

With each of the ANN models having undergone training using the training set, they have also undergone testing using
the testing set with prediction error calculated using MAPE. Table 7 shows the ANN models performance as a
combination of the training algorithms and the activation functions along with the hidden neurons from 3 to 8.

Table 7. ANN Models performance.


MODEL TRAINING ACTIVATION HIDDEN MAPE MODEL TRAINING ACTIVATION HIDDEN MAPE
NAME ALGORITHM FUNCTION NEURONS NAME ALGORITHM FUNCTION NEURONS
1 Resilient Sigmoid 3 3.73285214 19 Resilient Sigmoid 6 3.49373548
Propagation 8 Propagation 1
2 Resilient Gaussian 3 (Did not 20 Resilient Gaussian 6 (Did not
Propagation converge) Propagation converge)
3 Quick Sigmoid 3 3.32983361 21 Quick Sigmoid 6 2.91436199
Propagation 5 Propagation 2
4 Quick Gaussian 3 (NaN 22 Quick Gaussian 6 (NaN
Propagation Error) Propagation Error)
5 Back Propagation Sigmoid 3 2.85099359 23 Back Propagation Sigmoid 6 3.26811718
7
6 Back Propagation Gaussian 3 (NaN 24 Back Propagation Gaussian 6 (NaN
Error) Error)
7 Resilient Sigmoid 4 4.52546024 25 Resilient Sigmoid 7 4.09569427
Propagation 5 Propagation 3
8 Resilient Gaussian 4 (Did not 26 Resilient Gaussian 7 (Did not
Propagation converge) Propagation converge)
9 Quick Sigmoid 4 3.52533982 27 Quick Sigmoid 7 3.17050174
Propagation 7 Propagation 4
10 Quick Gaussian 4 (NaN 28 Quick Gaussian 7 (NaN
Propagation Error) Propagation Error)
11 Back Propagation Sigmoid 4 3.08950898 29 Back Propagation Sigmoid 7 3.21281982
8
12 Back Propagation Gaussian 4 (NaN 30 Back Propagation Gaussian 7 (NaN
Error) Error)
13 Resilient Sigmoid 5 3.72627754 31 Resilient Sigmoid 8 3.67044548
Propagation 5 Propagation 9
14 Resilient Gaussian 5 (Did not 32 Resilient Gaussian 8 (Did not
Propagation converge) Propagation converge)
15 Quick Sigmoid 5 3.18155688 33 Quick Sigmoid 8 3.02432275
Propagation 4 Propagation 9
16 Quick Gaussian 5 (NaN 34 Quick Gaussian 8 (NaN
Propagation Error) Propagation Error)
17 Back Propagation Sigmoid 5 3.23938874 35 Back Propagation Sigmoid 8 3.32428119
5 8
18 Back Propagation Gaussian 5 (NaN 36 Back Propagation Gaussian 8 (NaN
Error) Error)

Results showed that select ANN models have non-converging and Not a Number (NaN) error results with all of
which employing Gaussian activation. Moreover, ANN models 2, 8, 14, 20, 26 and 32 all using Resilient Propagation
training algorithm and Gaussian activation function with varying hidden neurons generate non-converging results.
Similarly, ANN Models 4, 6, 10, 12, 16, 18, 22, 24, 28, 30, 34, 36 all using Quick Propagation training algorithm and
Gaussian activation function with varying hidden neurons generate NaN error results. The researchers have observed
that the Sigmoid Activation works best for the hour-ahead electric load forecasting while the Gaussian Activation
performed poorly by outputting non-converging training error for Resilient Propagation and a NaN error on Back
Propagation and Quick Propagation [10]. This could be a result of an error computation function being pushed too
close to the infinity, a log(0) or similar since Gaussian activation in nature is too expensive to use with the error output
exceeding the maximum bits. The ANN models with Sigmoid activation function exhibited MAPE results to be less
than the 5% internationally accepted tolerance error for electricity forecasting with Model 5 composed of Back
Propagation training algorithm and Sigmoid activation function as the best performing model having 2.85% MAPE
and Model 7 composed of Resilient Propagation training algorithm and Sigmoid activation function as the worst
performing model only 4.53% MAPE. This is also supported by the studies of where Sigmoid Activation resulted
superior performance in electricity load forecasting [6] [7]. The researchers have also observed that ANN models with
Resilient Propagation and Quick Propagation training algorithms followed the same trend of MAPE results with
respect to the number of hidden neurons where the error lowers a little when the hidden neurons were set to 8 after a
slight increase when the hidden neurons were set to 7. Moreover, ANN models with Resilient Propagation and Quick
Propagation training algorithms also showed the lowest error when the number of hidden neurons were set to 6 while
having the highest error if the number of hidden neurons were set to 4. Furthermore, the error result also showed
significant increase when the hidden neurons were set to 7. These observations showed that 5 hidden neurons and
below are bound to increase in error while having 6 as the optimum number of hidden neurons if the training algorithm
is Resilient Propagation and Quick Propagation. ANN models with Back Propagation training algorithm seems to
Lemuel Clark P. Velasco et al. / Procedia Computer Science 197 (2022) 16–24 23
8 Lemuel Clark P. Velasco et al. / Procedia Computer Science 00 (2021) 000–000

deviate from the trends of ANN models employing Resilient Propagation and Quick Propagation with a gradually
increasing error rate starting from the lowest number of hidden neurons. Fig. 3 shows the top 3 best performing ANN
models having the lowest MAPE implying close to accurate electric load forecast.

a. b. c.
Kilowatt Delivered Kilowatt Delivered Kilowatt Delivered

15-Minute Data Observations 15-Minute Data Observations 15-Minute Data Observations

Fig. 3. Comparison of the actual and predicted electric load for (a) model 5, (b) model 21, (c) model 19.

The Top 3 best performing ANN models all employed Sigmoid activation function with Model 5 having Back
Propagation training activation with 3 hidden neurons generated a MAPE of 2.85%, Model 21 having Quick
Propagation training algorithm with 6 hidden neurons generated a MAPE of 2.91%, and Model 19 having Resilient
Propagation training algorithm with 6 hidden neurons generated a MAPE of 3.49%. Despite having the same activation
function, the top 3 best performing ANN models had three different training algorithms. Based on Models 5, 21 and
19, the researchers observed that the prediction fluctuates at some point in time while it also performs well on certain
times. The researchers also noticed that the network performed well on midnight up to sunrise at around 0.5 electric
load. This could be due to the fact that consumers are mostly asleep at these hours while the succeeding hours are the
time when people are awake and are consuming electricity at a higher fluctuating rates. This claim is supported by a
study where the networks sometimes performed well and poorly on certain times of the day done [18-20]. Additionally,
the researchers have also observed that there is a notable fluctuation on the predictive performance of Model 19
Propagation every time the load is at its lowest. Given the results, the researchers propose that 6 is the optimal number
of hidden neurons for this hour-ahead electricity load forecasting situation.

4. Conclusion and recommendations

In order for power utility companies to decide on the appropriate electric load forecasting solution that can be used
for close to accurate forecasting of very short term time granularity, this study conducted a performance analysis of
different ANN models for hour-ahead electric load forecasting. In data preparation, the input features selected as well
as the normalization results proved to be effective since it resulted to low and acceptable MAPE. The study also
conducted Min-Max normalization in data transformation and found out that all of the input features should be
normalized instead of the electricity load data alone in order for the neural network to converge. During ANN model
designing, six hidden neurons were found out to be optimal number and it was further discovered that Gaussian
activation function does not work well in all of the formulated neural network models while the Sigmoid Activation
works efficiently. In evaluating the designed models for its predictive capability, it was found out that Quick
Propagation has the lowest MAPE of 2.91%, followed by Back Propagation with a MAPE of 2.85%, and lastly the
Resilient Propagation with a MAPE of 3.49%. These forecasting results delivered by the ANN models were able to
comply with internationally accepted forecasting tolerance error that governs acceptability of hour-ahead nominations
among spot markets.
The researchers recommend further studies to explore other normalization methods, training algorithms, and
activation functions along with its corresponding ANN parameters in the conduct of ANN models performance
analysis. Inclusion in the models of methodologies like fuzzy logic and adaptive network-based fuzzy inference system
is also something that future researchers can explore. The researchers also recommend to use a machine with high
computing power to cope with the huge resource requirement that neural networks typically consume as this study
was conducted using a desktop computer. For further performance optimization of the ANN models, the researchers
24 Lemuel Clark P. Velasco et al. / Procedia Computer Science 197 (2022) 16–24
Lemuel Clark P. Velasco et al. / Procedia Computer Science 00 (2021) 000–000 9

recommend a separate study on determining the frequency of retraining a network if new data will be appended to the
dataset since hour-ahead electric load forecasting delivers very short term predicted time granularity. Through a
thorough examination of the performance of various ANN models, this study shows the capability of ANN in very
short-term electric load forecasting so that power utility companies can prepare in the agile reaction participation to
spot market and deliver efficient and optimal service to its stakeholders.

Acknowledgement

The authors would like to thank the support of the Mindanao State University-Iligan Institute of Technology
(MSUIIT) Office of the Vice Chancellor for Research and Extension through the Premiere Research Institute in
Science and Mathematics (PRISM) for their assistance in this study. This work is supported by MSU-IIT as an
internally funded research under the PRISM-Applied Mathematics and Statistics Group.

References

[1] Laouafi, A., Mourad Mordjaoui, and Djalel Dib. (2015) “One-hour ahead electric load forecasting using neuro-fuzzy system in a parallel
approach.” Computational Intelligence Applications in Modeling and Control: 95-121, Springer Cham.
[2] Guan, C., P. B. Luh, L. D. Michel, Y. Wang, and P. B. Friedland (2013) “Very short-term load forecasting: wavelet neural networks with data
pre-filtering.” IEEE Transactions on Power Systems 28 (1): 30-41.
[3] Liu, X., Zijun Zhang, and Zhe Song. (2020) “A comparative study of the data-driven day-ahead hourly provincial load forecasting methods:
From classical data mining to deep learning” Renewable and Sustainable Energy Reviews 119: 109632.
[4] Senjyu, T., H.Takara, K. Uezato, and T. Funabashi. (2002) “One-hour-ahead load forecasting using neural network.” IEEE Transactions on
power systems 17(1): 113-118.
[5] Benlembarek, K., M.T. Khadir, F. Benabbas. (2010) “A Web Based System for Short-Term Forecasting of Algerian Electricity Load Using
Artificial Neural Network.” Journal of Automation and Systems Engineering 4 (2): 94-100.
[6] Adepoju, G. A., S. O. A. Ogunjuyigbe, and K. O. Alawode. (2007) “Application of neural network to load forecasting in Nigerian electrical
power system.” The Pacific Journal of Science and Technology 8(1): 68-72.
[7] Neusser, L., L. N. Canha, A. Abaide, and M. Finger. (2012) “Very short-term load forecast for demand side management in absence of historical
data.” In International Conference on Renewable Energies and Power Energies and Power Quality, Santiago de Compostela (Spain).
[8] Al-Shareef, A. J., E. A. Mohamed, and E. Al-Judaibi. (2008) “One hour ahead load forecasting using artificial neural network for the western
area of Saudi Arabia.” International Journal of Electrical Systems Science and Engineering 1(1): 35-40.
[9] Adamowski, J. F. (2008) “Development of a short-term river flood forecasting method for snowmelt driven floods based on wavelet and cross-
wavelet analysis.” Journal of Hydrology 353(3-4): 247-266.
[10] Heaton, J. (2011) “Programming Neural Networks with Encog3 in Java.” Heaton Research, Inc.
[11] Henderi, H., Henderi, Tri Wahyuningsih, and Efana Rahwanto. (2021) “Comparison of Min-Max normalization and Z-Score Normalization
in the K-nearest neighbor (kNN) Algorithm to Test the Accuracy of Types of Breast Cancer”. International Journal of Informatics and
Information System 4(1): 13-20.
[12] Kouhi, S., F. Keynia, and S. N. Ravadanegh. (2014). “A new short-term load forecast method based on neuro-evolutionary algorithm and
chaotic feature selection.” International Journal of Electrical Power & Energy Systems 62: 862-867.
[13] Mandal, P., T. Senjyu, T. Funabashi. (2006) “Neural networks approach to forecast several hour ahead electricity prices and loads in
deregulated market” Energy Conversion and Management 47(15-16): 2128-2142.
[14] Param, S., Md. Minhaz Chowdhury, Damian Lampl, Pranav Dass, Kendall E. Nygard. (2016) “Energy Demand Prediction Using Neural
Networks.” 28th International Conference on Computer Applications in Industry and Engineering, Volume 1, CA, USA.
[15] Karsoliya, S. (2012) “Approximating number of hidden layer neurons in multiple hidden layer BPNN architecture.” International Journal of
Engineering Trends and Technology 3(6): 714-717.
[16] Ceperic, E., V. Ceperic, A. Baric. (2013) “A strategy for short-term load forecasting by support vector regression machines.” IEEE
Transactions on Power Systems 28(4): 4356-4364.
[17] Sibi, P., S. A. Jones, P. Siddarth. (2013) “Analysis of different activation functions using back propagation neural networks.” Journal of
Theoretical and Applied Information Technology 47(3): 1264-1268.
[18] Taylor, J. W. (2008) “An evaluation of methods for very short-term load forecasting using minute-by-minute British data.” International
Journal of Forecasting 24(4): 645-658.
[19] Karlik, B., and A. V. Olgac. (2011) “Performance analysis of various activation functions in generalized MLP architectures of neural networks.”
International Journal of Artificial Intelligence and Expert Systems 1(4): 111-122.
[20] Wang, J., Q. Zheng, Jingyi Wang, and P. Yan. (2020) "Hour-Ahead Photovoltaic Power Forecasting Using an Analog Plus Neural Network
Ensemble Method" Energies 13 (12): 3259.

You might also like