You are on page 1of 10

1

Electrical Load Forecasting using Artificial


Neural Network ECE 668 -Project
Student Name: Farhana Hossain Priana ID#:20862161
These can be further grouped in terms of the forecasting time
horizon: Long term forecasting has a horizon of 1-20 years,
Abstract-- Load forecasting is one of the major challenges in the mid-term spreads over 1 week to about 12 weeks and finally
development of power-supply projects and the demand-supply short-term usually covers an hour ahead to a week. Therefore,
balance of a power grid. It acts as the foundation of power market at the distribution level load forecasting is a significant input
operation and planning. Improving the accuracy of load for the design and proper operation of the distribution network.
forecasting is critical in improving the utilization ratio of power Various techniques have been developed and studies have
equipment and reducing energy consumption. Load-forecasting been conducted by researchers for different forecasting
methods are divided into two categories based on the forecasting
horizons. Out of many, some key factors that may affect the
techniques. One category is the statistical approach such as the
time-series and regression, which is based on the analysis of the load forecasting models and methods are time horizon,
intrinsic attributes of past data. The second approach is historical data, load density, population growth and alternative
established on correlated factors such as meteorological factors energy source availability, etc. The empirical process of
and modelling those factors require auxiliary data. Knowledge of forecasting depends on the factors which determine the
future load demand system is the first requirement in optimal precision of the outcome. Thus, careful consideration is
planning and operation, and the load forecasting horizon can be a required to choose the factors for the different models.
month(s) or year(s) and even day(s) or hour(s). Therefore, Researchers have found that time and historical data are the
accordingly, they are divided into long, medium and short-term most important factor. It has been observed that the load curve
load forecasting. Long and medium-term forecasts are generally
has a time of the day property; it follows along with an hour of
used to determine the capacity of generation and transmission, for
expansion planning and annual maintenance scheduling, etc. the day, day of the week, week of the month and month of the
Lastly, a short-term forecast is required for daily control and season. Therefore, current data along with historical data
scheduling of power system, etc. This project presents a complete contribute to the accuracy of the forecast and cannot be inferred
analysis of the performance of a load forecasting model based on without reliable data.
the second approach. In the original paper [1], the authors Additionally, the weather has a noticeable effect on
introduce a methodology to evaluate a simple Artificial Neural consumer behaviour which eventually impacts load demand.
Network with a backpropagation algorithm and the performance When weather deviates from the average temperature it will
of the model under different case studies, to emulate real-life increase load demand and sudden weather changes may result
utility requirements. Thus, the main purpose of this document is to
analyse the methodology proposed in [1], reproduce the results,
in over/underestimation. Therefore, weather factor which may
and evaluate its effectiveness in real distribution systems. The include temperature, dew point temperature, humidity, and
models used to reproduce the results are developed in wind index etc. to account for load dependability in the
MATLAB/NN TOOL and EXCEL. forecasting models.
For short-term forecasting, several approaches have been
Index Terms — Load forecasting, Artificial Neural Networks, Non- proposed. It is necessary to have proper knowledge of the
linear load-temperature relation, Simple ANN system, factors affecting the load as integration between the factors and
MATLAB/NN Tool. the demand is the primary concern as the demand at any time of
the day is different. Traditional approaches which are
1. Introduction computationally economic such as regression and interpolation
Load forecasting is one of the essential parts of electric power have been used for decades. However, these approaches may
utilities. Determination of electricity demand for the next hour, not always produce acceptable results. On the other hand,
day and year are of great interest to allocate finite resources and methods which take advantage of complex algorithms often are
serve the rising demand and population. Distribution power heavy computationally. Some of the techniques widely
system planning and operation rely heavily on accurate load proposed by researchers are time series analysis, regression
forecasting. With ever-growing demand, it is becoming analysis, fuzzy logic, artificial neural networks, support vector
increasingly important to accurately forecast load as it is crucial machine algorithms, genetic algorithms, and hybrid methods,
for several planning decisions. Planning decisions can be etc.
divided into three broad categories: long term medium-term and Traditional time series analysis has numerical instability and
short-term decisions. Long term decisions help in laying out inaccuracy occurs due to its lack of incorporating weather
new distribution system infrastructure and expanding old information as load demand has a strong correlation with
infrastructures. Medium-term decisions readily facilitate temperature and other weather variables. Similarly,
electricity market decisions such as buying or selling electricity conventional regression models assume a linear relationship
from neighbouring networks and additionally fuel required between the weather variables and the demand of the load.
soon. Lastly, short-term decisions help in the day-to-day or However, the functional relationship between them is dynamic
weekly operational decisions like operational planning and unit and depends on spatial-temporal variation.
commitment. Expert-based systems have been proposed in the past that
uses expert knowledge of operators which is very difficult to
articulate accurately and has the disadvantage of expert
2
dependency. models, where the present value of the time series is
The application of artificial intelligence techniques to the formulated in terms of past values and random noise in a
load forecasting problem is very popular among researchers. linear combination.
Since quantitative forecasting is extracting patterns from past 2. Moving-Average Model (MA)
observed occurrences and extrapolating them into future events, In this method, present value is linearly expressed in
Artificial Neural Networks have proved to perform well with terms of both present and past values with noise, where
good accuracy. Unlike previous techniques like the time series noise is derived from the forecasted errors when actual
model, state-space and knowledge-based approach ANNs learn observation is recorded.
the patterns from input-output data of the utilities’ system and 3. Autoregressive Moving-Average Model (ARMA)
produces their non-linear model for future forecasting. ARMA is the hybrid of AR and MA. Here, the present
Recently due to advancement in computing technology, value is composed of its previous values at past periods
computation intelligence (CI) methods are widely used in state- along with current and previous noise at past periods.
of-the-art works. This method intuitively teaches itself a Unlike MA this model depends on both the past values of
particular task by trial and error from the past available data and itself and noise.
forecast accurately in the future based on the learning. CI has 4. Autoregressive Integrated Moving-Average
the unique ability of autonomous operation without requiring (ARIMA)
any complex mathematical formulation and/or quantitative ARIMA is a generalized version of ARMA. The time
correlations between input and output. Another recent approach series defined previously as an AR, MA, or ARMA process
is to use Hybrid AI and deep learning algorithms. [1] [2] [3] [4] is called a stationary process. Previous models: AR, MA,
Accuracy and reliability of short-term load forecasts have and ARMA are stationary models meaning the mean of the
significant effects on power system operations, due to the series and the covariances among its observations does not
sensitivity of economic operation and the control of the power change with time. Transformation of non-stationary series
systems to forecasting errors. As substantial errors can lead to to a stationary process by differencing time series can be
either overly conservative or risky scheduling, for example modelled as AR, MA or ARMA to produce ARIMA time
resulting in the start-up of excessive units and unnecessary series. [1][5]
reserves, spinning and operating reserves requirement failure In [6] a modified ARIMA method was presented to
etc. which eventually ends with heavy economic penalties. forecast hourly. The model was improved with utility
Therefore, this paper simulated and evaluates a short-term load knowledge on estimation, weather data: temperature, and
forecasting model to critically analyze its performance and load data. The operators’ knowledge acted as an initial
forecasting capability. forecast, which was then merged with temperature and load
This document is divided into 6 sections to reproduce the data in ARIMA which resulted in enhanced forecasting.
results obtained in [1]. First, a literature review of the Therefore, simulation results compared with conventional
forecasting models and the state-of-the-art is presented. Then, ARIMA highlighted the better performance of the modified
the modelling of an artificial neural network (ANN) for short- ARIMA. Furthermore, the proposed method was able to
term load forecasting is developed using MATLAB. Once the successfully forecast daily peak load with another ARIMA
model is developed, it is used to simulate the same scenarios model. Overall, the processing time of the modified ARIMA
described in [1]. Finally, conclusions and suggestions for was approximately equal to ARIMA but less than ANN,
improvements are stated. moreover simulation for the same data with operator methods,
ARIMA and an error backpropagation ANN models were
2. Literature Review carried out which revealed that the modified ARIMA
Load forecasting has always been one of the vital parts of an outperformed both. The modified ARIMA obtained on average
efficient power system planning and operation. Several electric 2% error compared to the operators 4 to 4.97% error.
power companies have been forecasting load power based on The disadvantage of the time series models such as ARIMA
conventional methods for several years. However, since the is that it extrapolates the history of load data using a linear
relationship between load demand and factors influencing load relationship, therefore, whenever there is a drastic change in the
demand is complex and nonlinear, it is challenging to weather, it cannot model the non-linear behaviour. A possible
characterize its nonlinearity by using conventional methods. In solution to the shortcomings is the use of Transfer Function
this section, a critical literature review of existing and emerging modelling.
forecasting models developed and proposed by researchers to [7] tried to improve the reflection of the load-temperature
overcome the limitations of the conventional method is relationship by modifying the transfer function of previous
presented. work in [8]. In [7] by subjecting the temperature to a non-linear
A. Time Series transformation before including it in the transfer function model
Time series models are based on the assumption that the it aimed to overcome the limitations of previous models lacking
load curve can be decomposed into a time series signal with in incorporating load-temperature relationship. At first, a
daily, weekly and seasonal cyclicities. However, these standard transfer function was developed. Autocorrelation and
cyclicities do not produce exact forecast results rather an partial autocorrelation function were used to identify the time
average. Therefore, analysis of the difference between the series model tentatively and once it was identified, model
actual and the forecasted values are critical in the time parameters were approximated using the maximum likelihood
series approach to producing more accurate results. Some of estimation (MLE) method which is a commonly used method in
the methods used by researchers to study this stochastic the estimation of parameters in statistical models based on the
process are discussed below: principle that the parameter values would maximize the
1. Autoregressive Model (AR) possibility of producing the observations. An optimal
The autoregressive model is one of the most fundamental differencing scheme was chosen after several tests to produce a
3
stationary process from the data collected, followed by model conditions.
order identification was carried out. Finally, they developed
four ARIMA and four transfer function models to forecast the C. Knowledge-based expert system
load of each season of the year. Results demonstrated that the A knowledge-based expert system is a computer program
average percentage errors for the proposed model perform that has a knowledge base and an inference engine and can act
better than the ARIMA models. Thus, it can be concluded that as an expert. Researchers have introduced this method in load
explicit inclusion of temperature influence in the transfer forecasting using knowledge from experts in the field. The
function model optimizes the forecasting model. Lastly, to knowledge is used to articulate facts and rules in the knowledge
include the non-linear relationship for load temperature, it was base and via the inference engine reasoning is performed.
fitted to a third-order polynomial and used as an input to the [3] presented two expert based algorithms for 1 to 6 hours
transfer function model. It can be inferred from the results that ahead and 24-hour ahead forecasts. The analogical approach
the modified transfer function showed improvement given a had the advantage of reduced online database requirement and
simple procedure was adopted. However, it did not cause any the expertise of system operators. Intuitive relationships were
significant improvement to the forecast error comparatively developed between load and weather parameters- dry bulb and
which is evident from the mean absolute error of the models, wet bulb temperatures, wind direction, wind speed and relative
i.e., 5.25% for ARIMA, 3.93% for Transfer Function and humidity. Four different sets of rules were required for different
4.02% for the Non-linear approach for summer. seasons (spring, summer, autumn and winter). To report the
weather change from one season to another, two forecasts were
B. Regression run, one for the present season and one for the coming season.
In regression key steps are selecting the explanatory Expert systems must find variables that are strongly and weakly
variables and estimating coefficients of the variables. In most correlated with the load to help in forming the rules. Another
conventional models, the temperature is widely used and other important factor is the study of the load curves to intuitively
variables such as humidity, wind velocity, etc. when used account for the seasonal influence. Furthermore, a day-of-the-
sometimes produce better. week impact was also considered to fine-tune the seasonality.
[9] demonstrated a linear regression-based model for short- Although the absolute average for the one to six hours ahead
term load forecasting. Their work was an improvement of an was in the range of 0.869to 1.218% and the range from 2.429 to
existing algorithm that forecasted peak load hourly load for the 3.3% for the 24-hour ahead forecast algorithm, the expert
present day and the following day. A multiple-regression (MR) system based may have yielded acceptable results but can be
model and an ARIMA model was used in the process of concluded that it is not superior to the regression-based and
forecasting the peak load. Finally, a weighted average model time-series approach.
was used which used the results from the MR and ARIMA In a later work in [10], a generalized knowledge-based load
model to determine the weighted peak load. An hourly load forecasting technique was presented. A pairwise comparison
model was used to forecast the 24-hourly loads which took a technique to prioritize categorical variables was used which
forecasted peak and historical hourly load as inputs. Lastly, a made it generalized and applicable for different areas. A
final peak load was forecasted from the hourly forecasts. pairwise comparison lets choose the judgement with maximum
Notable improvement was performed for holidays as the likelihood. A site-specific feature also ensured adaptability with
previous algorithm had a high relative error for holidays as the less effort. A generalized model gave the advantage of having a
SLF model did not use any special model for holidays. It was set of weighing parameters that can be updated if necessary and
also observed by [9] that the SLF model was affected when the more robust than [3] as it did not only depend on the exact
weather changes especially during seasons such as spring and weather forecast and naturally updatable as it did not depend on
fall, as the weather model used for summer did not take the any present model for load-temperature relationship due to the
effects of heating loads due to cold fronts, likewise, the weather fact that site-specific weather-load relationship parameter was
model for winter did not model the cooling effects on loads due ignored, and only site-independent features included. The
to warm fronts. Therefore, [9] modelled the cold effects on model was tested for four different sites and results
heating loads by heating degree functions and cooling degree demonstrated that average errors were higher when there was a
functions for warm effects on cooling loads. A binary holiday weather transition especially for fall and spring and for
model was also implemented to account for the effect holidays weekends. Overall average error varied between 1.22% to
have on the load demand. The ARIMA model used was omitted 3.36%.
from the algorithm as it was observed that both the regression Both the methods proposed in [3] and [10] were able to
and ARIMA models produced similar results. However, the produce acceptable forecast results. However, the knowledge-
weighted average of their results gave a better accuracy than based system has limitations in emulating the acquired
their single models, but the increase in accuracy was relatively knowledge in arriving at clear quantitative solutions concerning
less significant than the complexity of the model thus, the a problem. This is rather difficult and cumbersome and often
ARIMA model was eliminated. Therefore, the final model was leads to inconsistent rules.
formulated to be an initial daily peak forecast using a regression
model which then was used to produce initial hourly forecasts D. Artificial Neural Networks
using another regression model. To capture autocorrelation of Forecasting using artificial intelligence has been proven to
random effects the initial daily peak forecast, the maximum of provide promising results. Artificial Neural Networks (ANN)
the initial hourly forecasts and the most recent initial peak are the foundation of artificial intelligence. An ANN is
forecast error along with an exponential smooth of errors were described as a computing system that is composed of highly
used in yet another regression model to finally yield an adjusted interconnected processing elements known as neurons that
peak forecast. The resulting model proved to be accurate and process information by their dynamic state response to external
robust, most importantly it was adaptive to changing inputs. ANN design consists of 3 layers: an input layer, hidden
4
layer(s), and output layer. Neurons within the layers are changes significantly increasing forecast error as forecasting
interrelated by weights and, the output from every neuron is was based on similarity. In the proposed method similar days
multiplied by its corresponding weight before reaching the were selected based on Euclidean norm with weighted factors
inputs of the neurons within the next layer. Every neuron has an as the Euclidean norm decreases, the better the evaluation of
activation function that is employed to work out the output of similar days can be achieved. Load power was used as a
the neuron from its input. All inputs to every hidden layer variable of Euclidean norm to overcome the uncertainties that
neuron are totalled to form an activation function for the come with using a maximum and minimum temperature of
neuron. Similarly, the total of all inputs to every output neuron forecast day as usually used. This provided an approach that is
makes the neuron activation function. Net inputs are computed unrelated to rapid temperature changes yet successful similarity
for every neurone within the hidden layer and the output layer. selection. A three-layer with feedforward connection was used
The output of a neuron within the hidden layer and also the which has 9 inputs and 20 hidden units resulting in 1 output,
output layer is computed generally by the sigmoid activation hourly forecasted load. The correction came from the neural
function. At the start of the learning progression, interconnected network itself where the NN utilized deviation data of load and
weights are initialized to some random values according to the temperature for its learning. The load curve was also forecasted
defined initialization technique. And are later adjusted using a neural network that adopted online learning and
according to a group of input-output pairs till the error is feedback data and forecasted errors for its learning. Simulation
negligible [2]. A Feedforward network is the most common results with different case studies such as forecast load curve
ANN model where information travels in the forward direction using a simple regression model using past temperature and
in the neural network where it goes through input nodes load data, forecast load curve using only similar day data,
followed by hidden layers to output nodes.[11]. forecast load curve using the neural network with off-line
learning and only similar day data used as learning data for
A backpropagation algorithm which is most commonly used to learning and lastly, the proposed method was compared.
train networks was proposed in [11] as a load forecasting Forecasting ability was the best for the proposed method
method and a nonlinear load model where the parameters of the compared to the other case study methods in terms of mean
nonlinear load model were estimated using the backpropagation absolute error percentage. Furthermore, it outperformed others
algorithm. Loads were separated into weekday and weekend remarkably in months when there were seasonal changes. In
patterns with further classification of weekend patterns. The conclusion, if the forecast day were changed, to obtain the
weekends were grouped into Saturdays, 1 st and 3rd Sundays, 2nd, relationship between load and temperature around that forecast
4th and 5th Sundays, 1st and 3rd Mondays and lastly 2nd, 4th and 5th day, the neural network would need to re-trained to be able to
Mondays. These classifications were based on their historical output the correction corresponding to the rapid temperature
data and therefore it may not produce accurate results for other changes.
datasets. Two methods were developed and implemented for In another study [2], a multi-layered feedforward ANN and a
one-day ahead load forecasting, test results showed satisfactory fuzzy set-based classification algorithm were analyzed. Fuzzy
outcome with 2% forecasting error. The first method was a sets were employed to classify the hour wise data into various
static approach where the 24-hour load was forecasted class of weather conditions. ANN created non-linear models for
simultaneously. The present load was formulated such that it is each class of data. Dry-bulb temperature and relative humidity
affected by the past loads and the pattern in which the current were subdivided into categories such as cold, warm, hot and
load is included incorporated in a nonlinear function with a dry, normal and humid respectively, which gave combinations
weight vector that represents the load model. For weekdays like cold-dry, cold-normal, cold-humid, etc. In total 48
load forecasting three latest weekdays were used to adjust the combinations were formulated from classes of temperature and
weight vector and the next two latest weekdays as inputs. For 6 classes of humidity. Like [11] a multi-layered feedforward
the weekend forecasting, it was formulated using the five network using a backpropagation learning algorithm was
grouped weekend load patterns to adjust the weights. The employed. The ANN was assigned to each combination class to
second method proposed was a dynamic approach where it was learn and perform hourly forecasting. This let ANN learn from
forecasted sequentially using the previous time forecasts. It was data that are corresponding to its corresponding class and few
based on the strong correlation they found for the peaks at the data that are less related but has some degree of fitting in that
same hours independent of the day of the week. Same models class. For forecasting, only one class of weather condition was
as method 1 were used to adjust and estimate the weights. identified by calculating the membership value to be able to
Another difference between the two methods was that method select the proper ANN. The ANN had 14 input variables and a
one has 42 forecasts and, in this case, only one. Computational single output. This method of using forecasted weather
studies showed that for comparable accuracy method 2 needed information at each hour and classifying it into one of the 48
less input. The backpropagation algorithm demonstrated robust classes by the fuzzy-based classifier and then automatically
performance in estimating the weights of the nonlinear load selecting the corresponding ANN of that class to forecast the
model. load proved to be capable. Results validated the method as its
However, [11] did not take into account any weather mean absolute error percentage was below 2%.
variables which could have significantly improved the model
performance. E. State-of-the-art forecasting approaches
In [12] an ANN forecasting method was proposed which Recent developments in load forecasting and state-of-the-art
considers a correction approach. It attempted to overcome the solutions and their results are discussed in this part.
cons of the methods which forecast 24-hour-ahead or next day Load forecasting techniques have progressed with time and
peak load by using forecasted weather information such as demand. Not it is not only constrained to weather factors but
temperature. The problem arises when the temperature load diversity as well which can also be caused by weather
fluctuates rapidly on the forecast day and hence, the load diversity for an area. With increasing load demand for power
5
system covering a large geographical area, conventional reviewers assessed it to be over-parameterized sometimes and
approaches and a single model may not provide satisfactory additionally along with the fact that the size of neural networks
results all the time. In a power system with a large geographical increases quickly with the increase in the numbers of inputs,
area as demand diversity and weather plays a key role in hidden nodes and/or hidden layers, the main critic is the
forecasting accuracy, a state-of-the-art approach is utilizing overfitting issue of neural networks. Current advancements in
multi-region load forecasting. The biggest challenge of this neural networks and artificial intelligence have shown terrific
method is to optimally identify areas into regions. A short-term impacts in fields like computer vision and speech recognition.
multi-region load forecasting system based on support vector One of the advancements in the use of deep neural networks,
regression (SVR) was proposed in [13]. First weather and load rather than implementing fixed shallow structures of NNs with
diversity within the selected test system were studied and then hand-crafted characteristics as inputs, is now possible to
the forecasting model was developed. Among many weather incorporate one’s understanding of various tasks. Deep neural
indexes temperature was found to be a dominant factor and an networks are highly adaptable and efficient due to the different
approximate piecewise linear relationship of correlation building blocks including long short-term memory and
between load and temperature for cold and hot days were convolutional neural networks (CNN). The application of deep
computed. This work took into consideration of load diversity neural networks to short-term load forecasting is relatively a
as a reference to the level that different demand patterns affect new research area [20].
the overall system demand. A coincidence factor was In [20] state-of-the-art deep neural network structures was
formulated to define the level of diversity for a group of loads implemented. A basic neural network structure was used for
which revealed that there was a presence of load diversity forecasting for one hour. It did not include external feature
among different zones. The optimal region partition was extraction and/or selection algorithms, rather only raw data of
performed such that the aggregated load forecasting error was historical loads, and weather information, temperature are used
minimized and merging of areas occur only when the as input variables. Inputs were processed in a low-level basic
aggregated error can be lowered based on the partition model. If structure by many fully connected layers to generate
emergence was not possible then a new area was to be created preliminary forecasts, which were then passed through a deep
and a similar searching scheme was appointed again until all residual network. To capture periodic and unusual temporal
areas were grouped. Load forecasting models for each new characteristics of the load curve, one-hot codes for season,
region was also established using an SVR model which weekday/weekend distinction, and holiday/non-holidays
supported a more general and flexible regression. Good distinction were added. By stacking residual blocks, a deep
forecasting was also achieved as different models were used for residual network (ResNet) was constructed and added to the
regular days and special days which included weekends, neural network for improving the 24 hours forecast. To further
holidays, and anomalous days. Test results showed overall enhance the learning capability some modifications were
forecasting error increases when partitioning goes beyond the performed. A modified deep residual network (ResNetPlus)
optimal solution. was implemented by adding a series of side residual blocks,
In the earlier sections, we found that the nonstationary where the input of the side residual blocks is the output of the
characteristics of the load play a critical role in the accuracy of first residual block on the main path. Evaluation of the
forecasting. In another recent work, Wavelet Decomposition performance of the modified model was performed with
(WD) was investigated to improve load forecast accuracy in existing models which had a high dependency on external
combination with a gray neural network. WD has been widely feature extraction and selection, and/or hyper-parameter
used due to its capability of reducing the nonstationary optimization. Results showed excellent performance of the
characteristics of a series and thus improve the accuracy [14]. proposed method with a mean absolute percentage error of
However, the use of WD to process the time series has issues in 1.665 compared to other models having in the range of 1.8 to
the sense that there is no theoretical basis available for the 2.6%. As complicated feature extraction and selection
determination of the WD level, and it cannot predict high- techniques and additional weather information such as
frequency components. Therefore, a method of selecting the humidity, wind speed etc. were not used, this work has the
WD series based on an augmented Dickey-Fuller (ADF) test capability of being a good benchmark for comparison.
was proposed in [14] and a second-order gray forecasting More works have been proposed such as a novel hybrid
model was implemented where a neural network mapping was model with a new signal decomposition in [21] where load
used to build it. The ADF test is an improved technique of the series was decomposed into regular low-frequency components
unit root test. The gray forecast model offered a simple model using improved empirical mode decomposition (IEMD) which
that required less historical data but high forecast accuracy, and mitigates the limitations of the conventional empirical mode
can be conveniently calculated, and does not need to consider decomposition (EMD). Correlation analysis using T-Copula
the distribution. Simulation results proved the claim that WD was carried out for exogenous variables to incorporate them to
can improve forecast accuracy. The model was compared with compensate for the information loss during signal
two other models, WD-Elman which had five layers of WD decomposition.
combined with the Elman neural network and a gray neural After going through IEMD and T-Copula correlation analysis,
network model, GNNM (2,1) that had two eigenvalues data was applied to a deep belief network (DBN) for
representing the static and dynamic changes. The model forecasting the future load. A DBN is believed to overcome the
performed better than the two with a mean absolute percentage shortcomings of traditional neural network-based models.
error of 2.4 compared to that of 3.94 and 3.23, respectively. The DBN teaches itself to probabilistically reconstruct the input
Short-term load forecasting systems with ANN has long been data and then detect feature patterns. Simulation results
one of the recognized solutions, different types and variants of provided higher load forecasting accuracy, MAPE value of the
neural networks have been proposed and applied to [1] [2] [11] proposed model was lower than the other comparative models.
[12][15][16][17][18][19]. However, some researchers and The MAPE value of the proposed model reduced by 21:19%.
6
In a similar work in [22], an EMD-DNN was executed to
perform short-term load forecasting. The EMD was applied to Peak load at day d is defined as maximum {L (1, d):
decompose the load curve into components, a comparable L(42,d)} and Total load at day d is defined as the
24
approach to what was performed in [21]. These components
were then passed through a convolutional neural network summation of L over 24 hours for each day: ( ∑ L( h , d),
(CNN). The output from CNN and raw data was then fed into h =1
the long short-term memory layer (LSTM). This whole process where L(h, d) is the load at hour h on day d.
was the extraction model for multimodal spatial-temporal The neural network structures modelling is shown below
features. Features from this layer along with multimodal in Figure 1. The size of the hidden layers and the number of
spatial-temporal features extracted from electricity price along hidden layers were chosen from among several structures
with a day and hour information and loads of similar days were that gave the best network performance in terms of
used as supplementary features for the forecasting layer. The accuracy.
forecasting layer had a multilayer fully connected neural
network consisting of two fully connected (FC) layers.
It was observed that the proposed method could demonstrate
effective and accurate results. However, it was targeted for
weekdays as the electricity consumption is the largest during
weekdays.
For many decades load forecasting has been studied using
various methods such as time series, regression, machine
learning, etc. Recently, the trend is to develop an ensemble Figure 1: Neural Network Structure
model or hybrid model to improve the accuracy of load
forecasting. Moreover, in recent years the field of Artificial Throughout the simulation, a
Intelligence and Machine Learning has experienced a rise in single hidden layer was used with
popularity due to the advancements of computing technologies. various numbers of hidden
Various advanced AI and ML techniques such as deep learning, neurons. From several
reinforcement learning, extreme learning machine and transfer simulations runs the optimal
learning have been adopted. All these methods have their hidden neuron size was selected.
advantages and disadvantages, a model is evaluated based on its The number of inputs depended
forecasting accuracy, generalization ability, anomaly adaptation on the cases, but the output was
and computational complexity. An extensive literature review always one. Figure 2 shows the
of different defect detection methods highlighted that ensemble training information from the
or hybrid approaches tend to perform better. MATLAB NN tool, where the
training algorithm used is
3. Modelling and Simulation Gradient Descent with
In [1] an ANN is presented where a backpropagation Momentum and training stopped
algorithm is used to train. The backpropagation algorithm is when it reached minimum
also called the generalized Delta rule where the activation Figure 2: NN training gradient. Modelling starts by
function is a sigmoid function. The networks weights are defining the inputs and outputs
adapted using the gradient descent algorithm. A momentum followed by diving them into training and test sets. Then the
was introduced to improve the convergence characteristics. network and the topology along with transfer function is
To model a similar ANN as proposed in the work [1], a defined, where the input, output and hidden layers, activation
feedforward neural network was modelled using the Neural function and learning are specified and configured. Lastly, the
Network (NN) toolbox in MATLAB. The built-in training training algorithm is specified, and training begins, and the
function “traingdm” was used, which employs a gradient model is tested out. At this point based on the simulation
descent algorithm with momentum backpropagation to results, training parameters such as learning rate and
update the weight and bias values. momentum for the gradient descent algorithm was varied to
Since the exact dataset used in [1] was not available, a find the optimal settings.
North American dataset was used for simulation from Table 1 summarizes the network architecture for the three
another paper [20], where it had load and temperature data case studies. Gradient descent with momentum algorithm was
from 2003 to 2014. Hourly temperature and load data from chosen as the initial algorithm for all the case studies, however,
[20] in the interval of Nov.1, 2013 – Jan.31, 2014, were it failed to produce accurate results for hourly load forecasts.
used to train the ANN and test its performance. The optimal training parameters for the gradient descent with
Table 1 shows the five sets used to test the neural momentum was observed to be as learning rate = 0.01 and
network. Each set contains at least 6 normal days, the focus momentum = 0.5. Therefore, a Levenberg-Marquardt
was on normal weekdays according to [1]; no holiday or backpropagation algorithm, “trainlm” was used, which updates
weekends. Test data were not used to train the neural the weights and bias values according to the Levenberg-
network. Marquardt optimization. To evaluate the resulting ANN’s
Three case studies were performed to analyse the performance, the following percentage error measure is
performance of the proposed ANN: used:
 Case 1: Peak Load of the Day
 Case 2: Total Load of the Day error =¿ actual load−forecasted load∨ ¿ × 100¿
 Case 3: Hourly Load actual load
7
Table 5 shows the error (%) of the hourly forecast of each day
in the test sets. The average error for all 5 sets is found to be
1.60 %. Note that each day’s result is averaged over a 24-hour
Case Inputs Output Training period.
Study function and
Hidden days Set 1 Set 2 Set 3 Set 4 Set 5
neurons Day 1 2.64 1.33 1.50 1.50 1.24
1  Average Peak Load Gradient Day 2 1.24 1.57 1.43 1.60 0.98
temperature at day k descent with
 Peak momentum Day 3 2.18 1.50 1.59 1.93 0.84
temperature Day 4 2.66 1.35 1.76 1.87 1.00
 Lowest 5 Day 5 2.25 1.58 1.62 1.79 1.31
temperature
Day 6 2.04 1.40 1.87 1.50 0.89
Avg. 2.17 1.45 1.63 1.70 1.04
2  Average Total Load Gradient Table 5: Error (%) of Hourly Load Forecasting with One Hour Lead
temperature at day k descent with
Time
 Peak momentum
temperature
 Lowest 5 To find the effect of the lead time on the ANN load forecasting
temperature model, set 3 was used whose performance in Table 5 was the
nearest to the average following the method proposed in [1].
3  Hour of Load at Levenberg The lead time was varied from 1 to 24 hours with a 3-hour
predicted hour k Marquardt interval. The topology of the ANN was as follows:
load Inputs:
 Load at hour
k-2 & k-1 10  k, hour of predicted load
 Temperature  Load at lead time 24 for hour k {L(24,k)}
at hour k-2 &  Temperature at lead time 24 for hour k {T(24,k)}
k-1 Output:
Table 1: Summary of Network Architecture
 Predicted load at hour k {L(k)}
Hidden Neurons:
4. Results & Discussion
 10 neurons
Table 2 shows five sets used to test the neural network.
Training Function:
Each set contains at least 6 weekdays. The test sets were not
 Levenberg Marquardt
used in the training stage.
Sets Test data from Figure 3 and 4 show examples of the hourly actual and
Set 1 11/9/2003 - 11/18/2003 forecasted loads with one-hour and 24-hour lead times. Figure 5
Set 2 11/18/2003 - 11/25/2003 shows the average errors (%) of the forecasted loads with
Set 3 12/08/2003 - 12/15/2003 different lead hours for test set 3.
Set 4 12/27/2003 - 01/04/2004
Set 5 01/23/2004 - 01/30/2004
Table 2: Test Data Sets

Table 3 shows the error (%) of the peak load of each day in the
test sets. The average error for all 5 sets is 3.83%.
days Set 1 Set 2 Set 3 Set 4 Set 5
Day 1 2.27 3.85 4.78 2.77 2.96
Day 2 4.50 3.82 5.11 1.67 3.66
Day 3 3.20 3.92 1.12 6.51 1.43
Day 4 2.65 8.74 6.50 3.17 1.34
Day 5 1.50 3.14 2.14 8.60 4.94
Day 6 3.85 1.87 2.24 9.88 3.29
Avg. 2.92 4.22 3.65 5.43 2.94
Table 3: Error (%) of Peak Load Forecasting Figure 3:Hourly Load Forecasting and Actual Load

Table 4 shows the error (%) of total load of each day in test
sets. The average error for all 5 sets is 5.74%.
days Set 1 Set 2 Set 3 Set 4 Set 5
Day 1 4.67 8.03 3.03 6.79 3.80
Day 2 5.86 8.18 1.33 6.31 4.57
Day 3 8.48 9.20 1.60 9.57 1.29
Day 4 5.20 10.03 1.47 6.70 1.83
Day 5 6.09 7.78 3.75 14.08 6.47
Day 6 8.03 7.17 4.63 1.02 5.34
Avg. 6.39 8.40 2.63 7.41 3.88
Table 4: Error (%) of Total Load Forecasting
8

Figure 4: Hourly Load Forecasting and Actual Load

Figure 7: Hourly Load Forecast for set 1 with actual Load

Figure 5: Average errors (%) of the forecasted loads with different


lead hours for test set 3

Figure 8: Hourly Load Forecast for set 2 with actual Load

Figure 6: Hourly Load Forecast for set 3 with actual Load

Figure 9:Hourly Load Forecast for set 4 with actual Load


9
significantly participate in load demand can further improve the
performance, which has been seen to be effective in approaches
discussed in the literature review.

5. Conclusion
In this paper, an electric load forecasting methodology using
an artificial neural network has been presented and was
accurately implemented in MATLAB/ NN TOOL. This was
based on the topology used in [1] and the performance of this
method was comparable.
The results reinforced that the ANN is suitable to incorporate
historical load and temperature pattern into future load pattern.
In order to reproduce all the results shown in [1], there were
developed two different NN models, according to the study
cases presented in [1].
In order to forecast the future loads from the trained ANN,
Figure 10: Hourly Load Forecast for set 5 with actual Load
recent load and temperature data in addition to the predicted
In the training stage, the historical temperature was used and in future temperature was suggested in [1], a similar ANN was
the test, stage predicted temperature was used in [1]. However, used without the predicted temperature due to unavailability
since the predicted temperature was not available due to access and performance for each study case replicated the expected
constraints, in the test stage temperature from the dataset was results from the simulation analysis.
applied. MATLAB and EXCEL were used to study the simulation
From Figure 5, the error gradually increases as the lead hour and generate the required results and compute performance and
grows which is true up to 18 hours of lead time, like what has evaluate them.
been observed in [1]. One of the causes for this error pattern In general, neural networks require training data well spread
can be the periodicity of load and temperature pattern as can be in the feature space to provide exceptionally accurate results
seen in Figure 6-10 and, also assumed in [1]. Even though they [1].
are not exactly as similar as those of the previous day, the In conclusion, as discussed earlier, it is of paramount
temperature and system load are very analogous to those of the importance that a short-term load forecasting model delivers
previous day. accurate forecasts for the decision-making of power system
However, compared to [1] the average error is much higher and operators. Uniform performance during all seasons and
varies in between 2% to 13%, as compared to 1% to 3% in [1]. especially during times of unusual and unexpected weather
An exact comparison cannot be deduced as the datasets and conditions is expected of a robust forecasting model.
year of forecasts are not the same. One-hour lead error results
compare quite favourably with the one obtained in [1]. 6. References
Now to compare our case study simulation results to [1], it
can be concluded that the NN modelled in this paper shows [1] S. S. Madani, “Electric load forecasting using an
satisfactory results in terms of peak, total and hourly load artificial neural network,” Middle East J. Sci. Res.,
forecasting. The average errors are 3.83%, 5.74% and 1.60% vol. 18, no. 3, pp. 396–400, 2013, doi:
compared to 2.04%, 1.68% and 1.40%. As we can see peak and 10.5829/idosi.mejsr.2013.18.3.11682.
hourly forecast errors are nearly identical to the simulations by [2] M. Daneshdoost, “Neural network with fuzzy set-
[1]. In addition, it was also noticed that forecasts for weekends based classification for short-term load
in the selected 5 datasets demonstrated high inaccuracy when forecasting,” IEEE Trans. Power Syst., vol. 13, no.
simulated with Gradient Descent with Momentum. Though in
4, pp. 1386–1391, 1998, doi: 10.1109/59.736281.
the paper [1] it was explicitly stated that focus was on weekday
forecasts, it was discovered that when trained with Levenberg-
[3] S. Rahman and R. Bhatnagar, “An expert system
Marquardt weekend forecast errors significantly dropped. based algorithm for short term load forecast,”
Figure 5-10 graphs the hourly load forecasts including IEEE Transactions on Power Systems, vol. 3, no.
weekends and holidays and from the graphs it can be seen that 2. pp. 392–399, 1988, doi: 10.1109/59.192889.
an accurate forecast was produced. [4] S. H. Sunny and D. R. O. Y. Dipta, “A
The load forecasting technique replicated is using only Comprehensive Review of the Load Forecasting
temperature data and from simulation results on a different Techniques Using Single and Hybrid Predictive
dataset extrapolated acceptable results however did not deliver Models,” pp. 134911–134939, 2020, doi:
the same accuracy in certain cases. To improve the accuracy 10.1109/ACCESS.2020.3010702.
and extrapolation of the NN model and avoid overfitting data [5] I. Moghram and S. Rahman, “Analysis and
pre-processing can be a key player. In [1] there was no mention
Evaluation of Five Short-Term Load Forecasting
of any data-preprocessing, data pre-processing techniques such
as scaling, and normalizing can lead to improvement. These can Techniques,” IEEE Power Eng. Rev., vol. 9, no.
often provide faster training and less chance of getting trapped 11, pp. 42–43, 1989, doi:
in a local optimum. Moreover, incorporating further weather 10.1109/MPER.1989.4310383.
variables such as humidity, cloud cover and dew level etc. and [6] N. Amjady, “Short-term hourly load forecasting
more data spread over various years, seasons and taking using time-series modeling with peak load
consideration of different activities as human activities estimation capability,” IEEE Trans. Power Syst.,
1
vol. 16, no. 4, pp. 798–805, 2001, doi: He, “Short-Term Load Forecasting With Deep
10.1109/59.962429. Residual Networks,” IEEE Trans. Smart Grid, vol.
[7] M. T. Hagan and S. M. Behr, “The Time Series 10, no. 4, pp. 3943–3952, 2019, doi:
Approach to Short Term Load Forecasting,” 10.1109/TSG.2018.2844307.
Power, vol. 0, no. 3, pp. 785–791, 1987. [21] R. Haq and Z. Ni, “A New Hybrid Model for
[8] M. Hagan and R. Klein, “On-Line Maximum Short-Term Electricity Load Forecasting,” IEEE
Likelihood Estimation for Load Forecasting,” no. Access, vol. 7, pp. 125413–125423, 2019, doi:
5, pp. 711–715, 1978. 10.1109/ACCESS.2019.2937222.
[9] A. D. Papalexopoulos and T. C. Hesterberg, “A [22] F. Xiong and Z. Fu, “Multimodal Feature
regression-based approach to short-term system Extraction and Fusion Deep Neural Networks for
load forecasting,” Forecast, pp. 1535–1550, 1990. Short-Term Load Forecasting,” pp. 185373–
[10] S. Member, S. Member, and E. Systems, “A 185383, 2020, doi:
Generalized Knowledge-Based Short-Term Load- 10.1109/ACCESS.2020.3029828.
Forecasting Technique,” vol. 8, no. 2, pp. 508–
514, 1993.
[11] Y. Shimakura et al., “Short-term load forecasting
using an artificial neural network,” Proc. 2nd Int.
Forum Appl. Neural Networks to Power Syst.
ANNPS 1993, vol. 7, no. 1, pp. 233–238, 1993,
doi: 10.1109/ANN.1993.264285.
[12] T. Senjyu, H. Takara, K. Uezato, and T.
Funabashi, “One-Hour-Ahead Load Forecasting,”
vol. 17, no. 1, pp. 113–118, 2002.
[13] S. Fan, K. Methaprayoon, and W. Lee,
“Multiregion Load Forecasting for System With
Large Geographical Area,” vol. 45, no. 4, pp.
1452–1459, 2009.
[14] B. Li, J. Zhang, Y. U. He, and Y. Wang, “Short-
Term Load-Forecasting Method Based on Wavelet
Decomposition With Second-Order Gray Neural
Network Model Combined With ADF Test,” vol.
5, 2017.
[15] S. A. Villalba and C. Á. Bel, “Hybrid demand
model for load estimation and short term load
forecasting in distribution electric systems,” IEEE
Trans. Power Deliv., vol. 15, no. 2, pp. 764–769,
2000, doi: 10.1109/61.853017.
[16] H. S. Hippert, C. E. Pedreira, and R. C. Souza,
“Neural networks for short-term load forecasting:
A review and evaluation,” IEEE Trans. Power
Syst., vol. 16, no. 1, pp. 44–55, 2001, doi:
10.1109/59.910780.
[17] H. Quan, D. Srinivasan, and A. Khosravi, “Short-
term load and wind power forecasting using neural
network-based prediction intervals,” IEEE Trans.
Neural Networks Learn. Syst., vol. 25, no. 2, pp.
303–315, 2014, doi:
10.1109/TNNLS.2013.2276053.
[18] N. Mahdavi, M. B. Menhaj, and S. Barghinia,
“Short-term load forecasting for special days using
Bayesian neural networks,” 2006 IEEE PES
Power Syst. Conf. Expo. PSCE 2006 - Proc., vol.
15, no. 2, pp. 1518–1522, 2006, doi:
10.1109/PSCE.2006.296525.
[19] G. Gross, S. Francisco, D. Galiana, and L.
Number, “Short-Term Load Forecasting,” vol. 75,
no. 12, 1987.
[20] K. Chen, K. Chen, Q. Wang, Z. He, J. Hu, and J.

You might also like