You are on page 1of 6

Energy Consumption Forecasting Using SARIMA and NARNET: An

Actual Case Study at University Campus


Junior A. Zancanaro, Orlem L. dos Santos, Luis F. Ugarte, Mateus Giesbrecht and Madson C. de Almeida
School of Electrical and Computer Engineering
State University of Campinas
Campinas, Brazil
juniorzancanaro@ufmt.br

Abstract— Energy consumption forecasting is important 23% profit operating in a free environment between 2003
since it allows the managers of the electrical systems to plan the and 2017.
energy supply over time, reschedule consumption to periods of In order to move for an environment free electric energy
low activity and to propose more profitable energy purchase
plans. In this paper, two methods for energy consumption contract, it is fundamental to evaluate and estimate the
prediction, called seasonal autoregressive integrated moving benefits. In this sense, real-time monitoring and commu-
average (SARIMA) and nonlinear autoregressive neural nication capabilities can provide conditions to improve the
network (NARNET), are applied and compared in an management of electric energy. Additionally, literature can
actual-case using historical data from the Federal University provide reliable models for the projection of electrical energy
of Mato Grosso. The results of both methods show a
better understanding of energy consumption, highlighting a consumption helping to predict the use of it during different
better identification of behaviors in electric energy use patterns. periods, planning the use of electric energy, improving en-
ergy purchase contracts, with a view to energy sustainability
Index Terms Energy consumption prediction, Energy [5]. Therefore, the modeling of electric energy consumption
efficiency, NAR, SARIMA, Time series analysis. plays an important role in management.
The literature shows a large number of methodologies for
I. INTRODUCTION time series forecasting, such as regressions [6], machine
Contemporary society and sustainable development are an learning [10],[12], deep learning [13],[14], to name a few. In
inseparable whole. The first relates to the human needs of this context, this paper presents the assessment of two very
forming socially stable groups, while the second relates to the promising techniques for energy consumption forecasting:
longevity of the human species [1]. In this sense, the electric the SARIMA [6] and the Autoregressive Neural Networks
energy is one of the more important vectors. Its importance (NARNET) [8]. Both methodologies assume that future
directly lies in the human capacity to transform resources values can be predicted from patterns observed only in the
into essential goods. As a result, electrical energy is subject past values of the series under analysis. In the case of
to the three conflicting requirements: economic, social and SARIMA, the approach adopted was to model the series in
environmental [2]. the State Space and to estimate them using the Kalman filter.
Universities, as centers dedicated to knowledge devel- In the case of NARNET, the approach has been proven to
opment, must be protagonists in the process of proposing be suitable for short-term forecast having poor performance
sustainable solutions. In this perspective, programs such as for long-term forecast [8], [10].
the Sustainable Campus, a partnership between the State Uni- The analysis is carried out using historical data of the
versity of Campinas, the private sector in the figure of CPFL Federal University of Mato Grosso, which has almost 21
and the Brazilian regulatory agent - ANEEL - , is innovative. thousand undergraduate students, a budget of approximately
In progress, it has the ambition of establishing a model 254 million dollars [7] and an electric power consumption
for energy management and efficiency for high education expenditure of 4 million dollars in 2017.
institutions in Brazil and Latin America. Indeed, this could The remainder of the paper is structured as follows. The
be a very important achievement, since Brazilian Federal prediction models of energy consumption are presented in
Universities paid out approximately 150 million dollars in section II. The application of model identification to solve the
electric energy in 2018 [3]. Such amount represented almost forecasting problem studied in this paper is briefly described
22% of the Federal Government budget with the acquisition in section III. The evaluation of the prediction models is
of electric energy. made in section IV. Finally, conclusions are drawn in section
Among the options to reduce the costs, is the prerogative V.
for most Brazilian high education institutions to operate in an
environment free for electric energy contracting. Data from
[4] claims that free consumers perceived, on average, an

978-1-5386-8218-0/19/$31.00 2019
c IEEE
II. PREDICTION MODEL OF ENERGY Stage 1
CONSUMPTION         
Descriptive 
Analysis of Data
A. SARIMA Model
Let yt be a time series with no trend or seasonality
observed between the instants t = 1, 2, · · · , k. Among the
Stage 2
polynomial models widely cited in the literature to describe         
it are the Autoregressive Moving Average (ARMA). Such Model
Identification
representations assume that the value of the variable at any
instant is a linear combination of past observations and white
noise. It has the following structure:
Stage 3
p q
X X         
Parameters
yt = φi yt−i + θi t−j + t Estimation
i=1 j=1 (1)
θq (L)
yt = t
φp (L) Stage 4
where t is a normally distributed zero mean white noise         
Diagnostic
- t ∼ (0, σ2 ) - and φi (i = 1, 2, · · · , p) and θj (j =
1, 2, · · · , q) are the autoregressive and moving average pa-
rameters respectively. The φp (L) and θq (L) are polynomials
of orders p and q respectively. yes no
Is model
Time series usually exhibit trends and/or seasonality.         
Forecast adequate?
They may also depend on random values and errors shifted
by a period of s and exogenous variables. The seasonal Fig. 1: Box & Jenkins Approach for Polynomial Models
autoregressive integrated moving average with exogenous
regressors (SARIMAX) model is a generalization of the
ARMA model that covers these possibilities. Equation 3
and estimates the parameters recursively using the Kalman
presents it in the State Space Format. In this format, it is
Filter and maximum likelihood estimation [11].
also known as a regression model with SARIMA errors [9].
4) Diagnostics: The model is evaluated by the suitability
yt = βt xt + ξt (2) of the residue with the assumption of being normally and in-
dependently distributed.The Shapiro-Wilk tests and quantile-
ΦP (Ls )φp (L)∆d ∆D s
s ξt = ΘQ (L )θq (L)t (3)
quantile graph evaluate the normality while the LJung Box
In State Space, the parameters and states can be calculated Test test and partial and total autocorrelation graphs the
recursively using Kalman Filter and the Maximum Likeli- independence.
hood Method. For predictions, the objective is to identify 5) Forecasting: Finally, once a parsimonious model is
the orders (p, d, q)x(P, D, Q)s, estimating them to minimize determined, it is apt to be used for the forecast.
σ2 and extrapolate it to the horizon of future interest.
The Box & Jenkins approach [6] consists of proposing B. NARNET Model
and adjusting models to the observed series. Fig. 1 shows its The nonlinear autoregressive neural network (NARNET)
typical flow and the tools used in this study. uses the past values of the time series to predict fu-
1) Descriptive Analysis of Data: The evaluation of the ture values. Thus, this model has as input the sequence
statistical properties and graphs of the series allows formu- (yt−1 , yt−2 , . . . , yt−p ), where the p value is known as times-
lating hypotheses about its constituent components and the tamps. The NARNET model, to predict the value of a data
necessity of transformations. series y at time t using the past p values of the series, can
2) Model Identification: Several models may explain cer- be represented as follows:
tain observations.The parsimonious ones, that is, the ones
that accurately explain the series with the smallest number yt = f (yt−1 , yt−2 , . . . , yt−p ) + (t) (4)
of parameters, are indicated as a candidate. For this, tools
The function f (.) is
such as total and partial autocorrelation analysis and Akaike !
n m
Information Criterion - AIC - are used. X X
3) Parameters Estimation: Once the model and its orders f (.) = wj ϕj vji yli , (5)
j=0 i=0
are defined, the parameters are estimated.This can be done
by a batch method or recursive method. The SARIMAX where ϕj are the activation’s functions of each layer j.
Statsmodels library was used in this paper. It implements The function f (.) is approximately determined during the
the so-called regression model with SARIMA errors. In this training of the neural network by updating weights and bias.
approach, it also rewrites the SARIMA model in State Space (t) denotes the error of approximation. A NARNET model
can be represented in Fig. 2. The estimate value ŷt output TABLE I: Data-set Splitting
of the NARNET model is:
Set Set Size Number of Samples
ŷt = f (yt−1 , yt−2 , . . . , yt−p ) (6)
Training 65.38% 68
Validation 11.54% 12
Input Hidden Output
Test 23.08% 24
layer layer layer

The Fig. 3 shows the data-set splitting in Train - 68


yt−1
first observations -, Validation - next 12 observations - and
Test - the others. Also, in TABLE II, the main statistical
yt−2 z −1 characteristics of the consumption series are presented.
ŷt
yt−3 z −1
Train Val Test
yt−p z −1
1400

Consumption - [MWh]
Fig. 2: NARNET Model
1200
1000
This represented architecture is known as shallow, which
means there is only one hidden layer. The number of neurons 800
in the hidden layer nh and the number of timestamps p are
600
hyper-parameters of the architecture that must be optimized
7 9 1 2 4
9-0 011-0 013-1 016-0 018-0
200
to obtain the best generalization capacity. Therefore, avoiding
over-fitting. 2 2 2 2
1) Training of the Model: To train a suitable parameter
Year-Month
set wk , we minimize the Mean Square Error (MSE) loss
Fig. 3: Time series of Energy consumption divided into
function and the optimal parameter set wk can be computed
training, validation and testing
as follow,
N
1X 2 2
w∗k = arg min (yi − ŷi ) + λ kwk k2 (7) TABLE II: Statistical data of the electric energy consumption
wk ∈Rn+1 2 i=1
series
where N is the number of observations and λ is the regula-
tion coefficient used to avoid over-fitting. The Levenberg- Statistics Value
Marquardt method employed to determine the parameters ȳ 1059.51 MWh
with a decay γ and learning rate α. Also, Early stop method, σy 192.64 MWh
which stops iterations if the loss function does not reduce for Min 622.92 MWh
l∗ consecutive times. st
1 Quantil 951.98 MWh
2st Quantil 1073.51 MWh
III. MODEL IDENTIFICATION APPLIED TO THE 3st Quantil 1204.96 MWh
ENERGY CONSUMPTION FORECAST Max 1462.01 MWh
The data-set refers to a historical series of energy con-
sumption at the headquarters of the Federal University of It should be noted that the institution went through three
Mato Grosso Foundation, located in Cuiaba, Mato Grosso, strikes that significantly change the consumption pattern and
Brazil. For this data-set, there are 104 observations, from deteriorates the estimates provided by both models. The first
March 2010 to October 2018. one lasted 126 days (May to September 2012), the second
1) Data-set Splitting: The data-set was split in Training- one 139 days (May to October 2015) and, the last, 65 days
set, Validation-set and Test-set. The size of these data-sets (May to June 2018)[17].
are represented in TABLE I. Both models will be in a free simulation, that is, the
Firstly, the Training-set is the part of the data-set used for validation data will not be used for forecast update.
the model identification. Secondly, the Validation-set is the
part of the data-set employed to obtain the best generalization A. Proposed SARIMA Model
capacity of the model, this is, avoid overfitting, which means In the construction of the SARIMA model, the set of
to achieve the minimum MSE for this set. Thirdly, the Test- training and validation data are unified. Then there is the
set is the part of the data-set used for test the final model. set called Identification.
Among the various combinations of orders tested, the one 1.00
with the lowest AIC, equal to 1428, with significant param-
eters was (0, 1, 1)x(0, 1, 1)12. The model does not have an 0.75

Autocorrelation
auto regressive component, either ordinal or seasonal. 0.50
We can observe the recursive performance of the Kalman
filter of this model in Fig. 4. It is initialized with zero value 0.25
and after 15 iterations reaches steady state. This type of 0.00
initialization is called diffuse.
0.25
0 2 4 6 8 10
yt ySARIMA Lag
1500
Consumption - [MWh]

Fig. 5: Autocorrelation function of the residue of model


(0, 1, 1)x(0, 1, 1)12
1000

500
2
0
1

Sample Quantiles
8 2 5 9 2
0-0 1-1 3-0 4-0 6-0
201 201 201 201 201 0
Year-Month
1
Fig. 4: Observed value yt and estimated ŷt
2

Table III shows the statistical results of this model. Ac- 3


cording to the Dickey-Fuller test, it is not possible to reject
3 2 1 0 1 2
Theoretical Quantiles
the hypothesis of the series transformed by ordinal and
seasonal filter differentiated to be stationary. Fig. 6: Quantile-Quantile plot of residue of model
(0, 1, 1)x(0, 1, 1)12
TABLE III: Statistical data of the electric energy consump- The coefficients of the SARIMA model are:
tion series
TABLE IV: SARIMA Model Coefficients
Test Name Serie P-Val T-Stat Critical Value

Dickey-Fuller ∆12 ∆yt 17e−6 -5.06 -3.55 (1%)


Coefficients Value Std P-Value

Shapiro-Wilk ˆt , ∀ t ≥ 0 0.93 - 0.75 θ1 -0.6376 0.151 0.000


Shapiro-Wilk ˆt , ∀ t ≥ 1 0.21 - 0.75 Θ1 -0.4712 0.147 0.001
LJung-Box - Lag 5 ˆt , ∀ t ≥ 0 0.43 - -
σ2 3.35x1010 6.4x10−13 0.000
LJung-Box - Lag 10 ˆt , ∀ t ≥ 0 0.52 - -

LJung-Box - Lag 15 ˆt , ∀ t ≥ 0 0.54 - - B. Proposed NARNET Model


1) Data Preprocessing: There are two widely applied
techniques known as normalization and standardization. In
The model initially does not pass the Shapiro-Wilk test of the normalization technique, the input data is re-scaled within
the residue for normality. This is due to the distortion caused a specific range, generally, the range is (-1,1) or (0,1). The
by Kalman filter initialization. By eliminating it, the residue other technique is standardization in which the input data fit
then passes the normality test. a Gaussian distribution, therefore the input data are re-scaled
Fig. 5 and Fig. 6, the Quantile-Quantile and Autocorrela- to have zero mean and unit standard deviation. The proposed
tion Function plot, also induces the normality of the residue, data preprocessing applied by our project in the NARNET
with them fitting reasonably well in the 45o qqplot line. model was standardization.
The LJung-Box test and the autocorrelation graph demon- 2) Hyper-parameter Tuning: Some hyper-parameters are
strate that the residue can be considered independently from the NARNET model (nh and p) and others are from
distributed. The diagnosis, then, is that the model is fit for training method (α, γ, λ and l∗ ). These hyper-parameters
forecasting. were obtained empirically by trial and error to achieve the
minimum MSE in the Validation-set.
The proposed NARNET model used to predict energy Traditionally July is in the summer break, so an attenuated
consumption has hyper-parameters presented in the TABLE effect is obtained. Both models prediction error of more
V. than 20% is in concerning with the strike period, observed
between May and June of 2018.
TABLE V: NARNET Hyper-Parameters

NARNET nh 42 yt ySARIMA
Hyper-parameters p 25 1400

Consumption - [MWh]
α 0.001
γ 0.1 rut 1200
Training
Hyper-parameters λ 0.085
l ∗
6
1000

800
2 5 9 2 7
C. Performance Measurement Metrics 6-1 7-0 7-0 8-0 8-0
The performance of the models are evaluated through the 201 201 201 201 201
Absolute Percentage Error - MAPE - Equation 8, Root Mean
Year-Month
Square Error - RMSE - Equation 9, and monthly percentage
Fig. 7: Observed value - y - and value predicted - ŷ - by
error graph.
SARIMA model (0, 1, 1)x(0, 1, 1)12
N
1 X ŷt − yt
M AP E = (8)
N t=1 yt
v yt yNARNET
1400
u N
Consumption - [MWh]
uP
u (ŷt − yt )2

RM SE = M SE = t=1
t
(9)
N 1200
where ŷt is the predicted forecast Energy consumption in
month t and yt is the observed values and the N the number 1000
of test point.
IV. RESULTS 800
2 5 9 2 7
The Fig. 7 and 8 present the values observed and pre-
0 16-1 017-0 017-0 018-0 018-0
dicted by the SARIMA and NARNET models, respectively. 2 2 2 2 2
Analyzing the behavior of the prediction curves compared to Year-Month
the reference, the two models adhere to the real behavior of
energy consumption. However, the NARNET model better Fig. 8: Observed value - y - and value predicted - ŷ - by
captures the non-linearities of the series, that is, it fits better NARNET model.
to the atypical event that presents the behavior of electric
energy consumption.
Fig. 9 shows the monthly percentage error of both predic- yNARNET ySARIMA
tion models. It should be noticed that the NARNET model
30 %
Consumption - [MWh]

presented greater stability around zero percent. On the other


hand, this model presented a percentage error peak higher 20 %
than 30%. In the TABLE VI the performance metrics for both 10 %
methods are presented. According to that table the first year
the NARNET model presented an MAPE of 5.72% compared
0%
to 10.46% of the SARIMA model. In the second year, its -10 %
performance deteriorated, presenting an 8.54% compared to -20 %
7.47% of the SARIMA. The behavior of both models in
2 5 9 2 7
terms of the aforementioned MAPE was similar in terms 6-1 017-0 017-0 018-0 018-0
of RMSE, where the SARIMA model presented an energy 201 2 2 2 2
consumption RMSE of 92 [MWh] compared to 106.57 Year-Month
[MWh] of the NARNET model during the last year.
The errors between May and June as well as August and Fig. 9: Percent error of prediction of both models between
September of 2017 can be partially explained by strikes. 11/2016 to 07/2018
TABLE VI: Metrics of both models ACKNOWLEDGMENT
MODEL METRIC ERROR
The authors would like to thanks CAPES, CNPQ, UFMT,
FAPESP under Grants No. 2017/25425-5 and CPFL/ P&D
MAPE - 01 to 24 8.96% ANEEL n 00063-3032/2017 for funding this research.
MAPE - 01 to 12 10.46% R EFERENCES
SARIMA MAPE - 13 to 24 7.47% [1] Brundtland, Gru, et al. ”Our common future”.
(0, 1, 1)x(0, 1, 1)12 https://sswm.info/sites/default/files/reference a
RMSE - 01 to 24 132 [MWh]
ttachments/UN%20WCED%201987%20Brundtland%20Report
RMSE - 01 to 12 163 [MWh] .pdf Accessed in: April 2019.
[2] Montalvo, E. and Faria, I. D. ”Energia Sustentvel para Todos, Núcleo
RMSE - 12 to 24 92 [MWh] de Estudos e Pesquisa - Consultoria Legislativa.
https://www12.senado.leg.br/publicacoes/estudos-
MAPE - 01 to 24 7.13% legislativos/tipos-de-estudos/outras-publicacoes
/temas-e-agendas-para-o-desenvolvimento-sustenta
MAPE - 01 to 12 5.72% vel/energia-sustenta vel-para-todo Accessed in: April
2019.
MAPE - 13 to 24 8.54% [3] BRASIL. Painel de Custeio
NARNET http://paineldecusteio.planejamento.gov.br/
RMSE - 01 to 24 108.83 [MWh]
custeio.html. Accessed in: April 2019.
RMSE - 01 to 12 111.04 [MWh] [4] BRASIL. Comisso Especial do Senado- Projeto de Lei n 1917/18
https://www2.camara.leg.br/atividade-legislativa
RMSE - 12 to 24 106.57 [MWh] /comissoes/comissoes-temporarias/especiais/55a-
legislatura/pl-1917-15-portabilidade-da-conta-de
-luz/documentos/audiencias-publicas/
ReginaldoAlmeidadeMedeiros22.05.18.pdf. Accessed
in: April 2019.
V. CONCLUSIONS [5] S. Aman, Y. Simmhan and V. K. Prasanna, ”Improving Energy Use
Forecast for Campus Micro-grids Using Indirect Indicators,” 2011
IEEE 11th International Conference on Data Mining Workshops,
The electricity acquisition in the free market is an appeal- Vancouver, BC, 2011, pp. 389-397.Junior 3
ing alternative for Brazilian universities. In order to do that [6] Box, G. E. P. and Jenkins,G. M., ”Time Series Analysis: Forecasting
properly, mitigating the risks, there are several challenges and Control,” San Francisco: Holden Day,1970.
[7] BRASIL. Fundação Universidade Federal de Mato Grosso (FUFMT).
that universities must define, for example, how much to buy Ministério da Educação. Relatório de Gestão: FUFMT, 2018.
and how to distribute it over the months. A wrong decision http://www.ufmt.br/proplan/arquivos/b76293709cb75
can incur considerable losses. c0b03577acac9d7a235.pdf. Accessed in: April 2019.
[8] M.Datta, Pallab Kumar. ”An artificial neural network approach for
In order to find a solution to this problem, this document short-term wind speed forecast.”(Master Dissertation), 2018.
addressed two alternatives for the monthly forecast of electric [9] C. Fulton. ”Estimating time series models by state space methods in
energy consumption: the SARIMA and NARNET model. Python: Statsmodels.”
http://www.chadfulton.com/files/fulton statsmo
Both models have adjusted appropriately to the observed dels 2017 v1.pdf. Accessed in: April 2019.
data series, highlighting the NARNET model, which best [10] T. W. S. Chow, and C-T. Leung. ”Nonlinear autoregressive inte-
represents the non-linearity of the observed series, that is, grated neural network model for short-term load forecasting.” IEE
Proceedings-Generation, Transmission and Distribution 143.5 (1996):
the atypical events are identified in this model with better 500-506.
precision. Both models presented percentage errors accept- [11] J. Durbin and S. J. Koopman. Time series analysis by state space
able in the short-term. In the long-term, the NARNET model methods. Oxford university press, 2012.
[12] Zhang, G. Peter. ”Time series forecasting using a hybrid ARIMA and
tends to deteriorate the precision of the energy consumption neural network model.” Neurocomputing 50 (2003): 159-175.
prediction due to the NARNET model be more suitable for [13] Shi, Heng, Minghao Xu, and Ran Li. ”Deep learning for household
a short-term forecast, possibly due to an overfiting. load forecastingA novel pooling deep RNN.” IEEE Transactions on
Smart Grid 9.5 (2018): 5271-5280.
The SARIMA model is a well-known classical technique [14] Kong, Weicong, et al. ”Short-term residential load forecasting based
for time series analysis. It is an intuitive model with a on LSTM recurrent neural network.” IEEE Transactions on Smart Grid
low computational burden, highlighting the flexibility to add (2017).
[15] Dickey, D. A., and Fuller, W. A. (1979). ”Distribution for the estimates
intervention variables that better capture the atypical events for autoregressive time series with a unit root”. Journal of the
that cause some types of non-linearity in the observed series. American Statistical Association, 74, 42731.
[16] S. S. Shapiro; M. B. Wilk. (1965). ”An Analysis of Variance Test
Some of the prediction errors can be attributed to strikes for Normality (Complete Samples)”. Biometrika, Vol. 52, No. 3/4, pp.
that were not addressed in the models. On the other hand, it is 591-611.
worth mentioning that simulation is free, which usually leads [17] M. Vinicius.(2018). ”Apos 65 dias de greve, aulas na UFMT serão
retomadas nesta segunda-feira”.
to greater uncertainty in the data with a more distant horizon https://olhardireto.com.br/noticias/exibir.asp?
because there is no recursive update of the estimation. id=447319edt=25noticia=8203apos-65-dias-de-greve
As a general conclusion, despite the pros and cons, the -aulas-na-ufmt-serao-retomadas-nesta-segunda-
feira. Accessed in: June 2019.
study presented in this paper indicates that both approaches
have potential to properly solve the problem of a monthly
forecast of electric energy consumption required to support
the acquisition of energy in the free market.

You might also like