You are on page 1of 8

Machine Translated by Google

2021 14th IEEE International Conference on Industry Applications Mo5Track C.3

Challenges for the Application of MLOps in Forecasting


Energy consumption
Tiago Yukio Fujii Wilson Vicente Ruggiero Haroldo LM do Amaral
Department of Computer Engineering Department of Computer Engineering Intelligent Power Systems and Techniques
and Digital Systems and Digital Systems Laboratory
Polytechnic School of USP Polytechnic School of USP Paulista State University -
Sao Paulo, Brazil Sao Paulo, Brazil UNESP
tiago.fujii@usp.br wilson@larc.usp.br Sao Paulo, Brazil
haroldo.amaral@alumni.usp.br

Victor Takashi Hayashi Reginald Arakaki


Department of Computer Engineering Department of Computer Engineering Khalil Ahmad Khalil
and Digital Systems and Digital Systems Department of Computer Engineering
Polytechnic School of USP Polytechnic School of USP and Digital Systems
Sao Paulo, Brazil Sao Paulo, Brazil Polytechnic School of USP
victor.hayashi@usp.br reg@usp.br Sao Paulo, Brazil
kha@usp.br

Abstract—Models for forecasting residential energy consumption in the The forecast of residential energy consumption can be used to
short term (horizon of hours or days) allow its users to plan and make assist residents in decision-making and cost-conscious planning [9]
decisions in order to reduce their consumption. Meanwhile, much of two and energy concessionaires in medium-scale and large-scale
academic jobs forecasting and detection of customer consumption anomalies [10].
This area uses offline environments for experimentation, disregarding the We can also facilitate energy transactions between prosumers in
challenges of automation, monitoring and updating of models in online point-to-point (P2P) energy markets [11], [12], promoting the efficient
environments. This article presents the challenges and solutions in a case use of the electrical network. Home Energy Management Systems
study, detailing the implementation in a real scenario online of consumption
forecasting models in 4 Brazilian residences in the period of 2020. It is
(HEMS) can use consumption prediction as input for predictive
concluded that the use of best practices and metrics for the development in
control models[13], helping to plan the use of controlled applications
online environments will not only increase
such as washing machines, air conditioning systems and electric
vehicles, in order to optimize the use of energy consumption of
Accuracy of forecasts, as well as facilitating the development of two models
and helping in the speed of experimentation and reproducibility of results. energy and financial economy for the user in the variable tariff
scheme.

Palavras-Chave—Machine Learning, Model Update, Residential Energy


Meanwhile, academic research on consumption forecasting
Consumption, Consumption Forecast.
models focused on static environments, analyzing the degradation
of the accuracy as time due to unexpected changes in behavior of
I. INTRODUCTION the time series
(concept drift) [14], to the sensitivity of the manual configuration of
The increase in the non-use of smart energy consumption meters
hyperparameters and the training and prediction times of two models.
makes it possible to monitor and analyze the usage habits of
household appliances that were previously impossible. Coupled with
direct feedback, personalized applications to the user, such as Although they have standardized metrics to measure the
forecasting and classification of energy consumption, allow them to accuracy of models, there is no consensus on the use of metrics to
educate and reduce their electricity bill [1]. measure the fitness of machine learning systems in terms of their
operation in online environments, making comparisons between
The installations of the future, not only for residences, but also
solutions difficult.
for the industrial sector, should comprise a consumption chain in
which the behavior of energy use in real time will be made possible These jobs are considered real scenarios with data on residential
by digital platforms based on measurements of residential or business energy consumption, which have great dependency in relation to the
units. These aggregated data will allow analyzes of consumption, season of the year and temperature [4], [7], causing the concept drift
not analyzed in experiments in static environments.
seasoning, costs and planning in terms of generation, transmission
and distribution capacity [2], [3]. Com isso, or scene two digital data,
The main contributions of work are in consideration of two
ready to be processed
problems arising from or considering online environments for ML
by platforms of algorithms and artificial intelligence, it is quite
systems [4], [7], além da
consistent with the innovations of products and services in this area.
application of methodologies for the development and retraining of
models to attenuate these effects in a
Machine learning (ML) techniques for forecasting [4] and case study using real data of energy consumption
classifying [5] energy consumption are widely used both in academia of 4 Brazilian residences in the period of 2020 in order to approximate
and in industry, applying different learning models, such as neural a real scenario found in the industry.
networks [6], support vector machines [7 ] and gradient boosting [8].
The sections to follow, the problems encountered during the
development of an online solution for the forecast of

978-1-6654-4118-6/21/$31.00 ©2021 IEEE 455 ISBN 978-1-6654-4118-6

Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:16:40 UTC from IEEE Xplore. Restrictions apply.
Machine Translated by Google

Residential energy consumption is discussed, and its solutions


are presented with the objective of not only improving the infrastructure
accuracy of the model, but also the reproducibility, speed of
1 The training of the reproducible model
experimentation, and operability of the system.
two
Specification of the model goes through unit tests

3 All stages of the pipeline pass through integration tests


II. MLOPS
4 Quality of the model is validated before use
In order to integrate the stages of software development
5
and information technology systems operations, the DevOps It is possible to debug the model

culture uses test automation, monitoring and integration, 6 New models are gradually introduced to users
infrastructure management as code, among other techniques,
7 New models can be reversed quickly and safely
allowing assim deliveries
and continuous implantation of the system [15].
Monitoring
The application of the culture of DevOps in ML systems,
known as MLOps [16], seeks to adapt the DevOps techniques 1 Changes of dependencies of the system are monitored
to the area, distinguishing the practices used in traditional two Invariants two dice are kept both in the offline environment
software systems due to their dependence on the quality of how much online
data by means of correct extraction and processing, its
3 Features are calculated equally during training and prediction
exploratory character during the development, with the test of
different configurations, model architectures and feature
4 Models are not old too
generation , and its
monitoring of only errors derived from the programming 5 The model is numerically stable
wrong system, but also caused by obsolete or shipped models
and training data. 6 The model does not obtain significant changes at the speed of
training or delivery latency

Thus, testing systems before introducing them into 7 There was no detriment to the quality of prediction in the online environment
production environments and monitoring their performance is
considered a good practice not to develop and operate software
systems. However, due to its nature Besides these metrics, another good practice in ML projects
prediction, these practices are difficult to define and implement is the separation of its stages in pipelines [19], in order to
in ML systems [17]. facilitate the integration of the different stages, the scalability
of the system and the reproducibility of two results.
Google Research uses 28 metrics to measure the readiness
of ML systems in production [18]. These metrics involve tests One of the two differences of online systems is the need
related to 4 categories, related to the input given, to the model for continuous training of their models to avoid the occurrence
used, à of concept drift. In [20] a strategy has been defined for
infrastructure, and system monitoring year, listed in TABLE I. simulating and evaluating two effects of periodic training in
temporary series, finding the seasonality of two input data and
updating the model at each seasonal cycle, using training and
TABLE I. METRICS FOR THE PREPARATION OF ML SYSTEMS IN validation data that reflect the cycle more recent
PRODUCTION. SOURCE: [18]

Dices
III. FORECAST OF RESIDENTIAL ENERGY CONSUMPTION
1 Feature expectations are captured in data schemes
With the implementation of the use of smart meters in the
two
All the features are beneficial to the model's accuracy residential sector, given on the individual consumption of
3 Features are not too expensive in memory usage greater granularity, it allows new applications and discoveries
to help in education and economy by its user.
4 Features in addition to business requirements

5
Among these applications, there is forecasting of residential
Pipeline has appropriate privacy control
consumption, helping in the planning of expenses and decision-
6 New features can be added quickly making for the consumer [9], forecasting in the medium and
7 Code for creation of features is tested large scale by energy distributors [10], or assistance in the
energy transaction between prosumers (consumers who also
generate electricity on a small scale) in peer-to-peer markets
Models
(energy transactions carried out directly between consumers)
1 Changes in specifications are revised and versioned [12], [21], and in predictive control models of Home Energy
Management Systems
two
Offline metrics correlate with real online impact
[13].
3 All hyper parameters are adjusted
Different energy consumption on a medium and large
4 The aging effects of the model are known
scale, or hourly individual consumption, presents greater
5 A simpler model is not better than the current one volatility, with daily consumption peaks that may occur at
different times.
6 The quality of the model is sufficient in all parts of the data

7 Model was tested considering issues of inclusivity


Due to this feature, traditional metrics to measure forecasts
such as the mean absolute error (MAE)

456

Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:16:40 UTC from IEEE Xplore. Restrictions apply.
Machine Translated by Google

We end up measuring just accurately point by point, not analyzing with an intermediary for temporary storage of two data [3], [25].
temporary or formal errors.

Fig. 1 shows an example in which a constant forecast (F1), which The data of hourly consumption and internal temperature of the
does not introduce a significant value to its user, has a point-to-point residence are sent to a remote data bank, which are used for the
error less than a forecast of behavior closer to the real one, more proposed solution of energy consumption forecast for three ML
dislocated in time (F3 ). Regarding the forecast F1 possui um models.
Currently, based on data provided by information in the period from
MAE of 0.82, forecast F3 possui a MAE of 0.99.[18][19][20] January 2020 to February 2021.

Each stage of the pipeline has multiple steps, as shown in Fig.


For a satisfactory analysis of two forecasting models, it is 3. In the offline environment, important tests have been carried out
necessary to use metrics that consider shape and time errors such to prototype the initial model and experiment with new functionalities.
as Dynamic Time Warping, Move-Split-Merge [22], DILATE [23], or
adjusted error [24].
The online pipeline, for its time, simulates the behavior of the
offline environment, with differences in relation to the degree of
automation, execution time restrictions and error handling.

The first stage of the pipeline, which occurs exceptionally in


online environments, is performed through an automated data search,
either in internal databases or through external interfaces, being
necessary or correct treatment of unavailability or transfer errors.

The next stage of pre-processing included the cleaning and


engineering of features, as well as the treatment of anomalies and
missing values performed manually in offline environments.

In offline environments, an exploratory analysis of data is


Fig. 1 Four different predictions F1, F2, F3 and F4 (dotted lines) performed, in which occurs familiarization of two data, detection of
compared to the real value (solid lines). Source: [24]. anomalies, and analysis of the distribution and correlation between
features, in order to refine or advance the processing step.
Among the main features used to improve the accuracy of
consumption forecasts in the short term (horizon of the next few
hours or days) are the climatic data, such as temperature,
precipitation, or wind speed, and references to the calendar, such as
time of day, day of the week, or occurrence of holidays [4], [7], [14].

Most of the two experiments carried out, meanwhile, were in


offline environments, not giving importance to the treatment of
erroneous or incomplete data, apart from the degradation of the
accuracy as the time shifted [14].

IV. PROPOSED SOLUTION


The project is based on the architecture of energy consumption
data collection in [2], implemented at the beginning of 2020 in 4
Brazilian residences. A Fig. 2 shows two implanted gauges.

Fig. 3 Pipeline of online and offline environments for machine learning.

In the next stage, the model is built, defining its hyperparameters,


either manually or automatically by means of a grid search, and
Fig. 2 Smart meter used for data collection installed next to the circuit training according to the available data.
breaker panel. Source: [2].

Finally, in the evaluation of the model and its accuracy


The meters have a data collection system that is tolerant to
measurement, the hyperparameters that optimize the defined metric
connection failures, guaranteeing the integrity of two data during
are selected. Thus, it is important to analyze and choose
network unavailability through the connection

457

Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:16:40 UTC from IEEE Xplore. Restrictions apply.
Machine Translated by Google

which will be the most adequate and relevant metrics to the To analyze or win the introduction of new features
problem. (Test Dice 2: all the features are beneficial and test Dice 3:
Features are not too expensive in memory use), a base reference
In order to measure the preparation of the solution for
model was used, or one that hardly uses or consumes 24 hours
environments in production, we use the metrics provided by Google
ago and data referring to the calendar: time , day of the week and
Research, being the implementation to satisfy each metric
of the month, month and year.
explained in the following sections.
This reference model was compared with other models with
A. Loading two dice
additional features other than those used in the reference model
The input data are received from the storage service in the (test Model 5: a simpler model is not better than the current one),
cloud, or which storage or total energy consumption and by as shown in TABLE II.
sectors, apart from the internal temperature of the residence, with
a sampling frequency of one hour.
Cross-validation was used for each residence, obtaining the
Every hour, searches are made for files in the cloud, if files are mean of the mean quadratic error (MSE) and the error adjusted
found that are not present in the local files, they are downloaded with a 2-hour window and norm 4 [24] of all the residences. TO
from these local directories. TABLE II. also show the reductions
Percentages of the MSE and the adjusted error in relation to the
B. Pre-processing reference model.
During or pre-processing, the presence of missing hours and
gross anomalies is verified. To be considered an anomalous value,
TABLE II. ANALYZE THE ADDITION OF FEATURES IN THE ACURÁCIA DO
the consumption must be less than 0, and the temperature with a MODEL
variation of more than 10 ºC in relation to the previous value, Model MSE Adjusted Error
satisfying the test Data 1: expectations of the features are captured
Reference Model 0.0479 (0%) 0.2535 (0%)
in data schemes [18].
Reference + 1st Derivative 0.0476 (-0.62%) 0.2529 (-0.24%)

There were occurrences of temperature anomalies, in which Reference + 0.0502 (+4.80%) 0.2568 (+1.30%)
its variation in relation to the previous hour exceeded 20 ºC, apart Internal temperature
from moments with missing temperature and consumption readings. Reference + Consumption 0.0415 (-13.36%) 0.2043 (-19.40%)
These cases were attributed as a reading error, discarding these of 25 and 23 hours ago
values.
The addition of the internal temperature of the residence as a
For the definition of features, we have added three
feature of the model ended up reducing its accuracy, while using
Past hourly consumption, referring to 25, 24 and 23 hours ago in
the 1st derivative of energy consumption, obtaining slightly
relation to the moment to be forecast, in addition to attributes
related to the calendar, such as the time of day, day of month, and significant gains. Due to these results, tais features
not foram considered not final model.
month of year to be forecast. Check out these features
For the final model of prediction, it was carried out in the stage of We observed a low weekly correlation for all the residences,
exploratory analysis of two data. not having great variation between days of the week and the end
of the week, as shown in Fig. 4 for one of the residences. It is
Meanwhile, new features can be added by altering two input
necessary to note that the consumption figures refer to the year
files, such as by adding the internal temperature of the residence,
2020, and this low variation may be related to the quarantine period
or also generated by modifying the code-source at the pre-
during the Covid-19 pandemic.
processing stage, such as by adding gives 1st derivative

hourly energy consumption. In this way, the other stages of the


pipeline do not need new modifications (test Dice 6: new features
can be added quickly).

C. Exploratory analysis of two data


In [2], we developed energy consumption forecasting models
using Extreme Gradient Boosting (XGBoost) architectures, long
short-term memory (LSTM) neural networks, and support vector
machines (SVM). The results will show that the XGBoost
architecture obtains better accuracy in most of the monitored
residences, being this architecture chosen to be used in this work.

The XGBoost is an open source ML library for regression


models and classification by means of ensembles
of decision trees [26]. Its implementation allows the training of
models in a parallelized and distributed way.
The models also accept the existence of missing values given to Fig. 4 Box diagram of total consumption per day of week of one of the
us from the start both in the training stage and in the prediction residences
stage.

458

Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:16:40 UTC from IEEE Xplore. Restrictions apply.
Machine Translated by Google

Fig. 5 shows the autocorrelation function of energy parameters, the one with the smallest mean square error is
consumption shown with hourly frequency for one of the monitored chosen.
residences. Not graphic, a higher value not on the ordinates
indicates a high correlation between the temporal series and the
non-temporary series in k units, with k represented by the axis of
the abscissas. It can be observed autocorrelation peaks for delays
of 24 in 24 hours, evidencing daily seasonality.

Fig. 7 Importance of the features for the forecast model of one of the
residences of the project

Fig. 5 Autocorrelation function of hourly energy consumption (band of A random seed used by XGBoost is fixed
significance of 95%)
automatically, guaranteeing the reproducibility of two results (Test
Infrastructure 1: the training of the model is reproducible).
Fig. 6 shows the 1st decision tree of the model, being possible
to observe the relevance of the time of day in the prediction of the
model, while Fig. 7 shows the importance of the features Both the data and the code are versioned by means of Git
for one of the residences, informing the relative contribution of and DVC version control systems (test Model 1: changes are
each feature in the XGBoost tree breeding process. The possibility revised and versioned).
of obtaining information related to the internal structure of the
model is important because it allows us to debug the operation Fig. 8 shows an example of the forecast of energy consumption
and investigate performance problems or instability (Test for one of the residences made during the month of July, being
Infrastructure 5: it is possible to carry out the debugging of the possible to observe the daily seasonality of energy consumption.
model).
The models are also endorsed with extreme tickets E. Assessment
or even invalid, endorsing its robustness (Monitoring test 5: the The evaluation of the accuracy in a static environment was
model is numerically stable). The inputs tested are: consumption carried out by means of the proposed method in [20]. This method
equal to zero, negative, infinity and with missing values. is trained multiple models, each one based on training dice of
different instants, in order to reflect the change of new dice in an
online environment (test Model 4: the aging effects of the model
D. Model Training and Prediction are known and test Model 6: a quality of the model is sufficient in
The model uses the XGBoost library to predict the hourly all parts of the data).
consumption of the next 24 hours, being trained with the features
of the consumption of 23, 24 and 25 hours ago, at the hour of the
day, day of the week, day of the month, day of the year and month It is considered a daily training schedule, with data separation
of 80% for training and 20% for tests. The hyper parameters are
defined by means of search in grade, as stated in Section IV.D.
To perform or adjust two hyperparameters of XGBoost, a
search was performed in the grade (grid search) with cross
validation with partitioning in 4 subsets for each residence (test The adjusted error is used to compare the updated model
Model 3: all hyperparameters are adjusted), varying the with the previous one, being used the one with less error
parameters of size of trees, learner taxa, and objective function to (Infrastructure test 4: the quality of the model is validated before
serving it and Monitoring test 4: models are not old too).
be minimized. apos
or thirty two models for each hyper combination

Fig. 6 Decision tree of two forecasting models of XGBoost

459

Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:16:40 UTC from IEEE Xplore. Restrictions apply.
Machine Translated by Google

static models without retreat. The values in parentheses refer


to the reduction of the error of the model with updating in
relation to the model without retraining.

Fig. 9 Website to visualize the forecasts made

One of the difficulties in comparing the results between


different works is the use of different methodologies, which
use various databases for model training, methods for defining
hyperparameters and metrics for accuracy analysis. Due to
the scope of work in analyzing or gaining accuracy in using
practical tools for the development of online models, we did
not bother to make a comparison with other jobs in relation to
accuracy, which requires the reproduction of the different
methodologies used for a real analysis , demanding or
Fig. 8 Forecast (vermelho) and real value (preto) of energy consumption for the full month retreinamento in new data bases, out of scope of the project.
of July 2020 and for the 1st week of July 2020

The system in an online environment was implemented


as a Flask application on an Apache2 server hosted on A TABLE IV. shows the tests performed by the system
Amazon Elastic Compute Cloud (EC2), periodically retreining following the metrics defined in [18] automatically (A),
every 24 hours and allowing calls to be received in REST API manually (M), not performed (-), or not applicable (N/A).
format to forecast the consumption of the monitored residences.
Tests Data 4 and 5 are not applicable to the project in its
current state because it has not collected personal data that
The API can be used by other systems for consultation allows us to identify them, due to concerns related to the
gives forecast of the consumption of two users. Fig. 9 shows privacy of two users. Just test Model 2 is not applied because
an example of an application, in which a website was there are no online metrics monitored at the moment. The
developed on the Dash platform [27] to make consumption Infrastructure 6 test is not applicable due to the insufficient
forecasts in time periods customized by the user. number of users to launch gradual new versions (rollouts) ,
and the Monitoring 3 test has no differences between the
The calls made to API are monitored at the time of offline and online training data.
training, being saved in log files (Test Monitoring 6: the
model did not obtain significant changes in the speed of Due to the relative low complexity of the pipeline and low
training or latency of delivery). cost of training, no integration tests (test Infrastructure 3) and
rollback tests (test Infrastructure 7) have been performed.
Given that the XGBoost library performs a large series of unit
When an abnormal value is found as defined in Section
tests to guarantee the correct execution of the code for
IV.B (negative consumption or temperature variation greater training and prediction of two models, the verification of the
than 10 ºC), an alert is added to the log files (Monitoring specification of the model was considered as part of the scope
test 2: two data invariants are kept both online and offline) . of the project (Infrastructure 2).

TABLE III. METRICS FOR STATIC AND DYNAMIC ENVIRONMENTS


V. RESULTS
A TABLE III. It shows the average of two results obtained MSE adjusted error
Model without update 0.0495 0.3472
for all the residences in relation to the MSE metrics and the
adjusted error. In order to compare the effectiveness of the updated model 0.0486 (-1.82%) 0.3089 (-11.03%)
model update, the metrics are also calculated considering weekly

460

Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:16:40 UTC from IEEE Xplore. Restrictions apply.
Machine Translated by Google

TABLE IV. TESTS RELATED TO GIVEN metrics proposed in the literature, and the positive and negative
Test No. points in their due use, not development of systems
1 3 4 5 6 7
Set
two
of online machine learning, through a case study using real data
Dices AMMN/AN/AAM of energy consumption of 4 residences in the period of 2020 in
Model AN/YYYMM - order to approximate a real scenario found in the industry.
Infrastructure A- - AMN/A -
Monitoring - AN/AAMA -
The results also show the difficulty in defining metrics that
capture the relevant characteristics in the domain of the energy
Due to having a low number of users at the moment, or the
consumption prediction problem, especially the need to consider
project still does not pay careful attention to questions of social
both spatial and temporal errors. Other difficulties encountered
inclusion of the system (test Model 7). When new users are
during the development of the system were also considered, and
invited to participate, questions of representativeness of the
the impact of these difficulties were quantified from experiments
Brazilian population will be important so as not to be sent to the
system. in dynamic scenarios, we were able to analyze the impact of the
model training, reducing the adjusted error by 11.03% in relation
At the moment there were no changes in the structure of two to a model static.
input data, so that the monitoring of changes (test Monitoring 1)
is not currently carried out, but in future stages, in the case of
As future work, the application of the system in a larger
new features obtained from external sources, such as forecast of
number of users can help in the validation of its scalability, also
time, maybe inserted, this test will become more important.
taking into account the representation of the Brazilian population
in the selection of new testers to analyze the performance of two
Given that the model forecasts consumption for the next 24 models for different consumption profiles. Além of the residential
hours, or real-time monitoring of the quality of the forecasts made scope focused on this work, the application of the architecture of
(Test Monitoring 7) was not done, once its accuracy can only be MLOps proposed power
measured 24 hours after the forecast. also contribute to the prediction of consumption in industrial or
commercial establishments, being a possible future scenario to
analyze its viability, gains for the client, consumption peculiarities
The exploratory data analysis proved to be of extreme
and make a comparison
importance, satisfying various tests (Data 2, Model 5, Infrastructure com or residential sector.
5, Monitoring 5) that were previously performed manually, can be
reused in the future for addition to the pipeline, being executed ACKNOWLEDGMENTS
automatically.
The authors thank the patrimonial fund “Amigos da
Poli” for financial support.
One of the main objections to the project start-up was related
to the large number of functional changes due to the project start- REFERENCES

up, on the assumption that the tests implemented at this stage [1] K. Carrie Armel, A. Gupta, G. Shrimali, and A. Albert, “Is
would quickly become obsolete. However, this belief was disaggregation the holy grail of energy efficiency? The case
unfounded, once the simple definition of two tests did not just of electricity,” Energy Policy, vol. 52, p. 213–234, 2013.
[two] V. Hayashi, R. Arakaki, T. Fujii, K. Khalil, and F. Hayashi, “B2B
verify the correct execution of the code, but also the development
B2C Architecture for Smart Meters using IoT and Machine
process, following the test-driven development technique (TDD) Learning: a Brazilian Case Study,” International Conference on
[28]. Smart Grids and Energy Systems, p . to be published, 2020.
[3] V. Hayashi, T. Fujii, R. Arakaki, H. Amaral, and A. Souza, “Boa Energia:
Public Database of Residential Consumption with Data Quality,” 2020.
The versioning of data was shown to be important not only
for the online environment, but also in experimentation during the [4] S. Humeau, TK Wijaya, M. Vasirani, and K. Aberer, “Electricity
exploratory analysis of two data, guaranteeing the reproducibility load forecasting for residential customers: Exploiting aggregation
and correlation between households,” 2013 Sustainable Internet
of experiments carried out in previous versions.
and ICT for Sustainability, SustainIT 2013, 2013.
[5] M. Martins PB, RGD Pinto, and SP Bittencourt, “Load Disaggregation of
During the development it was necessary to balance the Industrial Machinery Power Consumption Monitoring Using Factorial
Hidden Markov Models,” The International Workshop on Non-Intrusive
delivery of results with the execution of tests, so that in defining Load Monitoring (NILM), p. 6, 2018.
priorities it was extremely important not to run the project. A
selection of priorities was calculated according to the probability [6] W. Kong, ZY Dong, Y. Jia, DJ Hill, Y. Xu, and Y. Zhang,
of related problems occurring, as well as the impact of these “Short-Term Residential Load Forecasting Based on LSTM
Recurrent Neural Network,” IEEE Transactions on Smart Grid,
problems on the system.
vol. 10, no. 1, p. 841–851, 2019.
Another key point for the execution of two tests was the [7] P. Lusis, KR Khalilpour, L. Andrew, and A. Liebman, “Short
modularization of the stages of the pipeline. By accurately defining term residential load forecasting: Impact of calendar effects
and forecast granularity,” Applied Energy, vol. 205, no. March,
its functionalities, expected inputs and outputs, it becomes easier p. 654–669, 2017.
to alter the source code and experiment with new configurations, [8] S. ben Taieb and RJ Hyndman, “A gradient boosting approach
visualizing more clearly the impact of these changes in the project. to the Kaggle load forecasting competition,” International Journal
of Forecasting, vol. 30, no. 2, p. 382–394, 2014.
[9] P. Serrenho, T., Bertoldi, “Smart home and appliances: State of
SAW. CONCLUSION the art,” Luxembourg, 2019.
[10] HLMD Amaral, JAG Maginador, RMJ Ayres, AN de Souza, and DS
Neste artigo foram presented the details of implementation, Gastaldello, “Integration of consumption forecasting in smart meters and
the difficulties encountered in evaluating smart home management

461

Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:16:40 UTC from IEEE Xplore. Restrictions apply.
Machine Translated by Google

systems,” SBSE 2018 - 7th Brazilian Electrical Systems Symposium, [19] P. Sugimura and F. Hartl, “Building a reproducible machine learning
pp. 1–6, 2018. pipeline,” arXiv, 2018.
[eleven] Y. Wang, Q. Chen, T. Hong, and C. Kang, “Review of Smart Meter [twenty] JA Guajardo, R. Weber, and J. Miranda, “A model updating strategy
Data Analytics: Applications, Methodologies, and Challenges,” for predicting time series with seasonal patterns,” Applied Soft
IEEE Transactions on Smart Grid, Vol. 10, no. 3, p. 3125–3148, Computing Journal, vol. 10, no. 1, p. 276–283, 2010.
2019. [twenty-one] Y. Wang, Q. Chen, T. Hong, and C. Kang, “Review of Smart Meter
[12] W. Tushar, TK Saha, C. Yuen, P. Liddell, R. Bean, and HV Data Analytics: Applications, Methodologies, and Challenges,”
Poor, “Peer-to-Peer Energy Trading With Sustainable User IEEE Transactions on Smart Grid, Vol. 10, no. 3, p. 3125–3148, 2019.
Participation: A Game Theoretic Approach,” IEEE Access, vol. 6, no.
October, p. 62932–62943, 2018. [22] A. Stefan, V. Athitsos, and G. Das, “The move-split-merge metric for
[13] A. Pratt, D. Krishnamurthy, M. Ruth, H. Wu, M. Lunacek, and P. time series,” IEEE Transactions on Knowledge and Data Engineering,
Vaynshenk, “Transactive Home Energy Management Systems: The vol. 25, no. 6, p. 1425–1438, 2013.
Impact of Their Proliferation on the Electric Grid,” IEEE Electrification [23] V. le Guen and N. Thome, “Shape and Time Distortion Loss for
Magazine, vol. 4, no. 4, p. 8–14, Dec. 2016. Training Deep Time Series Forecasting Models,” no. NeuroIPS, pp.
[14] A. Gerossier, R. Girard, A. Bocquet, and G. Kariniotakis, “Robust day- 1–13, 2019.
ahead forecasting of household electricity demand and operational [24] S. Haben, J. Ward, D. Vukadinovic Greetham, C. Singleton, and P.
challenges,” Energies, vol. 11, no. 12, 2018. Grindrod, “A new error measure for forecasts of household level, high
[fifteen] M. Senapathi, J. Buchan, and H. Osman, “DevOps Capabilities, resolution electrical energy consumption,”
Practices, and Challenges,” in Proceedings of the 22nd International International Journal of Forecasting, vol. 30, no. 2, p. 246–256, 2014.
Conference on Evaluation and Assessment in Software Engineering
2018, Jun. 2018, pp. 57–67. [25] R. Arakaki, VT Hayashi, and WV Ruggiero, “Available and Fault
[16] D. Sculley et al., “Hidden technical debt in machine learning systems,” Tolerant IoT System: Applying Quality Engineering Method,” 2nd
Advances in Neural Information Processing Systems, vol. 2015- International Conference on Electrical, Communication and Computer
Janua, pp. 2503–2511, 2015. Engineering, ICECCE 2020, no.
[17] Google Cloud, “MLOps: Continuous delivery and automation pipelines June, 2020.
in machine learning,”
learning/mlops
https://cloud.google.com/solutions/machine-
pipelines-in-machine 2020.
continuous-delivery-and-automation-
learning. [26] T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,”
Proceedings of the ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, vol. 13-
17-August, pp. 785–794, 2016.
[18] E. Breck, S. Cai, E. Nielsen, M. Salib, and D. Sculley, “The ML test [27] Plotly Technologies Inc., “Dash.” Plotly Technologies Inc., Montreal,
score: A rubric for ML production readiness and technical debt QC.
reduction,” in 2017 IEEE International Conference on Big Data (Big [28] K. Beck, Test Driven Development: By Example, 1st ed. Addison
Data ), Dec. 2017, vol. 47, no. 3, p. 1123–1132. Wesley Professional, 2002.

462

Authorized licensed use limited to: LIVERPOOL JOHN MOORES UNIVERSITY. Downloaded on August 13,2022 at 06:16:40 UTC from IEEE Xplore. Restrictions apply.

You might also like