You are on page 1of 8

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/327160009

Towards Hybrid Energy Consumption Prediction in Smart Grids with Machine


Learning

Conference Paper · August 2018

CITATIONS READS

0 15

3 authors, including:

Abdulsalam Yassine
Lakehead University Thunder Bay Campus
69 PUBLICATIONS   492 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Multimedia Big Data View project

Smart Meters Big Data View project

All content following this page was uploaded by Abdulsalam Yassine on 22 August 2018.

The user has requested enhancement of the downloaded file.


Towards Hybrid Energy Consumption Prediction in
Smart Grids with Machine Learning
Shailendra Singh Abdulsalam Yassine and Rachid Benlamri
Department of Electrical and Computer Engineering Department of Software Engineering
Lakehead University Lakehead University
Thunder Bay, ON, Canada Thunder Bay, ON, Canada
Email: ssingh59@lakeheadu.ca Email: ayassine@lakeheadu.ca, rbenlamr@lakeheadu.ca

Abstract—This paper addresses the problem of prediction each other and can result in higher accuracies, shorter run-
accuracy of multivariate models. We propose a hybrid system times, and improved performance. From a machine learning
to analyze the energy consumption data along with associated perspective, a long-term learning approach would capture the
weather data at different time periods and address the limitations
of learning techniques. Our model addresses a rather practical longer weather variations and energy usage patterns, but not
problem that applies to real-world scenarios where energy adequate for obtaining the short-term weather and energy
consumption data is influenced by multiple variables and vary usage patterns changes. Therefore, the prediction strategy is
according to the utility’s cyber infrastructure. Such variations to relearn the model at a higher frequency. This approach is
affect the accuracy of the model as time changes from day to in one hand impractical and expensive for long-term learning
day and during shoulder seasons. The proposed system combines
both the long-term and the short-term learning mechanisms to as the training dataset could be very large (e.g., few years of
achieve improved performance and accuracies. The performance data). On the other hand, this approach might fail to anticipate
and accuracy of the proposed model is evaluated experimentally the shoulder times (transition between seasons) for short-term
using real-life data from Thunder Bay electric grid system. The learning due to variations in energy usage and weather. The
results show the significance of the proposed system for practical solution is to train smaller datasets as it is inexpensive and
implementations.
can be conducted at a higher frequency (e.g, daily). Additional
aspect for the need of hybrid prediction is that certain models
I. I NTRODUCTION
might perform differently at different runs. For example, one
Energy consumption prediction is vital to the reliability of model performs better for the predictions that are conducted at
the electric grid especially in urban areas where the increasing 10:00 AM while another model might outperform it during the
demand for electricity requires sufficient supply to keep com- next run at 11:00 AM if we decide to retrain the models at a
munities functioning [1]. It is the responsibility of electric grid frequency of one hour. Consequently, dynamic determination
operators (EGOs) to balance supply and demand of energy of most accurate prediction model is required at each run.
ahead of time. To plan such balance, EGOs collect energy For the implementation and evaluation of the proposed
consumption data from power feeds and employ analytical hybrid prediction mechanism, an energy consumption dataset
and statistical models to forecast the future demand. However, with high time resolution and historical weather data are
with the introduction of smart grid applications, EGOs need to acquired for the Thunder Bay, Ontario, Canada. The energy
predict energy consumption at short or very-short time periods dataset has a time resolution of 5 minutes for 1500 homes in
to address stringent accuracy requirements for smart grid ap- Thunder Bay, ON, Canada for a period of two years between
plications such as automatic demand response for commercial January 2016 to December 2017. We also acquired the weather
and residential consumers [2] [3] [4]. The major challenge that data from Environment Canada [5] for the same period which
affects the accuracy of prediction in power networks is the includes measurements for temperature, wind chill, dew point
correlation between weather and energy consumption. Energy temperature, wind speed, and relative humidity recorded at an
consumption patterns and weather change significantly from interval of one hour.
year to year, season to season or even day to day depending The organization of the paper is as follows: Section II
on local weather patterns. Therefore, it is essential for the provides the related work to the proposed system. In section
prediction model strategy to take into account the historical III, the details about the proposed approach are presented
energy consumption and weather variations, but, it is also followed by evaluation and results in section IV. Finally, in
equally necessary to capture these variations on an ongoing section V, we conclude the paper and discuss future work.
basis to adapt to current situations, probably, daily or even
few times in a day. II. R ELATED W ORK
In this paper, we address the above-mentioned issue by a Prediction mechanisms for energy consumption are exten-
hybrid prediction mechanism in which models trained on long- sively studied in the literature. The work presented in [6] and
term learning and short-term learning techniques complement [7] use Artificial Neural network to predict the short-term load
of energy consumption. In [8] and [9] the authors minimize the
complexity of the non-linear forecasting models using fuzzy
logic and random forest techniques. Adaptive approaches that
use Support Vector Regression (SVR) to perform energy
predictions are studies in [10] and [11]. The authors in [12]
and [13] proposed Bayesian networks and statistical methods
to predict time series of energy consumption for short and
long-term for peak load forecasting while in [14] the focus of
prediction is on identifying and learning occupants activities
for healthcare applications. The work in [15] proposes an
adaptive univariate and multivariate approach to select the best
forecasting model among several methods in the system as a
means of coping with forecasting errors.
The above-discussed approaches are genuine, but they suf-
fer from shortcomings when real-world guarantees must be
satisfied for meaningful end-user applications. As mentioned
above, the accuracy of predicting energy consumption in real-
life is sensitive to the application domain of the smart grid sys-
tem. In such domains, robust prediction models that are trained
to capture the various features of the model are required to
maintain higher accuracy. To this end, several researchers
proposed Hybrid-based prediction models such as those in
[16] [17] [18] [19] and [20]. The work in [16] enhanced the Fig. 1. System Architecture
accuracy by a Bayesian neural network (BNN) model and used
the entropy-based criterion of a discrete wavelet transform
(DWT) to decompose the load components into various levels sources and tags the power consumption records with features
of resolution. The authors later performed further enhancement from weather data such as temperature, wind chill, dew point
in [17] to the model with correlation analysis that calculates temperature, wind speed, and relative humidity. Additionally,
the coefficients between the training inputs and output. Similar features such as time of day, day of the week, week of the
to [16] and [17], the work in [18] used wavelet transform year, day of the month, month, and season are also added.
with a new ensemble method for short-term load forecasting The second and the third stages take the ready to mine
based on extreme learning machine (ELM) and partial least data from the first stage and runs different models against
squares regression. Dong et al. in [19] discussed five machine it to identify the best performing model that is observed to
learning algorithms in a hybrid system to predict household be the most optimal fit. It is done while performing N-fold
energy consumptions. cross-validation using rolling forecasting origin [21] on each
Unlike, the above mentioned studied, the proposed hybrid prediction model. Both stages use 70/20/10 rule to split the
model addresses a rather practical problem that applies to real- data into training set (70%), validation and tunning set (20%)
world scenarios where energy consumption data is influenced and test set (10%). The difference between the two stages is
by multiple variables and vary according to the utility’s cyber the size of the dataset used for training. Stage two deals with
infrastructure. Such variations affect the accuracy of the model long-term learning approach where the maximum possible
as time changes from day to day and during shoulder seasons. data is processed whereas the third stage deals with short-
These problems are not extensively studied in the current state- term learning approach using a rolling window mechanism
of-the-art. that considers last M months of historical data. The values of
N and M can be defined before each execution or adopting
III. P ROPOSED S YSTEM the default values of N = 10 and M = 6. Furthermore, we
We propose a system to analyze the energy consumption employ in stage two and three an optimization technique as
data along with associated weather data at different time a model selection strategy that utilizes the Root Mean Square
periods and address the limitations of learning techniques. This Error (RMSE), RSquared, and Mean Absolute Error (MAE)
consists a hybrid approach that combines both the long-term metrics to evaluate various models and decides the best-fit
and the short-term learning mechanisms to achieve improved prediction model for the data.
performance and accuracies. Figure 1 illustrates the overall The fourth stage combines the two models that were chosen
architecture of the proposed system. during the earlier two stages, one trained with long-term
The system has four stages: data pre-processing, long-term learning and the other with short-term learning to form a
learning, short-term learning, and finally, model selection and hybrid prediction system for future energy consumption. The
predictions. The first stage accepts the historical data, that is, system tracks the accuracies of both models to ascertain
power consumption time-series and weather data from two the best performance and decide if they need retaining or
replacement before the schedule of running stage two and the estimate. Therefore, penalizing the points that are outside
three. By doing so, we ensure that there will always be a the tube, but not the ones that are in the tube [27] [28].
best fit model that produces the highest accuracy. 4) CUBIST: Cubist predictive approach is a rule-based
The proposed system can process energy consumption time- model that is an extension to the M5P model tree proposed in
series data with any time resolution, i.e., the frequency of [29], where M5P is an improved version of Classification And
capturing power readings. Data with higher time resolution are Regression Trees (CART). In CUBIST, trees are grown using
desired for better results. Additionally, the accuracy of weather M5P and then translated into a set of rules using paths from the
data has a direct impact on the efficiency of any prediction root to the terminal nodes while combining and removing the
model. Therefore, an accurate historical and forecast data is paths. Rules can have overlapping criteria. Therefore, the final
required. A power consumption dataset from January 2016 to predicted value is determined by taking an average over the
December 2017 is used for the evaluation of our system has predictions from all the applicable rules. A prediction at the
a time resolution of 5 minutes for 1500 homes in Thunder terminal node is made using the multivariate linear regression
Bay, ON, Canada. The weather data was extracted from model, but it is further smoothened using the predictions from
Environment Canada [5] for the same period (2016-2017) previous nodes of the tree [30] [31].
with measures such as temperature, wind chill, dew point 5) RANDOM FORESTS: Random forests is an ensemble
temperature, wind speed, and relative humidity recorded at method of learning for regression. It constructs a multitude
an interval of one hour. of decision trees during the training and outputs the mean
prediction of all the trees grown with a weight function applied
A. Predictive Models to them [32] [33]. Random forests approach addresses the
tendency of over-fitting generally associated with decision
We chose five most widely employed multivariate regression trees [34]. Random forests can be viewed to be related to
algorithms to build various prediction models. Additionally, k − N N method as both follow weighted neighborhoods
current system design facilitates plugging in additional algo- schemes and formation of the neighborhood adjusts according
rithms on-demand. to the localized importance of features [35].
1) k-NN: k-nearest neighbors: The k-nearest neighbors
algorithm is a non-parametric approach for regression analysis. B. Dynamic Model Selection
The input to the algorithm is formed of the k closest training In multivariate predictive analytics it is imperative to check
observations from the feature space. The outcome is the how independent variables and data influence the decision-
property value that is calculated by computing average of the making. For example, are the model variables have equal
values of the k nearest neighbors [22] [23]. influence on the result? Do all the prediction models perform
The Mahalanobis or Euclidean distance is calculated from perfectly equal at all times and with every type of data? Will
the data point in consideration to the already labeled data the accuracy change with time for the same trained model?.
points that are ordered according to increasing distance. There- In our case, weather data changes constantly and influence the
after, a weighted average of inverse distance for the k nearest energy consumption. Therefore, there is a need to keep training
neighbors is computed. The optimal value of k is determined the prediction models at regular intervals and determine which
based on RMSE and by using cross-validation [22] [23]. one is the best fit to ensure predictive accuracy is within
2) XGBoost: eXtreme Gradient Boosting: XGBoost is a acceptable limits. Furthermore, on-the-fly or dynamic selection
highly scalable and performance-oriented implementation of of one best performing model among many is critical. Metrics
gradient boosting machine or gradient boosted regression tree such as RMSE, RSquared, and MAE are used to evaluate
where underlying model is tree ensembles. XGBoost controls the different models and decide the best-fit prediction using
the over-fitting by using regularized model formalization. the selection strategy at stage two and three for long-term
Complexity measurement of tree models can be the depth of and short-term learning respectively. Additionally, we utilize
the tree and number of terminal nodes. In order to regularize Partitioning Around Medoids (PAM/k-medoids) [36] cluster
the model, a constraint on the complexity measurement or analysis to determine the final model for the predictions at
a penalization on the number of terminal nodes or the leaf the last stage in the hybrid system. A detailed discussion on
weights can be applied [24] [25]. final model selection and prediction mechanism is presented
3) Support Vector Machines with Radial Basis Function in sub-section (III-F).
Kernel for Regression - SVR: SVM regression uses kernel
functions and is regarded as a nonparametric approach [26]. C. Stage I: Data Pre-processing and Preparation
The kernel functions convert the data to a higher dimensional The raw time series and weather data were cleaned by
space to facilitate linear separation. A margin of tolerance filling in the missing values and removing noise. We used
(epsilon) is set in approximation and training is achieved via Random Forest regression method to fill missing values and
a symmetrical loss function that penalizes over and under k-means clustering to identify and eliminate the outliers. Next,
estimates equally. A flexible tube having a minimum radius we added features such as time of day, day of the week, week
is constructed encompassing the estimated function to ensure of the year, day of the month, month, and the season that
errors below a threshold are ignored for both over and under represent various time granularities. Additionally, we added a
variable identifying if a specific day is declared a holiday or threshold level. Second, the system identifies the model that
not. The rationale behind adding it was to explore if energy exhibits better accuracy at a given time stamp with a 5-minute
consumption is affected by it in any possible way positively resolution for the next 24 hour period such as 12:05 PM,
or negatively while increasing or reducing the energy usage 12:10 PM, 12:15 PM, and so on. We employ k-medoids (PAM-
respectively. Partitioning Around Medoids) [36] cluster analysis to cluster
The weather data was at a time resolution of 1 hour, but, the observations of the winning model on the base of time
energy consumption was at a time resolution of 5 minutes. of day. The optimal number of clusters k is determined by
Therefore, to ensure same timescale or resolution, the weather computing average silhouette width. The results are updated
data was transformed into a time series data with 5 minutes after each event of prediction, ensuring the optimal model
interval. Without the loss of generality, we assumed that the selection for the next time period.
weather remains more-or-less stable and does not change
IV. S YSTEM E VALUATION
abruptly within 1-hour time frame. Therefore, we replicated
hourly weather value to all the records for that hour. Later, the A. Dataset and System Setup
transformed weather data is used to tag energy consumption The dataset used is actual energy consumption on the power
time series records with variables such as temperature, wind feed connecting to 1500 homes in Thunder Bay, Ontario,
chill, dew point temperature, wind speed, and relative humidity Canada. These power measurements are in kilowatts and
to produce ready to mine data with fourteen variables. Table contain two year energy time series collected from January
(I) presents a sample of ready to mine data. 2016 to December 2017 at a 5 minute time resolution. Weather
data was extracted from Environment Canada [5] for the
D. Stage II: Long-term learning similar period of two years (2016-2017) with measures such
In long-term learning stage, the predictions models are as temperature, wind chill, dew point temperature, wind speed,
trained while performing N-fold cross-validation using rolling and relative humidity recorded at an interval of one hour.
forecasting origin approach on the maximum training dataset We use Ubuntu Linux machine with Intel(R) Core(TM) i7-
available that includes historical observations from many years 6600U CPU @ 2.60GHz (4 CPUs) having 64 Gb memory to
in the past. A dataset split rule of (70/20/10) is used to split the implement and conduct experiments. We used the following
data into training set (70%), validation and tunning set (20%) R language libraries for our system development and imple-
and test set (10%). This stage of the system is computationally mentation:
expensive due to possible large training dataset and various • Caret: classification and regression training [37]
iterations to tune each model that ensures they are performing • Weighted k-Nearest Neighbors for Classification, Regres-
at the most optimal efficiency and accuracy when compared. sion and Clustering [38]
Therefore, it is recommended to retrain the models at a less • xgboost: Extreme Gradient Boosting [39]
to very-less frequency on a need basis. We recommend twice • kernlab: An S4 Package for Kernel Methods in R [40]
a year training for long-term learning approach. • Cubist: Rule And Instance Based Regression Modeling
[41]
E. Stage III: Short-term learning
• ranger: A Fast Implementation of Random Forests [42]
In short-term learning stage, the models are trained using • Fpc: Flexible procedures for clustering [43]
rolling window approach that considers only last M months of
historical data, i.e., M months of historical evidence is selected B. Results
and split into training set (70%), validation and tunning set Figure 2 presents the comparison of prediction models with
(20%) and test set (10%). The variable M can be initialized long-term training technique. The Random Forest model was
at each run of the system. Hence, the training dataset size is found to be outperforming other models, and the best fit,
small, and the training and evaluation of various models with therefore, was the model identified by the system. Similarly,
multiple iterations are computationally inexpensive. Therefore, figure 3 presents the model comparison for the short-term
the retraining can be performed daily or multiple times in a learning technique. The CUBIST prediction model was found
day. At each run in this stage, the best performing model is to be the best fit model for the specific execution. It is
chosen using the model selection strategy. important to note that the determination of the model can
change at each consecutive run of the system because new
F. Final Model Selection and Prediction Mechanism data becomes available to train.
The proposed system combines the two models chosen Table II shows the comparison of variable importance for
in stage two and three to form a hybrid system for dy- two final models identified for predictions. For Random Forest
namic predictions. Additionally, the system tracks the actual variables, wind chill, temperature, dew point temperature, and
prediction accuracy of the chosen models and compares it wind speed were the deterministic variables whereas other
against the actual value when it becomes available. First, it is variables such as relative humidity and time components
important to decide if the models are performing as expected have no influence. In the case of CUBIST, the variables
or require retraining. This is decided by constantly monitoring temperature, dew point temperature, relative humidity, day of
the accuracy levels and checking them against a pre-defined month (MonthDay), day of week (WeekDay), wind speed, and
TABLE I
R EADY TO MINE DATA : S AMPLE

DateTime Power KW Temperature Dew Point Relative Wind Wind Holiday WeekDay MonthDay YearDay Month YearWeek Season
Temperature Humidity Speed Chill
1451626200 47178.744 -8.2 -10.9 81 13 -14 Yes 5 1 0 1 1 Winter
1451628900 45305.868 -8.2 -10.9 81 13 -14 Yes 5 1 0 1 1 Winter
1451627100 46648.74 -8.2 -10.9 81 13 -14 Yes 5 1 0 1 1 Winter
1451631300 43980.047 -8.1 -10.8 81 15 -14 Yes 5 1 0 1 1 Winter
1451625300 47735.842 -8.5 -11 82 11 -14 Yes 5 1 0 1 1 Winter
1451696100 54666.551 -5.7 -8.8 79 15 -11 Yes 5 1 0 1 1 Winter
1451631900 43892.473 -8.1 -10.8 81 15 -14 Yes 5 1 0 1 1 Winter
1451627700 45988.737 -8.2 -10.9 81 13 -14 Yes 5 1 0 1 1 Winter
1451626500 46928.385 -8.2 -10.9 81 13 -14 Yes 5 1 0 1 1 Winter
1451634900 42722.338 -7.9 -10.8 80 17 -15 Yes 5 1 0 1 1 Winter

wind chill were the deterministic variables; but, other variables


such as day of year (YearDay), week of year (YearWeek) has
very little or no influence.

TABLE II
VARIABLE I MPORTANCE : C OMPARISON

Variable Importance
Feature Random Forest CUBIST
Wind Chill 100.00 46.20
Temperature 91.84 100.00
Dew Point Temperature 86.54 99.37
Wind Speed 3.14 63.29
Month 0.32 0.00
YearWeek 0.31 1.27
YearDay 0.29 8.23 Fig. 2. Model Performance Comparison: Long Term Training
WeekDay 0.04 75.32
MonthDay 0.01 82.28
Relative Humidity 0.00 98.73

Figures 4 and 6 provide energy prediction comparison


between long-term and short-term learning strategies; where
the former presents a normal day and the later presents
a day during transition of weather i.e. shoulder time. The
corresponding weather conditions are shown in figures 5 and
7 respectively. We noted that short-term learning technique
performs well during a season but might suffer from low
accuracy during the transition period. In the case presented
for the short-term learning approach, it is observed that pre-
diction follows the general pattern of power consumption but
consistently overestimates the expected power consumption. Fig. 3. Model Performance Comparison: Long Term Training
Whereas, long-term learning approach performs at a much
better accuracy level.
V. C ONCLUSION AND F UTURE W ORK
Over one period the accuracy of long-term learning model
may reduce due to long gap between training and the inability In this paper we proposed a hybrid system to analyze the
to capture current impact of weather conditions. Figure 8 energy consumption data along with associated weather data at
presents one such case where model accuracy is way lower different time periods. The developed hybrid model combines
than a model trained on short-term period basis. Figure 9 long-term learning and short-term learning techniques for bet-
shows the weather conditions for the chosen day. Therefore, ter accuracies, shorter run-times, and improved performance.
it requires relearning or retraining. However, it is computa- The proposed system includes a selection strategy to determine
tionally expensive to retrain these models in a very short the prediction model and tracks the accuracy for all time
time interval. Short-term learning models might suffer from periods. In the future, we plan to formally present the hybrid
low accuracies during shoulder times, but, are computationally model and the optimization technique for determining the
inexpensive to retrain. Therefore, a hybrid approach where we accuracy error. Also, we plan to test our model with various
combine both the models into a hybrid system to achieve better datasets from different regions to evaluate the robustness and
accuracies and performance is highly desirable. validity of the system.
Fig. 4. Energy Predictions Long Term Vs. Short Term Training Fig. 7. Weather Conditions: Day Of Prediction (Shoulder Times)

Fig. 5. Weather Conditions: Day Of Prediction Fig. 8. Energy Predictions Long Term Vs Short Term Training: Long-Term
Learning Model with Reduced Accuracy

Fig. 6. Energy Predictions Long Term Vs Short Term Training: Shoulder


Times Fig. 9. Weather Conditions: Day Of Prediction (Long-Term Learning Model
with Reduced Accuracy)

R EFERENCES
[4] A. Yassine, Cooperative games among consumers in the smart grid 7th
[1] G.K.F. Tsu and K.K.W.Yau, Predicting electricity energy consumption: IEEE GCC Conference and Exhibition (GCC), Doha, 2013, pp. 70-75,
A comparison of regression analysis, decision tree and neural networks 2013
Elsevier, Energy, Volume 32, Issue 9, Pages 1761-1768, September 2007 [5] Environment Canada, climate.weather.gc.ca/
[2] A. Yassine, Implementation challenges of automatic demand response [6] L. Hernndez, C. Baladrn, J.M. Aguiar, L. Calavia, B. Carro, A. Snchez-
for households in smart grids 3rd International Conference on Renew- Esguevillas,F. Prez, A. Fernndez, J. Lloret, Artificial neural network
able Energies for Developing Countries (REDEC), Zouk Mosbeh, 2016, for short-term load forecasting in distribution systems. Energies 2014,
pp. 1-6, 2016. 7, 15761598.
[3] A. Yassine, A. A. Nazari Shirehjini and S. Shirmohammadi, ”Smart [7] A.S. Khwaja, M. Naeem, A. Anpalagan, A. Venetsanopoulos, B.
Meters Big Data: Game Theoretic Model for Fair Data Sharing in Venkatesh, Improved short-term load forecasting using bagged neural
Deregulated Smart Grids”, in IEEE Access, vol. 3, no. , pp. 2743-2754, networks. Electric Power Systems. Res. 2015, 125, 109115
2015. [8] K.B Song, Y.S. Baek, D.H. Hong, G. Jang, Short-Term load forecasting
for the holidays using fuzzy linear regression method. IEEE Transaction [35] Y. Lin and J. Yongho, Random forests and adaptive nearest neighbors
on Power Systems, 2005, 20, pages 96101. (Technical report). Technical Report No. 1055. University of Wisconsin,
[9] N. Huang, G. Lu, D. Xu, A Permutation Importance-Based Feature 2002.
Selection Method for Short-Term Electricity Load Forecasting Using [36] L. Kaufman, and P. J. Rousseeuw. Finding groups in data: an introduc-
Random Forest. Energies 2016, 9, 767. tion to cluster analysis 68-125, 1990
[10] G. LV, X. Wang, Y. Jin, Short-Term Load Forecasting in Power System [37] M. Juhn, Caret: classification and regression training. Astrophysics
Using Least Squares Support Vector Machine. Computational Intelli- Source Code Library (2015).
gence Theory Applications. 2006, 38, 117126. [38] K. Hechenbichler and K.P Schliep Weighted k-Nearest-Neighbor Tech-
[11] Y.H. Chen, W.C. Hong, W. Shen, N.N. Huang, Electric Load Forecasting niques and Ordinal Classification, Discussion Paper 399, SFB 386,
Based on a Least Squares Support Vector Machine with Fuzzy Time Ludwig-Maximilians University Munich
Series and Global Harmony Search Algorithm. Energies 2016, 9, 70. [39] T. Chen, T. He, M. Benesty, V. Khotilovich, Y. Tang, H. Cho, K. Chen,
[12] S. Singh, A. Yassine, Big Data Mining of Energy Time Series for R. Mitchell, I. Cano, T. Zhou, M. Li, J. Xie, M. Lin, Y. Geng, Y. Li,
Behavioral Analytics and Energy Consumption Forecasting. Energies XGBoost contributors (base XGBoost implementation) xgboost: Extreme
2018, 11, 452 Gradient Boosting https://cran.r-project.org/web/packages/xgboost/
[13] S. Singh and A. Yassine. Mining Energy Consumption Behavior Patterns [40] A. Karatzoglou, A. Smola, K. Hornik, A. Zeileis, kernlab - An S4
for Households in Smart Grid. IEEE Transactions on Emerging Topics Package for Kernel Methods R. Journal of Statistical Software 11(9),
in Computing. doi: 10.1109/TETC.2017.2692098 1-20. 2004
[14] A. Yassine, S. Singh and A. Alamri. Mining Human Activity Patterns [41] M. Kuhn, S. Weston, C. Keefer, N. Coulter, R. Quinlan, Rulequest Re-
From Smart Home Big Data for Health Care Applications. IEEE Access, search Pty Ltd. Cubist: Rule- And Instance-Based Regression Modeling
vol. 5, pp. 13131-13141, 2017. doi: 10.1109/ACCESS.2017.2719921 https://cran.r-project.org/web/packages/Cubist/index.html
[15] M. Matija, J.A. Suykens, S. Krajcar, Load forecasting using a mul- [42] M. N. Wright, A. Ziegler ranger: A Fast Implementation of Random
tivariate meta-learning system. Expert Systems Application 2013, 40, Forests for High Dimensional Data in C++ and R. Journal of Statistical
44274437. Software, 77(1), 1-17. 2017 ¡doi:10.18637/jss.v077.i01¿ https://cran.r-
[16] M. Ghayekhloo, M.B. Menhaj, M. Ghofranic A hybrid short-term load project.org/web/packages/ranger/index.html
forecasting with a new data preprocessing framework Elsevier Electric [43] C. Hennig, Fpc: Flexible procedures for clustering. R Package Version.
Power Systems Research 119 (2015) 138148 2. 0-3. 2010 https://cran.r-project.org/web/packages/fpc/index.html
[17] M. Ghofrani, M. Ghayekhloo, A. Arabali, A. Ghayekhloo d A hybrid
short-term load forecasting with a new input selection framework
Elsevier, Energy 81, 777-786, 2015
[18] S. Li, L. Goel, P. Wang An ensemble approach for short-term load
forecasting by extreme learning machine Elsevier, Applied Energy 170,
2229 2016
[19] B. Dong, Z. Li, S.M. Mahbobur Rahman, R. vega, A hybrid model ap-
proach for forecasting future residential electricity consumption Energy
and Buildings 117, 341351, 2016
[20] C.W Lee, B.Y. Lin, Application of Hybrid Quantum Tabu Search with
Support Vector Regression (SVR) for Load Forecasting. Energies 2016,
9, 873.
[21] Hyndman J Rob, Cross-validation for time series, 2016,
https://robjhyndman.com/hyndsight/tscv/
[22] L. Breiman, Classification and Regression Trees, New York: Routledge,
1984.
[23] N. S. Altman (2012) An Introduction to Kernel and Nearest-Neighbor
Nonparametric Regression, The American Statistician, 46:3, 175-185,
DOI: 10.1080/00031305.1992.10475879
[24] Tianqi Chen, Carlos Guestrin, XGBoost: A Scalable Tree Boosting
System, CoRR, abs/1603.02754, 2016, http://arxiv.org/abs/1603.02754
[25] Scalable and Flexible Gradient Boosting,
http://xgboost.readthedocs.io/en/latest/index.html
[26] V. Vapnik, The Nature of Statistical Learning Theory. Information
Science and Statistics, Springer, New York, 1995. DOI: 10.1007/978-
1-4757-3264-1, ISBN: 9781475732641
[27] https://www.mathworks.com/help/stats/understanding-support-vector-
machine-regression.html
[28] M. Awad, R. Khanna R. Support Vector Regression In: Efficient Learning
Machines: Theories, Concepts, and Applications for Engineers and
System Designers. Apress, Berkeley, CA DOI https://doi.org/10.1007,
2015
[29] R.J. Quinlan, Learning with continuous classes, 5th Australian joint
conference on artificial intelligence, World Scientific, Singapore (1992),
pp. 343-348
[30] Y. Lingjian, L. Songsong, T. Sophia, L. G. Papageorgiou, A regres-
sion tree approach using mathematical programming, Expert Systems
with Applications, Volume 78, 2017, Pages 347-357, ISSN 0957-4174,
https://doi.org/10.1016/j.eswa.2017.02.013.
[31] Data Mining with Cubist, https://www.rulequest.com/cubist-info.html
[32] T.K. Ho, Random Decision Forests. Proceedings of the 3rd International
Conference on Document Analysis and Recognition, Montreal, QC,
1416 August 1995. pp. 278282, 1995
[33] T. Ho The Random Subspace Method for Constructing Decision Forests
IEEE Transactions on Pattern Analysis and Machine Intelligence. 20
(8): 832844. doi:10.1109/34.709601, 1998.
[34] T. Trevor; R. Tibshirani, J. Friedman, The Elements of Statistical
Learning (2nd ed.), Springer. ISBN 0-387-95284-5, 2008.

View publication stats

You might also like