You are on page 1of 10

AI Project

Electricity consumption Prediction

Hannan Aamir

Muhammad Zeeshan

Muhammad Noman Shahid

Introduction:
During the last few decades, Energy demand and consumption has been increased. With the
increase in world populations energy demand is also increasing while its production is not up to
that mark. Big importance has been given to electric load prediction and forecasting. It
represents an important parameter used by many electric utilities for optimal planning and
operational decision. However, it was noticed that the level of the management of these
utilities has shifted from large-scale management of the entire grid to a small household level
scale. It is because of many reasons like management of large grid is not so easy as compared to
the management of a building or a house [1].
A significant prerequisite for small energy management small energy management is the
identification of effective models is for forecasting electric power usage. This identification of
load forecasting techniques is not limited to the most famous forecasting models that include
regression and some other models being implemented, but to the most suitable and efficient
model for the specific of each case study.

Dataset:
Our dataset is from Kaggle which is of a competition dataset. It has total 13 columns and 698675
rows. These 13 columns have 12 features and 1 target variable. The dataset contains time series
data recording meter reading of 100 buildings for every hour for one complete year. The first 10
months’ data was used for training the model and the last two months’ data was used to validate
the results.
These 12 features include. [2]

1) Building id(Int64):
We have data of total 100 buildings. Which is based on hourly consumption of electricity of 12
months. So “building id” is used to keep track of the buildings numbers.
2 ) Timestamp(Object):
When the measurement was taken.
3) Primary use (Object):
Indicator of the primary category of activities for the building based on EnergyStar
property type definitions.
4) Square feet(int64):
Area of the building.
5) year built(int64):
Year building was opened.
6) Air temperature (Float64):
In degrees.
7) Cloud coverage (Float64):
Portion of the sky covered in clouds, in oktas.
8) dew temperature (Float64):
In Degrees.
9) precip_depth_1_hr (Float64):
In millimeters.
10) Sea level Pressure (Float64):
Millibar/hectopascals
11) Wind direction (Float64):
Compass direction (0-360).
12) Windspeed (Float64):
Meters per seconds.
13) Meter reading(Float64):
The target variable. Energy consumption in kWh.

Baseline:
As our problem required us to predict data the assumption was made that it could be solved
through regression, we initially thought of applying ridge regression to predict electricity
consumption this only yielded 34% accuracy due to this reason we moved on towards lasso
regression which also gave similar results. The focus was that we considered our data to be
multicollinear. We later realized this was not the case and then implemented these two models
as our baselines.
Main Approach:
After establishing our baselines, we moved towards what would give us better results. All the
inputs given to the models were in floating points. The result was returned in form of a column
that was the predicted value. The train data was split into 70 train 30 test. The 30 percent was
the data of later dates. The meter_reading of this percentage would be predicted after training
the model.
Random forest:
The first proper model implemented was random forest. The initial accuracy was achieved was
77 percent then by applying hyperparameter tuning this accuracy was increased to 91 percent.
The models took n_estimators, max_depth, and max_features as parameters. The n_estimators
was the number of trees used and the max_depth was how deep the tree went and
max_feature was the number of features each node contained. After this model we needed to
be sure that the data set was giving optimal results. Then we applied the process of parameter
reduction. We isolated which parameters were more effective during processing and eliminated
the rest. This decreased he computational load in exchange of loss of very little accuracy.
Gradient Boost:
Then we applied gradient boost to the data set. The main reason for implementing different
models was that to see which gave the best accuracy. Incase of gradient boost we got a max of
91 percent accuracy. The parameters taken by this model were n_estimators, max_depth,
max_features, learning rate, subsample. The n_estimators is the number of trees and
max_depth the deapth of tree. The max_feautes is the number of features in each node of the
tree. Learning rate is a value which shrinks off the effect of each tree. The subsample value is
used to fit individual base learners. If value is less that 1 this leads to stochastic gradient
boosting.
K Neighbor Regressor:
This model was applied to see if less computation could give better results. This model takes
the reduced parameters and then apply Minkowski distance. Then we take the targeted value
of the specified neighbors with least distance and take their average. The only parameter
passed is the number of neighbors to take in consideration. This gave 88 percent accuracy. We
used 5 neighbors in this case.
Neural Networks:
We had implemented multiple models at this point and simply wanted to explore the outcome
with neural networks. For this purpose, we implemented artificial neural networks. This neural
network was given the reduced parameter dataset to work on. We used three layers in this
model. The first had 100 neurons and mse loss function was implemented. Then the 2 nd layer
had only 20 neurons and the same loss function while the final layer had 1 neuron with the
linear loss function. The accuracy with 100 epochs was 10 percent. As this was only an
experiment we did not further look into applying different networks or tuning the parameters.

Main Approach:

As our problem is about regression so main approach towards the solution was to use the
regression models to predict the energy consumption of a given data point. We implemented 6
models on the dataset and then predicted the output value. These models include Ridge regression,
lasso regression, randomForest, gradientBoostting, K_nearest_neighbour, neural networks.
Evaluation Metric:

The evaluation metric used for the regression is r2 (r squared) which is the proportion of the
variance in the dependent variable that is predictable from the independent variables. That means
100% or 1.00 is the maximum r2 score that we can achieve representing that all the predictions
made by the model are correct.

The model we used from Scikit-learn library each have “.score” function that calculates r2 for
regression models and accuracy for classification models. However, different metrics such as RMSE
can be calculated on the model too.

Results & Analysis:

Lasso and ridge regression did not give satisfying results and the reason was high variance in a huge
dataset. Their scores are given below respectively.

RandomForest Regression gave better results.

Later, hyperparameter tuning was applied and the results became much better with the
following parameters

• 'n_estimators' : [300],

• 'max_depth' : [14],

• 'max_features' : [14]
Gradientboosting regressor gave the following accuracy with the given hyperparameters:

• 'n_estimators' : [300,],

• 'max_depth' : [14],

• ‘max_features' : [14],

• 'learning_rate' : [0.1],

• 'subsample' : [0.6]

Feature Reduced dataset and results:

This was the most important step to make it so that the training and prediction of model use
less resources with the least trade off with accuracy. Moreover, KNN and Neural Networks use
feature reduced dataset for training, so this step was even more necessary for those models.

The approach used was based on _feature_importance function of the randomforest regressor.
This function gives higher values to the features that help reduce the variance in the model. The
more important (higher value) features are retained while the others were removed.
Only the left 7 features were retained in the dataset. The model was trained on the new
reduced feature dataset.

RandomForest regressor r2 score decreased a little:

Gradientboosting however gave better results on less features than on full features:

KNN with 5 neighbors gave the following results. This algorithm was the least resource
consuming of all the other models with good accuracy.
Knn plot for training and test values of meter_reading.

Error Analysis:
Lasso only gave importance to the “parking” feature and made the coefficient of the rest of the
features zero. Which resulted in the horrible predictions.

KNN algorithm did kept training for 20 minutes with full features. From the documentation we learned
that it needs feature reduced dataset. After that the results came out better.

Neural network was used as a mere experiment and its results are not worth mentioning.

Future Work:

Neural Network and KNN can certainly be improved. The models trained were still not fully
optimized. EDA can be done to improve the models further.

References

[1] https://www.energy.gov/sites/prod/files/2017/03/f34/qtr-2015-chapter5.pdf. [Online].


[2] https://www.kaggle.com/c/predicting-electricity-consumption. [Online].

You might also like