You are on page 1of 5

Available online at www.sciencedirect.

com
ScienceDirect

Energy Reports 8 (2022) 705–709


www.elsevier.com/locate/egyr

2021 The 2nd International Conference on Power Engineering (ICPE 2021), December 09–11,
2021, Nanning, Guangxi, China

Experimental and analysis on household electronic power


consumption
Jing Qin
Northeastern University, Wenhua Road, Heping District, Shenyang, 110819, China
Received 28 January 2022; accepted 22 February 2022
Available online 9 March 2022

Abstract
Household power consumption helps the power supply department understand the power consumption of residents and
whether there will be some abnormal power consumption phenomena. Taking the individual household electric power
consumption dataset as an example, this paper establishes an extensible experimental analysis framework and analyzes the
data in a visual way.
In the experiment, the effects of linear regression model and neural network model on different characteristics are compared.
The experiment shows that the effect of neural network model is better than linear regression model in the experimental dataset.
© 2022 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of the scientific committee of the 2021 The 2nd International Conference on Power Engineering, ICPE, 2021.

Keywords: Power consumption; Household electric; Linear model; Neural network model

1. Introduction
The data of household power consumption can not only reflect the situation of household power consumption,
but also provide message for the power sector to help understand the power supply. In addition, the household power
consumption data can also be used as a reference for the power bureau to collect electricity charges. At the same
time, the abnormal situation of the data can also be verified through the historical information of the household
power consumption data.
In recent years, the research on power consumption prediction has become a hot issue [1–8]. This paper takes the
individual household electric power consumption dataset as an example (http://archive.ics.uci.edu/ml/datasets/Indi
vidual+Household+electric+power+consumption) designs an experimental framework for predicting and analyzing
user power consumption data. Many methods can be applied to the prediction of user power consumption in machine
learning and deep learning, but linear regression is the most basic method and can reflect the correlation between

E-mail address: liuj_job@126.com.

https://doi.org/10.1016/j.egyr.2022.02.270
2352-4847/© 2022 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http:
//creativecommons.org/licenses/by-nc-nd/4.0/).
Peer-review under responsibility of the scientific committee of the 2021 The 2nd International Conference on Power Engineering, ICPE,
2021.
J. Qin Energy Reports 8 (2022) 705–709

features. Although linear regression is a basic method, it is still widely used [9–11]. In addition, as the basis of
deep learning model, neural network model is also widely used [12–14]. Therefore, this experiment used to the
linear regression model and neural network model to predict the power consumption in prediction method module.

2. Experimental
This section introduces the dataset and framework, and reports experimental results.

2.1. Dataset and general framework

Dataset: This experiment uses the individual household electric power consumption dataset, which contains
2075259 pieces of data. The power consumption data of a household for 4 years are recorded in the dataset, and
sub metering , sub metering 2 and sub metering 3 to represent the power consumption of different appliances.
Experimental environment: In order to facilitate the storage of datasets, this experiment uses SQL Server
database. The programming language is python in the experiment.
General framework:
The purpose of this experiment is to realize the statistics and power consumption prediction. Its implementation
framework consists of four modules, as shown in Fig. 1.

Fig. 1. General framework.

The specific functions of each module are as follows:


Data preprocessing module:data cleaning and statistics of the data in minutes, days and months, and the results
are saved to the SQL Server database in the form of views.
Data analysis module: (1) counts the user’s power consumption per minute, daily and monthly, and use the box
plot to display the data. (2) the association between features in the dataset is analyzed and represented by heatmap
diagram.
Prediction method module:predicts the power consumption of users using different models. In the experiment,
this module contains linear regression model and neural network model.
Evaluation result module: contains different evaluation metrics. In the experiment, this module contains R2
score metric.
Each module in the framework are extensible. The results obtained in the data analysis module and the evaluation
result module are reported and analyzed respectively below.

2.2. The results in the data analysis module

(1) User’s power consumption statistics per minute and day


706
J. Qin Energy Reports 8 (2022) 705–709

Table 1. Electricity power consumption per minute.


Type Max value Min value Mean value Median value
sub_metering_1 88 0 1.12 0
sub_metering_2 80 0 1.30 0
sub_metering_3 31 0 6.46 1
sub_metering_1, sub_metering_2, sub_metering_3 box plot of data distribution, as shown in Fig. 2.

Fig. 2. Box plot for electricity power consumption per minute (sub_metering_1, sub_metering_2, sub_metering_3).

The maximum, minimum, average and median of power consumption sub metering 1, sub metering 2 and
sub metering 3 of different appliances per minute are shown in Table 1.
sub metering 1, sub metering 2, sub metering 3 box plot of data distribution, as shown in Fig. 2.
As can be seen from Fig. 2, the outliers value of sub metering 3 should be less than sub metering 1,
sub metering 2.
The user’s daily power consumption statistics are shown in Table 2. The user’s power consumption data contains
1433 days.

Table 2. Electricity power consumption per day.


Type Max value Min value Mean value Median value
sub_metering_1 11178 0 1604.421 1113
sub_metering_2 12109 0 1856.965 682
sub_metering_3 23743 0 9235.985 9273

The box plot of the user’s daily power consumption is shown in Fig. 3.

Fig. 3. Box plot for electricity power consumption per day (sub_metering_1, sub_metering_2, sub_metering_3).

As can be seen from Fig. 3, there are few outliers in sub metering 3.
(2) Statistics of users’ monthly power consumption
In order to completely count the power consumption of users in the same month of each year, all the data of
2007, 2008 and 2009 in the dataset. The statistical results of power consumption are shown in Fig. 4.
707
J. Qin Energy Reports 8 (2022) 705–709

Fig. 4. Distribution of different types of electric power consumption.

Fig. 5. The heatmap of correlation between features.

As can be seen from Fig. 4, the power consumption of type sub metering 3 is the largest of the three types, the
data distribution of sub metering 1 and sub metering 2 is similar.
(3) Correlation of data
The dataset contains seven features, and the Heatmap is used to represent the correlation between the features,
as shown in Fig. 5. Pearson correlation is used to calculate the correlation between features, and the formula is as
follows.

Ri,j = Cov(i, j)/ (Var(i)Var(j)) (1)
Where, i and j denotes vector, Cov (i, j) denotes covariance of i and j, Var (i) and Var (j) denote standard deviation
of i and j.
The value range of correlation coefficient is −1 to 1. The larger the value of correlation coefficient, the higher the
correlation between the two vectors. Global active Power and sub metering 3 has the highest correlation, Therefore,
global active Power is important when predicting sub metering 3.

2.3. The results in the evaluation result module

The module uses the data of power consumption statistics per minute to predict sub metering 3 value. The linear
model is the linear regression model based on the least square method, that is, the LinearRegression model in sklearn
(https://scikit-learn.org/stable/). The neural network model is the MLPRegressor model in sklearn, and the test set
proportion is set to 0.2. In order to verify the influence of different features on the prediction results, the results
including four features (‘Global active power’, ‘Global reactive power’, ‘Voltage’, ‘Global intensity’) and only one
feature were tested in the experiment.
708
J. Qin Energy Reports 8 (2022) 705–709

Table 3. Prediction results of R2 score.


Features LinearRegression MLPRegressor
ALL 0.475 0.7041
Global_active_power 0.409 0.6679
Global_reactive_power 0.008 0.0342
Voltage 0.073 0.085
Global_intensity 0.394 0.6647

The prediction results of R2 values are shown in Table 3 (Use “ALL” to indicate the use of four features.).
As can be seen from Table 3, the effect of MLPRegressor model is better than LinearRegression model, and the
effect is the best when using the prediction results of four features in the dataset. Also, the Global active power
feature is the best to predict power consumption, while the Global intensity feature has the worst.

3. Conclusion
This paper realizes the experiment on the individual household electric power consumption dataset. An extensible
framework is established in the experiment. The correlation between the features obtained in the analysis module
in the framework is consistent with the prediction results, global active Power feature is relatively important to the
results. In addition, the experimental results show that the prediction effect of neural network model is better than
that of linear model.

Declaration of competing interest


The authors declare that they have no known competing financial interests or personal relationships that could
have appeared to influence the work reported in this paper.

References
[1] Shi H, Xu M, Li R. Deep learning for household load forecasting—A novel pooling deep RNN. IEEE Trans Smart Grid
2018;9(5):5271–80.
[2] Kim TY, Cho SB. Predicting the household power consumption using CNN-LSTM hybrid networks. In: Intelligent data engineering
and automated learning – IDEAL 2018. Lecture notes in computer science, vol. 11314, Cham: Springer; 2018.
[3] Loginov A, Heywood MI, Wilson G. Benchmarking a coevolutionary streaming classifier under the individual household electric power
consumption dataset. In: 2016 international joint conference on neural networks. 2016, p. 2834–41.
[4] Hajjaji I, Alami HE, El-Fenni MR, Dahmouni H. Evaluation of artificial intelligence algorithms for predicting power consumption in
university campus microgrid. In: 2021 international wireless communications and mobile computing. 2021, p. 2121–6.
[5] Pinto Tiago, Praça Isabel, Vale Zita, Silva Jose. Ensemble learning for electricity consumption forecasting in office buildings.
Neurocomputing 2021;423:747–55.
[6] Lim C, Choi H. Deep learning-based analysis on monthly household consumption for different electricity contracts. In: 2020 IEEE
international conference on big data and smart computing. 2020, p. 545–7.
[7] Hyeon J, Lee H, Ko B, Choi H-J. Building energy consumption forecasting: Enhanced deep learning approach. In: 2nd international
workshop on big data analysis for smart energy. p. 22–5.
[8] Liang K, Liu F, Zhang Y. Household power consumption prediction method based on selective ensemble learning. IEEE Access
2020;8:95657–66.
[9] Pandiaraj K, Sivakumar P, Jeya Prakash K. Machine learning based effective linear regression model for TSV layer assignment in
3DIC. Microprocess Microsyst 2021;83.
[10] Barbhuiya Sakil, Kilpatrick Peter, Nikolopoulos Dimitrios S. Linear regression based DDoS attack detection. In: 13th international
conference on machine learning and computing. p. 568–74.
[11] Zekić-Sušac M, Knežević M, Scitovski R. Modeling the cost of energy in public sector buildings by linear regression and deep learning.
CEJOR Cent Eur J Oper Res 2021;29:307–22.
[12] Miri D, Khedher A, BenOthman K. Tracking of trajectory and fault estimation of MIABOT robot using an artificial neural network.
In: 2021 18th International multi-conference on systems, signals & devices. 2021 p. 1296–301.
[13] Abdallah_Qasaimeh Bashar Mohammad, Abdallah Ammar, Ratte Sylvie. Detecting depression in Alzheimer and MCI using artificial
neural networks (ANN). In: International conference on data science, E-learning and information systems. 2021, p. 250–3.
[14] El-Khalek AAA, Khalil AT, El-Soud MAA, Yasser I. Classification of galaxy images using computer vision and artificial neural network
techniques: A survey. In: Proceedings of the international conference on artificial intelligence and computer vision, vol. 1377, 2021.

709

You might also like