An - Intelligent - Machine - Learning - Approach - For - Smart - Grid - Theft - Detection Garg 2-22 IEEE

2022 IEEE 23rd International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM)
2022 IEEE 23rd International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM) | 978-1-6654-0876-9/22/$31.00 ©2022 IEEE | DOI: 10.1109/WoWMoM54355.2022.00079
An Intelligent Machine Learning Approach for

Smart Grid Theft Detection
Dhruv Garg Neeraj Kumar
Computer Science & Engineering Department Computer Science & Engineering Department
Thapar Institute of Engineering & Technology Thapar Institute of Engineering & Technology
Patiala, India Patiala, India
Email: dhruvgarg01@gmail.com Email: neeraj.kumar@thapar.edu
Nazeeruddin Mohammad
Cybersecurity Center
Prince Mohammad Bin Fahd University
Khobar, Saudi Arabia
Email: nmohammad@pmu.edu.sa
Abstract—Smart grids are an improvement of the traditional grid. Consequently, such a vast communication infrastructure
electric grids. They allow a much higher degree of automation increases the attack surface making the AMI more vulnerable
and more efficient power distribution. Nonetheless, due to au- to cyber attacks [1].
tomation, these grids become more vulnerable to cyber attacks.
Hence, cyber security becomes a major milestone to overcome Due to such vulnerabilities, cyber security is a very integral
before we can permanently shift to smart grids. Electric theft is part of an SG. The main cyber security objectives are to
one of the most dangerous cyber attacks in a smart grid. It allows maintain the availability, integrity, and confidentiality of data.
users to lie about their load profiles and decrease their electricity
bills. Several research studies have been conducted regarding the Availability is preventing a consumer from not granting access
detection of such cyber attacks in a smart grid, but none of them or control of the system to the authorized personnel. Integrity
consider weather information as a feature. This paper proposes a is preventing any modification of critical information like con-
novel machine learning-based approach to smart grid electricity sumer load profiles. Confidentiality is preventing unauthorized
theft detection using both the load profile of a household and the access by an adversary of classified information. The first line
weather features. The results show that our current approach
using both load and weather information perform much better of defense in cyber security is preventing the attacker from
than previous approaches that only use load information. accessing the system. However, if the attacker gains access to
Index Terms—Electrical Theft, Machine Learning, False Data the system due to weak protocols then the next line of defense
Injection, Smart Grid, Data Analytics, Cyber theft. would be to detect the changes in our data so our system can
bounce back from any modifications made by the attacker [2].
I. I NTRODUCTION An integrity attack occurs after the attacker gains access to
our system and attempts to modify the data. Electricity theft
Smart grid (SG) is a type of electric grid that was proposed is a type of integrity attack that can occur at both transmission
to handle the increasing complexity and electricity demands and distribution levels. Although, there is a very low probabil-
of the 21st century. SG consists of several devices working ity of theft occurring at the transmission level because the data
together and automating our electrical grid to respond digi- is being transmitted at a very high rate. Instead, the chances
tally to our quickly changing electric demand. Furthermore, of electricity theft occurring at the distribution level are much
SG provides efficient transmission of electricity and quick more likely. Distribution level theft usually occurs at the House
restoration of electricity in case of power disturbances. They Area Network (HAN) level where the consumer can perform
allow better integration of renewable energy sources and also a false data injection (FDI) attack and post their electricity
electric vehicles. Moreover, SG provides improved security consumption onto the victim’s load profile. Basically, the
from theft at certain levels as the grid is fully automated so attacker would decrease his electricity consumption bill and
human intervention is minimal. increase the victim’s electricity bill. According to an article
Advanced Metering Infrastructure (AMI) is the infrastruc- by Smart Energy International, an estimated amount of $96
ture used in SG. AMI makes two-way communication possible billion is lost every year worldwide due to electricity theft
and is the backbone of SG. Furthermore, AMI enables the [3]. This paper focuses on detecting these kinds of attacks.
gathering and transfer of energy usage information in near One way to stop such attacks would be to stop the hacker
real-time. This will allow utilities to provide dynamic pricing from ever gaining access to the data. Several papers have
services, demand response management, forecast load profiles attempted to use blockchain technology for this purpose [4].
of households and perform better management of the electric There are several issues in implementing a blockchain such as
978-1-6654-0876-9/22/$31.00 ©2022 IEEE 507

DOI 10.1109/WoWMoM54355.2022.00079
Authorized licensed use limited to: Universiti Kebangsaan Malaysia. Downloaded on December 04,2023 at 18:08:24 UTC from IEEE Xplore. Restrictions apply.
very large storage and processing capabilities are required. It neural network (ANN) for theft detection that gave good
is very difficult to scale and very slow, which is not good for results. Zheng et al. [17] proposed a Convolutional Neural
smart meters since the sampling rate is very high. Finally, the Network (CNN), which outperformed linear regression (LR),
data can still be modified on the HAN level before it is put SVM, and random forest (RF) classifiers. Hasan et al. [18]
onto a blockchain. So, overall this is not an efficient solution. used SMOTE (Synthetic Minority Over-sampling Technique)
At the HAN level, IEEE 802.15.4 or ZigBee specifications class balancing technique and a CNN-LSTM based model to
are implemented for communication between home appliances predict non-technical electricity theft. Their proposed model
and the home network’s coordinator [5]. Several penetration performed better than SVM and LR on the same dataset. Nabil
tests have shown that these protocols can be easily hacked and et al. [19] used the same dataset creation technique as [9] and
the data can be modified by the attacker [5]. This can lead they found that Gated Recursive Unit (GRU) produces the best
to very high losses if the attacker is not caught. This paper results. Gul et al. [20] proposed using Smote Over Sampling
proposes an approach, which can detect if the load values of Tomik Link (SOSTLink) and bi-GRU to detect electricity theft.
a user have been tampered with using the load profiles of Li et al. [21] used SMOTE to handle class imbalance and
households and weather information. proposed a novel hybrid CNN-RF model for electricity theft
detection which outperformed SVM, RF, Gradient Boosting
A. Related work Decision Trees (GBDT), LR, CNN, CNN-GBDT, and CNN-
The approaches taken for SG theft detection can be cate- SVM models. Nevertheless, these approaches only used load
gorised into two different types - hardware and data driven. information, and no weather data was used for identifying
Hardware based approaches concentrate on designing devices fraudulent users.
and infrastructures to detect electricity theft. Khoo and Cheng In more recent approaches, Punmiya and Choe [22] used a
[6] proposed a system that uses radio-frequency identification modified version of FDIs that were proposed in [9] for dataset
(RFID) technology to manage inventory in an electricity creation and received excellent results by using GBTD based
supply company and detect electricity theft. Ngamchuen and classifiers such as XGBoost, CatBoost, and LightGBM for
Pirak [7] proposed a smart anti-tampering algorithm for a classifying theft cases. Avila et al. [23] proposed a framework
single-phase smart meter by using a single chip solution. using the maximal overlap discrete wavelet packet transform
Dineshkumar et al. [8] proposed a new system based on an (MODWPT) for feature extraction from time-series data and
ARM-Cortex M3 processor to protect the energy meter from the random undersampling boosting (RUSBoost) algorithm
tampering. There are several limitations to these hardware for theft detection. Also, Buzau et al. [24] proposed using
based approaches like the cost of the device, sensitivity of XGBoost classifier for detecting theft cases on a dataset that
device concerning weather conditions, dust, and also mainte- contained some geographical data along with load profiles of
nance of device like changing batteries. users to detect theft cases. The proposed model outperformed
Since this paper proposes a data driven approach, we will several machine learning models including SVM, K-Nearest
mostly be focusing on those approaches. Earlier theft detection Neighbors (KNN), and LR. Authors of Aldegheishem et al.
approaches use machine learning models like Support Vector [25] proposed two novel hybrid approaches - SALM and
Machines (SVM) and Decision Trees (DT). Jokar et al. [9] GAN-NETBoost. Both models outperformed several state-of-
used a regular Irish smart grid dataset where they simulated the-art techniques, such as SVM, Bi-GRU, and CNN-RF.
theft cases using some novel FDI attack functions and used After reviewing the literature, our approach uses several
an SVM model for classification. Nagi et al. [10] also used machine learning models, similar to those that were used by
an SVM classifier for theft detection which improved upon previous approaches such as XGBoost, LightGBM, CatBoost,
his earlier works [11]. Esmalifalak et al. [12] used Princi- Random Forest, Decision Tree, Logistic Regression, KNN and
pal Component Analysis for feature extraction and an SVM ANN. Along with load information, we also take weather
classifier for predicting theft cases. Cody et al. [13] proposed data into consideration for predicting electricity theft which
an approach to first predict the load consumption using DT was not used by any of the approaches listed above. This
and then calculate the difference between predicted and actual paper also uses a modified version of the FDIs that were
load to detect theft users. Jindal et al. [14] proposed a novel originally proposed by [9] to simulate theft attacks. We also
theft detection scheme capable of detecting theft at both used SMOTE sampling technique to handle class imbalance
transmission and distribution levels. They used a combination like [18].
of DT and SVM classifiers for finding the theft cases. Also,
Zanetti et al. [15] proposed an approach using clustering B. Motivation
algorithms to find anomalies that would be considered as theft Electricity theft in an SG poses a very serious threat to
cases. Although, all of these methods lack the accuracy to be society and also to the adoption of SGs worldwide. Consumers
implemented. can easily surpass the weak security protocols applied in SG
Since the evolution of deep learning, it has also been used communication and manipulate their load values to decrease
very widely for theft detection. Ismail et al. [16] used a data their electricity bills. This can be very hard to detect in a
creation technique similar to [9] and proposed a deep artificial real-time environment since SGs have a very fast sampling
508
rate and a consumer can change their load values in many steps using Apache Spark include typecasting, removing null
different ways. We cannot track all of these different methods, values by dropping rows if house id or date is missing, or
so there is a need to use machine learning based approaches substituting by the mean value if only consumption data is
that can automatically detect electricity thefts in SG. Weather missing. Further, the outliers are removed using the standard
information also plays a massive role in detecting these deviation method, and the consumption values are normalized
malicious cases as the load consumption would be different between 0 and 1. Finally, data for 3000 houses is extracted so
in different weather conditions like in extreme cold or heat. it is short enough to be processed further without a big data
Further the weather information cannot be tampered with since framework.
weather data can be easily be verified. Considering all of these The second step in processing includes processing the
factors, this paper proposes a theft detection approach using weather data obtained using the darksky API and combining
machine learning techniques that makes predictions based on both datasets. Pandas Python library is used for processing
consumers’ daily load profile and weather information of that as the pre-processed consumption dataset is short, and the
particular day. weather dataset is small as it has just one observation for each
day between Nov 2011 and Feb 2014. Now the steps taken
C. Contribution to pre-process this weather data include dropping unnecessary
The primary contribution of this paper is to present an features, one-hot encoding categorical features, dropping rows
intelligent machine learning approach for SG theft detection with null values, and normalizing the continuous numerical
that is efficient and accurate in making predictions. values in this dataset. After the weather data is processed, the
consumption dataset and the weather dataset are merged based
• The first contribution of the paper is the usage of weather on their date column, and any rows for which weather data
data as a feature along with load information to detect is found null are dropped. So, at the end of the second pre-
theft in an SG. processing step, we have a dataset containing the load infor-
• Several machine learning models have been tried for the mation for a specific house on a specific day (48 measurements
proposed approach and the proposed approach has been taken at half-hour intervals, house id, and date) along with the
compared with several other intelligent SG theft detection weather information for that day (25 different weather-related
approaches. features).
D. Organisation The third and final step of data pre-processing includes sim-
ulating theft by using several FDI functions on our combined
The rest of the paper is organized as follows. Section II dataset obtained after the second step of pre-processing and
gives the brief description about the proposed scheme. The also applying SMOTE to the generated imbalanced dataset.
results and discussions are presented in Section III. The paper From our combined dataset, we extract 100,000 rows of data
is finally concluded in Section IV. and apply six theft functions that are shown in Table I on
every observation and hence we get 700,000 rows of data. This
II. P ROPOSED S CHEME dataset has imbalanced classes since we have 100,000 non-
fraudulent users and 600,000 fraudulent users. So, to balance
TABLE I: False Data Injection Functions for Attack Simula- the classes we use the SMOTE technique which will give us
tion. a total of 1.2 million rows containing 600,000 non-fraudulent
Attack Simulation Functions
users and 600,000 fraudulent users. After the pre-processing
stages are over, these datasets are divided into the train, valid
f1 (x) = x ∗ random(0.1, 0.9)
and test datasets with 80%, 10%, and 10% split. Finally, the
f2 (xt ) = xt ∗ random(0.1, 1.0) chosen machine learning models are trained, validated, and
f3 (xt ) = xt ∗ random[0, 1] tested on these datasets.
f4 (xt ) = mean(x)
The data injection functions used for simulating theft cases
f5 (xt ) = mean(x) ∗ random(0.1, 1.0) are shown in Table I where xt signifies the load consumed at t
f6 (xt ) = xz−t (z is total timesteps in one observation) timestep, and each observation has 48 timesteps total (taken at
half hour intervals). These functions are inspired by [22], but
This section describes the working of our proposed scheme slightly modified to make them more realistic. These functions
for SG theft detection. Data pre-processing for our data has modify the consumption data and no changes are made to
been divided into three steps. The first step is processing weather data since that can be cross-checked by the authorities
the smart meter data from publicly available UK power net- directly. For theft case 1, we multiply all the values by one
works dataset [26]. This dataset contains energy consumption randomly generated number between 0.1 and 0.9. In theft case
readings for 5,567 London households measured at half-hour 2, we slightly modify theft case 1 and multiply each value
intervals between November 2011 and February 2014. Since by a different randomly generated number between 0.1 and
this dataset is quite large we use Apache Spark, which is a big 1.0. In theft case 3, the consumer either sends the real usage
data framework for processing this dataset. Data processing or sends zero usage for each timestep. Theft cases 4 and 5
509
Real Real Real
0.8 Theft 1 0.8 Theft 2 0.8 Theft 3
0.6 0.6 0.6

Load
Load
Load
0.4 0.4 0.4
0.2 0.2 0.2
0.0 0.0 0.0

0 10 20 30 40 0 10 20 30 40 0 10 20 30 40
Time Time Time
(a) Theft Case 1 (b) Theft Case 2 (c) Theft Case 3
Real Real Real

0.8 Theft 4 0.8 Theft 5 0.8 Theft 6
0.6 0.6 0.6

Load
Load
Load
0.4 0.4 0.4
0.2 0.2 0.2
0.0 0.0 0.0

0 10 20 30 40 0 10 20 30 40 0 10 20 30 40
Time Time Time
(d) Theft Case 4 (e) Theft Case 5 (f) Theft Case 6

Fig. 1: Different Simulated theft cases.
are generated using the mean value of the consumption, theft 600000
case 5 was made more realistic by multiplying the mean to a
different randomly generated number between 0.1 and 1.0 at 500000
Electricity Consumption
each timestep. Theft case 6 is when the consumer reverses his

usage for that particular day. All of these simulated attacks 400000
are shown clearly in Fig. 1.
300000
0.140 200000
0.138
0.136
0 10 20 30 40 100000
0.2
0.1
0.0
0 10 20 30 40 0
0.4
Mon Tue Wed Thu Fri Sat Sun
0.2
Days
0.0
0 10 20 30 40
0.2
Fig. 3: Electricity Consumption on different days.
0.0
0 10 20 30 40
0.4
0.2
0 10 20 30 40
Timesteps Networks [26]. This dataset contains energy consumption
readings for a sample of 5,567 London Households measured
Fig. 2: Sample consumption data of five different houses on between November 2011 and February 2014. The readings
one day. were taken on a half hourly basis and the actual dataset
contains energy consumption readings in Kilowatts, along with
III. R ESULTS a unique household identifier, date-time and CACI Acorn
group. Daily weather data for these households was extracted
This section presents the results of all the experiments con- using the darksky API (Application Programming Interface)
ducted and compares various different classification models and combined with the consumption dataset. The weather fea-
over six different evaluation metrics. tures include temperatureMax, windBearing, dewPoint, cloud-
Cover, windSpeed, pressure, visibility, humidity, uvIndex,
A. Dataset Description and Analysis moonPhase, weather icon and some more temperature related
The dataset used for experiments is taken from publicly information.
available Low Carbon London project led by UK Power A single observation in our dataset will include all the 48
510
TABLE II: Comparison of several performance metrics obtained for different models on the test set.
Model Accuracy ROC AUC F1 score Precision Recall FPR
WITHOUT SMOTE
XGBoost 0.984 0.963 0.991 0.988 0.993 0.066
LightGBM 0.969 0.927 0.982 0.978 0.986 0.131
CatBoost 0.927 0.785 0.958 0.934 0.983 0.413
Random Forest 0.910 0.737 0.949 0.920 0.979 0.504
Decision Tree 0.864 0.731 0.920 0.923 0.916 0.453
Logistic Regression 0.866 0.566 0.926 0.873 0.986 0.853
KNN 0.848 0.555 0.916 0.871 0.966 0.854
ANN 0.909 0.800 0.947 0.941 0.953 0.352
WITH SMOTE
XGBoost 0.950 0.950 0.949 0.969 0.929 0.029
LightGBM 0.913 0.913 0.912 0.927 0.897 0.070
CatBoost 0.896 0.896 0.895 0.903 0.887 0.094
Random Forest 0.929 0.930 0.928 0.949 0.908 0.048
Decision Tree 0.856 0.856 0.854 0.867 0.842 0.129
Logistic Regression 0.691 0.691 0.708 0.672 0.748 0.365
KNN 0.828 0.829 0.800 0.964 0.683 0.025
ANN 0.908 0.908 0.905 0.940 0.872 0.055
400000
hh_0 1.00
hh_1
hh_2
hh_3
hh_4
hh_5
hh_6
hh_7
350000
hh_8
hh_9 0.75
hh_10
hh_11
hh_12
hh_13
hh_14
hh_15
hh_16
300000
hh_17
Electricity Consumption
hh_18 0.50
hh_19
hh_20
hh_21
hh_22
hh_23
hh_24
hh_25
250000
hh_26
hh_27 0.25
hh_28
hh_29
hh_30
hh_31
hh_32
hh_33
200000
hh_34
hh_35
hh_36 0.00
hh_37
hh_38
hh_39
hh_40
hh_41
hh_42
150000
hh_43
hh_44
hh_45 −0.25
hh_46
hh_47
temperatureMax
windBearing
dewPoint
100000
cloudCover
windSpeed
pressure
apparentTemperatureHigh −0.50
visibility
humidity
apparentTemperatureLow
apparentTemperatureMax
uvIndex
50000
temperatureLow
temperatureMin
temperatureHigh
apparentTemperatureMin −0.75
moonPhase
0
1
2
0
3
4
5
6
Jan Feb Mar April May June July Aug Sept Oct Nov Dec
7
−1.00
hh_0
hh_1
hh_2
hh_3
hh_4
hh_5
hh_6
hh_7
hh_8
hh_9
hh_10
hh_11
hh_12
hh_13
hh_14
hh_15
hh_16
hh_17
hh_18
hh_19
hh_20
hh_21
hh_22
hh_23
hh_24
hh_25
hh_26
hh_27
hh_28
hh_29
hh_30
hh_31
hh_32
hh_33
hh_34
hh_35
hh_36
hh_37
hh_38
hh_39
hh_40
hh_41
hh_42
hh_43
hh_44
hh_45
hh_46
hh_47
temperatureMax
windBearing
dewPoint
cloudCover
windSpeed
pressure
apparentTemperatureHigh
visibility
humidity
apparentTemperatureLow
apparentTemperatureMax
uvIndex
temperatureLow
temperatureMin
temperatureHigh
apparentTemperatureMin
moonPhase
0
1
2
3
4
5
6
7
Months
Fig. 4: Electricity Consumption in different months. Fig. 5: Correlation Matrix of all the features.
energy consumption readings taken at half hourly intervals for of data after all six different FDIs were applied and that
one specific house on one specific day, unique house identifier, finally turned into 1.2 million rows after applying SMOTE
date and all the weather information for that particular day. sampling technique. Furthermore, sum of energy consumption
Fig. 2 visualizes the energy consumption for five different on different days and months for the year 2013 by all the
houses on one day taken from the dataset. In total, 100,000 houses in our extracted dataset can be seen in Fig. 3 and Fig. 4.
rows from this dataset are used which became 700,000 rows Fig. 5 shows correlation between all the features in our dataset
511
0.98 0.8
0.96 0.7
False Positive Rate

0.94 0.6
0.5
Accuracy
0.92
0.4
0.90
0.3
0.88
0.2
0.86 0.1
XGB
st
sion
N
ree
est
XGB
st
sion
N
est
ree
AN
KN
Boo
LGB
AN
KN
Boo
LGB
For
nT
For
nT
res
res
Cat
Cat
isio
isio
dom
Reg
dom
Reg
Dec
Dec
Ran
R an
istic
stic
i
Log
Log
Models Models
(a) Accuracy on dataset without SMOTE (b) FPR on dataset without SMOTE
0.95
0.35
0.90 0.30
False Positive Rate

0.25
0.85
Accuracy
0.20
0.80
0.15
0.75 0.10
0.05
0.70
XGB
st
st
sion
XGB
st
N
st
sion
ree
ree
AN
KN
KN
AN
e
Boo
Boo
LGB
LGB
For
For
nT
nT
res
res
Cat
Cat
isio
isio
dom
dom
Reg
Reg
Dec
Dec
R an
R an
tic
istic
is
Log
Log
Models Models
(c) Accuracy on dataset with SMOTE (d) FPR on dataset with SMOTE
Fig. 6: Evaluation metrics on different models.
including consumption and weather data. The FDI functions To evaluate the performance of our classification models
used for synthetic attack simulations as stated in Table I have we have chosen accuracy, ROC curve, F1 score, precision,
been very clearly visualized in Fig. 1. recall and false positive rate (FPR) as the evaluation metrics.
Accuracy is the most basic evaluation metric for classification
models, it calculates the number of observations correctly
B. Experimental Results predicted by the total number of observations. ROC curve is
a graph which plots True Positive Rate and FPR and shows
Two separate tests were done to prove the validity of the performance of a classification model at all classification
the proposed scheme. The first test was done on an imbal- thresholds and the area under this curve (AUC) tells us how
anced dataset which contained 700,000 rows in total. Out of well the model is performing. Precision is a good metric to
these 100,000 is legitimate consumer data and the remaining determine, when the costs of False Positive is high and Recall
600,000 is theft simulated data. The second dataset is bal- is a good metric when the costs of False Negatives is high. F1
anced and contains 1.2 million rows, 600,000 observations for score is a harmonic mean of precision and recall so it provides
each class. This dataset was generated by using the SMOTE a balance between those two metrics. Finally, FPR is also used
sampling technique on the first dataset. Then the datasets are since it is very important to minimize the number of False
divided into train, valid and test datasets with 80%, 10% and Positives in theft detection if the model is to be implemented
10% split. XGBoost, LightGBM, CatBoost, Random Forest, practically.
Decision Tree, Logistic Regression, KNN and ANN were the
As shown in Table II and Fig. 6, XGBoost has the best per-
eight different models picked to be implemented on both the
formance in terms of accuracy, ROC curve, F1 score, precision
datasets and six different evaluation metrics were picked to
and recall for both datasets. In terms of FPR, XGBoost had
compare the performance of all these models as shown in
the best performance in 700,000 row imbalanced dataset but
Table II. All these tests were conducted on Google Colab,
it had the second best performance in the balanced dataset,
which has an Nvidia K80 with 2496 CUDA cores operating
slightly higher than KNN. Other than XGBoost - LightGBM,
at 4.1 TFLOPS with 12 GB of primary memory.
512
TABLE III: Comparative analysis of the proposed scheme with other approaches in the literature.
Reference Technique Accuracy FPR
Proposed Approach XGBoost (with and without SMOTE) 98.40% 2.90%
Jindal et al. [14] DT coupled SVM 92.50% 5.12%
Hasan et al. [18] CNN-LSTM 89.14% NA
Jokar et al. [9] CPBETD 94.00% 11.00%
Nabil et al. [19] GRU 94.20% 4.70%
Punmiya and Choe [22] GBTD 94-97% 5-7%
CatBoost and Random Forest also performed very well with learning models. Also, larger datasets and more diverse FDI
high accuracy, ROC AUC, F1 score, precision, recall and low functions need to be tested.
FPR, and Random Forest actually managed to attain a better
overall performance on the second dataset. ROC AUC for
almost all the models increased on the second dataset except R EFERENCES
for XGBoost and LightGBM which had very high values [1] S. Sridhar, A. Hahn, and M. Govindarasu, “Cyber attack-
of ROC AUC on the first dataset. Another important thing resilient control for smart grid,” in 2012 IEEE PES
observed is that there was a significant decrease in the values Innovative Smart Grid Technologies (ISGT), 2012, pp.
of all the evaluation metrics, which is to be expected since the 1–3.
dataset is also larger but this also led to a significant decrease [2] S. Hussain, M. Meraj, M. Abughalwa, and A. Shikfa,
in FPR which is very important in practical scenarios. “Smart grid cybersecurity: Standards and technical coun-
The results of our proposed approach have also been com- termeasures,” in 2018 International Conference on Com-
pared to several other existing approaches as shown in Table puter and Applications (ICCA), 2018, pp. 136–140.
III. Our approach has been able to reach the highest accuracy [3] “Combatting fraud and theft in the smart
and the lowest FPR out of all these approaches as seen in the grid,” Smart Energy International, 2020,
table. The reason for this is because our approach takes both accessed: Jul. 2021. [Online]. Available:
load and weather information into consideration while the rest https://www.smart-energy.com/industry-sectors/smart-
of these approaches only use load information to detect theft. grid/combatting-fraud-and-theft-in-the-smart-grid/
[4] A. Aderibole, A. Aljarwan, M. H. Ur Rehman, H. H.
IV. C ONCLUSION Zeineldin, T. Mezher, K. Salah, E. Damiani, and
D. Svetinovic, “Blockchain technology for smart grids:
Electricity theft leads to major losses in the power sector Decentralized nist conceptual model,” IEEE Access,
and disrupts the functioning of SGs. These thefts need to vol. 8, pp. 43 177–43 190, 2020.
be detected accurately and efficiently so the malicious users [5] M. M. Fouda, Z. M. Fadlullah, and N. Kato, “Assessing
can be stopped in time. Previous theft detection approaches attack threat against zigbee-based home area network for
used only load information and tried to find inconsistencies smart grid communications,” in The 2010 International
in the load consumption sequence to detect theft. But the Conference on Computer Engineering Systems, 2010, pp.
inconsistencies cannot be detected properly with only the 245–250.
load data. This paper proposed a novel SG theft detection [6] B. Khoo and Y. Cheng, “Using rfid for anti-theft in a
approach using both load and weather information to detect chinese electrical supply company: A cost-benefit anal-
theft. Weather data will be very useful in this scenario since it ysis,” in 2011 Wireless Telecommunications Symposium
is directly related to daily household load usage. Also it can be (WTS), 2011, pp. 1–6.
easily verified by the authorities preventing any modifications [7] S. Ngamchuen and C. Pirak, “Smart anti-tampering al-
to this data. Several machine learning models were trained gorithm design for single phase smart meter applied to
and tested on the generated dataset. Among all the machine ami systems,” in 2013 10th International Conference on
learning models, XGBoost model gave the best results in Electrical Engineering/Electronics, Computer, Telecom-
terms of most evaluation metrics. Our proposed approach has munications and Information Technology, 2013, pp. 1–6.
also been compared with several other previous works and [8] K. Dineshkumar, P. Ramanathan, and S. Ramasamy,
the results show that its performance is better than previous “Development of arm processor based electricity theft
methods in terms of accuracy and FPR, the two most important control system using gsm network,” in 2015 International
metrics in this case. Further work can be done to improve Conference on Circuits, Power and Computing Technolo-
this approach by using different machine learning and deep gies [ICCPCT-2015], 2015, pp. 1–6.
513
[9] P. Jokar, N. Arianpoo, and V. C. M. Leung, “Electricity [20] H. Gul, N. Javaid, I. Ullah, A. M. Qamar, M. K. Afzal,
theft detection in ami using customers’ consumption and G. P. Joshi, “Detection of non-technical losses
patterns,” IEEE Transactions on Smart Grid, vol. 7, no. 1, using sostlink and bidirectional gated recurrent unit to
pp. 216–226, 2016. secure smart meters,” Applied Sciences, vol. 10, no. 9,
[10] J. Nagi, K. S. Yap, S. K. Tiong, S. K. Ahmed, and 2020. [Online]. Available: https://www.mdpi.com/2076-
F. Nagi, “Improving svm-based nontechnical loss detec- 3417/10/9/3151
tion in power utility using the fuzzy inference system,” [21] S. Li, Y. Han, X. Yao, S. Yingchen, J. Wang,
IEEE Transactions on Power Delivery, vol. 26, no. 2, pp. and Q. Zhao, “Electricity theft detection in power
1284–1285, 2011. grids with deep learning and random forests,” Journal
[11] J. Nagi, K. S. Yap, S. K. Tiong, S. K. Ahmed, and of Electrical and Computer Engineering, vol. 2019,
M. Mohamad, “Nontechnical loss detection for metered p. 4136874, Oct 2019. [Online]. Available: https:
customers in power utility using support vector ma- //doi.org/10.1155/2019/4136874
chines,” IEEE Transactions on Power Delivery, vol. 25, [22] R. Punmiya and S. Choe, “Energy theft detection using
no. 2, pp. 1162–1171, 2010. gradient boosting theft detector with feature engineering-
[12] M. Esmalifalak, L. Liu, N. Nguyen, R. Zheng, and based preprocessing,” IEEE Transactions on Smart Grid,
Z. Han, “Detecting stealthy false data injection using vol. 10, no. 2, pp. 2326–2329, 2019.
machine learning in smart grid,” IEEE Systems Journal, [23] N. F. Avila, G. Figueroa, and C.-C. Chu, “Ntl detection in
vol. 11, no. 3, pp. 1644–1652, 2017. electric distribution systems using the maximal overlap
[13] C. Cody, V. Ford, and A. Siraj, “Decision tree learning discrete wavelet-packet transform and random undersam-
for fraud detection in consumer energy consumption,” in pling boosting,” IEEE Transactions on Power Systems,
2015 IEEE 14th International Conference on Machine vol. 33, no. 6, pp. 7171–7180, 2018.
Learning and Applications (ICMLA), 2015, pp. 1175– [24] M. M. Buzau, J. Tejedor-Aguilera, P. Cruz-Romero, and
1179. A. Gómez-Expósito, “Detection of non-technical losses
[14] A. Jindal, A. Dua, K. Kaur, M. Singh, N. Kumar, and using smart meter data and supervised learning,” IEEE
S. Mishra, “Decision tree and svm-based data analytics Transactions on Smart Grid, vol. 10, no. 3, pp. 2661–
for theft detection in smart grid,” IEEE Transactions on 2670, 2019.
Industrial Informatics, vol. 12, no. 3, pp. 1005–1016, [25] A. Aldegheishem, M. Anwar, N. Javaid, N. Alrajeh,
2016. M. Shafiq, and H. Ahmed, “Towards sustainable en-
[15] M. Zanetti, E. Jamhour, M. Pellenz, M. Penna, V. Zam- ergy efficiency with intelligent electricity theft detection
benedetti, and I. Chueiri, “A tunable fraud detection in smart grids emphasising enhanced neural networks,”
system for advanced metering infrastructure using short- IEEE Access, vol. 9, pp. 25 036–25 061, 2021.
lived patterns,” IEEE Transactions on Smart Grid, [26] “Smart meter energy consumption data in london
vol. 10, no. 1, pp. 830–840, 2019. households,” UK Power Networks, 2015, available:
[16] M. Ismail, M. Shahin, M. F. Shaaban, E. Serpedin, https://data.london.gov.uk/dataset/smartmeter-energy-
and K. Qaraqe, “Efficient detection of electricity theft use-data-in-london-households.
cyber attacks in ami networks,” in 2018 IEEE Wireless
Communications and Networking Conference (WCNC),
2018, pp. 1–6.
[17] Z. Zheng, Y. Yang, X. Niu, H.-N. Dai, and Y. Zhou,
“Wide and deep convolutional neural networks for
electricity-theft detection to secure smart grids,” IEEE
Transactions on Industrial Informatics, vol. 14, no. 4,
pp. 1606–1615, 2018.
[18] M. N. Hasan, R. N. Toma, A.-A. Nahid, M. M. M.
Islam, and J.-M. Kim, “Electricity theft detection
in smart grid systems: A cnn-lstm based approach,”
Energies, vol. 12, no. 17, 2019. [Online]. Available:
https://www.mdpi.com/1996-1073/12/17/3310
[19] M. Nabil, M. Mahmoud, M. Ismail, and E. Serpedin,
“Deep recurrent electricity theft detection in ami net-
works with evolutionary hyper-parameter tuning,” in
2019 International Conference on Internet of Things
(iThings) and IEEE Green Computing and Communica-
tions (GreenCom) and IEEE Cyber, Physical and Social
Computing (CPSCom) and IEEE Smart Data (Smart-
Data), 2019, pp. 1002–1008.
514

An - Intelligent - Machine - Learning - Approach - For - Smart - Grid - Theft - Detection Garg 2-22 IEEE

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

An - Intelligent - Machine - Learning - Approach - For - Smart - Grid - Theft - Detection Garg 2-22 IEEE

Uploaded by

Copyright:

Available Formats

2022 IEEE 23rd International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM)

An Intelligent Machine Learning Approach for

978-1-6654-0876-9/22/$31.00 ©2022 IEEE 507

0.6 0.6 0.6

0.2 0.2 0.2

0.0 0.0 0.0

(a) Theft Case 1 (b) Theft Case 2 (c) Theft Case 3

Real Real Real

0.6 0.6 0.6

0.2 0.2 0.2

0.0 0.0 0.0

(d) Theft Case 4 (e) Theft Case 5 (f) Theft Case 6

each timestep. Theft case 6 is when the consumer reverses his

Model Accuracy ROC AUC F1 score Precision Recall FPR

XGBoost 0.984 0.963 0.991 0.988 0.993 0.066

LightGBM 0.969 0.927 0.982 0.978 0.986 0.131

CatBoost 0.927 0.785 0.958 0.934 0.983 0.413

Random Forest 0.910 0.737 0.949 0.920 0.979 0.504

Decision Tree 0.864 0.731 0.920 0.923 0.916 0.453

Logistic Regression 0.866 0.566 0.926 0.873 0.986 0.853

KNN 0.848 0.555 0.916 0.871 0.966 0.854

ANN 0.909 0.800 0.947 0.941 0.953 0.352

XGBoost 0.950 0.950 0.949 0.969 0.929 0.029

LightGBM 0.913 0.913 0.912 0.927 0.897 0.070

CatBoost 0.896 0.896 0.895 0.903 0.887 0.094

Random Forest 0.929 0.930 0.928 0.949 0.908 0.048

Decision Tree 0.856 0.856 0.854 0.867 0.842 0.129

Logistic Regression 0.691 0.691 0.708 0.672 0.748 0.365

KNN 0.828 0.829 0.800 0.964 0.683 0.025

ANN 0.908 0.908 0.905 0.940 0.872 0.055

False Positive Rate

False Positive Rate

Reference Technique Accuracy FPR

Proposed Approach XGBoost (with and without SMOTE) 98.40% 2.90%

Jindal et al. [14] DT coupled SVM 92.50% 5.12%

Hasan et al. [18] CNN-LSTM 89.14% NA

Jokar et al. [9] CPBETD 94.00% 11.00%

Nabil et al. [19] GRU 94.20% 4.70%

Punmiya and Choe [22] GBTD 94-97% 5-7%

You might also like