You are on page 1of 9

IET Generation, Transmission & Distribution

Research Article

Power theft localisation using voltage ISSN 1751-8687


Received on 15th December 2016
Revised 10th March 2017
measurements from distribution feeder nodes Accepted on 28th March 2017
E-First on 14th July 2017
doi: 10.1049/iet-gtd.2016.2011
www.ietdl.org

Kalyan Dasgupta1 , Manikandan Padmanaban1, Jagabondhu Hazra1


1IBM research India, Bangalore, India
E-mail: kalyand1@in.ibm.com

Abstract: In this study, a novel algorithm to locate regions in a distribution feeder, where power is being illegally tapped, is
proposed. The basic requirements of the algorithm are voltage measuring devices located at distribution feeder nodes or
transformers that can communicate data to the distribution substation. Initially, how voltage magnitude difference between
successive nodes in a feeder can help identify possible locations of illegal tapping is shown. The technique is further refined by
a normalised voltage double difference method, to pin-point the exact location of power theft. The algorithm does not require
network parameters. Simulations are performed on the IEEE 34 node test feeder, to demonstrate the efficacy of this method.

 Nomenclature and Mouftah [4] discuss how data extracted from smart meters and
data collectors can provide evidence to legal proceedings in
V̄ k, | V̄ k| voltage phasor at node k, magnitude of voltage electricity theft matters. In [5], the authors discuss existing smart
phasor at node k metering practices and the prevailing situation in the Netherlands.
Īk, Īth current phasor at node k, current phasor They also present a novel automated method to detect tampering
representing theft and theft. In [6], the authors propose a method to locate illegal
Zk j impedance of the feeder line joining nodes k and tapping by using smart meter data. In this method, the network
j model/parameters of the distribution system and mainly the voltage
Re( . ), Im( . ) real and imaginary parts of a phasor measurement data from the smart meters are used to locate theft
Esubstn energy reading in a distribution substation points. The voltage measurements from smart meters are compared
ΣEnodes sum of the energy consumed at all the feeder with the estimated voltages of the network, and a probable theft
nodes current is calculated. This was one paper where voltage
δ angle between voltage phasors of successive measurements were made use of to detect theft. However, this
node method requires knowledge of the network parameters (which may
θ angle between voltage and lagging current have a lot of inaccuracies) and also requires estimation of an
phasors at a feeder node impedance matrix of illegal (theft) loads.
Var( . ), MD( . ) variance function, Mahalanobis distance function There are several other papers on theft detection using smart
σk − 1, k standard deviation in ( | V̄ k − 1 | − | V̄ k | ) over a set meters. The primary difficulty all methods/algorithms face is the
of measurements lack of network topology and parameter data beyond the
C, μ sample covariance matrix, sample mean vector distribution transformer (DT). In [7], a method of detection of theft
Lp, Lnp peak period load, non-peak period load in kW by estimating a statistical model of technical losses and NTL is
discussed. It does this by making the assumption that every user is
1 Introduction connected directly to its local DT and thereby calculating effective
resistances between the DTs and the customer premises. If the NTL
Power/energy theft, also known as non-technical losses (NTLs), is detected goes beyond a statistical threshold, the utility can be
a huge problem in distribution systems all over the world. The warned of possible illegal loads. A method of identifying energy
world loses US$89.3 billion annually to electricity theft and theft by using pattern recognition techniques is proposed in [8]. It
especially in India, NTL can be of the order of 20–30%. India loses uses the energy meter readings of the smart meter and the energy
around US$16.2 billion annually to NTL [1]. Identification of reading of a central observer for several time instants. A state
power theft and consequently their localisation are crucial to estimation based approach for energy theft detection in microgrids,
stopping this menace. Now, many countries have started putting in while preserving privacy is proposed in [9]. This paper mainly
place, advanced metering infrastructure (AMI) to measure, collect, deals with theft involving tampering of smart meter data. In [10],
store, analyse and use power consumption information [2]. AMI the authors propose a method based on dynamic programming, by
essentially involves having smart meters and their associated which an optimal number of feeder remote terminal units need to
communication systems in large numbers in the distribution be deployed for detection. One can also find papers using game
system. In emerging economies like India, most of the metering is theoretic approaches to detect power theft [11, 12]. Game-theory
through legacy meters and installation of AMI infrastructure may based approaches focus on detecting the amount of power stolen by
take some time. While digital meters have been installed in rigging or tampering of the AMI meters at the customer level by
distribution feeder nodes, they lack the ability to communicate data formulating the problem as a game between the electric utility and
to a distribution substation. the electricity thief. They do not use network parameters to detect
Several papers and reports on using smart meter or AMI data to theft. However, these methods do not detect illegal hooking/
detect NTL can be found in the public domain. In [3], techniques tapping into the distribution lines.
using AMI data to detect power theft is discussed. It lists out In the contemporary literature, all the works have been focused
several AMI energy-theft detection schemes, relying on on detecting and localising power theft based on the data from
classification-based, state estimation-based, and game theory-based smart meters and AMI at the customer and at the distribution
algorithms and finally compares all these methods. Erol-Kantarci feeder level. This may not be feasible for countries like India,

IET Gener. Transm. Distrib., 2017, Vol. 11 Iss. 11, pp. 2831-2839 2831
© The Institution of Engineering and Technology 2017
Fig. 1  Schematic and phasor diagram of a typical distribution feeder
(a) Schematic diagram of the distribution feeder, (b) Phasor diagram for two successive nodes

where cost effective solutions are the need of the day. A solution 2 Formulation
involving a low-bandwidth communication system integrated with
an existing metering infrastructure at the DT level could be more Let us have a look at the voltage difference between successive
acceptable. In this paper, a method is proposed by which a utility nodes in a distribution feeder. Fig. 1b gives the phasor diagram of
can detect and locate probable points in a feeder (DT level) where the voltage phasors for two successive nodes. V 2 is upstream to the
electricity theft is happening, by using only voltage measurements. node V 1. I is the current flowing between the nodes. Re and Im
This requires digital meters that can record voltage measurements shown in Fig. 1b refer to the real and imaginary of a phasor,
at the DT level and can calculate and communicate differences in respectively. The voltage drop, V d, in the line is defined as the
voltage magnitude measurements between successive nodes to the difference in the voltage magnitude of voltages at nodes 1 and 2. It
substation. While the digital meter can record a lot of is given as follows:
measurements (power level, units consumed, power factor etc.)
apart from voltage measurements, it needs to communicate only V d = | V̄ 2 | − | V̄ 1|
the voltage information data. This would require a low-bandwidth
communication system. For example, for a hundred node three- The bar indicates that they are phasors. In distribution feeders, the
phase system, having 1 min granularity, a 1 kbps data rate would ratio of resistance-to-reactance (R/X ratio) is higher as compared to
be more than sufficient to communicate voltage difference transmission lines and successive nodes/buses are also not very
information data to the substation. The choice of voltage distant. This leads to very low phase angle differences in the
measurements has its inherent advantages, since it well captures voltages of successive nodes, denoted by δ shown in Fig. 1b. Since
the state of system as compared to other derived measurements. the angle δ between phasors V̄ 2 and V̄ 1 is very small, V d is
In many developing countries, bulk of the meters at the approximated as
consumer premises, downstream to the DT are legacy meters.
Given the granularity of the legacy meters, which are in the order V d ≃ Re(ZĪ) (1)
of days to month, it will be difficult to locate power theft. While
the granularity of measurement data in smart meters is in minutes,
In other words, the voltage drop is approximately equal to the real
in the legacy meters it is in days. Detecting and locating theft at the
part of the impedance drop [13]. As current in the line increases, so
consumer level, in areas where we have a mix of legacy and smart
does the voltage drop.
meters, is a very difficult proposition. On the other hand, if we
have the voltage measurements at the DT level with high
granularity, we could use the proposed algorithm to at least detect 2.1 Voltage difference as an indicator
and locate points of electricity theft at the feeder. Fig. 1a gives the Detecting NTL/electricity theft is easier than locating it. NTL is the
schematic diagram of the distribution feeder. Once detected, an difference in the energy reading in a distribution substation (Esubstn)
inspection of all the connections below the suspected DT, could and the sum of the energy consumed at all the nodes (ΣEnodes), plus
lead to identification of theft points at the consumer level.
The proposed method here mainly uses voltage measurements the technical losses (I 2R losses) in the network, as given in the
taken from meters located at all the important nodes of the feeder. following equation:
The nodes could either be DTs or an industrial consumer premise.
The granularity of the measurements is expected to be in minutes. NTL = Esubstn − ΣEnodes − I 2R losses (2)
The algorithm we propose uses voltage magnitude difference
between successive nodes. There are two variations of these Power theft points can be located when network parameters and
methods. The first one uses a simple voltage measurement voltage measurements are available by (i) first estimating the
difference between successive nodes. The second variation uses a voltages that is expected at the customer premises and then (ii)
normalised voltage double difference (NVDD) technique. In both comparing the estimates with meter readings. A significant
of these methods, essentially, outliers are detected using a difference in the measured and estimated values indicates the
statistical approach. The biggest advantage of the proposed method prevalence of theft [6]. The problems of these methods faced are
is that it uses only voltage measurements and it does not require
network parameters. i. the network parameters for a distribution system as provided
The paper is organised as follows. In Section 2, the formulation by the utility companies are not very accurate and are seldom
of the methods is presented. The outlier detection technique used in known,
the simulations is also presented. In Section 3, some of the ii. in case of theft, the difference in the estimated and measured
implementation issues and how data are to be processed before values will be noticeable everywhere in a radial feeder,
finally implementing the algorithm are discussed. In Section 4, iii. not every customer may have a smart meter.
some of the simulation results are presented. Finally, Section 5
gives the important conclusions. To correctly determine the exact location, thus, is a tricky business.

2832 IET Gener. Transm. Distrib., 2017, Vol. 11 Iss. 11, pp. 2831-2839
© The Institution of Engineering and Technology 2017
Fig. 2  Radial feeder system with a main node branching into lateral nodes

As mentioned in Section 1, two variants of the algorithm are Clearly, |V̄ 1 | − | V̄ 2| and |V̄ 2 | − | V̄ 3| change depending upon Īth.
proposed. The main idea of the first variant is given as follows: |V̄ 3 | − | V̄ 4| remains unchanged. Node 3 is the probable point where
illegal power consumption is happening. Once we are able to find a
i. The basic idea is to analyse the measured voltage magnitude node, upstream to which there are changes in voltage drops, we can
difference between successive nodes in the radial system. The manually inspect all the connections emanating from the DT
voltage readings can be obtained from meter readings at the representing that node and pin-point illegal tapping. This change in
nodes of the feeder as shown in Fig. 1a. drop should ideally be noticeable throughout the day under
ii. The voltage difference between successive nodes in a radial different loading conditions (different times of the day) of the
feeder would rise significantly upstream to the theft location. feeder. A technique to detect voltage drops as outliers (and hence
iii. Downstream to the theft location the voltage difference would detect theft) using a statistical approach has been discussed later in
not change significantly. this section.
iv. The voltage difference is also a subject of time, as the loading This kind of an analysis ideally should work when there are not
condition changes throughout the day. The challenge is to too many nodes in the lateral. When there are many nodes, the
identify the difference under conditions of theft. The cumulative variation in the currents being drawn at all the nodes,
cumulative loading condition of a group of customers follows a could overshadow the current due to illegal tapping/theft. For
rough pattern every day and one could use this information. example, for a given time window, if Īσ1, Īσ2 and Īσ3 are the
standard deviations in the currents Ī1, Ī2 and Ī3, respectively, (refer
To understand the concept, let us have a look at Fig. 2. The Fig. 2), we could have
voltages and currents in the figure are all phasors. In Fig. 2a, we
have a normal case with no power theft. A main node branches out |Īσ1 + Īσ2 + Īσ3 | > | Īth| (5)
into lateral nodes. Every node has a group of customers. The total
current drawn by the loads in node 2 is Ī1. Similarly, the loads for a given set of days. Under such conditions the described
connected to nodes 3 and 4 draw currents Ī2 and Ī3, respectively. method will not be very accurate. To overcome this problem, we
The current flowing in the radial line connecting nodes 1 and 2 is propose the NVDD technique.
thus Ī1 + Ī2 + Ī3. In Fig. 2b, we have a case where there is illegal
tapping at node 3, represented by Īth. Due to this extra current, the 2.2 NVDD technique
voltage drop between nodes 1 and 2 and between nodes 2 and 3
increases. However, the drop between nodes 3 and 4 remains the Let us have a look at Fig. 3. The voltage phasors at node k − 1, k
same. In other words, upstream to node 3 we see increased voltage etc., are V̄ k − 1, V̄ k and so on, respectively. Zk − 1, k is the impedance
drops, whereas downstream to node 3, the drops remain the same. of the line connecting nodes k − 1 and k. Node k has some illegal
This can be summarised by the following set of equations. load connected to it. Under normal circumstances the current
drawn at node k is Īk − 1. The illegal load draws a current of Īth.
• Under normal circumstances, the voltage differences are Without any instances of power theft (illegal load), we have
approximated as follows:
|V̄ k − 1 | − | V̄ k | ≃ Re (Īk − 1 + Īk + Īk + 1)Zk − 1, k (6)
|V̄ 1 | − | V̄ 2 | ≃ Re (Ī1 + Ī2 + Ī3)Z12
|V̄ 2 | − | V̄ 3 | ≃ Re (Ī2 + Ī3)Z23 (3) |V̄ k | − | V̄ k + 1 | ≃ Re (Īk + Īk + 1)Zk, k + 1 . (7)

|V̄ 3 | − | V̄ 4 | ≃ Re Ī3Z23 The double differencing of node voltages essentially involves


• Under conditions of illegal tapping (theft) calculating the difference of (6) and (7). If Zk, k − 1 ≃ Zk, k + 1, the
difference will give us
|V̄ 1 | − | V̄ 2 | ≃ Re (Ī1 + Ī2 + Ī3 + Īth)Z12
|V̄ 2 | − | V̄ 3 | ≃ Re (Ī2 + Ī3 + Īth)Z23 (4) ( | V̄ k − 1 | − | V̄ k | ) − ( | V̄ k | − | V̄ k + 1 | ) ≃ Re Īk − 1Zk − 1, k . (8)

|V̄ 3 | − | V̄ 4 | ≃ Re Ī3Z23 However, when there is an illegal load at node k, we have

IET Gener. Transm. Distrib., 2017, Vol. 11 Iss. 11, pp. 2831-2839 2833
© The Institution of Engineering and Technology 2017
Considering Ī js to be independent and of almost identical phase,
(13) can be reduced to
k+n−1
Re(Ī jZk − 1, k)
X1 ≃ ∑ k+n−1
j=k−1 Var(∑ j = k − 1 Re(Ī jZk − 1, k))
k+n−1
(15)
|Ī j|
≃ ∑ k+n−1
.
j=k−1 ∑ j = k − 1 Var( | Ī j | )

where Var(.) is the variance function. In (15), the effect of the


impedance Zk − 1, k has been eliminated. Similarly, let

( | V̄ k | − | V̄ k + 1 | )
X2 ≃
σk, k + 1
k+n−1 (16)
|Ī j|
= ∑ k+n−1
.
j=k ∑j = k Var( | Ī j | )

Now
k+n−1
|Ī j|
X1 − X2 ≃ ∑ k+n−1
j=k−1 ∑ j = k − 1 Var( | Ī j | )
Fig. 3  Radial feeder with illegal load at one of the nodes k+n−1
(17)
|Ī j|
( | V̄ k − 1 | − | V̄ k | ) − ( | V̄ k | − | V̄ k + 1 | ) ≃ Re (Īk − 1 + Īth)Zk − 1, k (9)
− ∑ k+n−1
.
j=k ∑j = k Var( | Ī j | )

which can also be written as Equation (17) can be further reduced if certain assumptions are
made as follows:
( | V̄ k − 1 | + | V̄ k + 1 | ) − 2 | V̄ k | ≃ Re (Īk − 1 + Īth)Zk − 1, k . (10)
• Let Var( | Ī j | ) = ϵ for j = k − 1k + n − 1.
So, by doing a double differencing with respect to node k, we get • Let ∑kj =+ nk −− 11 |Ī j| = (n + 1)I μ, and similarly ∑kj =+ nk − 1 |Ī j| = nI μ
the voltage drop only due to the current drawn at node k, i.e.
Īk − 1 + Īth. If we run an outlier detection algorithm on the double
The first assumption is based on the fact that the variance in the
differenced voltage, we should ideally get an outlier, provided the
current sum drawn by a set of customers (connected to one node)
standard deviation in the current Īk − 1 is less than Īth as given in the does not change much as we move from node to node. The second
following equation: assumption is true when we have a large number of nodes.
Equation (17) can be written as follows:
Īσ(k − 1) ≤ Īth (11)
(n + 1)I μ nI μ
When this is compared with the set of equations given in (4), one X1 − X2 ≃ −
(n + 1)ϵ nϵ
can clearly see that with a double difference technique, the chances (18)
of an outlier getting detected are higher, since we do not have the I μ (n + 1) n
= −
sum of variations in the other branch currents ( Īk and Īk + 1) in the ϵ (n + 1) n
network to deal with.
In the double difference technique, if Zk, k + 1 ≠ Zk − 1, k, we will On taking the binomial expansion n + 1 = n(1 + (1/2n)), we
have (see (12)) Here again, the effect of Īk and Īk + 1 could have
overshadow the effect of Īth, if the difference between Zk, k + 1 and
I μ (n + 1) n − n n(1 + (1/2n))
Zk − 1, k is significant. X1 − X2 ≃
ϵ n(n + 1)
One can get around this problem by a normalisation procedure. (19)
Let there be n nodes downstream to the node (node k) we are Iμ Iμ
≃ =
checking. Let 2 (n + 1)ϵ 2 ∑kj =+ nk −− 11 Var( | Ī j | )

( | V̄ k − 1 | − | V̄ k | ) With electricity theft at node k, we will have


X1 ≃ (13)
σk − 1, k

where σk − 1, k is the standard deviation in ( | V̄ k − 1 | − | V̄ k | ) over a set X1 − X2 ≃
k+n−1
2 ∑ j = k − 1 Var( | Ī j | )
of measurements spanning several reference days (reference days
is discussed in the next section). Now, as per (1), and without any |Īth|
electricity theft + (20)
k+n−1
∑ j = k − 1 Var( | Ī j | )
k+n−1 I μ /2 + | Īth|
( | V̄ k − 1 | − | V̄ k | ) ≃ ∑ Re(Ī jZk − 1, k) . (14) =
k+n−1
j=k−1 ∑ j = k − 1 Var( | Ī j | )

( | V̄ k − 1 | − | V̄ k | ) − ( | V̄ k | − | V̄ k + 1 | ) ≃ Re (Īk − 1 + Īth)Zk − 1, k
(12)
−Re (Īk + Īk + 1)(Zk, k + 1 − Zk − 1, k) .
2834 IET Gener. Transm. Distrib., 2017, Vol. 11 Iss. 11, pp. 2831-2839
© The Institution of Engineering and Technology 2017
Fig. 4  Selection of days for analysis

So, in the NVDD technique, not only has the effect of variations in N
1
N − 1 i∑
the sum of multiple currents been eliminated, the effect of C= (xi − μ) ⋅ (xi − μ)T (23)
impedance being different in different feeder sections is also not a =1
factor.
taken over N instances. For a normal distribution, MD2 is the Chi-
2.3 Theft at a junction point squared distribution. We could then use the thresholds in a Chi-
squared distribution to detect our outliers.
We could have junction points that carry significant load currents The sample mean μ and the covariance matrix C in our
in all the line segments connected to it. Under such circumstances, electricity theft case are chosen from the reference days. The
(17) will need modifications. Let the node i, as discussed in (13)– Mahalanobis distance MD(y) is calculated for the observation days.
(17), be a junction point, branching out into two separate line A value above the Chi-squared threshold for a given confidence
segments. Let the immediate nodes in these segments be denoted level is considered a suspect case.
by k + 1 and k + 2, respectively. For theft localisation, (17) should
then be changed, to calculate X1 − (X2 + X3), where X3 is given by 3 Implementation issues
( | V̄ k | − | V̄ k + 2 | ) 3.1 Reference days and observation days
X3 ≃ (21)
σk, k + 2 It has been mentioned before that the algorithms will not be able to
locate existing power theft. However, any new cases of theft can be
2.4 Outlier detection detected. To effectively detect and locate theft, we need to have a
good set of reference days. A set of observation days are then
The method we propose here essentially compares the calculated checked with respect to these reference days. This is represented in
voltage differences (as discussed in the previous subsection) of Fig. 4.
some observation days with a fixed number of reference days. Here D1, D2, …, Dr represent the r reference days. t1, t2, …, tm
Reference days are the days during when, it is considered that no represent the different time slots in a day. The loading condition at
power theft has taken place. The observation days are the days that every node varies throughout the day. The voltage pattern also
are being tested for power theft. The proposed algorithm, in that varies accordingly. However, the pattern is roughly the same every
sense, will not be able to detect power theft that was taking place 24 h working day for a group of customers, connected to a node
before the installation of the system. It can only detect power theft (with some noise). D1, D2, …, Dp are the p observation days, as
once data comprising a sufficient number of reference days are
available, after the installation of the meters and the collection of shown in the right side of the figure. V ij denotes the voltage
voltage measurement data. The outlier is detected in the difference value (simple difference or NVDD) at the jth time slot
observation days. If no outliers are found, the observed days are of the ith day.
deemed normal and a subsequent set of observation days are In our case, the vector xi ∈ Rm in (23) represents the vector
chosen. The voltage profile may also change if new legitimate T
V 1i , V 2i , V mi from the ith reference day. The vector μ ∈ Rm is the
connections come up at any of the nodes. Under such
circumstances, the reference days will require an update. Once sample mean vector, with the mean taken over the r reference days.
sufficient reference days become available, the theft analytic can be The covariance matrix can then be calculated over the r [N in (23)]
started again. The implementation issues concerning the choosing reference days. The vector y ∈ Rm, whose distance we are going to
of reference and observation days are discussed in the next section. find [as given in (22)], is the mean of the p observation days. For
In our simulations, the Mahalanobis distance metric to detect example, the jth element of y is given as follows:
outliers has been used. This kind of a metric has been used in
p
power systems for a wide variety of applications, ranging from bad 1
p i∑
data detection in state estimation to event detection using PMU y( j) = V ij (24)
=1
data [14, 15]. The Mahalanobis distance MD(y) of a sample vector
y ∈ Rm from the sample mean vector μ ∈ Rm is given as follows:
V ij in (24) are selected from the observation days here. Both
observation and reference days are chosen in the form of a moving
MD(y) = (y − μ)TC−1(y − μ) (22) window. In the figure, there are r reference days and p observation
days in a window with r > p.
C ∈ Rm × m is the sample covariance matrix given by

IET Gener. Transm. Distrib., 2017, Vol. 11 Iss. 11, pp. 2831-2839 2835
© The Institution of Engineering and Technology 2017
3.2 Selection algorithm innovative ways of relaying data to the substation to reduce the
communication overload. In [19], the authors proposed a multi-
Since power consumption pattern changes with time, reference agent system in a small-scale power network, whereby the nodes
days and observation days should have identical consumption communicate data only to its neighbour after an event. The data
patterns (e.g. Monday–Friday, summer, winter etc.). Reference and transmission scheme proposed in [19] is aperiodic and an event is
observation days could be selected based on the calculated triggered when the measurement error crosses a threshold. While
equivalent load (Leq) at the distribution substation. Leq is calculated the focus of the paper is not on the communication protocol, we
as follows [16]: would like to add that a similar approach could be used in the
0.5
algorithm proposed here. In this case, however, an event could be a
Lp ⋅ tp + Lnp ⋅ tnp significant change in the voltage difference measurements of
Leq = (25)
tp + tnp neighbouring nodes. Voltage difference with respect to a
downstream node/bus could be calculated at every node and
where Lp and Lnp are the peak and non-peak period loads (in kW/ relayed back to the substation control centre in the event of a
kVA), respectively, and tp and tnp (in min/h) their corresponding significant deviation. The downstream nodes in turn could
duration. Lp and Lnp can be calculated by adding the energy communicate its voltage measurement information to an upstream
node. Once all the information has been collected for a day at the
consumption data of all the digital meters located at the nodes
substation, average measurements representative of a time span
(DTs). If the variance of Leq for a set of days is below some
(hourly, half-hourly etc.) could then be used for further analysis.
threshold, then these days could be examined for locating theft.
The algorithm to select reference and observation days is given
below as a step-by-step process. 3.4 Complexity of the methods
Complexity wise, the NVDD method is different from the simple
i. Let us consider that the first N days were found to have Leq voltage difference method only in the calculation of the
with variance below the threshold. We take the last p days from normalisation factors (13). In the simple voltage difference method,
this set, as our observation days and the first r = N − p days as for every node, and for r reference days, p observation days and m
our reference days. p is a constant in our algorithm selected time slots, the number of operations required are 2(r + p)m. The
beforehand. factor 2 comes because, first, the voltage magnitude differences are
ii. If the observation days are found to be normal (no theft), then calculated and second, the mean for every time slot is calculated.
we take the first k (out of the p) observation days and together This is done for all the reference days as well as the observation
with the last r − k days from the reference set, from our new days. In the NVDD technique, apart from the 2(r + p)m operations
reference set. for calculating the voltage differences and their mean, an extra
iii. The next set of p observation days is created, such that the 2(r + p)m (total 4(r + p)m) operations are required for calculating
variance of their Leq is within the specified threshold. If we are the normalisation factors.
not able to find these p days, the value of k is decreased Calculation of the Mahalanobis distance is common to both the
suitably to allow for more days from the previous observation methods. Calculation of the covariance matrix has a complexity of
set to join the present set of observation days. O(rm2), while calculation of its inverse has O(m3). It is to be noted
that the complexity discussed so far is only for a node. In any
Thus the set of reference and observation days are continuously typical feeder system, the number of nodes is going to be in the
updated till theft at a node is located. Once theft is suspected at a tens. Given that for hourly data, m = 24, reference days in the
node, the reference days are left unchanged and only the window range of 15–20 days and observation days in the range of 5–10
of observation days is kept moving. days, the overall complexity of the process is not going to be very
large for the computers in a distribution control centre and can be
3.3 Data communication calculated within seconds (if not milliseconds).

In the literature, several papers on communication technologies as 3.5 Limitations


applied to power networks can be found. In [17], communication
infrastructure for smart grids at different levels is considered and The primary limitation of this method is in the accuracy of voltage
the pros and cons of different technologies at the field level are difference measurements. The algorithm requires voltage
discussed. In [18], the role of power line communications (PLC) in difference measurements with error/noise levels below 5–6% (refer
the smart grid is discussed. This paper gives a detailed review of all Table 1). Higher levels of noise would result in no/incorrect
the different technologies available in PLC, their standards and outliers getting detected.
their applications in power networks. Given the low bandwidth Another point where the proposed method will fail is if the
requirement for the proposed system, a less expensive narrow band cumulative load of customers connected is highly variable with
PLC with bandwidth in the range of tens of kbps should suffice. time of day. The basic assumption made here is that the cumulative
Although, we have already discussed about the low bandwidth load of multiple customers would follow some pattern (with some
requirements of the proposed method, there could be further noise) in a 24 h period. For the experiments conducted using the
proposed algorithm, we have allowed for a ±10% noise in the
cumulative load levels of all the customers connected to a feeder
Table 1 Sensitivity to noise levels in voltage magnitude node at every instant of time. Violation of this assumption could
difference measurements result in inconsistent results.
% Noise level Average MD values % Confidence One example of such cumulative load variations could be due to
1 27.13 99.93 heavy penetration of rooftop PVs. On normal sunny days, PV
2 26.89 99.92
outputs follow a predictable pattern with respect to time. This will
get reflected in the load and voltage profile of reference days, and
3 24.68 99.82
should not adversely impact the outcome of the method. However,
4 20.5 99.13 on sudden cloudy days, and with a large percentage of loads being
5 20.4 99.10 normally displaced by PV, this could adversely affect the outcome.
6 20.2 99.03 If the variations in the cumulative output of all the rooftop PVs
7 18.89 98.45 concerning that node are more than the considered noise levels of
8 15.47 94.93 the entire load at the feeder node, the results may not be consistent.
9 13.36 89.99
10 11.42 82.09

2836 IET Gener. Transm. Distrib., 2017, Vol. 11 Iss. 11, pp. 2831-2839
© The Institution of Engineering and Technology 2017
Fig. 5  IEEE 34 bus feeder system
(a) Single-line diagram of the IEEE 34 bus feeder system, (b) Voltage magnitude variation in a day for node 842

Fig. 6  Mahalanobis Distance (MD)


(a) MD of the voltage differences, (b) MD of the NVDD

4 Results from simulated data rising. Since the MD starts rising, the reference days’ window is
kept fixed and the observations days’ window is kept moving, e.g.
All the simulations were performed in the IEEE 34 bus feeder day 18–day 22, day 19–day 23 and so on. Fig. 6a gives the MD of
system [20]. Load flow analysis was done on this setup using a the voltage magnitude differences of phase 2.
Newton Raphson (NR) method. Fig. 5a gives the single-line The Chi-squared threshold for 8° of freedom with a 99%
diagram of the feeder system. confidence level is 20.09. Clearly, the MD of voltage magnitude
Load data was generated for several time instances for each of difference between the nodes 850-816 rises and remains so (except
the nodes to simulate the load variations over a day. The load data for the first window) for all the windows of observation days. The
for a particular day (base data) was generated by varying the base MD crosses the threshold in the fifth window itself. The sixth
load over a range of ±10%. The load data was generated for window has day 21–day 25, all of which were days that had the
several days by adding 10% normally distributed noise to the base extra load. The MD peaks in the sixth window. The difference for
load values. The voltage magnitude at the nodes varies accordingly. nodes 850-816 rises because this line segment carries the extra
Fig. 5b gives the variation of the voltage magnitude at node 842 for current for the illegal load. As is discussed in Section 2.1, the
one particular day. difference should also show up for all the lines upstream to 816.
This, however, does not happen in our case, as can be seen from
4.1 Power theft at a node the line 814-850. Clearly, when we have a significant number of
4.1.1 Voltage difference: To demonstrate the usefulness of nodes downstream to the point of theft, not all upstream voltage
measuring simple voltage magnitude difference, the following differences may come out as outliers. This test still does the job of
experiment was performed. The nodes 814, 850, 816 and 824 (see pointing out where we may have a load that has not been accounted
Fig. 5a) were considered. A feeder segment branches out from for.
node 816 towards node 818. As such, the line segment between
nodes 850 and 816 carries currents serving loads at two different 4.1.2 Normalised voltage double difference: The same set of
line segments. The node 816 does not have any spot load; it has a data is taken and the NVDD technique is used. Fig. 6b gives the
distributed load of 5 kW in phase 2, in the line segment connecting MD for the double difference. The MD for (850-816)–(816-824),
816 and 824. We also have a distributed load of 40 kW in phase 2 which we will refer as 816, spikes up by a huge margin. The
in the line segment joining nodes 824 and 826. Nodes 818, 820 and threshold is crossed in the third window, much earlier than the
822 do not have any load in phase 2. An extra load of 40 kW in simple voltage difference case. The MD for (814-850)–(850-816),
phase 2 of node 816 was added to mimic illegal load/power theft. which we will refer as 850, remains within the threshold limit.
Power flow data was generated for 40 days. For the first 20 In the example that was just discussed, it was seen that a power
days, no illegal load was added. From day 21 to day 40 the extra theft of the order of 40 kW is quite high when compared to the
(illegal) load was added in phase 2 of node 816. For calculation of normal load level attributed to that node. It is, thus, easier for the
MD, the size of the voltage vectors was kept at R8, by selecting algorithm to detect an outlier. Next, we show the effect of changing
the load level (from high to low) of the illegal load, on the
equally spaced time slots (e.g. V 1i , V 4i , , V 22
i
), with each slot
performance of the outlier detection algorithm.
representing an hour. This was done to cover the entire day and at
the same time to keep the matrix of measurements from becoming
ill-conditioned due to repetitive measurements. The first 15 days 4.2 Sensitivity to changes in the theft load level
were chosen as the reference days in the first instance. The number To see the sensitivity of this algorithm to the level of theft, the
of observation days was kept at 5. The sliding window of illegal load level at a node with respect to its already existing legal
observation days and reference days were shifted by one day every load is varied. In the IEEE 34 node feeder system, node 844 has a
time. So, the first window of observation days ranged from day 16 spot load of 135 kW and 105 kVar in all the three phases. To create
to day 20. These days were normal days and hence the MD had no multiple scenarios, extra loads (illegal load) varying from 2 to 10%
spikes. As such, the reference days’ window was shifted to days 2– of this load (with a power factor of 0.8), on phase 2 of node 844
day 16. The observation days’ window was shifted to day 17–day were added, to mimic power theft.
21. This set had one day (day 21) with illegal load. The MD starts

IET Gener. Transm. Distrib., 2017, Vol. 11 Iss. 11, pp. 2831-2839 2837
© The Institution of Engineering and Technology 2017
Fig. 7  MD of the NVDD with different theft load levels

Eighty days of power flow data were generated for each of the A noise level of 1–6% in the measurements could still result in
scenarios by adding noise to the base case (as mentioned in the a confidence level above 99%. With a 10% noise level, the
previous section). For the first 40 days, for all the scenarios, no confidence level is quite low and cannot be used to conclude
extra/illegal load was added. For the remaining 40 days, extra load anything. From the above results, it is clear that a noise level of 6%
was added at node 844. The window size of reference days was and below is preferable for the algorithm to give good results.
kept at 25 and the window size of observation days at 5. For
calculation of MD, the size of the voltage vectors was kept at R8, 4.4 Effect of mutual coupling on other phases
with each element representing an equally spaced hour (hours 1, 4,
7, …, 22). Fig. 7 gives the MD plot for the NVDD of phase 2, for Due to the effect of mutual coupling between phases, the double
three different scenarios: 2, 5 and 10% illegal load levels. differencing of voltages on the other phases (phases that do not
In Fig. 7, 844 refers to the NVDD, (842-844)–(844-846). There have illegal loads) can also throw up some outliers. The probability
are 51 windows of observation days. The first window has day 26– of getting outliers in the other two phases is more, when the node
day 30, while the last window has day 76–day 80. From the figure, which is being examined for theft does not have any existing load
it can be observed that with a 2% power theft (compared to the in these two phases. If the other two phases have loads, then
existing base case load), only a couple of minor peaks come up in mutual coupling effect due to the theft load may not be significant.
the MD. These peaks cross the Chi-square threshold of 18.17 for a In Section 4.2, a theft at phase 2 of node 844 is simulated. It has a
98% confidence level. With a 5% power theft, one can see some spot load of 135 kW and 105 kVar in all the three phases. Fig. 8a
peaks and a sustained high level of MD for some of the windows. gives the MD plot for the NVDD of phase 1 of node 844.
The peaks cross the Chi-square threshold for a 98% confidence Clearly, the effect on phase 1 due to a theft in phase 2 is not
level, while the sustained higher level of MD crosses the threshold strong enough to throw up an alarm. This is mainly because of the
for a 90% confidence level at 13.36. With a 10% theft level, the 135 kW spot load present in phase 1 of node 844.
MD rises from the 15th window onwards, and remains above the Node 808 on the other hand, has a base load of 8 kW in phase 2
threshold for a 99% confidence level, on a consistent basis. and no loads in phases 1 and 3. An extra 8 kW (illegal) load in
It is clear from the above simulation, that with very low levels phase 2 of node 808 was added. Fig. 8b gives the MD plot of
of power theft with respect to the existing load, the algorithm may phases 1 and 2 of node 808. The MD in phase 2 gives a clear
not be able to detect anything. This is primarily because of the indication of extra current flowing into node 808, with very high
noise level in the load demand of the customers, connected to the values. The MD in phase 1 also points to something abnormal.
node. With higher levels of theft, the algorithm is able to detect Fig. 8c gives the plot of voltage magnitudes of phase 2 (the
power theft correctly. Detection accuracy will be dependent on the phase where we have the illegal load) and phase 1. The plot gives
levels of power theft with respect to the variation in the existing the hourly values for all the 80 days (24 × 80). If we look at the
load from day to day. The variation level (normally distributed) in plot of phase 2, we see a clear dip in the voltage magnitudes for the
the existing load in our simulation case studies is at 10% (1σ ). So, last 40 days. This gets reflected in the high MD values. The dip in
for example, if we calculate the MD squared for only one degree of phase 1 cannot be found visually. Nonetheless, the MD still throws
freedom (one time instant, univariate distribution), with theft levels up an alarm. This information from a mutually coupled phase could
of 10% and above, there is a 68% chance that the total load be useful under circumstances where we may have missing data in
(existing + theft) would be considered as an outlier. Whereas, if the one of the phases.
theft levels are above 30% (≥ 3σ of existing load variation levels),
there is a 99% chance of detection. With a higher number of 4.5 Theft at a junction point
degrees of freedom (more time instants spanning over the day), the In Section 4.1, a theft at node 816, which is a junction point with
chances of outlier detection will be dependent upon the Chi-square two different line segments downstream to it, is simulated. While
distribution. doing the voltage double differencing, (850–816) and (816–824)
were considered, but (816–818) was ignored. This is because,
4.3 Sensitivity to noise levels in voltage magnitude difference nodes 818, 820 and 822 do not have any load in phase 2 and any
measurements voltage drop in the line segment 816-818 due to the current in
The outlier detection technique will be sensitive to noise in the phase 2, would be negligible.
voltage magnitude difference measurements. To check for the To simulate a theft at a junction, an extra load in phase 1 of
effect of noise levels in the voltage difference measurements, we node 834 was added. Node 834 is a junction point with two line
considered the case discussed in Section 4.2. 10% of the load (with segments branching out of it (to nodes 860 and 842). Between
a power factor of 0.8) on phase 2 of node 844 was added, to mimic nodes 834 and 860, distributed loads of 16, 20 and 116 kW in
power theft. Normally distributed noise was then added to the phases 1, 2 and 3, respectively, are distributed. A theft of 20% in
voltage magnitude differences, ranging from 1 to 10%. Table 1 phase 3 was simulated. The voltage double difference now was
gives the MD values and the Chi-squared confidence levels (8° of calculated using (21). Since phase 3 has a bigger load compared to
freedom) with such values. the other phases, a 20% theft in phase 3 will have a significant
effect on the other phases due to mutual coupling. Fig. 8d gives the

2838 IET Gener. Transm. Distrib., 2017, Vol. 11 Iss. 11, pp. 2831-2839
© The Institution of Engineering and Technology 2017
Fig. 8  Simulation results
(a) MD in a mutually coupled phase having load, (b) MD in a mutually coupled phase having no load, (c) Voltage magnitude plots of phases 2 and 1 for 80 days, (d) MD in a
mutually coupled phase due to theft at a junction point

MD plot of phase 1 corresponding to node 834. There is a definite [7] Nikovski, D., Wang, Z., Esenther, A., et al.: ‘Smart meter data analysis for
power theft detection’ (Mitsubishi Electric Research Laboratories, 2013)
rise in the MD. [8] Bandim, C.J., Alves, J.E.R., Pinto, A.V., et al.: ‘Identification of energy theft
and tampered meters using a central observer meter: a mathematical
5 Conclusions approach’. Proc. of IEEE PES Transmission and Distribution Conf. and
Exposition, September 2003, pp. 163–168
Simple voltage differences and normalised double differences [9] Salinas, S.A., Li, P.: ‘Privacy-preserving energy theft detection in microgrids:
a state estimation approach’, IEEE Trans. Power Syst., 2016, 31, (2), pp. 883–
between successive nodes in a distribution feeder can be used to 894
detect and locate power theft/illegal loads. Mahalanobis distance [10] Zhou, Y., Chen, X., Zomaya, A.Y., et al.: ‘A dynamic programming algorithm
measure to detect outliers was used. Once an outlier is detected, for leveraging probabilistic detection of energy theft in smart home’, IEEE
technicians on the field can be asked to verify the connections Trans. Emerg. Top. Comput., 2015, 3, (4), pp. 502–513
[11] Amin, S., Schwartz, G.A., Cardenas, A.A., et al.: ‘Game-theoretic models of
going out of a feeder node. electricity theft detection in smart utility networks: providing new capabilities
The algorithms discussed here, however, are dependent on very with advanced metering infrastructure’, IEEE Control Syst., 2015, 35, (1), pp.
accurate voltage measurements. The fact that the algorithms are 66–81
dependent on voltage magnitude differences and double [12] Cardenas, A.A., Amin, S., Schwartz, G.A., et al.: ‘A game theory model for
electricity theft detection and privacy-aware control in AMI systems’. Fiftieth
differences, the presence of measurement/calibration errors in the Annual Allerton Conf. Allerton House, Illinois, USA, October 2012, pp.
voltage measurements could lead to additive errors and hence 1830–1837
erroneous results. With increased availability of accurate digital [13] Kersting, W.H.: ‘Distribution system modeling and analysis’ (CRC Press,
meters, this kind of algorithms could be deployed in the future in a 2002)
[14] Gardner, R.M., Liu, Y.: ‘Generation-load mismatch detection and analysis’,
large scale. IEEE Trans. Smart Grid, 2012, 3, (1), pp. 105–112
[15] Gajjar, G., Soman, S.A.: ‘Auto detection of power system events using wide
6 References area frequency measurements’. National Power Systems Conf. (NPSC),
Guwahati, India, 2014, pp. 1–6
[1] ‘Emerging markets smart grid: outlook 2015’ (Northeast group, llc, 2014) [16] ANSI/IEEE C57.91-1981: ‘Guide for loading mineral-oil-immersed overhead
[2] Dan, L., Bo, H.: ‘Advanced metering standard infrastructure for smart grid’. and pad-mounted distribution transformers’ (American National Standards
Proc. of China Int. Conf. Electricity Distribution, Shanghai, China, September Institute, Inc., 1981)
2012 [17] Sauter, T., Lobashov, M.: ‘End-to-end communication architecture for smart
[3] Jiang, R., Lu, R., Wang, Y., et al.: ‘Energy-theft detection issues for advanced grids’, IEEE Trans. Ind. Electron., 2012, 58, (4), pp. 1218–1228
metering infrastructure in smart grid’, Tsinghua Sci. Technol., 2014, 19, (2), [18] Galli, S., Scaglione, A., Wang, Z.: ‘For the grid and through the grid: the role
pp. 105–120 of power line communications in the smart grid’, Proc. IEEE, 2011, 99, (6),
[4] Erol-Kantarci, M., Mouftah, H.T.: ‘Smart grid forensic science: applications, pp. 998–1027
challenges, and open issues’, IEEE Commun. Mag., 2013, 51, (1), pp. 68–74 [19] Meng, W., Wang, X., Liu, S.: ‘Distributed load sharing of an inverter-based
[5] Kadurek, P., Blom, J., Cobben, J.F.G., et al.: ‘Theft detection and smart microgrid with reduced communication’, IEEE Trans. Smart Grid, 2016, PP,
metering practices and expectations in the Netherlands’. IEE PES Innovative (99), pp. 1–1
Smart Grid Technologies Conf. Europe, Gothenburg, Sweden, October 2010 [20] ‘Distribution Test Feeders’, http://ewh.ieee.org/soc/pes/dsacom/testfeeders/,
[6] Bergh, F.V.D., Kadurek, P., Cobben, S., et al.: ‘Electricity theft localization accessed 1 October 2016
based on smart metering’. 21st Int. Conf. Electricity Distribution, Frankfurt,
Germany, June 2011

IET Gener. Transm. Distrib., 2017, Vol. 11 Iss. 11, pp. 2831-2839 2839
© The Institution of Engineering and Technology 2017

You might also like