You are on page 1of 20

Accepted Manuscript

Lightweight Privacy-Preserving Data Aggregation Scheme for Smart


Grid Metering Infrastructure Protection#

Ulas Baran BALOGLU , Yakup DEMİR

PII: S1874-5482(17)30110-5
DOI: 10.1016/j.ijcip.2018.04.005
Reference: IJCIP 246

To appear in: International Journal of Critical Infrastructure Protection

Received date: 30 June 2017


Revised date: 23 January 2018
Accepted date: 27 April 2018

Please cite this article as: Ulas Baran BALOGLU , Yakup DEMİR , Lightweight Privacy-Preserving
Data Aggregation Scheme for Smart Grid Metering Infrastructure Protection# , International Journal of
Critical Infrastructure Protection (2018), doi: 10.1016/j.ijcip.2018.04.005

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service
to our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and
all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT

Highlights:
• A lightweight privacy-preserving data aggregation
scheme for smart grids
• Lossless scheme by using a combination of
encryption and perturbation techniques
• Suitable for devices with limited hardware
• Resilient to both filtering and true value attacks

T
• Case study using Holt-Winters and STL

IP
prediction methods is presented

CR
US
AN
M
ED
PT
CE
AC

Lightweight Privacy-Preserving Data Aggregation Scheme for Smart


Grid Metering Infrastructure Protection
1
ACCEPTED MANUSCRIPT

Ulas Baran BALOGLU1*, Yakup DEMİR2


1
Computer Engineering Department, Munzur University, Tunceli, Turkey
2
Electrical and Electronics Engineering Department, Firat University, Elazig, Turkey
*
ulasbaloglu@gmail.com

Abstract
The electric industry’s planned shift to smart grid metering infrastructure has raised several
concerns especially on preserving the privacy. Various data perturbation and aggregation

T
solutions are developed to address this concerns. The drawback of these solutions is that a
simple random noise scheme cannot protect privacy, and more advanced perturbation

IP
techniques may increase hardware costs of smart metering devices. The proposed data
aggregation scheme combines the power of perturbation techniques with crypto-systems in an

CR
efficient and lightweight way so that it is suitable for devices with limited hardware such as
smart meters. We investigated the privacy preserving capabilities of the proposed aggregation
scheme with Holt-Winters and Seasonal Trend Decomposition using Loess prediction
methods. The results indicate that the proposed scheme is resilient to both filtering and true
value attacks.
US
Keywords: data perturbation; privacy; security; smart grid protection.
AN
This research did not receive any specific grant from funding agencies in the public,
commercial, or not-for-profit sectors.
M
ED

1. Introduction

Recently, there is significant research on smart grids and its smart components such as
PT

electricity meters. The introduction of these new technologies raises privacy problems and
concerns because data engineering and data mining techniques could investigate large
volumes of private data quickly. The electric power industry has to cooperate with
CE

information technologists to adopt cybersecurity into the smart grid to maintain reliability
because reliability requires security [1]. Sender authentication and privacy-preserving of
AC

consumer data are two major security problems in smart grid communication [2].
Applications and services should be designed in a way to operate efficiently without intruding
on the privacy of consumers [3]. Smart meter data can be associated with a consumer’s
activities, and such privacy-sensitive household data shouldn’t be shared, disclosed or used by
any third-party company to profile energy consumption patterns [4].
Smart meters are essential components of a smart grid, and they are usually referred as
the future generation of power measurement systems [5]. Advances in technology not only

2
ACCEPTED MANUSCRIPT

have started the emergence of smart grids, but they also have triggered the development of
smart metering devices over time. The evolution first begins with the use of Automatic Meter
Reading (AMR) devices, which provide one-way communication. AMR technology has made
it easier to read the electricity meters, but it is still not suitable for implementing smart grid
applications. Consequently, the Automatic Metering Infrastructure (AMI), which can provide
two-way communication, has emerged. AMI meters can be read for a much shorter period,
and they can be communicated remotely [6]. Historical development of metering devices for
electrical grids is shown in Figure 1.

T
IP
CR
US
AN
M
ED
PT

Figure 1. Historical development of metering devices and technologies.


CE

With current communication techniques, smart meters not only monitor the power
consumption of all household appliances but also report meter readings in smaller time
AC

intervals. Naturally, a smart meter becomes a key requirement for smart grid applications,
such as a demand response system which encourages consumers for reducing the demand to
limit peak amounts [7]. Collected metering data from these applications could be used to
generate profiles of consumers even in anonymous environments, and failure to adequately
secure personal privacy may threaten customer participation and risk success of smart grid
applications. From the metering data, daily routines of your house can be revealed such as
when you sleep, when you go to work or which brands of appliances at home are used at what

3
ACCEPTED MANUSCRIPT

time. This information can be used for a variety of purposes from burglary to assassination.
The information that metering devices read should then be used only by smart grid
applications, and should not be readable or accessible except for this.
This study focuses on developing a lightweight privacy-preserving data aggregation
scheme which efficiently utilizes components of a smart grid metering infrastructure. Our
goal is preserving the information privacy in smart meters while keeping hardware costs
down. Unlike many other privacy-preserving data aggregation schemes, the computational
cost of the aggregation and privacy preserving is minimal in this study. The amount of

T
processing power has been tried to be minimized to decrease the infrastructure installation

IP
cost. Nonetheless, reducing processing power may lead to complexity problems in distributed

CR
systems, such as a limited number of Privacy Preserving Nodes (PPNs) problem [8]. The
proposed approach extends this previous study and introduces a task scheduling layer to cope
with this minimization problem. Another contribution of the proposed approach is the

US
calculation of each user data in two processing units at least and monitoring the status of the
system continuously with the developed algorithms. They help the system to prevent data loss
AN
in the event of a malfunction in processing units and also capture possible errors resulting
from smart grid communication. Conclusively, the proposed scheme does not require
M

distribution or model of the data for reconstructing perturbed data, and unlike other studies the
reconstruction process is lossless.
The remainder of this paper is organized as follows. Related work is given in Section 2
ED

followed by the details of the proposed privacy-preserving data aggregation scheme and
problem formulation in Section 3. We use two different prediction methods to investigate the
PT

privacy preserving capabilities of the proposed aggregation scheme in Section 4. The last
section presents the concluding remarks.
CE

2. Related Work
AC

Privacy-preserving for smart metering infrastructure is an important research problem


because meter readings can be used to observe a household activity in real time [8]. Those
meter readings are sensitive data. They may create risks related to profiling or data mining in
the future so that they should be kept and processed securely. Nobody wants someone else has
the information of their household activities or schedules of particular home appliances.
In a smart metering infrastructure, each meter measures energy consumption and sends
it to a utility company at regular intervals, which is typically 15 minutes [9]. There should be

4
ACCEPTED MANUSCRIPT

an external aggregator at company side to gather the metering data which represent energy
consumption in the form of a time series. Privacy of consumers can be protected by
implementing a security architecture which enables aggregation of the collected data [8].
There are different approaches in the literature for this purpose. Those approaches can be
classified as trusted aggregator, homomorphic cryptosystem, secret sharing, blind signature,
differential privacy and trusted dealer [10-12]. Aggregation schemes can be categorized as
message appending and mathematical summation [13]. Techniques without aggregation aren’t
feasible because they require smart meters with computational capabilities. It would be

T
expensive to construct, renew or modify such systems so that a reliable and secure

IP
aggregation scheme would be a better choice.

CR
Data perturbing process for time series adds noise to hide the true metering values or
daily routines of consumers while preserving the usability [14]. This process protects user
privacy in two different ways: either locally by randomizing user data before its transmission,

US
or in a centralized manner at an aggregator [15]. Perturbed time-series data transmitted from
smart meters to protect original usage information. There are different perturbation techniques
AN
in the literature, such as adding noise (additive perturbation), k-anonymity, compressing data
and geometric transformation [16]. A distributed noise generation procedure can be used to
M

employ differential privacy [17]. However, these methods have difficulties in achieving an
effective privacy and utility tradeoff because of correlations and high fluctuations of time-
series data [18]. Bayes Estimate and Principal Component Analysis might also be selected for
ED

noise generation and data reconstruction tasks [19]. Removal of noise from the data is
difficult, and this process may lead to data losses. Another technique is homomorphic secret
PT

sharing, which is a special form of secret sharing, where homomorphic encryption is used to
encrypt the secret. Homomorphic encryption allows anyone to encrypt data from a set of
CE

messages without knowing the decryption key [20]. Semantically not secure homomorphic
schemes may also cause security problems [21].
AC

Many of the existing studies in the literature tried to employ expensive computational
operations. These operations are not feasible for smart grid applications where there are
usually limited resources regarding bandwidth and computation [22]. In a previously
proposed study, finding the minimum number of PPNs is determined as an NP-hard problem
[8]. In this study, a task scheduler layer is added to the scheme to eliminate the need for
minimum PPN computation. This layer avoids problems due to a low number of processing
units and also leads to a more scalable architecture. Unlike previous studies, calculations to be
done by the smart metering devices are kept simple so that their hardware costs will not

5
ACCEPTED MANUSCRIPT

increase too much. Finally, the proposed aggregation scheme is not concerned only with the
privacy of the data in the aggregator. The scheme also achieves a robust architecture by
distributing decryption process to multiple processing units.

3. The Proposed Data Aggregation Scheme

We now describe how smart metering data is processed in the proposed data
aggregation scheme. There are four layers in the architecture as shown in Figure 2. The first

T
layer contains smart meters where metering data collected as time-series, perturbed and then

IP
transmitted with encrypted noise information. The second layer contains a task scheduler
which is responsible for two processes: data aggregation and transmitting metering data. The

CR
metering data is transmitted to at least two separate PPNs to maintain the integrity and to
increase robustness. Perturbed data and encrypted noise data are aggregated in this layer by

US
appending. Metering data is reconstructed and decrypted in the third layer by PPNs units. The
final layer represents the utilities or other services where metering data is used or stored.
AN
M
ED
PT
CE
AC

Figure 2. Privacy-preserving aggregation scheme for time-series metering data.

6
ACCEPTED MANUSCRIPT

Some studies aggregate all metering data and encrypt them together. The main
problem with this approach is its weakness to possible hardware failures. A malfunction in
one metering device causes failure of all aggregation scheme so that these studies not only
have to concentrate on fault tolerance but also have to concentrate on privacy preserving. The
proposed aggregation scheme is robust and resistant to the hardware failures in the system.
Since the aggregated metering data is sent to more than one processing units, it is possible to
recover the metering data in case of a hardware failure or communication error. Processing
units, which are unresponsive to processes, will automatically be out of the system as they can

T
be easily detected by the task scheduler.

IP
CR
3.1. Problem Formulation

The proposed scheme perturbs a consumer’s smart metering data such that the

US
individual data items cannot be estimated with high accuracy by attackers and their trend
cannot be used for creating a profile. Suppose that there are N smart metering devices (D1, D2
… DN) in the smart grid infrastructure. Each device records meter readings at a predetermined
AN
time interval t. When the value of t is 15, there are total 96 records in daily time-series of each
device. Smart grid metering data is represented as below,
M

⃑ ⃑ ⃑ ⃑ (1)
Perturbed data is generated by adding noise to the metering data before transmission.
ED

Scaled version of the noise yields a better privacy so that noise data is multiplied by a series
of coefficients. Given a series dn, using a perturbation series pt with zero mean and E[pt] = 0,
PT

additive perturbation is defined as calculating the series xt as a summation of dt and pt for


every t  1 [23]. For wavelet perturbation, wavelet coefficients are multiplied with Gaussian
CE

noise or other coefficients can be preferred to generate different perturbation schemes. In this
study, perturbed time-series is defined as follows,
AC

(2)

where  denotes coefficient,  is the mean value,  is the standard deviation and z
represents the grey level. Gaussian noise is time independent, and its generation doesn’t
associate with the consumer data. Perturbed smart grid metering data is represented as,
⃑ ⃑ ⃑ ⃑ (3)

7
ACCEPTED MANUSCRIPT

There are two problems related to time-series data reconstruction and preserving the
privacy of the data. When reconstructing the perturbed data with function , the
approximation error (t) should be zero for all vectors ⃑ ⃑ ⃑ ⃑ .
⃑ ⃑ … (4)
( ⃑) ⃑ ⃑ ⃑ (5)
Transmitted smart metering data consists of both perturbed and unperturbed time-
series data. Scaled version of fn can be added to the metering data to improve privacy and

T
equation 5 can be rewritten as the following,

IP
( ⃑) ⃑ ⃑ ⃑ (6)
where,  denotes the scaling coefficient, which is a random variable. Previous

CR
solutions for this problem have to deal with minimization of the f function because high
approximation errors make it difficult to separate unperturbed time-series data from the

US
perturbed data. In a smart grid environment, no utility wants to charge its consumers less or
more than actual consumption amount so that metering data should be transmitted without a
AN
loss in data. Reconstructing time-series with noise may cause data losses so that we decide to
transmit the data in two parts: perturbed time-series data and encrypted noise amount. This
information is used by the processing units to reconstruct the original data without a loss. A
M

traditional single perturbation technique can’t satisfy this lossless reconstruction without
compromising security. In the scheme, the noise is generated with a known function fn but
ED

with unknown parameters, and it is applied with an unknown scaling coefficient. As a result,
noise variance is not small, and noise values are correlated.
PT

The second problem is defined as protecting a smart meter’s data values in a way that
individual readings or data trends cannot be estimated by statistical models. The minimum
CE

metering data leak for a given time series function ( ⃑) with a noise function fn is defined as,

⃑ ⏟ (7)
AC

L denotes the metering data leak function and as expected lower values of this function
achieve a better privacy. The optimal noise value should satisfy two constraints: it should
limit the metering data leak, and it shouldn’t increase the size of transmitted data too much.
The optimal noise value is application specific, and it should be determined according to the
environment parameters.

8
ACCEPTED MANUSCRIPT

T
IP
CR
Figure 3. The system model for metering data transmissions of smart meters.

3.2. Encryption Scheme


US
AN
A Decision Diffie-Hellman (DDH) based scheme is used for encryption of noise data.
Let G be a multiplicative cyclic group of Sophie German prime order q with random
M

generator gG. For a,bZq, gab is a random element in G for independently chosen ga and gb.
H0 : Z  G and H1 : Z  G are two hash functions. Array A with m random elements over Z
ED

is defined as,
∑ ⁄
{ (8)
∑ ⁄
PT

For 0 < i  m, secret key is keyi = (Ai, A2i) and


CE

. (9)
In Figure 3 system model for metering data transmissions of smart meters is given.
Perturbed data and encrypted noise data are sent to the task scheduler as two separate
AC

transmissions. It is possible to employ different security schemes for the encryption part, such
as Shamir’s Secret Sharing algorithm [24], homomorphic encryption [20,21,25], NTRU [26-
27], Paillier cryptosystem [28-29], elliptic curve cryptography [30], differential privacy [31],
or a neural network based multi-key algorithm [32]. The proposed security scheme is
preferred because of its low complexity and simplicity. For additional computational
complexity reduction, no noise is added to the metering data for coefficients which are lower
than a predetermined threshold value. This setting may lead to the leakage of only an

9
ACCEPTED MANUSCRIPT

insignificant number of actual values as it is seen in the case study but it does not give
information about the general trend of the series. The algorithm of the perturbation process is
given below.

Algorithm 1. Perturbation
Input: Metering data time-series d, perturbed series x, coefficient series k,
time interval count t, threshold 
1: while (t  0) do
2: if (k < ) then
3: p = 0;

T
4: else
5: calculate noise p;

IP
6: x = d + p;
7: Encrypt(p);
8: Transmit(p);

CR
9: end if
10: Transmit(x);
11: t--;
12: end while

US
The proposed data aggregation scheme transmits the metering data as discrete values
so that any attacker or a malicious user may capture only a portion of the data which can’t be
AN
estimated accurately by only removing the noise. An attacker has to capture both perturbed
data and encrypted noise data and then also has to identify and encrypt the data. The proposed
M

data aggregation scheme has the following assumptions:


 We assume that internal hardware of metering devices isn’t accessible by any
ED

attacker. This study concentrates on securing the transmitted data.


 We assume that all the information transmitted by the metering device can be
PT

accessible by attackers. There isn’t any safe zone in the communication


infrastructure of metering devices.
CE

 We assume that each metering device operates independently without interacting


with other metering devices. This assumption makes the system robust as a failure
in one metering device doesn’t affect the other devices.
AC

 Finally, we assume that processing units and the utility are in safe zone. Data can
also be encrypted at the utility, but protection mechanisms for storage are beyond
this study’s interest.

3.3. Task-Assign Algorithm

Task scheduler learns nothing other than the perturbed data and encrypted parts of the
noise. This structure communicates with smart meters and PPNs, and it assigns PPNs

10
ACCEPTED MANUSCRIPT

according to the Task-Assign algorithm. Task scheduling waits idly like a server, and it
transmits aggregated metering data to suitable processing units when there is data in the
metering data queue. It is not possible to retrieve information about the consumption after
aggregation of metering data.
Two different data structures are used in the task scheduler. The first structure is a
queue for keeping the received metering data. New meter readings append to the end of the
queue, and meter readings picked from the front are sent to the processing units. The second
structure is a list which keeps the information about processing units. In the list data structure,

T
processing units are stored as ordered pairs. The first entry is the PPN identifier and the

IP
second entry is the allocation counter which is used to measure available processing power.

CR
When a processing unit finishes its job, it transmits a message to the task scheduler, and
allocation counter is incremented. The allocation counter is decremented when the task
scheduler assigns a new job to a processing unit. If the allocation counter value is 0, it means
no job is given to that processing unit.
US
AN
Algorithm 2. Task-Assign
Input: Metering data queue Q
1: while (Q is not empty) do
2: if (front == NIL) then return error
M

3: Data *aggregate = front;


4: aggregate.append(aggregate->next);
5: front = aggregate->next;
6: firstPPN = *PPNPointer;
ED

7: while (firstPPN->allocation-counter != NIL) do


8: PPNPointer++;
9: firstPPN = *PPNPointer;
10: end while
PT

11: PPNPointer++;
12: secondPPN = *PPNPointer;
13: while (secondPPN->allocation-counter != NIL) do
14: PPNPointer++;
CE

15: secondPPN = *PPNPointer;


16: end while
17: Transmit(aggregate, firstPPN, secondPPN);
18: Delete aggregate;
AC

19: end while

3.4. Malfunctioning Processing Unit Problem

A malfunctioning PPN is either not responding or can’t perturb the metering data
according to the proposed scheme. As it is described in Section 3.2, the task scheduler can
easily detect processing units, which are not responding due to a possible hardware problem.
A processing unit may also transmit corrupted or wrong data values. This data do not match
with the data from the alternative processing unit at the utility. In this case, there are two

11
ACCEPTED MANUSCRIPT

possible scenarios. First, consider one processing unit is malfunctioning. This unit can be
detected by using the second and third units from the list data structure. If transmitted data of
second and third units’ match, then the first unit is marked as malfunctioning. Otherwise, the
second unit is marked. Second, consider both of the processing units are malfunctioning.
Similar to the first scenario, these malfunctioning units can be detected by checking third and
fourth units.

3.5. Security Model

T
IP
An honest-but-curious security model is considered in this study. In this model, all
entities follow the protocol honestly and do not learn anything beyond their outputs. However,

CR
some entities such as task scheduler can be considered as curious or semi-honest because they
collect data from other entities. Entities such as smart meters and PPNs should transmit data

US
without leaking any information about their inputs, and they are considered as honest. The
model is secure if and only if all entities just have the knowledge of the output and they have
no new knowledge gathered from the other entities.
AN
Task scheduler is in the middle of the proposed scheme, and it is temporarily storing
the metering data so that this entity is an appealing target for security attacks. A malicious
M

task scheduler does not necessarily follow the security model and can leak encrypted metering
data. Besides, data communication between metering devices and task scheduler is considered
ED

as open to attacks from malicious users. The privacy requirement or the security goal of the
proposed scheme is hiding the trend of the consumption and preventing the reveal of true
meter reading values. In more details, smart metering data should be kept secure, and no
PT

information will be leaked when a subset of time-series data is captured. To ensure this
privacy protection, in the proposed scheme all entities other than PPNs and the utility can
CE

only process and transmit the perturbed metering data.


AC

4. Case Study

A case study has been carried out on different scenarios to test the privacy-preserving
scheme proposed within the scope of the study. Two different prediction methods are used to
reconstruct perturbed time-series. The main advantage of using statistical techniques to
demonstrate security attacks is monitoring privacy problems related to malicious users and
servers in the infrastructure. Details of these prediction methods and evaluations of the
proposed scheme are briefly given in this section.

12
ACCEPTED MANUSCRIPT

Prediction with Holt-Winters Method

Holt-Winters method [33] was developed by Holt and Winters to capture seasonality.
There are two types of this method as the multiplicative model and the additive model. This
method is remarkably used for exponential smoothing and time-series prediction. The additive
Holt-Winters method for prediction is expressed as follows,
̂ (10)
where smoothing parameters at, bt and st are calculated as,

T
( ) (11)

IP
(12)
(13)

CR
In these equations, p denotes the period length and , , and  are filter parameters
which are decisive in the estimation process. Small values of  give more importance to

US
previous data, and high values of it considers recent data. Values of  closer to 0 gives weight
to trend and level changes become important when  value is around 1. Finally, high values of
AN
 make predictions sensitive to variations.

Prediction with Seasonal Trend Decomposition using Loess


M

The Seasonal Trend Decomposition using Loess (STL) is an algorithm that was
ED

developed by Robert B. Cleveland, William S. Cleveland, Jean E. McRae and Irma


Terpenning [34]. It decomposes a time-series into three components namely: the trend,
seasonality, and remainder. Loess is used to smooth the output. This algorithm is simple, fast
PT

and powerful and can decompose time-series with missing values. In this algorithm, every
member of the time-series data is divided as follows,
CE

(14)
where t denotes trend, s denotes seasonality, r indicates remainder, and i denotes the
AC

corresponding index. STL consists two recursive procedures nested inside. Seasonality and
trend are updated in the inner loop and weights are calculated when the execution moves to
the outer loop.
In Figure 4(a), a daily smart metering data is illustrated. This data is perturbed
according to the proposed scheme and shown in Figure 4(b). Figure 4(c) shows the power
spectral densities of both original data and the perturbed data. It is seen from the graph that
perturbed data is only allocated at the frequencies where the original data is allocated. In other

13
ACCEPTED MANUSCRIPT

words, their spectra have a similarity which means that it is impossible to separate the
perturbed data and the original data.

T
IP
(a) (b)

CR
US
AN

(c)
M

Figure 4. Illustration of daily metering data: (a) original data (b) after perturbation.
(c) Power spectral densities of both.
ED

In the first two experiments, it is assumed that an attacker is capable of performing


PT

privacy-invading inference attacks and the attacker obtains 50% true values of the perturbed
data. The Holt-Winters and STL methods are used to estimate the consumption data from this
CE

captured data collection. Since the attacker has only 50% of the data, we assume that the
attacker preprocessed the data and the unknown information is filled by using two different
methods. In the first experiment it was filled with 0, and in the second experiment, it was
AC

filled with the previous values. The results of these two experiments are shown in Figures 5(a)
and 5(b). Since the Holt-Winters method is additive, it cannot make any predictions in the
first experiment. The predictions made with STL aren’t successful, and they do not even
reflect the general trend. In the second experiment, estimation methods perform better, but
they still cannot obtain the true data values. At only one point, the STL method was able to
capture one true value which is due to the fact that some members of the time series with
small coefficient values are not perturbed as explained to reduce the computational cost.

14
ACCEPTED MANUSCRIPT

T
(a) (b)

IP
Figure 5. Illustration of prediction attacks to the captured data: 50% true value leaks with (a) 0-Value insertions
(b) Previous-Value insertions.

CR
In the third experiment, the worst scenario is investigated, and it is assumed that the
attacker obtains 100% true values of the perturbed data. Although the performance of the STL

US
method increases in this scenario, neither method succeeds in obtaining the true values as
shown in Figure 6. The STL method captures a couple of true values, but there is no
AN
improvement in the performance of the Holt-Winters method. This time it only generates a
smoother output. With parameter enhancements or using a combination of different methods,
an attacker might get a slightly better result than the results shown here. Even in this case, it
M

seems tough for the attacker to capture true data values. The main reason for this is the
spectral similarity of both perturbed data and original data.
ED
PT
CE
AC

Figure 6. Illustration of prediction attacks to the captured data: 100% true value leaks.

15
ACCEPTED MANUSCRIPT

Computational Complexity

In the proposed data aggregation scheme, computational tasks include data


aggregation operations, task scheduling, perturbation and encryption-decryption operations.
Smart meters and processing units do not need high computational power because the
proposed scheme is lightweight and it does not demand frequent encryption and decryption
operations. Further, it is easy to increase or decrease the number of processing units
dynamically in the case of a change in demand. The ease of capacity change makes the

T
infrastructure more scalable. Data aggregation and task scheduling operations are similarly

IP
lightweight. Table 1 shows that running time of the proposed scheme is increased slowly with
the increase of the number of smart meters and the proposed scheme is much faster than

CR
Shamir’s Secret Sharing [24] algorithm, which is a fast and lightweight algorithm. From these
results, it is observed that the number of users has very little effect on the computational cost,

US
which makes the proposed scheme a good candidate for smart metering infrastructure.

Table 1. Running time of the proposed scheme and Shamir’s scheme with a PPN parameter setting of 10.
AN
Scheme Number of Smart Meters Time (ms)
Proposed Security 10000 8,21
Proposed Security 15000 10,70
M

Proposed Security 20000 14,33


Proposed Security 25000 18,84
Proposed Security 30000 24,02
ED

Shamir’s [24] 10000 10,15


Shamir’s [24] 15000 13,39
Shamir’s [24] 20000 20,86
Shamir’s [24] 25000 24,66
PT

Shamir’s [24] 30000 30,23


CE

Conclusions

In this paper, we constructed and investigated a lightweight data aggregation scheme


AC

to be used in smart grid metering infrastructure. The proposed scheme is resilient to both
filtering and true value attacks. In the case study, Holt-Winters and STL prediction methods
are employed to show that attackers cannot obtain the metering data accurately. On the other
side, same data can be reconstructed without a loss by the processing units even though a
perturbation is applied.
For a proper operation, smart grid applications should not experience any performance
degradation or information leakage. The proposed scheme is suitable for these applications
because smart meters and task scheduler only do the lightweight computations and complex

16
ACCEPTED MANUSCRIPT

operations are done by scalable processing units. Performance evaluation demonstrates the
efficiency regarding the computation. To the best of our knowledge, the proposed scheme is
the first attempt to propose a lightweight and lossless solution by using a combination of
encryption and perturbation techniques.

References

[1] Hawk, C., Kaushiva, A. ‘Cybersecurity and the smarter grid’, The Electricity Journal, 2014, 27(8),
pp.84-95.

T
[2] Chim, T. W., Yiu, S. M., Hui L. C. K., Li V. O. K. ‘Privacy-preserving advance power

IP
reservation’, IEEE Communications Magazine, 2012, 50(8), pp.18-23.

[3] M rmol F. G., Sorge, C., Ugus, O., P rez, G. M. ‘Do not snoop my habits: Preserving privacy in

CR
the smart grid’, IEEE Communications Magazine, 2012, 50(5), pp.166-172.

[4] Yang, L., Xue, H., Fengjun L. ‘Privacy-Preserving Data Sharing in Smart Grid Systems’, IEEE

US
International Conference on Smart Grid Systems, 2014, pp. 878-883.

[5] Sharma, K., Saini, L. M. ‘Performance analysis of smart metering for smart grid: An
overview’, Renewable and Sustainable Energy Reviews, 2015, 49, pp. 720-735.
AN
[6] Farhangi, H. ‘The path of the smart grid’, IEEE Power and Energy Magazine, 2010, 8(1), 18-28.

[7] Siano, P. 'Demand response and smart grids - A survey', Renewable and Sustainable Energy
M

Reviews, 2014, 30, pp. 461-478.

[8] Rottondi, C., Verticale, G., Capone A. 'Privacy-preserving smart metering with multiple data
ED

consumers', Computer Networks, 2013, 57, pp. 1699-1713.

[9] Benhamouda, F., Joye, M., Libert, B. ‘A new framework for privacy-preserving aggregation of
time-series data’, ACM Transactions on Information and System Security, 2016, 18(3), pp. 21.
PT

[10] Leontiadis, I., Elkhiyaoui, K., Molva, R. ‘Private and dynamic time-series data aggregation with
trust relaxation’, In Proceedings of Cryptology and Network Security: 13th International Conference,
CE

CANS 2014, 2014, pp. 305-320.

[11] Lu, R., Liang, X., Li, X., Lin, X., Shen, X. S. ‘Eppa: An efficient and privacy-preserving
aggregation scheme for secure smart grid communication’, IEEE Transactions on Parallel and
AC

Distributed Systems, 2012, 23(9), pp. 1621-1631.

[12] Chim, T., Yiu, S., Hui, L. C. K., Li, V. K. ‘Pass: Privacy-preserving authentication scheme for
smart grid network’, In IEEE International Conference on Smart Grid Communications
(SmartGridComm), 2011, pp. 196-201.

[13] Bae, M., Kim, K., Kim, H. ‘Preserving privacy and efficiency in data communication and
aggregation for AMI network’, Journal of Network and Computer Applications, 2016, 59, pp. 333-
344.

[14] Laforet, F., Buchmann, E., Böhm, K. ‘Individual privacy constraints on time-series data’,
Information Systems, 2015, 54, pp. 74-91.

17
ACCEPTED MANUSCRIPT

[15] Erdogdu, M. A., Fawaz, N., Montanari, A. ‘Privacy-utility trade-off for time-series with
application to smart-meter data’, In Proceedings of the AAAI Conference on Artificial Intelligence,
Workshop on Computational Sustainability, AAAI 2015 Workshop, 2015, pp. 32-36.

[16] Hong, S. K., Gurjar, K., Kim, H. S., Moon, Y. S. ‘A survey on privacy preserving time-series data
mining’, In 3rd International Conference on Intelligent Computational Systems (ICICS’2013),
2013, pp. 44-48.

[17] Bao, H., Lu, R. ‘A New Differentially Private Data Aggregation with Fault Tolerance for Smart
Grid Communications’, IEEE Internet of Things Journal, 2015, 2(3), pp. 248-258.

T
[18] Yang, X., Ren, X., Lin, J., Yu, W. ‘On Binary Decomposition based Privacy-preserving
Aggregation Schemes in Real-time Monitoring Systems’, IEEE Transactions on Parallel and

IP
Distributed Systems, 2016, 27(10), pp. 2967-2983.

[19] Huang, Z., Wenliang, D., Chen, B. ‘Deriving Private Information from Randomized Data’, In

CR
Proceedings of International Conference on Management of Data, ACM SIGMOD, 2005, pp. 37-48.

[20] Alharbi, K., Lin, X., Shao, J. ‘A Framework for Privacy-Preserving Data Sharing in the Smart

219.
US
Grid’, IEEE/CIC ICCC 2014 Symposium on Privacy and Security in Commutations, 2014, pp. 214-

[21] Gentry, C. ‘Fully homomorphic encryption using ideal lattices’, STOC’09, ACM, 2009, pp. 169-
AN
178.

[22] Jia, W., Zhu, H., Cao, Z., Dong, X., Xiao, C. ‘Human-Factor-Aware Privacy-Preserving
Aggregation in Smart Grid’, IEEE Systems Journal, 2014, 8(2), pp. 598-607.
M

[23] Papadimitriou, S., Feifei, L., Kollios, G., Yu, P. S. ‘Time series compressibility and privacy’,
VLDB '07 Proceedings of the 33rd international conference on Very large data bases, 2007, pp. 459-
470.
ED

[24] Shamir A. ‘How to share a secret’, Communications of the ACM, 1979, 22, pp. 612-613.

[25] Mai, V., Khalil, I. ‘Design and implementation of a secure cloud-based billing model for smart
PT

meters as an Internet of things using homomorphic cryptography’, Future Generation Computer


Systems, 2017, 72, pp. 327-338.
CE

[26] Nitaj, A. ‘Cryptanalysis of NTRU with two public keys’, International Journal of Network
Security, 2014, 16(2), pp. 112–117.

[27] Abdallah, A., Shen X. 'Lightweight Security and Privacy Preserving Scheme for Smart Grid
AC

Customer-Side Networks', IEEE Transactions on Smart Grid, 2015, PP(99), pp.1-1.

[28] Paillier, P. 'Public-Key Cryptosystems Based on Composite Degree Residuosity Classes’, In


Proceedings of Eurocrypt, 1999, pp. 223-238.

[29] Li, H., Lin, X., Yang, H., Liang, X., Lu, R., Shen, X. 'EPPDR: An Efficient Privacy-Preserving
Demand Response Scheme with Adaptive Key Evolution in Smart Grid’, IEEE Transactions on
Parallel and Distributed Systems, 2014, 25(8), pp. 2053-2064.

[30] Mahmood, K., Chaudhry, S.A., Naqvi, H., Kumari, S., Li, X., Sangaiah, A.K. (in press) 'An
Elliptic Curve Cryptography based Lightweight Authentication Scheme for Smart Grid
Communication’, Future Generation Computer Systems. doi: 10.1016/j.future.2017.05.002

18
ACCEPTED MANUSCRIPT

[31] Dwork, C. 'Differential privacy’, Automata, languages and programming, 2006, 4052, pp. 1-12.

[32] Li, P., Li, J., Huang, Z., Li, T., Gao, C.Z., Yiu, S.M., Chen, K. ‘Multi-key privacy-preserving
deep learning in cloud computing’, Future Generation Computer Systems, 2017, 74, pp. 76-85.

[33] Winters, P. R. ‘Forecasting sales by exponentially weighted moving averages’, Management


Science, 1960, 6(3), pp. 324-342.

[34] Cleveland, R. B., Cleveland W. S., McRae, J. E., Tepenning, I. ‘STL: A Seasonal-Trend
Decomposition Based on Loess’, Journal of Statistics, 1990, 6(1), pp. 3-33.

T
IP
CR
US
AN
M
ED
PT
CE
AC

19

You might also like