Professional Documents
Culture Documents
Abstract
In this paper, we focus on the false data injection attacks (FDIAs) on state estimation and correspond-
ing countermeasures for data recovery in smart grid. Without the information about the topology and
parameters of systems, two data-driven attacks (DDAs) with noisy measurements are constructed, which
can escape the detection from the residue-based bad data detection (BDD) in state estimator. Moreover,
in view of the limited energy of adversaries, the feasibility of proposed DDAs is improved, such as
more sparse and low-cost DDAs than existing work. In addition, a new algorithm for measurement data
recovery is introduced, which converts the data recovery problem against the DDAs into the problem
of the low rank approximation with corrupted and noisy measurements. Especially, the online low rank
approximate algorithm is employed to improve the real-time performance. Finally, the information on
the 14-bus power system is employed to complete the simulation experiments. The results show that
the constructed DDAs are stealthy under BBD but can be eliminated by the proposed data recovery
algorithms, which improve the resilience of the state estimator against the attacks.
© 2018 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.
1. Introduction
Smart grid is surely a typical cyber physical system (CPS) [1,2] due to its more and
more tightly integrated with information and communication technology (ICT) to collect and
∗ Corresponding author.
E-mail addresses: auliqinxue@mail.scut.edu.cn (Q. Li), aubgxu@scut.edu.cn (B. Xu).
https://doi.org/10.1016/j.jfranklin.2018.10.022
0016-0032/© 2018 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.
Please cite this article as: Q. Li, S. Li and B. Xu et al., Data-driven attacks and data recovery with noise on state
estimation of smart grid, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j.jfranklin.2018.10.022
JID: FI
ARTICLE IN PRESS [m1+;December 6, 2018;7:30]
2 Q. Li, S. Li and B. Xu et al. / Journal of the Franklin Institute xxx (xxxx) xxx
process the physical meter data over a wide geographical range. However, the ICT depends
on the complex network connections as well as the Internet, which is vulnerable to some
malicious adversaries. Therefore, the security and reliability of the smart grid are not easy to
be guaranteed due to cyber attacks, which draw the attention of more and more researchers
[3–5].
In order to assess the vulnerability of the smart grid and study countermeasures for cyber
attacks, we firstly study feasible attack strategies in smart grid. False data injection attacks
(FDIAs) are one of main cyber attacks, which are implemented by injecting the bias values
into the transmitted measurements between the remote terminal units (RTUs) and the state
estimator in Supervisory Control and Data Acquisition (SCADA) (the false data injected into
control data is not within the scope of this paper), as shown in Fig. 1. Note that the successful
FDIAs always depend on the known detailed knowledge of the system topology and parame-
ters, which have been proved to be undetected by the residue-based bad data detection (BDD)
in the state estimator (SE) [6]. However, owing to protection settings of the system or limited
ability of adversaries (technical aspects), the information of the system mentioned above is
always hard to acquire. Therefore, many strategies of FDIAs under different constraints are
proposed.
According to the system information obtained by adversaries, the FDIAs can be roughly
classified into three categories: FDIAs with full information of the system topology, FDIAs
with partial information of the system topology, FDIAs without the system information but
only with the measurement data [6–11]. The FDIAs with full information of the system
topology are firstly introduced in [6], and then the FDIA with information about the impedance
of transmission lines in at least one cut set of the network topology can still be constructed in
[7]. Afterwards, the research in [8] shows that the topology and parameter information of the
local attacking region are enough to construct the attack. Different from [6–8], without any
information of the system topology, the FDIAs are constructed by independent component
analysis (ICA) and principal component analysis (PCA) approximation based on linear DC
power flow model, which can acquire the estimated topological information to launch the
attacks by analysing the power flow measurement data [9,10], where the FDIA in [10] is
tested to be still valid for the nonlinear AC power data. Obviously, the FDIAs in [9,10] are
Please cite this article as: Q. Li, S. Li and B. Xu et al., Data-driven attacks and data recovery with noise on state
estimation of smart grid, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j.jfranklin.2018.10.022
JID: FI
ARTICLE IN PRESS [m1+;December 6, 2018;7:30]
Q. Li, S. Li and B. Xu et al. / Journal of the Franklin Institute xxx (xxxx) xxx 3
both data-driven attacks. In addition, the data-driven attacks constructed by subspace methods
are proposed [11], which can acquire the subspace structure of the system through full or
even partial measurement data. However, the attacks in [9–11] always need to corrupt all of
the measurement data, which are impossible due to the limited energy [12] or constrained
capability of adversaries. Therefore, the FDIA strategies with energy constraint are proposed in
this paper, which are lower cost or more purposeful data-driven attacks (DDAs) than existing
work. In order to resist the FDIAs, the research on the countermeasures against the FDIAs is
also proposed, which has attracted intensive attentions and great interests in recent years.
In general, the countermeasures against the FDIAs on state estimation can be divided into
four categories: self-protection strategy, eattack detection, state reconstruction, and measure-
ments recovery.
Self-protection strategy: the protected measurements are always carefully selected to in-
crease the attack cost and improve the precision of the state estimation, since it is too expen-
sive to protect the all measurements [13–15]. Nevertheless, self-protection strategy is always
expensive and depends on an accurate dynamic grid topology. Complementing with other
anti-attack technology is a better solution.
Attack detection: once being attacked, attack detection is used to detect the attacked mea-
surement data based on a prior probability distribution on the grid states or at least the grid
topology and parameter information [16,17]. The system theories, graph theories and statis-
tical structure learning are partly or wholly adopted to complete attack detection. However,
the corrupted measurement data are always discarded and can not be recovered.
State reconstruction: if the estimated state values with the corrupted measurements can
be reconstructed on the system model, the anti-attack state estimator is used. Recently, state
reconstruction technologies are proposed to solve the problem for state estimation under
attacks [4,18–22], which omit the step of attack detection and identification. However, the
state reconstruction depends on the system model, especially on state equation of systems.
It should be noted that it is difficult to model the exact state equation for complex power
systems.
Measurements recovery: without the state equation of systems but with the measurement
data, the low rank matrix factorization and nuclear norm minimization [17,23] are employed to
complete not only the detection of FDIAs but also the identification of proper operation states
in smart grid [24], which are the problem of measurement data recovery and have been widely
applied to image processing [23,25]. Especially, with the scale expansion of power systems,
low rank matrix factorization becomes more practical due to higher computational efficiency
than singular value decomposition (SVD) in nuclear norm minimization. Obviously, the two
algorithms for measurements recovery are all data-driven methods, which do not depend on
the exact state equation of the smart grid. However, in engineering, the measurement data with
gross noise and over a period of time should not be considered as a low rank matrix, which
could deteriorate the effect of the algorithms proposed in [24]. Therefore, the measurement
noise is not negligible.
Different from attacks and anti-attack countermeasures under the ideal assumptions, not
only the data-driven attack strategies but also the corresponding countermeasures for mea-
surements recovery in smart grid are presented, both of which are data-driven under the
measurement noise. In this paper, the contributions are summarized as follows:
(1) At first, it is difficult to acquire the topological structure and parameter of the smart
grid for adversaries, especially, adversaries always have the limited energy. Therefore,
Please cite this article as: Q. Li, S. Li and B. Xu et al., Data-driven attacks and data recovery with noise on state
estimation of smart grid, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j.jfranklin.2018.10.022
JID: FI
ARTICLE IN PRESS [m1+;December 6, 2018;7:30]
4 Q. Li, S. Li and B. Xu et al. / Journal of the Franklin Institute xxx (xxxx) xxx
two kinds of low cost data-driven FDIAs are constructed without any knowledge about
the system but with the noisy measurement data in this paper. Furthermore, the stealth
of the two attacks is proved.
(2) The problem of measurement data recovery against attacks is converted into the problem
of low rank approximation. Moreover, the Gaussian noise is considered, since it is more
coincident with the actual situation.
(3) The improved online low rank approximate algorithm is performed to complete the
recovery of measurement data more efficiently. The proposed algorithms are shown to
be still effective under the both low rank and sparse attack, which have not been seen
before. The experiment results show that both real-time and accuracy of the proposed
data recovery algorithms are better than the algorithms used in [24]. Moreover, the
online version of the proposed algorithm compensates the deficiency of the off-line
algorithm.
The remainder of the paper is organized as follows. The preliminaries are described in
Section 2, including state estimation model of power systems, state estimator and residue-
based BDD (Section 2.1) and (general stealth attack model (Section 2.2). The general data-
driven attacks are presented in Section 3.1, and then two kinds of data-driven attacks are
proposed in Section 3.2, which are consistent with the actual situation and under the limited
energy of adversaries. Furthermore, data-driven countermeasures for data recovery are pro-
posed in Section 4, whose off-line and online versions are elaborated in the Section 4.1 and
Section 4.2 respectively. Then, the simulation experiments for proposed attacks on the IEEE
14-bus power system are shown in Section 5.1, and the proposed data-driven countermeasures
are displayed in Section 5.2. Finally, Section 6 concludes the paper.
Notations. n denotes the state dimension and n denotes n-dimensional real vector space.
I denotes the identity matrix. · 0 denotes the number of nonzero elements for a vector or
matrix · , · 1 , · 2 and · F denote the 1-norm, 2-norm and Frobenius-norm of the vector
· , respectively. In addition, diag(b) denotes that the diagonal elements of a matrix is the
vector b. N(·) denotes the null space of matrix · and sign(·) denotes the signum function.
vec(·) denotes a linear transformation, which converts a matrix · into a column vector. rank(·)
denotes the rank of a matrix.
2. Preliminaries
We consider the commonly used DC model of the power system for state estimation
(SE) [26,27] to approximate the nonlinear relationship between measurements and states,
since the proposed attacks are a kind of FDIAs proposed in [6]. Specifically, the linear DC
approximation model for the measurement equation is expressed as following by assuming
that all branch resistances and shunt elements are neglected and the bus voltage magnitudes
are all equal to 1 p.u.:
z = H x + e, (1)
where the states x = θ ∈ n denote the voltage phase angles. z ∈ m is the power measurement
data, which could be the real power flow between two buses or the real power injection at
a bus (node). H denotes the measurement Jacobian matrix and e denotes the measurement
noise.
Please cite this article as: Q. Li, S. Li and B. Xu et al., Data-driven attacks and data recovery with noise on state
estimation of smart grid, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j.jfranklin.2018.10.022
JID: FI
ARTICLE IN PRESS [m1+;December 6, 2018;7:30]
Q. Li, S. Li and B. Xu et al. / Journal of the Franklin Institute xxx (xxxx) xxx 5
In SCADA of the smart grid, assuming that the state estimator based on the weighted least
square (WLS) [26,27] is employed and can be formulated as following problem:
xˆ = arg min J (x)
x (2)
J (x) = (z − H x)T R−1 (z − H x),
where xˆ ∈ n denotes the estimated value of the state x. J(x) denotes the objective function
and R denotes the covariance matrix of the e. Then the residue vector r ∈ m can be defined
as following Eq. (3):
r = z − H xˆ (3)
Assumption 1 [27] If the H is a full column rank matrix, then Eq. (2) has the following
unique solution:
−1
xˆ = H T R−1 H H T R−1 z (4)
according to Eq. (4), the state estimator needs to collect enough measurements to ensure the
observability of the system. Obviously, the enough measurements are still too large and their
transmissions bring great vulnerability to the system.
In addition, as bad data detection (BDD) method, the chi-square (χ 2 ) test used to detect
bad measurement data [26] during the state estimation is shown as follow:
|r i |
max > τ, i ∈ {1, 2, . . . , m}, (5)
Rii
where Rii denotes diagonal elements of R and ri denotes the component of r. τ indicates the
detected threshold of the bad measurement data.
Moreover, we employ the largest normalized residual (LNR) test to identify the corrupted
data if the bad data exists according to Eq. (5). Otherwise, the measurement data can pass
the BDD and will be adopted to estimate the system states. The detection task continues until
making sure that all of the bad measurement data have been removed, and then the states are
re-estimated.
being detected by BDD on the measurement residue [6]. However, in fact, the adversaries
always can not obtain the full system configuration information (e.g., smart grid topology and
transmission-line admittances) due to their own limited ability (limited resources and energy)
and system safeguard but only on the intercepted measurement data [10,12]. Therefore, the
research on data-driven attacks is meaningful and more consistent with the actual situation.
In general, the stealth attacks can be constructed under the assumption that the adversaries
can acquire full information about the system topology of the smart grid [3,15,28]. However,
this full information is always hard to acquire, hence data-driven attacks generate.
Actually, the information from the measurement data is very rich. Therefore, inspired by
the idea of the data mining, subspace analysis or component pursuit methods are employed to
acquire the key information and then launch the valid attacks by adversaries [9–11]. With the
full or partial measurement data, the former namely subspace analysis is introduced to find the
nonzero vector in the column space of H and then construct an unobservable attack. While in
the latter, PCA or ICA approximation methods are proposed to transform each measurement
vector into the linear combination of a vector with principal or independent components, as
shown in the following form:
z = Hapx x˜ (7)
m×n
where x˜ ∈ denotes principal or independent component, and Hapx ∈
n
denotes linear
relationship between the measurements z and x˜. Therefore, if the x˜ is acquired by analyzing
the data z, the Hapx can be calculated by Eq. (7). In other words, both x˜ and Hapx can be
produced on z by the PCA or ICA approximation method, which is shown as follow (taking
the PCA approximation method as the example):
[Hapx , x˜] = PCA (Z, n), Z = (z1 , z2 , . . . zK ), (8)
where K is the total sampling time over a time period.
Note that the measurement data in smart grid can be eavesdropped in the form of the
node by adversaries, including injection power data of the node (bus) and power flow data of
branches directly connected to the node. Therefore, the adversaries can acquire n according
to the number of measurement data packets. Then the stealth attacks can be produced by this
new topology matrix Hapx , which are described as the following Lemma [9,10].
Lemma 1. The attack vector a ∈ m can be constructed as the following form:
a = Hapx c1 (9)
where c1 ∈ denotes an arbitrary nonzero vector. Note that a can be almost stealthy if the
n
formula Hapx ≈ HPx Hapx is established, where Px ∈ n×n denotes an projection matrix of the
principal or independent component x˜ to original state vector x, namely, x ≈ Px x˜.
Remark 2. Lemma 1 shows that the valid data-driven attack can be injected into smart grid
and pass the BDD successfully, which is a data-driven False Data-Injection Attack (DDFDIA)
in essence. Once the Hapx is calculated through analysis of measurement data, the attack can
Please cite this article as: Q. Li, S. Li and B. Xu et al., Data-driven attacks and data recovery with noise on state
estimation of smart grid, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j.jfranklin.2018.10.022
JID: FI
ARTICLE IN PRESS [m1+;December 6, 2018;7:30]
Q. Li, S. Li and B. Xu et al. / Journal of the Franklin Institute xxx (xxxx) xxx 7
be constructed with arbitrary nonzero entries of the c1 . The goal of these data-driven attacks
in [9,10] is to find stealthy data-driven FDIAs without the topological information of the
smart grid, but not to search the lower cost or optimal data-driven FDIAs. Therefore, the
research on lower cost or optimal data-driven attacks based on the [9,10] is necessary, after
all, the adversaries do not have the ability to corrupt all or most of the sensor data in smart
grid.
Thus, the stealthy and sparse attacks can be constructed as following formula, when B =
P − I:
min a0
(10)
s.t. Ba = 0, a = 0
Obviously, the Eq. (10) is difficult to solve since the formula denotes a non-convex problem.
Therefore, this problem can be translated into solving v by searching the null space of B [13]:
In order to obtain the lower cost of attacks, we try our best to search the sparser attack
vector in the null space of B. Generally, the small entry values of a attack vector always are
tolerated as long as their average energy being within the variance of the measurement noise,
whose injection is meaningless and negligible. Instead, the large entry values of the attack
vector v are our focus. Therefore, we introduce a shrinkage operator Sδ here:
v, |v| ≥ δ
Sδ ( v ) = , (12)
0, |v| < δ
where δ is a key threshold that decides the stealth and sparsity of the attack vector, which
should be adjusted along with the tolerable noise level. M denotes the maximum value of pro-
posed attack. Thus, the random low cost DDAs based on measurement data can be constructed
as the following Algorithm.
To testify the stealth of the attack in Algorithm 1, we have the following Proposition 1 to
show that this random data-driven attack is feasible.
Please cite this article as: Q. Li, S. Li and B. Xu et al., Data-driven attacks and data recovery with noise on state
estimation of smart grid, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j.jfranklin.2018.10.022
JID: FI
ARTICLE IN PRESS [m1+;December 6, 2018;7:30]
8 Q. Li, S. Li and B. Xu et al. / Journal of the Franklin Institute xxx (xxxx) xxx
Proposition 1. a = Hapx c1 , if and only if Bapx a = 0, where Bapx = Hapx (Hapx T Hapx )−1 Hapx T −
I . Proof: According to Theorem 3.2 in [6], a = H c ⇔ Ba = 0, where a is a stealthy attack.
With the relationship Hapx ≈ HPx in Lemma 1, we have:
¯ T −1 ¯ ¯ ¯ ¯
Proposition 2. a = Hapx c1 , if and only if Bapx a = y, where Bapx = Hapx [(Hapx ) Hapx ]
¯
T ¯
(Hapx ) − I . Hapx denotes a sub-matrix of Hapx , in which indices of columns are not in
¯
the set . y = Bapx b.
The proof of the Proposition 2 is similar to the Proposition 1 based on the proof of the
Lemma 2 in [6].
Thus, the construction of the attack can be formed into the following optimization problem:
min a1
¯
(13)
s.t. y = Bapx a,
where a1 denotes l1 relaxation of a0 [31]. Then, the problem Eq. (13) can be reformed
as a regressor selection problem further since the adversaries have limited resources to tamper
up to k measurement data.
2
¯
min y − Bapx a
2 (14)
s.t. a 0 ≤ k
2
¯
where y − Bapx a denotes the cost function and should be minimized to reduce the proba-
2
bility of being detected. In addition, the alternating direction method of multipliers (ADMM)
[32] is employed to obtain the solution of the Eq. (14). We define the augmented Lagrangian
parameter ρ and the maximum value of iterations tmax , then the Algorithm 2 is presented to
construct the following sparser and targeted DDA. where the update of a involves the regu-
larized least squares (RLS) problem. Therefore, through the input k, Algorithm 2 controls the
trade-off between the error of RLS and sparsity, which is a feature selection problem.
(1) Initialize: number of iterations t = 0; Optimization variable β=0 and dual variable
u = 0 in ADMM [32].
(2) Hapx = PCA (Z, n), Z = z1 , z2 , . . . zi . . . zK , K denotes the maximum sampling number.
¯ ¯
¯ T
¯ ¯ T
(3) Calculating Bapx = Hapx [(Hapx ) Hapx ]−1 (Hapx
) − I.
(4) The vector a is updated by ridge regression:
T −1 T
¯
¯
¯
at+1 = Bapx Bapx + ρI Bapx y + ρ(β t −ut )
Table 1
Characteristics of three attacks and true measurement data.
According to the Algorithm 1 and 2, the essential differences between the two data-driven
attacks are that the RLCDDA is low rank while the STDDA is sparse but not low rank, and
the more details can be shown in Table 1 and its related instructions.
The powerful ability of adversaries and vulnerability of smart grid have been shown in Sec-
tion 3, even if the adversaries can only acquire the transmitted measurement data. Therefore,
the researches on the countermeasures against the proposed data-driven attacks are necessary.
Similar to data-driven attacks and without a prior probability distribution on the grid states,
as a defender, new data-driven countermeasures for measurement data recovery are proposed
in this Section, which not only detect the attacks but also separate the injected data and
measurement noise from the true measurement data.
End while
Output: Ztrue , A
As mentioned before, it is obvious that the received measurement data must be collected
over a period of time and forms a measurement matrix Za , which lends to miss the early
opportunity to resist the malicious attacks (e.g., [24]). On the one hand, when the observed
measurement vector zai , i = 1, 2, . . . K arrives, we hope that the new data is evolved as soon
as possibly. On the other hand, when more and more measurement data are collected, the
running speed of proposed data-driven LRAM will be more and more slow due to the growth
of K. Therefore, the new problem appears: how to improve real-time of the proposed LRAM
and reduce storage burden? To solve these problems, the online frame for LRAM is proposed
as Algorithm 4. According to Algorithm 4, there are at most m measurement vectors to be
stored and processed, which is an acceptable amount of data and meanwhile can reflect the
characteristics of the data with growing value of K.
Else
(i−m+1):i
[Ztrue , A(i−m+1):i ]=LRAM (Za(i−m+1):i , r, λ, ε, q);
End
(3) i=i + 1.
(4) If the new measurement vector zai is received, go to step (2);Else, break.
End while
i
Output: ztrue , ai
5. Simulation experiments
Firstly, the proposed two attacks (RLCDDA and STDDA) shown in Fig. 2 are lower cost
and sparser than the DDA in [10], where the parameters in RLCDDA are set: M
=50, δ=3
and the parameters in STDDA : k =2, ρ=1.8, tmax =1000, y = Bapx Ā
b and b = j∈A h j c j .
Moreover, the set is randomly generated according to k. In addition, the sparsity of the
attack matrix A is defined to exhibit the characteristics of attacks:
A0
Sp = 0 ≤ Sp ≤ 1 (18)
m×n
Thus, the characteristics of attacks and true measurements including the sparsity and rank
of the matrix, the maximum (max) and minimum (min) values of matrix entries can be shown
in Table 1 According to Table 1 and Fig. 2, the sparsity of the DDA in [10] is 1, which
means all of measurement data are tampered, needing the high cost and requirements of
adversaries. In other words, for adversaries, the DDA in [10] is hard to launch. Conversely,
a
Amplitude (MW)
50
0
−50
50 100
50
0 0
m
b
Amplitude (MW)
20
0
−20
50 100
50
0 0
m
Sampling time
Fig. 2. (a) Random Lower Cost Data-driven Attack (RLCDDA); (b) Sparse Targeted Data-driven Attacks (STDDA).
Please cite this article as: Q. Li, S. Li and B. Xu et al., Data-driven attacks and data recovery with noise on state
estimation of smart grid, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j.jfranklin.2018.10.022
JID: FI
ARTICLE IN PRESS [m1+;December 6, 2018;7:30]
14 Q. Li, S. Li and B. Xu et al. / Journal of the Franklin Institute xxx (xxxx) xxx
0.2
RLCDDA
STDDA
0.15
0.05
0
0 20 40 60 80 100
Sampling time
Fig. 3. The true positive rate by BDD under the proposed attacks (RLCDDA and STDDA).
the RLCDDA and STDDA are much easy to implement due to their lower cost and sparsity,
the sparsity of RLCDDA and STDDA are 0.241 and 0.037 respectively. Specifically, the rank
is the main difference between RLCDDA and STDDA, where the rank of RLCDDA is 1 since
the RLCDDA depends on the update of the Hapx , while the rank of STDDA is 47. Overall,
the RLCDDA is a sparse and low rank DDA, and the STDDA is a sparse but an un-low rank
DDA. Furthermore, to show the stealth of two attacks proposed in the paper, we adopted the
following definition of the true positive rate Rtp (0 ≤ Rtp ≤ 1):
Nhit
Rt p = , (19)
Nhit + Nmiss
where Nhit is the number of detected attack entries by BDD, and Nmiss is the number of
undetected attack entries. The detected threshold of bad data in BDD is set in the experiment:
τ = 0.5, which is much smaller than amplitude of two attacks shown in Table 1 and Fig. 2.
In addition, Fig. 3 exhibits the Rtp under the proposed attacks (RLCDDA and STDDA).
According to the Fig. 3, we observe that the STDDA is absolutely stealthy (Rt p=0), while
the maximum value of Rtp under RLCDDA is 0.077, which means that the most elements of
the malicious attack are injected into measurement data successfully except very few elements
detected and discarded. In other words, large amounts of the undetected corrupted data are
used to complete state estimation and then mislead the control center to make erroneous
judgments and instructions.
To acquire the true measurement data, the LRAM and online LRAM based on measurement
data are proposed as data recovery algorithms. The parameters of the LRAM and online
LRAM are both set r = min (m, K ), λ=8, ε =10−6 , q=3. In addition, EDec is defined as the
error of decomposition for proposed data recovery algorithms.
Za − Z t − At 2
true
EDec = F
. (20)
Za 2F
Please cite this article as: Q. Li, S. Li and B. Xu et al., Data-driven attacks and data recovery with noise on state
estimation of smart grid, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j.jfranklin.2018.10.022
JID: FI
ARTICLE IN PRESS [m1+;December 6, 2018;7:30]
Q. Li, S. Li and B. Xu et al. / Journal of the Franklin Institute xxx (xxxx) xxx 15
0.7
Inexact−alm
0.6 LMafit
LRAM (proposed)
0.5 Time: 0.17803
EDec 0.4
Time: 0.047648
0.3
0.2
Time: 0.0022798
0.1
0
0 10 20 30 40 50
Iterations
Fig. 4. EDec and decomposition time for inexact-alm, LMaFit and LRAM (off-line) under noise (SNR=1.137) and
RLCDDA.
a −14 X: 99
x 10 Y: 2.805e−14
4
EDec
2
0
0 20 40 60 80 100
b −3 X: 22
x 10 Y: 0.002039
4
EDec
2
0
0 20 40 60 80 100
c −15 X: 100
x 10 Y: 1.187e−15
2
EDec
1
0
0 20 40 60 80 100
Sampling time
Fig. 5. EDec for three algorithms under noise (SNR=1.137) and RLCDDA. (a) By inexact-alm; (b) By LMaFit; (c)
By LRAM (online).
a X = 100
0.2 Y = 0.0839
0.1
0
0 20 40 60 80 100
Decomposition Time
b
X = 100
0.04 Y = 0.00719
0.02
0
0 20 40 60 80 100
c x 10
−3
X = 100
4 Y = 0.000808
2
0
0 20 40 60 80 100
Sampling time
Fig. 6. Decomposition time for three algorithms under noise (SNR=1.137) and RLCDDA. (a) By inexact-alm; (b)
By LMaFit; (c) By LRAM (online).
that the proposed off-line and online LRAM can be applied to accomplish not only the de-
composition of the low rank measurement matrix mixed with sparse attack matrix in large
noisy environment, but also the decomposition of the low rank measurement matrix mixed
with sparse and low rank attack matrix.
0.7
Inexact−alm
0.6 LMafit
LRAM (proposed)
0.5 Time: 0.17323
EDec 0.4
Time: 0.048585
0.3
0.1
0
0 10 20 30 40 50
Iterations
Fig. 7. EDec and decomposition time for inexact-alm, LMaFit and LRAM (off-line) under noise (SNR=1.137) and
STDDA.
a −14 X: 99
x 10 Y: 2.514e−14
4
EDec
2
0
0 20 40 60 80 100
b −3
x 10
2
EDec
1 X: 7
0 Y: 0.001764
0 20 40 60 80 100
c −15 X: 96
x 10
2 Y: 1.043e−15
EDec
1
0
0 20 40 60 80 100
Sampling time
Fig. 8. EDec for three algorithms under noise (SNR=1.137) and STDDA. (a) By inexact-alm; (b) By LMaFit; (c)
By LRAM (online).
In other words, the precision of data recovery by proposed DDC under STDDA should be
higher than that under RLCDDA in theory. Moreover, the SNR is also 1.137 in this scenario.
According to Figs. 7–9, faster convergence and higher precision of the LRAM than inexact-
alm and LMaFit are shown clearly. Moreover, the off-line LRAM converges rapidly within
very few iterations. Hence, the decomposition time of off-line LRAM is 1.734 × 10-3 s in
Fig. 7, which is the shortest time among three off-line algorithms. Furthermore, the online
LRAM also has a good performance in Figs. 8 and 9. The maximum errors of decomposition
EDec for three online algorithms are 2.514 × 10-14 , 1.764 × 10-3 and 1.043 × 10-15 respectively.
Similarly, the decomposition time of corrupted measurement data at 100s (sampling time) for
three online algorithms are 0.182 s, 0.019 s and 1.500 × 10-3 s respectively.
Please cite this article as: Q. Li, S. Li and B. Xu et al., Data-driven attacks and data recovery with noise on state
estimation of smart grid, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j.jfranklin.2018.10.022
JID: FI
ARTICLE IN PRESS [m1+;December 6, 2018;7:30]
18 Q. Li, S. Li and B. Xu et al. / Journal of the Franklin Institute xxx (xxxx) xxx
a X = 100
0.4 Y = 0.182
0.2
0
0 20 40 60 80 100
Decomposition Time
b
X = 100
0.04 Y = 0.019
0.02
0
0 20 40 60 80 100
c
−3
x 10 X = 100
4 Y = 0.0015
2
0
0 20 40 60 80 100
Sampling time
Fig. 9. Decomposition time for three algorithms under noise (SNR=1.137) and STDDA. (a) By inexact-alm; (b) By
LMaFit; (c) By LRAM (online).
Table 2
Influence of noise on the three online algorithms under RLCDDA.
Comparing Fig. 5 with Fig. 8, the maximum errors of decomposition for online LRAM
under RLCDDA and STDDA are 1.187 × 10-15 and 1.043 × 10-15 respectively. Therefore, due
to the similar low rank of RLCDDA and the true measurement matrix Ztrue , the effect of
online LRAM under STDDA in the paper is slightly better than that under RLCDDA.
Table 3
Influence of noise on the three online algorithms under STDDA.
of LMaFit are always less than that of inexact-alm. Hence, inexact-alm wastes the longest
time and its performances on decomposition accuracy are always in the middle of the three
algorithms.
6. Conclusion
In the paper, two kinds of data-driven attacks (DDAs) and corresponding countermeasures
are presented on the state estimation of the smart grid. Especially, the proposed DDAs are
successfully launched without the system topology and parameters but only on the noisy
measurements, which are lower cost and sparser due to the limited energy and capacity of
adversaries. Obviously, the proposed DDAs are more realistic than the general FDIAs and
prone to implement. In addition, a new algorithm for measurement data recovery against
the proposed data-driven attacks is presented, which exploits the low rank of the measure-
ment data over the time and is achieved by the techniques of low rank pursuit and matrix
decomposition. Especially, the gross measurement noise is considered into the scenes. More-
over, the online version of this algorithm to improve the real-time performance is exploited.
The simulation experiments on 14-bus power system show the stealth of the constructed
DDAs in BBD of SE. Meanwhile, the constructed DDAs can be eliminated by the pro-
posed data recovery algorithms that improve the anti-attack of state estimation. It should be
noted that the constructed attacks and algorithms for measurement data recovery are both
data-driven, which are state model-free or topology-free and no longer depend on the ex-
act state equation. Obviously, once the proposed algorithms for measurement data recovery
are feasible and efficient, the design of a complex state estimator against attacks can be
avoided.
In particular, the proposed algorithms for measurement data recovery are displayed to be
equally effective against the low rank and sparse attack in simulation experiments, but there is
no theoretical support on the situation, which is necessary work in future. In addition, event-
triggered or self-triggered fashion such as [37–39] can reduce the communication burden and
calculation burden of the proposed data recovery method, which should be considered in our
next work. Furthermore, data-driven algorithms applied to resist other types of attacks in CPS
are also our next work.
Acknowledgment
This work was supported by Natural Science Foundation of China [NSFC] -Guangdong
Joint Foundation Key Project [Grant no. U1401253], NSFC [Grant nos. 61573153 and
61672174], Foundation of Guangdong Provincial Science and Technology Projects [Grant
Please cite this article as: Q. Li, S. Li and B. Xu et al., Data-driven attacks and data recovery with noise on state
estimation of smart grid, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j.jfranklin.2018.10.022
JID: FI
ARTICLE IN PRESS [m1+;December 6, 2018;7:30]
20 Q. Li, S. Li and B. Xu et al. / Journal of the Franklin Institute xxx (xxxx) xxx
no. 2013B010401001], Fundamental Research Funds for the Central Universities [Grant no.
2015ZZ099], Guangzhou Science and Technology plan project [Grant no. 201510010132],
Maoming science and technology plan project [Grant no. MM2017000004], and the National
Natural Science Foundation of Guangdong Province [Grant no. 2016A030313510].
References
[1] L.F.e. a. Liang G Zhao J, Bibtex: a review of false data injection attacks against modern power systems, IEEE
Trans. Smart Grid 8 (4) (2017) 1630–1638, doi:10.1109/TSG.2015.2495133.
[2] X. Cao, L. Liu, W. Shen, A. Laha, J. Tang, Y. Cheng, Real-time misbehavior detection and mitigation in cyber-
physical systems over wlans, IEEE Trans. Ind. Inf. 13 (1) (2017) 186–197, doi:10.1109/TII.2015.2499123.
[3] K.O. Liang J, L. Sankar, Bibtex: Vulnerability analysis and consequences of false data injection attack on power
system state estimation, IEEE Trans. Power Syst. 31 (5) (2016) 3864–3872, doi:10.1109/pesgm.2017.8273736.
[4] H. Fawzi, P. Tabuada, S. Diggavi, Secure estimation and control for cyber-physical systems under adversarial
attacks, IEEE Trans. Autom. Control 59 (6) (2014) 1454–1467, doi:10.1109/TAC.2014.2303233.
[5] Q. Li, B. Xu, S. Li, Y. Liu, D. Cui, Reconstruction of measurements in state estimation strategy against
deception attacks for cyber physical systems, Control Theory Technol. 16 (1) (2018) 1–13, doi:10.1007/
s11768- 018- 7080- y.
[6] R.M.K. Liu Y, P. Ning, False data injection attacks against state estimation in electric power grids, ACM Trans.
Inf. Syst. Secur. (TISSEC) 14 (1) (2011) 13, doi:10.1145/1952982.1952995.
[7] M.A. Rahman, H. Mohsenian-Rad, False data injection attacks with incomplete information against smart power
grids, in: Global Communications Conference (GLOBECOM), 2012 IEEE, IEEE, 2012, pp. 3153–3158, doi:10.
1109/GLOCOM.2012.6503599.
[8] X. Liu, Z. Bao, D. Lu, Z. Li, Modeling of local false data injection attacks with reduced network information,
IEEE Trans. Smart Grid 6 (4) (2017) 1686–1696, doi:10.1109/tsg.2015.2394358.
[9] M. Esmalifalak, H. Nguyen, R. Zheng, Z. Han, Stealth false data injection using independent component analysis
in smart grid, in: Proceedings of the IEEE International Conference on Smart Grid Communications, 2011,
pp. 244–248, doi:10.1109/smartgridcomm.2011.6102326.
[10] Z.H. Yu, W.L. Chin, Blind false data injection attack using Pca approximation method in smart grid, IEEE
Trans. Smart Grid 6 (3) (2015) 1219–1226, doi:10.1109/tsg.2014.2382714.
[11] J. Kim, L. Tong, R.J. Thomas, Subspace methods for data attack on state estimation: A data driven approach,
IEEE Trans. Signal Process. 63 (5) (2015) 1102–1114, doi:10.1109/tsp.2014.2385670.
[12] H. Zhang, P. Cheng, L. Shi, J. Chen, Optimal denial-of-service attack scheduling with energy constraint, IEEE
Trans. Autom. Control 60 (11) (2015) 3023–3028, doi:10.1109/TAC.2015.2409905.
[13] J. Hao, R.J. Piechocki, D. Kaleshi, W.H. Chin, Z. Fan, Sparse malicious false data injection attacks and defense
mechanisms in smart grids, IEEE Trans. Ind. Inf. 11 (5) (2017) 1–12, doi:10.1109/TII.2015.2475695.
[14] R. Deng, G. Xiao, R. Lu, Defending against false data injection attacks on power system state estimation, IEEE
Trans. Ind. Inf. 13 (1) (2017) 198–207, doi:10.1109/tii.2015.2470218.
[15] Q. Yang, J. Yang, W. Yu, D. An, N. Zhang, W. Zhao, On false data-injection attacks against power system state
estimation: modeling and countermeasures, IEEE Trans. Parallel Distr. Syst. 25 (3) (2014) 717–729, doi:10.
1109/tpds.2013.92.
[16] F. Pasqualetti, F. Dörfler, F. Bullo, Attack detection and identification in cyber-physical systems, IEEE Trans.
Autom. Control 58 (11) (2013) 2715–2729, doi:10.1109/tac.2013.2266831.
[17] S. Tan, W.Z. Song, M. Stewart, J. Yang, L. Tong, Online data integrity attacks against real-time electrical market
in smart grid, IEEE Trans. Smart Grid 9 (1) (2018) 313–322, doi:10.1109/tsg.2016.2550801.
[18] M.S. Chong, M. Wakaiki, J.P. Hespanha, Observability of linear systems under adversarial attacks, in: Proceed-
ings of the American Control Conference, 2015, pp. 2439–2444, doi:10.1109/acc.2015.7171098.
[19] Q. Hu, D. Fooladivanda, Y.H. Chang, C.J. Tomlin, Secure state estimation and control for cyber security of the
nonlinear power systems, IEEE Trans. Control Netw. Syst. PP (99) (2017). 1–1 doi: 10.1109/tcns.2017.2704434.
[20] A. Wei, Y. Song, C. Wen, Adaptive cyber-physical system attack detection and reconstruction with application
to power systems, Iet Control Theory Appl. 10 (12) (2016) 1458–1468, doi:10.1049/iet-cta.2015.1147.
[21] Y. Shoukry, P. Tabuada, Event-triggered state observers for sparse sensor noise/attacks, IEEE Trans. Autom.
Control 61 (8) (2016) 2079–2091, doi:10.1109/tac.2015.2492159.
[22] C.K. Sid M A Chitraganti S, Medium access scheduling for input reconstruction under deception attacks, J.
Frankl. Inst. 354 (9) (2017) 3678–3689, doi:10.1016/j.jfranklin.2016.08.023.
Please cite this article as: Q. Li, S. Li and B. Xu et al., Data-driven attacks and data recovery with noise on state
estimation of smart grid, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j.jfranklin.2018.10.022
JID: FI
ARTICLE IN PRESS [m1+;December 6, 2018;7:30]
Q. Li, S. Li and B. Xu et al. / Journal of the Franklin Institute xxx (xxxx) xxx 21
[23] Z. Lin, M. Chen, Y. Ma, The augmented lagrange multiplier method for exact recovery of corrupted low-rank
matrices (2010) arXiv preprint arXiv:1009.5055, doi:10.1016/j.jsb.2012.10.010.
[24] L. Liu, M. Esmalifalak, Q. Ding, V.A. Emesih, Z. Han, Detecting false data injection attacks on power grid by
sparse optimization, IEEE Trans. Smart Grid 5 (2) (2014) 612–621, doi:10.1109/tsg.2013.2284438.
[25] Y. Shen, Z. Wen, Y. Zhang, Augmented lagrangian alternating direction method for matrix separation based on
low-rank factorization, Optim. Methods Softw. 29 (2) (2014) 239–263, doi:10.1080/10556788.2012.700713.
[26] A. Gomez-Exposito, A. Abur, Power system state estimation: theory and implementation, CRC press, 2004.
[27] A. Minot, N. Li, A fully distributed state estimation using matrix splitting methods, in: Proceedings of the
American Control Conference, 2015, pp. 2488–2493, doi:10.1109/acc.2015.7171105.
[28] M. Ozay, I. Esnaola, F.T.Y. Vural, S.R. Kulkarni, H.V. Poor, Sparse attack construction and state estimation in
the smart grid: Centralized and distributed models, IEEE J. Select. Areas Commun. 31 (7) (2013) 1306–1318,
doi:10.1109/jsac.2013.130713.
[29] L. Peng, L. Shi, X. Cao, C. Sun, Optimal attack energy allocation against remote state estimation, IEEE Trans.
Autom. Control PP (99) (2017). 1–1 doi: 10.1109/tac.2017.2775344.
[30] H. Zhang, W.X. Zheng, Denial-of-service power dispatch against linear quadratic control via a fading channel,
IEEE Trans. Autom. Control (2018), doi:10.1109/tac.2018.2789479.
[31] G. Kutyniok, Theory and applications of compressed sensing, 36, Gamm-Mitteilungen, 2013, pp. 79–101, doi:10.
1002/gamm.201310005.
[32] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, Distributed optimization and statistical learning via the
alternating direction method of multipliers, Found. Trends Mach. Learn. 3 (1) (2010). 1–122 doi: 10.1561/
2200000016.
[33] H. Huang, Q. Yan, Y. Zhao, W. Lu, Z. Liu, Z. Li, False data separation for data security in smart grids, Know.
Inf. Syst. 52 (3) (2017) 815–834, doi:10.1007/s10115- 016- 1019- 8.
[34] T. Zhou, D. Tao, Godec: randomized low-rank and sparse matrix decomposition in noisy case, in: Proceedings
of the International Conference on Machine Learning, Omnipress, 2011.
[35] T. Zhou, D. Tao, Greedy bilateral sketch, completion and smoothing, in: Proceedings of the International Con-
ference on Artificial Intelligence and Statistics, JMLR. org, 2013.
[36] R.D. Zimmerman, C.E. Murillo-Sanchez, R.J. Thomas, Matpower: Steady-state operations, planning, and analysis
tools for power systems research and education, IEEE Trans. Power Syst. 26 (1) (2011) 12–19, doi:10.1109/
tpwrs.2010.2051168.
[37] H. Yan, H. Zhang, F. Yang, X. Zhan, C. Peng, Event-triggered asynchronous guaranteed cost control for Markov
jump discrete-time neural networks with distributed delay and channel fading, IEEE Trans. Neural Netw. Learn.
Syst. PP (99) (2017) 1–11, doi:10.1109/TNNLS.2017.2732240.
[38] H. Li, W. Yan, Y. Shi, Triggering and control co-design in self-triggered model predictive control of constrained
systems: with guaranteed performance, IEEE Trans. Autom. Control PP (99) (2018). 1–1 doi: 10.1109/TAC.
2018.2810514.
[39] J. Liu, J. Xia, E. Tian, S. Fei, Hybrid-driven-based h∞ filter design for neural networks subject to deception
attacks, Appl. Math. Comput. 320 (2018) 158–174, doi:10.1016/j.amc.2017.09.007.
Please cite this article as: Q. Li, S. Li and B. Xu et al., Data-driven attacks and data recovery with noise on state
estimation of smart grid, Journal of the Franklin Institute, https:// doi.org/ 10.1016/ j.jfranklin.2018.10.022