Detecting False Data Injection Attacks in Smart Grids A Semi-Supervised Deep Learning Approach

IEEE TRANSACTIONS ON SMART GRID, VOL. 12, NO.
1, JANUARY 2021 623
Detecting False Data Injection Attacks in Smart

Grids: A Semi-Supervised Deep Learning Approach
Ying Zhang , Student Member, IEEE, Jianhui Wang , Senior Member, IEEE, and Bo Chen , Member, IEEE
Abstract—The dependence on advanced information and com- ReLU Rectified linear unit
munication technology increases the vulnerability in smart grids GPU Graphic processing unit
under cyber-attacks. Recent research on unobservable false data S3VM Semi-supervised super vector machine
injection attacks (FDIAs) reveals the high risk of secure system
operation, since these attacks can bypass current bad data detec- SS-AE Semi-supervised autoencoders
tion mechanisms. To mitigate this risk, this paper proposes a AC Alternating current
data-driven learning-based algorithm for detecting unobserv- DC Direct current.
able FDIAs in distribution systems. We use autoencoders for
efficient dimension reduction and feature extraction of mea-
surement datasets. Further, we integrate the autoencoders into I. I NTRODUCTION
an advanced generative adversarial network (GAN) framework,
OWER distribution systems are transforming into smart
which successfully detects anomalies under FDIAs by capturing
the unconformity between abnormal and secure measurements.
Also, considering that the datasets collected from practical power
P grids with the development of advanced communicating
devices, such as PMUs and smart meters, which facilitate the
systems are partially labeled due to expensive labeling costs and system monitoring and control [1], [2]. However, the high
missing labels, the proposed method only requires a few labeled
measurement data in addition to unlabeled data for training. dependence on information technology also increases vul-
Numerical simulations in three-phase unbalanced IEEE 13-bus nerability from malicious cyber-attacks [3]. Among common
and 123-bus distribution systems validate the detection accuracy attacks in cyber-physical systems, FDIAs are regarded as one
and efficiency of this method. of the most challenging threats against secure system opera-
Index Terms—Cyberattack detection, false data injection tion. An unobservable FDIA can circumvent the conventional
attacks, generative adversarial networks, deep learning, state BDD mechanism based on measurement residuals of state esti-
estimation, phasor measurement units. mation. Without the aids of the effective detection mechanism,
attackers can stealthily launch the FDIA multiple times, which
degrades the performance of the state estimation algorithm and
A BBREVIATION may render a significant threat to the grids [4].
PMU Phasor measurement unit A great deal of research effort is being devoted to inves-
FDIA False data injection attack tigating the construction and defense mechanism of FDIAs
BDD Bad data detection since Liu et al. proposed that an attacker can launch FDIAs
DBN Deep belief network against state estimation to avoid being detected by the estima-
SVM Support vector machine tion residual based BDD methods [5]. Some research on FDIA
DNN Deep neural network construction is reported in different application scenarios in
PCA Principal component analysis DC power systems, while recent work in AC transmission
GAN Generative adversarial network systems emerges due to their reactively accurate analytical
AAE Adversarial autoencoder models [6], [7]. Liang et al. [8] conducted a comprehensive
WLS Weighted least square survey on construction methods for FDIAs. On the other hand,
DSSE Distribution system state estimation many results using various statistical and probabilistic tech-
DG Distributed generation niques are reported to defend against FDIA in DC system
LNR Largest normalized residual state estimation, such as sparse optimization [9] and Kalman
SGD Stochastic gradient descent filter [10]. However, these methods require information on
measurement data distributions and system operation states,
Manuscript received November 6, 2019; revised May 20, 2020; accepted and once these perquisites change, detection for FDIAs may
July 10, 2020. Date of publication July 20, 2020; date of current version
December 21, 2020. Paper no. TSG-01691-2019. (Corresponding author: become ineffective and outdated.
Jianhui Wang.) Recently, with the fast development of advanced meter-
Ying Zhang and Jianhui Wang are with the Department of Electrical and ing infrastructure that collects a massive volume of data,
Computer Engineering, Southern Methodist University, Dallas, TX 75275
USA (e-mail: yzhang1@smu.edu; jianhui@smu.edu). machine-learning and data-driven techniques are being widely
Bo Chen is with the Energy Systems Division, Argonne National applied to power system operation because of their powerful
Laboratory, Lemont, IL 60439 USA (e-mail: bo.chen@anl.gov). capability of extracting useful information and flexible exten-
Color versions of one or more of the figures in this article are available
online at https://ieeexplore.ieee.org. sibility [11], [12]. Also, various learning-based techniques
Digital Object Identifier 10.1109/TSG.2020.3010510 for detecting FDIAs in transmission systems have emerged,
1949-3053
c 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Consortium - Algeria (CERIST). Downloaded on November 10,2023 at 19:39:48 UTC from IEEE Xplore. Restrictions apply.
624 IEEE TRANSACTIONS ON SMART GRID, VOL. 12, NO. 1, JANUARY 2021
including DBN [13], SVM [14], [15], and DNN [16]. For power systems, the labeled measurement data are difficult
instance, He et al. [13] proposed a conditional DBN-based or expensive to obtain [17], [21], and some newly emerging
method with a restricted Boltzmann machine for detecting cyberattacks are naturally unlabeled. Moreover, it is not easy to
compromised data in DC power systems. The authors of [14] update the trained model from supervised learning. As a result,
proposed a semi-supervised learning approach based on mix- the recent focus of machine learning is moving to genera-
ture Gaussian distribution and SVM for detecting FDIAs tive models [22]–[24] and semi-supervised learning [21], [25].
against state estimation in DC transmission systems, and For instance, in power system operation research, [24] realizes
since a linear system model is used to generate the mea- missing data discovery by GAN.
surement data, PCA is used for extracting the data feature. 4) Distribution system operation poses higher requirements
However, real-world utilities widely employ AC power system to the learning-based detection methods. Practical distribution
models, and these algorithms, such as [9], [13]–[15], per- systems are featured by less labeled data with lower-level mea-
formed on DC power systems, ignore the complexity of power surement accuracy and more complex nonlinear relationship
systems or the sophistication of unobservable attacks by the (including node-to-node and phase-to-phase) in the dataset.
conventional BDD mechanism. To overcome this deficiency Such large-scale distribution systems call for a sophisticated
when dealing with unobservable attacks in AC transmission learning algorithm for effective monitoring.
systems, [16] uses wavelet transform and DNN techniques Confronted with the vulnerability of cyber-physical systems,
to capture the inconsistency of abnormal and normal mea- this paper proposes an AAE-based detection algorithm for
surements by analyzing the state dynamics. Nevertheless, the unobservable FDIAs in distribution systems. Considering the
method in [16] requires measurements with labels from contin- high dimensionality and nonlinear correlated nature of mea-
uous samplings that may be unavailable in practical operation surements, we apply autoencoders to dimension reduction and
and leads to a high computational burden. Note that most exist- feature extraction of measurement datasets in the three-phase
ing machine-learning algorithms for detecting FDIAs such as unbalanced networks. Further, we integrate the autoencoders
[13] and [16] are supervised and test the abnormal data that into an advanced GAN framework [22], which successfully
differ in some manner from the labeled data available dur- detects abnormal measurements under FDIAs by capturing the
ing training. However, the datasets collected from practical unconformity between anomalies and secure measurements.
cyber-physical systems are partially labeled due to expensive Also, because of the expensive labeling costs and potential
labeling costs [17]. Moreover, the scale of unlabeled data is missing labeled data in practical systems, this method only
usually much larger than that of the labeled data in practice, requires unlabeled data and a few labeled data from measuring
and these extensive unlabeled data seldom take part in the instruments by leveraging the powerful generation capabil-
supervised learning process. This absence leads to the loss of ity of GAN and thus is semi-supervised learning. The main
useful information, even the failure in this process. contributions of the proposed method are listed:
In contrast to the work in transmission systems, there is a • This paper presents a novel learning-based FDIA detec-
handful of research related to FDIAs at the distribution level, tion algorithm for unobservable attacks or outliers that
although the vulnerability of distribution systems has been bypass the conventional BDD mechanism. This method
discovered over the years [8]. For instance, Dai et al. [18] enables the detection of these attacks within milliseconds
presented two simple yet powerful cyber-attack methods tar- and thus can be implemented online.
geting feeder automation and introduced a search theory-based • In contrast to supervised learning, the proposed semi-
method for modeling the probability of feeders being attacked. supervised detection method only requires a limited
Deng et al. [19] proposed an FDIA model with limited knowl- number of labeled data to detect the attacked measure-
edge of system states, which exposes the feasibility of attacks ment data. Specifically, with as few as 1,000 labeled
without being detected by the current BDD mechanism. Then, training data, this method self-learns with an accurate
they extended this work focusing on balanced networks to detection ability.
unbalanced distribution systems in [20]; these systems are • The proposed algorithm is fully data-driven and thus
more consistent with practical models. Motivated by these extensible and does not depend on the information of
studies on constructing unobservable FDIAs in distribution network topology and parameters in distribution systems.
systems, reliable system operation demands countermeasures • We investigate the impact of the amount of the labeled
against these FDIAs urgently. data, and the effective detection performance of the
We conclude that the existing BDD or cyberattack detection proposed method can be maintained with only 2% of the
solutions have the following limitations. labeled data under FDIAs in a training dataset. In other
1) As we mentioned, the traditional residual based BDD words, the proposed method can use the few to detect the
methods represented as the LNR test cannot find the unob- many, thanks to the generative models.
servable FDIAs such as those in [5], [19], and [20].
2) Other model-based detection methods have a poor exten-
sibility, resulting from two aspects, high dependence on II. S TATE E STIMATION AND FDIA
specific information of the topology and line parameters and A. State Estimation
unpredictable attack locations and magnitudes due to the In classical state estimators [1], the relationship between
hackers’ subjective activities. redundant measurements and state variables is depicted as:
3) The model-free methods usually use supervised learning,
which are trained by numerous labeled data. In real-world z = h(x) + e (1)
ZHANG et al.: DETECTING FALSE DATA INJECTION ATTACKS IN SMART GRIDS: SEMI-SUPERVISED DEEP LEARNING APPROACH 625
where x is an N-dimension state vector, and z is an errors, expressed as:

M-dimension measurement vector; h(x) is the measurement
function about x; the measurement noise vector e obeys a H1
Gaussian distribution e ∼ N(0, R) is a covariance matrix max rN
i ≷λ (8)
i
and is usually considered diagonal (for instance, see [20]), H0
R = diag [σ12 , σ22 , . . . , σM
2 ], and σ 2 denotes the variance of
where rN i denotes the ith element of r , and i = 1, 2, . . . , M;
i N
the ith measurement error, i = 1, 2, . . . , M. H0 and H1 represent the hypotheses without and with bad
The state variables are obtained via a WLS criterion that data or false data injection, respectively, and the threshold λ
minimizes the sum of weighted measurement residuals J as: is set to some predetermined significant level.
x̂ = arg min J = arg min[z − h(x)]T W[z − h(x)] (2)
C. Unobservable FDIA
where W is a weight matrix of measurements to quantify the
The objective of FDIAs is to mislead system operators to
trust levels of diverse measurements, and W = R−1 .
consider xa = x + c as the estimated state vector, where c
Optimal estimated states are solved iteratively by the Gauss-
is the deviation of normal system states x [5]. The attackers
Newton method until each component of the vector x at each
can manipulate the received measurements at a control center
iteration is sufficiently small:
into za = z + a, and a is an injected attack vector. Also, the
∂J/∂x = H(x)T W[z − h(x)] = 0 (3) measurement residual vector of za is expressed as
H(x)T WH(x)x = H(x)T W[z − h(x)] (4) ra = za − h(x + c) = z + a − h(x + c). (9)
x(t+1) = x(t) + x (5)
To circumvent the detection method in (7), the attack vector
where H(x) is the Jacobian matrix of the measurement should be constructed as
function with respect to x, and H(x) = ∂h(x)/∂x.
a = h(x + c) − h(x). (10)
B. DSSE and BDD The attack can bypass the residual-based detection, if the
In three-phase unbalanced distribution systems, the system normalized residual vector rN
a satisfies the condition
states are chosen as the voltage phasors at all buses and
expressed as a,i ≤ λ.
max rN (11)
i

x = va1 , vb1 , . . . , vcn (6) In this paper, we refer to the attack as an unobservable FDIA
with a well-structured attack vector a that enforces (11). By
ϕ launching such an FDIA, the attacker can inject estimation
where vj denotes the voltage phasor at bus j, ϕ = {a, b, c}
and j = 1, . . . , n; n is the number of buses in the system. errors without being detected by the conventional LNR test.
The measurement vector, z, in DSSE generally includes
1) voltage and current phasors from distribution-level PMUs, III. C ONSTRUCTION OF FDIA
2) power flows recorded by supervisory control and data acqui- To evaluate the FDIA detection capability, this section
sition systems, and 3) power injections from smart meters introduces an unobservable attack construction method in dis-
or pseudo-measurements, including load consumption and tribution systems. Assuming that an attacker has limited ability
DG [26]. The detailed formulation of the measurement func- to hack into meters, they can use the attack method proposed
tions h(x) can be found in the Appendix, and the Jacobian in [20] for constructing FDIAs without paying high calculation
matrix can be found in [20] and references therein. The DSSE costs.
model is nonlinear since PMUs are not installed at each node Emerging research indicates that approximately linear mod-
in a practical distribution system [1]. Then, the solution of the els for state estimation in distribution systems exist and pro-
DSSE model is solved iteratively based on (3)-(5). vide convenience for hackers to launch unobservable FDIAs,
Considering the sampling errors of various measuring e.g., [19]. Specifically, the linear formulas in these DSSE
instruments and potential malicious cyberattacks, the conven- methods approximate the nodal voltages in equivalent cur-
tional BDD mechanism of state estimation applies the LNR rent measurements as the substation voltage V slack . This
test [8]. The LNR test calculates the measurement residual r approximation originates from two observations for distribu-
and the normalized measurement residual vector rN to test the tion systems: 1) the voltage magnitudes are close to each other,
existence of bad measurements: i.e., 0.95 ∼ 1.05 p.u., and 2) the voltage phase angle differ-
|r| ences are very small, such as 0.1 degrees per mile [26]. Take
rN = √ (7)
diag(SR) the three-phase power flow measurement at branch i-j, Sij , as
an example, convert the power measurement into an equiv-
where r = z − h( x); S represents the measurement sensitivity alent current measurement by Iij_eq ≈ (Sij /V slack )∗ , and then
−1
matrix of this estimator, and S = I − H(HT WH) HT W. the measurement function can be expressed as
The LNR test can be used to detect the presence of bad data
due to malicious cyber-attacks, faulty sensors, or topological h(x) = Y ij vi − vj ≈ Iij_eq (12)
where Y ij ∈ C3×3 denotes the nodal admittance between nodes

i and j, which is constant, and hence, h(x) is linear about the
state variables x = {vi , vj }; [·]∗ denotes the complex conjugate.
More details of the linear approximation can be found in the
Appendix B. Then, due to the existence of the nearly linear
relationship in DSSE, the estimator (1) can be expressed in a
linear form as

z = H
x+e (13)
where z and
x denote the measurement and the closed-form
estimated vector.
When the attack vector a is injected, the compromised
measurement residual ra can be expressed as [20]
ra =
za − H
x
−1
=
z+a−H
x + HT WH HT Wa (14)
Fig. 1. An overview of the proposed FDIA detection mechanism.
If a = Hc, the compromised measurement residual ra after
the attack is the same as the measurement residual r before
the attack, as follows.
In the AAE-based detection algorithm, the measurement
−1
ra = z − Hx + Hc − H HT WH HT WHc vector z including three-phase voltages, currents, and pow-
=z − Hx=r (15) ers are collected as the inputs of AAE, where only a limited
number of them is labeled with α = 0 or 1. Compared
If the residual r can bypass the LNR test, the compro- with the conventional LNR test, the learning-based detection
mised residual ra with malicious data can also bypass this method is fully data-driven and does not require the knowledge
test by (11). Furthermore, to construct an attack vector that of system knowledge of topology and parameters, i.e., h(x).
−1
meets a = Hc, let A = H(HT H) HT and A ∈ RM×M , and Furthermore, we train the AAE network to extract the node-
solve a by the following equation: to-node and phase-to-phase features of normal and abnormal
data, and then detect the presence of an unobservable FDIA.
Aa = AHc ↔ Aa = a ↔ Ba = 0 (16) To evaluate the detection performance, we generate unobserv-
where B = A − I and I is an identity matrix. Assume able attacks by the method in Section III, while the normal
that the maximum number of measurements that hackers can dataset comes from the nonlinear distribution system model in
compromise is K, and express the attack vector as a = the Appendix A.
[0, ai1 , . . . , 0, aik , . . . , 0]T and 0 <k ≤ K. Also, the elements
ai with i ∈ {i1 , . . . , ik } are the unknown variables to solve. A. Autoencoders and GAN
Then Ba = 0 is equivalent to B a = 0, where a is the 1) Autoencoders: Autoencoders are widely used for dimen-
k-dimension vector that removes zero-value elements in a, and sion reduction and feature extraction of highly dimensional
B is the matrix that removes the corresponding columns at the and correlated data [25]. Fig. 2 shows a classical struc-
locations related to these zero-value elements in a, B ∈ RM×k . ture of autoencoders, including an encoder and a decoder.
If rank(B ) < k, there is at least one non-zero solution in Specifically, the autoencoders learn a mapping from an input
B a = 0. Moreover, a can be obtained by a = (I − B + B )d, X to a hidden code Y and the mapping is parameterized as
+
where B is the pseudo inverse of B , and d is an arbitrary q(Y|X) = q(Y|X; θ ) with the parameters θ that we want to
non-zero vector, d ∈ Rk×1 . learn.
Attackers can successfully launch unobservable attacks by Define p(X) as the prior distribution and p(Y) as the prior
the attack model (16). Therefore, the mechanism for detecting distribution that we want to impose on the code. The encod-
the unobservable FDIAs demands an effective solution. ing function of the autoencoders q(Y|X) defines an aggregated
posterior distribution q(Y) on the hidden code vector as
IV. P ROPOSED D ETECTION M ECHANISM
This section proposes a deep learning mechanism for detect- q(Y) = q(Y|X)pd (X)dX (18)
ing unobservable FDIAs in three-phase distribution systems;
where pd (X) denotes the data distribution of X.
Fig. 1 provides an overview of the proposed detection mech-
Encoder: The mapping fθ transforms the input X into Y and
anism. We define the detection problem for unobservable
is expressed as
FDIAs or outliers represented by the attack vector a as a binary

classification problem with the detection indicator α: fθ (X) = s wXj + b (19)

0 if a = 0
α= (17) where w and b denote the weight matrix and offset vector,
1 if a = 0 respectively, and θ = {w, b}, w ∈ Rh×M , and b ∈ Rh×1 ; h
generated samples, and b) update the generator with the fixed

discriminator parameters to fool the discriminator with its gen-
erated fake samples. The solution of the two-player game is
globally optimal, and [22] provides the proof of the optimality
and convergence analysis.
Because of limited labeled measurements available for
training in practical power systems, we use GAN to aid
the autoencoders in shaping the hidden code for accurately
detecting FDIAs, which is detailed later.
B. Adversarial Autoencoder
Fig. 2. Structure of autoencoders with three fully connected hidden layers. The structure of AAE and its training procedure
for FDIA detection are introduced. Here, the input
X is a measurement dataset with P labeled samples
{(z1 , α1 ), (z2 , α2 ), . . . , (zP , αP )} and Q unlabeled samples
{zP+1 , zP+2 , . . . , zP+Q }, where αp = 0 or 1 denotes the label
of the pth set of measurements, p = 1, . . . , P and P Q.
As the inputs of AAE, X ∈ RM×Nd , where M is the number
of measurements in (1) and here the number of the samples
Nd = P + Q. Each sample of z is further represented in the
neural networks by Xj , and j ∈ {1, 2, . . . , Nd }, shown in (19).
In AAE, the encoder learns how to encode a given data into
a prior distribution, while the decoder learns a deep genera-
tive model that matches the aggregated posterior distribution
of the hidden representation from the encoder to an arbitrary
Fig. 3. The learning process of GAN. prior distribution. Fig. 4 shows the combination and division
of work of the autoencoders and GAN in the attack detection
task.
denotes the number of the hidden units; Xj ∈ RM×1 is the In Fig. 4, Y l and Y u denote the hidden representation
jth vector of the input samples X, j ∈ {1, 2, . . . , Nd }, and for the labeled and unlabeled inputs, and qθ (Y u , Y l | X) and
X ∈ RM×Nd , where Nd is the number of these samples; s(·) pθ (X|Y u , Y l ) denote the encoder and decoder in this semi-
denotes the squashing nonlinearity of the neural network. supervised learning, respectively. Assume the data gener-
Decoder: The hidden code Y is then mapped back to a ated by Y l and Y u , named Y l and Y u , follow a two-
reconstruction X in the input space, i.e., X = gθ (Y). This dimensional categorical distribution for the binary classifica-
mapping gθ is called the decoder, and based on the parameters tion problem in (17) and Gaussian distributions, respectively,
θ = {w , b }, it is shown as i.e., Y l ∼ Cat(2) and Y u ∼ N(0, I). Here, adding the Gaussian
noises is to stabilize the GAN training [22]. Also, we assume
gθ (Y) = s w Y + b . (20)
that the aggregated posteriors p(Y l ) and p(Y u ) obey Gaussian
The autoencoders optimize the network parameters by min- distributions. To match the aggregated posterior to the prior
imizing the mean square error between X and X as the distributions of the mixture data, the encoder qθ (Y u , Y l |X)
reconstruction cost, LR [25]: works as the generator of GAN. In the meantime, the adver-
1 sarial network has two discriminators, Dcat and Dgauss , for the
2
arg min LR = X−X (21) labelled and unlabeled inputs, respectively.
θ,θ Nd
We train the AAE network at three stages: the reconstruction
The parameters θ and θ are usually backpropagated by phase, the adversarial phase, and the supervised phase [28].
SGD in the training process. The convergence proof of The batch normalization technique is used to improve the
autoencoders can be found in [27]. training speed, performance, and stability of neural networks.
2) GAN: GAN establishes a min-max adversarial game We introduce the training procedure of AAE below and show
between two neural networks, a generator, G, and a dis- it in pseudo-code.
criminator, D, shown as Fig. 3. The generator produces the 1) Reconstruction Phase: The AAE detector first works as
measurement data samples (fake samples) that follow the dis- traditional autoencoders in this phase, shown in Fig. 4(a), and
tribution of the original training data (real samples), while the both the encoder and decoder are trained to minimize the total
discriminator distinguishes between the generated data sam- reconstruction loss, LR , for the labeled and unlabeled inputs
ples and these real samples. In a nutshell, GAN is alternatively X as in (21).
trained in two stages: a) update the discriminator with fixed 2) Adversarial Phase: In this phase, the encoders
generator parameters to distinguish the real samples from the qθ (Y u , Y l |X) is reserved for training the discriminators and
Fig. 4. Semi-supervised AAE architecture: (a) in reconstruction phase, (b) in adversarial phase, and (c) in supervised phase.
Algorithm 1 AAE Training Process equivalently written as

1: Input:Learning rate γ , mini-batch size, and the number
of epochs. min V(Dcat , G) = EY l ∼p(Y l ) log(1 − Dcat (G(Y l ))) . (23)
G
2: for t = 1 to Nep do
A two-player minimax game for the labeled data is formu-
3: Sample mini-batch.
lated by combining (23) with (22) as
4: Reconstruction Phase
5: Calculate LR by (21) and update θ and θ by descending min max EYl ∼Cat(2) log Dcat Yl
G Dcat
the gradients:
fθ ← ∇θ LR , θ ← θ − γ Adam(fθ ) + EYl ∼p(Yl ) log(1 − Dcat (G(Yl ))) (24)
gθ ← ∇θ LR , θ ← θ − γ Adam(gθ ) Similarly, the objective function for the unlabeled data is
6: Reconstruction Phase expressed as follows:
Obtain the hidden representation of the encoder
qθ (Y u , Y l |X) and sample from the prior distributions, min max EY u ∼N (μ, ) logDgauss Y u
G Dgauss
and calculate the confidence scores of Dgauss and Dcat .
Discriminator: Train the discriminators to update their + EY u ∼p(Y u ) log 1 − Dgauss (G(Y u )) . (25)
network parameters when fixing the generator parame- 3) Supervised Phase: Using only the labeled data, the
ters autoencoders continue to update the encoder network, shown
Generator: Update qθ (Y u , Y l |X) as the generator with in Fig. 4(c). Train the encoder for the labeled data by
the fixed discriminator parameters. minimizing the cross-entropy as the supervised cost by
7: Supervised Phase: Using only the labeled data, update
the encoder to minimize LS by (26). min LS = Eq(Y l ) −log p(Y l ) (26)
θ
8: end for
9: Output:the encoder qθ (·). where the aggregated posterior distribution q(Y l ) is calculated
by (18), and p(Y l ) is the distribution of Y l inherited from the
results in the adversarial phase.
During this training, we use the Adam optimization tech-
generator, illustrated in Fig. 4(b). GAN updates the discrimi- nique to computes adaptive learning rates for each parameter.
nators Dcat and Dgauss to distinguish the true samples of the Adam is straightforward to implement and computationally
categorical from Gaussian priors from the generated samples. efficient and has little memory requirements [29]. This tech-
Here, the goal of a discriminator is to maximize the probabil- nique is widely used as a replacement of SGD in the applica-
ity that Y l or Y u comes from the generated data rather than tion research of GAN [22], especially for the optimization of
from the true sample distribution, i.e., its confidence score. objective functions with high-dimensional parameters spaces.
Hence, we formulate the loss function of Dcat for the labeled The pseudo-code of Adam and its hyperparameters can be
data as found in [29].
V. C ASE S TUDY
max Dcat V(Dcat , G) = EYl ∼Cat(2) log Dcat Yl
We test the proposed AAE-based algorithm on three-phase
+ EYl ∼p(Yl ) log(1 − Dcat (G(Yl ))) (22)
unbalanced benchmarks: IEEE 13-bus and 123-bus distribu-
tion systems [30]. These systems are modified by adding DG
where EY l and EY l denote the expectations under the corre- units; more details about the location and types of these DG
sponding distributions. units are provided in [26]. Fig. 5 shows the unbalanced 13-bus
Then, we express the loss function of the generator G network, and the measurement arrangement of these systems is
as maxG V(Dcat , G) = EY l ∼p(Y l ) [logDcat (G(Y l ))], which is listed in Table I. The true values of measurements and states
Fig. 6. The LNR results under unobservable FDIAs and no attacks.
Fig. 5. Estimation results of voltages under no attacks and an unobservable

attack.
TABLE I
M EASUREMENT L OCATIONS IN T EST S YSTEM
are obtained by running power flow program and DSSE in Fig. 7. Estimation results of voltages under no attacks and an unobservable
MATLAB, and the proposed AAE-based algorithm runs in attack.
Python. Measurements with noises consist of voltage phasors,
current phasors, and complex powers, and these measurement
noises obey Gaussian distributions [31]. Specifically, the max- ReLU activation function, and a sigmoid activation function
imum meter noises of PMUs [32] are 1% of the true values for is used in the output layer of the autoencoder. Y l and Y u are
voltage/current magnitudes and 0.01 rads for the phase angles, two-dimensional for the binary classification problem. Adam
and assume that a PMU measures the nodal voltage and the is used to train these neural networks with mini-batches of 64
currents at the branches connected to this bus; the measure- samples for optimizing all the loss functions.
ment errors of power data at limited branches and all load/DG
nodes from smart meters are 3% of the true values [33].
Dataset Structure: The input of the proposed AAE detector A. Unobservable FDIA
is the collection of the measurement vector z. In the modi- We investigate the detection performance of the conven-
fied 13-bus system, there are 11 nodes by closing the switch tional BDD method under unobservable attacks to show its
installed at the branch 671-692; the state vector, x ∈ R66×1 , insufficiency. Assuming that an attacker has access only to at
is composed of the three-phase voltage magnitudes and volt- most K measurements from half the number of all meters, we
age phase angles, and 17 measurement phasors in Table I randomly choose k in (0, K] to generate a k-sparse attack vec-
produce a measurement vector, z ∈ R102×1 . In the modified tor a in these systems. Also, the DSSE method in Section III
123-bus system, there are 119 nodes due to three normally runs 5,000 Monte Carlo simulations under no attacks, and
closed switches and one normally open switch; x ∈ R714×1 we choose the maximum of rN in all trials as the detection
and z ∈ R870×1 . We record 5,000 sets of measurements from threshold, λ0 .
Monte Carlo simulations and generate other 5,000 sets of Fig. 6 lists the results of the LNR test under these FDIAs
measurements under unobservable FDIAs by the attack con- in the 13-bus system, and we compare these results with those
struction methods proposed in Section III and [5]. In the with no attacks. We find that these FDIAs are unobservable
training process, 80% of measurements are chosen as the by the LNR test, since all the residuals are located under the
training dataset and the rest are used for evaluating detection detection threshold [20]. For instance, we construct FDIAs
performance. Further, for semi-supervised learning, we label targeting on the A-phase and C-phase voltages at buses 611,
1,000 sets of measurements with a ratio of 1:1 as the secure 671, and 680 by constructing a sparse attack vector with k = 4,
and attacked data. These secure and attacked data are fed to and Fig. 7 shows the estimated states under this unobservable
the proposed AAE detector for offline training. attack. The unobservable FDIA can stealthily compromise the
DNN Specification: The learning rate is chosen as 0.0001, state estimation for voltage magnitudes to make them vio-
and the number of epochs is 400. The encoder, decoder, and late the operation ranges, e.g., below 0.95 p.u. at some buses
discriminators have two layers of 1,000 hidden units with a in Fig. 7. Estimated states with such significant biases may
TABLE II
D EFINITIONS OF P ERFORMANCE I NDICES
TABLE III
D ETECTION P ERFORMANCE OF P ROPOSED A LGORITHM
Fig. 8. The LNR results under unobservable FDIAs (worst case) and no
attacks.
mislead the decisions made by system operators for voltage

TABLE IV
regulation. C OMPARISON W ITH OTHER S EMI -S UPERVISED M ETHODS
We test the worst case of the proposed method by setting as
k = K. Fig. 8 shows the results of the LNR test. In this case,
the residuals of 82% of attacks in all 100 test cases of this
figure are below the detection threshold, i.e., these attacks cir-
cumvent the LNR test and are unobservable FDIAs. Compared
with Fig. 6, the probability of unobservable FDIAs constructed
decreases with the increasing number of attacked measure-
ments. Since the unobservable FDIAs are built on the linear
approximation of the nonlinear DSSE model, the more mea- average computation time of the proposed method is 9.30 and
surements a hacker attacks, the more probably this attack can 14.81 milliseconds in the 13-bus and 123-bus systems, which
be detected by the LNR test due to the increasing accuracy can satisfy the requirement of online detection. Without the
loss from this approximation. use of GPU, the training time reaches about four hours in
the 123-bus test system. Moreover, we further try the fine-
tuning technique [34] for neural network training based on
B. Detection Performance the original hardware configuration, the new training process
Targeting at finding these FDIAs, we evaluate the detection shows that this time (four hours) is shortened to less than two
performance of the proposed method in the 13-bus and 123-bus hours.
distribution systems. All simulation and training are conducted
on a computer with a 2.5 GHz Intel Core i5 CPU and 8 GB C. Other Semi-Supervised Learning Techniques
of RAM.
We calculate the true positive (tp), the true negative (tn), the To evaluate the detection performance of the proposed AAE
false positive (fp), and the false negative (fn) rates, which are method, we compare the proposed method with other data-
defined in Table II. For instance, tp denotes the probability that driven detection techniques. We employ the S3VM proposed
a measurement classified as attacked is actually exposed to an in [15] for attack detection as the baseline. Moreover, for a
attack. We evaluate the learning ability of the proposed method fair comparison, we adopt the SS-AE and update the k-nearest
by the values of precision (Prec), recall (Rec), and accuracy neighbor (k-NN) method [35] into a semi-supervised version.
(Acc) [15]. The precision values are used to evaluate the classi- Table IV lists the detection performance of these methods.
fication performance for the attacked measurements, while the With limited labeled data for training, our approach has a
recall values measure the probability that the secure measure- higher detection accuracy owing to the powerful combina-
ments are not misclassified as attacked. The overall detection tion of autoencoders and GAN. For instance, the proposed
performance is measured by the index Acc. Furthermore, we algorithm achieves a high detection accuracy of up to 95%,
calculate these three indices by while the S3VM-based scheme has a worse performance with
a detection accuracy of less than 80%. Our conclusion is
tp similar to that of Ozay et al. [15] that extensive unlabeled
Prec = (27)
tp + fp samples in the training dataset largely degrade the classifica-
tp tion performance of the SVM method. Moreover, owing to
Rec = (28)
tp + fn the use of the advanced generative models in AAE, the detec-
tp + tn tion accuracy of the proposed method is higher than that of
Acc = (29)
tp + tn + fp + fn the individual SS-AE algorithm without the use of generative
models. Therefore, the proposed method is more competitive.
Table III shows the detection accuracy, training time, and
detection time of the proposed method in two test systems. The
proposed algorithm used for detecting the attacked metering D. Impacts of Measurement Noises
data has a detection error of 3.75% in the 13-bus system, while To test the robustness against measurement noises in the
the detection error is 2.15% in the 123-bus system. Also, the proposed method, we investigate the impact of various noise
TABLE VI
DATA S TRUCTURE IN S ENSITIVITY A NALYSIS
TABLE VII
Fig. 9. Detection accuracy with maximum measurement errors ranging from
C OMPARISON OF D ETECTION P ERFORMANCE IN T WO C ASES
4% to 12%.
TABLE V
FDIA D ETECTION W ITH F EWER OR M ORE L ABELED DATA
TABLE VIII
C ONFUSION M ATRIX OF FDIA D ETECTION W ITH F EWER L ABELED DATA
levels of measurement data on the detection accuracy in the
123-bus distribution system. In the case with no installation
or malfunction of smart meters at load/DG nodes, we use
pseudo-measurements with higher errors (e.g., 10% of the true
values [26]), which are obtained from the historical or fore-
casting data of customer loads and DG production, to realize
the system observability. In experiments, the maximum errors
of the power measurements are set to vary from 4% to 12%. are not labeled in the training stage, we test the detection
Fig. 9 shows the detection results of the proposed method, performance towards new attacked samples. New attacked
SS-AE, and S3VM under these noise levels. samples here are defined as those that are not labeled in the
Fig. 9 implies that the proposed algorithm achieves a detec- training stage and produced by different attack construction
tion accuracy of more than 94% even with the maximum methods from that of those historical known FDIAs. This
measurement errors of up to 12% of the true values; this find- case study can be summarized as “using few attacked sam-
ing illustrates the robustness of the proposed detection method ples to detect more new samples” by adopting the generative
against measurement noises. In comparison, the detection models.
accuracy of the SS-AE and S3VM approaches decreases when Specifically, the attacked samples with labels in the train-
dealing with higher measurement errors. We conclude that the ing dataset are only from the construction method in [20], and
detection accuracy of the proposed method still remains high the ratio of these samples to all the training samples is low,
when the noise level of the test dataset increases in distribution i.e., 2%, shown in Table VI. Furthermore, we use the method
systems. This is because the adopted generative model has the in [5] to constructs different attacks from those labeled sam-
capability of better shaping the hidden code of autoencoders ples, and these attacked samples without attaching labels are
to make the measurement data distinguishable. randomly chosen and put in the training and test dataset as new
attacks. These details of the adopted training and test datasets
E. Sensitivity Analysis are shown in Table VI, and here only 160 attacks are labeled.
1) Impact of the Amount of Labeled Data: We investigate the Other settings are the same as Section V-A.
detection performance by using relatively fewer labeled data Shown in Table VII, the detection accuracy of the unla-
during the training. We set the different amount of the labeled beled data in the training decreases to 93.15%, compared with
data, ranging from 500 to 1250 in the 13-bus test system. 95.75% in the case study where 400 attacks are labeled in the
Table V provides the confusion matrix and evaluation total 800 labeled data. In the test stage, the proposed method
indices of the proposed algorithm, in which the number of detects the unobservable FDIAs with an accuracy of 91.60%
training samples is 8,000. More labeled data during the train- in the 123-bus system. We conclude that the limited attacked
ing leads to a more accurate detection performance. However, data that are labeled in the training process degrade the
with 500 sets of data labeled, the proposed method detects detection performance of the semi-supervised learning. More
the unobservable FDIAs with precision, recall, and accu- details about the test performance can be found in Table VIII.
racy values of about 91.17%, 92.26%, and 91.70%, respec- These “new attacks” influence the recall value more obvi-
tively, which illustrates the detection effectiveness of this ously, and this index is 90.49%. However, the detection
method. performance of the proposed algorithm might be acceptable,
2) Impact of Different Attacks: Considering that there are since only 2% of the attacked data is labeled in the training
some potential FDIAs that are not fully investigated and thus stage.
VI. C ONCLUSION where the power flow measurement Sij ∈ C3×1 and the power
This paper proposes a semi-supervised AAE-based algo- injection measurement Sk ∈ C3×1 ; vk and vi denote the esti-
rithm for detecting FDIAs in smart distribution systems. In the mated voltages at node k and i, respectively, and come from
case of only a small fraction of labeled measurement data, the the corresponding elements in x. Further, Iij can be obtained
proposed method leverages a state-of-the-art GAN framework by (31), and the current injection at node k can be expressed as
T
to realize the effective detection of unobservable FDIAs that Ik = [Ika , Ikb , Ikc ] = Y k vk , where Y k ∈ C3×3 denotes the nodal
bypass the conventional BDD method. Compared with other admittance.
semi-supervised learning techniques, the proposed algorithm The DSSE model in the complex form is expressed as
has a high and robust detection accuracy owing to the powerful T
z = V k , Iij , Sk , Sij = h(x) + e (34)
combination of autoencoders and GAN. The proposed method
is fully data-driven and does not depend the specific estima- Due to the nonlinear relationships between the voltages
tion methods and system knowledge. Numerical simulations and the power measurements, the model (34) is nonlinear.
validate the detection performance of this method. The DSSE process in the three-phase distribution system is
The cyberattack or outlier detection methods require col- iteratively implemented in the following steps [2]:
lecting data from the same topology structure [14], [15]. 1) Backward Sweep: Get initial values of branch currents
When the topology structure changes, the corresponding mea- by a backward approach. An initial voltage at each node
surements and states should be stored according to different vi = V slack , and (35)
is set as the substation voltage, i.e.,
topology labels for the subsequent data analysis. If this change is used to calculate the current injections is calculated
is unknown, effective topology identification methods should as through nodal power injections:
be used in advance, and a clustering method for grouping the
data from different topologies is another potential solution. v i )∗
Ik_eq = (Sk / (35)
The data-driven FDIA detection method with a varying topol- Next, these injections are used to obtain branch currents.
ogy is left for our future work. Varying DER penetration may 2) Forward Sweep: The branch currents in step 1) and
lead to greatly different system operation features in a distri- the substation voltage are used to calculate initial nodal
bution system. If the measurement data with various system voltages, x0 .
operation features are the input of neural networks, the cyber- 3) Obtain h(x) by (30)-(33) with the latest states xt , and
attack detection will not be a binary classification problem. then update the system state variables as
Before detecting, a clustering method to group the operation
T −1 t T
modes is necessary. xt = H xt WH xt H x W z − h xt .(36)
A PPENDIX 4) Update the nodal voltages by xt+1 = xt + xt .

5) If xt is less than a pre-set tolerance, stop the iterative
A. Nonlinear DSSE Algorithm
process. Otherwise, go to step 3).
The real and imaginary parts of all nodal voltages are cho-
sen as the state vector x, shown in (6). Denote the voltage at B. On the Existence of Linear Approximation
node k as V k and the current at branch i-j as Iij , k ∈ ψV and
{i, j} ∈ ψI ; ψV and ψI are the sets of nodes and branches with The power flow equations contain the linear relation-
voltage/current measurements from limited PMUs installed in ships (30) and (31) between x and PMU measurements,
the distribution system. Power measurements exist at node k together with nonlinear relationships between x and power
or branch i-j, {i, j} or k ∈ ψS is the set of load/DG nodes or measurements. The authors of [36] proposed linear approx-
the branches installed with a meter. imation theorems and the error analysis to establish a linear
The measurement function of the three-phase voltage mea- model between the voltages and powers. Based on [20], [36]
surement at node k, k ∈ ψV , can be depicted as proposes an FDIA model in the distribution system.
In [20], the complex power measurements at node k or at
T
branch i-j can be converted to the equivalent currents as
V k = Vka , Vkb , Vkc = vk (30)
Ik_eq = (Sk /V slack )∗ (37)
The relationship between the current measurement at branch ∗
Iij_eq = Sij /V slack (38)
i-j and the states can be expressed as:
T where V slack denotes the voltage at the substation measured
Iij = Iija , Iijb , Iijc = Y ij vi − vj (31) by a PMU.
The DSSE model with this approximation is expressed as
where Y ij denotes the line admittance at this branch, the same
T
as (12). z = V k , Iij , Ik_eq , Iij_eq
The complex power measurements at node k or at branch i-j T
= I, Y br , Y bus , Y br_eq x + e (39)
can be expressed as a nonlinear relationship about the states as
where the Jacobian matrix H = [I, Y br , Y bus , Y br_eq ]T and I
Sk = vk · (Ik )∗ (32) denotes the identity matrix; Y br is a matrix composed of all
∗
Sij = vi · Iij (33) the Y ij at {i, j} ∈ ψI and zero elements, Y bus (or Y br_eq ) is a
matrix composed of all the Y k (or Y ij ) at k ∈ ψS (or {i, j} ∈ ψS ) [20] P. Zhuang, R. Deng, and H. Liang, “False data injection attacks
and zero elements. against state estimation in multiphase and unbalanced smart distribu-
tion systems,” IEEE Trans. Smart Grid, vol. 10, no. 6, pp. 6000–6013,
This linear approximation solution is closer to the nonlinear Nov. 2019.
solution provided by (34), compared with a linear solution [21] Y. Zhao, R. Ball, J. Mosesian, J. de Palma, and B. Lehman, “Graph-
by simplifying the AC distribution system as a DC model. based semi-supervised learning for fault detection and classification in
solar photovoltaic arrays,” IEEE Trans. Power Electron., vol. 30, no. 5,
The conclusion is validated by the case study in [20]. Based pp. 2848–2858, May 2015.
on this linear approximation, an unobservable FDIA in three- [22] I. Goodfellow et al., “Generative adversarial nets,” in Advances in Neural
phase distribution systems can be constructed by the method Information Processing Systems. Red Hook, NY, USA: Curran, 2014,
pp. 2672–2680.
presented in Section III. [23] M. Khodayar, J. Wang, and M. Manthouri, “Interval deep generative
neural network for wind speed forecasting,” IEEE Trans. Smart Grid,
vol. 10, no. 4, pp. 3974–3989, Jul. 2019.
R EFERENCES [24] C. Ren and Y. Xu, “A fully data-driven method based on generative
adversarial networks for power system dynamic security assessment with
[1] K. Dehghanpour, Z. Wang, J. Wang, Y. Yuan, and F. Bu, “A survey on missing data,” IEEE Trans. Power Syst., vol. 34, no. 6, pp. 5044–5052,
state estimation techniques and challenges in smart distribution systems,” Nov. 2019.
IEEE Trans. Smart Grid, vol. 10, no. 2, pp. 2312–2322, Mar. 2019. [25] J. Deng, X. Xu, Z. Zhang, S. Frühholz, and B. Schuller, “Semisupervised
[2] W. H. Kersting, Distribution System Modeling and Analysis. Boca Raton, autoencoders for speech emotion recognition,” IEEE/ACM Trans. Audio,
FL, USA: CRC Press, 2006. Speech, Language Process., vol. 26, no. 1, pp. 31–43, Jan. 2018.
[3] S. Sridhar, A. Hahn, and M. Govindarasu, “Cyber–physical system [26] Y. Zhang, J. Wang, and Z. Li, “Interval state estimation with uncertainty
security for the electric power grid,” Proc. IEEE, vol. 100, no. 1, of distributed generation and line parameters in unbalanced distribu-
pp. 210–224, Jan. 2012. tion systems,” IEEE Trans. Power Syst., vol. 35, no. 1, pp. 762–772,
[4] Y. Zhang, J. Wang, and Z. Li, “Uncertainty modeling of distributed Jan. 2020.
energy resources: Techniques and challenges,” Current Sustain./Renew. [27] A. Radhakrishnan, K. Yang, M. Belkin, and C. Uhler, “Memorization
Energy Rep., vol. 6, no. 2, pp. 42–51, 2019. in overparameterized autoencoders,” 2018. [Online]. Available:
[5] Y. Liu, P. Ning, and M. K. Reiter, “False data injection attacks against arXiv:1810.10333.
state estimation in electric power grids,” ACM Trans. Inf. Syst. Security, [28] A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow, and
vol. 14, no. 1, pp. 1–33, 2011. B. Frey, “Adversarial autoencoders,” 2015. [Online]. Available:
[6] G. Chaojun, P. Jirutitijaroen, and M. Motani, “Detecting false data injec- arXiv:1511.05644.
tion attacks in AC state estimation,” IEEE Trans. Smart Grid, vol. 6, [29] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
no. 5, pp. 2476–2483, Sep. 2015. 2014. [Online]. Available: arXiv:1412.6980.
[7] X. Liu and Z. Li, “False data attacks against AC state estimation with [30] (2017). IEEE Test Feeder Specifications. [Online]. Available:
incomplete network information,” IEEE Trans. Smart Grid, vol. 8, no. 5, http://sites.ieee.org/pes-testfeeders/resources
pp. 2239–2248, Sep. 2017. [31] Y. Zhang, J. Wang, and J. Liu, “Attack identification and correction for
[8] G. Liang, J. Zhao, F. Luo, S. R. Weller, and Z. Y. Dong, “A review of PMU GPS spoofing in unbalanced distribution systems,” IEEE Trans.
false data injection attacks against modern power systems,” IEEE Trans. Smart Grid, vol. 11, no. 1, pp. 762–773, Jan. 2020.
Smart Grid, vol. 8, no. 4, pp. 1630–1638, Jul. 2017. [32] IEEE Standard for Synchrophasor Measurements for Power Systems,
IEEE Standard C37.118-2005, Dec. 2011.
[9] L. Liu, M. Esmalifalak, Q. Ding, V. A. Emesih, and Z. Han, “Detecting
[33] (Mar. 2011). Smart Meters and Smart Meter Systems:
false data injection attacks on power grid by sparse optimization,” IEEE
A Metering Industry Perspective. [Online]. Available:
Trans. Smart Grid, vol. 5, no. 2, pp. 612–621, Mar. 2014.
https://aeic.org/smartmetersfinal032511/
[10] K. Manandhar, X. Cao, F. Hu, and Y. Liu, “Detection of faults and
[34] (Oct. 2016). A Comprehensive Guide to Fine-Tuning Deep
attacks including false data injection attack in smart grid using Kalman
Learning Models in Keras (Part I). [Online]. Available:
filter,” IEEE Trans. Control Netw. Syst., vol. 1, no. 4, pp. 370–379,
https://flyyufelix.github.io/2016/10/03/finetuning- in-keras-part1.html
Dec. 2014.
[35] L. Cai, N. F. Thornhill, S. Kuenzel, and B. C. Pal, “Wide-area mon-
[11] M. Cui, M. Khodayar, C. Chen, X. Wang, Y. Zhang, and M. E. Khodayar,
itoring of power systems using principal component analysis and
“Deep learning-based time-varying parameter identification for system-
k-nearest neighbor analysis,” IEEE Trans. Power Syst., vol. 33, no. 5,
wide load modeling,” IEEE Trans. Smart Grid, vol. 10, no. 6,
pp. 4913–4923, Sep. 2018.
pp. 6102–6114, Nov. 2019.
[36] S. Bolognani and S. Zampieri, “On the existence and linear approxima-
[12] Y. Lin and J. Wang, “Probabilistic deep autoencoder for power system tion of the power flow solution in power distribution networks,” IEEE
measurement outlier detection and reconstruction,” IEEE Trans. Smart Trans. Power Syst., vol. 31, no. 1, pp. 163–172, Jan. 2016.
Grid, vol. 11, no. 2, pp. 1796–1798, Mar. 2020.
[13] Y. He, G. J. Mendis, and J. Wei, “Real-time detection of false data injec-
tion attacks in smart grid: A deep learning-based intelligent mechanism,”
IEEE Trans. Smart Grid, vol. 8, no. 5, pp. 2505–2516, Sep. 2017.
[14] S. A. Foroutan and F. R. Salmasi, “Detection of false data injection
attacks against state estimation in smart grids based on a mixture
Gaussian distribution learning method,” IET Cyber Phys. Syst. Theory
Appl., vol. 2, no. 4, pp. 161–171, Dec. 2017.
[15] M. Ozay, I. Esnaola, F. T. Yarman Vural, S. R. Kulkarni, and H. V. Poor,
“Machine learning methods for attack detection in the smart grid,”
IEEE Trans. Neural Netw. Learn. Syst., vol. 27, no. 8, pp. 1773–1786,
Aug. 2016. Ying Zhang (Student Member, IEEE) received
[16] J. J. Q. Yu, Y. Hou, and V. O. K. Li, “Online false data injection the M.S. degree in electrical engineering from
attack detection with wavelet transform and deep neural networks,” IEEE Shandong University, Jinan, China, in 2017, and
Trans. Ind. Informat., vol. 14, no. 7, pp. 3271–3280, Jul. 2018. the Ph.D. degree from the Department of Electrical
[17] L. Yao and Z. Ge, “Scalable semisupervised GMM for big data quality and Computer Engineering, Southern Methodist
prediction in multimode processes,” IEEE Trans. Ind. Electron., vol. 66, University, Dallas, TX, USA, in 2020. Her research
no. 5, pp. 3681–3692, May 2019. interests include situational awareness for power
[18] Q. Dai, L. Shi, and Y. Ni, “Risk assessment for cyberattack in active system monitoring and control via optimization
distribution systems considering the role of feeder automation,” IEEE and machine learning. She accepted the Frederick
Trans. Power Syst., vol. 34, no. 4, pp. 3230–3240, Jul. 2019. E. Terman Award in SMU. She serves as a
[19] R. Deng, P. Zhuang, and H. Liang, “False data injection attacks against Reviewer for the IEEE T RANSACTIONS ON P OWER
state estimation in power distribution systems,” IEEE Trans. Smart Grid, S YSTEMS, the IEEE T RANSACTIONS ON S MART G RID, and IEEE P OWER
vol. 10, no. 3, pp. 2871–2881, May 2019. E NGINEERING L ETTERS.
Jianhui Wang (Senior Member, IEEE) received Bo Chen (Member, IEEE) received the B.S. and
the Ph.D. degree in electrical engineering from the M.S. degrees from North China Electric Power
Illinois Institute of Technology, Chicago, IL, USA, University in 2008 and 2011, respectively, and
in 2007. He is a Professor with the Department the Ph.D. degree in electrical engineering from
of Electrical and Computer Engineering, Southern Texas A&M University, College Station, TX, USA,
Methodist University, Dallas, TX, USA. He has in 2017. In 2017, he worked as a Postdoctoral
authored and/or coauthored more than 300 journal Researcher with the Argonne National Laboratory,
and conference publications, which have been cited Lemont, IL, USA, where he is currently an Energy
for more than 20 000 times by his peers with an Systems Scientist with the Energy Systems Division.
H-index of 76. He has been invited to give tuto- His research interests include modeling, control, and
rials and keynote speeches at major conferences, optimization of power systems, cybersecurity, and
including IEEE ISGT, IEEE SmartGridComm, IEEE SEGE, IEEE HPSC, cyber–physical systems.
and IGEC-XI. He is the recipient of the IEEE PES Power System Operation
Committee Prize Paper Award in 2015 and the Premium Award for Best
Paper in IET Cyber-Physical Systems: Theory & Applications in 2018. He is
the Clarivate Analytics Highly Cited Researcher for production of multiple
highly cited papers that rank in the top 1% by citations for field and year in
Web of Science in 2018 and 2019. He is the past Editor-in-Chief of the IEEE
T RANSACTIONS ON S MART G RID and an IEEE PES Distinguished Lecturer.
He is also a Guest Editor of the P ROCEEDINGS OF THE IEEE special issue
on power grid resilience.

Detecting False Data Injection Attacks in Smart Grids A Semi-Supervised Deep Learning Approach

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Detecting False Data Injection Attacks in Smart Grids A Semi-Supervised Deep Learning Approach

Uploaded by

Copyright:

Available Formats

IEEE TRANSACTIONS ON SMART GRID, VOL. 12, NO.

1, JANUARY 2021 623

Detecting False Data Injection Attacks in Smart

where x is an N-dimension state vector, and z is an errors, expressed as:

where Y ij ∈ C3×3 denotes the nodal admittance between nodes

generated samples, and b) update the generator with the fixed

Algorithm 1 AAE Training Process equivalently written as

Fig. 6. The LNR results under unobservable FDIAs and no attacks.

Fig. 5. Estimation results of voltages under no attacks and an unobservable

mislead the decisions made by system operators for voltage

A PPENDIX 4) Update the nodal voltages by xt+1 = xt + xt .

You might also like