Unrolled Spatiotemporal Graph Convolutional Network For Distribution System State Estimation and Forecasting

IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, VOL. 14, NO.
1, JANUARY 2023 297
Unrolled Spatiotemporal Graph Convolutional

Network for Distribution System State
Estimation and Forecasting
Huayi Wu , Student Member, IEEE, Zhao Xu , Senior Member, IEEE, and Minghao Wang , Member, IEEE
Abstract—Timely perception of distribution system states is generators [1]. Such billow of uncertainties has brought sig-
critical for the control and operation of power grids. Recently, it nificant challenges to the reliable monitoring, stable control and
has been seriously challenged by the dramatic voltage fluctuations economic operation of power systems [2]. The DSSE receives
induced by high renewables. To address this issue, an Unrolled
Spatiotemporal Graph Convolutional Network (USGCN) is pro- much academic attention as it plays a pivotal role in the real-time
posed for distribution system state estimation (DSSE) and fore- perception of grid states for delivering enhanced monitoring,
casting with augmented consideration of the underlying complex control, and management functionalities [3]. Correspondingly,
spatiotemporal correlations of renewable energy sources (RES). it is necessary to build an effective and efficient model to
Specifically, the interconnection among individual spatial graphs
implement DSSE in power system industries.
of adjacent time steps will lead to an unrolled spatiotemporal
graph and benefit the synchronous capture of spatial and temporal Since locations of RES in distribution systems are geograph-
correlations to achieve enhanced accuracy. On top of this, the ically close to each other and the associated environments
node-embedding technique is employed in the unrolled spatiotem- are generally similar, RES generations, including wind power
poral convolutional layer to reveal the hidden nonlinear spatiotem- generators and solar panels, in these distribution systems will
poral correlations of RES outputs without relying on full prior
have interrelated power patterns [4]. Therefore, the patterns of
knowledge. Moreover, the proposed USGCN stacks the unrolled
spatiotemporal convolutional layers, leading to the perception of renewable generation will have significant and complex spa-
longtime correlations to obtain effective ahead-of-time state fore- tiotemporal correlations, leading to hidden complex spatiotem-
casting results robustly. The simulation results have been provided poral correlations among nodal measurements of distribution
to verify the accuracy and efficiency of the proposed model in systems [5]. Firstly, each node of the distribution system can
118-node and 1746-node distribution systems.
be directly influenced by its neighbor nodes at the current
Index Terms—Distribution system state estimation and time step due to spatial correlation. Secondly, each node will
forecasting, spatiotemporal correlation, node embedding, graph be self-influenced at the next time step due to the temporal
convolutional network, renewable energy. correlations in the time series. Thirdly, each node can be further
influenced by its neighbor nodes at the next time step due to
I. INTRODUCTION the synchronous spatiotemporal correlations. This phenomenon
will bring significant bias in system state estimation, render-
CTIVE distribution systems generally have dramatically
A uncertain system states due to the continuous variations
rendered by the increasingly deep penetration of renewable
ing an urgent challenge in state estimation and forecasting of
distribution systems, thus poses threaten in distribution system
operation [6]. However, such spatiotemporal correlations are
usually neglected in most DSSE, and the independent Gaussian
Manuscript received 3 May 2022; revised 25 July 2022; accepted 11 Septem- distribution of RES output is predominantly assumed. It is
ber 2022. Date of publication 4 October 2022; date of current version 19
December 2022. This work was supported in part by the PolyU under Grants
becoming increasingly critical to develop a suitable model to
G-SB4D and 1-YY4T. The work of Minghao Wang was supported in part by achieve accurate distribution system estimation.
the National Natural Science Foundation of China under Grant 62101473 and The most classical algorithms for DSSE are based on a
in part by Environment and Conservation Fund & Woo Wheelock Green Fund
under Grant 108/2021. Paper no. TSTE-00452-2022. (Coresponding author:
weighted least square (WLS) where the measurement resid-
Zhao Xu.) ual is minimized. The Gauss-Newton algorithm is applied to
Huayi Wu and Zhao Xu are with the Research Institute for Smart Energy solve this nonconvex optimization problem iteratively [7]. The
(RISE) and the Department of Electrical Engineering, The Hong Kong Polytech-
nic University, Hong Kong, SAR, China (e-mail: huayi22.wu@polyu.edu.hk;
WLS-based state estimation is generally based on the assumed
eezhaoxu@polyu.edu.hk). independent measurements. However, such a method may not
Minghao Wang is with the Research Institute for Smart Energy (RISE), and adapt to the inevitable correlation measurements, which may
Power Electronics research Center, The Hong Kong Polytechnic University,
Hong Kong, SAR, China, and also with the Shenzhen Research Institute of
introduce a significant error in system estimates [8]. To this end,
the Hong Kong Polytechnic University, Shenzhen, China (e-mail: minghao. in [9], the linear correlation coefficient matrix is utilized as the
wang@polyu.edu.hk). weighting matrix to alleviate the correlation impact on DSSE in
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TSTE.2022.3211706.
the WLS estimation, leading to improved accuracy. Similarly,
Digital Object Identifier 10.1109/TSTE.2022.3211706 the correlation among nodal measurements is employed in [10]
1949-3029 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Xiamen University. Downloaded on February 12,2023 at 02:12:23 UTC from IEEE Xplore. Restrictions apply.
298 IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, VOL. 14, NO. 1, JANUARY 2023
to form the weighting matrix to improve the estimation accuracy. techniques inspire a new way to consider the correlations’
An artificial neural network is applied in [11] to obtain the impact on distribution system state estimation. Therefore, it is
spatial correlations of measurements to construct the correlation unprecedentedly essential to leverage the spatiotemporal corre-
coefficient matrix, which is utilized to enhance the WLS-based lations to enhance the real-time state estimation performance
state estimation for microgrids. However, temporal correlations with simultaneous spatiotemporal correlations.
are not considered in this WLS-correlation coefficient matrix, In this study, a USGCN model is proposed for distribu-
which ignores the impact of the dramatical time-series fluctua- tion system state estimation and forecasting considering the
tions of the renewable energy resources. Therefore, the current spatiotemporal correlations of power injections. Specifically, a
system operation states will not be predicted accurately. novel unrolled spatiotemporal graph model (USG) formed by
Traditional DSSE is conducted based on instantaneous mea- splicing the spatial graphs of adjacent time steps is proposed
surements without incorporating historical data. Considering the to capture three aspects of spatiotemporal correlations simul-
temporal correlations, the valuable information extracted from taneously. In this way, enhanced state estimation accuracy and
time-series measurement data will facilitate the conventional computational efficiency can be achieved. Based on USG, an
DSSE methods. In [12] the temporal correlations among dis- adaptive USG with the node-embedding technique leveraged
tributed generators are considered in an augmented unscented to character the unrolled graphs is further proposed to cap-
Kalman filter based state estimation method, leading to en- ture the nonlinear spatiotemporal correlations automatically.
hanced accuracy. An augmented Complex Kalman filter (CKF) Then, on top of the adaptive USG model, a novel framework
is proposed in [13] to model time-series measurement changes of USGCN is proposed for state estimation. Furthermore, the
to reduce estimation errors. However, the reported works only USGCN with multiple adaptive USG layers is proposed to
consider single temporal correlations of measurements, which is capture long-range spatiotemporal correlations so that state
non-adaptive to stochastic uncertainty of RES in real-time state forecasting can be achieved effectively. The experiments are
estimation with spatial correlations. To adress this issue, condi- conducted on 118-node and 1746-node distribution systems. The
tional multivariate complex Gaussian distributions are focused contributions can be summarized threefold.
in [6] to model the spatiotemporal correlations of renewable 1) Different from the previous works that treat the spatial
resources and unbalanced loads to improve the accuracy of and temporal correlations of the measurement data separately,
traditional DSSE, where full knowledge of such correlations a USG formed by splicing the spatial graphs of adjacent time
is required. A vector autoregressive model is proposed in [14] to steps is proposed to capture the spatiotemporal correlations
model characteristics of both the spatial and temporal correlation synchronously so that the improved state estimation results can
among loads and DGs to aid the classical DSSE, improving be achieved.
state forecasting accuracy. However, these methods are compu- 2) Instead of utilizing a fixed correlation coefficient matrix
tationally intensive and challenged by growing system scale and to characterize the linear correlations, the adaptive USG model
uncertainties. leverages the node embedding to characterize the unrolled spa-
To alleviate the computational burden, learning-based meth- tiotemporal graphs automatically to capture the dynamic corre-
ods are becoming popular to forecast the power system states. An lations to achieve high accuracy in state estimation.
artificial neural network is proposed in [15] to reduce the com- 3) Unlike the traditional DSSE implemented based on the
putational cost and improves the accuracy. The full data-driven current time measurements, leveraging multi-module layers, the
physics-guided deep learning model is introduced in [16] to learn USGCN can capture long-range spatiotemporal correlations to
the state estimation model considering the temporal correlations gain enhanced ahead-of-horizon states effectively.
of nodal measurements. Physics-inspired deep neural networks The remaining sections are organized as follows. The state es-
(DNNs) are reported in [17] for real-time state estimation of timation model of the distribution system based on the Bayesian
transmission systems, achieving improved accuracy. A deep rule is introduced in Section II. The USGCN model is proposed
learning model is proposed in [18] to achieve real-time state in Section III, and the framework of the USGCN is proposed
estimations. These data-driven approaches can enhance calcu- in Section IV. The results and the analyses are presented in
lation speed. Nevertheless, the spatiotemporal correlations of Section V. Finally, the conclusion of this study is summarized
measurements are generally ignored. The spatial-temporal graph in Section VI.
convolutional network (STGCN) has attracted more and more
attention in different fields, such as traffic forecasting [19],
social pedestrian behavior prediction [20], skeleton-based ac- II. STATE ESTIMATION BASED ON BAYESIAN RULE
tion recognition [21], and so on. They capture the correlations The classical DSSE model can be expressed as
from the data to achieve enhanced performance. However, these
works treat the spatial and temporal correlations separately,
which ignore the spatiotemporal correlations and may lead to z = h(x) + e (1)
unsatisfactory results. To address this issue, a spatial-temporal
synchronous graph convolutional network is proposed for net- In (1), z is the vector of measurements obtained by the mea-
work data forecasting to consider the spatiotemporal simulta- surement devices. h is the measurement function for mapping
neously [22]. The graphical structure properties of the power the true state variable vector x, to the measurement vector. e
system and the spatial-temporal graph convolutional network denotes the vector of measurement errors. The classical state
WU et al.: UNROLLED SPATIOTEMPORAL GRAPH CONVOLUTIONAL NETWORK FOR (DSSE) AND FORECASTING 299
estimation can be modeled as the WLS optimization problem:

x̂ = arg min(z − h(x))T W (z − h(x)) (2)
x
In (2), x̂ is the estimated state vector, T is the matrix transposi-

tion operation and W denotes the weight matrix. However, this
method can only deal with independent Gaussian distribution of
measurements, it is not feasible for DSSE with the complex cor-
relations of high-penetrated RES [23]. To address this issue, the
Bayesian rule can be used to construct the posterior distribution Fig. 1. Unrolled Spatiotemporal Graph Structure.
for estimating states under given the measurements:
f (z|x)f (x)
f (x|z) = . (3) the information hidden in the measurements and thus leads
f (z)
to unsatisfactory accuracy. Thus, an unrolled spatiotemporal
In (3), f (x|z) is the posterior probability density function (PDF) graph model is proposed to capture the spatial and temporal
of the state variables. f (z|x) is the conditional PDF of the correlations simultaneously. Specifically, the unrolled operation
measurements that can be obtained by the maximum likelihood means that spatial graphs of adjacent time steps are spliced into
function of the state variables under the given measurements one bigger graph. In this way, the impact of each node by its
set. f (x) is the prior distribution of the state variables, and f (z) neighbors within the current and the adjacent time steps can be
is the distribution of the measurements. However, due to the extracted directly and synchronously. On the contrary, without
uncertain and intermittent nature of the RES outputs, it is highly the unrolled operation, the spatiotemporal correlations among
difficult to obtain a complex distribution function to describe different nodes across different time steps cannot be intuitively
f (x|z). Besides, plenty of historical measurements is required captured. The concept of unrolled operation can be illustrated
to conduct f (z), which is not practical due to the low observ- in Fig. 1.
ability of the distribution system. These difficulties inspire the To embed the nodal measurements through a graph structure,
practice of obtaining the estimation states based on the posterior each measuring site can be described by a node, and the correla-
conditional distribution. Specifically, given the measurements tions among different nodes can be depicted as edges. A graph
z, the estimation state x̂ is equivalent to the expectation of the G = (V, E) represents the spatial correlations between nodes,
posterior conditional distribution f (x|z). This is calculated by where V = {v1 , . . . , vi , . . . vN } is the set of all nodes with the
the following formulation. number of N and E is the set of edges with the number of L

[26]. To describe this graph structure quantitatively, the adjacent
x̂ = E[x|z] = αfα|z (α|z)dα (4) matrix A ∈ RN ×N is introduced such that:

In (4), the integral operation is performed in the whole state 1, if vi , vj ∈ V, and (vi , vj ) ∈ E
Aij = (6)
space and α represents the integration variable that can span 0, else
all the values x. Thus, the state estimation can be transferred
to calculate the expectation of this posterior distribution, which Since the aim is to model spatial and temporal correlations
can be approximated by the proposed model. synchronously across time steps, the intuitive practice is to
connect all graphs representing the spatial correlation with
x̂ = E[x|z] = h(x) (5) themselves at the adjacent time steps, as shown in Fig. 1(b). By
In (5), x also denotes the several layers [h0 , . . . , hl , · · · hout ] of connecting the graph of all the nodes belonging to the previous,
the proposed USGCN model. current, and next time slots, a spatiotemporal graph is obtained,
which is denoted by the adjacent matrix Ã ∈ R3N ×3N . For
III. UNROLLED SPATIOTEMPORAL GRAPH CONVOLUTIONAL each node in the spatial graph, its new index can be calculated
NETWORKS by (t − 1)N + i, where t(1 ≤ t ≤ 3) indicates the time step
number in the spatiotemporal graph model. In Fig. 1(b), the
A. Unrolled Spatiotemporal Graph elements Aij in Atadp 1
present the spatial correlation between
Conventionally, the previous models introduced the convolu- node i and node j in time step t1 . The diagonal elements Aii
(t1 −t2 )
tional neural network (CNN) into the recurrent neural network in Aadp denote the temporal correlation of node i across
(RNN) to capture the spatial and temporal correlations, while the time steps t1 and t2 . The off-diagonal elements Aij in
(t1 −t2 )
CNN can only feature the Euclidian data structure [24]. As a Aadp denote the spatiotemporal correlation of node i and
result, it may not be efficient for state estimation due to the node j across the time steps t1 and t2 . In a word, the unrolled
non-Euclidian graphical structure essence of the power system. spatiotemporal graph across three continuous time steps. The
To this end, the graph neural network (GNN) is combined diagonal adjacency matrices denote the spatial correlation of
with the RNN to improve the non-Euclidian data feature ex- each node with others in the current time step. The diagonal
traction [25]. However, these models capture the spatial and elements of the off-diagonal adjacency matrix represent the
temporal correlations separately without exploring correlations temporal correlation of each node with itself. The non-diagonal
across time among different nodes, which cannot fully leverage elements of the off-diagonal adjacency matrix represent the
spatiotemporal correlation of each node with other nodes across to its neighbors via certain transition probabilities and the model
time steps. This process shows that the correlations between each can converge after a finite number of diffusion steps. Generally,
node and its spatiotemporal neighbors can be captured directly the diffusion convolutional operation is formulated as:
via the unrolled spatiotemporal graph model.

M

(l)
h = P m h(l−1) W m ∈ R3N ×D (9)
B. Self-Adaptive Unrolled Spatiotemporal Graph m=0
It is crucial to build a graph model to describe the adjacent In (9). P m represents the power series of the transition
matrix representing the correlations. Traditionally, the historical matrix. Since the nodal measurement data can be modeled as
nodal measurements can be used to model the deterministic spa- an undirected graph, such that P m = Ã/rowsum(Ã), where
tial correlations using the static correlation coefficient matrix. rowsum computes column sums across rows of Ã. By combin-
However, spatial and temporal correlations are usually dynamic ing the pre-defined self-adaptive spatiotemporal graph model
over time. The deterministic spatial representation model, adja- and the diffusion operation, the spatiotemporal graph convolu-
cency matrix A ∈ RN ×N , may not be appropriate to character- tional layer can be defined as:
ize such dynamic essence. It is the benefit of node-embedding

M
˜
(n)
technique with the powerful ability in dependence representa- h(l) = P m h(l−1) W m1 + Aadp h(l−1) W m2 ∈ R3N ×D
tion [27] that motivates this study to employ it to formulate the m=0
learning model of implicit spatiotemporal correlation. Thus, a (10)
self-adaptive adjacency matrix Ã ∈ R3N ×3N is proposed for When the graph structure is not available, there remains the
each time step without requiring any prior knowledge. The hid- self-adaptive spatiotemporal graph term in the spatiotemporal
den complex spatiotemporal correlations can be automatically graph convolutional layer:
discovered through this model during the learning process. This ˜
(n)
self-adaptive adjacency matrix can be achieved by randomly h(l) = Aadp h(l−1) W m2 ∈ R3N ×D (11)
initializing two node embeddings with learnable parameters
E 1 , E 2 ∈ R3N ×k , where k denotes the dimension of embedding D. Gated Operation
vector. Gated linear units (GLU) are a typical mechanism in neural
networks. It is powerful to control which node’s information
Ãadp = Sof tM ax(ReLU (E 1 E T2 )) (7)
flow can be passed to the next layer during graph convolutional
The node embedding vectors E 1 and E 2 present two mea- operations [31]. The gated graph convolutional layer can be
sured nodes, respectively. The multiplied value of E 1 and E 2 described as follow:
is used to denote the spatial correlation weight between the cor-
M
˜
(n)
responding two nodes. The ReLU activation function is utilized h(l) = P m h(l−1) W m1 + Aadp h(l−1) W m2
to eliminate weak connections, while the SoftMax activation m=0
function is employed to normalize the self-adaptive adjacency

M
˜
(n)
m (l−1)
matrix. ⊗σ P h W m1 + Aadp h(l−1) W m2 ∈ R3N ×D
m=0
C. Unrolled Spatiotemporal Graph Convolution Layer (12)

Different from the traditional convolutional operation only In (12), W 1 ∈ RD×D , W 2 ∈ RD×D , b1 ∈ RD , b2 ∈ RD
implemented on Euclidean pixel structure, the graph convolution are learnable parameters, sigmoid denotes the sigmoid activa-
is a typical operation to extract a node’s features over a given tion function (sigmoid(x) = 1/(1 + e−x )), ⊗ represents the
non-Euclidean graph structure [28]. From a spatial perspective, element-wise product. To increase the receptive field of the
it can aggregate and transform the nodes’ arbitrary neighbour- graph convolutional operation to capture the spatiotemporal
hood information to achieve efficiency for implementation in correlations, the proposed graph convolutional operations are
most graph structures [29]. Let Ã(n) ∈ R3N ×3N denotes the multiply stacked.
normalized adjacency of the spatiotemporal graph model with

self-loops, h(l−1) ∈ R3N ×D is the input of the l-th graph con- IV. FRAMEWORK OF USGCN
(l) 3N ×D
volutional layer. h ∈ R is the output of the l-th graph A. Usgcn for Dsse
convolutional layer, where D is the dimension of the output.

W ∈ RD×D and b ∈ RD denote the learnable parameters, σ The historical measurement data with spatial and temporal
denotes the activation function, such as ReLU. The unrolled correlations can be used to facilitate the DSSE. The proposed
spatiotemporal graph convolutional layer can be defined as: USGCN is utilized to learn the expectation of the posterior con-
ditional distribution (5) that accounts for the spatial and temporal

h(l) = U SGCN (h(l−1) ) = σ(Ã(n) h(l−1) W + b) ∈ R3N ×D correlations hidden in the measurement data. To achieve this
(8) goal, the proposed unrolled spatiotemporal graph convolution
A diffusion convolution layer proposed by [30] is proved with operation and the gated operation are utilized to construct the
the excellent ability to handle spatiotemporal modeling effec- USGCN for state estimation and state forecasting. Unlike the
tively. In such an operation, the node’s information is transferred popular graph convolutional network that can only capture

hl ∈ R3N ×C ×(T −2) . Besides, the fully connected network is
employed to transfer the 3N dimensional into Nout so that
the target output dimensional can be achieved. Therefore, the

output of the proposed USGCN is hout ∈ RNout ×C ×(T −2) .

C is the hidden dimension of the layer, e.g. C = 2 means
two dimensions of channels, presenting the voltage magnitude
and the voltage phase angle. Nout is the number of voltage
magnitudes and phase angles.
The Huber loss [32] is utilized as the loss function, which
is less sensitive to outliers than the mean squared error loss
Fig. 2. (a) The structure of the USGCN for DSSE. (b) The structure of the and has continuous derivatives compared with mean absolute
USGCN for state forecasting. error. Although Log-Cosh loss has the Huber loss’s mentioned
advantages, it is less adaptive due to a fixed gradient for signifi-
cant error. In (13) h denotes the ground truth and ĥ denotes the
the spatial correlation, the proposed USG model can take the prediction of the model, δ is the threshold that controls the range
spatiotemporal correlations into account to enhance the state of squared error loss.
estimation accuracy. (t+j) (t+j)
1 0.5(ĥik − hik )2 ,
N j=T k=C
The framework of USGCN for state estimation is shown in
Fig. 2(a). This framework consists of a USGCN layer, three CNN Loss(ĥ, h) = (t+j) (t+j)
.
T N C i=1 j=4
k=1 δ|ĥik − hik | − 0.5δ 2
layers, and a fully connected neural network layer as the output
(13)
layer. The USGCN layer is utilized to capture the spatiotemporal
correlations and form the latent features for each node in the
unrolled graph. These features are aggregated by the following B. USGCN for State Forecasting
CNN layers to map the extracted features to the system states. Generally, the predicted states are corrected relying on com-
The measurement data includes the active and reactive power plete real-time measurements. Then the corrected states are uti-
injections, denoted by Z ∈ RN ×C×T , where N is the number of lized to predict the following single-step system states [17], such
system nodes. T is the sequence length. C is the hidden dimen- that the predicted states rely on the accuracy of the previously
sion, which presents the measurement data in each node, e.g. four estimated states. Instead of utilizing the previously estimated
dimensions include the nodal active/reactive power injection and states, the proposed method can directly forecast the system
active/reactive line flow. Note that the measurement data consists states based on the previous measurements. In this way, the
of the active/reactive power injections and the active/reactive proposed method can fully leverage the spatial and temporal cor-
line power flows, which are available from smart meters. Since relation features in measurements to forecast the states directly
the estimated states are the voltage magnitudes and the phase with available measurements so that improved awareness on sys-
angles, the number of the measurements is satisfied with the tem states ahead of time can be achieved. Furthermore, instead
observability without full deploying instruments in all nodes, of using the recurrent neural network to capture only temporal
e.g. above 50% of the system node number. Generally, the branch correlations, the proposed USG model can intuitively extract the
number in the distribution system is less than the node number spatial and temporal correlations synchronously characterized
due to the radial structure, so the first dimension N is enough by graphical structure. In this way, the correlation information
for nodal and line measurements. Note that the nodal and line is extracted according to their statistical dependence to achieve
input without the measurements are replaced with zero value. improved forecasting accuracy.
The measurement data Z ∈ RN ×C×T is then utilized to The framework of USGCN for state forecasting is shown in
extract the slices of unrolled spatiotemporal graphs’ series as Fig. 2(b). Different from the USGCN for DSSE, it consists of two
h0s ∈ RN ×C×3 , which can be reshaped h0s ∈ R3N ×C×1 . In each stacked spatiotemporal graph convolutional layers, three CNN
slice, each node aggregates its own features and neighbors at cur- layers, and a fully connected neural network layer. By stacking
rent and adjacent time steps in the unrolled spatiotemporal graph multiple spatiotemporal graph convolutional layers, USGCN
convolutional operation. The aggregate function is described by can perceive spatiotemporal correlations across adjacent time
embeddings whose weights equal the correlations among the steps as well as more than adjacent time steps.
nodes and their neighbors. Based on this process, the slices of Different from the previous works like [24] that generate the
T-2 unrolled spatiotemporal graphs of different time steps can time-series ht recursively through T time steps, the USGCN
be obtained. By splicing these spatiotemporal graphs by row, the can directly output the h(t+1):(t+T ) within all T time steps
input h0 ∈ R3N ×C×(T −2) of USGCN can be obtained. as a whole. In this way, the problem of inconsistency between
Based on the definition of the input and the USGCN, the training and testing data caused by the fact that a model learns to
output of the unrolled spatiotemporal graph convolutional layer make forecasting for a one-time slot during the training process

is hl ∈ R3N ×C ×(T −2) , where C is the hidden dimension of can be addressed. Specifically, to achieve this, the temporal
the layer. To obtain the target channel dimensional of fea- dimension of the last spatiotemporal layer is set to equal T − 3.
tures in each node, the 2-d convolutional neural network lay- In this way, the input of the USGCN for state forecasting is

ers are also utilized to transfer the hl ∈ R3N ×C ×(T −2) into h0 ∈ R3N ×C×(T −3) and the output is hout ∈ RNout ×C ×(T −3) .
C . It is noted that learning parameters of spatiotemporal con-

volutional layers are designed to be shared during all time steps
to achieve enhanced efficiency.
V. CASE STUDY
A 11 kV 118-node distribution system with six feeders [33],
and a 130.8 kV 1746-node distribution system with 14 feed-
ers [34] are employed to investigate the performance of the
Fig. 3. Historical data of wind power and solar power output with each curve
proposed model. Since it is difficult to collect the historical depicting one day.
measurements and states of real power systems, the real wind,
and solar power output data are extracted from the 2012 Global
Energy Forecasting Competition, which is used to produce the
training and testing datasets. The Matpower is used to perform
the AC power flow equations to obtain the voltage magnitudes
(V m ) and the voltage phase angles (V a ) and produce measure-
ments. The measurement data includes the active/reactive power
injections and the active/reactive line power flows, which can be
collected from smart meters. The number of the measurement
data collected from the smart meters is set at 55% of the system
node number, which is available due to the proliferation of smart
meters. These measurement data are the input of the proposed
model, and the voltage magnitudes and phase angles are the Fig. 4. The average correlation coefficient of the RES output.
output. The measurement noise distribution is the gaussian with
zero mean and 1% of the expected measurements as the standard
TABLE I
deviation. Specifically, the noises for each measurement data THE LOCATIONS OF RENEWABLE ENERGY
collected by the same device are with the same error values
in different time step. After this data preparation process, the
data sizes of the 118-node distribution system and 1746-node
distribution system are set to be near 1500 K, and 20000 K,
respectively with 90% as the training data and 10% as the test
data. The training process is conducted on an NVIDIA GeForce
GTX 3090 with 24 GB RAM, using Python as the programming
language and running in windows 10. The environment utilized is set 10−12 . These hyperparameters are optimized by random
to implement the proposed model is Python 3.7 with PyTorch searching selection. The number of epochs is determined by
1.7. To alleviate randomness in the obtained weight parameters, observing the training and test error during the training process
all proposed models are trained and tested independently 20 of the proposed model and is selected to be 10000.
times. 3) Renewable Energy Sources setting: Historical data of total
1) Performance indexes of the proposed model: The per- several RES units, including wind turbine and photovoltaic
formance of the proposed model is verified by compar- outputs, is shown in Fig. 3. The rated output of one location with
ing it with various state-of-the-art methods using the fol- wind turbines is 0.06 MW. And the rated power of photovoltaic is
lowingevaluation indexes, Mean Absolute Error, M AE = 0.08 MW. And the average correlation matrix of the wind power
1 nsam
Nsam i=1 |ĥi − hi |, Mean Absolute Percentage Error, and photovoltaic power is depicted in Fig. 4, which shows a
1
nsam ĥi −hi
M AP E = Nsam i=1 | hi | × 100%, Root Mean Square
high active correlation (close to 1) among wind turbine outputs
1
nsam 2 and photovoltaic outputs, respectively. The negative correlation
Error, RM SE = Nsam i=1 (ĥi − hi ) , where ĥi de- coefficient between wind turbine output and photovoltaic output
notes the estimated value of the proposed model and the other reflects the multiple complexities of the prediction task.
baseline methods; hi is the ground-true value of the correspond-
ing estimated value.
A. The Performance of USGCN in State Estimation
2) Hyperparameters setting: The channels of the 2-D CNN are
set at 4, 8, and 2, and the corresponding kernel sizes are set (5,1) 1) In Comparision With Troditional Methods: To investigate
to ensure that the CNN can aggregate the information across the impact of correlation of measurement data in traditional
5 nodes. Three CNN layers are formed to avoid overfitting. distribution state estimation, the conventional DSSE methods
The dimension of the node embedding is set at 10. The batch including the weight (Weighted Least Squares) WLS [35], (Least
size is set at 1000. The learning rate of the proposed model is Absolute Value) LAV [36] are employed, which are solved by
set at 0.01 at the beginning and exponentially decays with an the Gaussian-Newton algorithm. The reference voltage phase
increasing number of epochs. The regularization parameter λ angle is set at 0 rd.
TABLE II
THE EVALUATION INDEX OF USGCN COMPARED WITH TRADITIONAL
METHODS WITH AND WITHOUT CORRELATIONS
Since the historical data of the wind and solar power out-
puts shown in Fig. 3 are available with correlations, the state
estimation on traditional methods and the proposed USGCN
with and without correlations are investigated. The outputs of
wind turbines and photovoltaic without correlations are gen-
erated following the independent normal distribution with the Fig. 5. The estimated value of voltage magnitudes and phase angles of node
rated power as the mean and its 15% as the standard devia- 121 of the 1747-node distribution system.
tion. Then, similarly, the power flow is conducted to produce
the measurement data with 1% noise. The state estimation is
conducted in 480 scenarios for WLS and LAV, respectively.
The corresponding average MAE, MAPE, and RMSE results
of conventional methods and USGCN are listed in Table II.
Without correlations, the traditional methods WLS and LAV
give a relatively low result in MAE, MAPE, and RMSE. The
evaluation results of USGCN are consistent with that of the
traditional methods, which demonstrates the effectiveness of the
traditional methods and USGCN in state estimation. However,
with correlations, there are higher MAE and RMSE results Fig. 6. The estimation time of USGCN and traditional methods.
of voltage phase angle from traditional methods than without
correlation considered. Specifically, in the 118-node distribution
system with correlations, the USGCN reduces the MAE of volt- can significantly impact the distribution system state estimation
age magnitudes by 91.77% and 91.79% compared with the WLS function and thus have a negative influence on the system
and LAV, respectively. The MAPE is also reduced by 91.82% operation. Besides, the state estimated voltage magnitudes and
and 91.82% compared to the WLS and LAV. Besides, for the phases angles are also depicted in Fig. 5. It intuitively shows that
1746-node distribution system, the USGCN also outperforms the voltage magnitudes and phases angles from the proposed
WLS and LAV by 85.64%, 85.64% in MAE, and 85.21%, USGCN have a smaller bias with the corresponding real value
85.21% in the MAPE of voltage magnitudes, respectively. The in comparison with the traditional WLS and LAV methods.
RMSE for voltage magnitudes are also lower than the traditional Furthermore, the running time of the USGCN and the tradi-
methods. The MAE, MAPE, and RMSE in voltage phase angles tional methods of 118-node and 1746-node systems are repre-
of the USGCN state estimation results considered either with or sented in Fig. 6, which shows that the computational times of
without correlation are also lower than the traditional methods. WLS, LAV, USGCN are 0.1758, 0.0624, and 0.0006 s for the
This is due to the excellent ability of the USGCN in complex 118-node system, and 0.0928, 0.5201, 0.0064 s for 1746-node
spatiotemporal correlation modeling. system, respectively. The computation time of the USGCN is
The state estimation value of voltage magnitudes and phase less than 1 ms, while that of the traditional methods is upon
angles of node 121 of the 1746-node distribution system is shown 60 ms. The proposed model is about 100 times faster, indicating
in Fig. 5. It presents that the bias between voltage magnitudes that it can be implemented in real-time for distribution systems.
estimated by the high correlation measurement and the real value This is because the proposed USGCN can learn the expec-
is more than that without correlation measurement. Similarly, the tation of the posterior conditional distribution that describes
deviation of the estimated value of the voltage phase angle from the complex relationship between the measurements and the
the true value also has the same law. This demonstrates that the system states. In this way, the USGCN directly performs state
correlations brought by the dependent pattern of the RES output estimation different from traditional methods that involve many
TABLE III
THE EVALUATION INDEX IN STATE ESTIMATION OF USGCN IN COMPARISON
WITH DATA-DRIVEN METHODS WITH CORRELATIONS
time-consuming iteration processes so that reducing computa-

tional time can be achieved. Therefore, the conventional tech-
niques have limited performance in state estimation accuracy
and efficiency. The USGCN can handle the distribution system Fig. 7. The state estimation of voltage magnitudes and phase angles of node
state estimation efficiently. 15-30 of the 118-node distribution system at hour 12.
2) In Comparision With Data-Driven Methods: To investi-
gate the effectiveness of the proposed USGCN, the Full Con-
nected Network (FCN), Convolutional neural network (CNN),
Graph Convolutional Network (GCN) [28], and Spatiotemporal
graph convolutional network (SGCN) are employed as the base-
line data-driven methods. The FCN includes five fully connected
network layers with 1000, 2000, 1000, 500, and 118 neurons in
each layer. The GCN consists of two convolutional graph layers,
where the adjacency matrix with self-loops formed from the
distribution system topology. The CNN approach includes three
2-d convolutional layers where each kernel size of the CNN
layer is 5, the channel sizes are 4, 8, and 2, respectively, and
three fully connected network layers with 1000, 500, and 118
neurons. The difference between USGCN and SGCN is that the
latter has only one graph without splicing the spatial graphs of
adjacent time steps, leading to spatial and temporal correlations
considered separately. Note that the distribution system states
are estimated based on the measurement at the current time step
and the previous two-time steps, such that here the sequence
length is set 22.
The state estimation results are listed in Table III. For the IEEE
118-node system, it shows that the proposed USGCN gains im-
proved 82.18%, 92.48%, 71.82%, and 75.20% in voltage mag-
nitude MAE in comparison to FCN, GCN, CNN, and SGCN,
respectively. Besides, USGCN also enhances voltage magnitude Fig. 8. (a) Correlations of RES. (b) Adjacent matrix of topology. (c) Parame-
RMSE at 82.23%, 32.34%, 74.58%, and 78.43% compared with ters in the unrolled spatiotemporal graph.
FCN, GCN, CNN, and SGCN, respectively. Moreover, for the
1747-node system, the USGCN reduces voltage phase angle
MAE at 74.62%, 96.58%, 95.24%, and 68.82% compared with deep learning methods. It illustrates that the proposed USGCN
FCN, GCN, CNN, and SGCN, respectively. This is because model performs better in reducing state estimation errors.
the USGCN can extract the spatiotemporal correlation features Furthermore, to inventively deliver the correlations learned by
in the measurements to facilitate the state estimation and thus the USGCN, the parameter in self-adaptive unrolled spatiotem-
achieve a better accurate result. The state estimation of voltage poral graph is depicted. Each element in this graph is calculated
magnitudes and phase angles in nodes 5-30 of the 118-node by the inner product of the corresponding node embeddings.
system are depicted in Fig. 7. Curves shows that the voltage Fig. 8(a) is the correlations of RES. Fig. 8(b) denotes the adjacent
magnitudes and phase angles estimated by the proposed USGCN matrix of topology. Fig. 8(c) represents the correlations learned
are more consistent with the real value than other the baseline by self-adaptive unrolled spatiotemporal graph. In comparison
TABLE V
THE STATE FORECASTING EVALUATION INDEX OF USGCN COMPARED WITH
TRADICTIONAL METHODS WITH AND WITHOUT CORRELATIONS
B. The Performance of USGCN in State Forecasting

To investigate the effectiveness of the proposed USGCN, the
Fig. 9. The state estimation for a different number of CNN layers in USGCN. FCN, CNN, GCN, Long Short-Term Memory (LSTM) [37],
CNN-LSTM [38], and SGCN are employed as the baseline
models. The parameters of FCN, CNN, GCN are the same
as the above. The LSTM approach includes three layers. The
TABLE IV CNN-LSTM approach combines CNN and LSTM. Note that
THE RESULTS WITH DIFFERENT COLORED NOISE IN 118-NODE SYSTEM
state forecasting refers to the single-step following the last
observed time step so that sequence length setting is 21.
It is listed in Table V that the MAE, MAPE, and RMSE of volt-
age magnitudes and phase angles predicted by the proposed US-
GCN in the 118-node distribution system are lower than that of
the baseline methods. This indicates that the proposed USGCN
model is better than other deep learning methods. Quantitatively,
compared to the FCN, GCN, CNN, LSTM, CNN-LSTM, and
SGCN methods, the MAE of predicted voltage magnitudes from
USGCN is reduced by 11.7%, 27.3%, 56.9%, 77.8%, 92.2%, and
with the heatmaps in (a) and (b), (c) presents more complex 41.28%, respectively. Besides, the USGCN also outperforms the
correlations in unrolled spatiotemporal graph model between MAPE of the voltage phase angle of these baseline methods by
nodes across time steps. This indicates that the correlation coef- 51.8%, 59.3%, 51.6%, 76.4%, 82.8%, and 70.59%, respectively.
ficient matrix and the adjacent matrix of topology cannot fully The prediction results of voltage magnitudes and phase angles of
represent the complex correlations between nodes. However, node 55 are shown in Fig. 10. It shows that compared with other
the proposed node embedding can capture the correlations from methods, the state variables estimated by USGCN follow more
the measurements data automatically. By this way, the state closely with the real value than the baseline methods. Especially,
estimation accuracy can be improved. LSTM and CNN-LSTM can only capture the temporal trend of
Since the number of the convolutional layers affects the state the system states, while the accuracy is relatively low. Besides,
estimation results, a different number of CNN layers are set to the SGCN captures the spatial and temporal correlations sep-
verify their effect on the performance of the USGCN. Fig. 9 arately, leading to low accurate state forecasting results. This
depicts the MAE and MAPE of state estimation for USGCN demonstrates the excellent performance of the USGCN in terms
with different numbers of CNN layers. The biases between the of prediction effectiveness, which is because the spatiotemporal
training data and the testing data increase with the increase of correlation is fully captured by the unrolled spatiotemporal
the CNN layers. This indicates that the overfitting rises with graph model to achieve enhanced accuracy. Furthermore, the
eh increasing number of CNN. Besides, the three CNN layers predicted voltage from node 14 to node 30 at a 12-hour slot is
achieve the lowest MAE and MAPE so that three CNN layers also described in Fig. 11. This indicates that USGCN can give a
are chosen in the proposed model. more accurate estimated value than baseline methods. It is due
3) The Performance of USGCN With Colored Noise: To in- to the spatiotemporal correlations captured adaptively by the
vestigate the performance of the USGCN with colored noise proposed model. Therefore, the proposed USGCN can handle
due to non-gaussian measurement error is not rare [23], the the ahead-of-time DSSE effectively.
pink, blue, brown, and purple noise with the deviation of ±1%
are also set as the cases. The corresponding results are listed in
Table IV. In In comparison with the results in Table III, the results C. The Performance of USGCN in Large-Scale Distribution
System
show that the USGCN model can also achieve high accuracy
with different colored noise in state estimation. This is because To investigate the performance of the USGCN in a large-
the self-adaptive unrolled graph model can learn the complex scale distribution system, a 1746-node distribution system is
features from the measurements. Therefore, the proposed model employed with 18 locations deployed with RES units. Since
can also adapt to non-gaussian noise. such a system scale is too large to be learned by FCN and GCN
TABLE VI
THE USGCN EVALUATION RESULTS IN 1746-NODE SYSTEM
Fig. 10. The forecasting results of voltage magnitudes and phase angles of
node 55 in the 118-node distribution system.
Fig. 12. The prediction results of voltage magnitudes and phase angles for bus
500 of the 1746-node distribution system.
63.35%, respectively. Besides, the MAE, MAPE, and RMSE

of voltage phase angles forecasted by the USGCN also out-
perform SGCN by 76.92%, 53.87%, and 51.64%, respectively.
This demonstrates that the proposed USGCN outperforms other
data-driven methods for large-scale distribution system state
estimation in terms of accuracy and efficiency.
The predicted voltage magnitudes and phase angles of node
500 of the 1746-node distribution system are depicted in Fig. 12.
It also shows that the bias of the predicted value with the real
value produced by baseline methods is more significant than
that generated by the USGCN model. It reflects the consistency
of the predicted state variables and the real values with highly
acceptable accuracy in a large-scale system. Therefore, the pro-
posed USGCN can fully handle the large-scale DSSE in terms
of effectiveness and efficiency.
VI. CONCLUSION
A novel USGCN model is proposed to timely estimate and
Fig. 11. The forecasting results of voltage magnitudes and phase angles of forecast distribution system states with high-penetrated renew-
node 14-30 of the 118-node distribution system at hour 12.
able power energy that exhibits complex spatiotemporal correla-
tions. Compared with the traditional methods, at least a 16.42%
improvement in the accuracy of estimated states can be achieved
methods, and LSTM represents a relatively bad MAE and MAPE without and with correlations. This is because the innovative
in the 118-node system, the CNN is maintained to be the baseline unrolled spatiotemporal graph model can capture the complex
method. The average prediction results of USGCN and the spatial correlations among measurements to achieve enhanced
baseline methods are shown in Table VI. It shows that the MAE, effectiveness. Besides, the USGCN also has a much shorter
MAPE, and RMSE of voltage magnitudes predicted by USGCN computation time than traditional DDSE methods. Furthermore,
outperform CNN by the percentages of 86.26%, 86.08%, and the USGCN also outperforms the state-of-art deep learning
methods with more accurate ahead-of-time forecasting states [21] P. Ghosh, Y. Yao, L. Davis, and A. Divakaran, “Stacked spatio-
due to longtime spatiotemporal correlations perceived by the temporal graph convolutional networks for action segmentation,” in Proc.
IEEE/CVF Winter Conf. Appl. Comput. Vis., 2020, pp. 576–585.
multiple stacked of the USG model. The results show the tremen- [22] C. Song, Y. Lin, S. Guo, and H. Wan, “Spatial-temporal synchronous graph
dous potential foreground of the USGCN in spatiotemporal state convolutional networks: A new framework for spatial-temporal network
estimation of power system applications. data forecasting,” in Proc. AAAI Conf. Artif. Intell., 2020, pp. 914–921.
[23] P. A. Pegoraro et al., “Bayesian approach for distribution system state
estimation with non-Gaussian uncertainty models,” IEEE Trans. Instrum.
Meas., vol. 66, no. 11, pp. 2957–2966, Nov. 2017.
REFERENCES [24] B. Yu, H. Yin, and Z. Zhu, “Spatio-temporal graph convolutional networks:
[1] H. Wu, P. Dong, and M. Liu, “Random fuzzy power flow of distribution A deep learning framework for traffic forecasting,” in Proc. 27th Int. Joint
network with uncertain wind turbine, PV generation, and load based on Conf. Artif. Intell., 2018, pp. 3634–3640.
random fuzzy theory,” IET Renewable Power Gener., vol. 12, no. 10, [25] L. Ruiz, F. Gama, and A. Ribeiro, “Gated graph recurrent neural networks,”
pp. 1180–1188, 2018. IEEE Trans. Signal Process., vol. 68, pp. 6303–6318, 2020.
[2] H. Wu, M. Wang, Z. Xu, and Y. Jia, “Graph attention enabled convolutional [26] H. Wu, Z. Xu, Y. Jia, and X. Xu, “Adaptive distributed graph model for
network for distribution system probabilistic power flow,” IEEE Trans. Ind. multiple line outage identification in large-scale power system,” IEEE Syst.
Appl., early access, Aug. 26, doi: 10.1109/TIA.2022.3202159. J., 2022.
[3] H. Wu, P. Dong, and M. Liu, “Distribution network reconfiguration for [27] H. Wu, Z. Xu, J. Zhao, and S. Chai, “Gridtopo-GAN for distribution
loss reduction and voltage stability with random fuzzy uncertainties of re- system topology identification,” IEEE Trans. Ind. Informat., early access,
newable energy generation and load,” IEEE Trans. Ind. Informat., vol. 16, Mar. 2022, doi: 10.1109/TII.2022.3158614.
no. 9, pp. 5655–5666, Sep. 2020. [28] T. N. Kipf and M. Welling, “Semi-supervised classification with graph
[4] S. Chai, Z. Xu, Y. Jia, and W. K. Wong, “A robust spatiotemporal fore- convolutional networks,” in Proc. Int. Conf. Learn. Representations, 2017.
casting framework for photovoltaic generation,” IEEE Trans. Smart Grid, [Online]. Available: https://arxiv.org/pdf/1609.02907.pdf
vol. 11, no. 6, pp. 5370–5382, Nov. 2020. [29] S. Zhang, H. Tong, J. Xu, and R. Maciejewski, “Graph convolutional
[5] E. Caro, A. J. Conejo, and R. Minguez, “Power system state estimation con- networks: A comprehensive review,” Comput. Social Netw., vol. 6, no. 1,
sidering measurement dependencies,” IEEE Trans. Power Syst., vol. 24, pp. 1–23, 2019.
no. 4, pp. 1875–1885, Nov. 2009. [30] Y. Li, R. Yu, C. Shahabi, and Y. Liu, “Diffusion convolutional recur-
[6] M. Shafiei, G. Nourbakhsh, A. Arefi, G. Ledwich, and H. Pezeshki, rent neural network: Data-driven traffic forecasting,” in Proc. Int. Conf.
“Single iteration conditional based DSSE considering spatial and temporal Learn. Representations, 2018. [Online]. Available: https://openreview.net/
correlation,” Int. J. Elect. Power Energy Syst., vol. 107, pp. 644–655, 2019. forum?id=SJiHXGWAZ
[7] J. Zhao, G. Zhang, M. La Scala, and Z. Wang, “Enhanced robustness of [31] Z. Wu, S. Pan, G. Long, J. Jiang, and C. Zhang, “Graph wavenet for
state estimator to bad data processing through multi-innovation analysis,” deep spatial-temporal graph modeling,” in Proc. 28th Int. Joint Conf. Artif.
IEEE Trans. Ind. Informat., vol. 13, no. 4, pp. 1610–1619, Aug. 2017. Intell. (IJCAI), Int. Joint Conf. Artif. Intell. Org., 2019, pp. 1907–1913.
[8] K. Dehghanpour, Z. Wang, J. Wang, Y. Yuan, and F. Bu, “A survey on [32] P. J. Huber, “Robust Estimation of a Location Parameter,” in Breakthroughs
state estimation techniques and challenges in smart distribution systems,” in Statistics. New York, NY, USA: Springer, 1992, pp. 492–518.
IEEE Trans. Smart Grid, vol. 10, no. 2, pp. 2312–2322, Mar. 2019. [33] I. Pena, C. B. Martinez-Anido, and B.-M. Hodge, “An extended IEEE
[9] C. Muscas, M. Pau, P. A. Pegoraro, and S. Sulis, “Impact of input data 118-bus test system with high renewable penetration,” IEEE Trans. Power
correlation on distribution system state estimation,” in Proc. IEEE Int. Syst., vol. 33, no. 1, pp. 281–289, Jan. 2018.
Workshop Appl. Meas. Power Syst., 2013, pp. 114–119. [34] H. Ahmadi and J. R. Martí, “Distribution system optimization based on a
[10] G. Valverde, A. T. Saric, and V. Terzija, “Stochastic monitoring of distri- linear power-flow formulation,” IEEE Trans. Power Del., vol. 30, no. 1,
bution networks including correlated input variables,” IEEE Trans. Power pp. 25–33, Feb. 2015.
Syst., vol. 28, no. 1, pp. 246–255, Feb. 2013. [35] M. Meriem et al., “Study of state estimation using weighted-least-squares
[11] A. Ranković, B. M. Maksimović, A. T. Sarić, and U. Lukič, “ANN-based method (WLS),” in Proc. Int. Conf. Elect. Sci. Technol. Maghreb, 2016,
correlation of measurements in micro-grid state estimation,” Int. Trans. pp. 1–5.
Elect. Energy Syst., vol. 25, no. 10, pp. 2181–2202, 2015. [36] M. Göl and A. Abur, “LAV based robust state estimation for systems mea-
[12] L. Dang, B. Chen, S. Wang, W. Ma, and P. Ren, “Robust power system sured by PMUs,” IEEE Trans. Smart Grid, vol. 5, no. 4, pp. 1808–1814,
state estimation with minimum error entropy unscented Kalman filter,” Jul. 2014.
IEEE Trans. Instrum. Meas., vol. 69, no. 11, pp. 8797–8808, Nov. 2020. [37] T. Wu et al., “Hydrogen energy storage system for demand forecast error
[13] M. Shafiei, G. Ledwich, G. Nourbakhsh, A. Arefi, and H. Pezeshki, “One- mitigation and voltage stabilization in a fast-charging station,” IEEE Trans.
step multilayer unbalanced three-phase dse for LV distribution networks,” Ind. Appl., vol. 58, no. 2, pp. 2718–2727, Mar.–Apr. 2022.
IEEE Syst. J., vol. 15, no. 3, pp. 4403–4412, Sep. 2021. [38] B. Yu, H. Yin, and Z. Zhu, “Spatio-temporal graph convolutional networks:
[14] J. Zhao, G. Zhang, Z. Y. Dong, and M. La Scala, “Robust forecasting aided A deep learning framework for traffic forecasting,” in Proc. 27th Int. Joint
power system state estimation considering state correlations,” IEEE Trans. Conf. Artif. Intell., 2018, pp. 3634–3640.
Smart Grid, vol. 9, no. 4, pp. 2658–2666, Jul. 2018.
[15] B. Zargar, A. Angioni, F. Ponci, and A. Monti, “Multiarea parallel
data-driven three-phase distribution system state estimation using syn-
chrophasor measurements,” IEEE Trans. Instrum. Meas., vol. 69, no. 9,
pp. 6186–6202, Sep. 2020.
[16] L. Wang, Q. Zhou, and S. Jin, “Physics-guided deep learning for power
system state estimation,” J. Modern Power Syst. Clean Energy, vol. 8,
no. 4, pp. 607–615, 2020.
[17] L. Zhang, G. Wang, and G. B. Giannakis, “Real-time power system state
estimation and forecasting via deep unrolled neural networks,” IEEE Huayi Wu (Student Member, IEEE) received the
Trans. Signal Process., vol. 67, no. 15, pp. 4069–4077, Aug. 2019. B.Eng. and M.S. degrees in electrical engineering
[18] N. Bhusal, R. M. Shukla, M. Gautam, M. Benidris, and S. Sengupta, “Deep from the South China University of Technology,
ensemble learning-based approach to real-time power system state estima- Guangzhou, China, in 2016 and 2019, respectively,
tion,” Int. J. Elect. Power Energy Syst., vol. 129, 2021, Art. no. 106806. and the Ph.D. degree in electrical engineering from
[19] L. Zhao et al., “T-GCN: A temporal graph convolutional network for The Hong Kong Polytechnic University, Hong Kong,
traffic prediction,” IEEE Trans. Intell. Transp. Syst., vol. 21, no. 9, in 2022. She is currently a Postdoctoral Fellow work-
pp. 3848–3858, Sep. 2020. ing under Prof. Zhao Xu with the Department of
[20] A. Mohamed, K. Qian, M. Elhoseiny, and C. Claudel, “Social-STGCNN: Electrical Engineering, The Hong Kong Polytechnic
A social spatio-temporal graph convolutional neural network for human University. Her research interests include artificial
trajectory prediction,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern intelligence application in power engineering and
Recognit., 2020, pp. 14412–14420. power system optimal operation with renewables.
Zhao Xu (Senior Member, IEEE) received the B.Eng. Minghao Wang (Member, IEEE) received the
degree from Zhejiang University, Hangzhou, China, B.Eng.(with Hons.) degree in electrical and electronic
in 1996, the M.Eng., degree from the National Uni- engineering from the Huazhong University of Science
versity of Singapore, Singapore, and the Ph.D. de- and Technology, Wuhan, China, and the University
gree in electrical engineering from The University of of Birmingham, Birmingham, U.K. in 2012, and the
Queensland, Brisbane, QLD, Australia, in 2006. He M.Sc. and Ph.D. degrees in electrical and electronic
was an Assistant and then Associate Professor with engineering from The University of Hong Kong,
the Department of Electrical Engineering, Technical Hong Kong, in 2013 and 2017, respectively. Since
University of Denmark, Kongens Lyngby, Denmark, 2018, he has been with the Department of Electri-
during 2006–2010. Since 2010, he has been with cal Engineering, Hong Kong Polytechnic University,
The Hong Kong Polytechnic University, Hong Kong, Hong Kong. He is currently a Research Assistant
where he is currently a Professor with the Department of Electrical Engineering Professor with the Department of Electrical Engineering, The Hong Kong
and Leader of Smart Grid Research Area. He is also a Foreign Associate Polytechnic University. His research interests include power systems and power
Staff of Centre for Electric Technology, Technical University of Denmark. electronics.
His research interests include smart grid, renewable energy and applications
of AI and Big Data analytics. He has extensive research project experiences
involving collaborations with academia, industrial and business sectors. He was
the recipient of the several awards for research excellence including 2017 State
Award in Nature Science from MOE, PR China. He is currently the Chairman
of IEEE Hong Kong Joint Chapter of PES/IELS/IAS/PELS. He also holds
editorships for several top journals e.g. IEEE TRANSACTIONS ON SMART GRID.

Unrolled Spatiotemporal Graph Convolutional Network For Distribution System State Estimation and Forecasting

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Unrolled Spatiotemporal Graph Convolutional Network For Distribution System State Estimation and Forecasting

Uploaded by

Copyright:

Available Formats

IEEE TRANSACTIONS ON SUSTAINABLE ENERGY, VOL. 14, NO.

1, JANUARY 2023 297

Unrolled Spatiotemporal Graph Convolutional

estimation can be modeled as the WLS optimization problem:

In (2), x̂ is the estimated state vector, T is the matrix transposi-

C . It is noted that learning parameters of spatiotemporal con-

time-consuming iteration processes so that reducing computa-

B. The Performance of USGCN in State Forecasting

63.35%, respectively. Besides, the MAE, MAPE, and RMSE

You might also like