You are on page 1of 11

Received: 16 February 2021 Accepted: 27 June 2021

DOI: 10.1002/2050-7038.13036

RESEARCH ARTICLE

A distribution network state estimation method based on


distribution generation output mode discrimination

Zhaoyang Jin1 | Deyu Cai1 | Chen Wang2 | Lei Ding1

1
School of Electrical Engineering,
Shandong University, Jinan, China
Summary
2
Weifang Power Supply Company of State In recent years, the penetration rate of distributed generation (DG) in the dis-
Grid Corporation of China, Weifang, tribution network is increasing, which significantly adds the uncertainty for
China
state estimation. To tackle this problem, in this paper, a state estimation
Correspondence method based on DG output mode discrimination is proposed. The historical
Deyu Cai, School of Electrical output data of DG are analyzed offline by the k-means clustering algorithm,
Engineering, Shandong University, 17923
Jingshi Road, Jinan, China.
and the output modes are identified by the pre-state estimation results before
Email: deyucai@sdu.edu.cn the measurements are updated. The obtained pseudo measurements are used
to perform the secondary state estimation based on the model information and
the measurements. The proposed method can effectively improve the state esti-
mation of distributed generation buses, which is verified by MATLAB simula-
tions in the PE&G-69 bus system.

KEYWORDS
distributed generator, distribution system, k-means cluster method, state estimation

1 | INTRODUCTION

In recent years, the technologies on renewable energy have developed rapidly, and abundant research results have been
achieved.1 In the distribution network, most of the renewable energy is distributed generations (DGs), such as small
combined heat and power (CHP) units, photovoltaic power generation systems2 distributed on the roofs of households
and in the open space, and small-scale wind and hydro power plants.3 The emergence of DG has greatly changed the
power flow distribution of distribution network, and some branches even have reverse power flow, which brings great
difficulties for the operation and maintenance of the power grid. In order to cope with the increasing penetration of DG
in distribution network,4 the concept of an active distribution network is gradually formed. Active distribution network
is a shared distribution network with flexible topology structure, which adopts the mode of active management of dis-
tributed generation, energy storage equipment, and customer bi-directional load. Its purpose is to strengthen the accep-
tance capacity of distribution network for renewable energy, improve its asset utilization rate, delay its upgrading
investment, and improve the power quality and reliability of users.

List of Symbols and Abbreviations: cenj, the jth cluster center; CHP, combined heat and power; d70%, the cluster radius in each cluster; data and
dataj , the mean values of all data and the jth cluster; datai, the ith data; DG, distributed generation; di,70%, the d70% corresponding to the ith cluster;
Di, the Euclidean distance from the pre-state estimation output state information x DG to the ith cluster center; ERL, exponential recovery load; G-1,
the trace of the inverse matrix of the gain matrix; GMM, Gaussian mixture model; GVF, goodness of variance fit; p.d.f, probability density function;
PMU, phasor measurement unit; RMSE, root mean square error; RTU, remote terminal unit; SM, smart meter; var, the variance of all the data given;
varsum, the sum of the variances of the data in each cluster; WT, wind turbine; xpre, the distributed generator output information obtained by the pre-
state estimation; xsm, the DG output estimation result at the time of SM data update; xt and xt,true, the estimated value and true value of a certain state
quantity.

Int Trans Electr Energ Syst. 2021;31:e13036. wileyonlinelibrary.com/journal/etep © 2021 John Wiley & Sons Ltd. 1 of 11
https://doi.org/10.1002/2050-7038.13036
20507038, 2021, 11, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/2050-7038.13036 by South Dakota State University, Wiley Online Library on [06/03/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2 of 11 JIN ET AL.

The daily operation and maintenance of the power system requires accurate perception of the real-time state of the
power system. In the power system, the system situational awareness is realized via state estimation. The research of
state estimation began in the 1970s. The transmission network is characterized by its high measurement redundancy.
Therefore, the main functions of state estimation include identifying and eliminating the bad data, and converting the
raw data with a certain amount of errors into reliable accurate data.5-8 In contrast, the distribution network has low
measurement redundancy. Hence, distribution network state estimation only has the latter function.9
There are usually only a small number of real-time measurements in distribution networks. At present, the real-
time measurement is only deployed on the important buses of the main feeder of the distribution network, and the net-
work observability required for state estimation is generally constructed by pseudo measurement information. The
pseudo measurement information is generated from a certain approximate load model constructed by the user bill and
the user electricity survey information. A real-time load modeling technology is proposed in Reference 10. The model-
ing method considers different kinds of user load curves, makes full use of the user bill information, and gives the
corresponding uncertainty of the model. However, this method requires accurate daily load curves while the DG is
almost random and cannot be well described by the daily load curve. In References 11-14, considering the correlation
between observable buses and unobservable buses, a pseudo measurement load model was constructed, and its uncer-
tainty was constructed by Gaussian mixture model (GMM). The underlying assumption of this method is that the
uncertainty of the load can be described by a fixed probability density function (p.d.f.). However, the p.d.f. of DG is vari-
able to different working conditions including solar irradiance, weather, temperature, wind speed, etc. In Reference 15,
an exponential recovery load (ERL) model was constructed to approximate the dynamic process of load after voltage
disturbance. The shortcomings of this method are that this method cannot predict the load behavior for time scales
larger than that of the dynamic process (up to 10 seconds of milliseconds) and that it requires highly accurate real-time
measurement devices such as PMUs, which is scares in distribution networks. In Reference 16, a load estimation
method is proposed based on k-means clustering algorithm using the data from smart meters (SM).17 The load profiles
are divided into different cluster centers and estimated separately. Although, Reference 14 shows that the estimated
load using the k-means method is accurate than that using the conventional daily load curve-based methods, the pro-
posed k-means method in Reference 14 only considers the conventional load and neglects DG which usually has differ-
ent output modes under different working conditions.
The output of DG is characterized by its high intermittentness and randomness, which brings a lot of uncertainty
for distribution network state estimation, since its power output is random and can be hardly described by a fixed
p.d.f. In the distribution network with few real-time measurements, the accuracy of state estimation is very low and the
real-time states of the system cannot be accurately obtained. In addition, the resolution of the measuring device in
the current distribution network is too long-resolution of the most numerous measuring equipment SM is about
15 minutes,18,19 which cannot timely reflect the state of the system when the output mode of the distributed generation
changes greatly in a short time. However, the traditional pseudo measurement modeling method cannot accurately
summarize the load information for DG, as explained in the previous paragraph. Therefore, estimating the system state
accurately when the SM data are not updated still represents a momentous problem.
In this paper, a state estimation method for DG output mode discrimination is proposed. In this method, the histori-
cal output data of DG are processed by k-means clustering method and divided into different output modes. In the pro-
cess of state estimation, the current output mode of DG is identified by the results of pre-state estimation. Then, the
new pseudo measurement generated by the mode information and measurement information according to a certain
weight are used for secondary estimation. With more accurate pseudo measurements, which constitute the majority of
the measurements in distribution networks, the state estimation accuracy can be improved using the proposed method.
The structure of this paper is given as follows: Section 2 introduces the k-means clustering algorithm and its imple-
mentation; Section 3 discusses the Gaussianity of the cluster data; Section 4 presents the proposed state estimation
method based on the DG output mode; Section 5 shows the simulation results; Section 6 concludes this paper.

2 | K - M E A N S CL U S T E R I N G AL G O R I T H M

This section introduces the algorithm principles, advantages, and disadvantages of the k-means algorithm, and the brief
implementation process of the algorithm.
K-means clustering method was proposed by Steinhaus in 1955, Lloyd in 1957, Ball and Hall in 1965, and McQueen
in 1967, respectively in different scientific research fields,20-23 which is a classical clustering partition analysis method.
20507038, 2021, 11, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/2050-7038.13036 by South Dakota State University, Wiley Online Library on [06/03/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
JIN ET AL. 3 of 11

K-means clustering method divides a series of data into k clusters by minimizing the sum of squares in the cluster.
This clustering method has the advantages of ease to implement, fast convergence, and high efficiency for processing a
large number of data, which makes it the most common and widely used data clustering method in data mining and
knowledge discovery.
In the proposed method, we minimize the square distance between each data and the cluster center to which the
data belongs, and maximize the square distance between each data and other clustering centers outside the data to
achieve clustering division. n data are divided into k clusters by using the k-means method. The objective function of
clustering is as follows:
!
X
k X
n  
J intra ¼ min datai  cenj 2 ð1Þ
j¼1 i¼1, i∈j

0 1
X
k X
n  2
J inter ¼ max@ datai  cenj  A ð2Þ
2j
j¼1 i¼1, i=

where datai is the ith data, cenj is the jth cluster center, n and k represent the number of data and cluster respectively.
Note that the data can be obtained from one or different time series. The calculation steps of k-means algorithm are
shown in Figure 1. The calculation process assumes that the optimal number of clusters k is determined in advance.
The general method is to give the maximum number of clusters kmax. The k-means  clustering algorithm is used to
obtain the clustering results corresponding to each k in the range of k∈ 2k max ~ , and then the optimal number of clus-
ters is determined by a certain evaluation index of clustering effectiveness.23

Begin

Input data information and the optimal


number of clusters k

Randomly select k values in the data as the initial


cluster centers

Calculate the distance from each data point to each


cluster center

Add data points to the nearest cluster

Calculate the average of all data in each cluster as


the new cluster center

No The iteration difference is less than the


threshold

Yes

Get the final clustering result

Finish

FIGURE 1 Simplified process of k-means algorithm


20507038, 2021, 11, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/2050-7038.13036 by South Dakota State University, Wiley Online Library on [06/03/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
4 of 11 JIN ET AL.

The content of this chapter adopts k-means clustering calculation for distributed generators output data. Since the
active and reactive power output data of distributed generators belong to one-dimensional data, the goodness of vari-
ance fit (GVF) that is more suitable for one-dimensional data is selected to determine the optimal number of clusters,
and its mathematical expression is:

1X n  2
var ¼ datai  data ð3Þ
n i¼1
Xk
1X
nj
 2
var sum ¼ datai  dataj ð4Þ
n
j¼1 j i¼1

var sum
GVF ¼ 1  ð5Þ
var

where data and dataj are the mean values of all data and the jth cluster, respectively, var is the variance of all the data
given, and varsum is the sum of the variances of the data in each cluster. It can be clearly seen that for a given
data group, var is a constant, and varsum is related to the number of clusters k. The larger the number of clusters, the
smaller the varsum, and the closer the GVF is to 1. When the number of clusters k is equal to the number of data n,
varsum = 0 and GVF = 1.
The optimal number of cluster is obtained by analyzing the three-month offline output data of the actual DG in dis-
tribution network. To demonstrate the effectiveness of the proposed method, the wind power generation system is con-
sidered whose output uncertainty is much higher than that of the photovoltaic generation system. The resulting GVF
curve is shown in Figure 2.
From the above line graph, it can be clearly seen that when the number of clusters is 4, the growth rate of GVF
slows down significantly. Therefore, in this simulation, the optimal number of clusters for wind turbine output is 4.

3 | A S S E S S I N G T H E GA U S S I A N I T Y T H E C L U S T E R E D D A T A

This section discusses how to use the k-means clustering algorithm to perform clustering calculations on the one-
dimensional output data of distributed generator buses.
The data source is the active power output data of a wind power plant in the past 3 months, and the line graph of
the optimal number of clusters is given according to the goodness of variance fitting, and the optimal number of clus-
ters obtained from wind farm active power output data is 4. Table 1 shows the distribution of cluster centers of wind

FIGURE 2 GVF of a wind farm

TABLE 1 Distribution center of the wind turbine output data clusters.

Power type Cluster 1 Cluster 2 Cluster 3 Cluster 4


Wind turbine (kW) 256.31 75.22 483.16 797.89
20507038, 2021, 11, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/2050-7038.13036 by South Dakota State University, Wiley Online Library on [06/03/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
JIN ET AL. 5 of 11

turbine output data after clustering calculation. The value of the cluster centers represents the average output of DG
buses in each cluster, which corresponds to this cluster. The obtained distribution centers give a more accurate sum-
mary of their corresponding output modes.
The weighted least squares method used in state estimation requires the error of each measurement data in the sys-
tem to obey Gaussian distribution. This method uses cluster center data to represent an output mode of distributed gen-
erators for state estimation calculation. Other data that is not in the cluster centers can be regarded as the
measurement data composed of the cluster center plus a random error. Therefore, it is necessary to verify whether
the data in each cluster obey a Gaussian distribution. This article uses data skewness and kurtosis to judge the Gaussian
distribution of the data. Skewness characterizes the degree of skewness of the data equivalent to the cluster center. A
skewness of 0 means that all data are evenly distributed on both sides of the cluster center. The mathematical expres-
sion of data skewness is24:

nj 
P 3
nj datai  cenj
i¼1
skew ¼  23 ð6Þ
nj 
P 2
1
nj datai  cenj
i¼1

The kurtosis is used to express the steepness of the distribution of the data. Its determination is compared with the
Gaussian distribution. The absolute value of the peak represents the degree of difference in the degree of steepness from
the Gaussian distribution. A value of 3 means that the degree of steepness of the distribution is the same as that of the
Gaussian distribution, and a value greater than 3 means that the distribution of the data is sharper than the Gaussian
distribution, showing a “thin-tailed” distribution; otherwise, it is steady, showing a “thick-tailed” distribution. Its math-
ematical expression is as follows:

nj 
P 4
1
nj datai  cenj
i¼1
kurt ¼  nj  2 ð7Þ
P 2
1
nj datai  cenj
i¼1

where datai and cenj belong to the same cluster, and nj is the number of data in the cluster.
According to (6), the skewness data of the data in each cluster corresponding to the above-mentioned wind turbine
(WT) is calculated as shown in Table 2.
According to (7), the kurtosis data of the data in each cluster corresponding to the above-mentioned wind turbine
(WT) is calculated as shown in Table 3.
The range of skewness and kurtosis conforming to the Gaussian distribution is:
h qffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffii
skew ∈ 2  6=nj , 2  6=nj ð8Þ

TABLE 2 Skewness values in different clusters of wind turbine data

Power type Cluster 1 Cluster 2 Cluster 3 Cluster 4


Wind turbine 0.2476 0.2252 0.3695 0.0463

TABLE 3 Kurtosis values in different clusters of wind turbine data

Power type Cluster 1 Cluster 2 Cluster 3 Cluster 4


Wind turbine 2.6896 2.4930 2.5088 2.4610
20507038, 2021, 11, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/2050-7038.13036 by South Dakota State University, Wiley Online Library on [06/03/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
6 of 11 JIN ET AL.

h qffiffiffiffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffiffiffii
kurt ∈ 3  2  24=nj , 3 þ 2  24=nj ð9Þ

From the data in Tables 2 and 3 and the standards in (8) and (9), the data in each cluster obey a Gaussian
distribution.

4 | THE P ROPOSED STATE ESTIMATION METHOD BASED O N


DETERMINING THE DG OUTPUT M ODE

After verifying that the data of each cluster obeys the Gaussian distribution, each cluster can be regarded as an output
mode of the DG and added to the state estimator. However, adding to the state estimation process requires solving the
uncertainty value corresponding to the output mode, because the cluster data corresponding to each output mode can
be approximately considered to obey the Gaussian distribution, and the Gaussian distribution corresponds to
68.3% ≈ 70% of the data in the ½μ  σ, μ þ σ  interval. Therefore, the weight value added to the weighted least square
method can be expressed as:

1
ω¼ ð10Þ
d270%

where d70% is 70% of the cluster radius in each cluster, and cluster radius is the distance from the cluster center to the
furthest data (all data can be arranged according to the distance from the cluster center). It has been verified that
the numerical approximation obtained by the two selection methods has little effect on the state estimation accuracy.
After discriminating whether the pattern clustering is a normally distributed quantity, the following is the specific
implementation of the method. This method is based on the measurement system of PMU, RTU, and SM. Figure 3 is
the calculation process of the method.

FIGURE 3 The process of determining the state estimation method through the DG output model
20507038, 2021, 11, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/2050-7038.13036 by South Dakota State University, Wiley Online Library on [06/03/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
JIN ET AL. 7 of 11

One of the most important step of the algorithm is to perform a conventional static state estimator (SE) at the
moment when all measurement information in the system is updated. Because the resolution of the SM is long, it is
assumed that the measurement value of the SM is unchanged at the time when the measurement of the SM is not
updated. The SM data are added to the state estimator to perform a pre-state estimation process, and the output state
information x DG of the distributed power bus is obtained. Note that if an output mode change of the DG is detected, the
weight of the SM data needs to be reduced. According to the Euclidean distance to determine which output mode
the current DG is in, the Euclidean distance calculation method is:
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
X n  2
D¼ datai  cenj ð11Þ
i¼1

When the pre-state estimated output information satisfies (12), the current DG is considered to be in the
corresponding output mode.

Di ≤ di,70% ð12Þ

where Di is the Euclidean distance from the pre-state estimation output state information x DG to the ith cluster center,
di,70% is the d70% corresponding to the ith cluster. d70% is adopted to make sure that the data are reliably within the con-
sidered cluster. If there is no cluster that can make the point satisfy the above relationship,
 take
 the two
 clusterswith
the smallest Euclidean distance, and integrate the two clusters according to zcen ¼ Di =Di þ Dj ceni þ Dj =Di þ Dj cenj .
Similarly, the variance becomes:

Di Dj
σ 2cen ¼ d2 þ d2 ð13Þ
Di þ Dj i,70% Di þ Dj j,70%

Di Dj
σ 2cen ¼ d2 þ d2 ð14Þ
Di þ Dj i,70% Di þ Dj j,70%

According to the mode determination result and the measurement information zsm at the time when the SM of the
DG bus is last updated, the pseudo measurement of the output information of the DG bus is constructed. The construc-
tion method is as follows:

zDG ¼ ωcen zcen þ ωsm zsm ð15Þ

where:

ωcen ¼ ρ  0:1
ð16Þ
ωsm ¼ 1:1  ρ

where ρ represents the degree of change between the distributed generator output information x pre obtained by the pre-
state estimation and the DG output estimation result x sm at the time of SM data update. The number 0.1 p.u. is the larg-
est ratio for DG output to be considered as unchanged. It is obtained through large number of simulations. The mathe-
matical expression of ρ is:

x pre  x sm
ρ¼ ð17Þ
x sm

In this method, when 0:9x sm ≤ x pre ≤ 1:1x sm , the state of the DG bus can be considered to be unchanged approxi-
mately. In other cases, the mode information needs to be mixed with the SM measurement information at the time of
SM data update to form pseudo measurement information.
20507038, 2021, 11, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/2050-7038.13036 by South Dakota State University, Wiley Online Library on [06/03/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
8 of 11 JIN ET AL.

After constructing the pseudo measurements of the DG bus, they are added to the measurement vector for the sec-
ondary state estimation process, and finally the system state is obtained.

5 | S I MU LA T I ON R E S U L T S

This part presents the simulation settings and results analysis of the proposed state estimation method compared with the
static state estimator and the GMM-based state estimator, one of the conventional distribution state estimator that only
considers a fixed DG output mode. All simulations are performed in MATLAB. The majority of the measurements are
constituted by SMs, and only the important buses are placed with RTUs and/or PMUs.17 In this paper, these phasor mea-
surements from the PMUs are converted into rectangular coordinates. It is assumed that the SD of the measurement error
of the PMUs is 0.1% of the measurement value. RTU measurement includes voltage amplitudes, current amplitudes,
branch power flows and bus power injections, etc. In this paper, only branch power measurement is used, and the SD of
the measurement error of the RTU is 1% of the measurement value. The types of SM measurements include bus voltage
amplitudes, branch power flows, and bus injection powers. For the sake of simplicity, this paper only uses bus injection
power measurements, and the SD of the measurement error is 10% of the measured value. As for the update frequency, it
is assumed that the PMU measurement updates every 0.1 second, the RTU measurement updates every 10 seconds, the
SM measurement refreshes per 15 minutes, and the state estimation calculation period is 1 minute.
For DG output mode determination state estimation method, the PE&G-69 bus system is also used for simulation.
The configuration of various measurements is shown in Table 4. The single line diagram of PE&G-69 bus system is
presented in Figure 4.
Two small wind turbines are added at bus 27 and bus 69, and their output data and cluster center results are the
same as shown in Table 1 of Section 3. The comparison group uses the traditional static SE method, which is performed
every 1 minute. Since the SM measurement cycle is 15 minutes, the traditional static SE method keeps the measure-
ment data unchanged at the time when the measurement data of the SM is not updated. Therefore, the weight value of
the electricity meter data added to the state estimator is reduced to 1/10 of the original state estimation calculation to
ensure the accuracy of the state estimation.
Comparing the proposed method with the method considering the DG using the GMM model in Reference 12, the
active power estimations of the DG bus 27 and 69 are shown in Figures 5 and 6 separately. It can be seen from the fig-
ures that both the estimated active power at the DG bus of the proposed method is the closest to the true value; that of
the static SE has the second highest estimation accuracy; while that of the GMM method has the lowest estimation
accuracy. The reason for this is that the state estimation method using the GMM model in Reference 12 depends on the
real-time measurements, so when the number of real-time measurements is small, the result obtained is the worst.
The proposed method can make a more accurate judgment of the output mode of the DG bus when there are few real-
time measurements in the system, and obtain a more accurate estimation result.

TABLE 4 PE&G-69 bus system measurement configuration

Measuring device Configured bus


PMU 0, 12
RTU 2, 3
SM All injection buses in the system

FIGURE 4 Single line diagram of PE&G-69 bus system


20507038, 2021, 11, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/2050-7038.13036 by South Dakota State University, Wiley Online Library on [06/03/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
JIN ET AL. 9 of 11

FIGURE 5 Estimated injected power situation at bus 27

FIGURE 6 Estimated injected power situation at bus 69

TABLE 5 RMSE of the estimations on the active power of DG buses

Node number Static SE GMM method K-means method


4 3
27 5.44  10 1.10  10 2.96  104
69 1.10  103 1.20  103 5.21  104

To quantify the estimation accuracy of the DG output of the three methods, the root mean square error (RMSE) is
used. The calculation formula of RMSE is shown below:

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1X T
R¼ ðx t  x t,true Þ2 ð18Þ
T t¼1
20507038, 2021, 11, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/2050-7038.13036 by South Dakota State University, Wiley Online Library on [06/03/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
10 of 11 JIN ET AL.

TABLE 6 Convergence comparison of the three methods

Method Static SE GMM method K-means method


1 7 7
Trace (G ) 7.75  10 8.57  10 6.58  107

TABLE 7 Estimation accuracy comparison of the three methods

Vm Va Im Pi Qi
Static method 1.9157  104 0.0025 1.6821  104 3.4732  104 2.1954  105
4 4 4
GMM method 1.91  10 0.0026 1.70  10 3.49  10 2.28  105
K-means method 1.91  104 0.0016 1.29  104 3.38  104 1.79  105

where, T is the total calculation time, xt and xt,true respectively correspond to the estimated value and true value of a cer-
tain state quantity. The quantified results are shown in Table 5.
The results in Table 5 verify the conclusions obtained in Figures 5 and 6. In the case of fewer real-time measure-
ment configurations, the proposed method can effectively improve the estimation accuracy of the voltage, power, and
other states of DG buses.
The convergence of the estimators is measured by the trace of the inverse matrix of the gain matrix, and the results
are shown in Table 6.
It can be seen from Table 6 that under the premise of lack of real-time measurement, among the three methods, the
proposed
 state estimation method of DG mode determination based on k-means clustering has the lowest value of
trace G1 , meaning that the proposed method has the best convergence.
The estimation accuracy of the states represented by the average RMSE values including voltage magnitude (Vm)
and angles (Va), current magnitude, active and reactive power injections (Pi and Qi, respectively), and the results are
shown in Table 7.
It can be seen that the proposed method can effectively improve the estimation accuracy of the states of the system.

6 | C ON C L U S I ON

This paper has proposed a state estimation method based on the determination of the DG output mode using the
k-means clustering method. The estimation method is divided into two steps: The first step is to establish several DG
output modes by using the k-means method according to the historical output data of DG, and determine which mode
the DG is currently working at according to the pre-state estimation results. Then, according to the change amount,
the measurement information and the output mode information are respectively weighted to construct a new pseudo
measurement, which will be used in the final state estimation to produce a more accurate estimate. Therefore, this
method is particularly suitable for the situation where the active distribution network has fewer real-time measure-
ments. The simulation results demonstrate that the proposed method have better estimation accuracy of the states
related to the DG bus, and better convergence in comparison to two existing state estimation methods, including static
SE and the GMM method. Future research will focus on the implementation of the proposed method in forecasting-
aided state estimator, where not only the states of the DG buses but also the states of the other buses need to be
predicted. In addition, parallel processing techniques will be adopted to deal with the large amount of historical data.

P EE R R EV IE W
The peer review history for this article is available at https://publons.com/publon/10.1002/2050-7038.13036.

DATA AVAILABILITY STATEMENT


Data available upon request.
20507038, 2021, 11, Downloaded from https://onlinelibrary.wiley.com/doi/10.1002/2050-7038.13036 by South Dakota State University, Wiley Online Library on [06/03/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
JIN ET AL. 11 of 11

ORCID
Zhaoyang Jin https://orcid.org/0000-0003-1876-2153

R EF E RE N C E S
1. Bao W, Ding L, Yin S, Wang K & Terzija V. Active rotor speed protection for DFIG synthetic inertia control. Paper presented at: Mediter-
ranean Conference on Power Generation, Transmission, Distribution and Energy Conversion, 2016.
2. Fang Z, Lin Y, Song S, Li C, Lin X, Chen Y. State estimation for situational awareness of active distribution system with photovoltaic
power plants. IEEE Trans Smart Grid. 2021;12(1):239-250. https://doi.org/10.1109/TSG.2020.3009571
3. Falaghi H, Singh C, Haghifam MR, Ramezani M. DG integrated multistage distribution system expansion planning. Int J Electr Power
Energy Syst. 2011;33(8):1489-1497.
4. Zhou W, Ardakanian O, Zhang H, Yuan Y. Bayesian learning-based harmonic state estimation in distribution systems with smart meter
and DPMU data. IEEE Trans Smart Grid. 2020;11(1):832-845. https://doi.org/10.1109/TSG.2019.2938733
5. Samuelsson, O., Repo S, Jessler R, et al. "Active distribution network—demonstration project ADINE. Paper presented at: 2010 IEEE
PES Innovative Smart Grid Technologies Conference Europe (ISGT Europe) IEEE; 2010.
6. Lu CN, Teng JH, Liu WHE. Distribution system state estimation. IEEE Trans Power Syst. 1995;10(1):229-240.
7. Baran ME, Kelley AW. State estimation for real-time monitoring of distribution systems. IEEE Trans Power Syst. 1994;9(3):1601-1609.
8. Baran ME, Kelley AW. A branch-current-based state estimation method for distribution systems. IEEE Trans Power Syst. 1995;10(1):
483-491.
9. Wu J, He Y, Jenkins N. A robust state estimator for medium voltage distribution networks. IEEE Trans Power Syst. 2013;28(2):1008-
1016.
10. Ghosh AK, Lubkeman DL. Load modeling for distribution circuit state estimation. IEEE Trans Power Deliv. 1997;12(2):999-1005.
11. Manitsas E, Singh R, Pal B, Strbac G. Modelling of pseudo-measurements for distribution system state estimation. Paper presented at:
Smartgrids for Distribution, IET-Cired Cired Seminar IET, 2008.
12. Liu Qian, Lei Ding, Wang X, et al. Distribution system state estimation considering the uncertainty of DG output. Paper presented at:
2017 IEEE Conference on Energy Internet and Energy System Integration (EI2) IEEE; 2018.
13. Valverde G, Saric AT, Terzija V. Probabilistic load flow with non-gaussian correlated random variables using Gaussian mixture models.
Gener Transm Distrib. 2012;6(7):701-709.
14. Valverde G, Saric AT, Terzija V. Stochastic monitoring of distribution networks including correlated input variables. IEEE Trans Power
Syst. 2013;28(1):246-255.
15. Zhang B, Ma L & Liu X et al. A distribution system state estimation analysis considering the dynamic load effect. Paper presented at:
TENCON 2018—2018 IEEE Region 10 Conference IEEE; 2018.
16. Al-Wakeel A, Wu J, Jenkins N. k-means based load estimation of domestic smart meter measurements. Appl Energy. 2017;194:333-342,
ISSN 0306-2619. https://doi.org/10.1016/j.apenergy.2016.06.046
17. Liu Y, Li J, Wu L. State estimation of three-phase four-conductor distribution systems with real-time data from selective smart meters.
IEEE Trans Power Syst. 2019;34(4):2632-2643. https://doi.org/10.1109/TPWRS.2019.2892726
18. Alimardani A, Therrien F, Atanackovic D, Jatskevich J, Vaahedi E. Distribution system state estimation based on nonsynchronized
smart meters. IEEE Trans Smart Grid. 2015;6(6):2919-2928.
19. Al-Wakeel A, Wu J, Jenkins N. State estimation of medium voltage distribution networks using smart meter measurements. Appl
Energy. 2016;184:207-218.
20. J. Abonyi, B. Feil. “Cluster Analysis for Data Mining and System Identification.” Heidelberg, Germany: Springer-Verlag, 2007.
21. Hannu O. Introduction to clustering large and high-dimensional data by Jacob Kogan. Int Stat Rev. 2007;75(3):434–435.
22. Gan G, Ma C, Wu J. Data Clustering: Theory, Algorithms, and Applications. Philadelphia, USA: SIAM; 2007.
23. Oyelade OJ, Oladipupo OO, Obagbuwa IC. Application of k-means clustering algorithm for prediction of students’ academic perfor-
mance. Int J Inf Comput Sci Info Secur. 2010;7(1):292–295.
24. Joanes DN, Gill CA. Comparing measures of sample skewness and kurtosis. J R Stat Soc D. 1998;47(1):183-189.

How to cite this article: Jin Z, Cai D, Wang C, Ding L. A distribution network state estimation method based
on distribution generation output mode discrimination. Int Trans Electr Energ Syst. 2021;31(11):e13036. https://
doi.org/10.1002/2050-7038.13036

You might also like