You are on page 1of 6

Using k-Means Clustering Method in Determination

of the Optimal Placement of Distributed Generation


Sources in Electrical Distribution Systems
Florina Scarlatache*, Gheorghe Grigoraş*, Gianfranco Chicco**, Gheorghe Cârţină*
*Technical University “Gheorghe Asachi” of Iasi, Romania **Politecnico di Torino, Italy
E-mail: flr_rotaru@yahoo.com

exceeds demand) or negative values (i.e., demand exceeds


Abstract - This paper proposes a method using k-means generation). The variety of DG sources implies that the
clustering, based on the operational characteristics (loss amount and sign of the injected power are variable in time.
sensitivity factor and voltage values in nodes) to determine the DG location has then to be considered by analysing the
optimal location of distributed generation sources into an
electrical distribution system. The suggested method is tested on evolution of the electrical quantities in the time domain.
a 20 kV real rural distribution network with 91 nodes. In the Studies have indicated that inappropriate selection of
optimal nodes from the network analyzed, DG is installed at location of DG may lead to system losses greater than the
different power levels, considering the evolution in time of the losses without DG [1][2]. The use of a method capable of
energy losses. On the basis of the results, the proposed method indicating the best solution for a given distribution network
demonstrates that the methodology can be successfully used to
reduce the real power losses, to improve the voltage values in can be very useful for the system planning engineer. Due to
nodes, but more important to determine the optimal placement high efficiency, small size, low investment cost, modularity
of DG without violation of any of the system constraints under in the construction and exploitation of the units, DG has
any operating conditions. The results show the effectiveness of increasingly become an attractive alternative to network
the proposed method. reinforcement and expansion. Therefore, the factor of best
location is one of the important issues in the implementation
I. INTRODUCTION
of DG sources in the distribution system.
The definition of distributed generation (DG) takes With proper planning, the integration of DG sources in
different forms in different contexts and countries and electrical distribution networks would lead to enhancement in
according to the indications of different agencies. the network performance in terms of: voltage profile
The International Energy Agency (IEA) defines distributed improvement, reduction in system losses, improved power
generation as generating plant serving a customer on-site or quality and reliability of supply, reduction of the electricity
providing support to a distribution network, being connected cost, and possibility to assist congestion management in
to the grid at distribution-level voltages. CIGRE defines DG transmission lines. So, it is expected that DG sources will
as a generation not centrally planned, not centrally dispatched have increasing contribution in supplying the electricity
at present, usually connected to the distribution network, and demanded by customers in future [3]. In addition, installation
smaller than 50–100 MW. Other organizations like the of DG can decrease the costs related to transmission of
Electric Power Research Institute defines distributed electricity to remote places. Also because of using new and
generation as generation from a few kilowatts up to 50 MW. renewable energy in DG units, it can help protect the
According to the Standard IEEE 1547, distributed generation environment. Interconnecting DG to the distribution feeders
is an electric power source connected directly to the can have significant effects on the system in terms of power
distribution network or at the customer side of the meter, with flow, voltage regulation and reliability. A DG installation
maximum limit of 10 MVA. Thus, in general, DG refers to changes the traditional characteristics of the distribution
small-scale generation connected to Medium Voltage and system. Most of the distribution systems are designed such
Low Voltage grids, excluding the large size production plants that the power flows in one direction. The installation of a
connected to the High Voltage grid. The term DG also DG introduces other sources in the system.
implies the use of any modular technology that is sited Determination of the optimal location of DG units in
throughout a utility’s service area (interconnected to the distribution systems with various objective functions have
distribution or sub-transmission system) to reduce the cost of been continuously studied in recent years, with many
service. solutions proposed. DG sources impact on losses is an aspect
DG location is a key aspect emerging with the diffusion of of great interest, as shown by the number of different studies
local generation sources connected to the distribution performed on this subject. Most of these studies were
networks. The incorporation of DG in the distribution concentrated on studying a specific case (feeder and DG plant
systems has led to the possibility of getting the net power connection) or to develop a methodology to assess specific
injected in a node either with positive values (i.e., generation losses for particular scenarios of DG penetration. Regarding
the placement of DG sources, in the literature there are many • Pb is the sum of the real power loads of all the nodes
solutions proposed, namely using fuzzy techniques [4][5], beyond node b plus the real power load of node b itself
analytical technique [6], heuristic optimization algorithm [7] plus the sum of the real power losses of all the branches
[8], mixed integer nonlinear programming [9], Evolutionary beyond node b;
Programming (EP) optimization technique [10], and others. • Qb is the sum of the reactive power loads of all the nodes
In this sense, a new approach is proposed in this paper. Using beyond node b plus the reactive power load of node b
k-means clustering method the optimal placement of DG into itself plus the sum of the reactive power losses of all the
electrical distribution network is determined based on the branches beyond node b;
operational characteristics (the loss sensitivity factor (LSFs) • Vb is the voltage value at the end node b.
and the nodal voltage magnitude). In general, the data Using the reactive power loss calculated, the loss
analysis is based on the evaluation of several applications, sensitivity factors LSFs, real power loss P1ineloss.b with respect
either at the design stage or in on-line management operations to effective reactive power Qb given by equation (2), are
of complex processes. In such processes, the use of clustering determined:
techniques for representing data, measuring the similarity
between elements representative for the group, can be very ∂P1ineloss , b 2 * Qb * Rb
suggestive and attractive. The k-means clustering methods LSF = = 2 (2)
∂Qb Vb
are tested by a series of simulations on 20 kV rural
distribution network to show the effectiveness of the methods
in determining the optimal nodes where DG sources can be The loss sensitivity factors are then normalized to the (0,1)
placed. Compared with the methods presented in [1][2], and range, with the largest sensitivity having a value of 1 and the
[4], that used the same variables to find the optimal place for smallest one having a value of 0.
DG into a distribution network, the new method based on k- The normalization process of the LSF is carried out by
means clustering techniques obtains the results very quickly determining the LSF values obtained in the network nodes
and is particularly effective. and the sums of LSFs in all the feeders f, of the network
analysed [11]. The normalized loss sensitivity factors, l,
II. LOSS SENSITIVITY FACTORS expressed with respect to the ending node of branch b in
The sensitivity factor method applies the principle of order to use the system nodes as a common reference with the
linearization of the original non-linear equations around the normalized voltage magnitudes defined below, are calculated
initial operating point, reducing the dimension of the solution with the relation:
space. The loss sensitivity factor method has been used to LSF − LSF min
l= (3)
solve some of the problems referring to distribution systems, LSF max − LSF min
for instance for capacitor location, and can be successfully
used to determine the optimal place of DG [1][2][4]. To
calculate the loss sensitivity factors into a radial distribution The normalized nodal voltage magnitudes u are obtained
network, it is necessary to know the real and reactive power by considering the voltage magnitudes U at the network
loads, the real and reactive power losses in network nodes nodes and the nominal voltage Un of the network analysed,
and branches, that can be obtained from power flow obtaining:
calculations.
Let us consider a radial network in which the branches are U
u= (4)
numbered with the same number of the receiving node (the Un
one located at the opposite side with respect to the root node).
The real power losses Plineloss,b and the reactive power losses
This normalization assumes that the voltage magnitude U
Qlineloss,b at the receiving node b of branch b are calculated
does not exceed the value Un in the network analyzed (as in
with the next expressions [4]:
the case study analyzed here). More generally, the upper limit
for the network voltages in normal conditions could be used
Rb ( Pb2 + Qb2 ) as the normalizing factor.
P1ineloss , b = 2
Vb The normalised LSFs and nodal voltage magnitudes are
(1)
X b ( Pb2 + Qb2 ) used as inputs to the k-means clustering method illustrated in
Q1ineloss , b = 2 section III, which determine the nodes more suitable for DG
Vb installation.
III. K-MEANS CLUSTERING METHOD
where, indicating with “beyond node b” the portion of the
network located at the opposite side of node b with respect to The clustering methods has been used with success in
the root node: different applications for distribution systems, like in
evaluation of the energy losses [12], to find the optimal
placement of phasor measurements units (PMU) [13], or to paper, this is achieved through the silhouette global
find the optimal node to install the DG sources, method coefficient.
proposed in this paper. The k-means clustering is an 4. Increase the number of clusters Nc to Nc,max to see if
algorithm to classify or to group the objects based on the k-means method finds a better grouping of the
attributes into a number of groups K (positive integer data. (To repeat the steps 2 ÷ 3).
number). The grouping is done by minimizing the sum of 5. Show the number of clusters corresponding to the
squares of distances between data and the corresponding optimal value of the silhouette global coefficient.
cluster centroid, [12].
Using this approach each cluster could be represented by
⎧⎪ k ⎫⎪ the so-called silhouette, which is based on the comparison of
min {E} = min ⎨
⎪⎩ i =1
∑ ∑ d (x, z )⎬⎪i (5) its tightness and separation. The silhouette validation
x∈C i ⎭ technique calculates the silhouette width for each sample,
average silhouette width for each cluster and overall average
where zi is the center of cluster Ci , while d(x, zi) is the silhouette width for a total data set.
Euclidean distance between points x and zi. Thus, the The average silhouette width will be applied for evaluation
criterion function E attempts to minimize the distance of each of clustering validity and also will be used to decide the
point from the center of the cluster to which the point determination of the optimal number of clusters.
belongs. More specifically, the algorithm begins by
initializing a set of K cluster centers. Then, it assigns each 1
Nc

object of the dataset to the cluster whose center is the nearest, SC =


Nc ∑S
j =1
j (8)
and recomputed the centers. The process continues until the
centers of the clusters stop changing.
The steps of the algorithm are the following [14][15]: where Sj is the silhouette local coefficient, defined as
Step 1. Choose K initial cluster centres z1(0), z2(0), …, zk(0), for
rj
instance, at random among the points to be analyzed. 1
Step 2. At the k-th iterative step, distribute the samples {x} S j =
rj ∑si =1
i (9)
among the K clusters by using the relation:
(k ) (k ) in which si is the silhouette width index for the i-object
x ∈ Ci(k ) if , d ( x , z i ) < d ( x , z j )
(6)
i = 1, 2, ... , K ; i ≠ j bi − a i
si = (10)
max {bi , a i }
where Ci(k) denotes the set of samples whose cluster centre is
zi(k). rj is the number of object for each cluster;
Step 3. Compute the new cluster centres zi(k+1), j = 1, 2, …,K. ai – mean distance between object i and objects of the same
The new cluster centre is given by equation (7), where ni is class j;
the number of objects in Ci(k). bi – minimum mean distance between object i and the
1 objects in the class closest to class j.
z i( k +1) =
ni
∑ x, i = 1, 2 , ..., K (7) In (10), if the object i is the single object of a cluster, then
x∈C i( k )
si = 0.
Step 4. Repeat steps 2 and 3 until convergence is achieved, In [17] it is proposed the following interpretation of the
that is, until a pass through the training sample causes no new silhouette coefficient: 0.71 – 1.0 a strong structure has been
assignment. It is obvious in this algorithm that the final found; 0.51 – 0.7 a reasonable structure has been found; 0.26
clusters will depend on the initial cluster centre choice and on – 0.5 the structure is weak and could be artificial; < 0.25 no
the value of K. substantial structure has been found.
For defining of the optimal number of clusters Nc,opt the
following algorithm can be used [15][16]:
1. Determination of the maximum number of clusters IV. OPTIMAL DG ALLOCATION METHOD WITH CLUSTERING
TECHNIQUES
Nc,max. The maximum optimal of clusters Nc,max
should be set to satisfy 2 ≤ Nc,max ≤ n , where n is Regarding the placement of DG sources, in the literature there
the clustered objects from data base. are many solution proposed, namely using fuzzy and real
2. For set of objects from data base, the method of coded genetic algorithm, using particle swarm optimization
clustering k-means with given Nc (2 ≤Nc ≤ Nc,max) is (PSO) and others.
used.
3. According to the obtained clusters structure,
determinate partition quality is evaluated. In the
To determine the optimal placement of DG sources into information which have led to performing the regime
an electrical distribution system we propose a new solution, calculations. Based on the parameters computed in the regime
with the following steps: calculations step, the loss sensitivity factors (LSFs) were
1. Regime calculations. In this step is calculated the determined and after that the LSFs and voltage values in the
real and reactive power loads, the real and reactive nodes were normalized to be used in the process of clustering.
power losses in nodes and branch and the variations It was applied the algorithm to determine the optimal number
of voltage value in nodes. of clusters. In the first step, the maximum number of clusters
2. The second step it to determine the loss sensitivity Nc,max was calculated (<= 9). In the next step, for the set of
factors (LSFs) in nodes of electrical system feeders from the database, the k-means clustering method
analyzed. The relation (2) is used to calculate these with given Nc (2 ≤Nc ≤ Nc,max) is used. In the step three, the
coefficients. The normalized LSFs and voltages are silhouette global coefficient is calculated to assess the
then computed from equations (3) and (4). partition quality. The results are presented in Fig. 2.
3. The k-means clustering method is used for grouping
the nodes from the viewpoint of the operation
characteristics (l and u).
4. Based on the clusters obtained in the previous step,
the pilot node where DG sources can be located is
established for each cluster.

The choice of the type of DG source is a key aspect. The


possible set of locations of each type DG plant is restricted to
the locations at which the corresponding resource is available.
This aspect is dealt with at the time of data definition.

V. CASE STUDY
Fig. 2. The silhouette global coefficient for different values of the number of
To show the capability of the proposed method to solve the clusters Nc
problem of the optimal placement of DG sources, a 20 kV
real rural distribution network with 91 nodes, has been As it appears in Fig. 2, the criterion has given acceptable
considered (Fig. 1). results with Nc = 5. For this value, the silhouette plot is
presented in Fig. 3.

Fig. 3. The silhouette plot for Nc = 5.

The nodes that belong to each cluster from the real rural
distribution network tested are presented in Fig. 4. Each
cluster is represented by a pilot node that is characterized by
Fig. 1. The schematic diagram of the 20 kV rural distribution system
analysed.
the average group value of the normalized LSF, l, and voltage
For this network we have information about the line values, u, parameters that were used in the clustering process.
(length, number of transformers points and the circuit type),
Table I shows the average groups values of the normalized TABLE II
THE ALTERNATIVE DG SIZES TESTED IN THE 20 KV DISTRIBUTION SYSTEM
LSF, l, and voltage values, u, and the representative node for NODES
each cluster. Distributed Generation
Pilot Nodes Levels of power injected
Type
[kW]
19 CHP 300,400,500,600
65 PV 50,100,150,200
80 SH 300,400,500,600
30 SH 100,200,300,400

The effective solution of the levels of power injected in


pilot nodes in 20 kV rural distribution network, is presented
in Fig. 5. Thus, 600 kW are injected in nodes 19 and 80 using
combined heat and power sources (CHP) and a small hydro
power plant (SH), respectively, 400 kW are injected in node
30 from a hydro power plant (SH), and 200 kW are injected
in node 65 using photovoltaic systems (PV).
In the base case (no DG sources installed) the value of real
power losses is 0.19 MW and in the best case the value of real
power losses is 0.13 MW in the 20 kV real rural distribution
network.

Fig. 4. The nodes that belong to each cluster of the 20 kV network analysed.

TABLE I
THE AVERAGE CLUSTERS VALUES OF THE NORMALIZED LSF AND VOLTAGE
VALUES, AND THE PILOT NODE FOR EACH CLUSTER
No. No. Pilot
cluster feeders node l u
mean mean
1 55 19 0.0017 0.9676
2 15 65 0.0019 0.9916
Fig. 5. Best solution for DG sizes in the 20 kV rural distribution network.
3 12 80 0.0234 0.9679
4 4 30 0.1145 0.9779 The active power losses in the network analyzed, in
5 4 - 0.035 0.9915 different cases, when in the pilot nodes was injected different
value of power is shown in Fig. 6.
Cluster no. 5 presents an LSF normalized of medium to
maximum value and the voltage value normalized of high
value; the nodes included in this cluster are situated close to
the root node and are not of interest to install DG sources. In
every pilot node of clusters, namely 19, 30, 65 and 80, were
installed DG sources of different sizes to analyze the
evolution of power losses and the voltage variations in the 20
kW rural distribution network. The DG types considered in
the area are combined heat and power (CHP), photovoltaic
systems (PV), and small hydro plants (SH, in a restricted set
of nodes, from node 25 to node 33, and on the main feeder
from node 66 to node 84). In Table II are presented the
representative nodes for each cluster, the type of DG sources
and the levels of power injected in these nodes, in order to Fig. 6. The active power losses in different cases analyzed in the distribution
network of 20 kV.
analyze the evolution of losses when DG sources are installed
in some pilot nodes into a network. The evolution of the voltage variation in 20 kV rural
distribution network nodes in the base case (no DG sources
installed) and in the best case is presented in Fig. 7. Blue
REFERENCES
colour is used to represent the voltage profile in the network
nodes in the base case and red colour for the best case. An [1] J. A. Greatbanks, D. H. Popovic, M. Begovic, A. Pregelj, and T. C.
Green, “On optimization for security and reliability of power systems
improvement of voltage values in the 20 kV distribution with distributed generation”, Proc. IEEE Bologna Power Tech Conf.,
system is remarked where the DG sources are installed. The Bologna, Italy, Jun. 2003.
node voltage does not exceed the rated voltage in any node, [2] D. H. Popovic, J. A. Greatbanks, M. Begovic, and A. Pregelj,
“Placement of distributed generations and reclosers for distribution
and this result is consistent with using the rated voltage for network security and reliability,” Int. J. Elect. Power Energy Syst., Vol.
voltage normalization. 27, No. 5–6, pp. 398–408, Jun. 2005.
[3] V. Calderaro, A.Piccolo and P.Siano, “Maximizing DG penetration in
distribution networks by means of GA based reconfiguration”, Future
Power Systems International Conference, Amsterdam, The
Netherlands, 18 November 2005.
[4] R.Varikuti and M. Damodar Reddy, “ Optimal placement of DG units
using fuzzy and real coded genetic algorithm, ” Journal of Theoretical
and Applied Information Technology, pp. 145-151. Vol. 7, No. 2, 2009.
[5] M. T. Ameli, V. Shokri and S. Shokri, “ Using fuzzy logic & full search
for distributed generation allocation to reduce losses and improve
voltage profile”, International Conference on Computer Information
Systems and Industrial Management Applications (CISIM), pp. 626-
630, 2010.
[6] C. Wang and M. H. Nehrir, “Analytical approaches for optimal
placement of distributed generation sources in power systems,” IEEE
Transactions on Power Systems, Vol. 19, No. 4, pp. 2068–2076, Nov.
2004.
[7] G. Celli and F. Pilo, “MV network planning under uncertainty on
Fig. 7. Voltage values evolution in 20 kV rural distribution network nodes. distributed generation penetration,” Proc. IEEE PES Summer Meeting,
Jul. 2001, Vol. 1.
[8] G. Carpinelli, G. Celli, F. Pilo and A. Russo, “Distributed generation
VI. CONCLUSIONS sitting and sizing under uncertainty,” in Proc. IEEE Porto Power Tech
Conf., Porto, Portugal, Sep. 2001.
The methodology described in this paper allows to [9] Y. M. Atwa, E. F. El-Saadany, M. M. A. Salama and R. Seethapathy, “
determine the nodes for optimal placement of DG sources Optimal renewable resources mix distribution system energy loss
minimization”, IEEE Transactions on Power Systems, Vol. 25, No. 1,
based on the k-means clustering method. February 2010.
Optimization is based on searching for the best clustering [10] S. R. A. Rahim, T. K. A. Rahman, I. Musirin, S. A. Azmi, M. F.
solutions by using the operational characteristics of the Mohammed, M. H. Hussain and M. Faridun, “ Comparing the network
performance between the installation of DG and compensating
network nodes (normalized factors l and u) as clustering capacitor using EP, ” in International Journal of Power, Energy and
variables. The silhouette factor is considered as the relevant Artificial Intelligence, Vol. 1, pp. 14-20, August 2008.
entry for determining the optimal number of clusters. The [11] Fl. Rotaru G. Chicco, G. Grigoras and G. Cartina, “Two-stage
distributed generation optimal sizing with clustering-based node
proposed combination of clustering variables and k-means selection”, Electrical Power Systems & Energy Systems, in press.
application allows determining the set of nodes where DG [12] G. Grigoras, G. Cartina and Fl. Rotaru, “ Using k-Means Clustering
sources can be successfully installed in the network. Method in Determination of the Energy Losses Levels from Electric
Distribution Systems”, World Scientific and Engineering Academy and
The optimal DG placement obtained has operational effects Society (WSEAS), Section: Mathematical Methods and Computational
on improvement of the voltage levels and reduction of the Techniques in Electrical Engineering, pp. 52 – 56, ISSN: 1792-5967,
power losses. In the case studied, using this methodology the ISBN: 978-960-474-238-7, Timisoara, 2010.
[13] G. Grigoras, G. Cartina and M. Gavrilas, “Using of Clustering
losses in the 20 kV rural distribution network were decreased Techniques in Optimal Placement of Phasor Measurements Units”,
by 30% of the initial power losses (considering as base case Proceedings of the 9th WSEAS/IASME International Conference on
the one with no DG sources installed) and the voltage profile Electric Power Systems, High Voltage, Electric Machines, ISSN: 1790-
5117, ISBN: 978-960-474-130-4, pp. 104-109, October 2009.
in the network nodes was significantly improved. In the [14] G. Cartina, G. Grigoras and E.C. Bobric, “Tehnici de clustering in
solution obtained, all DG units are exploited at their modelarea fuzzy. Aplicatii in electroenergetica” (in Romanian),
maximum sizes. This result indicates that the network studied Publishing House Venus, Iasi, 2005.
[15] S. Ray and R.H. Turi, “Determination of Number of Clusters in K-
has not reached yet the maximum levels of DG penetration Means Clustering and Application in Colour Image Segmentation”,
(after which the total losses could increase and the voltage Available: www.csse.monash.edu.au/~roset/papers/cal99.pdf.
profile could reach excessively high values in some nodes), [16] I. Yatskiv, L. Gusarova, “The Methods of Cluster Analysis Results
Validation”, Proceedings of International Conference RelStat’04, Vol.
and leaves room to further deployment of local generation 6, No, 1, pp. 75 --- 80, 2005.
sources in that network. [17] P.J. Rousseeuw, “Silhouettes: a Graphical Aid to the Interpretation and
Validation of Cluster Analysis”, Journal of Comp. Appl. Math., Vol.
20, 1987.

You might also like