Clustering Load Profiles For Demand Response Applications

This article has been accepted for publication in a future issue of this journal, but has not been
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSG.2017.2773573, IEEE
Transactions on Smart Grid
1
Clustering Load Profiles for Demand Response

Applications
Shunfu Lin, Member, IEEE, Fangxing Li, Fellow, IEEE, Erwei Tian, Yang Fu, Dongdong Li, Member, IEEE
Abstract—With the development of smart grid technologies, clustering algorithm, model-based clustering algorithm [4]-
residential and commercial loads have large potentialities to [6] and so on. With the development of data mining
participate in demand response (DR) programs. This makes the technologies, some new clustering methods have emerged
data dimension reduction techniques and classification processing
critical for the success of DR development. A novel load profile for electricity consumption patterns classification. In order
clustering method is proposed for load data classification which is to acquire the optimal number of clusters, literature [7]
based on the information entropy (IE), piecewise aggregate proposes a clustering method based on ant colony
approximation (PAA) and spectral clustering (SC). The variable optimization by combining the clustering algorithm with the
temporal resolution technique is presented to model typical daily
load datasets, and then an improved spectral clustering based on optimal theory. The literature [8] presents an effective
multi-scale similarities of distance and shape characteristics is application of support vector clustering to electrical load
proposed for clustering to obtain reasonable load classification. A profiles clustering analysis study. Hierarchical clustering
case study with one hundred of commercial Heating, ventilation has high accuracy and low efficiency, and partitioning
and air conditioning (HVAC) data analysis illustrates the
approach. The results prove that the proposed method is feasible
clustering has high efficiency and low accuracy, which is
in terms of data dimension reduction, reasonable profile selection proved by literature [9], then an ensemble clustering
and classification, and the operation stability. algorithm is proposed through combining hierarchical
Index Terms--demand response, piecewise aggregate clustering with partitioning clustering. However, load shape
approximation, information entropy, multi-scale similarities,
spectral clustering.
variability is essential in load profiling which exhibits the
customers’ different behaviors and characteristics.
I. INTRODUCTION Traditional clustering methods based on Euclidean distance
HE demand response is one of the key technologies in
measure have the disadvantages that: 1) do not have the
T smart grid that contributes to reducing the peak loads and

reshaping the load profiles to save additional investments
ability of shape pattern recognition because they only
considers the distance between points to points lacking of
on costly standby generation units. Many studies have been piecewise trend information [10]. In other words, they
done about the residential and commercial loads participating assume that the order of the data points is not important.
in the DR programs [1]-[3]. The typical load profiles are They give us the same result when we permute the time
important to assess the schedulable load capacity, develop points. They may lose some important information about the
price-based or incentive-based DR programs and decide the profile shape pattern. 2 ） the clustering of load profiles
scheduling scheme. The load classification is to separate needs to consider all dimensions in the dataset that focus on
enormous load profiles into several typical clusters. In recent the global properties. With the increasing number of dataset
years, researchers have proposed a variety of clustering dimensions, the meaning of distance similarity metrics
methods. becomes less and less [11].
Some of clustering methods were usually used to The results of clustering depend on both the algorithm
process the load profiles such as division-based clustering and the resolution of the data [12]. However, few clustering
algorithms including fuzzy C-means and K-means, methods consider the effects of data granularity on the
hierarchical-based clustering algorithm, network-based performance for analyzing power demand profiles. It is not
clustering algorithm of self-organizing map, density-based an efficient method to process the raw data directly since
the data may be very large and contain many redundant
This research was supported by National Natural Science Foundation of details. Traditionally, the approximation of time-series load
China (51207088); Science and Technology Commission of Shanghai
(14DZ1201602); Shanghai Green Energy Grid Connected Technology Engineering
profile data has a fixed temporal resolution that is usually
Research Center (13DZ2251900); and Shanghai Municipal Education Commission 15, 30 or 60 mins according to the clustering experiments
(15SG50). [13]. It is essential to determine a resolution that is a trade-
S. Lin, Y. Fu (corresponding author) and D. Li are with the College of
Electrical Engineering, Shanghai University of Electric Power, Shanghai, off between the level of details that represent the
200090 China, (e-mail: shunfu.lin@163.com). characteristics of the load profiles, and the necessities to
E. Tian is with the State Grid Zhejiang Zhuji Electric Power Company,
Zhuji, Zhejiang, 311800 China.
process data. Several methods were proposed to deal with
F. Li is with the Department of EECS, The University of Tennessee, the problem of temporal resolution: for example, principal
Knoxville, TN, 37996 USA. (e-mail: fli6@utk.edu). component analysis, Sammon map, self-organizing map,
1949-3053 (c) 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSG.2017.2773573, IEEE
2
piecewise aggregate approximation and symbolic aggregate segmentation consisting of the following steps:
approximation [14]-[16]. Those methods adopted fixed A. Averaging
uniform or non-uniform temporal resolutions on the basis of
Typical daily profile of an individual load is usually
the characteristics of the datasets, which may cause a high
represented with the mean values of its load data during certain
possibility of missing some important patterns in some working days. It can obtain the typical electricity consumption
kinds of load data due to the natures of the reduction patterns by reducing the impacts of abnormal load data.
techniques such as mean-value based approximation [17].
Therefore, a novel clustering method is proposed to group B. Normalization
the load profiles for the load grouping control or a The load profile clustering is to group “similar” customers
heterogeneous aggregated load modeling in this paper. The together usually according to their profile shapes and Euclidean
main contributions of this paper are: 1) a piecewise measure of distances. The load data are usually normalized
aggregate approximation method with a variable temporal before the clustering to ensure that the distance measure
resolution is proposed for trading off the data details and accords equal weight to each variable.
data dimension; 2) a spectral clustering algorithm with a Two commonly-used normalization approaches are:
multi-scale similarities is applied for load profile clustering statistical normalization and scaling normalization. The former
analysis. It can improve the accuracy of the similarity one such as the Z-score method is normalized according to the
measures among the load profiles and guarantee a high mean and standard deviation of the original data and convert
clustering quality. data into the standard normal distribution. The scaling
This paper is organized as follows. Section II presents the normalization such as the Min-Max method is to normalize the
main steps of proposed load classification method. Section III data for linear compression data amplitude between 0 and 1.
describes the piecewise aggregate approximation algorithm The Min-Max method is used in this work because the
based on the information entropy, and further illustrates its Euclidean distance is sensitive to the differences of the load
effectiveness with cases. Section IV introduces an improved data amplitudes. For a data X consisting of n records numbered
spectral clustering algorithm and verifies its effectiveness. as n=1, 2, ..., n, it is defined as:
Section V summarizes the main findings of this paper. xi1  ( xi  xi min ) / ( xi max  xi min ) (1)
1
where xi is the ith record after the extreme value normalization;
II. LOAD CLASSIFICATION METHOD ximin and ximax is the minimum and maximum record,
With the development of demand side response programs respectively.
and the huge amount data from advanced metering C. Information Entropy based Piecewise aggregate
infrastructure (AMI) system, the load profile clustering approximation (IEPAA)
techniques are applied to classify customers according to their
The IEPAA algorithm is to approximate a high-
electricity consumption patterns, as well as to evaluate their
dimensional data by a low-dimensional data with a novel
overall energy consumption trends at a glance. Generally, the
variable temporal resolution, which is an approximation of the
clustering of the load profiles can mainly be divided into three
fundamental characteristics of typical daily load profiles of an
stages: load data preparation, load profile classification and
individual customer. It adopts the information entropy to
result applications to DR programs, as shown in Fig. 1. The
measure the fluctuation degree of the daily load profiles.
load data preparation usually includes data clearing, filling up
missing values and data transformation. D. Spectral clustering
Raw load data In this work, it uses spectral clustering with the
Load data preparation consideration of the distance, shape fluctuation and shape trend
Load data of N customers within certain working days to group the “similar” customers together.
Averaging In order to verify the proposed algorithms, it takes the
electricity data of one hundred HVAC units of commercial
Load profile classification
Typical daily profiles of N customers

buildings as example in this paper. The data is obtained from
Normalization the energy consumption monitoring system of public buildings
Normalized typical daily profiles of N customers in Shanghai, China. Fig. 2 shows the typical daily load profiles
IEPAA
of 100 HVAC units.
60
Load data representation
power/kW
Spectral clustering 40
Load profile classification

20
Applications to DR programs
0
00:00 04:10 08:20 12:30 16:40 20:50
Fig. 1. The flowchart of load profiles clustering.
time/h
This work mainly focuses on the stage of load profile
Fig. 2. Typical daily load profiles of 100 HVAC units.
3
The sampling interval is 5 minutes. The data is rounded to B. Information Entropy Piecewise aggregate approximation
the nearest tenth. It performs averaging algorithm on the typical （IEPAA）
daily load profiles of 100 HVAC units to obtain a dataset The PAA is to obtain an approximation of a high-
named as DS. Each typical daily load profile consists of 288 dimensional data by a low-dimensional data [18]. Assume a
data consisting of n elements expressed as X={x1, x2, …，xn}.
data points. The dimensions of the dataset DS are 100×288.
X can be approximated with a data consisting of m elements
expressed as Y={y1, y2, … , yn}. The ith element of Y is
III. PIECEWISE AGGREGATE APPROXIMATION BASED ON
calculated by the following equation:
INFORMATION ENTROPY
n
k
m m
A. Measurement of Load Profile Fluctuation Degree yk   xi
n i  n ( k 1) 1 (6)
The information entropy is adopted to measure the
m
fluctuation degree of the load profiles. Assume a data X i  {1, n}, k  {1, m}
consisting of n possible records expressed as x1, x2, ……, xn
where n=1, 2, ..., n; m=1, 2, ...., m; m<n and n can be exactly
The probability of each record is written as p1, p2,……, pn,
divided by m.
respectively. The information entropy Hn of the data X can be
defined as: C. Details of the proposed IEPAA algorithm
n
H n ( X )   pi ln pi (2) C.1 Parameter selection
i 1 The information entropy is calculated by the probability
n distribution of the data values. The load profile data are
0<pi<1;  pi  1 rounded to the nearest tenth.
i 1
The maximum temporal resolution (MTR) for load profiles
where i=1, 2, ..., n.
classification is mainly up to the load control duration and type
The information entropy Hn can reflect the fluctuation
in the DR programs. Thus, the data temporal resolution is
degree of the load profiles. The bigger Hn is, the bigger
determined according to the actual situations.
fluctuation of the load profiles is, and vice versa.
It is known that the maximum value Hmax of entropy Hn is C.2 Detection of load switching events
equal to ln(n) when p1=p2=p3=…=pn. Within a time window T, if the difference between the
The average information entropy can be expressed as: maximum and minimum power of a load is bigger than a
n certain threshold, a load switching event occurs. Define a
1
Hn   (3) variable S as the total number of the switching events for the N
i 2 i load profiles within a time window T:
Define j as the fluctuation degree of the jth load profile N
during a certain time duration. Let us give the approximation of S   si (7)
i 1
j as the following equation:
 Hnj ( X j ) 
1, if the i th load switching event occurs
 Hnj ( X j ) si  
1 
th
 ln(n j ) 0, if the i load switching event does not occur
j  (4) where i=1, 2, ..., N.
 Hnj ( X j )
0  Hnj ( X j ) C.3 Main procedure of IEPAA
 ln(n j )
Fig. 3 shows the flowchart of the proposed IEPAA
where j=1, 2, ..., N; N is the number of load profiles;  is a algorithm.
proportionality coefficient and in this paper =1; nj is the In Fig. 3, K1 and K2 is the number of data segments,
possible records of the jth load profile. respectively;  and Γ are preset thresholds; i, j are integer
Define the factor ρ as the ratio of the number of load variables. The thresholds  and Γ are adjustable. The bigger
profiles with the fluctuation degree j =1 to the total number of the threshold Γ is, the more profiles it removes from the dataset.
load profiles within a certain time duration. ρ can be expressed In this paper,  and Γ is set as 0.06 and 4, respectively.
as: It takes the dataset DS to illustrate the main procedure of the
N proposed IEPAA algorithm that includes the following steps:
 j Step 1) Initialize DSD=0.1, MTR=5 minutes, T=30 minutes,
j 1
= (5) K1=48 and i=0. The data length of the daily profile is
N 288. The time window T is 30 minutes, which means
In the following IEPAA algorithm, the factor ρ is compared that the dataset DS is divided into 48 segments. The
with a preset threshold . If ρ is bigger than , it considers that dimension of each segment is 100×6. K1 is the number
the corresponding data has big fluctuation and should be of segments. i is a temporary integer variable.
further divided into two data segments. Step 2) The purpose of this step is to screen out the profiles
that are suitable to participate in the DR responses. For
4
each segment, it calculates the total load switching otherwise to step 12.
events S. If 0<S<Γ, the load profiles with the switching Step 11) Use the mean value of the corresponding 6 data
event occurring within the time window will be points to approximate the jth segment of each profile. It
removed from the dataset DS. It assumes that R of load achieves the data dimension reduction by using 1 data
profiles is totally removed from the dataset DS, where point to replacing 6 data points.
R is an integer. Step 12) Divide the SegBj into two segments (named as SegC)
Dataset DS Step 1 equally. The dimension of each segment SegC is (100-
Initialize MTR=15 mins, T=30 mins, K1=48, and i=0 R)×3. A temporary integer variable k is set as 0.
i=i+1
Step 13) Select the kth segment data that is expressed as SegCk.
Y i=49
Step 14) Use the mean value of the corresponding 3 data
points to approximate the kth segment of each profile.
Step 2
N
Select ith segment data
N It achieves the data dimension reduction by using 1
S<Γ
Y
data point to replacing 3 data points.
Remove the corresponding profiles with switching events (si=1) during this time window
Step 15) Obtain the representation data.
Step 3
Form new dataset DATA It is seen that it adopts a variable differential temporal
Step 4
T=30 minutes, K2=24 and i=0
resolutions for the approximations of the data segments with
different fluctuation levels.
i=i+1
i=25
Step 5
D. Application of the proposed IEPAA algorithm
Select ith segment data
Step 7
The proposed IEPAA algorithm is performed on the given
Step 6
ρi<σ OR S<Γ Y Approximation with mean value dataset DS and obtain a representation dataset DS1. The load
N Step 8
profiles with the index number 12, 21, 46, 49, 52, 57, 69, 71,
Divide ith segment data into two data segments equally. j=0
73, 82, and 91 are removed in step 2 of Fig.3. The profiles of
j=j+1 dataset DS1 are shown in Fig. 4.
j=3 The traditional PAA algorithm is also performed on the
Step 9
Select jth segment data
dataset DS with the fixed temporal resolution of 10 and 15
Step 10 Step 11 minutes, respectively, to obtain the corresponding dataset DS2
Y
ρi<σ OR S<Γ Approximation with mean value and DS3, as shown in Fig. 5.
N Step 12 1
Divide jth segment data into two data segments equally. k=0
P(norm)
0.5
k=k+1
k=3
Step 13 Step 14 0
0 20 40 60 80
Select kth segment data Approximation with mean value
time
Step 15
Obtain representation data DS1
Fig. 3. The flowchart of the proposed IEPAA algorithm. Fig. 4. The representation profiles with the proposed IEPAA algorithm.
Step 3) After Step 2, a new dataset DATA is obtained. The
dimension of the dataset DATA is (100-R)×288. The 1 1
temporary variable i is reset to 0.

P(norm)
P(norm)
Step 4) The time window is set as 60 minutes, which means 0.5 0.5
that the dataset DATA is divided into 24 segments.

0 0
The segment is named as SegA. The dimension of each 0 50 100 0 20 40 60 80
segment SegA is (100-R)×12. time time

Step 5) Select the ith segment data that is expressed as SegAi. (a) DS2 with MTR=10 mins (b) DS3 with MTR=15 mins
Step 6) Calculate the factor ρ in (5) and S in (7) for the Fig. 5. The representation profiles with the traditional PAA algorithm.
segment SegAi. If ρ< or S<Γ, it goes to the step 7, An effective representation of a time series data should not
otherwise to step 8. only reduce the data dimension but also maintain the
Step 7) Use the mean value of the corresponding 12 data distinguished features of the raw data. An index of average
points to approximate the ith segment of each profile. It distinguished information (ADI) is introduced here to evaluate
achieves the data dimension reduction by using 1 data the representation effectiveness of the proposed IEPAA
point to replacing 12 data points. algorithm. The bigger of the ADI of a representation data, a
Step 8) Divide the SegAi into two segments (named as SegB) better performance on the maintaining of the distinguished
equally. The dimension of each segment SegB is (100- features the algorithm has.
R)×6. A temporary integer variable j is set as 0. We assume yi,j is an element of a two-dimension
Step 9) Select the jth segment data that is expressed as SegBj. representation dataset Y, where i=1, 2, ..., M; j=1, 2, ..., L; M is
Step 10) Calculate the factor ρ in (5) and S in (7) for the the number of total representation load profiles; L is the length
segment SegBj. If ρ< or S<Γ, it goes to the step 11, of the load representation data. The ADI of dataset Y is defined
5
as: dc (i, k )  1  cov(i,k) (11)

M
L  ( yi, j  mean( y1, j , y2, j , , y N , j )) 2 A.3 Shape trend metric
 i 1 The maximum distance is used to describe the shape trend

j 1 M metric. The maximum distance between two groups of load
ADI  (8) representation data Yi and Yk is expressed as
L
The ADI of the DS1, DS2 and DS3 is calculated with (8) d m ( j )  yi , j  yk , j
respectively, as shown in TABLE I. It is seen that, the data Sort the elements of dm in descending order to obtain an
length of DS1 is smaller than that of DS2 and DS3, which array d’m, and the shape trend metric matrix is defined as
means that the proposed IEPAA achieves a better data
1 s
dimension reduction than the traditional PAA does. The ADI of d t (i,k )   d m' ( j ) (12)
DS1 is obviously bigger than that of DS2 and DS3, which
s j 1
proves that the proposed IEPAA has a better performance on where s=*L; is the factor that can be adjusted according to
the maintaining of the distinguished features than the traditional DR programs.
PAA does. In addition, the approximation error of the DS1 is A.4 Multi-scale similarity metric
also smaller than that of DS2 and DS3, respectively. In a word, Based on the distance metric and two shape metrics, a
the proposed IEPAA algorithm has significant advantage in the multi-scale similarity metric is introduced in the paper. From
representation of the load profile data. the dataset Y, the Euclidean distance matrix de, shape
TABLE I COMPARISON BETWEEN IEPAA AND TRADITIONAL PAA fluctuation metric matrix dc and shape trend metric matrix dm
Traditional PAA
Proposed can be obtained, respectively. The multi-scale similarity metric
Algorithm IEPAA matrix D is defined as
Dataset DS2 DS3 DS1
D    de    dc    dt
(13)
Data length 96 144 94      1
ADI 0.19 0.18 0.31 where , , χ are the weighting coefficients that can be
adjusted according to different DR programs. The dimensions
Approximation error 0.52 0.36 0.34
of the matrix D are M×M.
B. Spectral Clustering Algorithm
In recent years, the spectral clustering has become one of the
IV. SPECTRAL CLUSTERING most popular modern clustering algorithms [19]. The
commonly used Ng-Jordan-Weiss (NJW) algorithm is adopted
A. Similarity metrics for spectral clustering analysis in this paper.
The similarity metrics are crucial to the clustering analysis.
It takes the representation dataset Y defined in section III D.2 B.1 Adjacency matrix construction
as the example to explain the similarity measurement. Before The Gaussian kernel function is adopted to construct the
the calculation of the similarity metrics, it performs median adjacency matrix W as
data smoothing on dataset Y. D(i,k )2
W (i, k )  exp( ) (14)
A.1 Distance metric 2 * 2
The Euclidean distance is commonly used for the distance where  is the scale parameter that is crucial to the clustering.
metric. The Euclidean distance de(i, k) between two groups of The following steps show how to determine the scale parameter
load representation data Yi and Yk is expressed as .
L Step 1) Sort each row of the matrix D in descending order to
 yi, j  yk , j
2
de (i, k )  (9) obtain a matrix D’;
j 1
Step 2) Obtain a (M-1)×M dimension matrix E where E(i,k)=
where i, k=1, 2, ..., M; j=1, 2, ..., L. D’(i+1, j)-D’(i, j);
A.2 Shape fluctuation metric Step 3) Find the maximum element E(im, jm) from each
The correlation distance is adopted as the shape fluctuation column of matrix E;
metric in this paper. The covariance between two groups of Step 4) Find the corresponding element D’(im, jm) where E(im,
load representation data Yi and Yk can be expressed as jm) = D’(im+1, jm)-D’(im, jm);
L Step 5) Given wmax as
 ( yi, j  Yi )(d k , j  Yk ) 1 n
wmax   ( Pm , j )
j 1
cov(i,k)  (10) n j 1
L L
 ( xi, j  Yi )  ( xk , j  Yk )
2 2
Step 6) The scale parameter  can be calculated from the
j 1 j 1 following equation:
The correlation distance dc(i, k) is defined as:
6
2
wmax C.2 Applications of the proposed spectral clustering
exp( ) (15)
2 2 The multi-scale similarity metric matrix comprises of the
where  is the maximum membership degree of the similarity Euclidean distance, shape fluctuation and shape trend. The
metric. weighting coefficients can be adjusted according to DR
programs. It is known that both of the load profile
B.2 Optimal number of clusters morphological characteristics and amplitude are important to
The optimal number k of clusters is determined with a DR programs. However, the load forecasting and load
novel method based on matrix perturbation theory introduced modeling pay more attention on the profile amplitude than
in literature [20]. The main steps are as follows: morphological characteristics. Therefore, if the proposed
Step 1) Calculate the eigenvalues of the adjacency matrix W; clustering is applied to DR programs, we recommend to set
Step 2) Sort the calculated eigenvalues descending order to smaller  and bigger  and χ. Otherwise, if the proposed
obtain an array λ; clustering is applied to load forecasting or load modeling, we
Step 3) k=max{i(i)>0.01, i=1, 2, …, M} recommend to set bigger  and smaller  and χ.
B.3 Steps of proposed spectral clustering C.2.1 Case 1
The proposed spectral clustering algorithm consists of the In this case, we choose a smaller weighting coefficient of
following steps: Euclidean distance and bigger weighting coefficients of shape
Step 1) Construct the multi-scale similarity metric matrix D; fluctuation and shape trend. The weighting coefficients of the
Step 2) Determine the scale parameter  of Gaussian kernel multi-scale similarity metric matrix is set as =0.2, =0.4,
function and calculate the adjacency matrix W with χ=0.4, respectively. The maximum membership degree =0.01
the size of M×M; and the corresponding scale parameter =0.1609.
Step 3) Calculate the normalized Laplacian matrix L of the The proposed spectral clustering algorithm is performed on
adjacency matrix W; the dataset DS1. The optimal number of clusters is 7. The
Step 4) Determine the optimal number k of clusters; clustering results are shown in Fig. 7. The profile number of
Step 5) Compute the first k eigen-vectors u1, u2, …, uk of L; each cluster C1, C2, C3, C4, C5, C6, and C7 is 21, 8, 7, 13, 11, 13
Step 6) Construct a matrix T with the dimension of M×k and and 16, respectively.
let T contains the vectors u1, u2, …, uk as column; 1
cluster C1
1
cluster C2
1
cluster C3
Step 7) Let si be the vector corresponding to the ith row of L; 0.8

0.6
0.8
0.6
0.8
0.6
Step 8) Cluster the points (si)i=1,…, M with the K-means method; 0.4 0.4 0.4
0.2 0.2 0.2
Step 9) Obtain clustering results. 0 0 0
0 20 40 60 80 0 20 40 60 80 0 20 40 60 80
C. Case Study time time time

normalized power
cluster C4 cluster C5 cluster C6

1 1
We take the dataset DS1 as an example to illustrate the 0.8
1
0.8 0.8
proposed spectral clustering algorithm. 0.6 0.6 0.6
0.4 0.4 0.4
C.1 K-means clustering 0.2

0
0.2 0.2
0
0
0 20 40 60 80 0 20 40 60 80 0 20 40 60 80
The K-means algorithm [21][22] is the most commonly used time time time
algorithm in load profile problems due to its simplicity, 1
cluster C7
comprehensive operation and efficiency. K-means clustering 0.8
0.6
aims to partition observatoins into several clusters so as to 0.4
minimize the within-cluster sum of squares. For full 0.2

0
mathematic description the reader is refered to [21]. The K- 0 20 40 60 80
time
means algorithm is performed on the dataset DS1. The optimal
Fig. 7. Clustering results with the proposed spectral clustering for DS1.
number of clusters is 6 with the Davies-Bouldin index (DBI)
[23]. Six clusters are obtained from DS1 as shown in Fig. 6. Comparing Fig. 7 with Fig. 6, it is seen that:
cluster C1 cluster C2 cluster C3 (1) The profiles in the clusters C1 and C6 in Fig. 6 have
1 1 1
good similarities in terms of time duration and shape,
0.5 0.5 0.5 regardless of the amplitude. The K-means clustering based on
normalized power
0 0 0
the only Euclidean distance metric classified those profiles into
0 20 40 60 80 0 20 40 60 80 0 20 40 60 80
time time time

two different clusters C1 and C6 in Fig. 6. The profiles in
1
cluster C4
1
cluster C5
1
cluster C6 clusters C1 and C6 in Fig. 6 should be better to classified into
one cluster if the HVAC units participate in some price-based
0.5
0.5 0.5
DR programs. The proposed spectral clustering algorithm
0
0 20 40 60 80
0 0
0 20 40 60 80
based on multi-scale metrics is able to classify those profiles
0 20 40 60 80
time time time into one cluster C1 in Fig. 7.

Fig. 6. Clustering results with K-means for DS1. (2)The profiles in cluster C5 in Fig. 6 correspond to those in
clusters C3, C5 and C7 in Fig. 7. It is observed that the profiles
7
in clusters C3, C5 and C7 in Fig. 7 have similar amplitudes but profile; cluster C1, C4 and C6 are short-time peak profiles;
quite different occurrence time. The corresponding loads in cluster C7 is dual-peak profile. Based on the information of
cluster C3, C5 and C7 in Fig. 7 have high feasibilities of typical profile patterns and the number of loads in each cluster,
participation in short-time DR programs. Those profiles should utility companies or load aggregators can design reasonable DR
use different load control scheme because of their different programs and evaluate the load potentiality of participation in
occurrence time. However, the K-means algorithm cannot different DR programs.
distinguish them and classifies them into one cluster. 1
cluster C1
0.8 cluster C2
C.2.2 Case 2
p(norm)
0.6 cluster C3
cluster C4
In this case, we choose a bigger weighting coefficient of 0.4 cluster C5
Euclidean distance and smaller weighting coefficients of shape 0.2 cluster C6
cluster C7
fluctuation and shape trend. The weighting coefficients of the 0 cluster C8
multi-scale similarity metric matrix is set as =0.4, =0.2,
0 20 40 60 80
time
χ=0.2, respectively. The maximum membership degree =0.01
and the corresponding scale parameter =0.15. Fig. 9. Typical profiles of 8 clusters.
The results are shown in Fig. 8 with the proposed spectral The traditional clustering method based on single-scale
clustering algorithm performed on DS1. The optimal number of distance similarity metric does not consider both the distance
clusters is 8. The profile number of the clusters C1, C2, C3, C4, metric and shape metric at the same time. The proposed
C5, C6, C7 and C8 are 11, 13, 13, 16, 15, 7, 8 and 6, respectively. spectral clustering based on multi-scale similarity metrics gives
Comparing Fig. 8 with Fig. 6, it is observed that the profiles full consideration to the distance and shape metrics so that it
in clusters C5 and C8 in Fig. 9 correspond to those in cluster has better clustering results than the traditional clustering
C6 in Fig. 6. In addition, the profiles in cluster C5 in Fig. 6 method does.
correspond to those in clusters C1, C4 and C6 in Fig. 9. It is D. Performance comparison
found that the proposed spectral clustering can classify the
The performance of the proposed spectral clustering is
profiles properly by setting the multi-scale weighting
compared with that of the primitive K-means clustering and
coefficients.
cluster C2 cluster C3
advanced ant colony clustering [7], including the computation
cluster C1
1
0.8
1
0.8
1
0.8
time, clustering stability and clustering validity. The ant colony
0.6 0.6 0.6 algorithm is defined by M. Dorigo, motivated by the intelligent
0.4 0.4 0.4
0.2 0.2 0.2
behavior of ant system. It has been applied to solve many
0
0 20 40 60 80
0
0 20 40 60 80
0
0 20 40 60 80
problems.
time time time
normalized power
cluster C4 cluster C5 cluster C6 D.1 Computational time

1 1 1
0.8 0.8 0.8 It is known that the input of the K-means clustering are the
0.6 0.6 0.6
0.4 0.4 0.4 vectors in the N-dimension Euclidean space, however the input
0.2 0.2 0.2
of the spectral clustering is the similarity matrix between the
0 0 0
0 20 40 60 80 0 20 40 60 80 0 20 40 60 80
data. The computation complexity of the spectral clustering is
time time time
cluster C7 cluster C8 less than that of the K-means clustering. In order to compare
1 1
0.8 0.8
the computation time, the K-means, ant colony and the
0.6 0.6
proposed spectral clustering algorithms are performed on the
0.4 0.4
0.2 0.2 dataset DS1 for thirty loop runs, respectively. The computation
0
0 20 40 60 80
0
0 20 40 60 80 time of the K-means, ant colony and proposed spectral
time time clustering is 13.21, 7.56 and 1.10 seconds, respectively. The
Fig. 8. Clustering results with the proposed spectral clustering for DS1. computing platform is a desk computer with Microsoft
Windows 8.1. The CPU is Intel Pentium G2350 and RAM is
C.2.3 Applications in DR programs
4.00GB. As the increasing of the number of load profiles, the
Two major DR categories are: price based DR and incentive computational time differences between the traditional
based DR. For the first one, customers face time varying prices clustering algorithms and the proposed spectral clustering
based on market price. On the other hand, customers under the algorithm will become bigger. It is critical to reduce the
second scheme are offered payments to motivate the reduction computational time when residential and commercial loads
of their electricity usage. The appropriate and effective load participate in some ancillary DR programs such as the dynamic
profile clustering is important for DR programs. The clustering frequency regulation.
results are potentially useful to design reasonable DR programs
and load control schemes. It still takes clustering results in Fig. D.2 Clustering stability
8 as an example. Eight typical profiles can be obtained by It is observed that the multiple-run results of most clustering
calculating their mean value of the corresponding profiles in algorithms are usually different. The stability of a clustering
each cluster, as shown in Fig. 9. It is observed that: cluster C2, algorithm means the consistency of the results of multiple runs.
C5 and C8 are flat-top profiles; cluster C3 is multimodal peak In this paper, the clustering stability of K-means, ant colony
8
and proposed spectral clustering algorithms is discussed and sharp increase of load, such as a large amount of loads
compared. Fig. 10 shows the ten-run results of the K-means, switched on in a short time duration.
ant colony and proposed spectral clustering algorithms for DS1. 2) Traditional dimensionality technologies have a tendency
K-means ant colony clustering to lose a lot of distinguished characteristic information, while
Number of load profiles in each cluster

the proposed PAA based on IE not only reduces the dimension

40 50
but also maintains the fundamental characteristics of the
20
original load data with higher accuracy.
0 0
12 10 11 12
12 10 11 12 3) The improved spectral clustering algorithm computes
34 8 9 34 8 9
6 7 6 7
56
3 4 5
56
3 4 5 similarity among load profiles in terms of distance,
Cluster No. 1 2 Run times Cluster No. 1 2 Run times
spectral clustering
morphological fluctuation characteristics as well as
morphological trend characteristics, and the number of cluster

40 is obtained with a method based on matrix perturbation theory.
20 It also greatly reduces the amount of calculation. The integrated
0 clustering quality of the method proposed by the paper is better
12
1011
12
34
56 7 8
9 than the K-means and ant colony clustering, as verified in the
7 5 6
3 4
Cluster No. 1 2 Run times case study.
Fig. 10. Ten-run results of K-means, ant colony and spectral clustering Besides the DR programs, the clustering results are also
algorithms. potentially useful to utility companies or load aggregators for
It is seen that the consistency of the ten-run results of the
load forecasting, aggregated load modeling, tariff design and
proposed spectral clustering algorithm is better than that of the
recommendation. Further research will address the
K-means algorithm.
implementation of the proposed techniques to specific DR
The stability index (TSI) defined in literature [24] for
programs such as dynamic frequency regulation.
evaluating the stability of clustering algorithms is employed to
compare the three clustering algorithms quantitatively. The TSI VI. REFERENCES
is defined as:
[1] M. Liu, Y. Shi, and X. Liu, “Distributed MPC of aggregated
Number of groups of load profiles heterogeneous thermostatically controlled loads in smart grid,” IEEE
by their load patterns of multiple runs Trans. Industrial Electronics, vol.63, no. 2, pp. 1120–1129, Feb. 2016.
TSI  (16)
Optimal number of clusters [2] Q. Cui, X. Wang, X. Wang, and Y. Zhang, “Residential appliances direct
load control in real-time using cooperative game,” IEEE Trans. Power
The smaller the TSI is, the better the stability of the Systems, vol. 31, no. 1, pp. 226–233, Feb. 2016.
clustering algorithm is. Based on (16), the TSI of the K-means, [3] M. Muratori and G. Rizzoni, “Residential demand response: dynamic
ant colony and proposed spectral clustering is equal to 1.70, energy management and time-varying electricity pricing,”IEEE Trans.
Power Systems, vol. 31, no. 2, pp. 1108-1117, Feb. 2016.
1.83 and 1.00 for ten-runs on the dataset DS1, respectively. [4] R. Li, C. Gu, F. Li, G. Shaddick, and M. Dale, “Development of low
The TSI of the proposed spectral clustering is smaller than that voltage network templates — part I: Substation clustering and
of the K-means and ant colony clustering algorithms. classification,” IEEE Trans. Power Systems, vol. 30, no. 6, pp. 3026-
3044, Feb. 2015.
D.3 Clustering validity [5] R. Gulbinas, A. Khosrowpour, and J. Taylor, “Segmentation and
classification of commercial building occupants by energy-use efficiency
One purpose of a clustering algorithm is generally to and predictability,” IEEE Trans. Smart Grid, vol.6, no. 3, pp. 1414-1424,
improve the similarity between the objects in the same cluster Feb. 2015.
as soon as possible. The validity index (TVI) of a clustering [6] G. Chicco, R. Napoli, and F. Piglione, “Comparisons among clustering
algorithm defined in literature [25] is adopted in this paper to techniques for electricity customer classification,” IEEE Trans. Power
Systems, vol.21, no. 2, pp. 933-940, Feb. 2006.
assess the similarity between objects quantitatively. The [7] G. Chicco, O. M. Ionel, and R. Porumb, “Electrical load pattern grouping
smaller TVI is, the better the clustering validity is. The TVI of based on centroid model with ant colony clustering,” IEEE Trans. Power
the K-means, ant colony and proposed spectral clustering with Systems, vol.28, no. 2, pp. 1706-1715, Feb. 2013.
[8] G. Chicco, and I. S. Ilie, “Support vector clustering of electrical load
the weighting factor μ=0.5 is 0.38, 0.26 and 0.15 for DS1, pattern data,” IEEE Trans. Power Systems, vol.24, no. 3, pp. 1619-1628,
respectively. The proposed spectral clustering has a better Feb. 2009.
clustering validity than the K-means and ant colony clustering [9] B. Zhang, C. Zhuang, and J. Hu, “Ensemble clustering algorithm
combined with dimension reduction techniques for power load profiles,”
algorithms. Proceedings of the CSEE, vol.35, no. 15, pp.3741-3749, Feb, 2015.
[10] H. Shatkay and S. B. Zdonik, “Approximate queries and representations
V. CONCLUSIONS for large data sequences,” Proceedings of the 12th International
Conference on Data Engineering, New Orleans, Louisiana, Feb. 1996.
The algorithm by combining the concept of information [11] M. Piao, H. S. Shon, J. Y. Lee, and K. H. Ryu, “Subspace projection
entropy, the piecewise approximation, and spectral clustering is method based clustering analysis in load profiling,” IEEE Trans. Power
proposed for load data approximation and classification in this Systems, vol. 29, no. 6, pp. 2628-2635, Feb. 2014.
[12] R. Granell, C. J. Axon and D. C. H. Wallom, “Impacts of raw data
paper. The results are demonstrated in one hundred commercial temporal resolution using selected clustering methods on residential
HVAC systems. The following conclusions can be drawn from electricity load profiles,” IEEE Trans. Power Systems, vol.30, no. 6, pp.
this research work: 3217-3224, Feb. 2015.
[13] G. Chicco, “Overview and performance assessment of the clustering
1) The proposed PAA based on IE algorithm can effectively methods for electrical load pattern grouping,” Energy, vol. 42, no. 1, pp.
select out the loads suitable for DR programs in the event of a 68-80, Feb. 2014.
9
[14] A. Notaristefano, G. Chicco and F. Piglione, “Data size reduction with Yang Fu received his M.S. degree in power system and
symbolic aggregate approximation for electrical load pattern grouping,” automation from Southeast University in 1993 and Ph.D.
IET Gener, Transm. Distrib, vol. 7, no. 2, pp. 108-117, Feb. 2013. degree in electrical engineering from Shanghai University,
[15] J. W. Sammon, “A nonlinear mapping for data structure analysis,” IEEE China, in 2007. He is currently a professor at the College
Trans. Computers, vol.18, no. 5, pp. 401-409, Feb. 1969. of Electrical Engineering and vice President of Shanghai
[16] S. D. Backer, A. Naud, and P. Scheunders, “Non-linear dimensionality University of Electric Power, Shanghai, China. He is the
reduction techniques for unsupervised feature extraction,” Pattern director of Shanghai Green Energy Grid Connected
Recognition Letters, vol. 19, no. 8, pp. 711-720, Feb. 1998. Technology Engineering Research Center, and Shanghai
[17] E. Keogh, K. Chakrabarti, M. Pazzani, and S. Mehrotra, “Dimensionality electrical engineering plateau discipline. His current
reduction for fast similarity search in large time series databases,” research interests include wind power and smart grid technologies.
Knowledge & Information Systems, vol. 3, no. 3, pp. 263-286, Feb, 2001.
[18] K. Eamonn and C. Kaushik, “Dimensionality reduction for gast Dongdong Li (M’08) received his B.S. and Ph.D.
similarity search in large time series databases,” Knowledge and degrees from Zhejiang University and Shanghai Jiao
Information Systems, vol.3, no. 3, pp. 263-286, Feb. 2001. Tong University both in electrical engineering in 1998
[19] U. Luxburg, “A tutorial on spectral clustering,” Statistics and Computing, and 2005, respectively. He worked for the Wuhu Power
vol.17, no. 4, pp. 395-416, Feb. 2007. Plant as an Electrical Engineer from 1998 to 2000. He is
[20] Z. Tian, X. B. Li, and Y. W. Ju, “Spectral clustering based on matrix currently a professor and dean of College of Electric
perturbation theory,” Science in China Series F：Information Sciences, Engineering in Shanghai University of Electric Power,
vol. 50, no. 1, pp.63-81, Feb. 2007. Shanghai, China. Dr. Li was a visiting scholar at The
[21] R. Xu and D. Wunsch. Clustering. New-York: John Wiley & Sons Inc., Purdue University from 2014 to 2015. His current
2008. research interests include analysis of electric power system, new energy system,
[22] G. Chicco, R. Napoli, and F. Piglione, “Comparisons among clustering smart grid and power electronization of power system.
techniques for electricity customer classification,” IEEE Trans. Power
Systems, vol. 21, no. 2, pp. 933–940, May 2006.
[23] D. Davies and D. Bouldin, “A Cluster separation measure,” IEEE Trans.
Pattern Analysis and Machine Intelligence, vol. PAMI, no.1, pp. 224-
227, Feb. 1979.
[24] T. Zhang, G. Zhang, J. Lu, X. Feng, and W. Yang, “A new index and
classification approach for load pattern analysis of large electricity
customers,” IEEE Trans. Power Systems, vol.27, no. 1, pp. 153-160, Feb.
2012.
[25] Y. Wang and L. Li, “Application of clustering technique to electricity
customer classification for load forecasting,” Proceedings of IEEE
Conference on Information and Automation, Lijiang, China, Feb. 2015.
Shunfu Lin (M'12) received his B.S. and Ph.D. from

the University of Science and Technology of China in
2002 and 2007, respectively. He worked for the
Corporate Technology of Siemens Limited China as a
research scientist in power monitoring and control from
Jul. 2007 to Sep. 2009. He was a post-doctoral fellow at
University of Alberta, Canada from Oct. 2009 to Oct.
2010. Dr. Lin is currently a professor at the College of
Electrical Engineering of The Shanghai University of
Electric Power. Dr. Lin was a visiting scholar at The University of Tennessee
(UT) at Knoxville from Jun. 2016 to Jun. 2017. His research interests include
power quality and smart grid technologies of LV distribution system.
Fangxing Li (S’98–M’01–SM’05–F’17), also known as

Fran Li, received the B.S.E.E. and M.S.E.E. degrees
from Southeast University, Nanjing, China, in 1994 and
1997, respectively, and the Ph.D. degree from Virginia
Tech, Blacksburg, VA, USA, in 2001. Currently, he is
the James W. McConnell Professor in electrical
engineering and the Campus Director of CURENT at
the University of Tennessee, Knoxville, TN, USA.
His current research interests include renewable energy integration,
distributed generation, energy markets, power system computing, reactive
power and voltage stability, and measurement-based technology.
Prof. Li is presently serving as the Vice Chair of IEEE PES PSOPE
Committee. He also serves as an Editor or Guest Editor for IEEE Transactions
on Power Systems, IEEE Transactions on Sustainable Energy, IEEE PES
Letters, IEEE Transactions on Industrial Informatics, and several IEEE and
other international journals.
Erwei Tian received his B.S. degree from Hebei

University of Engineering in 2013 and M.S. degree in
electrical engineering from Shanghai University of Electric
Power, China in 2017. He is currently working in State
Grid Zhejiang Zhuji Electric Power Company.
His current interests include demand response and smart
grid technologies of LV distribution systems.

Clustering Load Profiles For Demand Response Applications

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Clustering Load Profiles For Demand Response Applications

Uploaded by

Copyright:

Available Formats

This article has been accepted for publication in a future issue of this journal, but has not been

Clustering Load Profiles for Demand Response

T smart grid that contributes to reducing the peak loads and

Typical daily profiles of N customers

Load profile classification

temporary variable i is reset to 0.

that the dataset DATA is divided into 24 segments.

segment SegA is (100-R)×12. time time

as: dc (i, k )  1  cov(i,k) (11)

L  ( yi, j  mean( y1, j , y2, j , , y N , j )) 2 A.3 Shape trend metric

 i 1 The maximum distance is used to describe the shape trend

Step 7) Let si be the vector corresponding to the ith row of L; 0.8

C. Case Study time time time

cluster C4 cluster C5 cluster C6

proposed spectral clustering algorithm. 0.6 0.6 0.6

0.4 0.4 0.4

C.1 K-means clustering 0.2

minimize the within-cluster sum of squares. For full 0.2

time time time

time time time into one cluster C1 in Fig. 7.

cluster C4 cluster C5 cluster C6 D.1 Computational time

Number of load profiles in each cluster

the proposed PAA based on IE not only reduces the dimension

morphological trend characteristics, and the number of cluster

Shunfu Lin (M'12) received his B.S. and Ph.D. from

Fangxing Li (S’98–M’01–SM’05–F’17), also known as

Erwei Tian received his B.S. degree from Hebei

You might also like