Applied Energy: A. Cominola, M. Giuliani, D. Piga, A. Castelletti, A.E. Rizzoli

Applied Energy 185 (2017) 331–344
Contents lists available at ScienceDirect
Applied Energy
journal homepage: www.elsevier.com/locate/apenergy
A Hybrid Signature-based Iterative Disaggregation algorithm

for Non-Intrusive Load Monitoring
A. Cominola a,⇑, M. Giuliani a, D. Piga b, A. Castelletti a, A.E. Rizzoli c
a
Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Milano I-20133, Italy
b
IMT School for Advanced Studies, Lucca 55100, Italy
c
Dalle Molle Institute for Artificial Intelligence Research, SUPSI, Manno, 6928, Switzerland
h i g h l i g h t s
The hybrid, efficient, algorithm ‘‘HSID” for Non-Intrusive Load Monitoring is proposed.
HSID outperforms benchmark techniques for residential power load disaggregation.
HSID is robust to signal noise and high number of appliances, and can be used in semi-supervised applications.
a r t i c l e i n f o a b s t r a c t
Article history: Information on residential power consumption patterns disaggregated at the single-appliance level is an
Received 12 January 2016 essential requirement for energy utilities and managers to design customized energy demand manage-
Received in revised form 4 October 2016 ment strategies. Non-Intrusive Load Monitoring (NILM) techniques provide this information by decom-
Accepted 16 October 2016
posing the aggregated electric load measured at the household level by a single-point smart meter
into the individual contribution of each end-use. Despite being defined non-intrusive, NILM methods
often require an intrusive data sampling process for training purpose. This calibration intrusiveness ham-
Keywords:
pers NILM methods large-scale applications. Other NILM challenges are the limited accuracy in reproduc-
Non-Intrusive Load Monitoring
Energy disaggregation
ing the end-use consumption patterns and their trajectories in time, which are key to characterize
End-uses consumers’ behaviors and appliances efficiency, and the poor performance when multiple appliances
Smart metering are simultaneously operated. In this paper we contribute a hybrid, computationally efficient, algorithm
Energy demand management for NILM, called Hybrid Signature-based Iterative Disaggregation (HSID), based on the combination of
Factorial Hidden Markov Models, which provide an initial approximation of the end-use trajectories,
and Iterative Subsequence Dynamic Time Warping, which processes the end-use trajectories in order
to match the typical power consumption pattern of each appliance. In order to deal with the challenges
posed by intrusive training, a supervised version of the algorithm, requiring appliance-level measure-
ments for calibration, and a semi-supervised version, retrieving appliance-level information from the
aggregate smart-metered signal, are proposed. Both versions are demonstrated onto a real-world power
consumption dataset comprising five different appliances potentially operated simultaneously. Results
show that HSID is able to accurately disaggregate the power consumption measured from a single-
point smart meter, thus providing a detailed characterization of the consumers’ behavior in terms of
power consumption. Numerical results also demonstrate that HSID is robust with respect to noisy signals
and scalable to dataset including a large set of appliances. Finally, the algorithm can be successfully used
in non-intrusive experiments without requiring appliance-level measurements, ultimately opening up
new opportunities to foster the deployment of large-scale smart metering networks, as well as the design
and practical implementation of personalized demand management strategies.
Ó 2016 Elsevier Ltd. All rights reserved.
1. Introduction
The effectiveness of customized energy consumption feedbacks

⇑ Corresponding author.
and, broadly, demand management strategies in the energy sector,
such as economic incentives to upgrade poorly efficient energy
E-mail address: andrea.cominola@polimi.it (A. Cominola).
http://dx.doi.org/10.1016/j.apenergy.2016.10.040
0306-2619/Ó 2016 Elsevier Ltd. All rights reserved.
332 A. Cominola et al. / Applied Energy 185 (2017) 331–344
consuming devices [1], hourly dynamic energy pricing to reduce domain challenge. According to Butner et al. [20], Barker et al. [21]
demand in peak hours [2], and awareness campaigns to inform and Batra et al. [22], no consistent conventions and standards are
energy consumers about their broken-down consumption and sav- currently in place for measuring the accuracy of NILM technolo-
ings [3], has been demonstrated to benefit from appliance-specific gies. Many algorithms tend to focus only on accurately detecting
information [4,5]. The knowledge of timings, peak-hours, and fre- the on/off status of each appliance (e.g., [23–25]) and their accuracy
quencies of use of electric devices is key to understand consumers’ is hence evaluated using metrics accounting for on/off detection,
behaviors, identify consumption anomalies, and, ultimately, design such as the F-score [22]. Only few studies also consider the accu-
personalized demand management strategies (e.g., deferring the racy in reproducing the consumption patterns of single end-uses
use of some appliances to peak-off hours). Appliance-specific per- in time, which is evaluated either by visual inspection or by means
sonalized recommendations are potentially worth more than 12% of specific quantitative metrics [26–29,19,30,31]. While limiting
reduction in annual domestic consumption and can bring multiple the extent of NILM algorithms to only the detection of on/off events
benefits to energy consumers, utilities, and research and develop- allows retrieving information on appliances time and frequencies
ment centres [6]. In the last two decades, this has been prompting of use, a correct reproduction of end-use patterns would support
big investments for the deployment of smart metering networks water utilities and demand management with more exhaustive
[7–9], along with the development of Non-Intrusive Load Monitor- information regarding consumers’ behavior and energy usage effi-
ing (NILM) techniques. The main advantage of NILM [10] is that ciency. Accurate estimates of appliances power consumption pat-
it allows decomposing the aggregated electric load measured at terns enables a better identification of peak-hours, a more
the household level by a single smart, high-frequency, meter into accurate quantification of the power load contributed by each
the individual contribution by each appliance, the so-called end- appliance during peak and off-peak hours, as well as assessments
uses. Despite alternative options do exist for monitoring residential on the efficiency levels of different appliances. These are key infor-
energy consumption at the appliance level (e.g., smart appliances, mation to understand consumers’ behavior and, ultimately, design
distributed sensing networks for direct measurement and smart personalized demand management strategies targeted at improv-
plugs [11,12]), NILM methods, coupled with single-point sensors, ing power consumption efficiency and reducing costs, for instance
are so far the most promising decomposition approach as they through demand peak-shifting and retrofitting of low-efficiency
reduce hardware costs (sensor cost and related costs for installa- devices.
tion, maintenance, battery and sensors replacement) as well as Finally, a third challenge to energy disaggregation algorithms
intrusiveness into users’ houses, even though many require an consists in the number of simultaneously operating appliances that
intrusive calibration phase. Also, installing a unique high- can be identified by NILM algorithms [32,20,21]. This is a double
resolution sensor per house significantly reduces the amount of challenge because an increasing number of simultaneously operat-
data to manage, rather than collecting records from multiple sen- ing appliances not only raises the variety of appliance-specific con-
sors. Another reason promoting the suitability of NILM methods sumption patterns to be identified, but also increases the
for large-scale energy disaggregation applications and market pen- combinations of overlapping uses, and, consequently, signal distor-
etration consists in the overall economic advantages of disaggregation [33].
tion software technologies: a business case by Carrie Armel et al. In this work, we address these three challenges by contributing
[6] shows that the benefits per kW h in terms of potentially a novel Hybrid Signature-based Iterative Disaggregation (HSID)
avoided energy generation and distribution outweigh the costs of algorithm for NILM. A supervised and a semi-supervised versions
disaggregation technologies by a factor of four. This is further of the algorithm are proposed, in order to deal both with applica-
demonstrated by the fact that NILM methods are currently used tions involving intrusive measurements at single-appliance level,
in domains other than energy consumption, including water and as well as non-intrusive ones. Both versions combine Factorial Hid-
gas, and many companies such as General Electric, Opower and den Markov Models (FHMMs) and Iterative Subsequence Dynamic
Belkin are working on their development closely with smart meter Time Warping (ISDTW) to accurately characterize end-use trajec-
producers [13,6]. Yet, the problem of disaggregating an electric sig- tories for a number of simultaneously operating appliances and
nal into its sub-components places a twofold challenge. On the one reduce the intrusiveness of the off-line training. More precisely,
hand, disaggregation techniques should be able to maximize the the FHMM module of the algorithm initially disaggregates the total
appliance-specific information extracted from the aggregate signal. power consumption signal into 2-state single-appliance piece-wise
On the other hand, the algorithms should allow for scalability, constant trajectories. Thus, FHMM provides a rough approximation
while minimizing economic and privacy costs related to disaggre- of the end-use trajectories. ISDTW is then applied, in order to
gation activities (i.e., sensors installation, data collection, and data reshape them according to the typical power consumption pattern
analysis). of each specific end-use, and include the intrinsic variability of the
Several NILM algorithms have been proposed in the literature latter in terms of power range and appliance usage duration. After
(see Zoha et al. [13] and Zeifman et al. [14] and references therein being processed through ISDTW, the estimated end-use trajecto-
for a review). Yet, a number of research and operational challenges ries describe more accurately and realistically the power consump-
are under debate and emerged in recent works. The first, most tion time series of each appliance. The two versions of HSID are
important issue is related to the rate of intrusiveness of the data independent and differentiate with respect to the information
sampling process [15]. In fact, after the seminal work by Hart needed for algorithm training: the supervised version of HSID
[10], a first class of supervised algorithms has been developed, requires appliance-level load measurements, while the semi-
which requires large appliance-level data sets for the initial off- supervised version exploits aggregate measurements from the
line training phase (e.g., Singh et al. [16], Elhamifar and Sastry smart meter to retrieve appliance-level information.
[17], Kolter et al. [18]). Despite a certain level of intrusiveness is The paper is organized as follows. We formalize the disaggrega-
unavoidable to ensure accuracy in the subsequent stages of data tion problem in Section 2 and describe the two versions of the new
disaggregation, the challenge is to keep it at a minimum. This chal- HSID algorithm in Section 3. In Sections 4 and 5, we comparatively
lenge has been motivating the recent emergence of a second class of analyze through a diverse set of metrics the performance of HSID
unsupervised algorithms, which generally avoid collecting against a state-of-the-art benchmark on real world power consump-
appliance-level data (see Bonfigli et al. [19] and references therein). tion data and test the sensitivity of the results with respect to the
Second, the definition of consistent accuracy metrics against level of noise in the metered consumption, as well as the number
which NILM algorithms can be evaluated and compared is another of metered appliances. In the final semi-supervised experiment,
A. Cominola et al. / Applied Energy 185 (2017) 331–344 333
we also demonstrate the usability of the algorithm without the need optimization methods such as genetic algorithms [35], integer opti-
of gathering a training dataset at the appliance level. mization [36] or sparse optimization [18,17,31] are used to search
for the best match between a combination of appliances sampled
2. NILM problem formulation and related work from a known database and the vector of total measured consump-
tion. Despite showing good accuracy in cases with a limited num-
NILM end-use disaggregation algorithms estimate the power ber of appliances combinations, the computational complexity and
consumption of each appliance contributing to the total consump- the lack of inclusion of the temporal continuity of power signals
tion of a household as measured by a single-point smart meter at constitute two major drawbacks for such techniques. The issues
sub-daily frequency, namely either low-frequency (e.g., 10 s or of temporal structure, continuity, and state transition are tackled
1 min) or high-frequency (e.g., hundreds of Hz). This problem can by the so called pattern recognition methods [13]. The algorithms
be classified as a blind identification problem [34] where, given belonging to this class do not approach Problem 3 as an indepen-
the observed output of the whole system (i.e., the household total dent problem for each time step. In contrast, they include informa-
power consumption), the unobserved sub-states (i.e., the power tion about the temporal structure of power signals and search for
consumption of each appliance) should be estimated. More for- the sequence of appliance states that is optimal with respect to
mally, we can write the total power consumption of a house at those features that show temporal continuity (e.g., state transition
time step t as: probabilities). Techniques such as Artificial Neural Networks (ANN)
[37] or Hidden Markov Models (HMMs) [38] have been tested in
X
N this context, successfully showing the value of temporal informa-
Yt ¼ yit þ et ð1Þ tion in learning the consumption patterns of appliances. In partic-
i¼1
ular, several HMM-based load disaggregation algorithms have
where Y t is the total, observed power consumption at each time been widely used and discussed in NILM literature [39], showing
step t; yit the consumption of appliance i at time step t; N the total the potential of achieving disaggregation accuracies higher than
number of appliances, and et the measurement noise. 70% and up to 99% on relevant appliances, when they are trained
The consumption of the i-th appliance is written, for each time on data from the same household used for testing. Given these
step, as: promising results, HMM-family methods [26,22] are often adopted
as benchmarks for algorithm testing and comparison. Another fam-
yit ¼ Bi ðxit Þ þ it
>
yit 2 Rþ ily of recently proposed pattern recognition methods relies on
i;1 i;2 i;M Dynamic Time Warping (DTW), a well-established pattern match-
Bi ¼ ½b ; b ; . . . ; b i ¼ 1; . . . ; N ð2Þ
ing technique used in the field of speech recognition, which allows
xit ¼ ½xi;1
t ; xi;2
t ;...; xi;M
t i ¼ 1; . . . ; N comparisons and matching between traces of different length [40].
In the field of electricity smart grids, DTW has been mainly used for
where
electric profiles clustering [41].
i;j
All these algorithms rely on a preliminary supervised learning
Bi is a vector containing the power consumption basis b for phase, which requires to intrusively collect key information, such
each appliance i, i.e., the power consumption related to each as the number of appliances N or the typical end-use power con-
operating state j (e.g., on/off) of the appliance. M is the number sumption pattern of each appliance (the so-called signatures), for
of potential power states (in this formulation it is assumed that the calibration of the power consumption bases B and other
all the appliances have the same number of states); method-specific parameters. The supervised version of the pro-
xit represents the activation vector for the states of appliance i, posed HSID algorithm can be hence classified as a supervised pat-
at time t. It is a binary-valued vector indicating which power tern matching method, which combines Factorial Hidden Markov
levels of vector Bi are operating for the i-th appliance at each Models and Iterative Subsequence Dynamic Time Warping, and
time step, therefore xi;j
t 2 f0; 1g 8i; j; t. Also, each appliance can
requires the knowledge of N and of the appliances’ signatures. In
only operate in one state at a time, thus the following constraint particular, N can be either retrieved intrusively through door-to-
P door appliance surveys or can be reported directly by household-
holds: M i;j
j¼1 xt ¼ 1 8i; t;
ers. Especially the latter case, or when the number of electric
itis the noise affecting the consumption of appliance i at time
appliances is large, the reported values of N may be affected by
step t. It may be due to intrinsic characteristics of appliances
errors, that will be then propagated along the disaggregation pro-
or mutual interference among appliances on the same network.
cess, if no cross-checks are considered. However, the intrusiveness
of collecting appliance-level data for training the disaggregation
Based on the elements introduced so far, we can formulate the
algorithms often hampers the wide usability of such tools in real
general disaggregation problem as a minimization problem,
world applications. A non-intrusive alternative is represented by
searching for the consumption trajectories of each appliance that
completely unsupervised algorithms (see Bonfigli et al. [19] and
minimize the error between estimated and real power trajectories,
references therein), which do not require any a priori information
for each time step of the considered time horizon H:
about N and B. These methods have been benchmarked against
" #
XH
2
well known datasets and supervised methods (see, for instance,
½B ;
xt ¼ arg min c
ðY t Y t Þ [15,42–48,25]). Yet, they are subject of ongoing research in the
B; xt
"
t¼1
# ð3Þ field of electrical power disaggregation, and, given the recent
X
H XN
2 development of most of them, comprehensive studies cross-
¼ arg min ðyt c
i i
yt Þ comparing the performance of promising unsupervised methods
B; xt t¼1 i¼1
against common datasets and performance metrics are in progress
The above formulation defines the general problem of the NILM [19]. An alternative approach for reducing the intrusiveness of
methods classified in Zoha et al. [13] as optimization methods. In supervised methods is to extract appliance-level signatures from
principle, the solution to Problem 3 may be computed by standard the aggregate-level smart metered trace [45]. In practice, the avail-
least-square techniques. However, the problem is overparameter- ability of consumption diaries [49], where the time-of-use of each
ized (i.e., the number of parameters to be estimated is higher than appliance is recorded for part of the monitoring period, would sup-
the number of available measures) and, in practice, alternative port this approach. Recently, Elafoudi et al. [50] proposed a NILM
algorithm based on DTW and customers’ daily diaries, which is

demonstrated to attain promising results in the end-uses disaggre-
gation, with an overall disaggregation accuracy in terms of F-score
higher than 85%. Indeed, power use diaries information would
allow retrieving appliances signatures from portions of the total
consumption trace when only one appliance is operating (i.e., no
other appliances are simultaneously in operation). We developed
the second, semi-supervised version of the proposed HSID algo-
rithm upon this idea, and show that the algorithm is portable in
almost non-intrusive cases, where only the knowledge of the num-
ber of appliances N is needed, thus removing the initial intrusive-
ness and costs associated with the metering of fixture-level data.
The two versions of the algorithm are described in the next
section. Fig. 1. Typical power load signatures for five indoor appliances. (For a better
interpretation of the references to colour in this figure legend, the reader is referred
to the web version of this article.)
3. Methodology
The development of HSID, Hybrid Signature-based Iterative Dis- values of yi;T . Those values are identified through a 3-clusters K-
aggregation, is based on two assumptions: firstly, each electrical means algorithm [53] and are set all equal to the centroid of the
device contributing to the total household consumption can be rec- lowest cluster, so that no background noise affects the quality of
ognized from its specific consumption pattern, i.e., each fixture has signatures.
a typical signature [51,52], such as the example in Fig. 1. Secondly,
the consumption level time series of each appliance can be mod- 3.1.2. 2-state FHMM load disaggregation
eled as a Markovian state sequence, and can be represented with The purpose of this step is to perform the power load disaggre-
a limited number of states (e.g., state 1: fixture on/operating; state gation. The total power consumption trace of each household is
2: fixture off/not operating). The two assumptions might not hold partitioned into simplified two-state consumption trajectories for
for appliances with extremely noisy and scattered behavior or each fixture, in order to identify their on/off operating states. We
unrealistic situations characterized by random shifts in single- used Factorial Hidden Markov Models (FHMMs) for this purpose.
appliance operating range and operating status transitions. Opera- FHMMs [54] are a well established technique in machine learning
tionally, the workflow of the algorithm is composed of the follow- and have already been applied in the field of power load disaggre-
ing three steps (Fig. 2): gation [22]. FHMMs allow the identification of the most probable
sequence of states of a Markovian process when the considered
A. appliances signatures identification; system is composed of different sub-components, and the state
B. disaggregation of household power load through a 2-state of the whole system (i.e., the only measured element) is a combi-
Factorial Hidden Markov Model (FHMM); nation of the hidden states of each sub-component. In FHMM, each
C. end-use trace patterns correction through Iterative Subse- of the N appliances is characterized by a finite number of hidden
quence Dynamic Time Warping (ISDTW). states (i.e., not observed), the latter described by a prior probability
Two versions of the HSID algorithm are presented in this paper: distribution, and a matrix with probabilities of transition between
the first one, supervised, requires initial intrusive appliance-level couples of states. Each appliance can thus be modeled with a Hid-
load measurements for algorithm training, while the second, den Markov Model, and then single-appliance HMMs are combined
semi-supervised, retrieves appliance-level information directly in a FHMM, considering that there is a specific probabilistic rela-
from the aggregate measurements provided by the smart meter. tion between the observation and combinations of hidden states
We provide details on each step of the HSID algorithm for both set- (i.e., emission probability). In this work, we use the NILM toolkit
tings in the next paragraphs. proposed by Batra et al. [22], which implements a FHMM with
Gaussian emission probabilities and exact inference [54].1 FHMM
performs power load disaggregation according to the following
3.1. Supervised HSID algorithm two-step procedure:
3.1.1. Appliances signatures identification B.1 Training. The training phase of FHMM considers the same
This first step of the supervised version of HSID aims at creating training dataset DT used for the initial signatures extraction
a database containing the signature si of each appliance phase, and consists in the calibration of the three main ele-
i; ði ¼ 1; . . . ; NÞ contributing to the total measured consumption ments of hidden Markov models: (i) the marginal initial prob-
at the household level. In order to gather all the needed signatures ability distribution Pðxi0 Þ for each appliance, i.e., the
si , the algorithm requires a training dataset DT , consisting of power probability of occurrence of each state in the initial time
consumption observations retrieved intrusively for each appli- step, (ii) the transition probability distribution Pðxit jxit1 Þ for
ance/fixture during a short training period. The length of the train- the states of each appliance, i.e., the transition probability
ing period T should be kept as short as possible, in order to reduce among the different operating modes of each appliance
intrusiveness and costs. In this study we considered a 2-week long between two sequential time steps, and (iii) the emission
training period, in order to take into account possible consumption
probability distribution Pðyit jxit Þ for the states of each appli-
differences between week and weekend days and gather a mean-
ance, i.e., the probability of observing a particular output
ingful sample of power consumption events. We implemented
of the system depending on its operating state.
the signature identification in two steps. In the first, the consump-
tion trajectory yi;T ¼ fyit gt¼0;...;T of each appliance during the train-
ing period is retrieved from DT . As a second step, each signature 1
The HMM module of Python scikit-learn machine learning library is exploited for
si is obtained by removing the trace noise [52] from the lowest FHMM resolution.
and then it uses the Viterbi algorithm [55] to identify the most
probable sequence of (hidden) states associated with the measured
output.
For computational reasons, our model assumes that the number

of states each appliance has in the FHMM is equal to 2 (for more
details, see Section 4.3). As a consequence, the consumption trajec-
tories estimated for each appliance by FHMM assume the shape of
piecewise constant lines, i.e., only the on/off operating states are
detected, while an accurate reproduction of power consumption
patterns is missing at this stage. In addition, this two-state out-
come is not acceptable to accurately reproduce the consumption
patterns of multi-state appliances, such as washing machines, or
Continuously Variable Devices [13], whose behavior cannot be cap-
tured by a two-state sequence. The challenge of retrieving the vari-
ety of consumption patterns for such appliances and avoiding
estimation error propagation due to oversimplified trajectories is
tackled by the last component of our algorithm.
3.1.3. Trace patterns correction through Iterative Subsequence

Dynamic Time Warping
In this phase, we iteratively use Subsequence Dynamic Time
Warping (SDTW, [40,56]) in HSID to integrate the information on
the consumption patterns variety given by the signatures extracted
at the beginning of the procedure (Step A), and to correct the 2-
state trajectories produced as output in the FHMM step (Step B).
We integrated SDTW pattern-matching technique in our algorithm
according to the following procedure:
C.1 Event partitioning. The total consumption trajectory and the

single appliances trajectories estimated by FHMM are split
into time windows of equal length, henceforth called events.
The length of the event Le is tuned as an average of the dura-
tions of the pulses in the total consumption trajectory, in
order to be consistent with the typical power usage dura-
tions in the dataset.
C.2 Appliance FHMM ranking. For each event, the appliances are
ranked in decreasing order according to the values of the
90-th percentile of their FHMM trajectories within the event.
Each appliance is labeled with an ordinal value RiF . This rank
gives an idea of the contribution each appliance brings to the
total event, so that the trajectories of the highest ranked
appliances, i.e., the ones with a larger power contribution
in the considered event, can be corrected first, as they have
a largest impact on total consumption and ‘‘hide” the trajec-
tories of less contributing appliances.
C.3 Appliance STDW ranking. For each event, appliances are
assigned a second ranking RiS . This is computed by evaluating
the similarity between the observed total power consump-
tion trace of the considered event and the signature si of
each appliance. The similarity is given by the distance
among the two trajectories as evaluated by DTW: the larger
the distance, the lower the similarity of the two trajectories.
Therefore, highest ranked signatures are the ones closest to
Fig. 2. Flowchart of the HSID algorithm. Boxes enumeration is consistent with the
one adopted in Section 3.
the power trace of the event. DTW is applied as a Subse-
quence Dynamic Time Warping (SDTW) [56], because the
length of events is usually much shorter than the total
B.2 Power load disaggregation. Once the above three probability length of the signature. Indeed, signatures are extracted
distributions are calibrated, FHMM performs a disaggrega- from appliance-level traces long as much as the training per-
tion of the total consumption data over the validation time iod, as explained in Section 3.1.1, while consumption events
horizon H. In short, FHMM solves the following instance of usually last a few minutes. Therefore, signatures can poten-
Problem 3: tially contain more than one consumption event: this means
" #
i i XH
2
that the signature is not entirely compared to the total
P ðxt jxt1 Þ; P ðyit jxit Þ ¼ arg min c
ðY t Y t Þ ð4Þ power consumption trace of the event, but it is scanned in
P ðxit jxit1 Þ; P ðyit jxit Þ t¼1 order to find the best event matching sub-sequence si; s .
C.4 SDTW pattern correction. In this very last phase of the algo- the use of DTW so that it exploits signature information
rithm, the 2-state power load trajectories estimated by to shape and correct FHMM-estimate trajectories and
FHMM are corrected keeping into account the information increase the accuracy in end-use trajectory estimation,
given by the signatures and the ranking vectors RiF and RiS . consequently providing better estimate of the actual
SDTW pattern matching is iteratively applied according to amount of energy used by each appliance, as well as
the following alternative three-case heuristics: reducing event detection errors. The ISDTW we include
1. True positive detection. If R1F ¼ R1S , i.e., an appliance is in HSID can process long signatures comprising multiple
ranked first both by FHMM and DTW rankings, the on/ usage events, in order to find the best matching portion
off operating state detected by FHMM is assumed to be of each signature (subsequence) and, afterwards, exploit
correct. Given that, the estimated consumption pattern the latter to correct and refine the disaggregated end-
for that appliance can be refined by replacing the piece- use trajectories. As HSID iteratively repeats this process
wise constant Markov state with the best matching por- for each event on the residual power load until all the
load of the event has been disaggregated among different
tion of signature s1;
s found at point C.3.
appliances, we adopt ISDTW in a decomposition, rather
2. Possible false positive detection. If R1F – R1S , meaning the than classification, mode.
first ranked appliance by FHMM is not the one with the
most similar signature to the total load power, FHMM
3.2. Semi-supervised HSID algorithm
may have generated a false on event. In this uncertain si-
tuation, we assume that the state activation is inertial
Following the limitations posed by the intrusiveness of the
[18], i.e., if an appliance was on at time t 1 is likely to
supervised learning phase, in this paper we propose also a semi-
be on again at time t and viceversa. This assumption
supervised (i.e., appliance-level measure free) version of the HSID
penalizes unrealistic frequent state transitions (e.g., con-
algorithm. This second version of our algorithm does not require
tinuously turning on and off an appliance in a short time
appliance-level ground-truth training data and manipulates only
interval) and is implemented as follows: if the fixture
the total energy consumption metered at the household level. In
ranked as R1F had a larger normalized contribution to this semi-supervised scenario, we assume that a single-event sig-
the total power consumption over the previous k time nature, i.e., the signature of each appliance for a single event, can
steps than the one of the appliance ranked as R1S , it is kept be retrieved from the total power consumption pattern upon
as active and corrected with its signature si;
s (see step 1).
knowledge of a time window in which the considered appliance
Elsewhere, it is switched off according to the next case. is working without other appliances interference. This is equiva-
3. False positive detection. If none of the previous cases is lent to a situation in which no on-device smart sensors are
met, it is assumed that the on state generated by FHMM installed, thus avoiding the intrusiveness from the point of view
for the considered appliance is a false positive, therefore of measurements and hardware, but energy activities diaries [57]
the appliance is switched off and its consumption trajec- filled by energy consumers for a very limited time are available.
tory is set to its lowest Markov state. This situation is realistic, as many energy utilities and multi-
After the first ranked appliance is corrected, the residual utilities worldwide are developing web portals to interact with
total power consumption is updated and step C.4 is their customers and provide them with customized services.
repeated recursively for the remaining appliances. Thus, Energy consumption diaries can be easily included in such portals
the procedure is iterated in order to correct the signal and users can be allowed to insert information on their consump-
of all the simultaneously operating appliances, without tion (timing and type of device used) as an opt-in, therefore their
requiring only one appliance operating at each time step. privacy is safeguarded and intrusiveness avoided. The depicted
It is important mentioning that an exception holds for scenario reflects in the following modifications of HSID with
corrections 1 and 2: signature correction is not imple- respect to the supervised version described in the previous section:
mented when its implementation would introduce noise (i) appliances signatures to feed the ISDTW module of HSID are not
on the signal estimated by FHMM, i.e., when the total retrieved from an intrusively gathered appliance-level training
consumption signal is more similar to the estimated 2- dataset as explained in step A of Section 3, but each signature si
state appliance signal than to any signature si . A further consists of a single-event signature extracted from the total house-
exception holds for appliances with a training trajectory hold power consumption trace, when no other appliances are
confined in a very narrow power interval (i.e., lower or simultaneously working and (ii) the input dataset DT for training
equal to the lowest positive Markov state in the state the FHMM module (see step B of Section 3) is the union of such sig-
space of the problem): those appliances are corrected natures DT ¼ s1 ; s2 ; . . . ; sN .
prior than the other ones and set constantly equal to
the average of their signature, because this latter is noisy 4. Experimental settings
but varies within a narrow interval. Finally, it is worth
noticing that no false negative cases are explicitly consid- 4.1. Data
ered, as they are automatically solved by difference,
given that each iteration considers the updated residual We tested the HSID algorithm against the AMPds dataset [58],
of the total power load for ranking and similarity with which contains the power consumption readings of a single house
signatures. located in the Vancouver region in British Columbia (Canada). Data
Overall, we expect HSID to benefit from ISDTW in order metered at the end-use level at 1 min resolution are available for
to produce accurate disaggregated end-use trajectories. 1-year time, from April 1st 2012 to March 31st 2013. We consid-
In a previous work by Elafoudi et al. [50], DTW was suc- ered only the appliances contributing more than 5% of the total
cessfully adopted to classify energy consumption events indoor consumption, i.e., heat pump, forced air furnace, clothes
by matching with labeled templates stored in a reference dryer, kitchen fridge, and security/network equipment. No outdoor
library. This was demonstrated to achieve high perfor- uses, i.e., outside plugs, office uses and ‘‘hybrid/undefined” appli-
mance in appliance usage detection. In HSID, we extend ances, e.g., room aggregate consumption, were taken into account,
in order to focus only on the most important residential activities classified as off being the appliance actually on (false negative). The
and on traces measured at the end-use level for calibration, rather precision can be interpreted as a measure of how many detected
than those measured at the room level. events are relevant and the recall as a measure of how many rele-
We tested the supervised version of HSID against two sub- vant events are properly detected. The F-score indicator evaluates
periods, extracted from the available 1-year dataset in order to how good the algorithm is in classifying the operating states of
account for possible seasonality effects on energy use: (i) 6 the considered appliances. It ranges from 0 (0% accuracy on state
Spring/Summer weeks data from May 16th 2012 to June 30th detection) to 1 (100% accuracy on state detection).
2012, with the first two weeks used for appliances signatures The assigned power contribution error (PCE) gives information
extraction and FHMM calibration, and the remaining month for on the model accuracy in assigning the power consumption share
validation; (ii) 6 weeks from the Winter period, from November to each appliance i, according to the following formula:
P
16th 2012 to December 31st 2012, again divided into one third H i PH î
of the dataset for calibration and two thirds for validation. t¼1 yt t¼1 yt
These proportions are in line with those adopted in other state-
PCEi ¼ PH ð6Þ
t¼1 Y t
of-the-art end-uses or energy conservation studies (e.g.,
[59,5,18,26]). Since in this section we first apply the algorithm on where yi and yî are the ground-truth and estimated power con-
a supervised case, we assumed data measured at the end-use level sumption for appliance i respectively. An accurate algorithm would
to be available for FHMM calibration purposes and signature extrac- produce PCE values close to 0.
tion only during the 2-week training period, while for the validation The R2 score assesses the accuracy of end-uses trajectories
period we used them only as ground-truth data for assessing the reproduction. It is defined as
model outputs accuracy. Tests with the semi-supervised version of P P 2
HSID (Section 5.4) did not consider the 2-week training period, but ð H yi H y î Þ
R2i ¼ 1 P t¼1 t P t¼1 t 2 ð7Þ
only single-event signatures for each appliance were assumed as H H
t¼1 yt =HÞ
i i
t¼1 ðyt
input to FHMM and ISDTW. The parameters of both versions of the
HSID algorithm were set as follows: (i) Le (event length for SDTW This set of metrics assesses the performance of NILM algorithms
iteration) equal to 20 min; (ii) k (previous time step window for false by different viewpoints corresponding to increasing levels of infor-
positive event detection) equal to 30 min. mation provided on the signal characteristics and, correspondingly,
Finally, we assess the sensitivity of the algorithm with respect different value for electric utilities and decision makers interested
to different levels of signal noise (Section 5.2) and with respect in designing energy demand management strategies.
to the number of appliances (Section 5.3). For this second experi- F-score gives a basic level of information about the capability of
ment, we tested the HSID algorithm against the REDD dataset NILM algorithms to properly detect the appliances operating states
[26], considering house number 4, which includes power con- but does not provide any information about the power consump-
sumption readings for up to 11 appliance types in the same house- tion. In turn, PCE gives an overall indication of the goodness of
hold. Given the presence of missing readings and for consistency NILM models to estimate the power consumption assigned to each
with previous experiments, we down-sampled the original meter- appliance. This is more informative than F-score to design cus-
ing resolution (i.e., 3–5 s) to 1 min, using 50% of the data for cali- tomized feedbacks and other demand management strategies,
bration (i.e., data with timestamps from 17.04.2011 - h. 1:16 to since a model with a low PCE would be able to inform decision
29.04.2011 - h. 20:00, which cover the length of approximately makers about major power uses and power ratios. Finally, the R2
9 days, as data gaps are disregarded) and the remaining 50% for score evaluates the finest aspects of NILM outputs, i.e., the accu-
validation (i.e., data with timestamps from 29.04.2011 - h. 20:01 racy in characterizing power consumption trajectories, which
to 04.06.2011 - h. 00:45, which, again, cover the length of approx- becomes essential for improving the information level on energy
imately 9 days after data gaps are removed.2 use. This allows to evaluate values of power consumption during
peak periods, retrieve information about use frequencies and tim-
ing for major uses and, through the analysis of consumption pat-
4.2. Performance metrics
tern, identify changes in the electric equipment of a house or
potential savings from equipment renewal.
The evaluation of the outcomes from disaggregation algorithms
against a set of comprehensive and consistent metrics has been
4.3. Number of states for FHMM
mentioned as one of the main challenges in the literature on NILM
[20–22]. We contribute an assessment of the quality of the disag-
The computational complexity of FHMM algorithms grows
gregation results from HSID against ground-truth data according
exponentially in the number of states considered. It is in the order
to the following set of metrics, selected among the others because
they overall cover the characteristics of the estimated end-use sig- of OðTM2N Þ for a problem with M states for each system element
nals that should be considered for a complete evaluation. The F- (appliance), N appliances and T time instances [54]. On the other
score (Fs), as introduced in Batra et al. [22], is evaluated for each hand, the higher the number of states, the higher the number of
appliance i according to the following formula appliances potentially extracted by the disaggregation, in principle.
Yet, it is not easy to a priori decide which number of states is suit-
2 PC i RC i able for accurately describing the consumption pattern of different
Fsi ¼ ð5Þ
PC i þ RC i fixtures just considering traditional FHMM, as each fixture has its
own consumption pattern and multi-state or continuously variable
where RC i and PC i are the recall and precision, respectively, evalu- devices might potentially require a very high number of states to
ated for appliance i as RC i ¼ TPiTP i
þFN i
and PC i ¼ TPTP i
i þFP i
. TPi ; FPi , and be properly modeled. In order to explore how big such a number
FNi are the number of events correctly classified when appliances of states should be, we performed a preliminary sensitivity analy-
are on (true positive), the number of events classified as on being sis by disaggregating a 1-month power consumption contributed
the appliance actually off (false positive), and the number of events by 4 appliances with distinct signatures that should maximize
the potential of FHMM with many states. We considered an
2
Data present two big gaps between 02.05.2011 and 22.05.2011, and between increasing number of states (2, 3, 4 and 7) and evaluated the vari-
29.05.2011 and 03.06.2011, plus other minor gaps. ation in performance through the F-score and R2 metrics, which
range from 0 to 1 in the best case [22]. The values obtained for the ited to 2, as mentioned in Section 3.1.2, thus representing the on/off
two performance metrics are represented in the two radar plots in state of each appliance. The FHMM module is then coupled with
Fig. 3, where each axis reports the performance metrics specifically the ISDTW module explained in Section 3.1.3 to correct the end-
for each appliance and each colored line connects the performance use trace patterns. This combination of 2-state FHMM and ISDTW
given by a specific setting of FHMM states. In principle, as the two yields a complexity in the order of OðT 22N þ N dT=Le eÞ, where
metrics should be maximized, one would like to obtain a colored Le is the time length defined for each event (therefore lower than
line connecting all the vertices of the radar plot, with value 1. T). From the point of view of disaggregation outputs, we expect
Results show that the performance does not monotonically the iterative correction of ISDTW on 2-state FHMM outputs to sig-
increase with the number of states, suggesting that increasing nificantly improve the disaggregation accuracy.
the number of states raises the computational cost of the algo-
rithm, but this does not guarantee an improvement of the disag-
gregation accuracy. Consequently, alternative solutions to the 5. Numerical results
increase of FHMM states need to be formulated in order to get
accurate disaggregation results at reduced cost. Our HSID algo- 5.1. Supervised HSID disaggregation
rithm addresses this challenge from the point of view of computa-
tional complexity. In the FHMM module, the number of possible We have comparatively analyzed the HSID algorithm with
states allowed for each appliance to perform disaggregation is lim- respect to the 2-state FHMM benchmark algorithm developed in
Batra et al. [22] (Fig. 4). The radar plot in the figure must be read
similarly to Fig. 3 (the performance index relative to each appli-
ance is reported on each axis and different colors refer to different
algorithms) keeping in mind that, conversely to the F-score and R2
metrics, the PCE shows good performances for values close to zero.
The HSID algorithm overall achieves very good performance on
all the three metrics considered. The F-score shows that the algo-
rithm is able to correctly detect the operating states of each appli-
ance with a rate higher than 95% on four appliances out of five
simultaneously operating (and always higher than 70%). Slightly
lower, but still very good, results are achieved also by the 2-state
FHMM benchmark. Yet, the HSID algorithm significantly outper-
forms the benchmark on the other two metrics, attaining PCE val-
ues lower than 2% for all the five appliances and a R2 close to 1 for
three out of five appliances. These results demonstrate that HSID is
able to detect the operating states of the appliances, while also
providing information on the contribution by each appliance to
the total power consumption and on the consumption patterns
for most of the appliances (Fig. 5), despite multiple appliances
operating simultaneously and, thus, overlapping end-uses. It is
worth noticing that the largest improvement in correctly assigning
power (Fig. 4, middle panel) is mainly on the two most contribut-
ing appliances, thus the result is even more meaningful. Finally,
another numerically relevant result is obtained on the estimation
accuracy of the power consumption trajectories, which can be seen
in the R2 metric (Fig. 4, bottom panel) and by visual inspection of
the trajectories contrasted against ground-truth data (Fig. 5). A
careful analysis of Fig. 5 also clarifies why for two appliances
(i.e., the security equipment and the forced air furnace) the perfor-
mance in terms of R2 is relatively small while the PCE is kept very
low. The reason is that the trajectory of those two appliances is
very noisy, thus being difficult to predict. However, it varies in a
narrow range if compared to the other appliances’, thus allowing
for a correct estimation of its power contribution with average val-
ues. As mentioned in Section 3, the HSID algorithm does not per-
form signature correction for such appliances, as noise is
prevailing and, therefore, trajectories cannot be reproduced accu-
rately (even though visual analysis shows improvements with
respect to the trajectories produced by the benchmark 2-state
FHMM). Still, the algorithm is able to filter out the noise for such
appliances and estimates average consumption for each of them
as captured by the low values of PCE.
The promising results we obtained for the supervised experi-
ment showed the potential information content of appliances sig-
natures, which appears to be key to allow for an accurate
estimation of end-use power consumption trajectories. This find-
Fig. 3. F-score and R2 disaggregation accuracy on four appliances with increasing
ing opens space for a discussion on the usability of the HSID algo-
number of FHMM states. (For a better interpretation of the references to colour in rithm under conditions where the availability of signatures is
this figure legend, the reader is referred to the web version of this article.) reduced, for instance when just few end-use ground-truth data
noise or a high number of appliances. We address both these ques-

tions with three ad hoc experiments in the next paragraphs. Firstly,
the sensitivity of the algorithm to aggregate signal noise level is
assessed. Secondly, we test the sensitivity of HSID with respect
to a high number of appliances. Finally, opportunities for exploit-
ing the information content of signatures collected without intru-
sive measurements at the appliance level, while relying on the
total consumption coupled with consumption diaries, are analyzed
and quantitatively discussed.
5.2. Algorithm sensitivity to signal noise
The results discussed in Section 5.1 suggest that the HSID algo-
rithm successfully attains good performance in supervised disag-
gregation problems, assuming the availability of a training
dataset to accurately define the signatures of each appliance. The
corrections introduced in HSID through signature matching allow
to accurately approximate end-use consumption trajectories. Yet,
no signal noise was considered, as the total electricity consumption
trace to disaggregate consisted of the sum of appliance-level
traces. In real-world applications, the quality of the signal as mea-
sured by smart meter can be affected by noise, due to measure-
ment errors or interference among appliances. We expect this
noise to affect the quality of the approximated disaggregated
traces and, in order to evaluate the impacts of less accurate signa-
tures on the disaggregation performance, we ran the following sen-
sitivity analysis. First, we estimated the measure error from AMPds
dataset as the difference between the total power load trace mea-
sured by the house-level smart meter and the sum of appliance-
level power load traces. Afterwards, we repeated the supervised
disaggregation experiment commented in the previous section
upon summation of that noise to the total power trace to disaggre-
gate. More specifically, we considered a noise ratios of 0.5, 1, 1.25,
and 1.5 for each algorithm run, in order to evaluate the sensitivity
of the HSID performance with respect to different noise magni-
tudes. The results of this sensitivity analysis are represented in
Fig. 6. The algorithm shows a good robustness under conditions
of noisy signal over the F-score metric, as the introduction of signal
noise affected significantly only the disaggregation accuracy of the
kitchen fridge trace, but it is still kept to values around 0.6. As
expected, signal noise caused a performance degradation for those
metrics more strictly linked to trace shape and pattern. Indeed, PCE
increases to values higher than 5% for all appliances, as soon as
noise is considered and R2 drops to zero for the kitchen fridge
appliance. Still, PCE is lower than 20% for all appliances when the
actual noise of the dataset is introduced (i.e., noise ratio equal to
1) and it is lower than 40% for all appliances in the worst case when
the actual noise is artificially increased by 50% (i.e., noise ratio 1.5),
being overall lower than 6% for 3 appliances out of 5, in all noise
scenarios considered. Also the performance in terms of R2 is low-
ered as the noise level increases, as signal noise makes it harder
to reproduce accurate trajectories through signature matching.
Still, for noise ratios lower than 1.25 its values are still higher than
0.6 for heat pump and clothes dryer, i.e., two out of the three appli-
ances that presented positive values of R2 in the supervised exper-
iment without signal noise. We can conclude that the
disaggregation performance of HSID can be significantly affected
by signal noise, but overall the algorithm shows good performance
robustness in terms of event detection (F-score) and appliance-
Fig. 4. Disaggregation performance metrics for supervised HSID and 2-state FHMM level contributions for noise levels in the range of the actual mea-
algorithm on 5 simultaneously operating appliances. From top: F-score, PCE, and R2 sured noise.
score. Data refer to the Summer period. The performance on the Winter period were
found to be very similar.
5.3. Algorithm sensitivity to higher number of appliances
are available for algorithm training, or when the quality of the sig- The results discussed in the previous sections demonstrate that
nal to disaggregate is altered, due for instance to measurement the HSID algorithm is able to accurately disaggregate the end-use
Fig. 5. Consumption patterns estimated at the end use by the supervised HSID algorithm, compared with observed values. A stacked area with all the overlapping end-uses is
represented in the top part of the figure. The four charts in the bottom part represent a detailed comparison of model output vs observed values of the 5 considered end-uses.
(For a better interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
consumption of the major appliances (i.e., devices contributing score metric, with an average value across the 11 appliances equal
more than 5% of the total power consumption), also showing suffi- to 0.61. More precisely, the F-score metric is greater than 0.8 for 6
cient robustness against increasing levels of noise. In this section, out of 9 appliances (and equal to 1 for 4 appliances), and only 4 out
we assess the sensitivity of the algorithm’s performance when of 11 appliances are below 0.5. These results are comparable to
tested against a larger set of appliances. Specifically, we test the those presented in the recent works by Zhao et al. [46,49], even
disaggregation on one household from the REDD dataset [26], though these works considered different houses of the REDD data-
which includes 11 different appliance types. set, which includes at most 9 appliances. This high disaggregation
The results of this experiment, computed on the validation accuracy is confirmed by the values of PCE. HSID attains an average
dataset, are reported in Table 1, with the algorithm performance value in this metric below 4%, with the maximum error equal to
measured for each appliance in terms of the three metrics formal- 14% for the disaggregation of the lighting power consumption, thus
ized in Section 4.2. Despite HSID is trained on a shorter period than outperforming the FHMM disaggregation method of Kolter and
the previous experiment, it attains good performance in terms of F- Johnson [26] which reported an average error of over 50%. These
Table 1
Disaggregation performance metrics for HSID tested on a house comprising 11
appliances from the REDD dataset.
F-score PCE (%) R2

Kitchen outlets 1 8.12 0.17
Lighting 1 14.15 –
Furnace 1 2.80 –
Dishwasher 0.61 0.73 0.81
Stove 1 2.74 0.03
Washer-dryer 0.05 8.96 –
Miscellaneous 0.99 0.01 –
Bathroom GFI 0.82 1.96 –
Outlets unknown 0.20 1.96 –
Air conditioning 0.03 0.81 –
Smoke alarms 0.01 0.13 –
load from a 7-appliance home from REDD show an end-use esti-

mate error below or equal to 5%. In our case, PCE is lower than
3% for 8 appliances out of 11, demonstrating that HSID achieves
good results for a comparable number of appliances, despite the
complexity introduced by the overall higher total number of appli-
ances. Finally, the performance evaluated in terms of R2 shows the
largest degradation, with positive values attained for only 3 appli-
ances. This low performance can be explained by the increased
variability in the total metered consumption when numerous
heterogeneous appliances are considered and overlap. It is worth
noting that HSID attains positive values of R2 for both the dish-
washer and the fridge (included in the kitchen outlets), which
are the appliances contributing the most to the total consumption
as well as the ones characterized by a regular signature. These
results suggest that increasing the number of appliances impact
negatively on the ability of HSID of reproducing the single end-
use trajectories, as shown by the low values of R2. However, the
algorithm attains good performance in the other two metrics, thus
showing a good scalability to large set of appliances for both event
detection and recognition of appliance-level contributions.
5.4. Semi-supervised HSID disaggregation
In order to test the performance of the semi-supervised exten-

sion of HSID as described in Section 3.2, we run an experiment sim-
ilar to the one described in Section 5.1, but considering that only a
single-event signature for each appliance is available, as retrieved
from the total power consumption pattern. Fig. 7 reports the
results of this semi-supervised application. Not surprisingly,
results show that the semi-supervised use of the HSID algorithm
is underperforming with respect to the supervised, partially intru-
sive experiment. However, some important and promising aspects
do emerge. First, despite the small amount of information avail-
able, the performance is still acceptable: only one appliance has
an F-score lower than 0.7, all appliances have PCE lower than
10%, and still two appliances present trajectories estimation accu-
racy (R2) higher than 0.8. Second, it is worth examining the reason
of this degradation of performance, which is probably twofold. The
one-event signatures produce a loss of information on the potential
variability of the appliances signatures, which are instead repre-
sented in the 2-week training dataset used in the supervised appli-
cation. Moreover, the one-event signatures fail by definition in
characterizing the behavioral component of the power consump-
tion, i.e., the information about energy use habits that can only
Fig. 6. Disaggregation performance metrics for HSID under different levels of noise be a priori retrieved through the study of end-use consumption
on the smart-metered trace to disaggregate. From top: F-score, PCE, and R2 score.
patterns for a significantly longer period. We took into account
(For a better interpretation of the references to colour in this figure legend, the
reader is referred to the web version of this article.) such information in the supervised application by the FHMM mod-
ule, where the data retrieved from an intrusive measurement per-
results are also in line with those presented in the recent work by iod allowed for training the probabilities of initial operating state
Zhao et al. [49], where pie charts from the disaggregation of power and state transition for each appliance. These probabilities
6. Final remarks
The knowledge about residential end-use power consumption

patterns is essential to design and implement customized energy
demand management strategies and reduce costs for electric util-
ities. Moreover, literature claims that consumption feedbacks to
users, tailored on information at the appliance level, have a high
potential for fostering high energy savings in the residential and
commercial sector. The availability of high-resolution, smart
metered, power consumption data, jointly with NILM algorithms,
allows characterizing the distribution of the power consumption
over different end-uses with low hardware and maintenance costs,
and reduced intrusiveness from on-device meters. Yet, state-of-
the-art NILM techniques present several challenges that so far
are preventing their deployment at the large scale. In this paper
we proposed the novel Hybrid Signature-based Iterative Disaggre-
gation (HSID) algorithm for Non-Intrusive Load Monitoring, com-
bining Factorial Hidden Markov Models and Iterative
Subsequence Dynamic Time Warping. More specifically, we pre-
sented a supervised version of the algorithm, requiring
appliance-level measurements for calibration, and a semi-
supervised versions of the algorithm, retrieving appliance-level
information from the aggregate smart-metered signal. We tested
the algorithm on real power consumption datasets and evaluated
the performance according to a set of different selected metrics.
Results from the supervised disaggregation of the power load con-
tributed by five indoor end-uses simultaneously operating show
that the algorithm is able to tackle the disaggregation for multiple
operating appliances, outperforming the accuracy of a state-of-the-
art FHMM benchmark without adding to the computational bur-
den. Moreover, we demonstrated that the algorithm accurately
reproduces the consumption trajectory of each end-use, which is
a major challenge in the field of energy disaggregation since many
NILM algorithms can only afford the detection of operating states.
The accuracy demonstrated by HSID in reproducing the consump-
tion trajectories overcomes this challenge and allows supporting
the design of more informed demand management strategies on
the basis of the additional information on single appliance contri-
bution to total consumption, use frequencies, and timing, which
are essential to estimate and control peak and base load demand.
The robustness shown by the HSID algorithm to altered quality
of smart metered signals and increasing number of appliances,
and its potential to perform NILM just accounting for the informa-
tion given by single-event signatures (i.e., without an intrusive
training period) represent one of the key findings of this research.
Indeed, results suggest that HSID has the potential for avoiding the
intrusive measurement periods required by other approaches,
which are the main burden for large-scale applications of existing
supervised NILM algorithms.
The semi-supervised HSID approach requires further testing,
particularly considering its application to appliances showing a
variety of signatures (e.g., a washing machine having different
washing programs), which cannot be captured by a single-event
signature extraction. In general, the obtained outcome represents
an opportunity for further development of NILM methods in a truly
non-intrusive perspective, especially if supported by social com-
puting, consumers’ involvement techniques and ICT platforms.
Fig. 7. Disaggregation performance metrics for semi-supervised HSID. From top: F-
score, PCE, and R2 score. Indeed, platforms for increasing consumers’ awareness and
engagement in demand management are undergoing a period of
significant development, representing promising tools to pursue
constitute the statistical expression of energy consumers’ habits. In the usability and scalability algorithms like HSID at reduced cost.3
contrast, this information is missing in the semi-supervised exper- Ongoing and future research should focus, at a first stage, on a
iment, causing a performance decline. However, the results of this
semi-supervised application of the HSID algorithm suggest that the
contribution given by the use of signature patterns is essential to 3
See, for instance the Opower 3.0 platform at http://opower.com/company/news-
accurately perform power load disaggregation. press/press_releases/10.
further testing of the algorithm against other datasets, possibly gath- international conference on environment and electrical engineering
(EEEIC). IEEE; 2015. p. 1175–80.
ered in different spatial contexts and with different meter resolu-
[20] Butner RS, Reid DJ, Hoffman M, Sullivan GP, Blanchard J. Non-intrusive load
tions. Also, potential use for completely non-intrusive applications monitoring assessment: literature review and laboratory protocol. Pacific
should be further explored and tested on real cases, coupled with Northwest National Laboratory; 2013.
the design, implementation and monitoring of demand-side man- [21] Barker S, Kalra S, Irwin D, Shenoy P. NILM redux: the case for emphasizing
applications over accuracy. In: NILM-2014 workshop.
agement strategies [2]. Finally, it would be worth assessing the suit- [22] Batra N, Kelly J, Parson O, Dutta H, Knottenbelt W, Rogers A, et al. NILMTK: an
ability of the algorithm to perform end-use disaggregation in fields open source toolkit for non-intrusive load monitoring. In: Proceedings of the
other than power consumption, such as residential water consump- 5th international conference on future energy systems. ACM; 2014. p. 265–76.
[23] Giri S, Bergés M. An energy estimation framework for event-based methods in
tion [60,61] or combined water-energy or gas data [62]. Multi- non-intrusive load monitoring. Energy Convers Manage 2015;90:488–98.
utilities as well as energy, water and gas providers would benefit [24] Makonin S, Popowich F. Nonintrusive load monitoring (NILM) performance
from the knowledge about the end-use share of consumption at dif- evaluation. Energy Efficiency 2015;8:809–14.
[25] Bernard T, Marx M. Unsupervised learning algorithm using multiple electrical
ferent levels: demand management, network maintenance, and low and high frequency features for the task of load disaggregation. In:
strategic planning. Proceedings of the 3rd international workshop on NILM.
[26] Kolter JZ, Johnson MJ. REDD: a public data set for energy disaggregation
research. In: Workshop on data mining applications in sustainability
Acknowledgment (SIGKDD), San Diego, CA, Citeseer. p. 59–62.
[27] Gabaldón A, Ortiz-García M, Molina R, Valero-Verdú S. Disaggregation of the
electric loads of small customers through the application of the Hilbert
The research leading to these results has received funding from transform. Energy Efficiency 2014;7:711–28.
the European Union’s Seventh Framework Programme (FP7/2007– [28] Kelly J, Knottenbelt W. Neural NILM: deep neural networks applied to energy
2013) under grant agreement No. 619172 (SmartH2O: an ICT Plat- disaggregation. In: Proceedings of the 2nd ACM international conference on
embedded systems for energy-efficient built environments. ACM; 2015. p.
form to leverage on Social Computing for the efficient management 55–64.
of Water Consumption). The authors would like to thank the editor [29] Amenta V, Tina GM. Load demand disaggregation based on simple load
and the anonymous reviewer for their useful suggestions that con- signature and user’s feedback. Energy Proc 2015;83:380–8.
[30] Mueller JA, Kimball JW. An accurate method of energy use prediction for
tributed to improve the manuscript. systems with known composition. In: Proceedings of the 3rd international
workshop on NILM.
[31] Piga D, Cominola A, Giuliani M, Castelletti A, Rizzoli AE. Sparse optimization
References for automated energy end use disaggregation. IEEE Trans Control Syst Technol
2016;24:1044–51.
[1] Geller H, Harrington P, Rosenfeld AH, Tanishima S, Unander F. Polices for [32] Froehlich J, Larson E, Gupta S, Cohn G, Reynolds M, Patel S. Disaggregated end-
increasing energy efficiency: thirty years of experience in OECD countries. use energy sensing for the smart grid. IEEE Pervas Comput 2011;10:28–39.
Energy Policy 2006;34:556–73. http://dx.doi.org/10.1016/j.enpol.2005.11.010. [33] Liang J, Ng SK, Kendall G, Cheng JW. Load signature study - Part II:
Hong Kong Editorial Board meeting presentations. Disaggregation framework, simulation, and applications. IEEE Trans Power
[2] Gaiser K, Stroeve P. The impact of scheduling appliances and rate structure on Deliv 2010;25:561–9.
bill savings for net-zero energy communities: application to West Village. Appl [34] Abed-Meraim K, Qiu W, Hua Y. Blind system identification. Proc IEEE
Energy 2014;113:1586–95. http://dx.doi.org/10.1016/j.apenergy.2013.08.075. 1997;85:1310–22.
[3] Vassileva I, Campillo J. Increasing energy efficiency in low-income households [35] Baranski M, Voss J. Genetic algorithm for pattern detection in NIALM systems.
through targeting awareness and behavioral change. Renew Energy In: 2004 IEEE international conference on systems, man and cybernetics. IEEE;
2014;67:59–63. http://dx.doi.org/10.1016/j.renene.2013.11.046. Renewable 2004. p. 3462–8.
energy for sustainable development and decarbonisation. [36] Suzuki K, Inagaki S, Suzuki T, Nakamura H, Ito K. Nonintrusive appliance load
[4] Newborough M, Probert S. Intelligent automatic electrical-load management monitoring based on integer programming. In: SICE annual conference,
for networks of major domestic appliances. Appl Energy 1990;37:151–68. 2008. IEEE; 2008. p. 2742–7.
http://dx.doi.org/10.1016/0306-2619(90)90073-M. [37] Srinivasan D, Ng W, Liew A. Neural-network-based signature recognition for
[5] Fischer C. Feedback on household electricity consumption: a tool for saving harmonic source identification. IEEE Trans Power Deliv 2006;21:398–405.
energy? Energy Efficiency 2008;1:79–104. [38] Kolter JZ, Jaakkola T. Approximate inference in additive factorial HMMs with
[6] Carrie Armel K, Gupta A, Shrimali G, Albert A. Is disaggregation the holy grail of application to energy disaggregation. In: International conference on artificial
energy efficiency? The case of electricity. Energy Policy 2013;52:213–34. intelligence and statistics. p. 1472–82.
[7] Neenan B, Hemphill RC. Societal benefits of smart metering investments. Electr [39] Mauch L, Barsim KS, Yang B. How well can HMM model load signals. In:
J 2008;21:32–45. Proceedings of the 3rd international workshop on NILM.
[8] Chou JS, Yutami IGAN. Smart meter adoption and deployment strategy for [40] Sakoe H, Chiba S. Dynamic programming algorithm optimization for spoken
residential buildings in Indonesia. Appl Energy 2014;128:336–49. http://dx. word recognition. IEEE Trans Acoust Speech Signal Process 1978;26:43–9.
doi.org/10.1016/j.apenergy.2014.04.083. [41] Gullo F, Ponti G, Tagarelli A, Ruffolo M, Labate D, et al. Low-voltage electricity
[9] Colak I, Fulli G, Sagiroglu S, Yesilbudak M, Covrig CF. Smart grid projects in customer profiling based on load data clustering. In: Proceedings of the 2009
Europe: current status, maturity and future scenarios. Appl Energy international database engineering & applications symposium. ACM; 2009. p.
2015;152:58–70. http://dx.doi.org/10.1016/j.apenergy.2015.04.098. 330–3.
[10] Hart GW. Nonintrusive appliance load monitoring. Proc IEEE [42] Kim H, Marwah M, Arlitt MF, Lyon G, Han J. Unsupervised disaggregation of
1992;80:1870–91. low frequency power measurements. In: SDM. SIAM; 2011. p. 747–58.
[11] Morsali H, Shekarabi S, Ardekani K, Khayami H, Fereidunian A, Ghassemian M, [43] Shao H, Marwah M, Ramakrishnan N. A temporal motif mining approach to
et al. Smart plugs for building energy management systems. In: 2012 2nd unsupervised energy disaggregation. In: Proceedings of the 1st international
Iranian conference on smart grids (ICSG). p. 1–5. workshop on non-intrusive load monitoring, Pittsburgh, PA, USA.
[12] Kobus CB, Klaassen EA, Mugge R, Schoormans JP. A real-life assessment on the [44] Liao J, Elafoudi G, Stankovic L, Stankovic V. Non-intrusive appliance load
effect of smart appliances for shifting households’ electricity demand. Appl monitoring using low-resolution smart meter data. In: 2014 IEEE international
Energy 2015;147:335–43. http://dx.doi.org/10.1016/j.apenergy.2015.01.073. conference on smart grid communications (SmartGridComm). IEEE; 2014. p.
[13] Zoha A, Gluhak A, Imran MA, Rajasegarar S. Non-intrusive load monitoring 535–40.
approaches for disaggregated energy sensing: a survey. Sensors [45] Parson O, Ghosh S, Weal M, Rogers A. An unsupervised training method for
2012;12:16838–66. non-intrusive appliance load monitoring. Artif Intell 2014;217:1–19. http://dx.
[14] Zeifman M, Akers C, Roth K. Nonintrusive appliance load monitoring (NIALM) doi.org/10.1016/j.artint.2014.07.010.
for energy control in residential buildings: review and outlook. In: IEEE [46] Zhao B, Stankovic L, Stankovic V. Blind non-intrusive appliance load
transactions on consumer electronics, Citeseer. monitoring using graph-based signal processing. In: 2015 IEEE global
[15] Goncalves H, Ocneanu A, Berges M, Fan R. Unsupervised disaggregation of conference on signal and information processing (GlobalSIP). IEEE; 2015. p.
appliances using aggregated consumption data. In: The 1st KDD workshop on 68–72.
data mining applications in sustainability (SustKDD). [47] Pöchacker M, Egarter D, Elmenreich W. Proficiency of power values for load
[16] Singh S, Gulati M, Majumdar A. Greedy deep disaggregating sparse coding. In: disaggregation. IEEE Trans Instrum Meas 2016;65:46–55.
Proceedings of the 3rd international workshop on NILM. [48] Liu B, Luan W, Yu Y. A fully unsupervised appliance modelling framework for
[17] Elhamifar E, Sastry S. Energy disaggregation via learning powerlets and sparse NILM. In: Proceedings of the 3rd international workshop on NILM.
coding. In: AAAI. p. 629–35. [49] Zhao B, Stankovic L, Stankovic V. On a training-less solution for non-intrusive
[18] Kolter JZ, Batra S, Ng AY. Energy disaggregation via discriminative sparse appliance load monitoring using graph signal processing. IEEE Access
coding. In: Advances in neural information processing systems. p. 1153–61. 2016;4:1784–99.
[19] Bonfigli R, Squartini S, Fagiani M, Piazza F. Unsupervised algorithms for non- [50] Elafoudi G, Stankovic L, Stankovic V. Power disaggregation of domestic smart
intrusive load monitoring: an up-to-date overview. In: 2015 IEEE 15th meter readings using dynamic time warping. In: 2014 6th international
symposium on communications, control and signal processing (ISCCSP). IEEE. [58] Makonin S, Popowich F, Bartram L, Gill B, Bajic IV. AMPds: a public dataset for
p. 36–9. load disaggregation and eco-feedback research. In: Proceedings of the 2013
[51] Ruzzelli AG, Nicolas C, Schoofs A, O’Hare GM. Real-time recognition and IEEE electrical power and energy conference (EPEC).
profiling of appliances through a single electricity sensor. In: 2010 7th annual [59] Farinaccio L, Zmeureanu R. Using a pattern recognition approach to
IEEE communications society conference on sensor mesh and ad hoc disaggregate the total electricity consumption in a house into the major
communications and networks (SECON). IEEE; 2010. p. 1–9. end-uses. Energy Build 1999;30:245–59. http://dx.doi.org/10.1016/S0378-
[52] Dong M, Meira PC, Xu W, Chung C. Non-intrusive signature extraction for 7788(99)00007-9.
major residential loads. IEEE Trans Smart Grid 2013;4:1421–30. [60] Nguyen KA, Zhang H, Stewart RA. Development of an intelligent model to
[53] MacQueen J et al. Some methods for classification and analysis of multivariate categorise residential water end use events. J Hydro-environ Res
observations. In: Proceedings of the fifth Berkeley symposium on 2013;7:182–201. http://dx.doi.org/10.1016/j.jher.2013.02.004.
mathematical statistics and probability, Oakland, CA, USA. p. 281–97. [61] Cominola A, Giuliani M, Piga D, Castelletti A, Rizzoli AE. Benefits and challenges
[54] Ghahramani Z, Jordan MI. Factorial hidden Markov models. Mach Learn of using smart meters for advancing residential water demand modeling and
1997;29:245–73. management: a review. Environ Model Software 2015;72:198–214.
[55] Viterbi AJ. Error bounds for convolutional codes and an asymptotically [62] Tewolde M, Longtin J, Das S, Sharma S. Determining appliance energy usage
optimum decoding algorithm. IEEE Trans Inform Theory 1967;13:260–9. with a high-resolution metering system for residential natural gas meters.
[56] Müller M. Dynamic time warping. Information retrieval for music and motion, Appl Energy 2013;108:363–72. http://dx.doi.org/10.1016/j.
2007. p. 69–84. apenergy.2013.03.032.
[57] Desmedt J, Vekemans G, Maes D. Ensuring effectiveness of information to
influence household behaviour. J Clean Prod 2009;17:455–62.

Applied Energy: A. Cominola, M. Giuliani, D. Piga, A. Castelletti, A.E. Rizzoli

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Applied Energy: A. Cominola, M. Giuliani, D. Piga, A. Castelletti, A.E. Rizzoli

Uploaded by

Copyright:

Available Formats

Applied Energy 185 (2017) 331–344

Contents lists available at ScienceDirect

A Hybrid Signature-based Iterative Disaggregation algorithm

The effectiveness of customized energy consumption feedbacks

algorithm based on DTW and customers’ daily diaries, which is

For computational reasons, our model assumes that the number

3.1.3. Trace patterns correction through Iterative Subsequence

C.1 Event partitioning. The total consumption trajectory and the

noise or a high number of appliances. We address both these ques-

5.2. Algorithm sensitivity to signal noise

F-score PCE (%) R2

load from a 7-appliance home from REDD show an end-use esti-

5.4. Semi-supervised HSID disaggregation

In order to test the performance of the semi-supervised exten-

The knowledge about residential end-use power consumption

You might also like