Schaule MA 2018

PL
Power
Systems
Laboratory
Benjamin Schaule
Disaggregation of Smart Meter Data into

Specific Load Components
Master Thesis
PSL 1727
EEH – Power Systems Laboratory

ETH Zurich
Examiner: Prof. Dr. Gabriela Hug

Supervisors: M.Sc. Thierry Zufferey, Dr. Stephan Koch (Adaptricity)
Zurich, January 31, 2018

ii
Abstract
In this master’s thesis, methods were developed for disaggregating and mod-
eling heat pumps and boilers from coarse smart meter active power measure-
ments. The disaggregated time series were matched to appropriate models
that best describe the behavior of the individual appliances. For this pur-
pose, a physical, dynamic heat pump model and a parameter estimation
technique were developed. The disaggregation methods are designed for two
different resolution levels, i.e., in the range of one minute and in the range of
15 minutes per data point. Unlike existing disaggregation methods, which
focus on the shape of measurement signals in higher resolution, these dis-
aggregation methods combine information from the shapes of signals with
model based Bayesian and moving horizon estimation approaches. The dis-
aggregation method produces good results in one minute resolution, and
even for data with 15 minute resolution. Similarly, a disaggregation method
for electric boilers was developed. The disaggregated time series then result
in a statistical hot water consumption pattern. Finally, a method was de-
signed to add a disturbance pattern to the dynamic model of the boiler or
the heat pump. This allows forward simulation of a dynamic model of both
device types that not only has the same behavior on average, but also shares
the same noise characteristics.
iii
iv
Acknowledgements
This thesis was carried out as a cooperation with Adaptricity and the Power
Systems Laboratory at ETH Zurich. I am grateful to Prof. Dr. Gabriela Hug
for making this possible and to Dr. Stephan Koch and Thierry Zufferey for
their supervision. I would like to thank them for stimulating discussions and
helpful feedback.
v
vi
Contents
List of Acronyms ix
List of Symbols xi
1 Introduction 1
2 Heat Pump Modeling and Parameter Estimation 5

2.1 Heat Pump Model . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Heat Pump Behavior . . . . . . . . . . . . . . . . . . . 5
2.1.2 Physical Model . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Estimation of Heat Pump Parameters . . . . . . . . . . . . . 11
2.2.1 Definition of Optimization Problem . . . . . . . . . . 11
2.3 Illustration and Analysis . . . . . . . . . . . . . . . . . . . . . 14
3 Heat Pump Disaggregation 17

3.1 Structure of Disaggregation and Estimation . . . . . . . . . . 17
3.1.1 Summary of Approach . . . . . . . . . . . . . . . . . . 18
3.1.2 Overview of Disaggregation Methods . . . . . . . . . . 18
3.2 Method A: High Resolution Data . . . . . . . . . . . . . . . . 20
3.2.1 Initial Guess . . . . . . . . . . . . . . . . . . . . . . . 20
3.2.2 Disaggregation Process . . . . . . . . . . . . . . . . . 25
3.2.3 Probability Components and Constraints . . . . . . . 27
3.2.4 Calculation of Disaggregated Time Series . . . . . . . 32
3.2.5 Interpretation and Examples . . . . . . . . . . . . . . 34
3.3 Moving Horizon Method . . . . . . . . . . . . . . . . . . . . . 36
3.3.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3.2 Theoretical Background: Moving Horizon Estimation 38
3.3.3 Initial Guess . . . . . . . . . . . . . . . . . . . . . . . 39
3.3.4 Disaggregation Process . . . . . . . . . . . . . . . . . 52
3.3.5 Analogy to Moving Horizon Estimation . . . . . . . . 63
3.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.4.1 Motivation for Choice of Evaluation . . . . . . . . . . 64
3.4.2 Examples of Specific Houses . . . . . . . . . . . . . . . 66
3.4.3 Overall Disaggregation . . . . . . . . . . . . . . . . . . 68
vii
viii CONTENTS
3.5 Synthesis of Heat Pump Time Series . . . . . . . . . . . . . . 74

3.6 Implementation and Further Improvement . . . . . . . . . . . 75
4 Boiler Modeling and Disaggregation 79

4.1 Boiler Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.2 Disaggregation Algorithm . . . . . . . . . . . . . . . . . . . . 82
4.2.1 Compensation Period . . . . . . . . . . . . . . . . . . 82
4.2.2 Independent Period . . . . . . . . . . . . . . . . . . . 83
4.3 Evaluation of Boiler Disaggregation . . . . . . . . . . . . . . . 83
4.4 Dynamic Boiler Model . . . . . . . . . . . . . . . . . . . . . . 85
5 Combined Disaggregation of Boiler and Heat Pump 91
6 Conclusion and Outlook 95
Bibliography 99
List of Acronyms
ASHP Air Source Heat Pump

BESS Battery Electric Storage System
COP Coefficient of Performance
DSO Distribution System Operator
GSHP Ground Source Heat Pump
HHPS House-Heat Pump System
KDE Kernel Density Estimation
MHE Moving Horizon Estimation
NC Non-Controllable
NILM Non-Intrusive Load Monitoring
SOC State of Charge
TCL Thermostatically Controlled Load
ix
x CONTENTS
List of Symbols
A Surface area of body

a Rate at which the house temperature approaches Tamb
b Effect of the heat pump
c Capacity
c Solar gain coefficient
∆Pi Size of the ith jump
fOpt Objective function
h Heat transfer coefficient
I Set of upward jumps
J Jump
K Set of downward jumps
m Mass
ω Noise parameter
Ψ Objective value
P Power
Pboiler Boiler power
Pboiler,r Rated boiler power
Pjump,min Minimum threshold for jump to be detected as jump
PHP Heat pump power
PHP,r Rated heat pump power
PHP,disagg Disaggregated heat pump time series
PHP,meas Measured heat pump power
PMain,meas Measured time series of power consumption / generation of
the house
PNC,disagg Disaggregated time series of non-controllable loads
Psolar Solar irradiance
PHP,Son Heat pump power when switching on
PHP,Soff Heat pump power when switching off
p̂On,jump Probability density function from kernel density estimation
for upward jumps occurring in time series PMain,meas
p̂Off,jump Probability density function from kernel density estimation
for downward jumps occurring in time series PMain,meas
xi
xii CONTENTS
pOn,jump Probability density function from kernel density estimation

for upward jumps for heat pump switches
pOff,jump Probability density function from kernel density estimation
for downward jumps for heat pump switches
pOn,Temp Probability density function from kernel density estimation
for effective switching temperature (switch on)
pOff,Temp Probability density function from kernel density estimation
for effective switching temperature (switch off)
pBL Probability density function for the relationship between the
base load before and after the switching process
Qbody Heat contained in body
QHP Heat added to house by heat pump
r Radiation
Son/off Sets of segments where heat pump is on or off
T Temperature
tnext,on Next known time the pump switches on
Tamb Ambient Temperature
low
Tth Temperature threshold of heat pump for switching back on
high
Tth Temperature threshold of heat pump for switching back off
TiOn,fin Temperature at the end of the ith on-segment based on a
noiseless heat pump model
TiOff,fin Temperature at the end of the ith off-segment based on a
noiseless heat pump model
u Binary input: heat pump on or off
v Disturbance term
w Measurement noise
xt Model based HHPS temperature at time t
x̂t Measured HHPS temperature at time t
xitfin House temperature at the end of the ith off segment
xError Difference between HHPS model temperature at time of
switch and threshold temperature
Chapter 1
Introduction
In recent years, distribution grids have become significantly more dynamic

due to technological advancements and reduced costs of many technologies
in the energy sector. Some examples are [1]:
Generation: Costs for photovoltaics are continuing to decline and the
technology is becoming increasingly prevalent in large scale plants, as well
as in distributed generation on smaller scales (e.g., rooftop solar). In many
countries, there is political pressure to reduce dependence on traditional
power plants, such as coal and nuclear power, in favor of renewable sources
[2].
Storage: Battery electric storage systems (BESS) can be used within
the grid as a buffer for inequalities in supply and demand of electricity.
On a system-wide scale, BESS can be used for frequency control. Locally,
BESS can decrease effects of local demand or supply spikes that can lead to
congestion, overloading or voltage problems within a distribution grid. As
with photovoltaics, the costs for BESS are decreasing rapidly [3], [4].
Monitoring and Control: More and more data is becoming avail-
able at lower voltage levels in the grid because of the deployment of smart
meters and other measurement technologies throughout the power system.
As an increasing amount of the data becomes available in (near) real time,
actuators can apply changes to the grid that can alleviate issues such as
overloading or voltage quality [5].
These developments have a variety of effects: While there are more ele-
ments in the grid that can cause issues such as overloading, there is also an
increasing number of ways to solve them. Stress on the grid can be reduced
by using the flexibility of loads, commonly referred to as Demand Response
(DR) or demand side response. The basic idea of DR is to adjust the specific
time when certain loads consume or produce power. By increasing the load
in time of excess generation or by decreasing the load when consumption is
too large, voltage and current spikes and sags can be reduced. As a result,
costly cabling replacements can be avoided in some cases.
1
2 CHAPTER 1. INTRODUCTION
Electric thermal loads such as heat pumps or boilers are particularly well
suited for DR schemes for multiple reasons:
• They are present in many residential and industrial buildings
• They have high power consumption when active
• Their time of active operation can be adjusted without significant im-

pact on consumer comfort and needs
For residential use, consumer comfort can be maintained even when heat-
ing and cooling devices adjust their switching times to accommodate grid
needs. For example, if a user would like his house to be kept at a temperature
of 21◦ C, a heat pump would switch on when the temperature in the house
reaches a low threshold (e.g., 20.5◦ C) and off again when reaching a high
threshold (e.g., 22◦ C) when using a typical “bang bang” control scheme. For
a DR scheme, the heat pump could be blocked from switching on even when
the temperature falls below the lower threshold. As long as this state is not
maintained for an extended period of time, the effect on the consumers is
small. The same applies to industrial thermal loads, e.g., industrial ovens or
cooling devices. If orchestrated well, such a scheme can reduce consumption
peaks and therefore reduce the need for grid upgrades. DR schemes can
also be applied for reducing energy costs by avoiding consumption when the
electricity price is high.
For Distribution System Operators (DSOs), a key question is whether
or not DR schemes are able to alleviate problems in the grid and thereby
avoid costlier methods. In order to evaluate the effectiveness of demand
response in a specific grid area, the behavior of the grid needs to be sim-
ulated, using measurement data from the grid. However, loads of interest
are often not measured individually, but contained in an aggregate mea-
surement at a household level. Therefore, one must identify which parts
of the measured load belong to the controllable devices and which belong
to the uncontrollable devices, which is defined as the disaggregation of the
main load. Simultaneously, one can obtain a model describing the behav-
ior of the controllable devices in their uncontrolled behavior. With these
two components, the grid behavior can be simulated under various control
schemes.
Disaggregation is only possible when there is adequate data available.
With Smart Meters being rolled out in many countries, the amount of data
available to grid operators is increasing rapidly. However, in many cases the
available data is only available in 15-minute or hourly periods [6], [7]. There
has been a large amount of research in the field of disaggregation, often called
Non-Intrusive Load Monitoring (NILM), however most such research uses
data in significantly higher resolutions and often relies on device signatures
in the range of milliseconds [8]. Therefore, these methods are not suited for
3
highly coarse data in the range of minutes. The low resolution of the data
presents a particular challenge: Heat pumps are typically on for durations
of 20 to 50 minutes1 and therefore can be difficult to detect based solely on
the signal shape.
The goal of this thesis is to develop methods to disaggregate the con-
sumption of heat pumps and boilers from smart meter measurements and
find an appropriate model that describes the behavior of the individual heat
pumps and water heaters.
This report is structured as follows: First, the heat pump model and pa-
rameter estimation will be introduced. Second, two disaggregation methods
for heat pumps will be detailed, using data in one minute and 15 minute res-
olution, followed by a disaggregation method and dynamic model for boilers
and a combined disaggregation of heat pumps and boilers. Finally, a method
for synthesizing realistic heat pump and boiler time series will be described
and a conclusion and outlook will be provided.
1
Based on the data in this thesis; durations between 7 and 180 minutes have been
observed
4 CHAPTER 1. INTRODUCTION
Chapter 2
Heat Pump Modeling and

Parameter Estimation
2.1 Heat Pump Model

2.1.1 Heat Pump Behavior
As a first step, measurement data of the heat pump was analyzed, in order
to find a plausible model to describe heat pump behavior.
The work in this thesis was based on a set of measurements containing
active power measurements from 40 houses in one minute resolution. Active
power from 20 Heat pumps and 40 boilers was measured directly as well.
Most heat pumps show regular “on-off” behavior, meaning that they
either draw power in the range of their rated power or they consume a small
stand-by amount of power. Figure 2.1 shows three examples of heat pump
behavior: In the subplot on the left side, the heat pump shows a brief spike at
random times while on, which occurs similarly with some other heat pumps
as well. The middle subplot shows the most typical behavior, whereas the
plot on the right hand side shows a heat pump that occasionally draws twice
as much power. The large increase in power occurs only at few times in the
data set. For this thesis, behavior similar to the first two subplots will be
assumed. Trying to consider behavior as shown in the right subplot would
require more similar data than is currently available to the author.
2.1.2 Physical Model

The goal of this subsection is to identify a model that describes the relevant
variables of a system comprised of a house and a heat pump, i.e., House-
Heat Pump System (HHPS). Heat pumps and air conditioning devices are
thermostatically controlled loads (TCLs) and are usually controlled using
so-called “bang-bang” control based on temperature [9]. Bang-bang control
is a control strategy where the actuator or input is always at its limit, e.g.,
5
6CHAPTER 2. HEAT PUMP MODELING AND PARAMETER ESTIMATION
7 PHP
5
Power [kW]
0
0 200 0 200 0 200
Time [Minutes] Time [Minutes] Time [Minutes]
Figure 2.1: Heat pump behavior examples from three houses
on-off or 0 and 100%. Therefore, a model describing the behavior of the

house temperature as a dynamic state was chosen.
As is known from Fourier’s law for heat transfer, the decay thermal
energy Qbody of a body is proportional to the temperature difference between
the body and its surrounding. In mathematical terms,
dQbody
= hA(Tamb (t) − Tbody (t)) = hA∆T (t), (2.1)
dt
where Qbody is the thermal energy stored in the body, h is the heat
transfer coefficient, A is the heat transfer surface area, Tbody (t) and Tamb (t)
are the body and ambient temperatures, respectively, and ∆T (t) is the dif-
ference between these temperatures. Furthermore, the relationship between
change in heat and temperature in the body is given by the specific heat c
and the mass m, as
∆Qbody = cm∆Tbody , (2.2)
where ∆Qbody is the heat transferred to the body, m is the mass of the
body and ∆Tbody is the change in temperature in the body.
2.1. HEAT PUMP MODEL 7
1 3
4
Figure 2.2: Working principle of a heat pump, adapted from [10], simplified
Combining equations (2.1) and (2.2), one can describe the rate of change
of temperature as a function of the difference of ambient and body temper-
atures as
mc∆Tbody
= hA(Tbody (t) − Tamb (t)) = hA∆T (t), (2.3)
dt
or equivalently
d∆Tbody hA hA
= (Tbody (t) − Tamb (t)) = ∆T (t) = â∆T (t), (2.4)
dt mc mc
where â = hA
mc is a decay parameter describing the cooling effect that
occurs when the outside air is colder than the inside air. This is the main
component of the temperature decay.
2.1.2.1 Heat Pump Physics

Unlike resistive electric heaters which convert electric energy to thermal
energy via resistive losses, heat pumps work as heat exchangers and therefore
can add more output heat to a space per unit of electric power than resistive
heaters. The basic principle is depicted in Figure 2.2.
A closed piping system contains liquid which absorbs heat while evapo-
rating in the evaporator (3) in Figure 2.2. The compressor (4) then increases
the pressure, causing the liquid to condense and release heat (1). The cooled
liquid is then returned through the expansion valve (2) to the evaporator (3)
where the cycle begins again. The input power is given by the compressor,
and the heat transferred to the heat sink is taken from the heat source. Since
the electric energy added to the system is not directly the source of the heat
added to the heat sink, a measure for efficiency must consider the energy
extracted from the heat source and not only the electrical energy consumed
by the heat pump. Therefore, one defines the Coefficient Of Performance
(COP) as
QHP
COP = , (2.5)
PHP
with QHP being the heat the heat pump adds to the house and PHP being
the electric power consumed by the heat pump.
The location of the heat source differs by heat pump type: Two examples
of heat pumps are Ground Source Heat Pumps (GSHP) and Air Source Heat
Pumps (ASHP). The heat source for GSHPs is the ground, i.e., the piping
system is located under ground, such that it can extract heat from the
ground. ASHPs extract heat from the ambient air. The amount of energy
that can be extracted depends on the temperature difference between the
heat source and sink, thus leading to temperature dependent COPs. For
GSHPs, the COP does not vary much since the temperature of the ground
does not change significantly. ASHPs, however, have significantly varying
COPs for changing ambient temperatures.
The heat added to the house is therefore given by
QHP = PHP COP(T ), (2.6)
where the dependence on temperature is denoted in the COP. Using the

assumption from Equation (2.2) of a constant heat capacity, the temperature
changes linearly with the heat QHP . Assuming there is no temperature decay
driven by ambient temperatures, i.e., Qbody = QHP the model then is given
by
Qbody PHP COP

∆Tbody = = . (2.7)
cm cm
As a result, the differential equation describing the rate of change in
temperature is obtained by combining Equations (2.7) and (2.4) to
∆Tbody hACOP
= 2 2 PHP = b̂PHP . (2.8)
dt c m
In addition to the ambient temperature, solar irradiance influences the
house temperature, in an effect called “solar gains”. Analogously, the heating
of the body due to solar gains can be calculated in the same manner (ignoring
the COP) as
∆Tbody Tbody hA
= = 2 2 Psolar = ĉPsolar , (2.9)
dt dt c m
2.1. HEAT PUMP MODEL 9
where Psolar is the solar irradiance.

Using linearity and the principle of superposition, the differential equa-
tion describing the entire system is given as
Tbody
= â∆T (t) + b̂PHP + ĉPsolar (2.10)
dt
with an initial condition Tbody [10].
The work conducted in this thesis is performed on data in regular 1-
or 15-minute intervals (mean power), making a discrete description of the
system desirable. For the derivation of a discrete time state space model of
this shape, the reader is referred to resources on control theory such as [11].
In discrete time, the system is given as
xt+1 = a(xt − Tamb,t ) + Tamb,t + but + cPsolar,t + ωt , (2.11)
with an initial state

x0 = xinit , (2.12)
where xt denotes the state, namely the HHPS temperature at time t, a

denotes a decay variable describing the rate at which the HHPS temperature
approaches the ambient temperature Tamb . b is the factor representing the
effect of the heat pump and u is the binary input variable that takes the
value zero or one, indicating the heat pump is on or off and c is the coefficient
for solar gains due to the solar irradiance Psolar . Finally, ω is a disturbance
term.
The solution of the model at time t + 1 when ignoring noise ω is given
as
xt+1 = at x0 + Σt−1
m=0 a
t−1−m
(bum + cPsolar,m + (1 − a)Tamb,m ), (2.13)
when starting at time zero. This equation can be obtained by expanding

x1 = f (x0 ), x2 = f (f (x1 )), etc. and then generalizing. For clarity, note that
at refers to the t-th power of a.
Furthermore, one assumes that the HHPS temperature is controlled by a
hysteresis controller, such that the x remains within a specified temperature
high low
band given by an upper limit Tth and a lower limit Tth
high low
Tth > Tth . (2.14)
The switching algorithm can be described as follows:


low
if xt ≤ Tth
1

high
ut = 0 if xt ≥ Tth , (2.15)

ut−1 else

Power [kW]
2
PHP
0
Temperature [◦C]
20
Tamb
10 x
0
06:00 09:00 12:00 15:00 18:00 21:00
Time
Figure 2.3: Synthesized heat pump behavior
thus heating whenever the temperature drops below the lower threshold
and staying on until the upper threshold is reached. Analogously, the heat
pump switches off when the temperature x reaches the upper threshold and
low is reached.
stays off until the lower threshold Tth
An example of heat pump behavior according to the model is shown
in Figure 2.3. The heat pump, visualized above, switches on and off more
frequently when the ambient temperature is cold and less frequently during
the day when it is warmer.
2.1.2.2 Definitions and Terminology
For sake of clarity, some simplifications and terminology are explained and
introduced in the following.
According to the data used for the thesis, some heat pumps do not
instantaneously switch from standby power to their rated power, but have a
period of switching on and off that can last up to four minutes. In order to
avoid ambiguities while integrating ramp up and ramp down periods into a
model that assumes instantaneous switches, the heat pump can be defined
as being off (u = 0) when its power is less than 80% of the rated power and
on when above 80% (u = 1). As the only exception, brief downward spikes
while the heat pump is otherwise on (cf. Figure 2.1) are also counted as the
heat pump being on.
A continuous time period when a heat pump is switched on is called an
on-segment and denoted as SOn . Off-segments SOff are defined analo-
gously. The actual HHPS temperature is defined as x̂t , for which there is no
direct measurement.
2.2. ESTIMATION OF HEAT PUMP PARAMETERS 11
2.2 Estimation of Heat Pump Parameters

This section introduces and defines an optimization problem in order to find
optimal parameters a∗ , b∗ and c∗ for the HHPS system (cf. Equation (2.11))
when the switching behavior of a heat pump is known.
2.2.1 Definition of Optimization Problem

2.2.1.1 Input Variables
The known input variables are given by the ambient temperature Tamb and
the solar irradiance Psolar , which can be obtained from weather services.
Furthermore, one knows the switching behavior, i.e., ut for all t, which
can be obtained directly from the measurement PHP,meas . Note that the
direct measurement of the heat pump’s active power is not used in the
disaggregation, but only serves the purpose of validation and evaluation in
this thesis.
2.2.1.2 Assumptions
The basic assumption is that the heat pump switches according to the tem-
perature based hysteresis control as defined in Equation (2.15), based on a
local measurement of the HHPS temperature. The switching thresholds Tth low
high
and Tth are assumed to be known and constant. For practical purposes
low = 19◦ C and T high = 21◦ C. The
plausible thresholds were chosen, e.g., Tth th
choice of the switching thresholds as well a justification for fixed thresholds
will be discussed in Section 2.3.
2.2.1.3 Definitions
The entire time series of the active heat pump power can be divided into on-
and off-segments. Let i denote the ith off-segment and k the kth on-segment.
Furthermore, let xtfin ,i denote the HHPS temperature at the end of the ith
off-segment, i.e., when the heat pump switches on again, as calculated by
the model (2.11). Analogously, xtfin ,k is the HHPS model temperature at
the end of the kth on segment. As explained earlier, the real HHPS tem-
perature in the HHPS at the switching times is assumed to be the threshold
temperature, i.e., x̂tfin ,i = Tthlow and x̂ low
tfin ,k = Tth . The model temperatures
xtfin ,i and xtfin ,k may differ from the threshold temperatures Tth low and T high
th
due to incorrect assumptions, process noise or measurement noise. These
differences are defined as error terms ∆xiError and ∆xkError with
high
∆xkError = xtfin ,k − Tth (2.16)
and
Power [kW]
PHP,meas
0
x
∆xError
22
Temperature [◦C]
low
Tth
20
low
Tth
18
20:00 21:00 22:00 23:00

Time
Figure 2.4: Heat pump and temperature model behavior with error
∆xiError = xtfin ,i − Tth

low
. (2.17)
This is visualized in Figure 2.4, which shows a measured heat pump
consumption profile above and the temperature development according to a
set of HHPS model parameters a, b, and c, assuming that the temperature xt
low and T high at all switching times. The
is at the threshold temperatures Tth th
green background corresponds to the times when the heat pump is switched
on. As shown, the heat pump switches before the model temperature reaches
the threshold in some cases and after passing the threshold in others. If all
assumptions were accurate and there were no noise, the resulting error terms
∆xError would be zero for all segments. Therefore, the goal should be to find
HHPS parameters a, b and c that minimize the size of the error terms.
2.2.1.4 Optimization Problem

Using a least square error approach, the optimization problem to obtain
the HHPS parameters can be defined as follows: The optimization vector
consists of the HHPS parameters and is defined as
Ψ = (a, b, c). (2.18)

2.2. ESTIMATION OF HEAT PUMP PARAMETERS 13
The objective function is given by the sum of the square of all error terms
as
fOpt (Ψ) = Σi∈|1...SOff | (∆xiError )2 + Σk∈|1...SOn | (∆xkError )2 , (2.19)
corresponding to minimizing the square of the blue error terms xError in

Figure 2.4.
The error terms depend on all elements in the HHPS model, and therefore
also on the parameters a, b and c (cf. Equation (2.11)). In order to provide
a better understanding of the shape of the objective function, the individual
terms will be written out in more detail in the following. When inserting
Equations (2.16) and (2.17) into Equation (2.19), the objective becomes
low 2 high 2
fOpt (Ψ) = Σi∈|1...SOff | (xtfin,i − Tth ) + Σki∈|1...SOn | (xtfin,k − Tth ) . (2.20)
The switching temperatures xtfin ,i and xtfin,k can be expressed explicitly

using Equation (2.13), resulting in
xtfin,i =atfin,i −tstart,i Tth

low
+
t −1
fin,i
+ Σm=t fin,i −tstart,i
atfin,i −1−m (bum + cPsolar,m + (1 − a)Tamb,m )
(2.21)
for off-segments and
high
xtfin,k =atfin,k −tstart,k Tth +
t −1 (2.22)
fin,k
+ Σm=t fin,i −tstart,k
atfin,k −1−m (Tamb,m (1 − a) + cPsolar,m )
for on-segments. tstart,i and tstart,k denote the beginning of the respective
off- and on-segment and the starting temperature x0 is Tth low and T high ,
th
respectively. The multiplication of b and c with different powers of a makes
the objective nonlinear.
In order to formalize the constraints, h1 and h2 are defined such that
h1 (Ψ) = 0 and h2 (Ψ) = 0 are equivalent to Equations (2.21) and (2.22):
h1 (Ψ) = − xtfin,i + atfin,i −tstart,i Tth

low
+
t −1
fin,i
+ Σm=t fin,i −tstart,i
atfin,i −1−m (bum + cPsolar,m + (1 − a)Tamb,m )
(2.23)
and
high
h2 (Ψ) = − xtfin,k + atfin,k −tstart,k Tth +
t −1 (2.24)
fin,k
+ Σm=t fin,i −tstart,k
atfin,k −1−m (Tamb,m (1 − a) + cPsolar,m ).
Furthermore, one defines h(x) as
h(x) = (h1 (x), h2 (x))T . (2.25)
With the terms as defined in Equations (2.18) to (2.24), the optimization

problem is given by
min fOpt (Ψ)

a,b,c
s.t. h(x, a, b, c) = 0 (2.26)
a, b, c > 0
a < 1
and a? , b? , c? are the optimal parameters that minimize the square error
as defined in Equation (2.19).
The constraints are explained as follows: b and c both need to be pos-
itive for the heat pump and solar irradiance to have a positive impact on
the change in temperature (cf. Equation (2.11)). Furthermore, a must be
less than one for the HHPS temperature x to decay towards the ambient
temperature Tamb . Negative values of a are excluded because they would
cause the HHPS temperature to oscillate between being larger and smaller
than Tamb in consecutive steps.
2.3 Illustration and Analysis

Shape of Objective Function As stated earlier, this optimization prob-
lem is nonlinear, and additionally, the objective function is non-convex, as
can be seen in Figure 2.5. This figure shows the value of the objective func-
tion for data from one measured heat pump when adjusting the parameter a.
Note that the minimum is obtained for a strictly less than 1 (a∗ = 0.997).
Parameters b and c were fixed at their respective optimal values for this
figure. A formal proof of non-convexity is beyond the scope of this report,
however the visual indication is given: A line connecting the point where
a = 1 to the point where a = 0 crosses the objective function, ensuring
non-convexity.
Figure 2.6 shows evaluations of the objective function for different pa-
rameters a and b, indicating that the objective function is well-behaved for
plausible heat pump parameters. There is a unique minimum and the objec-
tive is shaped such that gradient descent methods can identify the minimum
when given an appropriate initial guess. In practice, values in the range of
◦C
a ≈ 0.995, b ≈ 0.05◦ C and c ≈ 0.05 kW worked well as initial guesses for
the optimization methods on the available data set1 . The parameter c has
a similar effect on the objective function as b.
1
For data in one minute resolution
2.3. ILLUSTRATION AND ANALYSIS 15
×105
1.25
Objective Value [(◦C)2]
1.00
0.75
0.50
0.25
0.0 0.2 0.4 0.6 0.8 1.0

a
Figure 2.5: Objective value as function of a
high low
Switching Thresholds The HHPS switching thresholds Tth and Tth
are unknown, as stated previously. Including these parameters in the op-
timization vector is not possible, because the optimization either does not
converge or converges to implausible values. Although a formal proof is out
of the scope of the thesis, an informal explanation can be given as follows:
Let the model parameters used in Figure 2.4 be the optimal parameters for
high
the optimization function using threshold temperatures Tth = 21◦ C and
low 19◦ C. If instead the thresholds were chosen to be T high
Tth th = 19.1◦ C and
low ◦
Tth 19 C, the optimal parameters would be adjusted such that the tem-
perature would increase and decay more slowly. The absolute value of the
error terms ∆xError would then be in a smaller range than in the original
problem. Therefore, the squares of the errors ∆x2Error would also be signif-
icantly smaller than in the original problem and consequently the parame-
high
ters Tth low 19◦ C would be considered closer to the optimal
= 19.1◦ C and Tth
value than the original parameters. Such a result is physically implausible
and therefore assumptions must be made for the temperature thresholds.
Switching thresholds near the standard room temperature of 20◦ C were
used with the difference between thresholds being between 1◦ C and 2◦ C [12].
The specific choice of the thresholds showed little influence on the resulting
heat pump behavior for thresholds in this range. Therefore, variations in
the threshold temperatures were not considered in detail.
Coefficient of Performance The coefficient of performance of a heat

pump can be temperature dependent for ASHP, as explained in Section
2.1.2. The model introduced in Equation (2.11) implicitly has constant
COP because b is constant. With the data used in this thesis, it was not
possible to detect any heat pumps with a temperature varying COP. This
25000
×104
3.0 20000
2.5
Value [(◦C)2]
Objective
2.0
15000
.
.
1.5
1.0
0.5 10000
0.75
0.98 . ]
0.50
.
0.99 0.25 [ C
◦
5000
a 1.00 b
Figure 2.6: Objective value as function of a and b
was analyzed by performing the optimization from Equation 2.26, though

replacing b with (b0 + b1 Tamb,t ) in the objective function. This resulted in
very small or negative values for b1 , indicating that the heat pump COP is
not clearly proportional to the ambient temperature. Therefore, a constant
COP is assumed for all considerations in this thesis.
Chapter 3
Heat Pump Disaggregation
3.1 Structure of Disaggregation and Estimation

In Chapter 2, a method was described for estimating HHPS parameters
given the direct active power measurements of the heat pump. In most
cases, direct measurements of the heat pump are not available, whereas
measurements from the whole house (“main”) are available to the grid op-
erator when there is a smart meter installed. Therefore, the goal of this
chapter is to provide a method to disaggregate the heat pump components
from the main measurement.
Multiple approaches were developed and tested in the course of this
thesis, which all use the following basic approach:
1. Find Initial guess: Easily detectable components with high certainty
2. Estimation of HHPS Parameters based on initial guess
3. Forward simulation of the HHPS model beginning with known

switching points
4. Selection of most likely switching points to identify new on- and

off-segments
5. Updated Estimation of HHPS parameters based on the disaggre-

gation from step 4 (optional)
6. Return to step 3 or finish
This section provides a brief explanation of the motivation for each step,
a comparative description of the different methods as well as a brief summary
of the elements contained within each disaggregation method.
17
18 CHAPTER 3. HEAT PUMP DISAGGREGATION
3.1.1 Summary of Approach

Initially, the only information that is available is the measurement of the
main as well as the rated power of the heat pump1 . Given that heat pumps
have a regular pattern of behavior, both the dynamic model and the signal
shape are considered for the disaggregation process.
A partial disaggregation as an initial guess (step 1) can be obtained
without a model by taking advantage of time periods when the heat pump
signal is relatively visible in the main measurement due to low other activ-
ity and high consumption from the heat pump. Such times occur frequently
at night or during times when the inhabitants of the house are inactive or
absent. This partial disaggregation is then used for the parameter estima-
tion (step 2) as described in Section 2.2. When performing the initial guess
on time periods lasting at least one week, it provides a sufficient number of
detectable heat pump segments to obtain heat pump parameters that are
similar to the parameters based on the measurements. Given the parame-
ters of the heat pump and switching times from the initial guess, the HHPS
model is simulated forward from these known switching points (step 3).
Combining the simulated HHPS temperature and probability measures for
switches, the most likely next switching point or series of switching
points are identified. The probability estimates for switching consider the
HHPS temperature as well as other factors which differ between the devel-
oped methods. This step leads to a new on- or off-segment identified, which
can then be used to calculate new HHPS parameters (step 5). This process
is repeated until there is an estimate for the heat pump at all times, i.e.,
return to step 3 until completed.
3.1.2 Overview of Disaggregation Methods

In this work, methods were developed for data in one minute resolution and
15 minute resolution, i.e., one data point with the averaged power consump-
tion over the respective time period.
Table 3.1 shows an overview of which components are contained in the
different models. Method A is applied to data with one minute resolution,
which is significantly shorter than the on-duration of heat pumps. Method
B is applied to data in 15 minute resolution. Method A uses the time of
the jumps directly (jump based), while Method B considers the shape of
the coarse 15-minute signal (shape based). All methods use the HHPS tem-
perature model (cf. Equation (2.11)) and only allow on-segments where
the total consumption is large enough for heat pump to potentially be on
(plausibility check). The size of a jump can only be detected in high reso-
lution data and therefore it is only considered for Method A. There are two
1
When unknown, the rated power can be identified by analyzing the histogram of the
main power measurements
3.1. STRUCTURE OF DISAGGREGATION AND ESTIMATION 19
Table 3.1: Overview of disaggregation methods
Method A Method B1 Method B2
Data Resolution 1 minute 15 minutes 15 minutes

Estimation Period 1 step Moving Horizon Moving Horizon
Initial Guess Method Jump based Shape based Shape based
Temperature Model
Plausibility Check
Jump Size × ×
Minimization of × ×
changes to NC load
Activity factor × ×
versions of Method B: Method B2 contains all components of B1 and con-

siders additional factors for calculating the most likely next switching point.
In particular, Method B2 prefers switching times that do not result in the
Non-Controllable (NC) load having large increases or decreases at the time
of a switch. Furthermore, Method B2 only considers potential on-segments
where the PMain,meas increases when the heat pump is estimated to switch
on and decreases when at when estimated to switch off (activity factor). For
a more detailed description of the components, the reader is referred to the
respective sections.
3.1.2.1 Method A: Estimation Based on Jumps in Power Con-

sumption
Method A was developed as an initial model for input data with a resolution
of one data point per minute, which allows spikes from large consumers such
as heat pumps to be identified. The initial guess can be obtained by finding
spikes in the size of the rated heat pump power at times where there is little
other consumption. Furthermore, the probability estimates for heat pump
switches are based on the HHPS model temperature as well as the size of
spikes in the time series. The temperature probability model provides a
probability for a heat pump switch at a certain HHPS temperature which
is obtained using the initial guess. Furthermore, physical constraints are
considered (plausibility check), i.e., the heat pump cannot be switched on
when the total power consumption was too low for the heat pump to be on.
3.1.2.2 Method B1: Moving Horizon Estimation

Unlike Method A, Method B1 uses data in 15 minute intervals, which is
a common resolution of smart meter measurement data. In this data set,
the most common duration of heat pump on-segments is in the range of
20 to 50 minutes with the switching processes lasting approximately one to

three minutes. Therefore the switching spikes cannot be detected directly
and on-segments can have different shapes due to measurement signal being
averaged over 15 minutes. An initial guess is possible in spite of the low
resolution because heat pumps often consume more power than other loads
and they are switched on and off in time intervals that are usually long
enough to have a significant impact on the resulting main measurement.
The probability estimation for heat pump switches is based on a HHPS
temperature model, as in Method A. Furthermore, the physical constraints
are also considered.
Method B1 differs from Method A in the disaggregation process: rather
than only calculating the next possible switching point, possible sequences of
switching points are considered. From each plausible next switching point,
the respective plausible switching points after that are calculated, etc.
3.1.2.3 Method B2: Modified Moving Horizon Estimation

Method B2 is an extension to Method B1: All components of Method B1
are included in Method B2 as well, however Method B2 includes more com-
ponents for estimating the switching probability as well as additional con-
straints. In particular, the probability estimate of a switch is modified to
prefer points that cause the resulting time series of non-controllable con-
sumption to be smooth (minimization of changes to NC load). Additionally,
the Method only considers switching points at times where the change in
the main measurement is larger than a certain threshold (activity factor).
3.2 Method A: High Resolution Data

For this method, a resolution in the range of one minute will be considered.
3.2.1 Initial Guess

As described in Section 3.1.1, an initial disaggregation guess is necessary to
obtain a first set of parameters for the HHPS. The approach used for this
method will be detailed in the following:
The goal is to identify time periods where the heat pump is on with a
high likelihood or where it is certainly off, while ignoring segments where it
is not possible to detect the state easily. The first step in acquiring an initial
guess is to find jumps Jup and Jdown in the measurement data PMain,meas that
likely are caused by the heat pump as follows:
Some heat pumps increase their power consumption over one to four
minutes, making it necessary to consider multiple consecutive data points
to identify a switch. The event of a jump occurring is therefore defined as
a set of up to four measurement points where the largest difference between
3.2. METHOD A: HIGH RESOLUTION DATA 21
Algorithm 1: Detection of jumps

Data: PMain,meas
Result: Tuples Jup and Jdown with Jump Size and Time Stamp
1 foreach element in PMain,meas do
2 if Jump up occurs then
3 Jup .append(∆Pjump,up , tjump,up )
4 if Jump down occurs then
5 Jdown .append(∆Pjump,down , tjump,down )
1000
Number of Occurrences
600
800
600 400
400
200
200
0 0
0 3 6 9 0 3 6 9
Power [kW] Power [kW]
(a) Detected upward jumps (b) Detected downward jumps
Figure 3.1: Histogram of jumps detected in main measurement of house

AEK32
these data points is larger than a certain threshold (e.g., 1 kW) and the
direction of the jump is given by the sign of the difference.
In some cases, DSOs know the rated power of the heat pumps that are
installed. When the rated power is not known, it can usually be detected
due to the high occurrence of heat pump switching relative to other large
devices. A typical example is shown in Figure 3.1, which shows the sizes of
all detected jumps in the time series (absolute value). In order to obtain an
estimate for the rated power, a probability density function is fitted to the
relative frequency of jump sizes using Kernel Density Estimation (KDE).
Kernel density estimation is one possible way of obtaining density functions
from discrete data sets2 . Details on such methods can be found in literature
on pattern recognition, such as [13]. A density function is used for estimating
2
Gaussian mixture models are also often used for the same purpose and can also be
used here. The choice of method has little impact on the overall result.
2.00e-03
Probability Density 1.50e-03
Probability Density
1.50e-03
1.00e-03
1.00e-03
5.00e-04 5.00e-04
0.00e+00 0.00e+00
3 6 9 0 3 6 9
(a) Upward jumps (b) Downward jumps
Figure 3.2: Kernel density estimation of jumps detected in main measure-

ment of house AEK32
the rated power in order to be independent from quantization errors caused

by discrete bins. The probability density function corresponding to Figure
3.1 is shown in Figure 3.2. The argument for the largest jump size is then
chosen as the most likely jump size for the heat pump, i.e.
PHP,Son = arg max p̂On,jump (3.1)

x
and
PHP,Soff = arg max p̂Off,jump , (3.2)
x
where p̂On,jump and p̂Off,jump are the probability density functions ex-
tracted from the KDE and PHP,Son and PHP,Soff are the respective switch-
ing powers for turning on and off, respectively. For some heat pumps the
power consumption does not stay constant while it is switched on but rather
increases slightly over time, leading to PHP,Soff being slightly larger than
PHP,Son in some cases. For the data used in this thesis, the most frequent
large jump always was caused by the heat pump.
The goal of the initial guess is to obtain time periods Son and Soff where
the heat pump is very likely to be on or off. Therefore, on-segments for the
initial guess are selected as time periods that
• begin with a jump that is close to the expected upward jump size (i.e.,
the jump is within ±10% of PHP,Son ),
• ends with a jump that is close to the expected downward jump size
and
• do not contain any jumps between the start and the end.
The initial guess for the off-segments is obtained by selecting time periods
between two on-segments that do not contain any jumps. The algorithm is
shown in mathematical terms in Algorithm 2.
Algorithm 2: Get conservative on- and off-segments

Data: Jup and Jdown
Result: Son , Soff : Sets of segments where the heat pump is on or off
1 foreach (∆Pjump , tjump,up ) ∈ Jup do
2 if ∆Pmin <| ∆Pjump |< ∆Pmax then
3 find first switch down (∆Pnext , tnext ) ∈ Jdown
4 s.t. tnext > tjump,up and ∆Pmin <| ∆Pnext |< ∆Pmax ;
5 if @(∆Pother , tnext ) s.t. tjump,up < tother < tnext then
6 Son .append(tjump,up ), tnext )
7 foreach ((tstart1 , tend1 ), (tstart2 , tend2 )) ∈

Son s.t. both segments follow each other do
8 if @(∆Pjump , tjump ) ∈ Jup ∪ Jdown s.t. tend1 < tjump < tstart2 then
9 Soff .append(tjump,up , tnext )
The obtained on- and off-segments are conservative in the sense that
they are very likely to be caused by the heat pump. Due to the fact there
are many time periods where large consumers are not switched on or off
(e.g., when residents are sleeping, out of the house or at home and not using
high electric appliances), a sufficient number of on- and off-segments can be
obtained3 .
Figure 3.3 shows an example of an initial guess, as well as the detected
jumps during one evening. It shows the behavior of the total power mea-
surement PMain,meas as well as the measurement of the heat pump PHP,meas .
As a first step, all on-segments (green) are detected by selecting the areas
that
• begin with a jump up within a band around the expected switching

power PS,on ± ∆P for ∆P = pth PS,on for some threshold pth ≈ 0.1
• end with a jump down in analogous size PS,off ± ∆P
• do not contain any other significant jumps, i.e., only changes to the
main measurement that are significantly smaller than PS,off and PS,on .
The off-segments (red) are added next, selected as areas between adjacent
on-segments, where the heat pump definitely does not switch on. The areas
3
Assuming there are multiple days where heating is necessary within the data set.
12
PHP,meas
10 PMain,meas
8 HP On-Segments
Power [kW]
HP Off-Segments
6 ∆PJump
0
00 00 00
20: 22: 00:
Time
Figure 3.3: Example of initial guess and detected spikes (house AEK32)
left white are unidentified time periods, where the behavior of the heat pump
is obscured by other electric activity. In the first unknown time period the
heat pump jumps were detected (black dots), however the time period is
marked as unknown due to the jump occurring between the start and end
of the on-segment. In the second unknown time period, the beginning of
the heat pump on-segment coincides precisely with another large appliance
switching on, making it impossible to detect the on-segment purely by shape.
As can be seen by a typical example from the same house as Figure

3.3, the change in behavior of the main measurement PMain,meas is shown in
Subfigure 3.4a and the direct heat pump measurement PHP,meas in Subfigure
3.4b during the detected on-segments. In comparing the two subfigures, one
can see that the profiles are very similar and the mean signals are nearly
identical.
In order to quantify this observation, the number of “true positives” was

evaluated. An on-segment detected by the disaggregation is counted as a
true positive if its beginning and end coincide with the beginning and end
of a measured on-segment with a margin of error at most ±2 minutes. The
margin of error is allowed in order to ignore minor deviations caused by the
ramp up time. For all houses, the initial guess had a rate of at least 94%
true positive, showing that the initial guess is an accurate starting point.
Using the resulting on- and off-segments, one can then calculate the
parameters a∗ , b∗ and c∗ for the heat pump model as described in Section 2.2.
This provides the basis for the disaggregation of the heat pump consumption
profile.
PMean HP, init. Guess PMean HP, init. Guess,Meas

3 3
Power [kW]
Power [kW]
2 2
1 1
0 0
0 5 10 15 20 0 5 10 15 20
Time [minutes] Time [minutes]
(a) Extracted from PMain,meas (b) Extracted from PHP,meas
Figure 3.4: Heat pump signal during on-segments in initial guess
3.2.2 Disaggregation Process

One now has multiple components that are indicative of switching behavior,
namely the segments Son,init and Soff,init from the initial guess as well as
parameters a∗ , b∗ and c∗ , which provides the necessary initial information
for the disaggregation process shown in Algorithm 3.
Algorithm 3: Disaggregation based on 1-minute data

Data: PMain,meas in 1-minute resolution, Son,init , Son,init
Result: Disaggregated time series PHP,disagg and PNC,disagg
1 Set Son,known = Son,init and Soff,known = Soff,init ;
2 while Disaggregation is incomplete do
3 foreach son in Son,known do
4 Calculate most likely next on-off-switching cycle beginning
with the end of the current on-segment son by applying
Algorithm 4;
5 Merge new switching cycle with the known on- and
off-segments SOn,known and SOn,known ;
6 Calculate new parameters a∗ ,b∗ and c∗ for the HHPS model and
new probability density functions pOn,jump , pOff,jump , pOn,Temp
and pOff,Temp based on the update segments Son,known and
Soff,known
7 Calculate disaggregated time series PHP,disagg and PNC,disagg from
PMain,meas , Son,known and Soff,known ;
The basic idea that from the end of every on-segment in the known seg-
ments (the initial guess in the first iteration), the most likely next switching
cycle is calculated using the HHPS model and probability components which
will be introduced in the following sections (lines 2-4). A heat pump cycle
is the process of switching on and off again, with the beginning of a cycle
being when the heat pump switches off. After iterating through all known
on-segments and merging the new switching points (lines 3,5), the HHPS
parameters and probability components are updated (line 6). This process
is repeated until the disaggregation is complete. Finally, the resulting time
series are calculated (line 7).
Son,known describes the on-segments that have been detected so far and
the pOn,jump and pOn,Temp describe the probability density functions that are
obtained based on Son,known (analogous for off-segments).
Algorithm 4 shows the estimation of the most likely next heat pump
cycle. The core idea is to perform a forward simulation of the HHPS tem-
perature model beginning a known time the heat pump switches off. Then
the next possible points are identified where the heat pump could switch on
again. From each of those points, the subsequent next points are identified
where the heat pump could switch off. Each switching process is attributed
a probability measure taking into account the model temperature at the
time of the switch, the size of the jump and constraints (i.e., the probability
is zero when constraints are violated). It is to be noted that the probability
measures are not strictly probability density functions, but their values are
indicative of the likelihood of a heat pump switching (cf. next section for
more details). For easier legibility, this type of probability indication will be
referred to as probability, probability density or probability estimate, even
though it is not strictly accurate in a mathematical sense. With this method,
all possible combinations of the next switching cycle, i.e., switching on and
off again, are identified. The switching points with the largest probability
are then returned as the most likely next heat pump switching cycle.
In Algorithm 4, xti denotes the HHPS temperature at the time of the
ith jump and Pjump,i denotes the size of the ith jump. xtj is the HHPS
temperature assuming that the temperature at the previous jump was the
switching threshold, i.e., xti = Tthlow . p
best denotes the best probability for all
cycles evaluated during the algorithm and pOn,jump denotes the probability
density function describing the likelihood of an upward jump in the main
measurement caused by the heat pump (analogous for pOff,jump ). pOn,Temp
is the probability density of the heat pump switching on when at a certain
HHPS temperature (analogous for pOff,Temp ). pConstraint is a binary variable
that is 0 when feasibility constraints are violated and 1 otherwise. The
probability densities will be discussed in the following sections in more detail.
In the first for loop (line 4), pHP,on,i is the probability estimate for the
heat pump switching on at time ti after switching off at time t1 . If the jump
Pjump,i has a similar size as PHP,Son , the HHPS model temperature at time
ti is close to the lower temperature threshold Tth low and no constraints are
Algorithm 4: Calculation of next switching cycle (on- and off)

Data: Beginning time t1 (heat pump switches off)
Result: Switching points tup,best and tdown,best with highest
likelihood for the next on-off switching cycle
1 pbest = 0;
2 Calculate HHPS model temperature xt from Equation (2.11) for time
period [t1 , t∗ ];
3 foreach time ti of an upward jump during [t1 , t∗ ] do
4 Calculate likelihood of switching on at time ti as
pHP,on,i = pOn,jump (Pjump,i ) · pOn,Temp (xti ) · pConstraint
5 foreach time ti of an downward jump during [ti , t∗∗ ] do
6 pHP,on,off,ij = pHP,on,i ·pOff,jump (Pjump,j )·pOff,Temp (xtj )·pConstraint
7 if pbest < pHP,on,off,ij then
8 Define tup,best := ti and tdown,best := tj
9 Define pbest = pHP,on,off,i,j
10 Return tup,best and tdown,best as the best choice for switching points of
the next best heat pump cycle beginning at time t1 .
violated, the respective probability densities will be large and therefore the
resulting probability of a heat pump switch pHP,on,i is large. Note that the
jump size and the switching temperature are assumed to be independent.
The second for loop (line 5) traverses all possible times for switching off
given it previously switched off at t1 and on at ti . pHP,on,off,i,j is therefore
probability estimate for the heat pump switching at times t1 , ti and tj .
The choice of time periods which are considered for potential future
jumps defined by t∗ and t∗∗ will be discussed after introducing the probability
components and constraints.
After introducing the probability components and constraints, an exam-
ple visualizing this process will be given.
3.2.3 Probability Components and Constraints

This section describes the probability components used for calculating the
most likely next switching points.
3.2.3.1 Temperature Distribution

One of the key assumptions made is that heat pump always switches on
high low . Since one
when the temperature is at the switching threshold Tth or Tth
does not measure the temperature in the house, the HHPS model is likely
to be inaccurate. This can be caused by incorrect threshold assumptions,
an inaccurate choice of model, disturbances from open windows and doors,
heating from cooking, non-homogeneous heat distribution, imprecise mea-

surement of ambient temperature, etc. The only (indirect) measurements
that exist are the HHPS model temperature at the time the heat pump is
estimated to switch xSwitch as well as the size of the switching spike ∆PSwitch .
Therefore, the approach was chosen to obtain distributions of these two vari-
ables and use these distributions as a measure of probability for whether an
unknown spike in PMain,meas is caused by the heat pump or not.
Using the on- and off-segments Son,init and Soff,init from the initial guess,
one calculates the temperature behavior during each segment, assuming that
high
each segment begins with the temperature at the switching threshold Tth
low , respectively. The temperature behavior is obtained by forward
or Tth
simulation of Equation (2.11). This process is visualized in Figure 3.5, ad-
ditionally showing the initial guess in the bottom plot. The corresponding
measurements of the main and heat pump are displayed above, including
the actual heat pump on- and off-segments.
10
PMain,meas
8 PHP,meas
HP On-Segments True
Power [kW]
6 HP Off-Segments True
0
House Temperature, Model
23 HP On-Segments, Disagg.
Temperature [◦C]
22 HP Off-Segments, Disagg.
high
Tth
20
low
Tth
18
00 00 00
22: 00: 02:
Figure 3.5: HHPS disaggregation behavior, based on initial guess
Assuming that the real temperature in the house is precisely at the

switching thresholds Tth low and T high and both the model and the param-
th
eters are accurate, the temperature in the HHPS behaves as shown in black.
In this case, the threshold temperatures are assumed to be 19◦ C and 21◦ C,
as indicated by the black dashed line.

The change in household temperature occurs over time and the HHPS
temperature according to the model behaves differently than the true tem-
perature. Since there is no way to determine the cause, the following as-
sumption is made for all future considerations:
Assumption The model and threshold temperatures are assumed to be

accurate. All deviations between the true HHPS temperature and the model
HHPS temperature are attributed to one unknown disturbance in the HHPS
model, i.e., ωt in Equation 2.11. In this context “disturbance” and “noise”
refer to the combination of these effects, unless noted otherwise.
This assumption is practical and justified because it combines all un-
known factors into one variable without predefining the shape of the dis-
turbance. The only way of estimating the behavior of the disturbance ωt
between two switching points is based on the model temperature at the
switching points, since the switching times are the only indications of the
temperature. Assuming additionally that the disturbance is characteristic of
the HHPS, one can use its distribution as a measure of the probability of
the heat pump having switched at a certain time. This approach is similar
to Bayesian estimation or Bayesian tracking.
As can be seen in Figure 3.5, the model temperature at the end of each
segment has a small deviation from the “target” temperature, i.e., Tth low or
high
Tth . One is now interested in how large these deviations are in order to
use this for estimations of switching behavior during unknown segments.
Formally, this means that one needs to extract the behavior of the cumu-
lative noise between the beginning and the end of each segment by observing
the temperature that the house would have if the model from Equation (2.11)
were noiseless for all segments using Equation (2.13), choosing the appro-
priate threshold temperature as the starting temperature. Let TiOn,fin and
TjOff,fin represent the temperature at the end of the ith on- or off-segment,
respectively. One can now create a probability density function from the
discrete sets
TiOn,fin i ∈ 1 . . . |SOn,Init | (3.3)
and
TjOff,fin j ∈ 1, . . . |SOff,Init | (3.4)
using kernel density estimation. The results are shown in Figure 3.6, which
show both the histogram of the “end” temperatures TiOn,fin and TjOff,fin and
their respective probability distributions pOn,Temp and pOff,Temp .
The shape of the occurrences and the probability density functions is a
confirmation of the assumptions made so far: Ignoring individual outliers,
the distributions are both unimodal and shaped like a gaussian centered
near the assumed switching temperatures Tth low . Observing the subfigures in
60 1.0 60 1.0
50 0.8 50 0.8
Probability Density
Probability Density
40 40
0.6 0.6
30 30
0.4 0.4
20 20
10 0.2 10 0.2
0 0.0 0 0.0
16 17 18 19 20 21 18 19 20 21 22 23 24
Temperature [C] Temperature [C]
(a) Switch-on temperature (b) Switch-off temperature
Figure 3.6: Histogram and probability density function of end-of-segment

temperatures TiOn,fin and TjOff,fin
Figure 3.6, one can see that the largest probability densities occur very near
the assumed switching thresholds Tth low and T high . Furthermore, the standard
th
deviations are around 0.5◦ C for this house (and similar for other houses in
the data set). This shows that the underlying model is appropriate for the
data and the disturbance at the time of a switch can be approximated by
gaussian noise. A further observation is that the total number of occurrences
for switching off is significantly smaller than for switching on. This is caused
by the fact that on-segments can be detected independently, whereas off-
segments need to be detected between two on-segments. Therefore there are
far fewer off-segments than on-segments.
3.2.3.2 Jump Height
Furthermore, a modified probability density function is used describing the

probability of a jump having been caused by the heat pump. In order to
obtain such a measure, the jump sizes from all known jumps are considered.
In the beginning, these are the jumps from the initial guess. With only the
initial guess, it is not possible to estimate the probability of a jump larger
or smaller than the rated power belonging to the heat pump because all
detected heat pump jumps are close to the rated power. However, there
may be times when the heat pump switches on when another device with
large electrical consumption switches on or off, e.g. the compressor of a
refrigerator. Therefore, large or small jumps can also be caused partially by
the heat pump. During the disaggregation process more jumps belonging to
the heat pump are detected, however there are still not enough data points
2.50e-03 2.50e-03
p̂Jump,down,orig p̂Jump,up,orig
2.00e-03 pJump,down,mod 2.00e-03 pJump,up,mod
Probability Density
Probability Density
1.50e-03 1.50e-03
1.00e-03 1.00e-03
5.00e-04 5.00e-04
0.00e+00 0.00e+00
0 3 6 9 0 3 6 9
(a) Upward jumps (b) Downward jumps
Figure 3.7: Original and modified probability estimations for the jump sizes
in house AEK32
to estimate the how likely a large jump is caused by the heat pump.
The primary goal for such a probability estimate based on the jump size
is to select a preference between different jumps under consideration. There-
fore, one can use the KDE for jumps that are close in size to the heat pump
and use a synthetic estimate for probability for larger and smaller jumps. As-
suming that small non-heat pump consumption occurs frequently and large
non-heat pump consumption is relatively rare, small superpositions of heat
pump jumps and other consumption are more likely than superpositions of
heat pump jumps with other large devices. Therefore, the probability of a
jump belonging to the heat pump decreases with distance from the rated
heat pump power. One possible choice of representing this is with the linear
modifications as shown in in Figure 3.7 in blue. As a result, jumps that are
very similar to known heat pump jumps are strongly preferred over jumps
with other sizes. When there are no similar heat pump jumps, the jumps
closer to the rated power are preferred. The choice of such a probability
function can be further evaluated in future research with larger amounts of
data available. Using a Gaussian estimate for all jump sizes is not beneficial.
For example, for a Gaussian model the probability estimate of a jump that
is twice as large as the rated power would be multiple orders of magnitude
larger than the likelihood of the rated power. As a result, the disaggregation
would only select jumps that are very similar to the rated power and ignore
all jumps where significant superposition with other large devices occurred.
3.2.3.3 Constraints
The constraint variable pConstraint is zero when constraints are violated and
one otherwise. Its function is to eliminate all infeasible switching candidates
from consideration.
The following constraining factors are considered:
• Physical feasibility: The power consumption between switching on and
switching off must be larger than the rated power of the heat pump
PHP,r 4
• Minimum Duration: From the initial guess, the minimum on-duration

and off-duration are calculated. All segments must last at least as
long as the shortest ones detected in the initial guess or only slightly
shorter.
3.2.3.4 Time Period Considered for Possible Next Switches

For a reasonable estimation, all points must be considered for possible jumps
where a jump is likely according to the HHPS temperature model. For find-
ing the next switch on, i.e., lines 2-4 from Algorithm 4, the temperature
decays between t1 and t∗ . Let T Off,fin,min be the lowest switching tempera-
ture at the end of all known off-segments according to the HHPS model5 . For
most cases, it is practical to consider all points where xt < T Off,fin,min − ∆T
holds for some small ∆T ≈ 0.5◦ C in order to consider all common switching
temperatures. However, if xt only reaches T Off,fin,min after the next known
on-segment begins (at time tnext,on ), only jumps until tnext,on are evaluated in
order to avoid unnecessary calculations. Considering points beyond tnext,on
would cause a conflict with the next on-segment. Mathematically, one can
define t∗ as
max t∗
s. t. t∗ ≤ tnext,on (3.5)
Off,fin,min
xt∗ > T − ∆T.
This means one considers all points forward until the heat pump is known
to turn on or it is has become too cold for the heat pump to realistically
remain inactive. One can therefore expect that the heat pump will switch
on during this time period.
3.2.4 Calculation of Disaggregated Time Series

Once the sets of on- and off-segments Son,known and Soff,known , the final steps
are to create the heat pump time series PHP,disagg and the corresponding time
4
individual downward spikes of up to 2 minutes are allowed; cf. Chapter 2
5 high
Assuming the temperature at the beginning of the off-segment is Tth
6 6
PHP,dis,orig PHP,dis,smooth
PMain,dis,orig PMain,dis,smooth
4 4
Power [kW]
Power [kW]
2 2
0 0
10/15 18:00 10/15 21:00 10/15 18:00 10/15 21:00

(a) Without smoothing (b) With smoothing
Figure 3.8: Time series of disaggregated main and heat pump time series
series of the remaining load PNC . One can create the heat pump time series
as a rectangular signal either at zero power or its rated power. However,
this data also allows for more detailed reconstruction.
For each detected jump, both the beginning and the end point of the
jump were detected. The heat pump signal can therefore be reconstructed
by linearly interpolating between the beginning of the “switch-on-jump”
tOn,start , the end of the“switch-on-jump”tOn,end , the beginning of the“switch-
off-jump” tOff,end and the end of the “switch-off-jump” tOff,end . For all end
of the ramp up and the beginning of the ramp down, one can choose the
most likely size of upward or downward jump from the probability density
function P1 = pOn,jump and P2 = pOff,jump , respectively. In mathematical
terms, the heat pump signal is then given as
P1
(tOn,end −tOn,start +1) (t − tOn,start ), tOn,start ≤ t < tOn,end
P2 −P 1
PHP,disagg (t) = (tOff,start −tOff,end +1) (t − tOn,end ) + P1 , tOn,end ≤ t < tOff,start
−P2
(tOff,end −tOff,start +1) (t − tOff,start ) + P1 ,
tOff,start ≤ t < tOff,end
(3.6)
The next logical step is to subtract the resulting heat pump time series
from the main time series, i.e., PHP,disagg − PMain,meas , however the result
typically has unrealistic spikes around the times of switches, which is caused
by incorrectly estimated ramping behavior, as is illustrated in Subfigure
3.8a.
In order to obtain a reasonable result, it is necessary to clean the data
around the switching times. For the process of switching on, one way of
doing this is by linearly interpolating all points between the last point prior
to the switch and the first point where the heat pump is no longer in its
switching process. This must be done such that the resulting heat pump
power is within its range of power (i.e., between zero and its maximum
power). The result of this smoothing process can be seen in Subfigure 3.8b,
which shows behavior that is plausible even during the switching process.
3.2.5 Interpretation and Examples

This section provides some examples visualizing the behavior of the algo-
rithm over time. For this purpose, the disaggregation process will be dis-
cussed for one house during a time period of about one day, while summa-
rizing the process.
For the first step, an initial guess is detected as previously shown in
Figure 3.5. With the initial guess, one calculates the new HHPS parame-
ters, as described in Section 2.2. These parameters are used to calculate the
HHPS temperature behavior, shown in black, assuming the temperature at
the beginning of each segment is the respective switching threshold. Using
the HHPS temperature at the time of the known switches, one can calculate
the probability density function describing the probability of a switch at a
certain temperature, pOn,Temp and pOff,Temp , and analogously, the probabil-
ity density function of a jump of a certain size belonging to the heat pump,
i.e., pOn,Jump and pOff,Jump .
In the subsequent steps, one calculates the probability of the next heat
pump cycle beginning with the end of each on-segment. The most likely
switches are added to the known segments SOn,known and SOff,known . The
result of this step is shown in Figure 3.9. One heat pump cycle that was
added in this iteration occurs between 23:00 and 23:30, when there briefly
is large non-controllable consumption. Let the time of the large power jump
be tlarge . Furthermore, let t1 := 23 : 00. From t1 , the HHPS temperature
model is calculated forward, as shown in black below. At tlarge , the HHPS
temperature is still significantly larger than the lower switching threshold,
meaning that pOn,Temp (tlarge ) is small. Furthermore, the jump is significantly
larger than the typical heat pump jump, leading to a small value for pOn,Jump .
Therefore, the probability of switching on at tlarge is small overall. From
tlarge , the next possible switch-off-time is evaluated similarly.
Using t1 = 23 : 00 as a starting point, the other possible switch-on-
time is when the heat pump does actually switch on next. The size of the
respective jump is typical for the heat pump and the HHPS temperature is
very close to the switching threshold. Therefore, it has a high probability.
The probability of the heat pump then switching off again around 23:30 is
large by the same reasoning. The large non-controllable jump can therefore
be disregarded.
After this iteration of calculating the next heat pump cycles is complete,
new HHPS parameters a, b and c are calculated and the process is com-
pleted, merging the newly detected on-segments into the set of known on-
and off-segments. The result is shown in Figure 3.10. The unknown period
after approximately 1:15am was reduced by one heat pump cycle. The on-
segment at 1:45 was detected accurately in spite of the large upward jump.
Even though the upward jump results in a lower probability, the tempera-
ture probability is relatively large because the HHPS at 1:45 is close to the
switching threshold. This entire process continues until there are no heat
pump segments added (not visualized).
10
PMain,meas
8 PHP,meas
HP On-Segments True
Power [kW]
0
Temperature [◦C]
high
Tth
20
low
Tth
18
00 00 00 00 00 00
22: 23: 00: 01: 02: 03:
Figure 3.9: HHPS disaggregation behavior (first iteration)

10
PMain,meas
8 PHP,meas
HP On-Segments True
Power [kW]
0
Temperature [◦C]
high
Tth
20
low
Tth
18
00 00 00 00 00 00
22: 23: 00: 01: 02: 03:
Figure 3.10: HHPS disaggregation behavior (second iteration)
3.3 Moving Horizon Method

This section introduces Method B, which was created for more coarse data
where large jumps cannot necessarily be located directly. In this thesis,
data in 15 minute resolution was investigated, though the method can be
applied for data with similar resolutions as well. The differentiation between
Method B1 and B2 will be introduced later in the section.
Heat pump behavior can be difficult to detect in coarse data due to the
fact that the measurement interval is significantly longer than the switch-
ing process of a heat pump. For illustration, Figure 3.11 shows different
examples of how heat pump behavior can translate into 15-minute data.
The duration that a heat pump is switched on differs between heat pumps
(and other conditions, e.g., weather), ranging from 8 minutes to 120 minutes.
Heat pumps that are switched on for a shorter time period (“short” heat
pumps, less than 30 minutes) are most difficult to detect because there are
no characteristics observable in the main measurement that are shared by
all heat pump switching processes. Indeed, if the on-segment of a heat pump
cycle is entirely within one sampling segment of the main measurement, it
causes a spike in one individual data point. When the heat pump on-segment
3.3. MOVING HORIZON METHOD 37
4
15
PMain,meas
PHP,meas
3
Power [kW]
0
0 50 100 150 200 250
Time [Minutes]
Figure 3.11: Example of Heat Pump in 15-minute Main Measurement
occurs over the course of two on-segments, the resulting power can result
in a “staircase” shape, a flat plateau or be hidden entirely due to activity
from other spikes (cf. Figure 3.11). In particular, heat pumps with short on-
segments and low rated power (i.e., in the range of 2 kW) are the hardest
to identify. Conversely, heat pumps that are switched on for longer time
segments can be detected more easily in time periods with low activity:
If PMain,meas (tprior ) is the power before the heat pump switches on, there
should be a measurement point during the on-segment where the main is
around PMain,meas (tprior ) + PHP,r , as shown in Figure 3.11 between 150 and
200 minutes.
3.3.1 Notation
In this subsection, algorithms will be introduced that nonetheless estimate
the switching behavior in a one-minute resolution, increases notational com-
plexity. The following conventions are introduced for easier legibility:
1. 15-minute time periods are referred to as “15-minute blocks” or “blocks”
2. t15 refers to an entire 15-minute block, i.e., PMain,meas (t15 ) refers to

the mean power measurement during the entire block t15 (which is
constant)
3. t15 + 1 refers to the following block, i.e., if t15 := [17 : 30, 17 : 45], then
t15 + 1 = [17 : 45, 18 : 00].
4. t + 1 refers to the next point in one minute resolution, i.e., t := 17 : 45

implies t + 1 = 17 : 46
5. t15 refers to the beginning of a 15-minute block, e.g., if t15 := [17 :

30, 17 : 45] then t15 = 17 : 30.
15
6. t refers to the end of a 15-minute block, e.g., if t15 := [17 : 30, 17 : 45]
15
then t = 17 : 45.
This will become useful for algorithms that work with 15-minute input
data, but mix 15-minute values and one minute values within individual
steps of the algorithms.
3.3.2 Theoretical Background: Moving Horizon Estimation

Method B borrows a core idea from Moving Horizon Estimation (MHE),
an established method in control theory and estimation. Therefore a brief
introduction to the theory of MHE is provided [14], [15].
The concept of MHE takes into account behavior of a known system,
past measurements and a cost function. The general idea is as follows and
is summarized in Algorithm 5.
Algorithm 5: Moving horizon estimation

Data: Parameters, Inputs, Measurements, System Model
Result: State Trajectory x∗ (t)
1 while Estimation Process is Incomplete do
2 Minimize cost function to find most likely trajectory x̃(t) for
t ∈ [t̂ − T : t̂];
3 Update final trajectory: x∗ (t̂ − T ) = x̃(t̂ − T );
4 Increment t̂ by Tinc with Tinc < T
Using measurements, inputs and known parameters for a time period

[t̂ − T : t̂] (one estimation window) as well as a model for the system, the
estimator calculates the trajectory x̃(t), t ∈ [t̂ − T : t̂] of the system that
provides the best explanation of the measurements during that time period.
The resulting trajectory of the overall estimation process, x∗ , only uses the
first value x̃(t̂ − T ) of the trajectory from the estimation horizon, i.e.,
x∗ (t̂ − T ) := x̃(t̂ − T ). (3.7)
x∗ (t) is undefined for all t > t̂ − T . The process is repeated after incre-
menting t̂ until x∗ (t) is completed. The cost function of the estimator can
be designed in many ways, e.g., minimizing the sum of square differences
between measurements and the model. The choice of cost function depends
highly on the application.
By taking into account a time period [t̂ − T : t̂] rather than just sub-
sequent measurement points, multiple measurements can be integrated in
Indicator
h
s
l
6
PMain,meas
5 PHP,meas
HP On-Segments, Init.
HP Off-Segments, Init.
4
Power [kW]
0
00 00 00 00 00 00 00
17: 18: 19: 20: 21: 22: 23:
Time
Figure 3.12: Initial guess for short heat pump with indicator
every estimation step, increasing accuracy. By limiting the size of the time
period, computational effort is reduced and online measurements become
possible.
3.3.3 Initial Guess

For the initial guess, it is necessary to identify on-segments without any prior
information except for the rated power. In particular, there is no indication
whether the heat pump is typically on for time periods that are rather long
(> 30 minutes) or short (< 30 minutes). In order to identify both short and
long on-segments in the initial guess, two methods were developed and then
combined.
3.3.3.1 Initial Guess for Short Segments

First, the initial guess for short time periods will be discussed.
As shown in Figure 3.11, heat pumps with short on-durations often show
a stair-like shape or an individual spike due to averaging of the main mea-
surement. Therefore, this algorithm is aimed at identifying time periods with
such shapes. One difficulty is that other devices may also cause similar char-
acteristics in the measurement. Therefore, the on-segment candidates are

restricted to time periods where PMain,meas does not increase significantly
above the rated heat pump power PHP,r and where there is a significant
change in PMain,meas .
As with the initial guess for higher resolution data described in Sec-
tion 3.2.1, one is interested in obtaining segments with a high rate of “true
positives”, i.e., on-segments that accurately reflect the actual heat pump
behavior. Segments with reduced certainty are ignored. To specify the al-
gorithm, let l denote a point in PMain,meas that is a local minimum, h a
local maximum and s a point between a local minimum an a local maxi-
mum. The segments of interest occur in areas where the power level is low,
i.e., PMain,meas (t) < PHP,r + ∆P for some ∆P PHP,r and for all t in the
segment of interest. The relative power levels can be described as either
• l−s−h−l
• l − h − l or
• l − h − s − l.
A last condition is that the difference between the values PMain,meas (tl )
and PMain,meas (th ) needs to surpass a certain threshold, e.g., 750 W.
A time period meeting these criteria is will be referred to as a heat pump
candidate. It is to be noted that segments where PMain,meas increases over
more than 2 consecutive steps are not included.
For an accurate disaggregation, it is useful to identify heat pump patterns
more precisely than in 15-minute steps. Figure 3.12 shows an example of
a “short” heat pump in 15-minute data, with the heat pump measurements
shown in minutely resolution and the main measurement shown in 15-minute
resolution, i.e., the input data for the disaggregation. The indicator above
shows whether the respective segment is a local minimum, a local maximum
or an interim step in between, while being below a threshold of PHP,r +∆P ≈
3000 kW and having the appropriate shape.
After having identified the plausible candidates, one needs to identify
where the heat pump cycle occurred precisely. The calculations of these
switching times are based on the idea the power of the non-controllable load
is known or can be estimated and remains constant while the heat pump
switches on. Consequently, all variations in the PMain,meas level are caused
by the heat pump. While this assumption clearly can be wrong at times,
it results in only very small errors in most cases. Most devices are small
relative to the heat pump and most of the time, people do not switch large
electric appliances on or off (in particular at night or when being absent).
Note that large electric devices such as stoves and ovens generally are not
included in heat pump candidates, because the measurement PMain,meas (t)
typically increases significantly above PHP,r + ∆P .
For any heat pump candidate, two local minima exist in the measurement
by design. One assumes that there is no heat pump activity during the
smaller of the two local minima in order to obtain a reference for the non-
controllable load. Next, two different types of candidates are differentiated:
For the case l-h-l, assume that PMain,meas (tl1 ) < PMain,meas (tl2 ), with t15
li
indicating the 15-minute block. Furthermore, one assumes static consump-
tion behavior in the house, i.e., that the non-controllable consumption is
constant at PMain,meas (tl1 ) during the duration of the candidate time and
therefore all consumption beyond that is caused by the heat pump. With
that assumption and knowing the heat pump’s rated power PHP,r 6 , one can
calculate how long the heat pump was switched on during the other 15-
minute blocks as
(PMain,meas (h) − PMain,meas (tl1 ))

Ton (th ) = 15 min · (3.8)
PHP,r
and
(PMain,meas (tl2 ) − PMain,meas (tl1 ))

Ton (tl2 ) = 15min · , (3.9)
PHP,r
where Ton (·) indicates the duration of the heat pump being on in the
respective 15-minute block. Should the calculation result in a negative value
or a value greater than 15 minutes, it must be replaced by zero or 15 minutes,
i.e.
0 min ≤ Ton (·) ≤ 15 min, (3.10)

must hold. More formally, this should be described as

0
 if f (PMain,meas ) < 0
Ton (th ) = 15 min if f (PMain,meas ) > 15 . (3.11)

f (PMain,meas ) else

For the sake of legibility in the remainder of this section, the conditions
on Ton described by Equation (3.10) are summarized by the comment “is
contrained by”.
The time ton,start when the heat pump switches on can be calculated as
(
th,start − Ton (tl2 ), if tl2 < tl1
ton,start = (3.12)
th,end − Ton (h) otherwise
and the switch-off time as

6
If the heat pump’s on-segments are long enough, it can be possible to estimate the
rated power from the distribution of PMain,meas as done in Method A.
(
th,end + Ton (tl2 ), if tl2 < tl1
ton,end = (3.13)
th,start + Ton (h) otherwise
An example of the result is shown in Figure 3.12, during the first on-
segment. In that example, the heat pump in fact switches on and off in the
same 15-minute block, contradicting an earlier assumption. However, the
assumption that there is little or no changing non-controllable consumption
during this time holds true. As a result, Ton (tl2 = [16.30 : 16 : 45]) ≈ 0,
meaning that the ton,start = 16 : 45 = th,start . The estimated duration
Ton (th ) of the on-component of the h-block is accurately calculated, with
the estimated heat pump therefore switching on and off slightly earlier than
in the measurements.
The second type of on-segment candidate is a stair-shaped candidate
as shown in the second detected on-segment (around t =17:30) in Figure
3.12. The calculations of the switching times is very similar and based on
the assumption that the heat pump is fully switched off during both of the
l-segments. The calculations are very similar, with the s-block taking the
place of the l2 -block, i.e., the larger of the two l-segments, and the mean
power of the two l-segments used as a reference value for the non-controllable
load.
P −P
In mathematical terms, using PRef = l1 2 l2 this leads to
PMain,meas (th ) − PRef

Ton (th ) = 15min · (3.14)
PHP,r
and
PMain,meas (s) − PRef

Ton (s) = 15min · , (3.15)
PHP,r
with the constraints from Equation (3.10) holding as well. Calculations
of the switching times ton,start and ton,end are analogous and given by
(
th,start − Ton (ts ), if ts > th
ton,start = (3.16)
th,end − Ton (h) otherwise
and
(
th,end + Ton (ts ), if ts > th
ton,end = (3.17)
th,start + Ton (h) otherwise
Figure 3.12 also shows noteworthy examples where there is no initial
guess: the period around t ∈ [18 : 30, 18 : 45] is omitted because there is a
significant amount of energy that must be consumed by another device dur-
ing that time segment. Furthermore, the period between 19:00 and 20:00 is
omitted because there is more than one step between the respective periods.
Longer on-segments are detected with the algorithm shown in the following
section.
For a complete initial guess, one must additionally find off-segments. As
with the initial guess in Section 3.2.1, off-segments are found as segments
between adjacent on-segments, assuming that it is not possible for the heat
pump to have switched on in the mean time. Maintaining the mindset
of only taking into account conservative segments that are true with high
likelihood, one counts off-segments only in areas where the power level is
small relative to the heat pump and the change in power is also small, which
is shaded in red in Figure 3.12. The time period between 20:30 and 21:30
is not characterized as an off-segment because consumption is too large at
21:00.
3.3.3.2 Initial Guess for Long Segments

Many heat pumps are switched on for longer periods of time that span
more than two 15-minute blocks. Such on-segments are not detected by the
algorithm described in the previous section by design, because it aims at
detecting short on-segments which have different characteristics than long
on-segments. Therefore a separate algorithm was developed which recog-
nizes characteristics of heat pumps with longer on-segments.
The algorithm is designed as a finite state machine (FSM) as shown in
Figure 3.13 which detects when the heat pump is in any of the states S = {
“Low” (L), “High” (H), “Rising” (R), “Falling” (F), “Unknown” (U)}. The
FSM switches between states once for every 15-minute block. The general
principle is to detect rectangular signals while allowing minor deviations.
The key difference to the short initial guess is that the signal can remain
at the “high” indicator for multiple 15-minute blocks and short signals are
ignored. Because the quality of the long initial guess heavily impacts the
quality of the total disaggregation, the details of the algorithm are presented
in detail in the following. For sake of this algorithm the following constants
and variables are introduced:
• Pth,low : Threshold for switching to a “low” state from any state. A
P
good choice can be in the range of HP,r
3 .
• PBL : The assumed or estimated power consumption of the NC devices

(base load), used as a reference for detecting heat pump activity. This
value is updated during some states and transitions.
• PBL,init : The value of PBL at the beginning of the current on-segment
candidate
• wupper : Upper threshold for heat pump jump, i.e., if PMain,meas (t) <
PBL (t) + wupper PHP,r , the heat pump might be on at time t. This
threshold is slightly larger than one.
C(U,U)
start Unknown
C(U,L)
C(L,L) Low Falling

C(F,L)
C(L,H)
C(L,R) C(H,F)
C(H,L)1,2
C(R,H)
Rising High
C(H,H)
Figure 3.13: Simplified finite state machine for detection of long segments
for initial guess
• wlower : Lower threshold for heat pump jump, i.e., if PMain,meas (t) >
PBL (t) + wlower PHP,r , the heat pump might be on at time t. This
threshold is slightly smaller than one.
• tstart,off : The starting time (in one-minute resolution) of the most re-
cent off-segment candidate
• tstart,on : The starting time (in one-minute resolution) of the most re-
cent on-segment
• PMain,meas,min : Minimal load during the entire time series
• Pmin,∆ : Threshold for switching from “Low” to “Rising”. A good choice

P
can be in the range of HP,r
5
• Phigh : The estimated power level when the heat pump is switched on
during an ongoing on-segment candidate, i.e., Phigh ≈ PNC (t15 )+PHP,r
• vdown : A threshold parameter in the range of 0.2 to ensure that a

downward jump is large enough for a transition.
The following paragraphs describe the transitions, actions and their re-
spective conditions that occur in the FSM. Transition conditions are denoted
as C(First, Next), with First and Next being the present and next node.
Furthermore, A(First, Next) denotes the action taken during the respective
transition. The FSM is initialized in the state “Unknown”.
Oversimplifying, one finds an on-segment when going through states
“Low”, (“Rising”,) “High”, (“Falling”) and Low in that order, possibly omit-
ting the states “Rising” and “Falling”. Off-segments are added between con-
secutive on-segments if the “Unknown” state was not entered between the
on-segments. Note that the segments are not fully finalized until the initial
guesses for both the “short” and “long” heat pumps are merged (e.g., an off-
segment from the “long” initial guess may be interrupted by an on-segment
from the “short” initial guess.
Transition and Action in “Unknown” State As the name suggests,

it is not known if the heat pump is on or not during the respective 15-
minute block in this state. Since one has no indication whether the power
consumption PMain,meas (t15 ) is caused by the heat pump or not, the only
way of obtaining some certainty is when the power level falls far enough so
that the heat pump cannot be on during that segment.
Condition C(U,L) is therefore given as
PMain,meas (t) < Pth,low (3.18)

and in the action step A(U,L) the current time is stored as the potential
beginning time of an on-segment as
t15 15
start,on = t , (3.19)
and the ending time of the previous on-segment is cleared, i.e.,
ton,end = undefined, (3.20)
which later serves as a signal that the “Unknown” state has been entered
since the last on-segment.
Lastly, the baseline power is updated as
PBL = min{PMain,meas (t15 − 1), PMain,meas (t15 )}, (3.21)
and
PBL,init = PBL (3.22)
where t15 − 1 refers to 15-minute block before t15 , as described in Section
3.3.1, and PBL,init will serve as a reference for calculating the precise time
where the heat pump switches on.
The minimum of these two measurements in Equation (3.21) is chosen
rather than PMain,meas (t15 ) in case the new “Low” state is actually a “Rising”
state. In many cases, the assumption holds that there is little other change
in consumption by other devices of a relevant size (compared to the heat
pump). Furthermore, small changes in the base load often fluctuate around
a lower level, i.e., the minor power increases often are followed by decreases
(e.g., switching on a toaster for a couple of minutes). Using the more con-
servative minimum can have the effect that a subsequent “High” state would
not be recognized; in such a case one would continue to the “Unknown”
state and the on-segment would be discarded. This does not impact the
disaggregation any further, whereas mistaking a “High” or “Rising” state for
a “Low” state can lead to an on-segment being classified as an off-segment.
Therefore the minimum is the more conservative choice, which is better for
the initial guess. As stated in previous sections, it is more important for the
initial guess to be accurate (most positive estimations are accurate) rather
than complete (many detected segments, however with a larger error rate).
Transition and Action in “Low” State Next, the heat pump can go
from a “Low” to a “High” state, i.e., where the heat pump is likely to be
on. The condition is that the power level increases by roughly PHP,r , i.e.,
mathematically speaking C(L,H) implies that
PMain,meas (t15 ) > Pth,low (3.23)

and
PMain,meas (t15 ) > PBL + wlower PHP,r (3.24)
and
PMain,meas (t15 ) < PBL + wupper PHP,r (3.25)
must all hold.
In the action step A(L,H) the high power level is updated as Phigh =
PMain,meas (t15 ) and the beginning time as t15 15
start,on = t − 1. It is possible for
the on-segment to actually begin during the 15-minute block of the “High”
state, however the resulting error would be small: Given that the difference
between the two power levels is close to the rated power PHP,r , the heat
pump would either turn on towards the end of the previous 15-minute block
(t15 − 1) or at the beginning of the “High”-block (t15 ).
A transition from “Low” to “Rising” is also possible when PMain,meas (t15 )
increases, yet not enough for the heat pump to be on entirely, and the
subsequent 15-minute block seems to contain the heat pump entirely. In
mathematical terms C(L,R) is given as
PMain,meas (t15 ) > PBL + Pmin,∆ (3.26)

and
PMain,meas (t15 + 1) > PBL + wlower PHP,r (3.27)
and
PMain,meas (t15 + 1) < PBL + wupper PHP,r . (3.28)
One can also remain in the low state if there is not enough change in the
power level, i.e., the transition condition C(L,L) is given if
PMain,meas (t15 ) ≤ PBL + Pmin,∆ (3.29)

or if both
PMain,meas (t15 ) > PBL + Pmin,∆ (3.30)
and
PMain,meas (t15 + 1) < PBL + wlower PHP,r (3.31)
hold.
The second and third part of this condition (Equations 3.30 and 3.31)
basically state that one stays in the “Low” state if the power level increases,
yet not enough for a block that is on with high likelihood. Note that this
condition makes it impossible to detect “short” on-segments, as they would
be ignored by this condition. It is quite useful for detecting more on- and off-
segments with “long” heat pumps because fluctuations in non-controllable
power can be identified as base load. Merging the two initial guesses will be
done in the following section.
Lastly, if none of the conditions C(L,H), C(L,L) or C(L,R) hold, then
C(L,U) is true and one moves to the state “Unknown”. All non-measured
variables are reset, in particular PBL = min{PMain,meas } as one can no longer
estimate the non-controllable load.
Transition and Action in “Rising” State As a condition for enter-

ing the “Rising” state, the subsequent state must be “High” (see Equation
(3.27)), since the following power level must be near PBL + PHP,r . The only
action A(R,H) taken is to set the starting block of the on-segment candidate
to t15 15
start,on = t − 1, i.e., the on-segment is likely to start between the previ-
ous 15-minute block (“Low” state) and the subsequent block (“High” state)
and, for sake of consistency, can define C(R,H) as TRUE.
Transition and Action in “High” State From the “High” state, i.e.,
where the heat pump is believed to be on, it can remain on during the entire
next 15-minute block t15 + 1, i.e., a transition C(H,H). The condition for
this transition is that the power level PMain,meas (t15 ) remains similar to the
previous step PMain,meas (t15 − 1), i.e.
PMain,meas (t15 ) > PBL + PHP,r wlower (3.32)

and
PMain,meas (t15 ) < PBL + PHP,r wupper . (3.33)

This condition allows for some fluctuation in the consumption for two
reasons: First, the heat pump can fluctuate around its rated power or in-
crease over time, as described in Section 2.1.1. These fluctuations can appear
in the 15-minute data as well. Secondly and more important, when the heat
pump is switched on for long durations (i.e., in the range of 60 minutes),
fluctuations in the NC-load are likely as well. Therefore the assumption
that the NC power level remains constant while the heat pump is on is too
strong. With this condition, such changes do not obstruct the detection of
the initial guess.
In order to take into account minor fluctuations in the non-controllable
load in the power reference, the PBL is updated as
PBL = max{PMain,meas (t15 ) − PHP,r , PMain,meas,min } (3.34)
Next, the FSM state can move from the “High” state directly to the
“Low” state if there is a sufficiently large downward jump, i.e., if the heat
pump is on until near the end of the prior 15-minute segment and then
switches off, causing a drop between PMain,meas (t15 − 1) and PMain,meas (t15 )
of approximately the rated heat pump power PHP,r . In mathematical terms,
C(H,L)1 therefore is
PMain,meas (t15 ) < PBL + PHP,r vdown , (3.35)

with vdown a parameter in the range of 0.2 ensuring that the downward
step is large enough.
In this case, one has likely identified a heat pump on-segment and one
can add the on-segment to the set Son,init . The start and end times are
calculated analogously to Subsection 3.3.3.1, assuming that the increase of
power beyond the baseline is caused by the heat pump. The start time
tstart,on is calculated as
15 PMain,meas (t15 )start,on ) − PBL,init

tstart,on = tstart,on − , (3.36)
PHP,r
using the assumption that the non-controllable load remains at PBL,init .
The resulting time therefore found by attributing any energy in the 15-
minute block to the heat pump and assuming it consumes the full rated
power PHP,r .
Thirdly, if the power level decreases by a smaller amount in 15-minute
block t15 , but large enough to ensure that the heat pump is not fully switched
on during the entire 15-minute block, one differentiates three different cases:
(a) Further reduction of the power level to a level near PBL at time t15 + 1,
i.e., the heat pump is likely turned off during block t15 and then is
switched off entirely during block t15 + 1. This means that the FSM
enters the “Falling” state.
(b) Return to power level near Phigh in the following 15-minute block (t15 +
1): The heat pump likely either switched off and on again during the
15-minute block or reduced power for a few minutes. This sequence of
power levels is “high” (PMain,meas (t15 − 1)), “medium” (PMain,meas (t15 )),
“high” (PMain,meas (t15 + 1)), i.e., the FSM returns to the state “Rising”.
(c) The subsequent 15-minute block is at a similar “medium” power level,

i.e., PMain,meas (t15 ) ≈ PMain,meas (t15 + 1) and the then subsequent
block returns to the level near Phigh , i.e., ‘high” (PMain,meas (t15 −
1)), “medium” (PMain,meas (t15 )), “medium” (PMain,meas (t15 + 1)), “high”
(PMain,meas (t15 +2)). Here the heat pump is likely to switch off in block
t15 and switch on again in block t15 + 1. The FSM enters state “Low”.
The calculation of the start time is the same as Equation (3.36) for all
of these cases.
The condition for case (a) is C(H,L)1 and is given as
PMain,meas (t15 ) > (Phigh − PBL )vdown (3.37)

and
PMain,meas (t15 + 1) < PBL + PHP,r vdown . (3.38)
The ending time is calculated analogous to (3.17), i.e., assuming that
the non-controllable load remains constant and the heat pump is on with
full power when it is on. The ending time for this case is then
PMain,meas (t15 ) − PBL

tend,on = t15 + · 15min. (3.39)
PHP,r
Case (b) is the same as C(H,R), i.e., the power level only decreases for one
15-minute block. In this case it is not possible to identify where in the block
the heat pump switches off and on again, because there is no reference for an
end or a beginning. Therefore, one assumes the time when the heat pump
is switched off is in the middle of the 15-minute block.
C(H,R) is given as by Equation (3.37) (PMain,meas (t15 ) is somewhat
reduced) and
PMain,meas (t15 + 1) > (Phigh − PBL )vdown (3.40)

and

and
The action A(H,R) consists of multiple components:
One can assume that this is a true on-segment given this shape, and
therefore one can calculate the start time of the on-segment tstart,on with
Equation (3.36). If there was no “Unknown” state since the previous on-
segment, the time since the heat pump switched off can be counted as a
certain off-segment, therefore (tend,on , ttstart,on ) is added to Soff . Note that
at this point, tend,on is the ending time of the previous on-segment and it is
discarded if the “Unknown” state was entered.
The end time of this on-segment tend,on is then calculated as
PMain,meas (t15 ) − PBL

tend,on = t15
end,on + · 15min, (3.43)
2PHP,r
with the only difference to Equation (3.39) by the factor 2 in the de-
nominator. This causes only half of the power attributed to the heat pump
within PMain,meas (t15 ) to be in the beginning of the block at t15 . The re-
maining heat pump portion of PMain,meas (t15 ) is allocated to the end of t15 ,
leading to the start time of the next on-segment to be calculated as
15 PMain,meas (t15 ) − PBL

tstart,on,next = tend,on − · 15min. (3.44)
2PHP,r
In this case, two on-segments are located very close to each other. How-
ever, this behavior can also be caused by a momentary drop in heat pump
power while the heat pump is switched on. Therefore, it is reasonable to
identify it as switching behavior only if the estimated switching times are
located far enough apart, i.e., add the on-segment (tstart,on , tend,on ) to the
set of known on-segments Son only if the difference is larger than a certain
threshold ton,min (e.g., 5 minutes):
tend,on − tstart,on > ton,min (3.45)
If the condition from Equation (3.45) holds, an on-segment is concluded
and one can add (tstart,on , tend,on ) to the set of known on-segments Son . Fur-
thermore, the brief off-segment (tend,on , tstart,on,next ) is added to Soff . Lastly,
the start time of the on-segment is updated as tstart,on = tstart,on,next .
For case (c), two 15-minute blocks with “medium” power levels are fol-
lowed by a block where PMain,meas ≈ Phigh . This indicates that the heat
pump switches off during the block t15 and on again during block t15 + 1.
In mathematical terms, the condition C(H,L)2 is given as Equation (3.37)
(PMain,meas (t15 ) is somewhat reduced) and
PMain,meas (t15 ) > PBL + (Phigh − PBL )vdown (3.46)

and
and
The end time tend,on is then calculated the same way as in Equation
(3.39) and the start time tstart,on as in Equation 3.36. One then can add the
on-segment (tstart,on , tend,on ) to Son .
Lastly, the baseline power needs to be updated in a different fashion here:
No direct reference of the base load exists here because all of the blocks con-
tain a heat pump component. Therefore, estimate the non-controllable load
using the next block where the heat pump is assumed to be on (PMain,meas (t15 +
2)).
Therefore, the next reference load is set as
PBL = PMain,meas (t15 + 2) − PHP,r . (3.49)
The FSM continues to the “Low” state. As with the other states, if
none of the described transition conditions apply, the FSM transitions to
the “Unknown” state.
3.3.3.3 Merging Both Initial Guesses

The two initial guesses can include contradicting segments, i.e., a short on-
segment may have been detected during a period where a long off-segment
was found. For the disaggregation in the later step, it is necessary to merge
the two initial guesses. Neither approach can guarantee accuracy detecting
the segments and therefore there are multiple approaches that can provide
a reasonable initial guess.
The approach chosen is as follows:
• Undetected periods remain undetected
• When an on-segment and an off-segment overlap, the on-segment is

chosen. This occurs only when the long disaggregation was unable to
identify an on-segment that the short disaggregation identified.
• When there is partial overlap, all times that are part of any on-segment
are chosen as part of the final on-segment
Errors due to this method can be as follows: There may be off-segments

detected in the long initial guess where there are in fact some short on-
segments. The long initial guess does not detect the short on-segments by
design, however, the short initial guess may ignore some of the heat pump
segments. Discarding an off-segment whenever it coincides with an “Un-
known” period from the short initial guess would greatly diminish the num-
ber of detected off-segments, possibly allowing too few segments to remain.
3.3.4 Disaggregation Process

Method B combines the temperature-based Bayesian approach from Method
A with a modified MHE approach. The overall method can be summarized
as follows:
Beginning with the initial guess, a forward simulation of the HHPS tem-
perature model is performed from the first switching point tstart , identifying
potential heat pump candidates. Unlike Method A, Method B not does
not only calculate the candidates for the next heat pump switch, but finds
candidates for the next set of chorizon switching processes, similar to an esti-
mation window in MHE (cf. Subsection 3.3.2). Each switching process has
a certain probability depending on the timing of the previous switch, the
HHPS temperature model and other factors. As a result, one can find the
most likely set of switching points tstart , t1 , t2 , . . . , tchorizon , given the starting
point tstart 7 . The first new switching point t1 from this set of switches is
used to add segment (tstart , t1 ) to the set of known segments Son,known or
Soff,known . After updating tstart as t1 , the same process is then repeated
until the disaggregation is complete.
3.3.4.1 General Approach and Algorithms

First, an overview of the components and terminology in the relevant algo-
rithms will be provided:
1. A node n contains a time stamp n.t and a probability factor n.p.
2. Each node has a unique parent node nparent = n.parent

7
Two consecutive switching times refer to switches in the opposite direction
3. A node n may have child nodes (i.e., n is the parent node of its child
nodes)
4. A node with no child nodes is called a leaf
5. A path is an ordered set of nodes (n1 , n2 , . . . , nk ), where ni is the

unique parent node of ni+1 ,i.e., ni+1 .parent = ni
The core of this algorithm is the MHE. The overall algorithm is described
in Algorithm 6, the process within an estimation window is described in
Algorithm 7 and the calculations of the probabilities is detailed in Algorithm
8. For best understanding, Algorithm 7 will be explained in more detail first.
Let Nstart = {nstart,1 } with ni .p = 1 and nstart,1 .t = t0 be the set contain-
ing one starting node. This starting node is assumed to be a true switching
point and the goal is to calculate the possible switching points of the next
chorizon switching points of the heat pump and find the most likely path.
Algorithm 6: Method B
Result: PNC , PHP,disagg , a∗ , b∗ , c∗
1 Get total initial guess Son,init , Soff,init (cf. Section 3.3.3);
2 Get HHPS parameters a∗ , b∗ , c∗ from initial guess (cf. Section 2.2
and Equation (2.26));
3 Obtain temperature probability density function pOn,Temp , pOff,Temp
(cf. Section 3.2.3.1);
4 Let nstart be node for the first switching point in Son,init ;
5 while Disaggregation is not complete do
6 Get path (nstart , n1 , n2 , . . . , nm , n∗ ) from Algorithm 7 with
input N = {nstart };
7 Add segments (nstart .t, n1 ) to SOn,known or SOff,known ;
8 Let nstart = n1 ;
(
PHP,r , if t contained in on-segment
9 Set PHP,disagg (t) = ;
0, else
10 Set PNC (t) = PMain,meas (t) − PHP,disagg (t) ∀t;
For sake of easier explanation, a simplified version of the implemented

method will be explained first, adding constraining and optimizing elements
later. From the initial point nstart , all possible next switching points are
calculated, which are child nodes nnext of the initial node and are added to
the set Nadded . Subsequently, the same step is repeated with each node in
Nadded as a starting node nstart , all following possible switching points are
calculated. This process is repeated chorizon times. This results in a tree
structure with chorizon + 1 layers. While calculating each of the next possible
switching nodes, its probability factor is calculated as well, which can be
interpreted as a relative likelihood of the child node following the parent

node. By design, each leaf nleaf,end in the final layer is the end of a unique
path from n1 to nleaf,end . The resulting probability factor nleaf,end .p of each
leaf is then the product of the transition probabilities along the path from
n1 to nleaf . In order to choose the most likely path, the leaf in the last layer
with the highest probability factor is selected as the best path. Each node
along the path indicates switching behavior of the heat pump - therefore this
results in the most likely switching behavior as measured by the probability
calculations in Algorithm 8.
Algorithm 7: Tree for MHE

Data: Set of nodes of starting points Nstart
Result: Path (nstart , n1 , n2 , . . . , n∗ )
1 Initialize the counter variable cctr := 0;
2 Initialize empty set Nadded for switching points added during the
algorithm;
3 while cctr < chorizon do
4 foreach node ∈ Nstart do
5 Apply Algorithm 8: Get possible next nodes and add them
to Nadded ;
6 Remove inferior nodes from Nadded ;
7 increment cctr ;
8 Set Nstart = Nadded ;
9 Set Nadded = {}
10 From Nstart choose leaf n∗ with highest probability;
11 Get path {nstart , n1 , n2 , . . ., nm n∗ }, where n∗ .parent = nm ,
ni+1 .parent = ni and n1 .parent = nstart
Figure 3.14 visualizes this process: It shows each node with the time
stamp t and probability factor p, as well as its unique node identification
number (above the nodes; note that the first number corresponds to the layer
number and the second number is counted continuously from the top). The
transition probability of moving from one node to the next is indicated along
the edges. There is one initial node (node 1), which has two children nodes
(nodes 21 and 22), each having two and further children. The interpretation
is as follows: It is known that the heat pump switches on (or off) at t = 0,
represented by node 1. The next possible heat pump switching times are at
t = 11 and t = 12. These switches have a probability factor of n2 .p = 41 and
n10 .p = 41 , which indicate the probability of the heat pump switching at the
respective time after having switched on at t = 0. If the heat pump switches
at time t = 11 (i.e., node 21), it then has three possible switches after that
at t = 19, t = 23 or t = 24 (i.e., nodes 31,32 and 33).
41
t = 30
31 3 1
p = 128
16
t = 19
1 3 42
p = 24 8
t = 31
1
p = 64
1
6
43
21
t = 31
32 1
1 p = 32
t = 11 1 2
4
p = 14 t = 23
1 1 44
p = 16 4
t = 32
1
p = 64
2
9
1
4
33
1 t = 24 ...
1
p = 18
t=0 45
p=1
t = 31
34 1 1
2
p = 16
t = 23
1 46
p = 18
1
4
2
t = 32
1
1
p = 32
4
22
47
t = 12
p = 12 t = 32
1
p = 48
1
1
6
4 48
35
1
t = 24 3 t = 33
1
1
p = 12 p = 24
...
Figure 3.14: Simplified example of tree for horizon length chorizon = 3

21 31 41
1 3
t = 11 6 t = 19 16 t = 30 ...
p = 14 1
p = 24 1
p = 128
1
45
4
1
t = 31 ...
t=0 34
1
1
p = 16
2
p=1
t = 23
46
p = 18 1
1
4
2
22 1
4
t = 32 ...
1
t = 12 p = 32
p = 12
1
6 35 48
1
t = 24 3 t = 33 ...
1 1
p = 12 p = 24
Figure 3.15: Simplified example of reduced tree
The probability factors are calculated as

nnext .p = nprior .p · ptrans,prior,next , (3.50)
with ttrans,prior,next indicating the transition probability. The goal of finding
the best path is obtained by selecting the largest probability of the leaves
in the final layer. The resulting optimal path is indicated in green in the
example in Figure 3.14.
Algorithm 7 is a modified algorithm of the description above: The overall
process is repeated for chorizon steps (While-loop line 2): For each node in
the set Nstart the next possible switching points and their respective proba-
bilities are calculated and saved. In the tree, this means adding one layer to
the tree. In practice, a layer will often have many nodes with the same time
stamp in one layer, as shown in Figure 3.14, where nodes 32, 33, 34 and 35
have the time stamps t = 23 and t = 24. Such duplicates cause unnecessary
computational effort and can be removed without changing the final result.
Using the principle of inferiority, any node ni can be removed if it there
exists a different node nj in the same layer such that ni .p < nj .p. This prin-
ciple is used in path planning as well: Let the goal be to find the shortest
path from location A to location Z and let two possible paths traverse loca-
tion C. If the first path takes 10 units of time to reach C and the second path
takes 15 units of time, the second path cannot be the shortest path from
A to Z. This example is analogous to Figure 3.17: the path (n1 , n21 , n32 )
has the same time stamp as path n1 , n22 , n34 , however the second path has
a larger probability factor. Because the possible future switching behavior
21 31 41
1 3
t = 11 6 t = 19 16 t = 30 ...
p = 12 p = 13 p = 18
1
45
4
t = 31 ...
t=0 34
1 p=1
2
p=1
t = 23
46
p=1 1
1
4
2
22 1
4
t = 32 ...
t = 12 p = 12
p=1
1
6 35 48
1
t = 24 3 t = 33 ...
p = 23 p = 23
Figure 3.16: Simplified example of normalized reduced tree
is identical for both nodes n32 and n34 because they both share the time
stamp t = 23, no path following node n32 can have the highest final proba-
bility factor and therefore are inferior solutions. Therefore, nodes that cause
inferior solutions are called inferior. All inferior nodes are shown in gray in
the figure and the dominant nodes are shown in cyan. Note that node n35
dominates n33 and node n45 dominates n42 . When one removes the inferior
nodes after calculating all possible next nodes (i.e., after the for-loop in line
4 of Algorithm 7), the example from Figure 3.14 can be reduced as shown
in Figure 3.15.
In practice, the trees are significantly larger than those shown in the
example figures, due to the fact that there are only a limited number of
constraints for when the heat pump can switch. Furthermore, the horizon
should be chosen to be at least as 4 or larger to maximize the benefit of
the MHE. Therefore, the probability factors of the leaves in the final layer
can be small enough to create numerical issues. Therefore it is useful to
normalize the probability factors in each level. In order to find the best
path, only nodes in the final layer are compared to each other. Therefore,
one can scale each node in a layer with a constant, without changing the
result.
Formally, this means to set the probability factor as
nadded,new .p = nadded,original .p ∀n ∈ Nadded (3.51)

after line 5 in Algorithm 7. As an example, see Figure 3.16, which shows
the reduced and normalized tree of the same examples of the previous figures.
Algorithm 8: Next nodes and probabilities

Data: nstart , HHPS parameters a∗ , b∗ , c∗
Result: Nnext with next nodes nnext
1 foreach t ∈ [nstart .t, t∗ ] do
2 Get trajectory of temperature model xt with t0 = nstart .t;
(
0 if constraint violated
3 Evaluate constraints: pconstr (t) =
1 else
4 Evaluate pOn/Off,Temp at time t
5 Evaluate pOn/Off,BL at time t
6 nnext .t = t;
7 Set nnext .p = nstart .p · pconstr (t) · pfeas (t) · pOn/Off,Temp (t) ·
pOn/Off,BL (t) · pOn/Off,BL · pact (t);
8 if nnext .p > 0 then
9 Add nnext to Nnext
As can be seen in that example, the relations between the probability factors
of nodes in one layer remains the same.
One application of Algorithm 7 completes one estimation window.
Next, the overall disaggregation as shown in Algorithm 6 will be dis-
cussed, which makes use of the previously described Algorithm 7. As a first
step, the initial guess, as described in Section 3.3.3 is obtained in order to
be able to estimate HHPS parameters as described in Section 2.2. The tem-
perature probability function is also obtained the same way as in Method A
(cf. 3.2.3.1).
To find the next switching point from tstart = nstart .t, the optimal path
is calculated with Algorithm 7. The resulting path contains one known
switching point (nstart ). Only the next switching point, i.e., n1 , is then
considered a known switching point. This calculation then repeats with n1
as the beginning point. For the example in Figure 3.14, node 1 is nstart
and node 22 is then n1 on the optimal path (with node 45 being n∗ ). The
segment (t = 0, t = 12) is then added to Son,known or Soff,known .
This then serves as the basis for updated HHPS parameters and tem-
perature probability density functions. Finally, the MHE is performed for
the entire time series and the resulting on- and off-segments Son,known and
Soff,known are used to create PHP,disagg , with
(
PHP,r , if t contained in on-segment
PHP,disagg (t) = (3.52)
0, else
The resulting disaggregated main time series PMain,disagg (t) is given as
PMain,disagg = PMain,meas (t) − PHP,disagg (t) ∀t. (3.53)
41
t = 30 ...
31 1
3 p = 128
16
t = 19
1 42
p = 24 8
3
t = 31 ...
1
p = 64
1
6
43
21
t = 31 ...
32 1
1 p = 32
t = 11 1 2
4
p = 14 t = 23
1 44
p = 16 4
1
t = 32 ...
1
p = 64
2
9
1
4
33
1
t = 24 ...
1
p = 18
t=0
p=1 45
t = 31 ...
34 1
1
2
p = 16
t = 23
1
2
46
p = 81 4
1
1
4 t = 32 ...
22 1
p = 32
t = 12
47
p = 12
1
6 t = 32 ...
35 1
1
4
p = 48
t = 24
1 48
p = 12 3
1
t = 33 ...
1
p = 24
Figure 3.17: Simplified example of tree showing inferior nodes in gray and
dominating nodes in cyan
Unlike with Method A, the issue of implausible power measurements

(e.g., negative PMain,disagg ) or unrealistic spikes near the beginning and end
of on-segments (cf. Section 3.2.2) do not show in the disaggregation due to
the lower resolution of the data.
3.3.4.2 Constraints
To eliminate implausible switching points from consideration, the following
constraints are added to the model and used in Algorithm 8:
• Physical feasibility: The disaggregation cannot be chosen such that
the non-controllable power is negative
• Mimimum duration of on- and off-segments
• Activity: Heat pump switching processes can only occur when there
is sufficient change in PMain,meas in the appropriate direction
• Certain initial guess: Switches that conflict with the initial guess
are discarded
Physical feasibility The physical feasibility constraint is applied as de-

fined in Method A, i.e., removing all potential switching times from consid-
eration where the total load is too small for the heat pump to be switched
on.
Duration The duration a heat pump is switched on or off can vary signif-
icantly, however they do not become arbitrarily small. Therefore, one can
place a constraint for the duration. In most cases, the number of segments
from the initial guess is large enough to assume that they are representative
of typical behavior. Therefore, the shortest heat pump segments are also
representative of the minimum duration. In order to choose a lower dura-
tion threshold that still leaves some margin for uncertainty, they are chosen
slightly shorter than the shortest duration in the respective segments, i.e.
ton,min,th = min (ti,end,on,init − ti,start,on,init ) − t∆ (3.54)

i∈1,...|Son,init |
and
toff,min,th = min (ti,end,on,init − ti,start,on,init ) − t∆ , (3.55)

i∈1,...|Soff,init |
where ton,min,th and toff,min,th are the new thresholds and t∆ is a buffer
variable ensuring the estimate is not too conservative (e.g., if the shortest
on-segment in the initial guess lasts 15 minutes, it is reasonable to set a
looser constraint on on-segments such as 13 minutes).
Activity constraint Lastly, the activity constraint is an optional con-

straint is to require that there is a significant change between PMain,meas (t15
switch )
and adjacent points, i.e.
max{PMain,meas (t15 15
switch + 1) − PMain,meas (tswitch ),
(3.56)
PMain,meas (t15 15
switch ) − PMain,meas (tswitch − 1)} > Pth,act,on
when switching on or
max{PMain,meas (t15 15
switch − 1) − PMain,meas (tswitch ),
(3.57)
PMain,meas (t15 15
switch ) − PMain,meas (tswitch + 1)} > Pth,act,off
when switching off. Whenever the heat pump activity has the largest
component in the measurement, this condition holds, however there can also
be time periods when accurate estimation of heat pump switching is pre-
vented, e.g., if a large consumer such as a boiler switches when the heat pump
switches off. Then the main PMain,meas increases at the time of the switch
if the boiler power is larger than the heat pump power. Given that boiler
disaggregation can be performed relatively successfully in many cases, this
problem occurs only infrequently. The probability of other devices switching
on at the same time and overcompensating for the heat pump switching off
is low, e.g., when switching on the oven.
The activity constraint can be counterproductive in some cases: When
the heat pump switches in short intervals, the measurement PMain,meas (t)
does not necessarily increase (cf. Figure 3.12). Therefore, the activity con-
straint is best omitted when on- and off-segments are often short. For the
simulations in this data set, the activity constraint was omitted when the
median duration of both on- and off-segments from the initial guess was less
than 20 minutes and included otherwise.
For any point where a constraint is violated, its probability factor is set
to zero and it is therefore discarded.
Certainty of the initial guess The certainty of the initial guess can
be molded into this algorithm by removing all switching paths that conflict
with the initial guess, e.g., an on-segment that begins before the start of an
initial guess-on-segment, but ends after it.
To sum up, the Methods B1 and B2 are differentiated based on the
constraints chosen: For method B2, all constraints are active, whereas for
method B1 the activity constraint is inactive. Method B2 shows better
results than B1 in most cases. When the heat pump duration is short, e.g.
less than 15 minutes on average, the activity constraint on Method B2 is
counterproductive. Short heat pump on-segments do not necessarily have
a sufficient impact on the total signal and can therefore be removed from
consideration by the activity constraint. Method B2 shows better results and

decreased runtime for heat pumps with longer on-segments because unlikely
segments are removed from consideration.
3.3.4.3 Probability Components

Two probability components are included in this method
• Probability factor for the switching temperature according to the model

pOn,Temp and pOff,Temp
• Probability factor for the change of the non-controllable load around

the switching time pBL
The fundamental basis of the algorithm, the temperature model and

switching assumptions are the same in this method as in Method A (cf.
Chapter 2 and Section 3.2.3.1).
The second probability component is reasoned as follows: Given that
in most cases the heat pump behavior dominates other consumption in
residential homes, it is reasonable to attribute a higher probability factor
to a switching point that results in a smoother non-controllable load than
a switching point which would then cause a large spike or drop in non-
controllable load after disaggregation. Unlike with the switching temper-
ature distribution, there is no way to extract an appropriate probability
density function from the initial guess, because the initial guess is obtained
based on the same principle (cf. Subsection 3.3.3, in particular Equations
(3.8) to (3.13)). Therefore, the switching points from the initial guess are
chosen for the change in non-controllable load to be minimal before and af-
ter the switch. This factor needs to be quantified for it to be useful for the
MHE. Figure 3.18 shows one possible choice of a baseline probability density
function. The choice of its shape and its relative importance in the MHE
(cf. Algorithm 8, line 8) can further be optimized. Switches that cause the
baseload power immediately before the switch and in the block of the switch
to be similar are strongly preferred.
The choice of a discontinuous function is caused by caused by the relation
of baseload powers being in different categories: When it is close to one, the
baseload shows little change at the time of the heat pump switching. Beyond
certain thresholds, a switch is only possible if a second large electric device
switched on or off at the same time. While this is possible, it occurs less
frequently than small changes in electric consumption at the time of the
P
switch. When there are only points under consideration where PBL,switch
BL,prior
is
very large, e.g. larger than 1.5, smaller ratios should be preferred, however
not by large margins, because such a case indicates that a different large
device switched on or off at the same time. The size of the other device is
less significant. A Gaussian model is also possible, however cannot combine
3.4. EVALUATION 63
1.2
Probability Estimation
1.0
0.8
0.6
0.4
0.2
0.0
0 1 2
PBL,switch
pBL = PBL,prior
Figure 3.18: Probability factor for minimization of change in non-

controllable load
both the strong preference for values near one with a small slope for larger
values.
3.3.5 Analogy to Moving Horizon Estimation

With standard MHE, one performs an optimization for times [t0 , t∗ ] to obtain
the optimal trajectory in this time period. Then the horizon moves and the
trajectory is calculated for [t0 + t∆ , t∗ + t∆ ] such that the state x(t0 + t∆ )
is on the optimal trajectory calculated in the prior step. Repeating this
process until the end of the estimation time series results in an estimated
overall trajectory.
In this moving horizon method, the horizon is not defined by a time pe-
riod, but by the number of switching processes. The general idea is therefore
to calculate the optimal switching trajectory8 from a known switching point
forward for a certain number of switches. In analogy to standard MHE, the
first point on this optimal switching trajectory after the starting point is
then assumed to be certain. Then, the process is repeated, beginning with
this new starting point, resulting in another optimal trajectory. The first
new node of this optimal trajectory is assumed to be certain again and one
repeats the entire process until the end of the time series has been reached.
3.4 Evaluation
This subsection aims to provide understanding of how well this method
performs.
8
A map between the number of the switch and the time at which the switch occurs
3.4.1 Motivation for Choice of Evaluation

There are many possible ways to quantify the performance of the disaggre-
gation. For method B the input data, i.e., the main measurement PMain,meas ,
is provided in 15-minute intervals, while the output data PHP,disagg and the
validation data PHP,meas are provided in 1-minute resolution. Therefore it is
necessary to assess the disaggregation quality by other means than minute
by minute matching.
The developed evaluation methods aim to answer two questions:
• How precisely does a disaggregated on- or off-segment match with a
measured segment?
• Does the disaggregated time series share characteristic attributes with

the measured time series?
Power [kW]
5 PHP,meas
0
Power [kW]
5 PHP,disagg
0
0 20 40 60 80
Time [Minutes]
Figure 3.19: Heat pump measurement and disaggregation (example 1)
Figure 3.19, with actual on-segments above and disaggregated on-segments

below, shows an example that clearly can be categorized as a good disag-
gregation: Two subsequent on-segments from the measurement can clearly
be assigned to the disaggregation and the difference between the start and
end times of the disaggregation is small.
However, Figure 3.20 is less clear: Although the timing of the disaggre-
gated on-segment is not in line with the measured on-segment, the argument
can be made that the disaggregation nonetheless performed somewhat well
in this case: The amount of energy consumed by the heat pump in the de-
picted time period is similar in both cases, even though there is very little
overlap.
This motivates the following choice for quantifying the performance. In
order to quantify how many on- and off-segments match how well between
3.4. EVALUATION 65
Power [kW]
5 PHP,meas
0
Power [kW]
5 PHP,disagg
0
0 20 40 60 80
Time [Minutes]
the measurement and the disaggregation, the percentage by how much the
on-segments overlap are can be calculated. This results in two separate
measures, because both the duration of the disaggregation and the duration
of the measurement can be chosen as a basis:
Precision by disaggregation pbd uses the disaggregation as a refer-
ence, i.e., let the ith on-segment from the disaggregation correspond to the
time period [ti,start,d , ti,end,d ] and contain di,dis = ti,end,d − ti,start,d + 1 data
points. Let dˆi be the number of data within the time period [ti,start,d , ti,end,d ]
where the heat pump is on according to the measurement and may contain
data points from multiple measured on-segments. Precision measured by
disaggregation is then defined as
dˆi
pbd,i = , (3.58)
di,dis
which is given in percent and can range from zero to 100% by definition.
For the case in Figure 3.21, pbd,i = 100%, because for every point in the
disaggregated on-segment the actual heat pump is on as well.
Precision by measurement is analogous, with the disaggregated and
measured values switching roles: The ith on-segment is defined by the
time period [ti,start,m , ti,end,m ] and contains di,meas = ti,end,m − ti,start,m + 1
data points. Then d˜i is the number of data points within the time period
[ti,start,m , ti,end,m ] where the heat pump is on according to the disaggrega-
tion and may contain data points from multiple disaggregated on-segments.
Precision by disaggregation is then defined as
d˜i
pbm,i = . (3.59)
di,meas
Power [kW]
5 PHP,meas
0
Power [kW]
5 PHP,disagg
0
0 5 10 15 20 25 30
Time [Minutes]
For the example in Figure 3.21, pbm,i = 50%, because the disaggregation
is only on during half of the actual on-segment. These measures are large
(i.e., close to 1) when there is large overlap and zero when there is no overlap.
Duration score The third measure is intended to accommodate for seg-

ments such as the one shown in Figure 3.20, in which the scores pbm,i and
pbd,i would be zero for all depicted segments. Consider a time period [ti , tj ]
and let dm,(i,j) be the number of data points within [ti , tj ] where the heat
pump is switched on according to the measurement. Analogously, dd,(i,j) is
the number of points in the time period where the heat pump is on according
to the disaggregation. The duration score pdur,(i,j) is then defined as
dd,(i,j)
pdur,(i,j) = . (3.60)
dm,(i,j)
The score is similar to a comparison of the amount of energy consumed
according to the measurement and the disaggregation, respectively. Time
periods investigated were 30 minutes, 1 hour and 2 hours. Some heat pumps
display behavior with non-constant power consumption when switched on
(cf. Section 2.1.1), which is not detected by this algorithm and is very
difficult or even impossible to identify from 15-minute data. In order to
remove impact from these effects from the evaluation, a relation of time
rather than energy was chosen.
3.4.2 Examples of Specific Houses

In order to support the intuitive understanding of the evaluation methods,
visualizations and interpretations from some specific houses are provided in
3.4. EVALUATION 67
120 100
100
80
80
60
60
40
40
20 20
0 0
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
pbm pbd
(a) Precision by measurement (b) Precision by disaggregation
Figure 3.22: Evaluation Method A, house AEK4
this section.
3.4.2.1 Method A, House 4
This house has a heat pump with approximately 2.5 kW rated power that is
typically on between 33 and 65 minutes (25th and 75th percentiles). Figure
3.22b shows the precision estimates based on disaggregation. pbd is typically
close to one, indicating that the detected heat pump segments were very
accurate. However, a significant number of actual on-segments were not
detected, as shown in Figure 3.22a. This is caused largely by the choice of
modification of the density function describing the height pJump (cf. Figure
3.7). When the linear components are small relative to the maximum, the
disaggregation strongly prefers jumps in the “correct” size. When the linear
components are in a similar order of magnitude, more heat pump segments
are detected, however the number of segments incorrectly attributed to the
heat pump increases as well.
3.4.2.2 Method B2, House 4
As shown in Figure 3.23a, the heat pump can be on both for short and for
long time periods, with on-segments lasting as short as 15 minutes and as
long as three hours. The overall distribution of the segment durations is sim-
ilar, indicating that the disaggregation shares similar stochastic properties.
The duration of the off-segments, shown in Figure 3.23b also show similar
profiles, however the disaggregation tends to overestimate the duration of
the off-segments because some on-segments are not detected.
Measurement Measurement
Number of Occurrences 50
80 Disaggregation Disaggregation
40
60
30
40
20
20 10
0 0
0 30 60 90 120 150 180 0 30 60 90 120 150 180
Duration [Minutes] Duration [Minutes]
(a) On-segments (b) Off-segments
Figure 3.23: Histograms of segment durations for house AEK4 (Method B2)
The precision measures, as defined previously in this section, are shown

in Figure 3.24, show the percentage of overlap between the disaggregated
and measured on-segments. This shows that the disaggregation was success-
ful: Based on both measures, over 100 on-segments have over 90% overlap.
This indicates that approximately 2/3 of the detected on-segments match
with high precision. Furthermore, for 91% of all on-segments from the disag-
gregation there is overlap, i.e., only 9% of disaggregated on-segments occur
when the heat pump is not actually switched on (cf. Figure 3.24a). 82% of
all measured on-segments overlap with disaggregated on-segments (cf. Fig-
ure 3.24b). This means that there are more measured on-segments that are
not detected at all than there are disaggregated on-segments that do not
correspond to the measurement.
The duration score pdur,1h is shown in Figure 3.25. When pdur,1h is
smaller than one, this indicates that the disaggregation was not active
enough, i.e., the heat pump in the disaggregated time series within the re-
spective hour is on for less time than the measured heat pump. The opposite
applies for pdur,1h > 1. The median is centered slightly below one, indicating
that the disaggregation shows similar behavior as the actual heat pump on
average. Because the heat pump can always be switched off, but switching
on is constrained, pdur,1h < 1 occurs more frequently than pdur,1h > 1.
3.4.3 Overall Disaggregation

For sake of comparing the performance of different methods and houses, the
following aggregate measures are defined:
• pbd,80 is the percentage of disaggregated on-segments where pbd,i is

3.4. EVALUATION 69
100
100
80
80
60
60
40 40
20 20
0 0
0.0 0.25 0.5 0.75 1.0 0.0 0.25 0.5 0.75 1.0
Precision score pbd Precision score pbm
(a) Precision by disaggregation (b) Precision by measurement
Figure 3.24: Precision evaluation for house AEK4 (Method B2)
300
250
200
150
100
50
0
0 1 2 3 4
p1h,rel
Figure 3.25: Duration score for House AEK4 (Method B2)

larger than 80%
• pbd,>0 the percentage of disaggregated on-segments that overlap the

actual on-segment, i.e., where pbd,i > 0.
• Analogous definitions for pbm
• pdur,20,1h is the percentage of one hour time intervals where pdur(i,j)

is between 0.8 and 1.2, i.e., time intervals where the actual and the
disaggregated heat pump were on for similar durations.
All evaluations in this section are applied to on-segments, but can also be
applied to off-segments. As indicated earlier, all evaluations are conducted
on a 1-minute basis.
3.4.3.1 Method A
In case of high-resolution data (1-minute), Method A is used, which can
detect typical heat pump jumps well. Therefore, the precision by disaggre-
gation pbd,80 shows particularly high scores, as can be seen in Figure 3.26a.
Furthermore, pbd,80 is typically only slightly smaller than pbd,>0 , indicating
that when there is overlap, the overlap usually covers more than 80% of
the segment. Additionally, three houses with larger heat pumps (houses 7,
20 and 23) showed better performance than the smaller heat pumps. This
observation and the data for the figures in this section are shown in Table
3.2 at the end of this section.
Furthermore, the duration of the heat pumps in the on-segments tends
to be smaller in the disaggregation than the actual heat pump, as shown
in Figure 3.279 . This is likely caused by the method of detecting the initial
guess for Method A: Only time periods are considered where there are no
other jumps occurring between the potential heat pump jumps. The longer
an on-segment is, the higher the likelihood is of a jump occurring that is
caused by a different electric device. Therefore, the initial guess tends to
detect shorter heat pump segments rather than longer ones. Given that the
HHPS model is very simple, it is plausible that an initial guess primarily with
short on-segments reduces the likelihood estimations of longer on-segments.
The data to this Figure is shown in Table 3.3 at the end of this section.
3.4.3.2 Method B
The performance of Method B2 is evaluated in more detail in the following
and compared in some aspects to Method B1 and Method A.
In spite of using data in significantly lower resolution than Method A, the
overall performance is similar in many measures, as can be seen from Figure
9
Interpretation of the first boxplot: This boxplot shows the distribution of the 25th
percentile of on-durations of all houses based on Method A
3.4. EVALUATION 71
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
pbm,80 pbd,80 pbm,>0 pbd>0 pbm,80 pbd,80 pbm,>0 pbd>0
(a) Method A (b) Method B2
Figure 3.26: Aggregated precision scores for all houses
100 Method A
Method B2
Duration [Minutes]
80 Measurement
60
40
20
0
25 50 75
Percentile
Figure 3.27: Duration of on-segments, aggregated for all houses, by per-

centile 25, 50 and 75
3.26. In particular, the percentage of on-segments with overlaps pbd,>0 is

only slightly smaller than in Method A and the number of actual on-segments
that are at least partially detected is also generally larger (pbm,>0 ). This is
caused by the fact that Method A strongly prefers segments with “correct”
jump size. If there is an on-segment candidate where the jump sizes are too
large or too small due to superpositions with activity from other electric
devices, Method A often ignores the segment entirely and instead chooses
the subsequent on-segment. Method B, however, has larger flexibility with
regards to the specific starting time of on-segments and therefore omits fewer
on-segments. pbm,80 is better in many cases for Method B than Method A,
indicating that the energy consumption may be a stronger indicator for heat
pump activity than jump sizes.
Method B performs significantly worse than Method A for pbd,80 . The
on-segments detected by Method A have a high likelihood of matching pre-
cisely with the actual on-segment because it prefers segments that begin and
end with the correct jump size. Without access to jump sizes, Method B
is more prone to selecting on-segments that do not match the actual on-
segments perfectly than Method A. As a consequence, Method A can pos-
sibly be improved by including the overall signal shape more and reducing
the strong dependence on the signal shape.
The data to these observations and Figure 3.26b are shown in Table 3.4
at the end of this section.
When comparing durations of the on-segments, the median durations
and the 25th and 75th percentiles of the on-durations are closer to the actual
values from the measurement with Method B than Method A. This is caused
by Method B generalizing better to heat pumps with longer on-segments.
In particular, the algorithm for the initial guess of Method B allows long
segments to be detected better, which provides better HHPS parameters
than Method A. Method B provides good performance for percentiles 25,
50 and 75 in most cases, but still tends to underestimate outliers with very
long durations.
The performance measures for Method B1 are relatively similar to Method
B2 or worse, except for heat pumps with very short on-durations. For such
cases, the overlap percentage is less relevant because of the frequent switch-
ing. In terms of the duration score, B1 performs significantly better with
heat pumps that switch frequently if their average behavior is similar. In this
data set, this applies to house 32, which has an actual median on-duration
of 8 minutes. This duration is too short to be detected by the disaggre-
gation algorithm consistently. The activity constraint is counterproductive
because it declares actual switching points as infeasible due to the changes in
PMain,meas being too small. Therefore, these constraints should be omitted
for heat pumps with short on-durations.
3.4. EVALUATION 73
Table 3.2: Summary of algorithm performance: Method A
house pbm,80 pbd,80 pbm,>0 pbd,>0 pdur,20,1h PHP,r [W]

2 0,60 0,91 0,63 0,92 0,82 2350
4 0,59 0,96 0,62 0,98 0,89 2500
6 0,49 0,88 0,54 0,91 0,68 2800
7 0,74 0,92 0,81 0,96 0,92 4000
10 0,64 0,84 0,70 0,89 0,81 2300
15 0,57 0,86 0,65 0,89 0,79 2200
16 0,48 0,91 0,56 0,94 0,76 2500
20 0,80 0,94 0,84 0,95 0,91 3600
23 0,83 0,94 0,86 0,96 0,96 3370
32 0,67 0,48 0,69 0,97 0,66 2500
Table 3.3: Comparison of durations of on-segments [minutes] (percentile 25

/ 50 / 75)
House Disaggregation Measurement

2 20 / 26 / 39 23 / 36 / 49
4 29 / 40 / 55 33 / 48 / 65
6 40 / 55 / 60 64 / 76 / 104
7 12 / 26 / 40 14 / 29 / 53
10 26 / 30 / 40 29 / 36 / 49
15 18 / 24 / 35 21 / 28 / 40
16 25 / 33 / 55 33 / 42 / 69
20 14 / 15 / 18 14 / 18 / 24
23 17 / 19 / 23 19 / 25 / 32
32 7 / 8 / 10 9 / 11 / 16
Table 3.4: Summary of algorithm performance: Method B2
house pbm,80 pbd,80 pbm,>0 pbd,>0 pdur,20,1h PHP,r [W]

2 0,71 0,84 0,92 0,94 0,70 2350
4 0,78 0,78 0,91 0,90 0,76 2500
6 0,62 0,89 0,91 0,99 0,65 2800
7 0,72 0,85 0,83 0,98 0,82 4000
10 0,61 0,66 0,82 0,85 0,71 2300
15 0,52 0,68 0,79 0,84 0,54 2200
16 0,82 0,92 0,97 0,98 0,77 2500
20 0,69 0,74 0,81 0,97 0,7 3600
23 0,77 0,82 0,91 0,96 0,7 3370
32 0,33 0,39 0,56 0,96 0,38 2500
32a 0,36 0,42 0,64 0,95 0,67 2500
a
Method B1
Table 3.5: Comparison of durations of on-segments, Method B2 [minutes]

(percentile 25 / 50 / 75)
House Disaggregation Measurement

2 24 / 35 / 44 23 / 36 / 49
4 32 / 44 / 58 33 / 48 / 65
6 57 / 67 / 79 64 / 76 / 104
7 21 / 39 / 55 14 / 29 / 53
10 26 / 35 / 49 29 / 36 / 49
15 20 / 28 / 39 21 / 28 / 40
16 32 / 42 / 65 33 / 42 / 69
20 17 / 21 / 25 14 / 18 / 24
23 20 / 25 / 32 19 / 25 / 32
32 8 / 14 / 18 9 / 11 / 16
3.5 Synthesis of Heat Pump Time Series

One of the motivations for disaggregating heat pump time series from smart
meter measurements is to simulate different control strategies for heat pumps
in the grid in order to evaluate whether grid congestion or other issues can
be solved by demand response. For this purpose, a model emulating normal
heat pump behavior needs to be used that can then be controlled by a control
algorithm. The model from Equation (2.11) is useful, however the result of
such switching behavior is more regular than real heat pumps due to the
fact that there is no noise in the model.
The noise that occurs in the HHPS, e.g. from opening windows or mea-
surement noise in the HHPS thermostat, is captured in the distribution of
3.6. IMPLEMENTATION AND FURTHER IMPROVEMENT 75
the switching temperatures. One can synthesize a heat pump time series
with similar statistical properties as the original heat pump by performing
a forward simulation of the HHPS model with one modification: Rather
high low , the thresholds
than using the constant switching thresholds Tth and Tth
are sampled. At every switching time, a random sample is drawn from the
distribution of the switching temperature pon,Temp or poff,Temp and used as
the switching threshold for that segment (cf. Figure 3.6 in Section 3.2.3.1).
Additionally, one can sample the distributions restricted to certain times
of day, e.g. from the distribution of switching temperatures on weekdays or
time of year to obtain more realistic noise patterns. However, in the data
that was used, the distributions appeared to be relatively homogeneously
distributed over different time periods. Investigating patterns in these dis-
tributions could provide more insights when using larger data sets.
3.6 Implementation and Further Improvement

Method A is significantly simpler and computationally less expensive than
Method B. The most time consuming component of Method A is the op-
timization problem for finding the optimal HHPS parameters. Instead of
updating the parameters after every step, one can reduce the number of up-
dates. The more progress has been made in the disaggregation, the less the
parameters change, meaning that the updates become less important over
time.
Method B is computationally relatively expensive (e.g., 3 to 15 minutes
for disaggregating approximately 6 weeks). However, computation time can
be significantly reduced with little or no loss in accuracy with the following
adjustments.
• Reduction of output resolution (Methods A and B): Rather

than specifying the disaggregation in a one minute resolution, a more
coarse resolution could be chosen (e.g., 3 minutes or 5 minutes). This
adjustment needs to be performed carefully when considering con-
straints. Let 5 minute resolution be chosen such that only time stamps
divisible by 5 are allowed and let t = 15 be a certain switching
point. Using the model and the constraints, it could be that t ∈
{26, 27, 28, 29} are the only allowed next switching points. In such a
case, the problem would become infeasible because there are no can-
didate switching times that are divisible by 5.
• Saving the HHPS temperature trajectory (Method B): The

HHPS temperature is calculated highly redundantly if one directly
applies the described algorithms, i.e., once for every node in every tree.
Instead, every HHPS temperature trajectory that has been calculated
can be saved and then retrieved when necessary.
• Approximating the HHPS temperature trajectory (Method

B): According to the algorithm, two possible heat pump segments that
have similar starting points each require the HHPS temperature tra-
jectory to be calculated. Trajectories beginning at similar times show
very similar behavior due to the slow change in ambient temperature
and solar irradiance. Therefore, a practical approximation is to reuse
any previously calculated trajectory for other starting points in a simi-
lar time period, e.g., use the same trajectory for all potential segments
beginning between 4.00 pm and 4.15 pm on a certain day. This modifi-
cation can be applied to Method A as well, however shows significantly
less benefit because Method A has far fewer HHPS calculations.
Applying this concept to similar weather conditions would likely not
increase speed because one would need to compare the weather condi-
tions throughout the entire relevant time period.10
• Save probability and constraint calculations (Method B): The

calculations of the probability estimation (transition probability in the
tree in Method B) are also highly redundant. If two switching paths
both traverse t1 and t2 , the transition probabilities will be the same.
These results can instead be stored and reused.
• Parallelization (Method B): In principle, it is possible to parallelize

Method B because all nodes in the same layer are independent from
each other. By parallelizing the algorithm, redundant calculations (cf.
the previous two bullets) cannot be avoided. In practice, this disag-
gregation is primarily useful for performing multiple disaggregations
for an entire grid area. Therefore, it is more useful to perform disag-
gregations of different houses in parallel (i.e., one house per core) and
optimize total runtime with the other adjustments.
Additionally, Method B does not perform well for extreme temperatures,

i.e., when the heat pump is on for very long time periods (very cold ambi-
ent temperature) or when it is on only very rarely (close to the threshold
temperature). In such cases, the HHPS temperature is within a close range
of the switching threshold for a long time. This then causes the tree in the
MHE to become very large and results in very many plausible switching
points. Therefore, the temperature model is not very indicative of switches
in such cases and could be ignored in favor of a shape-based disaggregation.
Furthermore, Methods A and B could be performed backwards as well,
i.e., beginning with the end of the time series. It may be possible to obtain
10
For example, if the ambient temperature at 8pm and 8am are the same, the HHPS
trajectories beginning at 8am and 8pm are likely different because the ambient temperature
will likely increase after 8am and decrease after 8pm. Comparing the trajectories of
weather conditions for similarity may be more computationally complex than directly
calculating a new HHPS trajectory.
3.6. IMPLEMENTATION AND FURTHER IMPROVEMENT 77
a more robust result by combining forward and backward disaggregations,

however a method would need to be developed for making an appropriate
final selection where the results differ.
Chapter 4
Boiler Modeling and

Disaggregation
Electric water boilers are used to provide hot water supply and are common
in many European countries. In Switzerland in particular, there are an
estimated 1 million electric water boilers installed with a total yearly energy
consumption of approximately 2.2 TWh, therefore providing large potential
for DR.[16]
In analogy to Chapters 2 and 3, this Chapter aims to explain typical
behavior and control of boilers, a suitable disaggregation algorithm and a
boiler model that describes its electric behavior. Furthermore, the quality
of the disaggregation are evaluated.
4.1 Boiler Behavior

Typically, electric boilers are controlled by ripple control in such a way, that
the boilers are on during times with lower power tariffs. The ripple control
signal is interpreted as a “blocking” or “free” signal, preventing boiler activity
or allowing it to behave according to its internal controls. Boilers consist
of a heating element and a storage tank, thus permitting the analogy of a
battery with a certain state of charge. The analogy to discharging is caused
by thermal losses and hot water usage, whereas charging is done by usage
of the heating element. The heating elements display “on” or “off” behavior
with a relatively constant power draw, i.e., bang-bang control behavior.
When boilers move from an extended period of being blocked to the un-
controlled state, the amount of thermal energy in the tank has decreased
since the last heating period (thermal losses and water usage). Water in
the boilers is then heated up until its temperature has reached an internal
threshold analogous to a full SOC or until a charging limit has been reached,
e.g., a maximum heating duration. This process typically lasts for approxi-
79
80 CHAPTER 4. BOILER MODELING AND DISAGGREGATION
10
PMain,meas
8 PBoiler,meas
Compensation Period
Power [kW]
6 Independent Period
0
00:00 08:00 16:00 00:00 08:00 16:00 00:00
Time
Figure 4.1: Boiler behavior example: Independent and Compensation Period
mately 30 to 150 minutes1 . In many cases, after an initial long period at full
power, a second segment follows, implying that there may be effects such as
charging limits or non-homogeneous temperature in the tank.
During extended periods of being uncontrolled, e.g., on weekends, boilers
switch on in irregular intervals for short periods of time, ranging from 3 to
30 minutes. At night, the boilers tend to switch on less often than during the
day, implying that the switching frequency and duration depend on the hot
water consumption. Furthermore, hot water consumption is highly volatile,
making predictions about the timing of the next heating period of a boiler
difficult.
For the disaggregation algorithm, it is useful to define two time periods
with different characteristic behavior:
• Compensation Period: Night time at the end of a day in which the

boiler was blocked previously. The boiler must compensate for the
energy loss over the course of the day when the boiler was blocked.
• Independent Period: The remaining uncontrolled times, e.g., dur-

ing the day on holidays. The boiler behaves independent from any
external control, e.g., ripple control, and there are no remaining ef-
fects from the last period when the boiler was blocked.
The two types of behavior are shown in Figure 4.1. Furthermore, the
on-durations of boilers in the Independent Period are usually consistent, as
shown in Figure 4.2. The periods are results of the grid operator’s blocking
signal, which is known to the grid operator. The boiler periods during the
Compensation period can also be detected easily by evaluating a histogram
1
Based on the data used for this thesis.
4.1. BOILER BEHAVIOR 81
80
60
40
20
0
0 5 10 15 20 25
On-Duration [minutes]
Figure 4.2: Histogram of the on-duration of a boiler during the Independent

Periods for one house
of the heat pump signal over the course of all days of the week, i.e., for all
Mondays, all Tuesdays etc. Large power levels that recur at certain times
of day are likely to be caused by the boiler, e.g. when PMain,meas ≈ Pboiler,r
between 00:30am and approximately 1:30am on all days except Sunday and
Monday. The Compensation Period can be chosen as beginning at 00:30am
on those days and ending several hours later. The ending time of the Com-
pensation Period must be chosen to be later than the latest detected boiler
on-segment during the possible Compensation Periods. The Independent
Periods are then during days when there is no detectable Compensation Pe-
riod at the typical time of day for a Compensation Period. For the previous
example, Independent Periods occur on Saturday, beginning around 5am
or when the Compensation Period ends, and ending on Monday morning
around 5am. This is chosen because there is a Compensation Period in the
night from Monday to Tuesday, i.e., the boiler is blocked during the day on
Mondays, and there are no Compensation Periods in the nights from Sat-
urday to Sunday or Sunday to Monday. For more irregular ripple control
schemes a more detailed estimation algorithm would need to be developed.
Similar to a heat pump, a boiler also is intended to keep water at a cer-
tain temperature level and therefore the assumption of a switching model
based on threshold temperatures is justified. Unlike with air temperature in
residential homes, the set point temperature can not be estimated as well
as it can vary significantly. Online resources provide temperature references
that vary in a range of 45◦ C and 65◦ C, e.g., [17]. Furthermore, hot wa-
ter consumption in a house can not directly be observed from the electric
consumption of the boiler.
4.2 Disaggregation Algorithm

Disaggregation only is necessary during time periods where ripple control
gives a “free” signal to the boiler. Given that there are two distinct types of
behavior shown by boilers, it is useful to define two separate disaggregation
algorithms for the Compensation Periods and the Independent Periods.
As with Method B for the heat pump, the disaggregation algorithm uses
15-minute data from the main measurement PMain,meas as input data and
provides a resulting disaggregated boiler power time series Pboiler,disagg in
1-minute resolution. The evaluation is also performed on a 1-minute basis.
4.2.1 Compensation Period

During the Compensation Period, the disaggregation can be performed in
a very simple manner for multiple reasons: Human activity rarely causes
any significant surges in electric power consumption in the Compensation
Period, because it occurs at night. Therefore the non-controllable compo-
nent of the main signal is smooth and small in most cases. Since the boiler
tank usually loses a large amount of energy between two heating periods,
the first on-segment of a boiler during a Compensation Period lasts for ap-
proximately one to 2.5 hours. Given that the rated boiler power is typically
larger than that of all other electric devices, this signal is easy to detect.
The disaggregation process is given by Algorithm 9.
Algorithm 9: Disaggregation of boiler in Compensation Period

Data: Time series PMain,meas
Result: Disaggregated time series Pboiler,disagg
1 if PMain,meas jumps from PBL,previous to approximately
Pboiler + PBL,previous then
2 calculate beginning and ending time by minimizing change of NC
load;
(
Pboiler,r if boiler is switched on at time t
3 Define Pboiler,disagg (t) =
0 else
The main idea of the Algorithm 9 is to identify all points in time when
the main measurement jumps up its previous base load level PBL,previous by
approximately Pboiler . The specific starting and ending time of the boiler
on-segment is then calculated in 1-minute resolution such that the NC load
remains constant in the 15-minute block before and during the switch. The
assumption that the NC load stays constant is plausible because the Com-
pensation Period typically takes place at night when people rarely operate
other devices with large electric consumption. The ending time of the on-
segment is calculated analogously and the resulting disaggregated time series
4.3. EVALUATION OF BOILER DISAGGREGATION 83
Pboiler,disagg (t) is calculated as Pboiler,r when on and zero when off.
4.2.2 Independent Period

Disaggregation is significantly more difficult during the Independent Periods,
due to the fact that the power spikes of the boiler are usually significantly
shorter. Furthermore, the spike duration varies from boiler to boiler, ranging
from 3 minutes to 30 minutes, which is lower than or in the range of the data
resolution. Boiler activity depends significantly on hot water consumption,
which is highly stochastic. Therefore, an approach based on signal shape
was chosen. The algorithm is identical to the detection of the initial guess
for heat pumps (cf. Section 3.3.3), with two modifications:
• Pboiler replaces PHP
• On-segments longer than 30 minutes are discarded
The basic idea is to detect possible candidates with stair-shaped peri-

ods or jumps in the main measurement that are likely to be caused by the
boiler with known rated power. Longer potential boiler on-segments are
ignored because they do not correspond to typical boiler behavior during
the Independent Period and are likely caused by other devices. The boiler
usually has short on-segments during the independent period because the
boiler only compensates for the energy lost from the tank since the last
on-segment. By definition, the energy lost since the last on-segment during
the Independent Period must be significantly shorter than during the Com-
pensation Period. Therefore, an upper threshold for the maximum duration
was chosen. All boiler on-segments in the Independent Periods in the data
set used are shorter than 30 minutes. The algorithm is based solely on the
shape of the signal.
4.3 Evaluation of Boiler Disaggregation

As with the disaggregation of the heat pump, the boiler disaggregation is
performed by comparing the measured and disaggregated boiler time series
in 1-minute resolution. Disaggregation of boiler activity in the Independent
Period improves significantly when the duration of the spikes is sufficiently
long and the rated power of the boiler is the largest individual consumer
that operates regularly. For boilers with short spikes it is very difficult to
detect when it may have been on. For example, if a boiler is on for 3 minutes
and draws 4 kW, the resulting increase in the 15-minute main measurement
block is merely 800 W. If the boiler activity occurs across two 15-minute
blocks, the measurement increase can barely be noticeable. If additionally
a heat pump is contained in the main measurement, such small changes
can not be attributed to the boiler. In many cases, the boilers are the
largest consumer in a household, often larger than heat pumps. However,

in households with smaller boilers (approximately 3.5 kW to 4.5 kW) where
heat pumps exist as well, the resulting characteristics of the boiler and the
heat pump in the main measurement PMain,meas can become similar to each
other and the quality of the disaggregation decreases.
Therefore, a practical measure of the disaggregation accuracy is whether
or not there is overlap between the disaggregated and the measured on-
segments, i.e., pbd>0 and pbm>0 . Since some boilers switch on only for
very short time periods, the percentage of overlap is less crucial: If the
boiler is on from t=0 until t=8 according to the measurement but from t=3
until t=11 according to the disaggregation, the percentage of overlap would
imply the disaggregation is mediocre. However, in a time series in 15-minute
resolution, the disaggregation would likely be identical or very similar to the
disaggregation with the measurement. Furthermore, for most disaggregated
on-segments, the duration is very close to the measured duration (i.e., within
few minutes).
Figure 4.3 shows the precision measures pbd>0 and pbm>0 in relation to
the median duration of a boiler spike2 . The disaggregation was performed
on all houses in the data set with a minimum duration of a spike set as
8 minutes and houses that did not show a significant number of on-segments
were omitted.
Subfigure 4.3a shows the percentage of on-segments from the disaggre-
gation with at least one point where the boiler is on according to the mea-
surement, with each point corresponding to one house. The blue data points
correspond to houses where there either is no heat pump or the boiler’s rated
power is significantly larger than that of the heat pump. In these cases, the
longer the median duration of a boiler spike is, the better the disaggregation
becomes. The orange data points correspond to a house where the differ-
ence between rated boiler power Pboiler,r and PHP,r is less than approximately
1.5 kW. This causes some heat pump activity to be interpreted as boiler ac-
tivity. With these particular heat pumps, their activity pattern contains
brief periods with significantly increased power consumption. While this is
an error in the sense that heat pump activity is mistakenly attributed to
the boiler, it is a good disaggregation in the sense that segments from con-
trollable thermal loads are identified as such. In summary, the on-segments
detected with this method are usually accurate.
Subfigure 4.3b shows the percentage of measured on-segments that were
detected by the disaggregation algorithm (i.e., where overlap exists). The
boilers with a high median duration also detect most of the measured on-
segments. However, for boilers with shorter median on-durations, a large
part of the measured on-segments are not detected.
The conclusion is that the the boiler disaggregation is very accurate in
2
The median duration according to the measurement, not the disaggregation
4.4. DYNAMIC BOILER MODEL 85
1.0 1.0
Precision score pbm,>0

Precision score pbd,>0
0.8 0.8
0.6 0.6 No HP interference

HP interference
0.4 0.4
0.2 No HP interference 0.2

HP interference
0.0 0.0
0 10 20 30 0 10 20 30
Duration [Minutes] Duration [Minutes]
(a) Precision by disaggregation (b) Precision by measurement
Figure 4.3: Correlation between precision of boiler disaggregation and du-

ration of spike
the on-segments that it detects, in particular when it is significantly larger

than other thermal loads that may be present. However, obtaining a com-
plete disaggregation where most of the on-segments are detected is only
consistently possible if the boiler is on for a sufficient duration.
4.4 Dynamic Boiler Model

In this context, the purpose of a dynamic boiler model is to create realis-
tic consumption patterns in one minute resolution and provide means for
exercising differing control schemes. The boiler model is based on the dis-
aggregated boiler time series from the Independent Period and therefore
reflects the consumption pattern of the device and the behavior of the in-
habitants. A model for boiler behavior during the Compensation Period is
not beneficial. Synthesized time series in the Compensation Period can be
generated by “copying” disaggregated boiler behavior in such periods.
The boiler is modeled analogously to a battery. It contains a certain
amount of thermal energy which is at its maximum at the temperature set
point (analogous to SOC=100%). Energy can be extracted, either because
of the tank’s thermal losses or hot water consumption (analogous to dis-
charging). In the Independent Period, it heats the water again until the
temperature set point is reached (analogous to charging when SOC=0%).
Along the lines of this approach, the following assumptions are made:
• When boilers switch on during the Independent Period, the amount of

energy added to the water tank during the on-segment is equal to the
amount of energy extracted from the boiler since it last switched off
• The boiler switches on whenever a certain amount of energy has been

extracted since it last switched off
• Energy extracted from the water tank between two on-segments of the
boiler is consumed with constant power
The first assumption is justified by physics and the assumption that the
boiler is set to keep water at a fixed set point temperature. The second
assumption is plausible given the distribution of on-durations (cf. Figure
4.2). In most cases, the heat pump is on for the same duration, i.e., a similar
amount of energy is added to the boiler every time. The deviations of the on-
durations can be modelled as a disturbance to the SOC measurement. The
third assumption is false whenever there is hot water consumption, however
it is a practical simplification that makes it possible to calculate a hot water
consumption pattern.
With these assumptions, one can calculate the mean power extracted
from the heat pump during the Independent Periods:
Let toff,i be the end of the ith on-segment of the boiler (i.e., switching
off), ton,i+1 be the beginning of the subsequent (i + 1)th on-segment and
toff,i+1 be the end of the i + 1th on-segment. The power consumption of the
boiler can then be calculated as
(
Pboiler,r if ton,i+1 ≤ t ≤ toff,i+1
Pboiler (t) = (4.1)
0 if toff,i ≤ t ≤ ton,i
and the electric energy consumed in the i + 1th on-segment
Eboiler,i+1 = Pboiler,r (toff,i+1 − ton,i+1 ). (4.2)
With the assumptions as stated above, this value is equal to the energy
extracted from the boiler, i.e.
Eboiler,ext,i+1 = Eboiler,i+1 . (4.3)
Eboiler,ext,i+1 is therefore the amount of energy extracted from the boiler

between ttoff,i and toff,i+1 due to thermal losses and hot water consumption.
The mean power extracted from the boiler in this time period is then
given as
Eboiler,i+1
Pboiler,i+1 = . (4.4)
(toff,i+1 − toff,i )
Matching all such time periods to the respective time of day, a daily
consumption pattern can be extracted as shown in Figure 4.4. It displays
the mean power consumed Pboiler,i of all off-on-cycles at the respective time
1.50 Pext,mean
1.25
Power [kW]
1.00
0.75
0.50
0.25
0.00
0:00 3:00 6:00 9:00 12:00 15:00 18:00 21:00
Time of Day
Figure 4.4: Daily Consumption Pattern
of day. The black line shown is the mean of all consumption occurring at
that time of day.
A boiler model is designed as
Pboiler (t) = Pboiler,r · u(Eboiler,ext,recent (t)), (4.5)

with the input u ∈ {0, 1} depending on the cumulative hot water con-
sumption pattern since the last time the boiler switched off. The energy
consumption since last switching on is given as
Z t
Eboiler,ext,recent = Pextr (t̂)dt̂, (4.6)
tlast
where tlast is the end of the last on-segment and Pextr is the power ex-
tracted from the boiler. To simulate a realistic time series, Pextr is obtained
by sampling the consumption profiles from one entire day (cf. Figure 4.4).
The reason for taking daily profiles is that hot water consumption at differ-
ent times of day is not independent of each other. For example, a person
may take a shower at 6pm on one day and at 8pm on another day, but will
likely not shower at 6pm and 8pm on the same day.
The switching behavior and thereby also the synthesized boiler time
series is then given by Algorithm 10. The algorithm acts as a load profile
generator which produces similar behavior to the original heat pump based
on the disaggregation from the Independent Period.
In line 3, the variation in the on-durations is sampled in order to obtain
the same stochastic behavior as the original heat pump. Line 4 is the equality
between the extracted energy and the added energy and line 5 calculates the
time when the recently extracted energy is equal to the energy added in the
Algorithm 10: Boiler switching algorithm

Data: Set of boiler on-durations Tboiler,on , boiler consumption model
from Independent Period, rated boiler power Pboiler,r
Result: Synthesized boiler time series Pboiler,synth (t)
1 Let toff,i be the time the boiler last switched off or the beginning of
the time series;
2 while time series incomplete do
3 Get random sample tj as duration of next boiler on-segment from
distribution of boiler on-durations;
4 Set Eextr,i+1 := Pboiler,r tj as energy consumed in next on-segment;
5 Get thermal boiler consumption Pboiler,ext (t) from consumption
R t∗
model until t∗ , s.t. toff,i Pboiler,ext (t)dt = Eboiler,on,,i+1 ;
(
0 t ∈ [toff,i , t∗ − tj ]
6 Set Pboiler,synth (t) = ;
Pboiler,r t ∈ [t∗ − tj , t∗ ]
7 Update toff,i := t∗ ;
8 if time is at the end of a day then
9 Draw a new sample from the daily consumption patterns
Pext (t)
next on-segment. Finally, the switching points are calculated assuming that
the duration of the off-on-cycle is t∗ − toff,i and the boiler is on for tj .
The resulting time series is very similar to the original time series. By
design the energy consumption over the course of any day is identical to
the energy consumption of the day that was selected from the consumption
pattern3 . However, by sampling the on-durations from the next on-segments,
the specific switching points differ.
In order to apply different control schemes to the boiler, Algorithm 10
can be modified and phrased as a dynamic model as
Estate (t + 1) = Estate (t) − Pboiler,ext (t) + Pboiler,r u(t), (4.7)

where Estate is the energy contained in the boiler and u(t) is the input
which indicates that the boiler is active or inactive. The thermal boiler con-
sumption Pboiler,ext (t) is from the boiler profile the same way as in Algorithm
10. A control scheme can be designed as

1
 if Estate (t) < Eth and ublock = 0
u(t + 1) = 0 if Estate (t) > 0 (4.8)

u(t) else,

3
Strictly speaking, there may be small quantization errors due to energy “debt” incurred
from a previous day
for a threshold Eth and a blocking variable ublock . The threshold Eth can
be chosen as Pboiler,r tboiler,on,ind with tboiler,on,ind being the average duration
the boiler is on during the Independent Period. Furthermore, the initial
state of the boiler energy is defined as Estate (0) = 0 as an arbitrary initial
point. The boiler can be controlled externally with the variable ublock , a
binary variable which allows or blocks the boiler from switching on.
The resulting boiler behavior is given by
(
0 if u(t) = 0
Pboiler (t) = (4.9)
Pboiler,r else.
In this manner, DR simulations can be performed with differing boiler

control schemes. The resulting boiler is designed to heat up until it reaches
its initial temperature, referenced equivalently by its initial energy content
Estate (0).
Chapter 5
Combined Disaggregation of
Boiler and Heat Pump
Boilers and heat pumps are two of the most common large electric consumers
in many households, with battery electric storage systems and electric car
chargers becoming more frequent as well.
Disaggregating the smart meter measurements becomes more difficult
when multiple large consumers are present, because they can have similar
characteristics in the time series and cause superpositions. Alleviating this,
however, are the following aspects:
• Boilers in Switzerland are controlled by ripple control, causing them

to be blocked most of the time.
• Boilers often have higher rated power than heat pumps
• Most boilers are on for shorter time periods than heat pumps when
they are on in their uncontrolled mode
As described in Chapter 4, the disaggregation of the boiler time series in

the Compensation Period is usually relatively simple. However, there still
remains the issue of disaggregating both the heat pump and the boiler in
the Independent Period, i.e., typically from Saturday morning until Monday
morning on most weeks.
Algorithm 11 shows the approach for the combined disaggregation. As
input variables, the rated power of the heat pump and the boiler as well as
the measured main time series are used. The result of the algorithm are dis-
aggregated time series for the heat pump PHP,disagg , the boiler Pboiler,disagg
and the remaining non-controllable component PNC . The elements de-
tectable with the highest certainty should be disaggregated first as this helps
to reduce noise for following disaggregations. The following disaggregations
are then performed based on the part of the main measurement that has not
yet been attributed to a device.
91
92CHAPTER 5. COMBINED DISAGGREGATION OF BOILER AND HEAT PUMP
As described in Subsection 4.2.1, the boiler behavior in the compensa-

tion period is very easily detected. Furthermore, the initial guess of the heat
pump is also designed to have high certainty. Therefore, these two compo-
nents are disaggregated first. Further auxiliary variables are introduced in
the algorithm.
If the boiler power is larger than the heat pump power by a margin of at
least ∆P ≈ 0.1Pboiler,r , the next most characteristic on-segments are boiler
spikes during the independent period, as long as they are sufficiently long.
When the detected boiler segments are shorter than a certain threshold, they
are discarded because they aren’t detectable with sufficient certainty and an
incorrect boiler disaggregation may then have a negative impact on heat
pump disaggregation, e.g., when the boiler disaggregation algorithm detects
a boiler spike during what actually is heat pump activity. In this case, the
power level of the disaggregated main measurement may be too low for heat
pump activity to be feasible. Next, the heat pump is disaggregated from the
remaining part of the main measurement that has not been attributed to a
device yet.
When the heat pump rated power is similar to or larger than the boiler
rated power (else-block), the heat pump is disaggregated before the boiler
because its larger size makes it detectable with higher certainty than the
boiler. Finally, the non-controllable time series is the remainder of the power
measurement.
If the disaggregated boiler time series is accurate, the heat pump disag-
gregation provides slightly better results after the boiler disaggregation than
with the boiler included, because the noise has been reduced.
93
Algorithm 11: Combined disaggregation of heat pump and boiler

Data: Rated powers PHP,r and Pboiler,r , time series PMain,meas
Result: Disaggregated time series PNC , PHP,disagg and Pboiler,disagg
1 Get initial guess for heat pump PHP,Init.Guess and set the temporary
auxiliary time series P̃Main,meas = PMain,meas − PHP,Init.Guess ;
2 Disaggregate boiler in Compensation Period as Pboiler,disagg,Comp from
P̃Main,meas ;
3 Set P̃Main,meas = PMain,meas − PHP,Init.Guess − Pboiler,disagg,Comp ;
4 if PHP,r < Pboiler,r − ∆P then
5 Disaggregate boiler in Independent Period from P̃Main,meas if
possible and combine with Pboiler,disagg,Comp to obtain the total
boiler time series Pboiler,disagg ;
6 Set P̃Main,meas = PMain,meas − Pboiler,disagg ;
7 Disaggregate heat pump PHP,disagg from P̃Main,meas ;
8 else
9 Disaggregate heat pump from PMain,meas − Pboiler,disagg,Comp to
obtain PHP,disagg ;
10 Set P̃Main,meas = PMain,meas − Pboiler,disagg,Comp − PHP,disagg ;
11 Disaggregate boiler in Independent Period if possible from
P̃Main,meas , combine with Pboiler,disagg,Comp to obtain total boiler
time series Pboiler,disagg ;
12 Set PNC = PMain,meas − PHP,disagg − Pboiler,disagg ;
94CHAPTER 5. COMBINED DISAGGREGATION OF BOILER AND HEAT PUMP
Chapter 6
Conclusion and Outlook
In this thesis, novel methods for disaggregating and modeling heat pumps
and boilers were developed using coarse smart meter measurements. For
this purpose, a physical, dynamic heat pump model and a parameter esti-
mation technique were designed. These methods constitute a novel approach
in the realm of non-intrusive load monitoring due to their ability to cope
with very coarse measurement data (average active power in one minute
or fifteen minute intervals). Unlike existing disaggregation methods, which
often focus on transient signals in the range of milliseconds, this method
combines information from the shapes of signals with model based Bayesian
and moving horizon estimation approaches. This is well suited for extracting
useful information in spite of the data resolution being multiple orders of
magnitude lower than for typical approaches in non-intrusive load monitor-
ing. Finally, a method for synthesizing realistic heat pump and boiler time
series was developed by detecting characteristic noise patterns specific to
the respective appliances. This enables realistic forward simulations which
can then be used to evaluate DR control strategies in distribution grids.
In spite of its benefits, there are limitations to consider: Due to the small
amount of measurement data from the disaggregated devices, it is uncertain
whether other heat pumps might behave differently. Two of the heat pumps
in the data set showed surges in power consumption or occasionally a second
rated power level. However, the number of data points was too small to iden-
tify a model for that behavior. In order to compensate for the small amount
of available data, the methods were developed to be very general rather than
tailored to individual characteristics from individual heat pumps.
There is great potential for future research expanding on this work by
including other large consumers such as electric vehicle chargers or bat-
tery storage systems. Additionally, this method may aid in evaluating the
benefit of DR for large grid areas. As more data becomes available, more
components can be included into the model, such as differentiating between
weekends and weekdays, or periods with low or high non-controllable electric
95
96 CHAPTER 6. CONCLUSION AND OUTLOOK
consumption. Furthermore, if more data with direct measurements of indi-

vidual large appliances becomes available, machine learning methods could
be applied. Continuing research in this direction can provide a further im-
proved basis for more cost-efficient grid planning and operation.
List of Figures
2.1 Heat pump behavior examples from three houses . . . . . . . 6

2.2 Working principle of a heat pump, adapted from [10], simplified 7
2.3 Synthesized heat pump behavior . . . . . . . . . . . . . . . . 10
2.4 Heat pump and temperature model behavior with error . . . 12
2.5 Objective value as function of a . . . . . . . . . . . . . . . . . 15
2.6 Objective value as function of a and b . . . . . . . . . . . . . 16
3.1 Histogram of jumps detected in main measurement of house

AEK32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Kernel density estimation of jumps detected in main measure-
ment of house AEK32 . . . . . . . . . . . . . . . . . . . . . . 22
3.3 Example of initial guess and detected spikes (house AEK32) . 24
3.4 Heat pump signal during on-segments in initial guess . . . . . 25
3.5 HHPS disaggregation behavior, based on initial guess . . . . . 28
3.6 Histogram and probability density function of end-of-segment
temperatures TiOn,fin and TjOff,fin . . . . . . . . . . . . . . . . 30
3.7 Original and modified probability estimations for the jump
sizes in house AEK32 . . . . . . . . . . . . . . . . . . . . . . 31
3.8 Time series of disaggregated main and heat pump time series 33
3.9 HHPS disaggregation behavior (first iteration) . . . . . . . . 35
3.10 HHPS disaggregation behavior (second iteration) . . . . . . . 36
3.11 Example of Heat Pump in 15-minute Main Measurement . . . 37
3.12 Initial guess for short heat pump with indicator . . . . . . . . 39
3.13 Simplified finite state machine for detection of long segments
for initial guess . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.14 Simplified example of tree for horizon length chorizon = 3 . . . 55
3.15 Simplified example of reduced tree . . . . . . . . . . . . . . . 56
3.16 Simplified example of normalized reduced tree . . . . . . . . . 57
3.17 Simplified example of tree showing inferior nodes in gray and
dominating nodes in cyan . . . . . . . . . . . . . . . . . . . . 59
3.18 Probability factor for minimization of change in non-controllable
load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.19 Heat pump measurement and disaggregation (example 1) . . 64
97
98 LIST OF FIGURES

3.22 Evaluation Method A, house AEK4 . . . . . . . . . . . . . . . 67
3.23 Histograms of segment durations for house AEK4 (Method B2) 68
3.24 Precision evaluation for house AEK4 (Method B2) . . . . . . 69
3.25 Duration score for House AEK4 (Method B2) . . . . . . . . . 69
3.26 Aggregated precision scores for all houses . . . . . . . . . . . 71
3.27 Duration of on-segments, aggregated for all houses, by per-
centile 25, 50 and 75 . . . . . . . . . . . . . . . . . . . . . . . 71
4.1 Boiler behavior example: Independent and Compensation Pe-

riod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.2 Histogram of the on-duration of a boiler during the Indepen-
dent Periods for one house . . . . . . . . . . . . . . . . . . . . 81
4.3 Correlation between precision of boiler disaggregation and du-
ration of spike . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.4 Daily Consumption Pattern . . . . . . . . . . . . . . . . . . . 87
Bibliography
[1] Rainer Bacher. Liberalized Electricity Power Systems and SmartGrids.

2017.
[2] Ran Fu, David J. Feldman, Robert M. Margolis, Michael A. Wood-
house, and Kristen B. Ardani. U.S. Solar Photovoltaic System Cost
Benchmark: Q1 2017. Technical Report September, National Renew-
able Energy Laboratory (NREL), Golden, CO (United States), sep
2017.
[3] Christophe Pillot. The Rechargeable Battery Market and Main Trends
2015-2025. In Advanced Automotive Battery Conference, 2016.
[4] Bolun Xu, Alexandre Oudalov, Jan Poland, Andreas Ulbig, and Göran
Andersson. BESS Control Strategies for Participating in Grid Fre-
quency Regulation. IFAC Proceedings Volumes, 47(3):4024–4029, 2014.
[5] Jens Koeppen. Beschlussempfehlung und Bericht des Ausschusses für
Wirtschaft und Energie (9. Ausschuss) zu dem Gesetzentwurf der Bun-
desregierung. Drucksache 18/7555. Entwurf eines Gesetzes zur Digi-
talisierung der Energiewende. Technical report, Deutscher Bundestag,
2016.
[6] Xiufeng Liu, Lukasz Golab, Wojciech Golab, and Ihab F Ilyas. Bench-
marking Smart Meter Data Analytics. Proceedings of the 18th Interna-
tional Conference on Extending Database Technology, pages 385–396,
2015.
[7] Bundesamt für Energie (BFE). Grundlagen der Ausgestaltung einer
Einführung intelligenter Messsysteme beim Endverbraucher in der
Schweiz. Technical report, 2014.
[8] Michael Zeifman and Kurt Roth. Nonintrusive appliance load monitor-
ing: Review and outlook. IEEE Transactions on Consumer Electronics,
57(1):76–84, feb 2011.
[9] Michael Chertkov and Vladimir Chernyak. Ensemble of Thermostat-
ically Controlled Loads: Statistical Physics Approach. Scientific Re-
ports, 7(1):8673, dec 2017.
99
100 BIBLIOGRAPHY
[10] Ioan Sarbu and Calin Sebarchievici. General review of ground-source

heat pump systems for heating and cooling of buildings. Energy and
Buildings, 70:441–454, feb 2014.
[11] William L. Brogan. Modern Control Theory. Prentice Hall, 1991.
[12] Ted Doiron. A Short History of the Standard Reference Temperature

for Industrial Dimensional Measurements. Journal of Research of the
National Institute of Standards and Technology, 112(1):1–23, 2007.
[13] Christopher M Bishop. Pattern Recognition and Machine Learning,

volume 16. jan 2007.
[14] Henrik Sandberg. CDS 270-2: Lecture 4-2 Moving Horizon Estimation.
California Institute of Technology, 2006.
[15] Peter Al Hokayem and Eduardo Gallestey. Lecture Notes on Nonlinear

Systems and Control. ETH Zurich, spring sem edition, 2016.
[16] Christof Bucher. Eigenverbrauchsoptimierung durch Lastmanagement.

Technical report, 2014.
[17] Energie-Experten and EKZ Energieberatung. Energieeffizienz-Forum.

https://www.energie/̄experten.ch/de/forum/beitraege/andere/ boil-
ertemperatur.html. Date accessed: 2017-12-05.

Schaule MA 2018

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Schaule MA 2018

Uploaded by

Copyright:

Available Formats

PL

Disaggregation of Smart Meter Data into

EEH – Power Systems Laboratory

Examiner: Prof. Dr. Gabriela Hug

Zurich, January 31, 2018

2 Heat Pump Modeling and Parameter Estimation 5

3 Heat Pump Disaggregation 17

3.5 Synthesis of Heat Pump Time Series . . . . . . . . . . . . . . 74

4 Boiler Modeling and Disaggregation 79

5 Combined Disaggregation of Boiler and Heat Pump 91

6 Conclusion and Outlook 95

ASHP Air Source Heat Pump

A Surface area of body

pOn,jump Probability density function from kernel density estimation

In recent years, distribution grids have become significantly more dynamic

• They are present in many residential and industrial buildings

• They have high power consumption when active

• Their time of active operation can be adjusted without significant im-

Heat Pump Modeling and

2.1 Heat Pump Model

2.1.2 Physical Model

Figure 2.1: Heat pump behavior examples from three houses

on-off or 0 and 100%. Therefore, a model describing the behavior of the

∆Qbody = cm∆Tbody , (2.2)

2.1.2.1 Heat Pump Physics

QHP = PHP COP(T ), (2.6)

where the dependence on temperature is denoted in the COP. Using the

Qbody PHP COP

where Psolar is the solar irradiance.

xt+1 = a(xt − Tamb,t ) + Tamb,t + but + cPsolar,t + ωt , (2.11)

with an initial state

where xt denotes the state, namely the HHPS temperature at time t, a

when starting at time zero. This equation can be obtained by expanding

The switching algorithm can be described as follows:

Figure 2.3: Synthesized heat pump behavior

2.1.2.2 Definitions and Terminology

2.2 Estimation of Heat Pump Parameters

2.2.1 Definition of Optimization Problem

20:00 21:00 22:00 23:00

∆xiError = xtfin ,i − Tth

2.2.1.4 Optimization Problem

Ψ = (a, b, c). (2.18)

fOpt (Ψ) = Σi∈|1...SOff | (∆xiError )2 + Σk∈|1...SOn | (∆xkError )2 , (2.19)

corresponding to minimizing the square of the blue error terms xError in

The switching temperatures xtfin ,i and xtfin,k can be expressed explicitly

xtfin,i =atfin,i −tstart,i Tth

h1 (Ψ) = − xtfin,i + atfin,i −tstart,i Tth

Furthermore, one defines h(x) as

h(x) = (h1 (x), h2 (x))T . (2.25)

With the terms as defined in Equations (2.18) to (2.24), the optimization

min fOpt (Ψ)

2.3 Illustration and Analysis

0.0 0.2 0.4 0.6 0.8 1.0

Figure 2.5: Objective value as function of a

Coefficient of Performance The coefficient of performance of a heat

Figure 2.6: Objective value as function of a and b

was analyzed by performing the optimization from Equation 2.26, though

Heat Pump Disaggregation

3.1 Structure of Disaggregation and Estimation

1. Find Initial guess: Easily detectable components with high certainty

2. Estimation of HHPS Parameters based on initial guess

3. Forward simulation of the HHPS model beginning with known

4. Selection of most likely switching points to identify new on- and

5. Updated Estimation of HHPS parameters based on the disaggre-