FULLTEXT01

EXAMENSARBETE INOM MASKINTEKNIK,
AVANCERAD NIVÅ, 30 HP
STOCKHOLM, SVERIGE 2020
Graph theory applications in

the energy sector
From the perspective of electric utility companies
KRISTOFER ESPINOSA
TAM VU
KTH
SKOLAN FÖR INDUSTRIELL TEKNIK OCH MANAGEMENT
Abstract
Graph theory is a mathematical study of objects and their pairwise relations,

also known as nodes and edges. The birth of graph theory is often considered to
take place in 1736 when Leonhard Euler tried to solve a problem involving seven
bridges of Königsberg in Prussia. In more recent times, graphs has caught the
attention of companies from many industries due to its power of modelling and
analysing large networks.
This thesis investigates the usage of graph theory in the energy sector for a utility
company, in particular Fortum whose activities consist of, but not limited to,
production and distribution of electricity and heat. The output of the thesis is a
wide overview of graph-theoretic concepts and their applications, as well as an
evaluation of energy-related use-cases where some concepts are put into deeper
analysis. The chosen use-case within the scope of this thesis is feature selection
for electricity price forecasting. Feature selection is a process for reducing the
number of features, also known as input variables, typically before a regression
model is built to avoid overfitting and to increase model interpretability.
Five graph-based feature selection methods with different points of view are
studied. Experiments are conducted on realistic data sets with many features to
verify the legitimacy of the methods. One of the data sets is owned by Fortum
and used for forecasting the electricity price, among other important quantities.
The obtained results look promising according to several evaluation metrics and
can be used by Fortum as a support tool to develop prediction models. In
general, a utility company can likely take advantage graph theory in many ways
and add value to their business with enriched mathematical knowledge.
Keywords: graph theory, feature selection, energy industry
1
Sammanfattning
Grafteori är ett matematiskt område där objekt och deras parvisa relationer,
även kända som noder respektive kanter, studeras. Grafteorins födsel anses ofta
ha ägt rum år 1736 när Leonhard Euler försökte lösa ett problem som
involverade sju broar i Königsberg i Preussen. På senare tid har grafer fått
uppmärksamhet från företag inom flera branscher på grund av dess kraft att
modellera och analysera stora nätverk.
Detta arbete undersöker användningen av grafteori inom energisektorn för ett
allmännyttigt företag, närmare bestämt Fortum, vars verksamhet består av, men
inte är begränsad till, produktion och distribution av el och värme. Arbetet
resulterar i en bred genomgång av grafteoretiska begrepp och deras tillämpningar
inom både allmänna tekniska sammanhang och i synnerhet energisektorn, samt
ett fallstudium där några begrepp sätts in i en djupare analys. Den valda
fallstudien inom ramen för arbetet är variabelselektering för
elprisprognostisering. Variabelselektering är en process för att minska antalet
ingångsvariabler, vilket vanligtvis genomförs innan en regressions- modell skapas
för att undvika överanpassning och öka modellens tydbarhet.
Fem grafbaserade metoder för variabelselektering med olika ståndpunkter
studeras. Experiment genomförs på realistiska datamängder med många
ingångsvariabler för att verifiera metodernas giltighet. En av datamängderna ägs
av Fortum och används för att prognostisera elpriset, bland andra viktiga
kvantiteter. De erhållna resultaten ser lovande ut enligt flera utvärderingsmått
och kan användas av Fortum som ett stödverktyg för att utveckla
prediktionsmodeller. I allmänhet kan ett energiföretag sannolikt dra fördel av
grafteori på många sätt och skapa värde i sin affär med hjälp av berikad
matematisk kunskap.
Nyckelord: grafteori, variabelselektering, energiindustri
2
Declaration
This study is conducted by students Kristofer Espinosa and Tam Vu, and
commissioned by Fortum Sverige AB.
3
Acknowledgements
We want to thank our supervisors from Fortum, Alexandra Bådenlid and Linda
Marklund Ramstedt, as well as our manager and mentor, Hans Bjerhag, for their
constant support and engagement. As well, we are thankful for the help we have
received from our respective supervisors at KTH, Elena Malakhatka and
Xiaoming Hu.
4
Contents
1 Introduction 1
1.1 Purpose and research question . . . . . . . . . . . . . . . . . . . . . 2
1.2 Research contribution . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Disposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Market trends for utility companies 5

2.1 Uncertain growth in electricity demand . . . . . . . . . . . . . . . . 5
2.2 A more complex portfolio . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Evolving technology . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Evolving conditions in the energy markets . . . . . . . . . . . . . . 7
2.5 Customer trends . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.6 Digitalisation of utility companies . . . . . . . . . . . . . . . . . . . 9
2.7 The emergence of graph analytics . . . . . . . . . . . . . . . . . . . 10
2.7.1 Relational databases . . . . . . . . . . . . . . . . . . . . . . 11
2.7.2 Graph databases . . . . . . . . . . . . . . . . . . . . . . . . 13
2.7.3 The relevance of graph analytics for utility companies . . . . 14
3 Methodology 15
3.1 Research design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Research approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Research layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3.1 The preparation phase . . . . . . . . . . . . . . . . . . . . . 17
3.3.2 The idea generation and suggestion phases . . . . . . . . . . 17
3.3.3 The evaluation phase . . . . . . . . . . . . . . . . . . . . . . 21
3.3.4 The implementation phase . . . . . . . . . . . . . . . . . . . 24
3.4 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5
CONTENTS
3.5 Data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.6 Semi-structured interviews . . . . . . . . . . . . . . . . . . . . . . . 27
I Graph theory and applications 28

4 Elementary terminology 29
4.1 Graph and subgraph . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Graph traversal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3 Trees and connectivity . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4 Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.5 Colouring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.6 Directed graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.7 Weighted graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5 Selected applications of graphs 38

5.1 University timetabling . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.2 Staff assignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.3 Cost-effective railway building . . . . . . . . . . . . . . . . . . . . . 40
5.4 Logistic network and optimal routing . . . . . . . . . . . . . . . . . 41
5.5 Planar embedding of graphs . . . . . . . . . . . . . . . . . . . . . . 42
5.6 Winter road maintenance . . . . . . . . . . . . . . . . . . . . . . . . 43
5.7 Social network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.8 Tournament ranking system . . . . . . . . . . . . . . . . . . . . . . 46
5.9 Image segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Summary 49
II Selection of use-case 50
6 Idea generation 51
6.1 Inventory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.1.1 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.1.2 Heat map . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7 Assessment of use-case clusters 54

7.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7.1.1 Presentation of the use-case clusters . . . . . . . . . . . . . . 54
6
CONTENTS
7.1.2 Scoring system . . . . . . . . . . . . . . . . . . . . . . . . . 55

7.1.3 Assessment results per criterion . . . . . . . . . . . . . . . . 55
7.2 Optimisation of hydropower operations . . . . . . . . . . . . . . . . 57
7.3 Hydropower operation and maintenance . . . . . . . . . . . . . . . 60
7.4 Operation and maintenance for nuclear power . . . . . . . . . . . . 63
7.5 Design and maintenance for wind power . . . . . . . . . . . . . . . 66
7.6 Electric vehicle applications . . . . . . . . . . . . . . . . . . . . . . 69
7.7 Market intelligence for energy trading . . . . . . . . . . . . . . . . . 72
7.8 Storage solutions for the distribution grid . . . . . . . . . . . . . . . 75
7.9 Master data management . . . . . . . . . . . . . . . . . . . . . . . . 78
7.10 Knowledge graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
7.11 Results of use-case cluster assessment . . . . . . . . . . . . . . . . . 83
7.11.1 Assessment results . . . . . . . . . . . . . . . . . . . . . . . 83
7.11.2 A note on Energy trading . . . . . . . . . . . . . . . . . . . 85
7.12 Discussion on the selection of use-case cluster . . . . . . . . . . . . 87
8 Assessment of use-cases in Energy trading 88

8.1 Natural gas market analysis with visibility graphs . . . . . . . . . . 90
8.2 Smart meter clustering for short-term load forecasting . . . . . . . . 92
8.3 Feature selection for electricity price forecasting . . . . . . . . . . . 95
8.4 Results of the use-case assessment . . . . . . . . . . . . . . . . . . . 98
8.4.1 A note on Electricity price forecasting . . . . . . . . . . . . 98
8.5 Discussion on the selection of use-case . . . . . . . . . . . . . . . . 100
9 Conclusion and discussion on PART II 101

9.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
9.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
III Graphs, feature selection and electricity price

forecasting 104
10 Graph theory in electricity price forecasting 105
10.1 Background on the electricity market . . . . . . . . . . . . . . . . . 105
10.1.1 Roles in the market . . . . . . . . . . . . . . . . . . . . . . . 105
10.1.2 The day-ahead spot-market . . . . . . . . . . . . . . . . . . 106
10.1.3 The balancing market . . . . . . . . . . . . . . . . . . . . . 107
10.2 Electricity price forecasting . . . . . . . . . . . . . . . . . . . . . . 108
7
CONTENTS
10.2.1 Forecasting methods . . . . . . . . . . . . . . . . . . . . . . 108

10.2.2 Feature selection in electricity price forecasting . . . . . . . 111
10.2.3 The case of intra-day . . . . . . . . . . . . . . . . . . . . . . 113
10.2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
10.3 Motivations on the proposed methodology . . . . . . . . . . . . . . 115
11 Introduction to feature selection 116

11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
11.2 Purpose and outline . . . . . . . . . . . . . . . . . . . . . . . . . . 117
12 Preliminaries 118
12.1 Laplacian matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
12.2 Nearest neighbour graph . . . . . . . . . . . . . . . . . . . . . . . . 119
12.3 Graph clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
12.4 Comparison of two clusterings . . . . . . . . . . . . . . . . . . . . . 120
12.4.1 Clustering accuracy . . . . . . . . . . . . . . . . . . . . . . . 121
12.4.2 Normalised mutual information . . . . . . . . . . . . . . . . 121
12.4.3 Adjusted mutual information . . . . . . . . . . . . . . . . . 122
12.5 Similarity comparison of two sets . . . . . . . . . . . . . . . . . . . 122
13 Feature selection methods 123

13.1 Laplacian score (LS) . . . . . . . . . . . . . . . . . . . . . . . . . . 125
13.1.1 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
13.1.2 Optimisation problem . . . . . . . . . . . . . . . . . . . . . 125
13.1.3 Feature selection algorithm . . . . . . . . . . . . . . . . . . . 127
13.2 Multi-cluster feature selection (MCFS) . . . . . . . . . . . . . . . . 127
13.2.1 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
13.3 Non-negative discriminative feature selection (NDFS) . . . . . . . . 129
13.3.1 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
13.4 Feature selection via non-negative spectral analysis and redundancy
control (NSCR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
8
CONTENTS
13.5 Feature selection via adaptive similarity learning and subspace

clustering (SCFS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
14 Experiments and results 139

14.1 Line of action . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
14.2 Data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
14.3 Parameter setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
14.4 Experiment: Eight public data sets . . . . . . . . . . . . . . . . . . 141
14.4.1 Clustering accuracy . . . . . . . . . . . . . . . . . . . . . . . 141
14.4.2 Normalised and adjusted mutual information . . . . . . . . . 141
14.4.3 Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
14.5 Experiment: CELEBI of Fortum . . . . . . . . . . . . . . . . . . . . 142
15 Discussion 144
15.1 Convergence speed . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
15.2 Parameter sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . 145
15.3 Jaccard index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
15.4 Conceivable challenges . . . . . . . . . . . . . . . . . . . . . . . . . 146
15.5 Implications for forecasting activities . . . . . . . . . . . . . . . . . 146
15.6 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
16 Research conclusion 148
Bibliography 150
Appendices 176
A Part I & II: Figures and tables

A.1 List of interviews . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A.2 Inventory of use-cases . . . . . . . . . . . . . . . . . . . . . . . . . .
A.3 Evaluation tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A.3.1 Graph applicability . . . . . . . . . . . . . . . . . . . . . . .
A.3.2 Technical feasibility . . . . . . . . . . . . . . . . . . . . . . .
A.3.3 Economic potential . . . . . . . . . . . . . . . . . . . . . . .
A.3.4 Workability . . . . . . . . . . . . . . . . . . . . . . . . . . .
B Part III: Figures and tables
9
CONTENTS
B.1 Experiments and results . . . . . . . . . . . . . . . . . . . . . . . .

B.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C Glossary
10
List of Figures
2.1 European Net electricity generation, EU-28, 1990-2017. Source:

Eurostat 2019 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Evolution of the estimated impact of technology on utility
companies between 2015 and 2018. Source: Monitor Deloitte (2018) 9
2.3 The economic impacts of digitalisation on utility earnings Source:
McKinsey (2016) [4] . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 A friendship directed social graph. Source: AWS (2020) . . . . . . . 14
3.1 The idea generation phase . . . . . . . . . . . . . . . . . . . . . . . 21

3.2 The idea selection phase . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1 Examples of graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.2 Various kinds of walks in a graph. . . . . . . . . . . . . . . . . . . . 32
4.3 Examples of graphs which are not trees. . . . . . . . . . . . . . . . 33
4.4 Differently configured trees with 5 vertices and 4 edges each. . . . . 33
4.5 Examples of spanning trees in a connected graph. . . . . . . . . . . 33
4.6 Examples of matchings. . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.7 Examples of vertex colourings using as few colours as possible. . . . 36
4.8 Weighted directed graph and a small fictitious kingdom far, far away. 37
5.1 The complete bipartite graph K3,3 is not planar. . . . . . . . . . . . 42

5.2 Graph vertices can be most central in different aspects. . . . . . . . 45
6.1 Heat map of past use-cases per cluster and per graph-related concept. 53
7.1 Hydropower optimisation . . . . . . . . . . . . . . . . . . . . . . . . 59

7.2 Assessment of Hydropower: Operation and maintenance . . . . . . 63
7.3 Assessment of Nuclear power operation and maintenance . . . . . . 65
7.4 Assessment of Wind power operation and maintenance . . . . . . . 68
11
LIST OF FIGURES
7.5 Assessment of EV Applications . . . . . . . . . . . . . . . . . . . . 71

7.6 Assessment of Energy trading . . . . . . . . . . . . . . . . . . . . . 75
7.7 Assessment of Energy storage solutions . . . . . . . . . . . . . . . . 78
7.8 Assessment of Master Data Management . . . . . . . . . . . . . . . 81
7.9 Assessment of Knowledge graphs . . . . . . . . . . . . . . . . . . . 83
7.10 Graph-score vs strategy-score of each use-case cluster . . . . . . . . 84
8.1 Assessment of Natural gas market analysis with visibility graphs . . 92

8.2 Assessment of Short-term load forecasting - clustering . . . . . . . . 95
8.3 Assessment Electricity price forecasting - feature selection . . . . . 98
8.4 Comparison of the use-case assessments . . . . . . . . . . . . . . . . 99
10.1 Formation of electricity prices . . . . . . . . . . . . . . . . . . . . . 107
14.1 Average Jaccard index. . . . . . . . . . . . . . . . . . . . . . . . . . 143
B.1 Jaccard index for eight different data sets. . . . . . . . . . . . . . .

B.2 Relative change of the objective value in iterative methods. . . . . .
B.3 Clustering quality for different α and β, obtained with the features
selected using SCFS. . . . . . . . . . . . . . . . . . . . . . . . . . .
12
List of Tables
3.1 Evaluation dimensions and criteria. . . . . . . . . . . . . . . . . . . 22
5.1 The centrality measures of the vertices in Figure 5.2. . . . . . . . . 46
6.1 Number of past use-cases per cluster . . . . . . . . . . . . . . . . . 52
7.1 Scoring of use-case clusters . . . . . . . . . . . . . . . . . . . . . . . 56
8.1 Scoring of use-cases in Energy trading . . . . . . . . . . . . . . . . . 89
13.1 Notations associated with a given data set. . . . . . . . . . . . . . . 124
14.1 Data sets with their numbers of samples (m) and features (n). . . . 140
A.1 List of interviewees. . . . . . . . . . . . . . . . . . . . . . . . . . . .

A.2 Use-cases in Hydropower optimisation . . . . . . . . . . . . . . . . .
A.3 Use-cases in Hydropower operation and maintenance . . . . . . . .
A.4 Use-cases in Nuclear power operation and maintenance . . . . . . .
A.5 Use-cases in Wind power design, operation and maintenance. . . . .
A.6 Use-cases for Electric vehicle applications . . . . . . . . . . . . . . .
A.7 Use-cases in Energy trading. . . . . . . . . . . . . . . . . . . . . . .
A.8 Use-case in Energy storage . . . . . . . . . . . . . . . . . . . . . . .
B.1 Clustering accuracy (ACC) [%] corresponding to different data sets

and feature selection methods. . . . . . . . . . . . . . . . . . . . . .
B.2 Normalised mutual information (NMI) [%] corresponding to
different data sets and feature selection methods. . . . . . . . . . .
B.3 Adjusted mutual information (AMI) [%] corresponding to different
data sets and feature selection methods. . . . . . . . . . . . . . . .
13
LIST OF TABLES
B.4 Clustering quality measures for feature selection of CELEBI. The

first column shows the indices of the subsets. . . . . . . . . . . . . .
14
CHAPTER 1
Introduction
Graph theory is a mathematical study of objects and their pairwise relations,

even known as nodes and edges respectively. The birth of graph theory is often
considered to take place in 1736 when the Swiss mathematician Leonhard Euler
tried to solve a routing problem about seven bridges of Königsberg in Prussia. In
more recent times, the increase in data and computing power has given rise to
computational intelligence modelling and perhaps more discretely has graph
theory been applied to several services at the foundation of a digital society.
Google’s PageRank search algorithm and their map are based on graphs and so
are Facebook’s and Twitter’s social networks. Whether it is a link from a website
pointing to another or adding someone as a friend, the relations (edges) bear the
fundamental information. These applications of graph theory in the digital
sphere, generally with large amounts of data, are referred to as graph analytics.
Graph analytics have the advantage of being fast analysis tools and scalable to
exceptionally large networks. Thus, it has caught the attention of companies
from all types of industries and is commonly being used as a modelling tool
designed for networks.
Societies are currently undergoing a transition toward a low-carbon energy
system. The main drivers of this change, electric utility companies, are
digitalising and innovating on new products to stay or become more competitive.
The liberalisation of the power markets and the increasing level of renewable
sources of electricity fed into the grid have brought perspectives of diminishing
electricity prices in many countries, complicating the situation for the electricity
1
1.1. PURPOSE AND RESEARCH QUESTION
generators. On the other hand, the technological advancements with respect to

the energy and information technology sectors provide new opportunities for
utilities to optimise their operations or find new revenues streams. In this regard,
graph theory appears as a potentially helpful technology to support such
endeavours. It has been used in the contexts of Internet of Things (IoT), for
routing, fraud detection, customer analysis, advanced search, scheduling and
many more. Less publicised are the set of useful applications specific to the
power market. As such, similarly as to machine learning or blockchain, utilities
have an interest in getting deeper insights in the technology to assess where and
how it can be used to their benefit.
1.1 Purpose and research question

This paper aims to give the interested reader guidelines as to where graph theory
is applicable in the energy sector and more particularly in the power market. A
thorough background of graph theory and its theoretical applications is followed
by a comprehensive presentation of previous case studies revolving around the
operational markets of a typical utility company, giving an indication of potential
applications. These applications, also called use-cases, are evaluated to indicate
where and how they could be beneficial. Finally, a use-case is conceptualised and
a proof-of-concept realised.
More specifically, we will treat the following questions:
◦ What are the main benefits of graph theory in engineering and science
applications?
◦ How can graph theory be useful in the energy sector and more particularly
within the context of an electric utility company?
◦ Given a use-case relevant for the business of Fortum, what can a
graph-based model look like and what algorithms can be used to solve
problems?
Relevant theoretical backgrounds will be covered for a deeper understanding of
the topics in question, namely graph theoretical and energy-specific ones.
2
1.2. RESEARCH CONTRIBUTION
1.2 Research contribution

This research aims to bring contributions in the following ways:
◦ An overview of applications of graph theory within the energy sector with a
detailed theoretical background for pedagogical purposes, which to our
knowledge does not exist to date.
◦ A generic evaluation tool for a face-value assessment of graph-theory
potential for commercial applications for practitioners in any industry.
◦ A review of graph-based feature selection methods and their application to
a commercial use-case.
1.3 Limitations
In this study, a trade-off is necessary between the number of applications
evaluated and the amount of details provided in each. Each application covered
is in fact subject for a study in itself. Consequently, the assessments are to be
considered at face-value, as a guideline for practitioners to where attention is to
be directed.
1.4 Delimitations
This study was commissioned by Fortum. Hence, the verticals and geographies of
the energy sector considered for the applications are aligned with the operations
of Fortum, with the exception of some of the more corporate activities, including
strategic and financial decision-making as well as regulatory and compliance
activities. The reason for this is to contain the research mainly to energy
industry-specific challenges.
1.5 Disposition
This paper comprises three parts.
Part I contains a wide overview with elementary concepts of graph theory and
selected applications with support of recent researches. This part aims to provide
3
1.5. DISPOSITION
an inspiring mathematical background for practitioners needing mathematical

foundations for implementing graph analytics applications.
Part II details the assessment of the identified use-cases of graph theory in the
energy sector. The evaluation tool is explained in further detail and a
comparison between use-cases is made, in order to find a relevant application for
a proof-of-concept as well as guiding decision-making for which areas to focus on
for future proofs-of-concept.
Part III focuses on a specific area relevant for Fortum to bring the general
concepts into deeper analyses. Within the scope of this thesis, the case study is
about feature selection for electricity price forecasting using graph-based
methods. Five methods with different points of view are presented with main
idea, derivation, algorithm and quality validation by experiments on real-world
data.
4
CHAPTER 2
Market trends for utility
companies
Utility companies have a central role in facilitating the coming transition to a

sustainable energy system. Active across the value chain from power generation
to its delivery to the end-user, they face increasing political, economic and social
pressure to decarbonise the power sector and scale the integration of renewable
sources of energy. However, given the scale of the economic, technological and
organisational challenges to this, some argue that this evolution can in fact be
likened more to a transformation. Below are some fundamental trends affecting
the power sector that utilities need to account for when elaborating their
strategies.
2.1 Uncertain growth in electricity demand

Historically, utility companies have been able to invest in new power generating
assets thanks to an ever-growing demand in electricity. However, utilities in most
developed economies have been seeing stagnating or even declining demands in
electricity, mainly due to the fact that the electrification rate already is so high.
Other reasons for declining electricity demand are the deindustrialisation of
developed economies and the energy efficiency improvements, notably in the
residential sector. For instance, in New York, the electricity demand is projected
to grow at an average annual rate of 0.16 % through 2024. For the case of
5
2.2. A MORE COMPLEX PORTFOLIO
Europe, the figure below summarises the evolution of electricity consumption

since 1990 [1].
Figure 2.1: European Net electricity generation, EU-28, 1990-2017.

Source: Eurostat 2019
Despite this recent trend, the electrification of the residential and transportation
sector is adding uncertainty to the future electricity demand projections,
particularly in terms of peak demand, potentially driving up the need for the
total power system capacity, yet without necessarily increasing the demand [13].
2.2 A more complex portfolio

Renewable energy generation differs from conventional power sources in some
fundamental ways, which alter the companies’ operations all along the asset life
cycle, from the investment and the operation and maintenance to the
decommissioning. As solar and wind are the growth drivers among the
distributed energy resources (DERs), focus will be put on them.
Different asset types
As opposed to conventional power sources, DERs have on average significantly
lower capacities and are generally connected to the distribution grid (low to
medium voltage) and are sparsely located across geographical regions. As such,
their design, placement and sizing are not only subject to the electricity grid and
pricing signals, but also vary largely based on topology, irradiation, wind maps
and land ownership. To be competitive, utilities now need to reach the same
level of expertise on these matters as they have accumulated over time regarding
conventional power sources [6]
Evolving needs in operation and maintenance
6
2.3. EVOLVING TECHNOLOGY
The intermittent nature of certain DERs such as wind and solar causes their
operation to be different than conventional power sources. The uncertainty of the
weather conditions adds complexity to the day-ahead bidding strategy of
utilities, notably in terms of residual demand of the energy system. Not only is
their own power output more unpredictable, the output of competitors is too,
causing higher margins of error induced by inaccurate weather modelling [14].
As wind power plants and solar farms are more distributed, complexity is also
added on the maintenance-side. The portfolio of the utilities now encompasses
higher distributions in terms of asset location, ages and life cycles. Even though
OEMs oftentimes provide warranties for the wind and solar power plants,
utilities are incentivised to create their own intelligence on these matters to
attain a competitive edge.
2.3 Evolving technology

Whereas conventional sources such as hydropower and nuclear are more mature
technologies, the technology in wind and solar, as well as energy storage, is
rapidly evolving. This complicates the investment decision, not only in terms of
the right timing to commission a new renewable power plant, but also in their
design (selection of equipment, addition of energy storage).
2.4 Evolving conditions in the energy markets

Statkraft projects solar power to become the largest source of power generation
on a global basis and cover almost 30 % of all electricity generation in 2040, with
wind power covering 20 % [7]. This has two main effects on the economics of the
market: declining electricity prices during peak production hours and increasing
need for grid stability services [2].
As renewables of the same type generate power simultaneously within the same
geographical area, a higher proportion of renewables is fed into the system, an
increase in supply of free generation drives down prices, sometimes even leading
to negative prices. This can have a notable impact on the investment decisions in
renewable generation, as adding renewable capacity hampers the profitability
from the installed renewable generation [6].
The stability of the grid is impacted by the increasing share of intermittent
7
2.5. CUSTOMER TRENDS
power generation, combined with the reduction in the system inertia provided by
heavier rotating equipment. Thus, increasing attention is channelled towards
providing ancillary services for the grid operator on the intra-day market as new
revenue sources for asset utilities [2].
On top of this, the energy commodities market, particularly coal, oil and gas, is
becoming all the more volatile and financialised, subject to recurring global
political and economic crises. In particular, natural gas is increasingly decoupled
from oil prices and subject to more fundamental forces. This puts pressure on
higher forecasting needs, whether or not a utility uses coal or gas as fuel,
impacting both their input price for electricity generation and their electricity
price projections [8].
2.5 Customer trends

There is a growing engagement from the customers to participate in the power
sector in various ways. First of all, the lower barriers for investment in solar
power has attracted many private households across the globe to invest in their
own power production and net metering schemes have been vastly deployed [12].
Secondly, there is a growing attention put on demand response programmes,
allowing customers to adapt their consumption in response to electricity price
signals. This is to a large extent enabled by the roll-out of smart metering
systems, allowing for a timely, two-way communication between the end-user and
the central system [11].
Utilities are thus facing a growing interest from customers for higher energy
independence and more elaborate and efficient energy solutions. This forces them
to innovate in terms of product offering in order to reduce churn or grid defection
(i.e. “going off the grid”). There is a risk for the situation to evolve into what is
being called the “death spiral”, whereby customers going off the grid further
increases grid defection, because the network costs are borne by fewer customers.
However, this trend is to be taken with a grain of salt, as it has not yet been
observed on a large scale and could be contained to some specific geographical
regions where the conditions are met (e.g. high solar irradiation) [5].
Despite all this, electricity is still seen as commodity for end-users and price
elasticity is low [9]. The interest and understanding of the energy system from
the customer’s side remains low for most people, which makes the
8
2.6. DIGITALISATION OF UTILITY COMPANIES
competitiveness of a utility dependent on product pricing rather than on product

differentiation, hampering the efforts of utilities to engage them and innovate
rapidly [6].
The aforementioned conditions puts pressure on utilities to adapt their
operations and competences, reimagine their role in the system through
technological and business model innovations, while needing to optimise their
asset fleet in various ways. Being industry incumbents, utilities have a strong
inertia, this change is both a technical, economic and organisational challenge,
requiring timely decisions and a balanced explore-exploit trade-off [10].
As can be seen in the figure below, the timing and impact of underlying energy
industry trends for utilities is difficult to assess [3].
Figure 2.2: Evolution of the estimated impact of technology on utility companies

between 2015 and 2018.
Source: Monitor Deloitte (2018)
2.6 Digitalisation of utility companies

New digital tools can prove useful in supporting utility companies in tackling
some of the challenges mentioned above. Managing and utilising the data
generated from the different components of the electrical system represent great
9
2.7. THE EMERGENCE OF GRAPH ANALYTICS
opportunities for utility companies to respond to the evolving environment they

operate in. As seen in the figure below, digitalisation can have a positive effect
all along the electricity value chain [4].
Figure 2.3: The economic impacts of digitalisation on utility earnings

Source: McKinsey (2016) [4]
Utility companies have already started to invest heavily in digitilisation,

particularly enabled by the emergence of Internet of Things and the possibility to
process big data with increased computing power and Artificial Intelligence (AI).
Since 2014, the investments in digital electricity infrastructure and software have
seen an annual growth rate at above 20 %, reaching US$ 47 billion in 2016. By
increasing connectivity as well as modelling and monitoring capabilities,
digitilisation is expected to decrease generation costs with up to 10 % to 20 % in
the oil and gas industry, and 5 % in the power sector and help integrate more
intermittent renewable generation. On the consumption side, digitilisation could
cut energy use by about 10 % in buildings, help reshape the mobility sector,
facilitate "smart demand response" programmes as well as smart charging
technologies for electric vehicles [15].
2.7 The emergence of graph analytics

According to Gartner, digitalisation "is the use of digital technologies to change a
business model and provide new revenue and value-producing opportunities".
10
Digitilisation is enabled by digitisation, which is the mere process "changing from

analog to digital form" [16]. The traditional way of digitising real-world processes
is through tabular databases, where elements from the real world are stored in
tables. Increasingly, however, graph databases are used to better capture the
relationships between the digitised elements in a network. The latter type of
database is a stepping stone to be able to garner the power from graph theory. A
quick look at the different forms of databases can be helpful in order to
understand how graph theory can become a useful tool of analysis for
organisations undergoing digitilisation.
2.7.1 Relational databases

Tabular, or relational, databases store information in tables. Each table
represents data elements, typically called a model, where the columns list the
attributes of the model and where each row is an instance of the model,
identified by a unique ID.
Example: A person model
There are three types of relationships between models: 1-to-1, 1-to-many and
many-to-many.
1-to-1 relationships describe a relationship between A and B in which one
element of A may only be linked to one element of B and vice versa. A country
and its capital city have a 1-to-1 relationship.
1-to-many relationships are a parent-child type of relationship, where A can
have several instances of B but B can only have one instance of A. We say that
B belongs to A. An example of this is an organisation which has several
employees. In this case, common practice is to assign an organisation foreign key
to an instance in the employee table.
Many-to-many relationships are where A can be connected to several
instances of B and vice versa. This can be exemplified by the relationship
between a doctor and a patient: A doctor can have several patients and a patient
can have several doctors. A solution to store this information is to connect them
with a common table, for example booking, the instanced of which contain both
a doctor and a patient foreign key.
Query language
11
SQL has remained a consistently popular choice for database administrators over
the years primarily due to its ease of use and the highly effective manner in which
it queries, manipulates, aggregates data and performs a wide range of other
functions to turn massive collections of structured data into usable information.
Benefits and drawbacks
Tabular databases have however two inherent drawbacks. The first one is that
relationships can only be described by foreign keys and as such are
descriptionless. This entails that the relationship cannot be described in full.
Another drawback of tabular databases is the complexity of traversing the
database. If a connection needs to be found between tables which are indirectly
linked, the user first needs to join all tables based on the foreign keys.
A basic example can help illustrate this. To find students in a given geographical
continent, the database manager would have to join the tables Student,
University, City, Country, Continent on their respective foreign keys. The query
would look like below:
SELECT * FROM students
JOIN universities ON universities.id = students.university_id
JOIN cities ON cities.id = universities.city_id
JOIN countries ON countries.id = cities.country_id
JOIN continents ON continents.id = countries.continent_id
WHERE continents.name = "Europe"
This query has two implications:
◦ The complexity of the queries can quickly increase as the distance between
A and B increases.
◦ The complexity of the calculation for the server can also increase
dramatically as the size of the database or the distance between A and B
increases, slowing down the response time and hence the productivity of
the organisation.
SQL is not the best choice for all database applications. For one thing, while
SQL had been effective at data scales up through the 1990s and beyond, it
started to falter at the hyperscale levels at the turn of the century. Some users
also complain of its sharding limitations hampering the ability to break large
databases into smaller, more manageable ones.
12
These drawbacks are what led to the creation of NoSQL and the more recent
NewSQL, which attempt to enhance the traditional SQL’s scalability without
sacrificing its inherent atomicity, consistency, isolation and durability (ACID),
critical components of stable databases.
2.7.2 Graph databases

Graph databases differ from relational databases in that they give as much
importance on the relationship between elements (edges) as on the elements
themselves (nodes). They are particularly useful in modelling networks of highly
connected components. As with relational databases, the entities can be
described with a set of attributes, most often described through a dictionary
data-structure (key-value pairs), often stored as JSON objects. The edges of the
graph are stored in the same way as the nodes, described by key-value pairs.
A graph database can be illustrated by the classical example of social graphs.
The nodes in this example are people, who might have a set of attributes (e.g.
first name, last name). The edges represent the friendships between people and
can also contain more information (e.g. date of friendship established, strength of
the bond). The ability of adding information on the relationship is a cornerstone
of modern social media applications such as Facebook, Twitter, Instagram and
more and is the foundation to the algorithms aimed at personalising the content
for the end-users (e.g. promoted posts).
An interesting nuance between Facebook and Instagram lies in the directness or
not of the relationships (followers vs. friends). This can easily be modelled by
the simple choice of applying directed or undirected edges connecting nodes.
Figure 2.4 is an example of a directed social graph.
On top of the potential improvements in query response times and the higher
level of intuition caused by the choice of a graph-database for modelling more or
less complex networks, a large set of algorithms and analysis tools from the field
of graph theory are made available for analytics purposes. Determining the
density of the network, identifying influential nodes, assessing the similarity or
interdependence of nodes or visualising the propagation of some phenomena
through a network are all facilitated by graph-related concepts such as clustering,
centrality, connectivity, label propagation, link prediction, etc.
13
Figure 2.4: A friendship directed social graph.

Source: AWS (2020)
2.7.3 The relevance of graph analytics for utility

companies
As companies digitalise their assets and operations, the question of the modelling
approach arises. As was briefly discussed in the previous section, alternatives to
the traditional tabular databases are available, among others graph-databases.
Given the fundamental trend of society toward higher levels of connectivity, be it
for physical objects via the roll-out of sensors, for people and companies or
energy commodity prices, a graphical approach can prove useful.
This very observation motivates the incentive for electric utility companies to
gain a deeper understanding of the foundations of the mathematical field of
graph theory to better harness the potential of graph analytics.
14
CHAPTER 3
Methodology
3.1 Research design

Management research method literature generally classifies research purpose in
three types: exploratory, descriptive or explanatory. Depending on its purpose,
research can belong to one or more of the research types and can even include
them sequentially as the research evolves [17]. The objectives of the respective
approaches are:
◦ Exploratory: the problem is loosely defined and the area potentially
unexplored and the researcher needs to gain a clearer understanding of the
topic.
◦ Descriptive: an observed phenomena, person or situation needs to be
analysed and described objectively, generally without inferring conclusions.
◦ Explanatory: a situation or a problem is observed and a causal relationship
between the variables is necessary.
An exploratory research was deemed fit for the purpose of this study, considering
the need to deeply understand the domains of graph theory and of the energy
sector operations and the novelty of the work. This research was carried out in
accordance with exploratory research praxis, both with respect to the research
approach and the information gathering. Indeed, as Saunders et al., point out,
exploratory research needs to be dynamic and flexible to changes, as new data
15
3.2. RESEARCH APPROACH
and new insights can alter the direction of the study [17]. In general, the research
starts with a broad transversal perspective and narrows down as insights are
gained. Regarding information gathering, common approaches include literature
review, interviews with subject experts and the conducting of focus groups.
3.2 Research approach

Research can be carried out either inductively, deductively or abductively [17].
◦ Inductive: the research starts with observations and theories are proposed
as a result of the observations, toward the end of the research process [18].
◦ Deductive: this approach is concerned with "developing a hypothesis (or
hypotheses) based on existing theory and then designing a research
strategy to test the hypothesis" [19].
◦ Abductive: the research process is devoted to finding the best possible
explanation to "surprising" observations from a range of possible theories,
reaching a plausible yet not necessarily universally true conclusion [20].
This research adopts a hybrid inductive-deductive approach. Firstly, an inductive
approach is taken as there is no hypothesis to verify. The aim is to discover the
potential benefits of implementing graph-based solutions by a deeper
understanding of graphs and perceived problems from the industry. In fact,
hypotheses are being created as use-cases are elaborated. The retained
hypothesis is the one from the selected use-case, namely that it can create value
to practitioners. This hypothesis is verified in the subsequent part of the
research, in a qualitative and quantitative manner. Thus, this latter phase takes
a deductive approach.
3.3 Research layout

Finding applications of graph theory in one’s sector is essentially an Innovation
Management endeavour. Innovation Management is a set of processes companies
implement to continuously introduce new technological solutions, products and
services to their markets. The processes typically revolves around generating new
ideas and solutions, which are then evaluated, prioritised, tested and finally
implemented and rolled out [21].
16
3.3. RESEARCH LAYOUT
This research focuses on the first steps of this process, that is, from the
generation of new ideas to the development of a proof-of-concept for a selected
idea. Therefore, the more narrow field of research of idea management was taken
inspiration from for the research layout. Idea Management (IM) is a sub-process
of innovation management, aiming at structuring and streamlining the idea
generation, evaluation and selection processes [32]. It is also referred to as the
“front-end” of innovation, as the ideas are generally generated by the employees
themselves to address their specific problems and needs. The systematic
approach adopted in IM is being implemented by many large firms to cope with
an otherwise “fuzzy” nature of the front-end of innovation management, bearing
a high level of informality and uncertainty [23, 24].
The research layout is explained in more details below. It followed the main
steps proposed by Gerlach and Brem in their guidelines for IM practitioners [25],
namely the preparation phase, the idea generation and suggestion phases, the
evaluation phase and the implementation phase.
3.3.1 The preparation phase

This phase defines the overall objective and scope of the idea management
programme. An idea manager formulates the ideation strategy and plans how to
generate, improve and evaluate ideas. The defined rules of the preparation phase
can be seen as the first of various filters toward a potential commercialisation
[27].
Among the problem types defined by Gerlach and Brem in the IM process, this
research addresses “a new technology looking for a new application”, in this case,
graph theory [25]. Therefore, the preparation phase entailed an extensive
background research on graph theoretical concepts and applications, as well as
the design of a nuanced approach from traditional IM praxis for the idea
generation and selection phase.
3.3.2 The idea generation and suggestion phases

This phase is usually of distributed nature, insofar as people from the
organisation submit and present their ideas. A critical factor to consider is the
reward system to engage employees for submission. Due to the novelty of the
technology constraint (i.e. the use of graph theory) and the lack of reward
17
possibilities, this research proposes a data-driven approach for the idea generation
phase. The interviews conducted helped identifying critical problems in each
sector of activity of the firm, seeding the search for past use-cases of graph-based
solutions and contributing to assess the relevance of these to the organisation.
A vast inventory of use-cases was compiled across the core activities at the utility
company, spanning all possible types of graph-theoretic solutions, both from
academia and the industry. In parallel, interviews were conducted to understand
the challenges and opportunities in the various sectors Fortum operates in,
providing deeper insights in the relevancy of the inventoried use-cases and their
implications. The use-cases were clustered into areas of applications and filtered
by relevancy to the company’s value chain as part of the “idea improvement”
process suggested by several authors in the IM literature [23, 28, 29].
The approach to generate application ideas is inspired from the concepts of
“market pull” and “technology push” [38]. In the “market pull”, the source of the
innovation is an unmet need of the customer (in this case, it could be both the
end-customer and the end-user of the solution). This results in new demands for
problem-solving (‘invent-to-order’ a product for a certain need). The impulse
comes from individuals or groups who (are willing to) articulate their subjective
demands.
In the “technology push”, the stimulus for new products and processes comes
from (internal or external) research; the goal is to make commercial use of new
know-how. The impulse is caused by the application push of a technical
capability. Therefore, it does not matter if a certain demand already exists or
not.
In this research, the pull approach consists in identifying the needs in the
respective sectors, whereas the push approach considers the benefits and
drawbacks of the technology itself and the solution it has permitted in academia
and various industries.
Technology push
The push approach consists in identifying classic, verified use-cases of the
technology in order to understand the inherent benefits of the technology in
practice. This inventory of use-case domains provides an important benchmark
for tying the utility company’s inventory of areas of improvements to graph
analytics. The two sources of benchmark use-cases are academia and the
18
industry.
Academia
Graph theory is a mature and vastly studied mathematical area. A Scopus
search for “graph theory” yields about 100,000 results. As a comparison, a
“blockchain”-search results in 15,000 hits. In addition, their high degree of
adaptability across engineering tasks, thus potentially across a utility company’s
operations.
Therefore, a systematic, programmatic search for research articles relating graph
theory and energy activities was performed in order to guarantee a complete
scanning for potential applications. The first phase consisted in searching a
combination of keywords representing graph theory: “graph theory”,
“graph-based”, “network theory”, in combination with each energy sector the
company is operating in (e.g. hydropower, nuclear power, wind power, etc.).
The second phase consisted in listing graph-related concepts and algorithms
(about 70) and identifying their presence in academia across the energy sectors.
Due to the higher dimensionality of this search (70 × 15 entries), the search was
done programmatically with the Scopus API (Application Programming
Interface), on TITLE-ABSTRACT-KEYWORDS. In total, about 3,000 articles
were found, stored, curated first automatically according to various filters and
then with the help of the abstracts.
Some interesting articles could be found thanks to the higher specificity of the
graph-theoretic concepts used in the search, graph theory being a rather wide
area in mathematics, sometimes making it too vague to be present in the
abstract or in the keywords on Scopus.
Industry
As described in the background, graph-based solutions have had an increasing
interest from the industry, particularly in highly digitalised companies.
Benchmark use cases from industry were mainly extracted from classical
use-cases published online, as well as publicised case studies from graph-database
providers (Neo4j, Expero). The industry benchmark use-cases having a lower a
priori relevance level to the specific energy industry and are less detailed in their
implementation than are the examples from academia but give a complementary
view of what is being worked on in a more practical manner.
Demand pull
19
The pull approach enables a technologically agnostic approach on solving the

company’s needs, taking user-centric approach, rather than a technology-centric
approach, helping to avoid shoehorning a technology to solve a problem. In this
process, a mapping of the activities of a company is made and an inventory of
pain-points faced by the company is made.
The pull approach consisted in interviews with experts in various domains of
activities at Fortum:
◦ Hydropower (asset management, plant optimisation, scheduling)
◦ Data science (worked on various projects)
◦ Project management
◦ Digitalisation
◦ Energy trading
◦ Wind power
◦ New ventures
◦ Charge and Drive
In this phase, the use-cases were clustered according to their domain of
application and their graph-related concepts and algorithms. This provides a
detailed picture of which problems have previously been solved and with which
approach. In this phase of the research, the question of where graph theory can
be used is considered more important than which graph theory concepts are most
useful. As such, the domains of applications are evaluated in the next step, in
other words, a problem-centric approach was taken. A graph-centric evaluation
approach could have also been possible, given their transferrability to multiple
domains of applications. However, we believe that a problem-centric evaluation
gives more value for practitioners since there is more organisational friction in
applying a graph-based solution to various sectors than applying different
graph-based algorithms within the same domain of application.
Figure 3.1 summarises the idea generation phase process.
20
Figure 3.1: The idea generation phase
3.3.3 The evaluation phase

In this research, the selection of idea consisted of two successive steps: firstly, an
application domain was prioritised and secondly, a specific use-case within the
application domain was elected.
One of the key issues of an idea management programme is the selection of ideas
from a large pool offering the biggest potential for future success of the
organisation [32]. To structure this high information load, suitable selection
criteria are required. However, there is no single dominant set of criteria as every
organisation has its own goals, needs and culture as well as individual budgets
and timetables [33]. Therefore, the evaluation criteria are generally chosen by the
organisation itself [30, 31]. The total score of an idea on these criteria indicates
whether it should be accepted, deferred, or rejected.
In their guidelines, Gerlach et al. have assembled the evaluation criteria most
commonly proposed in IM literature [25]. They span a wide array of dimensions,
such as technology, organisational culture, strategy, business, etc. The evaluation
tool in this research consists of four dimensions: graph applicability, technical
21
feasibility, economic potential and workability, whereby each of the dimensions

further has four criteria, as per the table below. The clusters were scored from 0
to 2 on each of the criteria and the dimension score is the aggregate score of its
underlying criteria.
Graph applicability Technical feasibility Economic potential Workability
Underlying graph structure Simplicity of model Sector size Relevance
Richness of relationships Homogeneity of tools Sector growth Data alignment
Identified concepts and algorithms Computational constraint Substitute tools Human alignment
Availability of supporting use-cases Risk Scalability Integrability and maintainability
Table 3.1: Evaluation dimensions and criteria.
The evaluation tool and set of criteria were chosen with the supervision of our
supervisors at KTH and experts from Fortum, with the following objectives in
mind:
◦ It needs to span a wide space of decision-making dimensions, otherwise the
evaluation would risk being misleading.
◦ Redundancy is minimised, meaning that factors where all clusters would
score equally are omitted (for example, the presence of a database,
potential presence of big data, etc.)
◦ The size of each evaluation block should be balanced, in order to avoid a
bias arising from a dimension including more criteria than others.
◦ A trade-off is needed between the simplicity and comprehensiveness of the
evaluation, to make it both user-friendly and complete.
◦ The evaluation needs to comprise both industry-specific and
company-specific criteria, to be both reproducible and relevant for the
company.
The difficulty of evaluating an idea can be seen in the lack of background
information of the ideator or examiner when describing or assessing the
potentials or limitations of an idea [34]. For mastering an information extensive
evaluation process, suitable evaluation criteria are crucial to ensure reliability.
The criteria represent guiding factors that can enable transparency,
comparability and repeatability. To ensure this, formal definitions of each grade
in the scoring system were made and attached in the appendix.
As was proposed by Edeland et al. in their assessment of the potential
22
blockchain technology in the energy sector, an alignment of each cluster with the
company’s strategic objectives was made. This allowed for a more contextualised
prioritisation in accordance with the corporate objectives. In fact, strategy
alignment is often cited as one of the most important dimensions in IM literature.
Strategic alignment
How well does the use-case cluster align with the strategic objectives expressed by
the company?
In this section, all four main strategy aspects contain the same score definition,
as follows:
◦ 0: No impact on strategy point.
◦ 1: Indirect impact in the time being. Potential impact in the long run
subject to changes endogenous or exogenous to the model.
◦ 2: Direct and immediate impact.
The four strategy points in Fortum’s CEO’s Business Review are outlined below
[40]:
◦ Pursue operational excellence and increased flexibility
Fortum attaches high priority to extract value from its current business
portfolio. Increased flexibility refers to the flexible generation assets and
demand response of large customers in order to balance the volatility
caused by renewable generation. Operational excellence refers to
minimising operational costs, either through productivity increase or asset
management distinction.
◦ Ensure value creation from investments and portfolio
optimisation
Fortum aims to consolidate its sizeable investments from the recent years
to further improve its financial performance. This includes the investment
in Uniper, a German generator with a portfolio mainly comprising flexible
power plants (gas, coal, hydro, oil, nuclear). In addition, a continuous
review of Fortum’s portfolio of power plants is made, with an emphasis on
CO2 -free assets, flexibility and low operating costs.
◦ Drive focused growth in the power value chain
Fortum aims to grow its CO2 -free power generation portfolio in an
23
asset-light manner, for example through partnerships or co-ownership. This

medium-term strategy aspires to extract value from its long-standing
expertise, which it could capitalise upon through an increasingly
service-oriented value capture. Digitalisation is an enabler for delivering
such services to both households, the grid operator, industrial customers or
cities.
◦ Build options for significant new businesses
The uncertainty of the energy sector in the longer run will create new
business opportunities which Fortum wants to seize, aiming at accrued
independence of power prices and high profit contributions. This strategy
point includes areas such as circular economy, waste and recycling and
bio-economy. It also includes collaborations with startups and new
ventures.
The mapping of the operational score against the strategic score was used to
divide the use-case clusters into three groups, as is suggested by Gerlach and
Brem in their conceptual model [25]:
◦ Accepted use-case clusters.
◦ Deferred use-case clusters.
◦ Rejected use-case clusters.
The second phase of the selection process is concerned with selecting a specific
use-case. Three use-cases are proposed, which were deemed to best represent the
accepted use-case cluster and be in line with Fortum’s operations. The use-cases
were scored according to the same evaluation criteria, though not the strategy
alignment, as they all belong to the same cluster and thus have the same
strategic score. When the selection of use-case was finally made, the modelling
and implementation phase was started.
Figure 3.2 summarises the idea selection phase.
3.3.4 The implementation phase

The actual implementation of the idea is important to demonstrate the feasibility
of an idea management programme [35]. Thus, it is a crucial motivational factor
for ideators as well. To successfully carry out the implementation process, clear
24
Figure 3.2: The idea selection phase
responsibilities and teamwork are required [36]. Such an implementation team

can consist of project managers, developers and subcontractors [37].
A full-scale implementation is outside the scope of this research; however, a
feasibility study was performed with a more detailed conceptual model and a
proof-of-concept. It consisted of the following steps:
◦ A more detailed background on the selected use-case and of the
graph-technology chosen is necessary to know how to find the best fit
possible.
◦ Data gathering and pre-processing. A data set was provided by Fortum for
analysis.
◦ Implementation of algorithms on benchmark, open source data sets as well
as on the actual studied data set.
◦ Implementation of evaluation criteria for the performance of the algorithms
25
3.4. LITERATURE REVIEW
chosen.
◦ Model improvements and extensions and key implications for practitioners
were proposed.
3.4 Literature review

Literature reviews were conducted to gain a deeper understanding of the various
dimensions of this study:
◦ The mathematical fundamentals of graph theory: the concepts, common
graph-based problems and applications both in academia and industry.
◦ The techno-economic challenges opportunities of the sectors studied.
◦ The research advancements within the selected use-case cluster.
In the first phase of the research, the literature covered came both from academia
(research articles and books) and “grey literature” from the industry (white
papers, reports, blog-posts) as both sources would complement each other.
The “grey” literature gives business insights and inspiration in potential areas of
applications more or less connected with a utility company’s operations and a
provides a certain degree of confidence of the commercialisation possibility and
benefits. However, in contrast with academic papers, they generally lack in
technical details, giving little information on how graphs were employed.
Another limitation of grey literature was the scarce information from commercial
implementation of graphs within the energy sector and the potential lack of
objectivity and verifiability of their solutions and their benefits.
On the other hand, academic research in energy-related applications is vast and
rich in technical details, giving the researcher a clear understanding of the
implementation process, as well as a more objective evaluation of the
graph-based solution. The drawback of these resources is the little knowledge
that can be derived regarding the commercialisation possibilities of their
solutions, hence a higher implementational risk.
The “grey literature” was found on graph-database providers and graph analytics
companies (AWS, Neo4j, Google, Expero). The academic literature search was
made in a systematic manner, by performing a grid search on Scopus combining
all graph-related concepts with all potential sectors of application.
26
3.5. DATA COLLECTION
In the second phase of the research, another literature review was conducted.
The scope of sources was narrowed to academic papers related to the selected
use-case to gain insight in the challenges and opportunities of possible methods
and find an interesting angle to tackle the problem.
3.5 Data collection

As suggested by Saunders et al., data for exploratory research should in practice
be collected both from primary and secondary sources [17]. Primary data were
collected by exploratory discussions, semi-structured interviews and group
discussions. Both qualitative and quantitative secondary data was collected. The
qualitative data came both from external sources (papers and reports,
presentations, videos) and internal sources (internal documents and
presentations). The quantitative data was used for implementing the
proof-of-concept of the selected use-case.
3.6 Semi-structured interviews

Semi-structured interviews are well suited for an exploratory research. Although
they are non-standardised and adapted to the background of the participant, a
common list of themes and questions to be covered are predetermined by the
researcher. The objective is to gather the opinion of respondents regarding
complex and sometimes even sensitive issues.
The semi-structured interviews were made with experts from the fields covered
by the research. The overall objective was to understand the particular
challenges and methods used in their activities, as well as discuss the potential
benefits that graph theory could generate in this field. They were however
adapted to the particular domain of expertise of the expert (data scientist, asset
manager, innovation manager, etc.), hence some specific questions could be
omitted and flexibility was allowed to enable for exploratory discussions.
27
Part I
Graph theory and applications
28
CHAPTER 4
Elementary terminology
This chapter conveys a number of selected concepts in graph theory which are
typically presented in literature about graph theory on an introductory level.
The definitions and properties are intentionally written in running texts to create
coherent paragraphs and a relaxing reading. If no specific reference is mentioned,
the mathematical definition, theorem, property, example or statement in
question is common and can be found in such textbooks as [41], [42] and [43].
4.1 Graph and subgraph

A graph in graph theory is simply put a collection of vertices (synonym: nodes)
and edges, where an edge can be drawn between two vertices to show that these
vertices are somehow related to each other. In engineering problem solving,
vertices can represent elements in a system, such as people, animals, buildings,
electric components and financial assets.
If a graph consisting of ten vertices which represent ten people in a
neighbourhood, one can choose to draw an edge between two vertices if the
corresponding people are neighbours, or if they own the same number of pets.
There is a lot of freedom for one to define what vertices and edges represent,
depending on what is interesting to study. In the next chapter, some real-life
applications of graphs will be presented.
Mathematically speaking, a graph G is an ordered pair (V, E), where V is a
29
4.1. GRAPH AND SUBGRAPH
finite set and E is a set of pairs of elements in V . The set V consists of vertices
and E of edges in G. A graph H is called a subgraph of G if each vertex and
each edge in H is also in G. In Figure 4.1 (a), the graph G = (V, E) has 7 vertices
and 9 edges, where V = {a, b, c, d, e, f, g} and
E = {{a, c}, {a, e}, {b, c}, {b, d}, {c, e}, {c, f }, {d, e}, {e, g}, {f, g}}.
a  
g 0 0 1 0 1 0 0
b 0 0 1 1 0 0 0

 
1 1 0 0 1 1 0
 
 
0 1 0 0 1 0 0
 
c f 1 0 1 1 0 0 1
 
 
0 0 1 0 0 0 1
 
d e
 
0 0 0 0 1 1 0
(a) Graph G = (V, E) (b) Adjacency matrix of G
a a
b g b g
c f c f
d e d e
(c) Graph H = (V, E 0 ) (d) Graph K = (V, E 00 )
Figure 4.1: Examples of graphs, where H is a subgraph of G and K is another

subgraph of G. The graph K is not a subgraph of H since the edge {b, c} in K is
not in H.
Two vertices x, y ∈ V are called neighbours if there is an edge between them,

i.e. if {x, y} ∈ E. In this case, one can even say that x and y are adjacent
vertices and that each of x and y is incident to the edge {x, y}. Two edges are
adjacent edges if they have a common vertex. The structure of a graph,
regarding whether the n vertices v1 , v2 , · · · , vn are adjacent, can be algebraically
represented with an adjacency matrix A of size n × n, where the matrix
element Aij = 1 if the vertices vi and vj are adjacent, and Aij = 0 otherwise.
A basic and important metric in a graph is degree (synonym: valency) of a
30
4.2. GRAPH TRAVERSAL
vertex x, denoted deg(x), which is the number of edges incident to x. A graph

where each vertex has degree k is called a k-regular graph. The maximum
degree of a graph, denoted ∆(G), is the maximum degree of its vertices. In the
same manner, the minimum degree of a graph, denoted δ(G), is the minimum
degree of its vertices.
4.2 Graph traversal

If a graph represents for example a network of buildings (vertices) and roads
(edges), it is natural to define some kinds of walks through the vertices:
◦ A walk is a sequence of vertices v1 v2 . . . vk , where v1 , v2 , · · · , vk ∈ V and
{vi , vi+1 } ∈ E for all i = 1, 2, · · · , k − 1. Not all vertices in V need to be
included in a walk.
◦ A trail is a walk which does not go through any edge more than once.
◦ A circuit is a closed trail, meaning that it starts and ends at the same
vertex.
◦ A path is a trail which does not go through any vertex more than once.
◦ A cycle is a closed path.
By these definitions, a cycle is a kind of path, which in turn is a kind of trail.
Each path is also a trail, but not all trails are paths. Do note that these names
are not standardised and different authors use them differently. In this paper, the
terms will be consistently used as described above.
Two special walks of interest in many applications are Eulerian trail and
Hamiltonian path. These walks are named after the Swiss mathematician
Leonhard Euler (1707-1783) and the Irish mathematician William Rowan
Hamilton (1805–1865), respectively. An Eulerian trail is a walk which goes
through each edge in E exactly once. A Hamiltonian path is a trail which goes
through each vertex in V exactly once. If an Eulerian trail or a Hamiltonian
path is closed, it is also called an Eulerian circuit and a Hamiltonian cycle
respectively.
31
4.3. TREES AND CONNECTIVITY
a d a d a d
b e b e b e
c f c f c f
(a) Graph G (b) A trail in G (c) A path in G
a d a d
b e b e
c f c f
(d) A circuit in G (e) A cycle in G
Figure 4.2: Various kinds of walks in a graph. Note that the circuit in (d) is not a
cycle since it passes the vertex b twice. The cycle in (e) is, however, also a circuit.
4.3 Trees and connectivity

A graph is said to be connected if there is at least a path from any vertex to
any other vertex in the graph. A graph which is not connected is called
non-connected or disconnected. An important type of connected graph is a
tree, which is defined as a connected graph without cycles.
If a graph G = (V, E) is connected, it has at least a subgraph which is a tree
containing all vertices in V . This tree is called a spanning tree 1 of G. A graph
may have multiple spanning trees.
1
The term “spanning tree” has been translated to “spännande träd” in some textbooks in
Swedish, which induces quite a humour effect. Those who wish to be extraneously serious can
instead say “uppspännande träd”.
32
4.3. TREES AND CONNECTIVITY
(a) Graph G (b) Graph H
Figure 4.3: The graph G is connected but is not a tree since it has a cycle. The
graph H has no cycles but is not a tree either since it is non-connected.
Figure 4.4: Differently configured trees with 5 vertices and 4 edges each.
a a a
b c b c b c
d d d
e f e f e f
g g g
(a) A connected (b) A spanning (c) Another spanning

graph G tree of G tree of G
Figure 4.5: Examples of spanning trees in a connected graph.
33
4.4. MATCHING
Two important metrics about the notion of connectivity in a graph G = (V, E)

are vertex connectivity and edge connectivity. The vertex connectivity of G,
denoted κ(G), is the least number of vertices in V which needs to be removed
(together with their adjacent edges) to make G disconnected. The edge
connectivity of G, denoted λ(G), is the least number of edges in E which needs
to be removed to make G disconnected. It is shown that κ(G) ≤ λ(G) ≤ δ(G) in
any graph. As explained in 4.1, the notation δ(G) stands for the minimum degree
of G.
4.4 Matching
Consider a group of people where some of them are mutually in love with each
other. If two people, who are already in love with each other, get married to each
other, this can be considered a matching. Typically, each person can take part in
at most one marriage (matching). In graph theory, a matching M in a graph
G = (V, E) is defined as a subset of E, where the edges in M are pairwise
disjoint, meaning that any two edges in M do not share a vertex.
If a matching contains as many vertices in V as possible, it is called a
maximum matching. If a matching contains all vertices in V , it is called a
perfect matching. Some graphs do, however, not have a perfect matching,
which aligns with the reality where not all people in the end can get married to
their dream partners.
4.5 Colouring
Given a graph G = (V, E) drawn 2 on a surface, one may sometimes want to put
a colour on each vertex in V such that no two adjacent vertices get the same
colour. More formally speaking, a proper vertex colouring for G is a function
c : V → N such that if {x, y} ∈ E, then c(x) 6= c(y). If nothing else is stated, a
vertex colouring should always be interpreted as a proper vertex colouring.
The least number of colours required to vertex colour a graph G is call the
chromatic number (synonym: vertex colouring number) of G, denoted χ(G) or
simply χ if there is no ambiguity. The Greek letter χ (khi, chi) has to do with
the fact that the Greek word for “colour” is “qr¸ma” (khróma, chróma).
2
A more mathematical term for “drawn” in this case is “embedded”.
34
4.6. DIRECTED GRAPH
b d b d
a f a f
c e c e
(a) Graph G (b) Graph G with
plainly drawn a matching M1
b d b d
a f a f
c e c e
(c) Graph G with (d) Graph G with
another matching M2 another matching M3
Figure 4.6: Examples of matchings, where the edges in each matching is

marked red. The matchings M1 and M2 are not maximum. The matching M3
is both maximum and perfect.
In the same manner, a proper edge colouring is a colouring of edges where no

two edges which share a vertex get the same colour. The least number of colours
required to edge colour a graph G is call the chromatic index (synonym: edge
colouring number) of G, denoted χ0 (G) or simply χ0 .
4.6 Directed graph

So far in this paper, the term “edge” has been used to refer to undirected
edge, which represents some kind of symmetric relationship between two
vertices. If two vertices represent two people x and y, and an edge is drawn
between these if they are friends, then this edge {x, y} does not need a direction
if one considers the friendship as symmetric. Here, “symmetric” means that if x
is a friend of y, then y is also a friend of x.
Nevertheless, not all relationships need to be symmetric. Imagine two vertices
35
4.7. WEIGHTED GRAPH
b d b d
a f a f
c e c e
(a) Graph G with (b) Graph H with
χ(G) = 3 χ(H) = 4
Figure 4.7: Examples of vertex colourings using as few colours as possible.

The graphs have the same number of vertices and the same number of edges,
but different chromatic numbers.
representing two objects x and y in a power grid, where x is a generating plan

and y is a household. If x delivers electricity to y, one can choose to describe this
flow with an arrow from x to y, implying at the same time that there is no
electricity sent from y to x. This arrow is called a directed edge, alternatively
an arc.
A graph where each edge is directed is called a directed graph. Even the
contraction term digraph is used for convenience.
4.7 Weighted graph

In some applications, one even wants to label the edges to associate the
relationships between vertices with some numerical values, also called weights.
A graph containing weighted edges is called a weighted graph.
An (undirected) edge between a city x and a city y can be labelled with a
positive number representing the geographic distance between them. Other
conceivable numerical values are for example resource flow between two factories,
correlation coefficient between two financial assets and expected travel time
between two train stations.
36
4.7. WEIGHTED GRAPH
16
f
c 26
29
e
14
11
7 42
4
l a
33
23 47
28 5
j 9 i 17 h
67
Figure 4.8: A weighted directed graph showing the number of people (red edge
labels) travelling between some cities (round vertices) at some point of time, in a
small fictitious kingdom far, far away.
37
CHAPTER 5
Selected applications of graphs
The applications described in this chapter are selected from a wide range of
areas, where graph theory has proven to be an effective tool to model and solve
problems. Each application begins with a couple of keywords, followed by an
example which can be adapted into the activities of Fortum as a utility company
and ends with some suggestions for adaptation possibilities.
5.1 University timetabling

Keywords: vertex colouring, chromatic number
An administrator needs to book classrooms for all the students having exams in
a day. There are m student groups g1 , g2 , · · · , gm and n available classrooms
r1 , r2 , · · · , rn . Not all groups start or finish at the same time point, and the exam
times for some groups may overlap partially or completely. It is even required
that at most one group can be in a classroom at any time. How should the
administrator do to book as few classrooms as possible? A motivation can be
that the fewer rooms booked the less fee paid to cleaning companies at the end of
the day.
This situation can be modelled with a graph, where the vertex set is
V = {g1 , g2 , · · · , gm }. An undirected edge is drawn between two vertices if the
exam times for the corresponding student groups are going to overlap. The
timetabling problem is now equivalent to a vertex colouring problem with as few
38
5.2. STAFF ASSIGNMENT
colours as possible, where a classroom is considered as a colour and two adjacent

vertices in a proper vertex colouring cannot have the same colour [44]. The least
number of necessary classrooms is equal to the chromatic number of the graph.
In reality, there may be more parameters to take into consideration. Some
classrooms may have too few seats for some student groups, and some classrooms
may have different booking prices even if they have the same number of seats.
The aforementioned graph model would require more parameters to solve the
problem.
Adaptation: A utility company with many employees such as Fortum can take
use graph colouring to solve timetabling tasks effectively. Even the power
systems, such as hydropower and electricity power systems, that Fortum
frequently works with can be monitored using graph colouring methods. For
example, the connectivity information [45] and the maintenance scheduling [46]
can be formulated as a vertex colouring of the system elements and transmission
lines, respectively.
5.2 Staff assignment

Keywords: perfect matching
A company has n workers w1 , w2 , · · · , wn and n tasks t1 , t2 , · · · , tn which need to
be done. The goal is to assign each worker to exactly one task, where each
worker is qualified for at least one tasks. Is it possible to come up with such an
assignment?
This is a brilliant opportunity to make use of matching in graph. Build a graph
with 2n vertices representing all the workers and the tasks, meaning that the
vertex set is V = {w1 , w2 , · · · , wn } ∪ {t1 , t2 , · · · , tn }. An edge in this graph goes
between a worker wi and a task tj , but not between two workers or two tasks, if
wi is qualified for tj . The staff assignment problem is solved when a perfect
matching is found.
Adaptation: Fortum can make use of matchings to effectively assign tasks to
their employees. In the task of monitoring electrical systems such as smart grids,
anomalies in an electric network database can be detected with a graph matching
approach [47]. Although a smart grid is advantageous compared to a traditional
grid considering the integration of renewable energy sources and bidirectional
39
5.3. COST-EFFECTIVE RAILWAY BUILDING
energy flows among other things, its data are highly vulnerable, making the
power system exposed to security risks [48]. It is interesting to note that graph
matching is not the only approach to detect anomalies in smart grid problems.
For instance, normative subgraphs can also be considered [49].
5.3 Cost-effective railway building

Keywords: weighted graph, minimum spanning tree
An engineer has got a mission to design a railroad network connecting the n
neighbourhoods h1 , h2 , · · · , hn , where a underground station is located at each
neighbourhood. The cost of building a railroad between every pair of stations
{si , sj } is given as a positive number cij = cji . Help the engineer design a
railroad network which costs as little as possible.
If h1 , h2 , · · · , hn are the vertices in a graph G, one can draw all the conceivable
edges to visualise all the potential costs of construction between two stations.
Each edge is a weighted undirected edge with weight cij > 0, where
i, j ∈ {1, 2, · · · , n}. Since the railroad network needs to connect all the
neighbourhoods, one wants to determine a spanning tree in G with the minimum
total weight. In other words, one wants to find a minimum spanning tree in
the graph.
A good question here is why one would look after a spanning tree, which is a
connected spanning subgraph of G with no cycles. The “connected spanning”
part is about connecting all neighbourhoods, but why “no cycles”? The reason is
simple. If one has a subgraph which goes through all the vertices in G and
contains a cycle, one or some edges in the cycle can always be removed to create
a new subgraph which still goes through all the vertices in G but has a less total
weight since each individual edge weight is positive. This explains why a
spanning tree is better than any other connected spanning subgraph with cycles.
This kind of cost-effective railway network may, however, not be time-effective
for the residents in some neighbourhoods.
Adaptation: Although a utility company does not constantly deal with
infrastructure projects, the concept of minimum spanning tree is interesting since
it is essentially about reduction in networks to reduce cost, decrease complexity,
extract the most important links and so on. In a power distribution network with
40
5.4. LOGISTIC NETWORK AND OPTIMAL ROUTING
numerous switches, power flow paths can be easier monitored using minimum
spanning trees combined with Kruskal’s algorithm to identify these [50]. When
studying financial markets, such as stock markets [51] and energy commodity
markets [52], a minimum spanning tree can be a simple, yet powerful, tool to
uncover the characteristics and dynamics of financial instruments and
institutions.
5.4 Logistic network and optimal routing

Keywords: shortest path
One of the most typical and important questions each logistic network faces is to
find the shortest path between two stations, which can be train stations, petrol
filling stations or pick-up and delivery stations. Take an undirected weighted
graph G = (V, E) where V is the sets of all stations and each edge between two
vertices represents a physical road between two stations. There can be multiple
edges between two vertices in this case and an edge weight measures how far two
stations are from each other.
Three elementary problems about shortest path in a logistic network are as
follows:
◦ The single-pair shortest path problem: find the shortest path between
two given vertices x, y ∈ V .
◦ The single-source shortest path problem: find the shortest path
between a given source x ∈ V to every other vertex in V .
◦ The all-pairs shortest path problem: find the shortest path between
each pair of vertices in V .
Numerous algorithms have been proposed to effectively solve these kinds of
optimal routing problems. Depending on the purpose and the graph structure,
different algorithms are suitable. For instance, a single-source shortest path
problem described by a graph with non-negative edge weights can be solved by
Dijkstra’s algorithm. If the edge weights are allowed to be negative, the
Bellman-Ford algorithm is applicable [53]. If the graph is known to have almost
no cycles, Takaoka’s algorithm can be used to accelerate the search [54].
Adaptation: Fortum has a business unit called Charge & Drive which manages
41
5.5. PLANAR EMBEDDING OF GRAPHS
charging infrastructure for electric vehicles [55]. How to guide vehicles to

charging stations is a relevant and non-trivial optimisation problem. Not only
the geographic distance but also the driving time, waiting time, charging time,
longevity of battery and so on need considering. Logistic graphs and shortest
path algorithms can come into the picture and offer efficient solutions [56].
Despite not being about optimal routing, another related problem about
charging infrastructure is siting and sizing of charging stations. A seemingly
natural choice for reasonable placements is the vicinity of fast food stores or
petrol stations. However, more insights and motivated decisions to minimise
costs for both the company and the users can be gained with a mathematical
study using graph-based models [57, 58].
5.5 Planar embedding of graphs

Keywords: planar graph
In design problems for radioelectronic circuits, utility lines and underground
passageways, one may find it important to come up with a graph representation
may not intersect each other, or intersect each other as little as possible. Two
intersecting edges in a circuit may for example lead to destructive interference. If
some underground stations are placed at disadvantageous locations, one may
need to build their connecting tunnels on different level which causes a higher
construction expense.
a b c
E G W
Figure 5.1: The complete bipartite graph K3,3 is not planar. A real-life
interpretation is as follows: Given three households V1 = {a, b, c} and three
utility sources V2 = {E, G, W } (electricity, gas, water) on a plane ground, draw a
line connecting each household with each utility source (nine lines in total).
Anyhow one places the households or the utility sources and bends the lines,
there will always be at least two lines crossing each other.
There are theorems in graph theory which reveal in advance whether a graph is
42
5.6. WINTER ROAD MAINTENANCE
planar or not, meaning it can be drawn (synonym: embedded) without

intersecting edges. Loosely speaking, if a graph has in some way too many edges,
there is a high probability that the graph is non-planar. A famous example is the
utility graph K3,3 , which is a complete bipartite graph
G = (V, E) = ({V1 , V2 }, E) where each of the vertex sets V1 and V2 consists of
three vertices and there is an edge connecting each vertex in V1 with each vertex
in V2 .
Adaptation: Planar graphs have appeared in several applications involving
design of networks, namely both physical networks and communication networks,
and assisting generators for creating planar embeddings have been of interest
[59, 60]. However, there seems to be no essential amount of researches about
design of such networks which are relevant for Fortum.
On the other hand, as pointed out in [61, 62], knowledge about planar graphs
and their properties can help elaborating customised problem-solving algorithms
which are highly suitable for planar graphs but not for graphs in general. In that
case, graph planarity can be considered a valuable supporting concept to further
study network models and develop algorithms, rather than a way to solve
drawing problems.
5.6 Winter road maintenance

Keywords: graph traversal, Eulerian trail
Being a prominent graph theoretician, a technician is asked to design a route
plan for a snowplough to clean all the local streets a winter day. The task is to
find an optimal route where the snowplough visit every street in a district, such
that the total travel distance is as short as possible.
The street network can be modelled with a connected graph whose edges are the
streets and the vertices are the intersections. Each edge has a weight
representing the geographic length of the corresponding street. Under the
assumption the snowplough is allowed to go two ways, the edges are undirected.
The original task is now equivalent to find a walk in this graph such that the
total edge weight is minimised.
The fact is that if at most two vertices have odd degrees, there exists an Eulerian
trail which passes every edge exactly once and we have our solution. Otherwise,
43
5.7. SOCIAL NETWORK
the snowplough needs to visit some edges more than once if all the streets need
to be traversed. It turns out that it is possible to come up with a shortest trail
which passes each edge exactly once or at most twice [63].
Adaptation: The related graph walk Hamiltonian cycle can be useful in
monitoring of smart grids, or distribution networks in general. State estimation
is significant to sustain distribution management systems. In [64], a calculation
scheme based on Hamiltonian cycle is proposed to quickly obtain the network
states. From a wider perspective, traversing in graphs can be a powerful
technique for searching in graph databases where data is represented and stored
as a graph structure [65], if Fortum wishes to model their data sets this way. To
give a concrete example, graph traversals can be used to perform a recovery
sequence in automated fault management for medium voltage direct current
shipboard power systems [66].
5.7 Social network

Keywords: vertex degree, centrality
The degree of a vertex in a personal network can be directly interpreted as how
many relationships a person x has, and to some extent the popularity of that
person. A simple graph model consists of vertices corresponding to people and
an undirected, unweighted edge between two vertices means that those people
know each other. The higher degree a vertex has, or in other words the higher
the degree centrality of a vertex is, the more people x has a relationship with.
However, one cannot from this really see how influential a person is. Even if x
knows many other people, the relationships between x and each of these may be
weak and x may not have much influence on others. A politician may in this case
be interested in another useful graph metrics called closeness centrality, which
is defined as the reciprocal of the distance sum y∈V,y6=x dist(x, y). Here,
P
dist(x, y) denotes the shortest distance between two people x and y, if one
assigns each edge in the graph with a weight which indicates the relationship
distance between two people, meaning how well they know each other.
In general, it is often desirable to detect which vertices that have the most
important role in a graph, or in other words the most central vertex. There are
some different ways to mathematically define what “most central” means,
depending on the real-life interpretations. Beside degree centrality and closeness
44
5.7. SOCIAL NETWORK
centrality, some other useful types of centrality are betweenness centrality and
eigenvector centrality. A famous variant of eigenvector centrality is PageRank
centrality 1 used by Google for ranking web pages in their search engine [67]. A
small but illustrative example is given on the next page. Note that the most
central vertices are different depending on what centrality measures are used.
Adaptation: Developers working with strategies involving customers can use
graph-based social networks, with centrality among other concepts, as a powerful
decision support tool. A company can gain more customers and profits by for
example identifying influential customers and creating suitable marketing
campaigns [68]. Some interesting ideas for analysing consumer behaviours and
purchasing patterns are proposed by [69]. A graph-based approach for valuing
the explicit financial worth of paying customers as well as the implicit social
value of non-paying customers is suggested by [70].
An application of centrality measures worth mentioning for a utility company is
management of power systems. A plan for efficient placement of security sensors
in a large network can be achieved with the notion of graph centrality [71].
Hydraulic centrality metrics can be derived from the conventional centrality
metrics to extract useful information of a hydropower plant or a water
distribution network [72].
c
b
d
a
e
g
f
Figure 5.2: Graph vertices can be most central in different aspects.
1
The word “PageRank”, spelled with a capital P, a capital R and with no space, was named
after Larry Page, one of the founders of Google.
45
5.8. TOURNAMENT RANKING SYSTEM
most central


Centrality a b c d e f g vertices


y
Degree 2 3 1 1 3 2 2 b, e
Closeness 0.10 0.09 0.06 0.06 0.09 0.07 0.07 a
Betweenness 9 9 0 0 8 0 0 a, b
PageRank 0.14 0.23 0.09 0.09 0.19 0.13 0.13 b
Table 5.1: The centrality measures of the vertices in Figure 5.2.
5.8 Tournament ranking system

Keywords: directed graph, Hamilton path, spectral ranking
A judge group has received a list of n participants (players) p1 , p2 , · · · , pn in a
tournament, where one participant competes against another at a time and the
results of the matches (win, lose, draw). Assume that each player completes in
exactly n − 1 matches against the other n − 1 players. The judges need to
suggest a ranking over all the participants, from the best to the least good one.
The results can easily be visualised with a directed graph with n vertices
p1 , p2 , · · · , pn . If player pi wins against player pj , one draws a directed edge from
vertex pi to vertex pi , and no directed edge from pj to pi . If the match concludes
in a draw, two directed edges between pi and pj are drawn: one from pi to pj and
one from pj to pi . This way, if a directed Hamilton path which passes all the
vertices, each vertex exactly once, can be found, then one can follow this path to
rank the players, where the best player corresponds to the first vertex in the
directed Hamilton path.
The hard part is that a tournament graph may not have a unique directed
Hamilton path, meaning that this approach can give contradictive rankings
where a player pi is considered better than pj in one Hamilton path but worse
than pj in another Hamilton path [73]. A more reasonable ranking procedure is
spectral ranking using eigenvalues and eigenvectors in linear algebra applied to
the adjacency matrix of the tournament graph [74].
Adaptation: A similar directed graph model can work as a general-purpose
decision support tool to rank a number of choices from the most to the least
favourable [75, 76]. In data mining, spectral ranking can also be useful for
anomaly detection. The main advantage of an anomaly ranking compared to a
46
5.9. IMAGE SEGMENTATION
binary classification is that even a measure of relative abnormality is provided,

which forms a basis for cost–benefit analysis [77].
5.9 Image segmentation

Keywords: graph partitioning, minimum cut
Image segmentation is a part of digital image processing where an image is
divided into segments, or in other words pixel clusters, which often depict
essentially different entities in the image. When contemplating a classical fruity
still-life painting, our eyes can easily segment the whole painting of hundreds
shades (or thousands pixels) into three main regions: the fruit bowl, the
background and the tablecloth. Other people looking at images may sometimes
only want to study a specific interesting part of the whole image for professional
identification purposes: sharp objects (airport security control), human faces
(criminal recognition), breast cancer (mammography), vehicle registration plate
(traffic control), letters (optical character recognition) or rare birds
(birdwatching).
An image can be viewed as a graph where each pixel is a vertex that has eight
edges connecting it to eight closest vertices (four in cardinal directions and four
in ordinal directions) or three edges if the corresponding pixel lies at a corner of
the image. Each edge is associated with a weight reflecting how similar two
neighbour pixels are, with respect to some kind of chosen measure, for example
colour similarity. A relatively low weight means relatively high dissimilarity.
A graph-based approach for image segmentation can make use of so-called graph
cuts [78]. Given a graph G = (V, E), a cut C = (S1 , S2 ) is a partition of V into
two disjoint subsets S1 and S2 , i.e. S1 ∪ S2 = V and S1 ∩ S2 = ∅. In a weighted
graph, a minimum cut or simply min-cut is defined as a cut such that the sum
of the weights of the edges between the subsets S1 and S2 is minimised. Such a
min-cut can be used to segment an image into two essentially different regions,
and the process can be repeated to obtain more segments if desired. Although
the idea is simple, the optimisation problem of finding a min-cut can turn out to
be computationally intractable, meaning that it can be solved in theory but
would take too much time (or other resources) in practice to be cost-effective.
Adaptation: Graph segmentation using minimum cuts can be useful in electric
power networks. A power network experiencing high loads or various types of
47
5.9. IMAGE SEGMENTATION
interruptions can lead to serious cascading failures. A preventive solution is to

partition the network into smaller networks called islands, where the imbalance
between generation and load in each island is to be minimised [79, 80]. On the
other hand, a security index problem about false data injection attacks can also
be formulated as graph partitioning problem, as an efficient tool for security
analysis for power transmission networks [81].
From a perspective of exploratory data mining, various kinds of data clustering
and data segmentation are desirable to easier detect similar patterns. This study
area is highly relevant for a company working with large-scale data sets such as
Fortum. Besides minimum cuts, many graph clustering techniques have been
proposed, such as evolving sets, heat kernel and random walks [82].
48
Summary
Graph theory is a part of mathematical combinatorics with many interesting

concepts and applications highly connected to reality in general and energy
technology in particular. With proper investment in exploring this discipline, a
utility company can likely gain business value from profound insights on their
operations. When it concerns energy production, database management or
administrative decision making, graph theory can find a way and provide novel
perspectives to model and assess possibilities. It is important to understand that
graphs in the area of graph theory is not for delightful visual representations. A
graph is a mathematical combinatorial structure for describing pairwise relations
and solve engineering problems effectively with exclusive methods.
49
Part II
Selection of use-case
50
CHAPTER 6
Idea generation
6.1 Inventory
While the search for use-cases from academia resulted in 15,000 papers, the one
on industrial use-cases was less prolific, as about 100 use-cases were gathered. An
inventory of use-cases was built after the filtering processes, with information on
their authors, energy-cluster and graph-cluster belongings. The inventory was
continuously updated along the research, as more keywords were found useful
and relevant. A curated set of use-cases can be found in A.2 and the full
inventory can be found online 1 .
As earlier mentioned, several types of academic use-cases were not saved in the
inventory, for the various reasons listed below.
From academia:
◦ Solar power: too few use-cases found.
◦ Waste management: too distant from the classic digitalisation challenges a
classic utility company intends to solve.
◦ District heating: the district heating operations in Sweden are now part of
Stockholm Exergi.
1
Inventory of uses cases of graph theory in the energy sector
51
6.1. INVENTORY
◦ Transmission and distribution grid: Fortum is no longer active in this

segment.
◦ Smart homes: too few use-cases found.
In parallel, several types of industrial use-cases were not selected to the final
inventory, mainly based on their a priori low current relevance for the main
challenges specific for an electric utility.
From industry:
◦ Fraud and financial services.
◦ Real-time recommendations.
◦ Social graphs.
◦ Supply chain analysis.
6.1.1 Clustering
The table below summarises the proposed clusters and the number of use-cases
in each.
Use-case cluster Number of past use-cases
Hydropower: optimisation 13
Hydropower: operation and maintenance 12
Nuclear power: operation and maintenance 47
Wind power: design, operation and maintenance 65
Electric vehicle applications 42
Energy trading 110
Master data management 16
Energy storage 21
Knowledge graphs 20
Table 6.1: Number of past use-cases per cluster
6.1.2 Heat map

The heat map below illustrates the number of past use-cases per cluster and per
graph-related concept. This heat map does not comprise knowledge graphs and
52
6.1. INVENTORY
master data management, as they could not be tied to specific energy sectors. It
provides insights not only on which clusters have most use-cases, but also on
which graph-related concepts (graph-type, problem-type or algorithm) are most
recurrent in literature.
Figure 6.1: Heat map of past use-cases per cluster and

per graph-related concept.
53
CHAPTER 7
Assessment of use-case clusters
7.1 Preliminaries
7.1.1 Presentation of the use-case clusters
In this section, a face-value assessment will be made of each of the selected
clusters. The structure of the assessment will look like follows:
◦ Background: the main challenges and objectives in the clusters are
covered.
◦ Case study: a specific case study from the inventory was selected and
explained in further details to give practitioners some more details of the
methods applied and their results.
◦ Expert opinion: for most clusters, feedback from industry experts on the
challenges and opportunities in this cluster, notably with graphs, was
gathered.
◦ Assessment: the cluster is graded on each of the dimensions detailed in
the methodology. The overall grade and the grades in each dimension are
given in brackets next to the dimension.
◦ Further reading: a few more articles are recommended for the reader to
have a wider perspective of the potential other solutions of graphs in the
cluster.
54
7.1. PRELIMINARIES
7.1.2 Scoring system

Graph-score
In each of the first four dimensions, the score is out of a total of 8 points.
Strategy-score
The score on each strategy point is weighed by the importance of the strategy
point in question in Fortum’s overall strategy, from 1 to 4, according to the
following:
◦ Strategy point 1: 4.
As such, the total score on the strategy alignment is out of a total of 20 points.
Final score
The final score of each use-case cluster is the sum of the graph-score and the
strategy-score. For legibility purposes, each score is out of 10 points. Thus, the
final score is out of a total of 20 points.
7.1.3 Assessment results per criterion

Table 8.1 summarises the results of the scoring of the use-case clusters in each
dimension and along the strategy points of Fortum.
55
7.1. PRELIMINARIES
Table 7.1: Scoring of use-case clusters
Legend:
C1: Hydropower: optimisation
C2: Hydropower: operation and maintenance
C3: Nuclear power: operation and maintenance
C4: Wind power: design, operation and maintenance
C5: Electric vehicle applications
C6: Energy trading
C7: Energy storage solutions
C8: Master Data Management
C9: Knowledge graphs
The strategic alignments can be found in more details in Section 3.3.3.
56
7.2. OPTIMISATION OF HYDROPOWER OPERATIONS
7.2 Optimisation of hydropower operations

Background
Due to their participation both in the electrical system and the water system,
hydropower plants are subject to a wide set of constraints, which can be
classified into environmental, operational and regulatory [84]. Conventional
hydropower stations and to a lesser extent, run-of-river hydropower stations
(without a reservoir) directly impact the water systems they are connected to.
They can affect, among others, the quality of the water, its temperature, its flow
rate, which can in turn impact both the human populations downstream as well
as the aquatic species around the stations. As an electricity generator, the
operations are similar to those of traditional power plants with respect to the
injection of power to the grid, among others, minimum and maximum generation
levels, ramping, provision of reserves and optimal ranges of turbine operations.
Finally, as water is a common good, hydroelectric facilities can be subject to
regulations to satisfy certain social and environmental objectives, which can
sometimes directly guide the allowed output of a hydropower facility.
As a result, dam and reservoir operation rules are considered one of the most
complicated optimisation engineering problems [83]. In fact, most models do not
comprehensively represent the constraints on hydropower operations for various
reasons, including a lack of computational resources, the modelling time required
with increasingly complex models, or a lack of data to properly account for
hydrological considerations.
In addition, the hydrological conditions are expected to be altered by the
climatic deregulation, strongly impacting the availability of water, one of the
main constraints in hydropower operations. Correctly assessing reservoir inflows
accounting for the dynamicity and stochasticity of the hydrological conditions is
an increasingly prioritised area for improving hydropower operations [84].
Case study Finding multiple optimal solutions to optimal load distribution
problem in hydropower plant [85]
The optimal load distribution of hydropower generation units represents a
central aspect of optimising hydropower plants. For a given unit commitment,
i.e. a pre-determined power output based on the scheduling of the plant, plant
operators dispatch the water through the plants’ generating units seeking the
turbines’ respective optimum energy conversion rates. The objective is to
minimise the water discharge while respecting the desired generation. Traditional
57
methods used by plant operators to solve the oftentimes non-convex piecewise

linear I/O (input / output) functions of the generation units, such as mixed
integer linear programming (MILP) or dynamic programming (DP), determine a
single optimal or near-optimal solution.
In this case study, the authors argue that this problem is of a multiple optimal
solution nature (MOS). They transform the OLD problem in a shortest path
problem by means of discretisation. All the optimal and near-optimal solutions
are sought with the near-shortest path algorithm, keeping track of the optimal
sub-paths. The graph is constructed with nodes representing the accumulative
loads and the edges the load distribution and the costs, corresponding to the
water discharge. They found promising results on the study of the Geheyan
power plant (China, 1.2 GW) providing decision makers with more alternatives
for running their units. This can be beneficial for reducing occurrences of
running units through vibration areas.
Expert opinion
The existing optimisation models used by hydropower operators is a relatively
mature technology with established processes and tools. Their operations could
however benefit from the simplification of the optimisation models without losing
detail quality, given that computation time is a primary constraint. Their main
limitation is to be able to use more dynamic data, better reflecting the real
operating conditions, without excessively impacting the computation time. In
particular, more precise hydrological data would be beneficial, as well as better
understanding the relationship between water flow and power output. Moreover,
the hydrological conditions are expected to be impacted by climate change,
adding an incentive to use more dynamic data to input in their models [86].
Assessment (4.7/10)
Graph applicability (3/8)
Hydropower operations can be modelled through graphs, both on a macro-level
with generation units as nodes and the water flow as edges, but even on more
micro-levels by shifting the focus to the turbines within the generation unit, or
by modelling the decision process. The availability of use cases is not large, but
they confirm a certain graph applicability, with identified algorithms (e.g.
shortest path, ant colony optimisation).
Technical feasibility (4/8)
This domain of application is predominantly concerned with setting up the
58
appropriate constraints to improving the operations. Therefore, a homogeneous

set of disciplines are particularly relevant, namely within mathematical
optimisation. The model setup is considered a challenge, given the large number
of constraints to account for. In addition, the computation constraint is generally
a severe limitation in hydropower operation optimisation models.
Economic potential (3/8)
To give an order of magnitude of the potential economic gains of optimisation
improvements of hydropower, the total hydropower generation in Europe in 2019
was 643 TWh [87]. A 1 % increase in power output, at an average sales price of
35 €/MWh [88], the economic gains could amount to €225 million annually. If
successful, a graph application could be replicated to a large number of
hydropower plants. This has to balanced by the fact these optimisation problems
lie at the core of a utility’s operations, increasing the competition from
alternative tools, such as the Water Use Optimization Toolset (WUOT), Powel,
GAMS, Hydrogrid, etc. The economic benefits from a graph-based solution are
only created if the new model outperforms the existing, more established models.
Workability (5/8)
Despite the high competence alignment and relevance of a graph approach
cluster, the obtention of data is a potential issue. Moreover, the mature processes
and the new nature of modelling the system through graphs can represent a
barrier for integrating the technology and maintaining it in production by the
numerous plant operators. [86]
Figure 7.1: Hydropower optimisation
59
7.3. HYDROPOWER OPERATION AND MAINTENANCE
Further reading
Arc-based constrained ant colony optimisation algorithms for the optimal solution
of hydropower reservoir operation problems [89]
Maximizing power production in path and tree riverine networks [90]
7.3 Hydropower operation and maintenance

Background
Hydropower plants are complex systems with a large amount of different physical
components (generators, turbines, gates, etc.), of which it is essential to monitor
the health status in time [91]. Moreover, hydropower is characterised by large
and ageing fleets, requiring significant reinvestments in equipment and
maintenance. Condition monitoring focuses on early detection of failures, faults,
wear and tear of machinery, helping asset managers avoid potentially large
damages and scheduling the maintenance. A timely detection of anomalies can
be valuable for decision-makers to act proactively to the failure of components,
with positive economic consequences.
Graph-based solutions have been proposed assessing the health and potential
failures for several subsystems of hydropower stations, such as the
governor-turbine-hydraulic systems [92], dam safety [93] and at an asset-level [94].
Case study Unsupervised anomaly detection based on minimum spanning tree
approximated distance measures and its application to hydropower turbines [95].
Turbine systems are central components in hydropower plants and have large
sub-components affecting their operation (bearing systems, a generator, filters).
The data collected is of wide nature (temperature of bearings and of the
generator coils, vibrations, etc.).
One fundamental issue in anomaly detection is proving the distinctness of
anomalous observations relative to the normal observations in the absence of a
learning rule. The most commonly used metric to measure the dissimilarity of
observations is the Euclidean distance. Data considered distant from the
majority of data points is considered an anomaly. However, the space embedding
of structures forming a nonlinear manifold are not a good fit with Euclidean
distances [96].
The authors propose to measure the geodesic distance rather than the Euclidean
distance, as they enable to measure distances in curved surfaces. Minimum
60
Spanning Trees (MST) are capable of approximating geodesic distances in high

dimensions, embedding complex structures. The authors map the data
observations as a network of nodes, with edges representing the distance between
them, capturing their relative connectedness. The most disconnected clusters are
considered anomalies. With the same methodology, they can detect pointwise
anomalies within an anomalous cluster, a complexity often encountered in
anomaly detection, by using a Local Minimum Spanning Tree (LoMST). They
compared their method with a wide variety of methods on 20 data sets and found
a superior performance. The knowledge generated from the anomaly detection
analyses helps service engineers continuously monitor the turbine operation and
potentially diagnose and predict the malfunctions of turbines in time.
Expert opinion
Operation and maintenance of hydropower stations is a core activity at Fortum,
who e.g. seeks to reduce operational costs. Criticality analysis i.e. the
identification of components with a high risk of failure, key to schedule the
maintenance tasks, is of interest. In this context, an automated anomaly
detection could be valuable. Although there certainly is an opportunity in a
more efficient and automated maintenance scheduling, there are organisational
considerations beyond the scope of a graph-based application, such as risk
aversion towards fully automated systems and training of service personnel [97].
Several other challenges in this area were raised, such as:
◦ Detecting external risks, such as the formation of ice, is an improvement
area. It depends on temperatures of water and air, of the flow and the form
of the river [86].
◦ An expressed need for understanding the actual cost of the turbine wear,
for which models are currently lacking [98].
◦ Hydropower documentation is distributed and a centralisation of them and
a higher level of hydropower station connectivity would be important
enablers for data analytics implementations [99].
Assessment (5.6/10)
The connectivity between the large number of components in a hydropower
station and the large sets of unlabelled time-series data are positive indicators for
implementing graphs. Moreover, most applications of graphs in the hydropower
61
sector were on the condition maintenance area, benefiting from well-established

algorithms - an MST [95], state-space equations [92].
Setting up a model requires the extraction and streaming of data points from
many different components and their integration into a common, streaming
database, making the architecture building a barrier [100]. Another barrier is the
computational constraint for the case of online algorithms. The analysis requires
a tight coordination of data-scientists, for establishing the infrastructure, as well
as component experts and the scheduling of maintenance. However, a
graph-based solution could be applicable to more hydropower plants if we don’t
consider the setup-time and more generally to maintenance activities overall.
The ageing, large fleet of hydropower plants are indicators for large potential
economic gains of early failure detection. However, there is a wide array of
available tools from O & M, as it is a central aspect of their operations. The
economic and environmental risks associated with equipment failure such as
downtime or oil leakage can be large and the model would need considerable
training to become robust and safe, increasing the costs of investment in this
technology [101].
Workability (4/8)
The workability is the biggest barrier in rolling out a graph-based solution for
maintenance. The low level of connectivity of equipment in hydropower stations
negatively impacts the data alignment and the potential for a fast roll-out
scheme for more stations, despite the high relevancy of the issue for asset
managers. In addition, the expertise of monitoring equipment pertains more to
the sphere of the O & M workforce, making this expertise difficult to reach.
Finally, there is a reported problem with data availability [86]. Such data is
gathered through a variety of meters and sensors, including level at the
trash-racks, water temperature, pressure at the spiral casing, guide vanes
percentage, active and apparent power output [83].
Further reading
A hydrogenerator model-based failure detection framework to support asset
management [94]
Stability analysis of governor-turbine-hydraulic system by state space method and
graph theory [92]
62
7.4. OPERATION AND MAINTENANCE FOR NUCLEAR POWER
Figure 7.2: Assessment of Hydropower: Operation and maintenance
7.4 Operation and maintenance for nuclear

power
Background
Safety is of utmost importance in the nuclear industry. Increasing the
performance of existing Nuclear Power Plant (NPP) and extending the life of
ageing NPPs are also of great interest for the nuclear power industry [102]. NPPs
are safety-critical systems, in that faults in an NPP system may potentially
compromise plant safety. It is necessary that the conditions of an NPP system
are correctly monitored and maintained and that problems are correctly
diagnosed at the earliest possible stage so that corrective actions can be taken.
In addition to preventing equipment failure and thus a plant interruption, Ma et
al. suggest that Fault Detection and Diagnostics (FDD) can optimise the
condition and maintenance by reducing the manual and oftentimes unnecessary
safety-induced calibration of process instruments [102]. Hines et al. have shown
that in fact, less than 5 % of the calibrations actually require any correction
[103]. Finally, FDD can lead to a higher plant efficiency, by either avoiding
performance degradation of some equipment or reduce uncertainties in the plant
surveillance models. Due to safety measures, NPPs are operated at conservative
levels and performant FDDs could help improve surveillance results and relax
some safety-caused constraints.
Several previous studies have applied graphs for fault detection in NPP.
63
Dabrowski et al. have used graphs to improve the Discrete Time Markov Chains
(DTMCs), which are useful in analysing scenarios in which a system’s operations
go from normal to failure [104]. Wu et al. use Signed Directed Graphs (SDG) to
reveal a fault propagation path [105]. SDGs are a commonly used for fault
detection notably in the process industry, as they show the complex relationship
between parameters in a model-free, flexible manner. Others have used graphs to
model Pressurised Water Reactors (PWR), a very common type of NPP
[106, 107], for plant piping design [108].
Case study Fault diagnosis and severity estimation in nuclear power plants [105]
SDG models do not require neither precise mathematical descriptions nor
complete operational data and can be constructed with only partial information
and from the experience of operators. According to the authors of this study,
SDGs reveal the latent dangers and the propagation rules in a simple and
effective way, making it particularly suitable for fault diagnosis in nuclear power
plants.
The authors constructed an SDG model with nodes representing parameters of
measure or fault root cause (e.g. pressure points, temperatures, flows) and
labelled the directed edges “+” for a positive causal impact between one node
and another and “−” in case of a negative impact. They converted this graph
into a fault diagnosis decision with a set of rules to more easily detect the root of
reoccurring faults.
This SDG is only a sub-component in their fault diagnosis tool, as it is preceded
by a feature extraction method for faster computation and followed by a neural
network assessing the severity of a fault (e.g. the size of a break) with Back
Propagation (BP), an algorithm aiming at fine-tuning the weights of a neural net
based on the error rates [109]. The SDG in this case is contained to identifying
the fault type. They verified their model on sub-systems such as loss of coolant
accidents and the steam generator tube rupture, successfully detecting all the
considered faults and determining the fault propagation chains.
Assessment (5/10)
As for the case of operation and maintenance for hydropower stations,
graph-based solutions can be beneficial to nuclear power plants operators.
Although the two systems express some similarities in what should be analysed
(equipment health monitoring, decision support tools), nuclear power stations
64
present a more elaborate set of components (coolant systems, pressurised water

reactors, pumps, etc.).
The complexity of nuclear power stations and the expertise required from
different fields (data science, nuclear physics, fluid mechanics, etc.) are strong
constraints to the model setup. Moreover, the risks associated with a failing
graph-based model are high both economically, environmentally and for plant
safety.
Given the large size of nuclear power in the electric system and the high level of
risk aversion, solutions increasing the efficiency of nuclear power stations and
helping to reduce the failures have a large economic potential. Although the
listed problems are an ongoing research area, nuclear power O & M is a highly
competitive field and both OEMs and utilities alike use sophisticated internal or
external tools (e.g. Fortum, Alstom, GE).
Workability (4/8)
Plant and equipment safety are of utmost relevance for the asset managers.
However, as for the case with hydropower, the connectivity of the equipment is a
hindrance, as well as the complexity of integrating, deploying and maintaining
new methods in well-established operations with such a high level of protocols.
Figure 7.3: Assessment of Nuclear power operation and maintenance
Further reading
65
7.5. DESIGN AND MAINTENANCE FOR WIND POWER
Unsupervised clustering of vibration signals for identifying anomalous conditions

in a nuclear turbine [110]
An adaptive decision method using structure feature analysis on dynamic fault
propagation model [111]
7.5 Design and maintenance for wind power

Background
A central aspect of wind power planning is Wind Farm Layout Optimisation
(WFLO), also called the micro-siting problem, which is the selection of locations
of turbines in a wind farm. Placing the turbines is optimised based on factors
such as the wind patterns, soil conditions, aviation restrictions, land agreements,
constructability, topology [112]. This makes the optimisation problem
site-dependent and a generic solution difficult.
In addition, when a turbine is activated by the wind, a deficit of velocity is
created downstream, called the wake effect, leading to a reduced energy
production [113]. Although there have been multiple approaches to the WFLO
such as artificial intelligence [112] and computational fluid dynamics [114],
several studies have approached it through graph theory. These include
optimising the location of the electrical collector system [115], the cable layout
[116] or the connection topology of an offshore wind farm network [117].
As for the clusters of hydropower and nuclear power, the operation and
maintenance of wind power can also be studied by graphs. Studies have used
fuzzy digraph models [118], probabilistic signed directed graphs for wind turbines
[119], bond graphs [120] or feature selection [121].
Case study Cuckoo search for wind farm optimisation with auxiliary
infrastructure [122]
The authors of this study seek to optimise the auxiliary infrastructure when
designing a wind farm. They considered the following constraints: the minimum
distance between turbines for safe operation, forbidden zones for installation and
the existing infrastructure. Additionally, the optimisation includes an algorithm
to find the least expensive layout of the wind farm roads and the electrical
collector system minimising cable length, for which they apply Dijkstra’s shortest
path and Prim’s minimum spanning tree algorithms.
The cuckoo search (CS) algorithm is applied to seek the most optimal layout.
66
Inspired from the search pattern known as the Lévy flight (a random walk
process) of cuckoo species, CS is a relatively new optimisation algorithm and has
successfully been applied to various problems, such as scheduling problems, the
travelling salesman problem.
Their test results indicate that the infrastructure cost has a significant effect on
the optimum wind farm solution. Compared with the genetic algorithm (GA), a
commonly used meta-heuristic algorithm in wind farm optimisation, CS was
slower, but found a better solution.
Expert opinion
As WFLO is a central aspect of wind power projects, there are already a number
of tools developed by OEM:s performing well, which also include load calculation
(i.e. the maximum load supported by the turbines) and life-time assessment.
However, due to particularly diverse constraints and bureaucracy of wind power
projects, there is a certain level of ad hoc, manual processing, making an
automated tool less powerful.
A widely researched topic in wind power nowadays is a good estimation of the
remaining life-time for the turbines and how to operate them in an optimal way
given this operational cost. Although a more automated scheduling of
maintenance is indeed an interesting topic, it is subject to internal organisational
capacities and should be in accordance to the optimal timing with regards to low
(or even negative) prices.
Finally, even if wind power projects are partnerships and much of the
computation work is made by sub-contractors, there is still an advantage of
analysing this system in-house as a complement for better negotiations [123].
Assessment (4.1/10)
As shown by the availability of the use-cases, graph theory is indeed applicable
to the problems of designing and monitoring wind power plants, on different
scales: at farm-level, auxiliary equipment-level, turbine-level. In all these system
levels, significant information is to be gained in the relationships between
components and edges can represent cost minimisation, clustering, fault
propagation. This leads to a good graph-applicability score.
The setup of the model is considered complex, given the wide set of constraints
67
of different nature. Coordination is needed from the fields finance and project
development, wind power and as well with the partners for the WFLO, as well as
in mechanical engineering and data science for the equipment monitoring. The
granularity of the time-series is also currently a barrier for proper life-time
estimation of the turbines.
The global wind power market size is approximately US$ 70 to US$ 100 billion
[124, 125] and is expected to grow at a CAGR of 7.20 % until 2023. The turbine
operations and maintenance market size was valued at US$ 12 billion in 2018,
with an expected CAGR of 8.54 % from until 2025, amounting to US$ 21 billion
[126]. However, due to the high competitiveness of existing software solutions of
both WFLO and condition monitoring as well as the relatively low
generalisability of WFLO due to the many local constraints, the economic
potential was considered low.
Workability (1/8)
Potential graph-based solutions have low relevance to the current workflow as
Fortum is working with sub-contractors for wind power projects. Similarly, the
integrability and maintainability of the tools is considered too burdensome and
resource-costly for the workforce.
Figure 7.4: Assessment of Wind power operation and maintenance
Further reading
Identification of critical components of wind turbines using FTA over the time
[127]
68
7.6. ELECTRIC VEHICLE APPLICATIONS
Criticality analysis of wind turbine energy system using fuzzy digraph models and
matrix method [118]
7.6 Electric vehicle applications

Background
The electric vehicle (EV) industry is conditioned by the actual vehicle’s
technology advancements, particularly the range. Range anxiety, i.e. the fear of
insufficient range to reach one’s destination, is a well-documented phenomena
and impacts all aspects in the development of the EV industry both on the
vehicle side (e.g. battery investments, vehicle structure), on the service side (e.g.
routing) and on the infrastructure side (placement of charging stations) [128].
There is a lot of ongoing research in both infrastructure placement optimisation
and routing problems using graphs, although routing is certainly the most classic
case of graph applications (Google Maps uses A*). Routing optimisations are
complex due to the various constraints to consider, among others energy usage,
distance, driver preferences, queuing time at the station, topology, battery state
of charge, regenerative braking [129]. Campana et al. have studied a theoretical
charging station placement based on traffic density in smart cities with the help
of OpenStreetMap data [130]. To consider the impacts on the electrical
distribution, another important constraint and on traffic condition,
Phonrattanasak et al. propose an Ant Colony optimisation (ACO) [131].
Case study Route optimisation for an electric vehicle with priority destinations
[132]
The authors of this study address the problem of finding a route of an electric
vehicle considering cost minimisation and the driver’s priority destinations. Here,
the driver needs to visit all the customer locations and some of them, defined as
priority, have to be visited within a specified time duration of the day. During
the trip, each location has a charging station which may be utilised by the EV, if
necessary.
Thus, the objective of this problem is to minimise the total cost incurred due to
recharging of battery. They propose a heuristic search algorithm based on
breadth first search to solve the aforementioned problem and demonstrate its
performance on various test cases.
Though the charging of the battery is affected by its charging characteristics, this
69
aspect is not taken into consideration but could be subject to future work.
Expert opinion
Placement of charging stations
Although the charging station placement and routing problems are interesting in
theory, in practice the nature of the needs is quite different. Investment in
charging point stations is capital intensive and not particularly profitable.
Outside Norway, the EV charging recurrence is considered too low for a roll-out
of infrastructure. The most important challenge is to increase return on
investment by maximising usage of the charging stations. To obtain this, a
deeper understanding of the user’s behaviours is needed to answer to the
question “What makes people charge at a particular station?”. For example,
there are areas where the Charging Point Operator’s (CPO) coverage overlaps
the users’ addresses and have a low utilisation, whereas other CPOs where user
density is lower experience higher utilisation rates [133, 134].
Graphs could potentially help better understand the usage rates of a charging
station by identifying critical success factors. One hypothesis which can be tried
is the “neighbourhood” factor, i.e. identify the points of interest around the CPO
and find a pattern. Charging stations and other landmarks could then be used as
nodes and the edges between them is the physical distance. Valuable attributes
could be given to the charging stations, such as the usage rate and the average
cost of charging, the time of the day the charges were completed, etc.). Stations
could then be clustered based on their usage rate and neighbourhood to detect
whether or not a pattern can be identified.
Routing
As well, charging price may not necessarily be the most important constraint in
the routing problem, as it is neither the prime concern of EV-owners, who could
otherwise charge at home, nor of business customers, who need availability.
Assessment (6.3/10)
EV has the most instances of case studies of all clusters. Many algorithms have
been tried and implemented, among others, different forms of shortest path
(Djikstra, A*, Kruskal), breadth-first-search, matching, etc. The graph
applicability in this case study is thus maximal.
70
Generally speaking, the routing problem is a classic graph problem and thus
setting up a simple model does not require expertise from other domains.
However, the problem can quickly expand to large and complex systems with
mixed data sources as we add more constraints to the optimisation to make it
realistic. The commercialisation possibility of the ongoing research is debatable,
as most studies focus on a small set of constraints. Moreover, most studies do
not take into account the actual behaviour of the users, which is still a big
question mark in this industry.
The electric vehicle charger market size was valued at US$ 3.8 billion in 2019 and
is projected to reach US$ 25.5 billion by 2027, registering a CAGR of 26.8 %
from 2020 to 2027 [135]. There is significant opportunity for utilities: Boston
Consulting Group estimates that the rise of EVs could create US$ 3 billion to
US$ 10 billion of new value for the average utility [136]. This has to be balanced
by the yet low maturity of EV, the currently low charging station utilisation rate
and thus their return on investment.
Workability (2/8)
For the moment, the problems that graphs solve are not the highest priority for
practitioners. The usage is so low that there is little need for automation of the
services, notably for the charging stations. Moreover, the data availability is low,
given the low interaction level between the user and the station. Therefore, it has
a low workability currently.
Figure 7.5: Assessment of EV Applications
71
7.7. MARKET INTELLIGENCE FOR ENERGY TRADING
Further reading
On calendar-based scheduling for user-friendly charging of plug-in electric
vehicles [137].
A location model for electric vehicle (EV) public charging stations based on
drivers’ existing activities [128].
7.7 Market intelligence for energy trading

Background
The deregulation of the electricity market and the introduction of competitive
markets have put pressure on electric utilities to improve their forecasting
capabilities in the energy commodities. Given its peculiarities, the electricity
market is particularly challenging and has been subject to much research recently
[200]. It is a non-storable commodity and power system stability requires a
constant balance between production and consumption [138]. In addition,
electricity demand depends on weather (temperature, wind speed, precipitation,
etc.) and the intensity of industrial activities (on-peak vs. off-peak hours,
weekdays vs. weekends, holidays and near-holidays, etc.). Although electricity
prices fundamentally follow the principles of supply and demand, they are also
largely affected by power grid constraints, fuel and carbon prices, renewable
energy feed-in, market power strategies of generators, among others [139].
Fuel prices (e.g. oil, coal, natural gas) are themselves largely determined by
supply and demand conditions, but exogenous factors such as politics, storage
capabilities, supply-chain and weather also come into play. Additionally, the
prices display interdependencies as they are in many cases substitutes, for
example when used as a fuel for power generation. The introduction of financial
agents such as commodity index funds in the natural gas market has led to its so
called financialisation and a significant increase in volatility has been witnessed
[140].
Given the complex fundamentals and relationships within the energy
commodities, graph theoretical analysis has been introduced to increase the
understanding of the price evolution. Lee et al. have applied minimal spanning
trees to analyse structural transformation in commodity markets [141]. Ji et al.
propose to a directed acyclic graph to model identify the factors behind crude oil
price evolution, from a system analysis approach [205]. Given the vast amount of
potential factors influencing prices, feature selection has attracted a lot of
72
attention by the research community to reduce the dimension of predictors for

more accurate and faster forecasting computation [144, 145, 146].
Case study Causal modelling and inference for electricity markets [147]
The authors of this study are interested in understanding the price dynamics
between electricity prices and major fuel sources (oil, gas and coal) in the Nordic
and German electricity markets. Using time series models combined with new
advances in causal inference, they estimate a causal model for the price
dynamics, both for contemporaneous and lagged relationships.
They build their model to compensate a limitation of the approaches of Park et
al., who used a directed acyclic graph to model instantaneous causal influences
between the energy market prices from their correlation matrix [154]. Directed
acyclic graphs, or DAGs, are used to represent causal relations (by directed
edges) between variables (nodes) (Spirtes et al., 2000). However, DAGs assume
that variables are jointly normally distributed. Because several DAGs correspond
to the same joint distribution, only a class of DAGs is obtained rather than one
single DAG. In the class of DAG, the edges might not always have the same
direction. As such, Ferkingstad et al. use a linear non-Gaussian acyclic model
(LiNGAM) introduced by Shimizu et al., allowing the identification of a single
DAG [155]. This also enables them to integrate both contemporaneous and
time-lagged causal relationships. The LiNGAM model is preceeded by a vector
autoregreession (VAR) model and a vector error correction model (VECM).
Among other things, they find a strong connection between gas and electricity
prices and a lagged causal link from Zeebrügge gas prices (the Belgian oil and gas
terminal) to the Nordic electricity market.
Expert opinion
Whereas natural gas is a strong price driver in continental Europe, due to the
high regulation power of gas in that market, the hydrological situations and
weather data are more prominent price-determinants in the Nordics. A
fundamental analysis for adding more internal market insights would be of
high-value, given the large data sets made available by financial institutions [156].
Assessment (7.2/10)
Several fundamental elements within the energy markets can been represented
through graphs, among others the topology of power grid itself [157], the price
73
relationships between the energy commodities [159], smart-grid data [158]. Fewer
traditional graph-algorithms were found within this cluster (e.g. shortest path,
MST, etc.). However, many of the problems needed to be solved can be done
with a graph-approach, such as clustering of electricity loads [160] and feature
selection for electricity price forecasting [161].
As opposed to most other applications considered in this research, market
intelligence is not a cyber-physical model, which facilitates the model setup.
There is a certain homogeneity in the expertise necessary, namely data science,
mathematics and trading, as the fields are merging. The computational
constraint of forecasting is high, since trading decisions need to be made fast. As
well, even though the physical risk of model failures is low, considerable financial
risks come into play.
According to Zareipour et al., there are significant gains to be made by higher
forecasting accuracy: a 1 % improvement in the mean absolute percentage error
(MAPE) would result in about 0.10 % – 0.35 % cost reductions from short-term
electricity price forecasting [162]. In dollar terms, this would translate into
savings of circa US$ 1.5 million per year for a typical medium-size utility with a
5 GW peak load [163].
Workability (6/8)
Deeper market insights are highly relevant for Fortum’s trading operations. Most
of the data needed in this area are made readily available by third parties, either
in the public domain (e.g. weather, commodity prices, power grid operator data)
or through third-party vendors. The integration of a graph-based solution is
considered less of a challenge than other areas given the proximity of graph
analytics and machine learning models.
74
7.8. STORAGE SOLUTIONS FOR THE DISTRIBUTION GRID
Figure 7.6: Assessment of Energy trading
Further reading
Evolution of the world crude oil market integration - A graph theory analysis
[164].
Grid topology identification using electricity prices [165].
7.8 Storage solutions for the distribution grid

Background
With the increasing number of renewable energy producers at medium and low
voltage level, power flows may become bidirectional at times when renewable
generation peaks and consumption dips, calling for active grid management and
more capital investment in new lines, transformers and power stations to ensure
system stability and reliability. By aggregating negative and positive generation
resources into the representational profile of a single power plant, the virtual
power plants (VPP) are able to offer market participation to small-scale
producers which they otherwise would not have owing to minimum size
requirements.
Besides supporting renewable integration in this manner, VPPs are also able to
support the grid in times of congestion through flexible control and management
of distributed generation and load resources [166]. As such, an optimal siting of
VPP:s are closely tied to the topological structure of the grid, for which graphs
provide a useful approach [167]. In addition, the optimisation of VPP operations
implies operating a network of assets which are communicating with each other,
where graphs have also been applied. This is the subject of the presented case
75
study.
Case study Optimisation of virtual power plant topology with distributed
generation sources [167]
This case study proposes a hierarchical control strategy to coordinate battery
energy storage devices based on a multi-agent system. A multi-agent system can
cope with complex problems effectively in a power system and improve the
robustness, reliability and flexibility of the system [168]. Therefore, many papers
bring a multi-agent method into the controller of an energy storage system.
However, as the authors point out, in most existing studies, the geographical
feature was not considered. As such, they design a distributed step-by-step
consensus algorithm based on droop controller (grid frequency stabilisation),
considering geographical location, aiming to improve the cooperative control of
the energy storage. According to different geographical distributions, Battery
Energy Storage Systems (BESSs) can be divided into several clusters with the
Kruskal algorithm, which is a version of the MST algorithm.
Graph Laplacians, or Laplacian matrices and their spectral properties are
important graph-related matrices that play a vital role in convergence analysis of
consensus and alignment algorithms. In particular, the stability properties of the
distributed consensus algorithms for networked multi-agent systems are
completely determined by the location of the Laplacian eigenvalues of the
network. Considering the location feature of each agent, an improved Laplacian
matrix with the weight corresponding to each edge is adopted. The edge set
describes the relative distance between agents in the communication topology,
implying their communication intensity.
This paper proposes a step-by-step consensus algorithm with several active
leaders, or a virtual leader. In fact, active leaders are internal agents while the
virtual leader is an external command. All active leaders can reach consensus by
exchanging information with each other and other agents can reach consensus
according to their leader. The virtual leader can force all agents to follow the
leader.
Assessment (5.6/10)
The application of graphs for energy storage solutions in the previous case
studies relies to a large extent to the topological analysis of the distribution
76
network. In this context, graphs are deemed fit as topological analyses of power
grids are a classic use-case for graphs, where edges are the power lines and nodes
are power sources, load sources and interconnections between the lines. In fact,
there have been various studies modelling the power grid via graph databases,
finding promising results [169, 170].
The cases studied were mostly based on graph theory, which simplifies the model
setup and reduces the diversity of expertise needed. As opposed to the optimal
siting of BESSs, an operational optimisation would require online models and
fast response times. Given the smaller scale of a BESS, the risk of failure is
considered low, relative to the larger-scale systems considered in this study.
Adding the renewable energy technology, batteries and other components, the
virtual power plant market could grow from US$ 1.5 billion in annual revenue in
2016 to a US$ 5.3 billion market by 2023. Europe would stand for US$ 1.3 billion
(Gallucci, 2016).
Thus, grid services with energy storage is a niche market and still to a large
degree in the research phase. This leads to a lower competitive barrier and gives
a certain window of opportunity for software building.
Workability (6/8)
Graphs have been used for VPP-siting, which is a critical component in the
battery energy storage systems. Fortum is in fact already providing services to
the DSO through their Smart City Solutions (Fortum, 2020), which indicates a
certain alignment in terms of human capital. However, more information is
needed regarding the data and the internal software being used in order to assess
the data and maintainability alignments.
Figure 12: Evaluation of storage solutions for the grid
Further reading
A stochastic shortest path framework for quantifying the value and lifetime of
battery energy storage under dynamic pricing [171]
Consensus design for heterogeneous battery energy storage systems with droop
control considering geographical factor [172]
77
7.9. MASTER DATA MANAGEMENT
Figure 7.7: Assessment of Energy storage solutions
7.9 Master data management

Background
Fragmentation and inconsistency of data collected across multiple business
entities creates manifold issues such as supply chain inefficiencies and work
redundancy. Master Data Management (MDM) is a business-critical method
that aggregates all corporate data and gains grounds across IT landscapes across
the world [173].
Master Data Management (MDM) provides complete, consistent access and
visibility of product, customer, location, employee and supplier data. This
centralisation can enable organisations to make data-driven decisions providing
operational agility, time-to-value and additional revenue generation [174].
Given the high degree of connection within the different data elements, for
example the interaction of a customer with different points of entries with a
company, graph databases have emerged as an efficient solution for mastering
data. They enable fast database traversals, clustering and community-detection
possibilities and interesting visuals [175]. Most database vendors (Neo4j,
TigerGraph, AWS Neptune) propose such solutions, which have the advantage of
being industry-agnostic.
Case study Using a 360◦ view of customers for segmentation [176]
Customer segmentation is widely used by companies to better target their service
or product offerings to their customers’ needs. It can be used both for marketing
78
or sales purposes, or for prospecting future trends.

The authors of this study look at how a 360◦ view of customers can help a
pharmaceutical company perform a multi-dimensional customer segmentation to
gain insights in the intricacies of their sales. They collected data across the
organisation to create profiles of customers based on both qualitative attributes
(e.g. preferences, attitude, interaction with sales representative) and quantitative
attributes (e.g. number of calls, frequency, age, sex). Once the characteristics of
a promising segment are identified, the information can be used to target new
customers with similar profiles.
They note that while this sounds simple, many organisations get stuck
somewhere in the process. Merging together attributes and behavioural data
collected across the organisation is a challenging task, requiring different
departments within the organisation to work together and share information
beyond their departmental mandate. They might need a leadership mandate to
lift potential internal barriers and have a capable group to work across
departments to inventory and integrate customer attributes and behavioural
data, to develop and test the customer segmentation.
Their study resulted in discovering that the highest volume of sales over a two
year period was attributable to a particular demographic segment of doctors.
This segment was clustered based on a mix of attributes including common ones
such geographic disposition, speciality, as well as more complex ones such as
ethnicity, age, gender and competitor product prescription mix.
Expert opinion
Experts at Fortum have worked with a graph-based MDM proof-of-concept.
They confirm that most of the work consists in data cleaning and consolidation
(e.g. an organisation number was not entered in the database). Also, the actual
graph building can become a constraint, as there is a need for designing the
connections, defining the wide variety of nodes and edges requiring some human
verification. The more the sources are disparate, the harder it becomes to gather
data and engage the people who would need to be involved. However, they have
found the visualisation helpful [177].
Assessment (6.6/10)
MDM is a classic industry case for graph applications and there is a large
79
number of previous use-cases, although not from the energy sector specifically.
However, there are numerous previous cases of graph-based MDMs in commercial
settings, so the graph applicability is considered maximal.
The technical feasibility is high, as the main task is about merging the data.
Creating the architecture to integrate data from different sources and keeping the
model updated is the highest source of technical complexity. Data-cleaning can
however become a liability, as the data sources can have different structures and
be incomplete. The risk associated with MDM is mainly a waste of resources: an
inconclusive model (e.g. because of incomplete or poor data) or a low adoption
rate by the users within the company. However, these risks are common to all
applications. Another risk is a breach of the privacy policy of customers, but it is
seen rather as a limitation for increasing the richness of the graph.
The global MDM industry is projected to grow to US$ 8.6 billion by 2022
(Persistence Market Research, 2017). Given the generality of this use-case, many
graph database-vendors offer this service where minimal customisation is
required from the company (Neo4j, TigerGraph, AWS, etc.). There are several
industrial use-cases having reported economic benefits from their implementation
(Cisco, Pitney Bowes, the German Center for Diabetes Research). However, a
graph-based solution is not the only alternative, as there is a myriad of
non-graph-based MDM (e.g. IBM).
Workability (2/8)
Although the problem can be of high overall relevance for companies with large
customer bases and a wide product offering with data spread out across the
organisation, the perceived relevance for the end-users of this application. This
can in turn affect the integration and maintenance of the application over-time
(i.e., keeping the data updated). Moreover, MDM is a query-driven graph,
responding to specific questions such as “What is the churn rate of customers?”,
generally provided from the sales or the marketing department, making
engagement from the end-users particularly critical. Another challenge is the
data alignment, which is logical given that data integration is the very purpose of
this application.
Further reading
Graph-based customer entity resolution [178].
80
7.10. KNOWLEDGE GRAPHS
Figure 7.8: Assessment of Master Data Management
Analytics-Aware Graph Database Modeling. [179].
7.10 Knowledge graphs

Background
The purpose of a knowledge graph is to centralise information from various
sources and formats (documents, images, videos, etc.) aiming at making
information quickly accessible, searchable and digestible. It is a model of a
knowledge domain, crated by experts in the subject.
Knowledge graphs became popularised with the advent of Google’s knowledge
graph and are today used in most highly valued information-centric internet
firms (Facebook, Amazon, Microsoft, IBM). However, as is the case with Master
Data Management, knowledge graphs are industry-agnostic and have been used
in companies in industries such as aerospace (NASA), telecom (Comcast) and
retail (Ebay) [180].
It provides a structure and common interface for all of a company’s data and
enables the creation of smart multilateral relations throughout the databases. It
is structured as an additional data layer, on top of the existing databases or data
sets, linking all data together at scale – be it structured or unstructured [181].
Case study A method for systematically developing the knowledge base of
reactor operators in nuclear power plants to support cognitive modelling of
81
7.10. KNOWLEDGE GRAPHS
operator performance [182]

This paper proposes a method for systematically developing the knowledge base
of nuclear power plant operators. The method starts with a systematic literature
review of a predefined topic. Then, the many collected publications are reduced
to summaries. Relevant knowledge is then extracted from the summaries using
an improved qualitative content analysis method to generate a large number of
pieces of knowledge. Lastly, the pieces of knowledge are integrated in a
systematic way to generate a knowledge graph consisting of nodes and links.
The proposed method is applied to develop the knowledge base of reactor
operators pertaining to severe accidents in nuclear power plants. The results
show that the proposed method has advantages over conventional methods,
including reduced reliance on expert knowledge and improved traceability of the
process.
Expert opinion
Fortum already has some experience with knowledge graphs and have found
value in studying this further [99].
Assessment (6.6/10)
As for the MDM, knowledge graphs are a widespread use-case for graph analytics
applications. There is a clear set of algorithms which can provide value in
organising the information within a firm (similarity algorithms, link prediction,
clustering, search algorithms).
Knowledge graphs are a technically more complex endeavour than is MDM, as it
involves domain experts for creating the ontology and the set of rules in the
graph, natural language processing or image-recognition algorithms on the
documents and the creation of a scalable database architecture. However, the
strong presence of use cases from the industry and the availability of existing
machine learning tools give a strong indication of the maturity of the technology
for commercial use.
Assessing the economic potential of implementing a knowledge graph is complex,
as it does not pertain to a specific industry. Therefore, the score for this criterion
is mainly influenced by the scalability and competitiveness from other tools
82
7.11. RESULTS OF USE-CASE CLUSTER ASSESSMENT
according to the following reasoning. Within a specific domain (e.g.

hydropower), the scalability is considered high, because once the structure is
established, all new documents can be seeded into the graph and automatically
processed. Across domains, although the structure can be inspired from an
existing graph, it still needs the coordination of stakeholders (e.g. domain
experts, asset managers). There are also out-of-the box alternative tools using
machine learning which can provide similar value (e.g. Amazon Kendra).
Workability (5/8)
The relevance of knowledge graphs for large corporations, where information can
be disparate, is high. The constraints for workability are the data alignment,
were considerable knowledge of the current information structure within the
domain of application is needed, in addition to the domain expertise needed and
the onboarding of the end-users.
Figure 7.9: Assessment of Knowledge graphs
Further reading
Hydro-graph: A knowledge graph for hydrogen research trends and relations [183].
A Potential Solution for Intelligent Energy Management—Knowledge [184].
7.11 Results of use-case cluster assessment

7.11.1 Assessment results
The graph-score of the use-case clusters were mapped against their
strategic-score, as illustrated in Figure 7.10.
83
Figure 7.10: Graph-score vs strategy-score of each use-case cluster
The graph-score is plotted on the vertical axis and the strategy-score on the
horizontal axis. To better discern the clusters, both axes have been rescaled
according to the following rule: the minimum of each axis is set to the value of
the cluster with the minimum score along the axis minus one. Similarly, the
maximum of each axes is set to the maximum score plus one. A more
fundamental reason for this is that in order to select a cluster, it is more
appropriate to evaluate the clusters in relative terms than in absolute terms.
As such, four quadrants can be identified by drawing a boundary set at the mean
84
of each axis.
As explained in the methodology, the use-case clusters were categorised in three
groups:
◦ The first group is comprised of the use-case clusters located in the
top-right-hand quadrant and represents the accepted clusters: Trading -
Market Intelligence, Knowledge Graph, Master Data Management and
Hydropower O & M.
◦ The second group is comprised of the use-case clusters located in the
top-left-hand and bottom-right-hand quadrants and represents the deferred
clusters: EV Applications, Storage solutions for the grid, Hydropower
optimisation and Nuclear power O & M.
◦ The third group consists of the use-case clusters in the bottom-left-hand
quadrant and represents the rejected clusters: Wind power design and O &
M.
The selected cluster for a proof-of-concept implementation was that of Energy
trading: Market intelligence, as it obtained the highest score on both dimensions.
7.11.2 A note on Energy trading

It can be interesting to give deeper insights in what set the Energy trading
cluster apart in the evaluation dimensions.
Graph applicability: graphs are particularly applicable in the context of
trading, for their ability to capture the dynamics between prices of stocks or
commodities and for time-series. According to the structural models, the prices
of the energy commodities are highly interdependent and have large sets of
explanatory variables such as weather or supply and demand conditions. Many
concepts and algorithms were identified, particularly in the case of natural gas
and crude oil. With regards to the electricity markets, graphs have not been used
extensively, hence the lower score on the availability of supporting use-cases.
However, similar types of problem are solved, notably feature selection and
clustering, for which graphs are considered particularly powerful.
Technical feasibility: the technical feasibility is moderate, as there are
computational constraints given the need for fast decisions. If applied for
structural modelling purposes, graphs can suffer from the same modelling
85
challenges as structural models, namely a time-consuming, manual model

construction considering many parameters, possibly introducing model biases.
The risk of wrong output of the models is more economical than technological,
arising from wrong trading decisions. However, there is a certain inherent
technological risk in energy trading given the stability constraints of the power
grid. The homogeneity of expertise needed in model construction is however
advantageous.
Economic potential: the energy markets are not projected to grow as fast as
other sectors considered in this study. In fact, energy consumption growth is
constrained by the efforts of societies reduce carbon emissions for the coming
decades, particularly in the developed economies. But it is large enough to be of
huge interest for utility companies. Some commercial tools are available, but
utilities have an interest in developing internal tools to increase competitiveness.
Finally, as the energy markets are interconnected and because much of the
problems in the previous case studies found in this cluster can be transferred to
other sectors (e.g. feature selection, clustering), the cluster scores high on
scalability.
Workability: the problems solved in the use-cases in this cluster are of central
importance for better trading decisions. High data alignment was also an
argument for its high workability. In fact, much of the research in this cluster is
made on openly available data sets. The integrability and human capital
alignment were determined to be moderate as they vary among the use-cases
within the cluster.
Strategic alignment: since the liberalisation of the electricity market and
considering the increasing volatility of prices due to renewable generation, energy
trading has become a key activity of any actor on the market and more
specifically electric utilities. As such, the cluster scored high on the first strategic
goal of the organisation. In addition, introducing graphs in the energy trading
cluster can indirectly open possibilities within other strategic ambitions of
Fortum, such as new services for large industrial customers and private
households with regards to consumption optimisation or demand response.
Finally, the learnings from this cluster can prove useful for financial modelling in
the areas of risk or portfolio optimisation.
86
7.12. DISCUSSION ON THE SELECTION OF USE-CASE CLUSTER
7.12 Discussion on the selection of use-case

cluster
Given the constraints in this research, namely the limited time and information,
the assessment was made at face-value. It provides an indication in which
clusters are to be prioritised relative to the other clusters, based on an equal set
of criteria and evaluation metrics. Therefore, it does not represent the potential
of clusters in absolute terms and each practitioner might have different
interpretations.
Another point to make is that all but two use-case clusters (knowledge graph and
master data management) are elaborated solely based on academic papers, given
the lack of publicised commercial use-cases in the energy sector. Their
workability score might have looked different if actual commercial benchmarks
had been found. Hence, the limited knowledge and experience in this dimension
brings about a higher uncertainty on the actual score and might evolve as
progress is made in the commercial sphere.
The introduction of a workability dimension in the graph-score has a high
impact. The results of the graph-score would have looked quite different if it only
contained the organisation-independent dimensions of graph applicability,
technical feasibility and economic potential. While such an approach is also
recommendable, we believe it would not represent the reality of a new technology
implementation process within an organisation. As indicated by experts in the
domain of graph theory, for organisations not accustomed to this novel modelling
approach, a more agile approach with consideration to the path of least friction
and the learning curve is recommended [185].
87
CHAPTER 8
Assessment of use-cases in Energy
trading
In this section, three use-cases within the Trading cluster are presented in a
similar fashion as the use-case clusters were presented in the previous section.
The use-cases were chosen so as to best meet the needs of a utility company like
Fortum, thus relate to the domains of natural gas, short-term load forecasting
and electricity price forecasting.
The difference in evaluating specific use-cases as opposed to use-case clusters is
that since a specific application is sought, a higher level of detail imposes itself.
Each proposal is inspired from the use-case from the inventory which best
highlights the benefits of graph theory within the aforementioned selected
domains.
Table 8.1 summarises the scoring of each use-case. The same scoring system on
each dimension is applied as for the use-case clusters, detailed in 7.1.2. Note that
the strategy-score was not computed, as they each pertain to the same cluster
and thus score similarly.
88
Table 8.1: Scoring of use-cases in Energy trading
Legend:
UC1: Natural gas: visibility graph
UC2: Short-term load forecasting: clustering
UC3: Electricity price forecasting: feature selection
89
8.1. NATURAL GAS MARKET ANALYSIS WITH VISIBILITY
GRAPHS
8.1 Natural gas market analysis with visibility

graphs
Background
Global natural gas trade concentrates on three regional markets: the North
American market, the European market and the Asia-Pacific market, with North
America having the most developed natural gas financial market.
Currently, due to the morphological characteristics of natural gas, a unified
global natural gas spot market does not exist. According to different price
standard, global market is mainly divided into three major regional markets: the
North American market, the European market and the Asia-Pacific market. The
North American market has gradually become the most mature market of them.
It has a competitive market system and its market risk can be controlled by cash
and derivatives.
The high volatility of natural gas prices means that energy producers and
distributors are often faced with high volatility risk. Therefore, in order to avoid
market risks and improve energy security, it is of great importance to study the
characteristics of price fluctuations of the North American natural gas market,
which is exemplary to Europe and Asia markets [186].
Case study Visibility graph network analysis of natural gas price: The case of
North American market [186]
The authors of this study constructed visibility graphs to study the North
American natural gas spot market. Visibility graphs have grown in popularity for
their capability to convert time series to complex network, helping to investigate
their overall and local features. First proposed by Lacasa et al. [187], it has been
applied to gold prices [188], financial markets [189], exchange rate fluctuations
[190]. Findings from empirical records for stock markets in USA (S&P 500 and
Nasdaq) and artificial series generated by means of fractional Gaussian motions
have shown that this method can provide us rich information benefiting
short-term and long-term predictions.
In a visibility graph, the data points of a univariate time series are converted into
nodes and are visible to each other if the line segment connecting them does not
pass through any obstacle. In this case an edge is drawn between them. The
resulting graph type can then be used to provide information on the
characteristics of the original time series. For example, periodic series convert
90
8.1. NATURAL GAS MARKET ANALYSIS WITH VISIBILITY
GRAPHS
into a regular graph, whereas random series convert into exponential random
graphs. In addition, the temporal characteristics and the internal evolution
mechanisms of time series can also be explained by the visibility graph algorithm.
This allowed the authors to model the natural gas price fluctuations by observing
the evolution characteristics of its corresponding graph. Namely, three attributes
of the visibility graphs were discussed: the degree and degree distribution, the
average shortest path length and the community structure.
The main contributions of this paper are as follows:
◦ It provides a new method to study the spot price fluctuations of natural
gas in North America, which also can be related to energy prices, such as
coal, oil, solar, etc.
◦ Combined with the degree distribution characteristics of NGP-VGN,
selecting appropriate window size, the overall natural gas price series is
divided into five 6-year time windows, which makes the research more
detailed and targeted.
◦ Degree and degree distribution, small-world characteristics and community
structure of the network.
Proposal: Construction of a visibility graph for the European natural gas
market.
Assessment (7.2/10)
Graph theory is a tool of growing interest in the signal processing community for
data representation and analysis. Their structure offers a new perspective, often
unveiling non trivial properties on the data they represent. In particular, time
series analysis has greatly benefited from graph representations as they provide a
mapping able to deal with non-linearities and scaling issues present in multiple
applications. In the case studied, classical graph-theoretic algorithms were used
(e.g. degree distribution, shortest path).
The model uses univariate time series of prices and the set of rules for the graph
construction are considered simple and the graph gives a lot of freedom for
analyses. This can also turn out to be the model’s very drawback as well, as it
requires a considerable intuition and understanding of networks to know what to
91
8.2. SMART METER CLUSTERING FOR SHORT-TERM LOAD
FORECASTING
analyse and how to make business decisions.

Deeper understanding and analysis tools for the price fluctuations of natural gas
is relevant both for reducing operational risk (e.g. identifying the propagation of
external events onto prices) and for reducing operational costs. However, the
direct consequences of this model on trading decisions are not fully clear. It
remains that this is a highly scalable approach, applicable to any time-series.
Workability (5/8)
The workability for this type of application is seen as high, as the data needed is
simple, the relevance of increased knowledge on natural gas fluctuations is high
and there is an alignment with the human capital within the trading department.
Figure 8.1: Assessment of Natural gas market analysis with visibility graphs
8.2 Smart meter clustering for short-term load

forecasting
Background
Short-term load forecasting (STLF) aims to predict electricity loads over a short
time horizon (hours or days ahead). However the global roll-out of smart meters
has created new opportunities for further improvement of forecast accuracy.
Before the mass adoption of smart meters, consumers’ electricity usage was
typically read in intervals ranging from one to six months. Consequently, usually
there was a maximum of 12 readings per year per customer available. However,
92
FORECASTING
smart meters, can measure and record each consumer’s energy usage, every 15,
30 or 60 minutes. The general method for improving the forecast using smart
meter data is based on clustering.
Assuming that the next day electricity demand of a city with a number of
consumers is to be predicted for the next 24 hours. A model can be trained to
generate a forecast for the next day, using inputs like temperature, load at the
same hour on the current day, load at the same hour on the same day in the last
week, etc. Any of the methods mentioned above can be used for creating such a
model.
The research about using smart data for improving load prediction is still in its
early stages. As a result, there are not many papers addressing this issue. The
existing works can be categorised according to their feature set selection
methods, clustering methods and the applied prediction methods. Alzate and
Sinn use wavelet analysis for feature extraction, a method called kernel spectral
clustering for clustering and a prediction method called PARX as their forecast
method. They report a more than 20 % improvement in forecast accuracy using
these techniques, which the subject of the case study below [191]. Gajowniczek et
al. simulate clustering approaches for residential electricity demand profiles [192].
Case study Improved electricity load forecasting via kernel spectral clustering of
smart meters [191]
The authors of this study aim to improve forecasts of aggregated demand with
smart meter data, by exploring the possibilities from kernel spectral clustering.
Their goal is to cluster the smart meter data in order to build a forecasting
model for each cluster separately instead of building a unified model for fitted to
the total aggregate of all meters.
The purpose of spectral clustering is to identify communities of nodes based on
the edges connecting them. It uses information from the eigenvalues (spectrum)
of adjacency matrices, whether built from a graph or a data set. As opposed to
k-means clustering, which results in convex sets, spectral clustering can solve
problems, such as intertwined spirals, because it does not make assumptions on
the form of the cluster. This property comes from the mapping of the original
space to an eigenspace. Given a sparse similarity graph, spectral clustering can
be implemented efficiently even for large data sets [193]. Kernel Spectral
Clustering (KSC) is an extension of the model described above. It allows for
out-of-sample extensions and model selection in a learning setting with training,
93
FORECASTING
validation and test stages.

The approach followed in this work is to improve the forecasting of a global load
time series (e.g., total regional or national demand) by first performing clustering
on the smart meter time series, fitting a forecasting model on each cluster and
summing the prediction per cluster to obtain the total disaggregated forecast.
Thereafter, they use an autoregressive model with exogenous variables (PARX)
to model the aggregated load.
The data used in the experiments of this paper are 6,000 smart meter series
collected by the Irish Commission for Energy Regulation. Their model shows a
20 % improvement in forecasting accuracy, by selecting a kernel based on the
Spearman’s distance.
Proposal: Graph-based spectral clustering of smart-meter data for STLF.
Expert opinion
In the case of load forecasting, the large number of diverse customers is leading
to less volatility, by the law of big numbers. Thus, regular linear regression has
acceptable performance, reducing the need for computational intelligence-based
models. In fact, much of the performance improvement potential lies in more
accurate weather models, which is beyond the scope of this research [98].
Assessment (5.3/10)
Although many variants of clustering exist, particularly in the machine learning
domain, spectral clustering has its roots in graph theory as it is based on the
Laplacian matrix.
The roll-out of smart meters has spurred research in the field of load
disaggregation.
Load forecasting using smart-meter data is a nascent field but considered to be
an economic opportunity for electric utilities. There is a window of opportunity
to develop such tools, but there is available software developed by major
companies to fit this purpose.
Workability (3/8)
As suggested by the expert opinion section, the relevance of this proposal is
94
8.3. FEATURE SELECTION FOR ELECTRICITY PRICE
FORECASTING
rather low. Data from smart meters is also be tricky to obtain, especially for
larger geographical areas, for which the load forecasting is necessary.
Figure 8.2: Assessment of Short-term load forecasting - clustering
8.3 Feature selection for electricity price

forecasting
Background
As has been discussed in previous sections, the electricity price depends on a
large set of fundamental drivers, including system loads (demand, consumption
figures), weather variables (temperatures, wind speed, precipitation, solar
radiation), fuel costs (oil and natural gas and to a lesser extent coal), reserve
margin (surplus generation, i.e., available generation minus/over predicted
demand) and the scheduled maintenance or forced outages of important power
grid components. Their historical (past) values and (market or expert)
predictions for the forecasting horizon considered are valuable for the
construction and proper calibration of the models [200].
EPF models are strongly dependent on the quality of features provided as input.
To a large extent, methods have been leveraging on expert knowledge to guide
the search of the right variables. Specific time-lags, oftentimes with time-varying
windows. And brute-force exhaustive search for variables is computationally
expensive. Still, there is interest in automated procedures for precise
dimensionality reduction [145].
95
FORECASTING
Almost all models share a common, critical phase: selecting the right
combination of parameters. It is important to note that though there have been
many studies focusing on feature selection, the development of an objective
method of selecting a minimum set of the most effective input variables would be
very valuable [200].
Case study A hybrid short-term electricity price forecasting framework: Cuckoo
search-based feature selection with singular spectrum analysis and SVM [194]
This study proposes a hybrid forecasting framework for short-term electricity
price forecasting to cope with non-linear dynamics of the electricity price.
The first step in their model is a singular spectrum analysis (SSA), to exploit the
important information hidden in the electricity price signal, such as trend,
periodicity and noise components. SSA can be an aid in the decomposition of
time-series into a sum of components, each having a meaningful interpretation.
Standard SSA algorithm consists of four stages: embedding, Singular Value
Decomposition (SVD), grouping and diagonal averaging. A great advantage of
this methodology is the fact that it is non-parametric, or model-free, meaning
that it can adapt itself to the underlying data set, dismissing the necessity of a
priori models. They decompose the price into a trend, a seasonal and a residual
component. This step is then followed by a cuckoo-search (CS) algorithm to
facilitate the development of feature selection and finally a support vector
machine (SVM) as a forecasting engine.
Benchmarking their model against a set of commonly used forecasting
techniques, among others LASSO, SVM and ARIMA, they find a superior model
performance. They attribute this to the HFS, which can generate the optimal lag
order for alternative electricity price series and then the hybrid of CS, SSA and
SVM can more effectively capture the relationship between inputs and output,
resulting in the desired forecasting.
The authors point out that a combination of models can obtain improved
robustness and be more generalisable than single methods, because they
integrate the strengths of a set of individual methods.
Proposal: A graph-based feature selection for EPF.
Expert opinion
There is reported interest in an automated feature selection process to improve
the forecasting results, computation efficiency and reduce manual input.
96
FORECASTING
Although various feature selection approaches have been tested, graph-based

models have not been tested. There is also an interest in a model which could
reduce the dimension of very large data sets [195].
Assessment (8.1/10)
The authors of this study have not directly used a graph. They implement a
method from spectral theory, namely the singular spectrum analysis (SSA), in
order to understand the inherent structure of the data and reduce the dimension
of the data set in a non-linear fashion. As graphs are commonly used for time
series, a method based on spectral graph theory can be elaborated, whereby the
structural information of a graph can be obtained by studying its spectrum [196].
For example, the cluster structure of a data set can be extracted from the leading
eigenvectors of its corresponding Laplacian matrix. We therefore consider that
graphs are applicable in this context.
The model would only require a set of time-series data. Expertise is needed in
the fields of mathematics, energy trading and data science, which represents a
rather homogeneous set of competences. The computational constraint is not
seen an issue as the solution’s very purpose is higher scalability.
The economic potential of improving price forecasting accuracy is extensively
covered in EPF literature [200]. An automated feature selection in particular has
high potential of reducing errors and the time spent on a manual selection. Since
it is still a blooming research topic, few commercial softwares are readily
available, rather these models are built internally. The advantage of a model-free
feature selection approach is its scalability to larger data sets [144] and to other
domains [197].
Workability (7/8)
The relevance of an automated feature selection for large data sets is high. Data
for this application is readily available and to a certain extent pre-processed.
Given its model-free nature, the solution can be incorporated as a pre-processing
step for a hybrid forecasting engine.
97
8.4. RESULTS OF THE USE-CASE ASSESSMENT
Figure 8.3: Assessment Electricity price forecasting - feature selection
8.4 Results of the use-case assessment

While all three use-cases are deemed interesting for a proof-of-concept, the
leading use-case was Electricity price forecasting - feature selection. It scored
higher on the dimensions of technical feasibility, economic potential and
workability. The results are illustrated in Figure 8.4.
Note that since the above three use-cases belong to the same cluster of Energy
trading, a mapping against the strategic alignments was not necessary as they
would score equally.
8.4.1 A note on Electricity price forecasting

Below is a more detailed description of the criteria setting Electricity price
forecasting apart.
Graph applicability: the graph applicability was not the highest in the cluster,
as use-cases where specific graphs and algorithms were more prevalent in
literature. However, it is still considered high, as graphs can be created on
time-series and are reported to perform well in feature selection.
Technical feasibility: The simplicity of model set-up is high, given the low
amount of modelling definitions and assumptions needed. S is the homogeneity
of expertise, due to the intimate relationship between graph-based feature
selection and machine learning forecasting.
98
8.4. RESULTS OF THE USE-CASE ASSESSMENT
Figure 8.4: Comparison of the use-case assessments
Economic Potential: the electricity price forecasting market is considered

high, because of the large volumes of electricity traded and the relative size of
power sales in utility companies’ earnings. Feature selection in electricity price
forecasting is very much a research in progress, so there are few commercially
available tools which are robust and stable enough. Finally, the scalability of the
solution comes from the capability of graph-based feature selection to be
generalisable to more domains as it only looks at the data structure. The sector
is however not considered a growing market.
Workability: feature selection methods are considered workable in the context
of electricity trading activities. They are a widely utilised pre-processing stage
and are very relevant for improving the forecasting accuracy. Furthermore, much
of the data is readily available through public domains or through third parties.
The integrability and maintainability are considered decent, as feature selection
is per se easily integrable as a pre-processing step, but there is a risk for
non-complementarity or redundancy and could introduce more sources of error,
adding a level of complexity in managing the current architectures.
99
8.5. DISCUSSION ON THE SELECTION OF USE-CASE
8.5 Discussion on the selection of use-case

The differences introduced in this process compared to the assessment of the
use-case clusters are two-fold.
Firstly, use-case proposals were made. They are inspired from the case study
presented, aiming to combine their utilised concept from graph theory with the
sector challenge at hand. However, they intentionally have a level of vagueness,
to leave room for potential nuances in the approach, as the specific opportunities
and challenges are further explored before the implementation phase.
Secondly, another mapping on the strategy score could have been possible. There
are possibly differences in the strategic scores of use-cases within the same
cluster. However, the present use-cases in Energy trading were chosen according
to their good representation of key challenges within this cluster. Furthermore, it
is not believed that the small variations would have altered the final result, as
the graph-score contains more information regarding an immediate
implementation of a proof-of-concept.
100
CHAPTER 9
Conclusion and discussion on
PART II
9.1 Conclusion
As was covered in Part I of this research, graph theory has a wide array of
applications across engineering fields. Modelling physical or digital systems into
nodes and edges enable the usage of many graph-related algorithms and tools of
analysis.
Part II aimed at illustrating where in the energy sector graphs can bring
additional value. A systematic search through academic and industrial use-cases
was made to obtain a more formalised manner of identifying where graphs have
previously been implemented and more particularly within the domains of
activity relevant for a utility companies.
Inspired from established Idea Management theory and guidelines, an idea
selection process was designed. Firstly, a filtering and clustering was made. The
use-cases were clustered in two dimensions, namely by domain of activity and by
graph-related concept to more easily identify which graph-based methods are
most prevalent in the respective domains. Then, an evaluation framework was
elaborated in coordination with organisational and academic supervisors. It
consists of two sequential evaluations, one of use-case clusters and then of specific
use-cases within the chosen cluster. This way, utilities can have a wider choice of
101
9.2. DISCUSSION
selection of solutions to solve related challenges within a very cluster.

The clusters were evaluated with the help of interviews conducted with industry
experts from the respective clusters and upon criteria rooted in IM literature.
They were then compared upon their denoted graph-score and strategic score.
The chosen criteria span the dimensions of graph applicability, technological
feasibility, economic potential and workability, as well as how well they align with
the organisation’s strategic goals. Finally, three specific use-cases from the
selected use-case cluster, Energy trading, were evaluated upon their graph-score.
Namely, the potential applications of the natural gas market, electricity price
forecasting and short-term load forecasting were considered. The selected
use-case was feature selection for electricity price forecasting, for its particularly
challenging and economically crucial aspects for electric utilities.
9.2 Discussion
The contributions of Part II of this research are the following:
◦ A data-driven idea generation approach for technological scouting within
the innovation processes of electric utility companies. It can be extended to
more domains of applications or replicated for other technologies.
◦ A framework for selecting a use-case of graph theory within the energy
sector.
◦ An index of possible use-cases within the energy sector, which can be used
as basis for further, more detailed studies in this context.
As the assessment of the use-cases is made at face-value, it aims at providing
guidelines and indications as to the potential value of graph theory in a subset of
domains of activity within the energy sector. This research contains certain
inherent limitations in order to objectively examine each use-case clusters,
namely in terms of time, scope, expertise and data. A suggestion for a more
objective evaluation would be to distribute the assessment process across the
organisation’s relevant experts and selected by a voting system. However, this
could introduce potential subjective biases.
Another possible direction would be to develop the idea generation tool further
to index all documents and do natural language processing for creating
102
9.2. DISCUSSION
knowledge from the information, in line with the concept of knowledge graphs.
This has been made for nuclear power operation effectivisation, for enriching
smart grid information and to asses the status of hydrogen research. Specific
elements of the previous case studies could be detected and compared
systematically and objectively. But this process could become time-consuming
and there will still remain some sources of uncertainty, be they organisational,
technical or economic. Taking into account the value of an agile approach,
consisting in incremental and iterative development and factoring in compounded
learnings throughout the process, rapid prototyping of a proof-of-concept of a
use-case might contribute to a more realistic assessment of an application.
As this research is exploratory, another literature review will be conducted to
gather a more profound understanding of the challenges with electricity price
forecasting and the various graph-based approaches which can be adopted. Given
the versatility of graph theory, some flexibility from the original use-case is
reserved for the proof-of-concept proposal.
103
Part III
Graphs, feature selection and

electricity price forecasting
104
CHAPTER 10
Graph theory in electricity price
forecasting
10.1 Background on the electricity market

10.1.1 Roles in the market
TSO (Transmission System Operator)
The operator of the national power system. Oftentimes the grid owner, it is
responsible for the physical balance between supply and demand and the
development of the network.
Generators
Market participants responsible for the supply of electricity.
Power exchange
The marketplace where the electricity is traded between the market players
through an auction system.
BRP (Balance responsible party)
A market participant, or the chosen representative, which is responsible for its
power imbalances in the system. A power imbalance refers to the difference
between their bid and their actual production or consumption. The BRP is
financially responsible for these imbalances and is incurred a cost proportional to
its imbalance.
105
10.1. BACKGROUND ON THE ELECTRICITY MARKET
Imbalance settlement administrator

Market participant settling the imbalances between the scheduled quantity of
power supply or consumption and the actual values exchanged on the market. It
is responsible for setting the price of the imbalance and dispatches the costs or
incomes to the BRPs. In most markets, the imbalance settlement administrator
is the TSO itself.
Traders
Also called retailers, these market participants buy electricity from the producers,
either bilaterally or on the power exchange, to sell it to their customers.
10.1.2 The day-ahead spot-market

The spot market is organised through auctions: each working day, a market
player makes a bid to buy or sell electricity on the power exchange for the
following day. The TSO is responsible for settling the selling and buying bids,
thus creating a dispatching schedule for the generators and establishing the
market-clearing price for each power period (generally an hour) for the following
day.
In their most generic form, the bids consist in a power quantity and a price for
each power period. If the bid is accepted, the market player engages in selling or
buying the quantity from the bid for each power period auctioned. Sellers bid the
lowest price at which they are willing to be paid for a certain quantity of power
(in other words, the marginal cost) and buyers state the highest price at which
they are willing to consume a certain quantity of power (the marginal utility).
Upon market closure (generally 12:00 the day before), the TSO constructs the
buying and selling curves, in other words, sorts the selling and buying bids by
price and quantity for each period. The bids are accepted in order of increasing
price until the total demand is met, according to Figure 10.1 [198].
As per fundamental microeconomics rules, the intersection of the supply and
demand curves determine the price, called the market clearing price. This way,
market equilibrium is reached and social welfare is maximised. All supply bids at
a price point lower than the clearing price are accepted, so are all purchasing
bids at a price point higher than the clearing price. This process is called the
generation dispatch, as it dictates the production schedules of each generator.
Because each bid is based on forecasts (of either generation or consumption), it is
106
10.1. BACKGROUND ON THE ELECTRICITY MARKET
Figure 10.1: Formation of electricity prices
logical that there is some margin of error and the actual values will look
different. These variations are remedied in various ways in the balancing market.
It is worthwhile mentioning that the biggest portion of power sales is still
generated by bilateral contracts with large customers, also called futures.
Suppliers and buyers agree on a future delivery of electricity at a pre-determined
rate, hedging against the risk caused by the daily price volatility.
10.1.3 The balancing market

During the day of delivery, continuous adjustments in production and
consumption need to be made continuously in order to keep the physical balance
of the grid. For this reason, it is also referred to as the real-time market, where
suppliers and bidders continuously place bids on the power exchange. The
market clearing takes place during the hour of operation: the TSO selects the
relevant bids and call the entities in real time. The TSO uses this real-time
auction mechanism for the delivery of energy for tertiary regulation and for the
management of network congestions in real time [199].
◦ If the regulation is upward (the system needs more power), generally the
TSO must pay the entity that generates power, while
◦ If the regulation is downward (the system needs less power), generally the
entity that consumes power must pay the TSO.
This market has two economic consequences for traders. Firstly, it is rare that
107
10.2. ELECTRICITY PRICE FORECASTING
generators produce exactly the quantity from their bid in their day-ahead market.
The TSO incurs a cost for this imbalance, either an upward-regulation if the
system needs more electricity or a downward-regulation in the opposite scenario.
Secondly, market participants can choose to trade on this market instead of the
day-ahead market, if they predict that it would be more beneficial to them.
As soon as buy- and sell-orders are matched, the trade is executed. Electricity
can be traded up to 5 minutes before delivery and through hourly, half-hourly or
quarter-hourly contracts. As this allows for a high level of flexibility, market
participants can also make last minute adjustments to balance their positions
closer to real time to reduce their imbalance [201].
10.2 Electricity price forecasting

10.2.1 Forecasting methods
As earlier indicated, electricity price forecasting (EPF) is not trivial and there is
a rapidly growing literature on electricity price forecasting. Most research report
similar challenges in their respective studies, namely the non-linear dynamics,
strong self-correlation, time varying means and deviations as well as spikes and
seasonality. The integration of more volatile renewable energy in the system adds
to the complexity of accurately predicting prices.
Weron has proposed the following taxonomy of the different types of models
proposed in literature [200]:
Multi-agent models simulate the behaviour of agents in a system, in other
words, the interaction between the various market players is analysed. This
approach is justified by the fact that the electricity price is determined through a
bidding process, where market participants can adopt various strategies to
maximise their revenues (e.g. by exerting market power). They can be used for
qualitative studies in the market.
Structural models take into account the fundamental relationship between the
elements of the electrical system. Parameters affecting the electricity price in the
physical sense are being modelled to create a realistic electricity price
representation. For instance the capacities of the transmission and distribution
lines, the geographical representation of the generators, the feed in from
renewables, the load sources, weather variables.
108
Graphs can be applied in the context of structural equation modelling (SEM)

[202]. SEM is increasingly being used in engineering and natural sciences to
respond to questions about complex systems. It puts emphasis on the causal
process in dynamic systems, by providing a framework inferring cause-effect
relationships. The use of graphical modelling methods to analyse multivariate
data permits the identification of causal hypotheses [203]. Identifying causal
relationships in the energy markets by the means of graphs have been subject of
several studies, though mainly for understanding complex networks of the crude
oil and natural gas markets [204, 205, 206] and their causality relationship with
electricity prices [147]. Graph-based solutions for structural modelling in the
electricity market mainly focus on problems related to the power grid topology,
for instance for assessing the reliability of power grids [149] or power system
islanding [148]. With the integration of more renewable generation, graphs are
also emerging as an approach to manage power flows in distribution grids [150].
Though this modelling approach can give a more fundamental understanding of
the price-creation, it generally fails to capture the volatility of electricity prices in
time. It is also complex to set up and is more prone to arbitrary assumptions to
compensate for the lack of some timely parameters, for example the time
granularity from the natural gas data. Therefore, hybrid models combining
fundamental factors with statistical, reduced-form or computational intelligence
are often used [200].
Reduced form models, or stochastic models, are inspired from the financial
markets and do not aim at forecasting the electricity price per se, but rather seek
to replicate their main characteristics to capture the price dynamics (e.g.
marginal distributions, correlations). They are commonly used for derivatives
valuation and risk analysis. Several approaches including jump-diffusion models
or Markov regime-switching models have been proposed.
Graphs can help capture the dynamics between prices in the stock market. In
fact, there is a type of graph usually employed for such modelling, the market
graph. It has been used on various occasions for modelling the correlations
between stocks [151] and between energy commodities [152]. As the markets of
natural gas and crude oil are becoming more financialised and are known to
impact the electricity price, a combination of graphical and financial insights in
characteristics such as causality, centrality, degree distribution and impact of
exchange rates can be of interest [153]. Although the simplicity of this type of
109
models can be a serious limitation and few attempts are being made, it performs
reasonably well to capture the volatility or price spike forecasts [200].
Statistical models build upon the seasonality of electricity prices and capture
their auto-regression at the same time as exogenous factors can be considered.
They have been exhaustively applied to EPF thanks to their simplicity and
satisfying accuracy. Traditionally used models include autoregression, transfer
functions, autoregressive integrated moving average (ARIMA) and Generalised
Autoregressive Conditional Heteroskedasticity (GARCH) [194]. However, they
fail at capturing price spikes because of their non-linearity, so hybrid models are
often used, for example with various combinations of ARIMA and GARCH
models [210, 211].
Computational intelligence (CI) models aim at overcoming the
aforementioned weakness of statistical models as they are effective for capturing
the non-linear behaviour of electricity prices [212]. Artificial neural networks
[213] Support Vector Machines [214], Fuzzy inference systems [215] have often
been suggested, but a vast set of models is being tried (e.g., genetic algorithms,
regression trees). Their ability to handle complexity and non-linearity make
them a promising approach for short-term predictions and authors have reported
excellent performance. They can furthermore accept much larger amounts of
data than traditional statistical regression models. However, this can also be
their very drawback, as a risk is to overfit the model to the training set and
incorrectly forecast the point forecasts [200]. The accuracy of the models is
dependent on the calibration of the hyperparameters, the initial conditions,
which is to a large extent made by computationally intensive exploratory means
and through trial and error, which reduces their replicability. Also, they can
suffer from a low level of explainability as they generally adopt a black-box
model, i.e. what happens in the neural networks is not fully understood.
EPF models are strongly dependent on the quality of features provided as input.
To a large extent, methods have been leveraging on expert knowledge to guide
the search of the right variables. Specific time-lags, oftentimes with time-varying
windows. And brute-force exhaustive search for variables is computationally
expensive. Still, there is interest in automated procedures for precise
dimensionality reduction [145].
Graphs are widely adopted in combination with neural networks, e.g. through
graph neural networks, graph convolutional networks or graph embeddings for
110
performing computational intelligence models on graph-structured data or to

reduce the dimensionality of data while preserving its inherent structure. For
instance, graph embedding enable a representation of entire graphs, or subsets of
graph data, in a numerical format adapted for machine learning tasks [234]. The
problem of graph embedding is related to two traditional research problems, i.e.,
graph analytics [233] and representation learning [209]. Particularly, graph
embedding aims to represent a graph as low dimensional vectors while the graph
structures are preserved. Graph analytics can mine useful information from
graph data, whereas representation learning obtains data representations making
it easier to extract useful information when building classifiers or regressors [235].
The singular spectrum analysis performed in [194] is a type of graph embedding.
Because of the high performances of computational intelligence models and the
ability of graphs to provide additional understanding, particularly in the area
where CI models encounter the highest limitations, namely feature selection, this
will be the subject of study for this research. Some of the main challenges in
feature selection and an intuition of the benefits of graphs is outlined below.
10.2.2 Feature selection in electricity price forecasting

Almost all models share a common, critical phase: selecting the right
combination of parameters. The development of an objective method of selecting
a minimum set of the most effective input variables would be valuable [200].
As has been discussed in previous sections, the electricity price depends on a
large set of fundamental drivers, including system loads, weather variables
(temperatures, wind speed, precipitation, solar radiation), fuel costs (oil and
natural gas and to a lesser extent coal), reserve margin (surplus generation, i.e.,
available generation minus/over predicted demand) and the scheduled
maintenance or forced outages of important power grid components. Their
historical (past) values and (market or expert) predictions for the forecasting
horizon considered are valuable for the construction and proper calibration of the
models [200].
The overall goal of feature selection is three-fold [236]:
◦ Model simplification, to make them more easily interpretable.
◦ Reduce the curse of dimensionality, which refers to the increase in
computation time as we go toward very large data sets.
111
◦ Reduce overfitting, namely the risk of models to correspond too closely or

exactly to a particular set of data and potentially failing to fit additional
data or predict future observations reliably.
There are three families of feature selection, filter, wrapper and embedded
methods [143].
Filter methods apply some statistical measure to assess the importance of
features. Their main disadvantage is that, as the specific model performance is
not evaluated and the relations between features are not considered, they may
select redundant information or avoid selecting some important features. Their
main advantage is that, as a model does not have to be estimated, they are very
fast [212].
By contrast, wrapper methods perform a search across several feature sets,
evaluating the performance of a given set by first estimating the prediction model
and then using the predictive accuracy of the model as the performance measure
of the set [208]. Their main advantage is that they consider a more realistic
evaluation of the performance and interrelations of the features; their drawback
is a long computation time.
Embedded methods, e.g. regularisation, learn the feature selection at the same
time the model is estimated. Their advantage is that, while being less
computationally expensive than wrapper methods, they still consider the
underlying model. However, a drawback is that they are specific to a learning
algorithm and thus, they cannot always be applied.
In the last years, LASSO emerged as a valuable feature selection technique for
EPF [220]. Uniejewski et al. reported an empirical study involving
state-of-the-art parsimonious expert models as benchmarks, data sets from three
major power markets and five classes of automated selection and shrinkage
procedures; LASSO and elastic nets have proved to bring significant accuracy
gains [163]. However, most of the LASSO-related algorithms focus on variable
selection and shrinkage for linear problems and there are few works studying
nonlinear variable selection with LASSO [145].
It is important to remember the particular challenges of the electricity price
forecasting, namely the high amount of data and the non-linearity, which
restricts the usage of some linear analysis models (correlation analysis, principal
component analysis, etc.) [194]. Some information-theoretic, non-linear
112
approaches have been proposed to tackle this, such as mutual information (MI)
[144, 222] and the relief algorithm [223] which are well-known filters. But
according to Zhang et al. they fail to evaluate the importance of alternative
variables due to the independence of the filter methods from the forecasting
model [194].
According to Yang et al., hybrid models have caught the attention of researchers
in recent years, for their ability to aggregate the benefits from each method [224].
Abedinia et al. propose a filter-wrapper feature selection for load and price
forecasts. The filter model is able to select a minimum subset of features with
regards to relevancy, redundancy and synergy, which is the interaction between
the input parameters. Subsequently, they applied a wrapper to fine-tune the
results from the first step [225].
10.2.3 The case of intra-day

Across Europe, there has been a shift in volumes from DA to intra-day [226].
However, the literature regarding intra-day EPF is still scarce, most of it has
consisted in establishing day-ahead forecasting models. [227]. But there are a few
significant differences in the explanatory factors compared to the day-ahead
market.
The impact of renewables in the intra-day prices has been widely studied and is
considered strong [228]. Kulakov et al. have found that in the German EPEX
market, the reserve margins (the excess capacity compared to the demand),
although having a great impact on day-ahead price forecasting, only had a
marginal effect on intra-day electricity prices, so do the trade values of the
day-ahead markets [229].
However, studies made by Monteiro et al. and Andrade et al. argue that market
fundamentals, such as generation and weather, did not play an important role in
the intra-day forecasts on the Spanish market, as DA prices already included
such information [?].
A notable difference in intra-day markets is that prices are not set once a day,
because information is continuously updated (e.g. weather forecasts, trading
decisions). This enables forecasters to continuously update their models upon
more accurate, real-time data [232]. Combining the naïve forecast with a LASSO
model, Marcjasz et al. have assessed that the impact of using more recent
113
forecasts reduces their own forecasting errors by more than 2 %. Their inclusion
of exogenous variables in addition to their baseline model with lagged versions of
intra-day and day-ahead prices is motivated by the results of [163], who stressed
that fundamental variables had an explanatory value when forecasting day-ahead
electricity prices. It contradicted the observations made by Monteiro et al.
regarding the impact of market fundamentals.
Feature selection is a particularly important challenge to solve in the case of
intra-day trading because of the large amounts of data available and the
currently lower level of understanding of the price movements. Prices are
continuously updated for all hours of the day along with the trading decisions of
generators and loads and the updated weather forecasts. Compared with the
DA-models where the dispatch is optimised by the TSO, line congestions can
further increase the volatility of prices.
10.2.4 Summary
To summarise the discussion above, feature selection is key to increase the
accuracy and speed of any electricity price forecasting models and more
specifically in the case computational intelligence models. To approach this,
researchers have tried filter, wrapper and embedded methods, each having their
advantages and drawbacks. Whereas filter methods such as mutual information
are model-free, scalable and can be apply non-linear dimensionality reduction,
they have a lower accuracy level. Embedded and wrapper methods, on the other
hand, can contribute to more accurate predictions, but can become too
computationally intensive. To cope with large dimensions of data, sparse
regularisation methods have been employed successfully, particularly LASSO and
elastic nets. However, they can still lack in performance for non-linear mappings.
Hybrid models are receiving increasing attention for their ability to aggregate the
inherent benefits of each method, for instance filter-wrapper methods.
Feature selection is particularly important in the context of intra-day trading,
given the large amounts of data available, the higher volatility of prices and the
impact of fundamental factors such as line congestions. As trade volumes are
being shifted to the intra-day market, research on the mechanisms in this market
are still lagging and the varying conclusions inferred by the models proposed in
literature suggest that there is a need to better understand the price dynamics.
Graphs have a reported value for enriching computational intelligence models,
114
10.3. MOTIVATIONS ON THE PROPOSED METHODOLOGY
either by modelling neural networks through graphs or by pre-processing the

high amounts of chaotic data for feature selection purposes through graph
embeddings. Graph embeddings are able to reduce the dimension of data sets.
They can be applied directly on graph structures, but can also be employed on
regular data sets after they have been converted into a graph. This procedure
will be employed in this research and described in further details in the coming
section.
10.3 Motivations on the proposed methodology

Based on the above findings, a model rooted in the concept of hybrid models is
proposed to benefit from the value of graphs as an initial filtering step for very
large data sets to capture information from the internal structure of the data.
This initial step can be integrated with more accurate feature selection processes
embedded in the forecasting engine. The recent successes of sparse regularisation
motivates our selection of algorithms which in fact include a sparse regularisation
step.
The aim is to apply five graph-based feature selection algorithms, presented in
the section below, on a large intra-day data set. The algorithms will be evaluated
by three metrics assessing the quality of dimensionality reduction (NMI, AMI,
ACC), by comparing their clustering accuracy with the Baseline scenario. This
evaluation will be benchmarked with open-source data sets, in order to verify the
generalisability of the feature selection.
The data set includes hourly intra-day variables from the year 2019 containing
different lags of the price, system load, available generation, fuel costs, calendar
indicators, transmission congestion, as well as combinations of the above
(synthetic features). After the pre-processing and normalising stages, the data
was partitioned into approximately week-long batches.
If the dimensionality reduction achieves satisfactory results, this procedure could
be applied as a pre-processing step to the prediction engine for processing large to
very large data sets. The actual prediction performance with the new feature set
is outside the scope of this study, as the primary purpose is to identify whether
the proposed feature selection approach can perform well in this context.
115
CHAPTER 11
Introduction to feature selection
11.1 Background
Large-scale data with many variables, also called features, come with both
wealthy information and challenges in engineering modelling problems. Not only
do high-dimensional data cause high computational time and memory errors for
computing systems, they also lead to a problem called overfitting. This means
that the mathematical model is too closely fitted to many variables, where some
of them are noisy, inaccurate or irrelevant data. A consequence is that the model
is too complicated to analyse and generalise. It is therefore suitable to reduce the
number of features before the modelling phase, for example by removing the
redundant features and keeping the most representative ones.
Although feature selection can be done by empirical expert knowledge in the
respective fields, this approach can be loaded with personal biases, unclear
evaluation measures, time-consuming manual work and most of all uncertainty
when discriminating the importance of different features. Various methods for
automatic feature selection have been proposed to solve this challenge. Some
typical methods in traditional regression analysis are stepwise regression, Akaike
information criterion and principal component analysis [237]. In information
theory, features can be selected using informative fragments [238], conditional
infomax learning [239] and fast correlation-based filter [240], among many other
methods.
116
11.2. PURPOSE AND OUTLINE
In the task of feature selection, a huge contribution of a graph representation is

to capture the non-linear inherent geometric structure of the data, also known as
the data manifold [241]. In many real-life applications, high dimensional data
points are actually samples from a low-dimensional manifold which is a subspace
of a high-dimensional manifold [242]. In other words, it is usually assumed that a
data point given by hundreds features can be described by fewer features while
preserving the structure of the original data manifold. This is the intuition
behind several graph-based methods for feature selection. A feature is considered
good if it in some sense respects the data manifold.
11.2 Purpose and outline

In this thesis, five graph-based methods for feature selection, proposed in earlier
researches, are studied to understand how a feature selection problem can be
formulated and solved as a mathematical optimisation problem. The methods
start with constructing a graph where each vertex represents a data sample and
the edge weights show the similarities between the samples. Criteria for good
features are then formulated as a minimisation problem which ultimately results
in a ranking where the top features are to be selected. Different approaches have
different views on how a feature is considered good, which will be described in
details in Chapter 13.
The presentation outline is as follows. Chapter 12 presents the definitions and
properties of some important concepts in graph theory and information theory.
In Chapter 13, five feature selection methods are presented with their aim,
construction of data graph, optimisation problem and the whole idea is
summarised with an algorithm, whose output is a suggested ranking of all
features. The algorithms are then experimented on several data sets with at least
500 features, collected from public online databases and the internal database of
Fortum. The results are reported in Chapter 14 and discussed in Chapter 15.
117
CHAPTER 12
Preliminaries
In this chapter, some mathematical notions recurrently used in the work of

feature selection are briefly presented. How these notions are related to selecting
features and assessing the feature selections will be described in details in
Chapter 13 and Chapter 14.
12.1 Laplacian matrix

Given an undirected graph G = (V, E, W) with m vertices v1 , · · · , vm , the degree
matrix D of G is the m × m-matrix defined as

deg(v
i) if i = j

Dij =
0
 if i 6= j
where deg(vi ) is the degree of the vertex vi . In other words,

D = diag deg(v1 ), · · · , deg(vm ) , where diag(v) denotes the diagonal

m × m-matrix with the elements of the vector v ∈ Rm on the main diagonal.

Recall that the adjacency matrix A of G is the m × m-matrix defined as

1

if vertices vi and vj are adjacent
Aij =
0
 otherwise
118
12.2. NEAREST NEIGHBOUR GRAPH
Then, the unweighted Laplacian matrix L of G is the symmetric m × m-matrix

defined as
L=D−A
If the edge weights given by the weight matrix W are considered, the formula
above is modified to
L = diag(W1) − W
h iT
where 1 = 1 1 · · · 1 ∈ Rm .
By construction, L is a symmetric matrix. It has also been shown that L is
positive semidefinite and has therefore real, non-negative eigenvalues. In spectral
graph theory, the spectrum 1 of L can be used to understand the underlying data
structure of a graph, for example its connectivity [243] and cluster structure [244].
12.2 Nearest neighbour graph

Assume a graph G with m vertices representing the m points x1 , · · · , xm ∈ Rn
where some edges between the vertices need to be defined. In some applications
where one wishes to link a vertex to its most similar vertices given some measure
for similarity, it is reasonable to connect two vertices vi and vj with an edge if
the corresponding points xi and xj are close to each other according to some
distance metric. A common metric is the Euclidean distance
q
xi − xj = (xi1 − xj1 )2 + (xi2 − xj2 )2 + · · · + (xin − xjn )2
There are two versions of defining edges as suggested by [245]:

1. Let vi and vj be connected if the distance between xi and xj is small
enough, meaning if xi − xj < ε for some chosen threshold ε > 0.
Although this version seems intuitive, it is difficult to choose ε for different
sets of points and the resulting graph is often non-connected.
2. Let vi and vj be connected if vi is among the p nearest neighbours of vj or
if vj is among the p nearest neighbours of vi . Although this version is less
intuitive than the previous version, it is easier to choose p and the resulting
graph is almost always connected. In the rest of this paper, this kind of
graph will be called the p-nearest neighbour graph 2 .
1
The spectrum of a matrix is another word for the set of the eigenvalues of the matrix.
2
That is to say not “p-nearest neighbours graph”.
119
12.3. GRAPH CLUSTERING
12.3 Graph clustering

In some applications in data analysis, the m vertices in a graph G representing m
data vectors need to be grouped into κ disjoint clusters C1 , · · · , Cκ . It is
meaningful that the vertices assigned to the same cluster are more similar to one
another compared to the vertices assigned to other clusters, given some measure
of similarity.
A clustering of a graph can be described with a cluster label vector y ∈ Rm such
that
yi = C j
if the vertex vi is assigned to the cluster Cj , for i = 1, · · · , m and j = 1, · · · , κ. A
clustering can alternatively be described with a cluster indicator matrix
Y ∈ {0, 1}m×κ such that

1

if vertex vi is assigned to cluster Cj
Yij =
0
 otherwise
There are different ways to perform a clustering on a set of vertices without

necessarily considering what the corresponding data stand for in real life. Some
common methods are k-means clustering 3 , mean shift clustering and expectation
maximisation using Gaussian mixture models.
12.4 Comparison of two clusterings

A situation where two clusterings of the same vertex set with m vertices need
comparing is where a clustering is manually defined and a clustering is
automatically computed by an algorithm, for example. Two commonly used
comparison metrics are clustering accuracy (ACC) and normalised mutual
information (NMI). In this paper, a third metric called adjusted mutual
information (AMI) is also included. The range of all three metrics is [0, 1]. In
general, the higher ACC, NMI and AMI are, the higher the clustering quality.
3
The k in “k-means clustering” is an inherent part of the name and is not the same as the k
in Table 13.1.
120
12.4. COMPARISON OF TWO CLUSTERINGS
12.4.1 Clustering accuracy

Assume two cluster label vectors y and ỹ describing two clusterings C and C˜
respectively, the corresponding clustering accuracy is defined as
˜ 1 Xm
ACC(C, C) = δ ỹi , Map(yi )

m i=1
where Map is the permutation mapping function which maps each cluster label
yi to an equivalent label in ỹ and the Kronecker delta function δ is defined as

1

if a = b
δ(a, b) =
0
 if a 6= b
The latter function is named after the German mathematician Leopold

Kronecker (1823–1891) 4 .
12.4.2 Normalised mutual information

In order to define normalised mutual information, two fundamental concepts
called entropy and mutual information from the field of information theory need
introducing first.
Assume a random variable X with outcome space ΩX and probability mass
function pX (x). The entropy, also known as information entropy, of X is denoted
H(X) and defined as
H(X) = − pX (x) log2 pX (x)
X
x∈ΩX
A way to interprete this quantity is how unexpected, uncertain and

unpredictable the information contained in X is [246]. Interestingly enough, the
expression for entropy does not depend on the values of X but rather on the
probability distribution of X.
For two random variables X and Y , the mutual information between them
I(X, Y ) is defined as
p(X,Y ) (x, y)
!
I(X, Y ) = p(X,Y ) (x, y) log2
X X
y∈ΩY x∈ΩX pX (x) pY (y)

4
Kronecker is famous for having said “Die ganzen Zahlen hat der liebe Gott gemacht, alles
andere ist Menschenwerk.”, meaning “God made the integers, all else is the work of man.”, among
other things.
121
12.5. SIMILARITY COMPARISON OF TWO SETS
where ΩY is the outcome space of Y , pY (y) the probability mass function of Y

and p(X,Y ) (x, y) the joint probability mass function of X and Y . Intuitively, the
mutual information of two random variables tells how much one can know about
one variable given the information about the other variable [247].
The normalised mutual information of two clusterings C and C˜ with cluster label
vectors y and ỹ respectively can now be defined as
˜ = 2I(y, ỹ)
NMI(C, C)
H(y) + H(ỹ)
where I(y, ỹ) is the mutual information between y and ỹ and H(y) the
information entropy of y.
12.4.3 Adjusted mutual information

This version of mutual information contains an adjustment for the situations
where two clusterings agree by coincidence, which is not considered in the
normalised mutual information [248]. The adjusted mutual information of two
clusterings C and C˜ is defined as
I(y, ỹ) − E I(y, ỹ)

˜ =
AMI(C, C)
max H(y), H (ỹ) − E I(y, ỹ)

where E I(y, ỹ) denotes the expectation of the mutual information between C

˜
and C.
12.5 Similarity comparison of two sets

Assume two non-empty sets A and B with equally or unequally many elements.
A measure for how similar these sets are is the Jaccard index J defined as
|A ∩ B|
J(A, B) =
|A ∪ B|
or in other words the ratio between the number of elements in common and the
number of different elements in total. This index is named after the French
professor of botany Paul Jaccard (1868–1944).
122
CHAPTER 13
Feature selection methods
Five graph-based methods for feature selection with different points of view are
presented in this chapter. Given a data set with m samples and n features, the
aim is to select the best k n features according to certain criteria. In
chronological order, the five methods are as follows:
1. Laplacian score (LS), proposed 2005 [249]: This method favours the
features which have large variances and preserve the local manifold
structure of the data.
2. Multi-cluster feature selection (MCFS), proposed 2010 [251]: This method
favours the features which preserve the multiple cluster structure of the
data. The optimisation problem involves spectral regression with `1 -norm
regularisation.
3. Non-negative discriminative feature selection (NDFS), proposed 2012 [252]:
This method favours the most discriminative features. The optimisation
problem involves non-negative spectral analysis and regression with
`2,1 -norm regularisation.
4. Feature selection via non-negative spectral analysis and redundancy control
(NSCR), proposed 2015 [253]: This method is an extension of NDFS which
even controls the redundancy between features. The optimisation problem
involves non-negative spectral analysis and regression with `2,q -norm
regularisation. Abbreviation: “NS” for “non-negative spectrum” and “CR”
123
for “constrained redundancy”.
5. Feature selection via adaptive similarity learning and subspace clustering
(SCFS), proposed 2019 [250]: This method favours the most discriminative
features which also preserve the similarity structure of the data. The
optimisation problem involves regression with `2,1 -norm regularisation.
Abbreviation: “SC” for “subspace clustering”.
Of all five methods, Laplacian score is the only one which contains neither
iterations, regression, regularisation nor clustering of data.
For simplicity, some notations will be used consistently in the descriptions of all
methods according to Table 13.1.
X ∈ Rm×n data matrix

m number of data samples
n number of original features
k number of selected features
f1 , f2 , · · · , fn the n original features
x1 , x2 , · · · , xm ∈ Rn the m sample vectors
f1 , f2 , · · · , fn ∈ Rm the n feature vectors
Table 13.1: Notations associated with a given data set.
Furthermore, the bold uppercase letters A, B, C, · · · denote in general matrices

and the bold lowercase letter a, b, c, · · · denote vectors. For an arbitrary matrix
A, the notation with two subscript indices Aij means the element at row i and
column j in A, the notation with one subscript index ai means the row vector at
row i in A while the notation with one superscript index At means A at
iteration step t.
The notation kxk always refers to the Euclidean norm of the vector x, while
kAkF means the Frobenius norm of the matrix A. For an arbitrary matrix
A ∈ Rm×n with real elements, its Frobenius norm is defined as
v
um n
2 q
kAkF = = tr (AT A)
uX X
t Aij
i=1 i=1

where the trace tr AT A is the sum of the main diagonal elements of AT A.
124
13.1. LAPLACIAN SCORE (LS)
13.1 Laplacian score (LS)

The main idea of this approach is to rank the features according to their power
of preserving locality, by a number called Laplacian score. Selecting the k best
features is equivalent to selecting the k features with the lowest Laplacian scores
[249].
13.1.1 Preparation
Create the p-nearest neighbour graph G = (V, E, W) with m vertices
corresponding to the m sample vectors x1 , x2 , · · · , xm ∈ Rn , for some own choice
of p. There are two ways to define the edge weight matrix W [245]:
1. Let Wij = 1 if the vertices vi and vj are adjacent and Wij = 0 otherwise.
2. Let

2
e−ωkxi −xj k

if vi and vj are adjacent
Wij = ψ(xi , xj ) = 
0 otherwise
for some chosen parameter ω > 0 which needs tuning in practice.

The latter suggested edge weight matrix reflects the similarity between the data
samples. The lower the distance xi − xj , the higher Wij and the more similar
the data points xi and xj are. As explained in [245], this particular choice of W
is advantageous due to a related partial differential equation about heat
distribution, where ψ is also called the heat kernel. A motivated choice for the
parameterω is ω = (ln ¯ ¯
2)/d, where d is the arithmetic mean value of the
distances xi − x j [254]. By construction, W is a symmetric matrix.
13.1.2 Optimisation problem

When a graph has been constructed, the next step is to define the requirements
for the best features. In this approach, a good feature should preserve the local
structure of the data manifold, meaning that it should be the one on which two
data points are close to each other only if the corresponding vertices are
adjacent. This leads to an optimisation problem where
2
(fr )i − (fr )j
Pm
i,j=1 Wij
(?) :
Var(fr )
125
13.1. LAPLACIAN SCORE (LS)
2
shall be minimised. The weight Wij in the expression (fr )i − (fr )j Wij can be
interpreted as a kind of penalty to ensure the locality preserving ability of a
feature. To be more concrete, the closer xi and xj are to each other, meaning the
2
larger Wij , then the smaller (fr )i − (fr )j should be. With Var(fr ), denoting
the weighted variance of the feature fr , in the denominator, LS(fr ) is minimised
by maximising Var(fr ). This also allows the most representative features to be
selected.
Algebraic simplification: To simplify (?), use the re-writings
m 2 m h i2 h i2
(fr )i − (fr )j Wij = (fr )i + (fr )j − 2 (fr )i (fr )j Wij
X X
i,j=1 i,j=1
m h i2 m
=2 (fr )i Wij − 2 (fr )i Wij (fr )j = 2frT Dfr − 2frT Wfr = 2frT Lfr
X X
i,j=1 i,j=1
and #2
n h n
f T D1
"
i2
Var(fr ) ≈ (fr )i − µr Dii = (fr )i − rT
X X
Dii
i=1 i=1 1 D1
h iT
shown in [245], where 1 = 1 1 · · · 1 , D = diag(W1) and L = D − W is
the Laplacian matrix of the graph G.
Further improvement: As pointed out in [249], there is a risk that the vector
fr can be a non-zero constant vector such as 1. This leads to
frT Lfr = 1T L1 = 0
2
(fr )i − (fr )j
Pm
i,j=1 Wij 2frT Lfr
⇒ (?) : = =0
Var(fr ) Var(fr )
which gives no information about the feature fr . This problem can be avoided by
introducing
f T D1
f̃r = fr − rT 1
1 D1
which gives

f T Lf = f̃rT Lf̃r

r r
Var(fr )
 ≈ f̃rT Df̃r
Hence, the optimisation problem is about minimising the quotient

2f̃rT Lf̃r
(?) :
f̃rT Df̃r
126
13.2. MULTI-CLUSTER FEATURE SELECTION (MCFS)
which is equivalent to minimising the objective function
f̃rT Lf̃r
LS(fr ) =
f̃rT Df̃r
where LS(fr ) stands for the Laplacian score of the feature fr . At this point, one
can note that a good feature fr has a low Laplacian score. An algorithm for
ranking the features according to their Laplacian scores is now ready to be
formulated.
13.1.3 Feature selection algorithm

Given a data matrix X ∈ Rm×n with m samples and n features, the best k
features can be found as follows:
1. Construct a p-nearest neighbour graph G with m vertices corresponding to
the m data vectors as described in 13.1.1.
2. Compute the weight matrix W ∈ Rm , in this case also known as similarity
matrix or affinity matrix, with the heat kernel as described in 13.1.1.
3. For a feature fr with corresponding feature vector fr , define f̃r as
frT D1
f̃r = fr − 1
1T D1
h iT
where 1 = 1 1 · · · 1 ∈ Rm and D = diag(W1).
4. Compute the Laplacian matrix L = D − W of the graph G and the
Laplacian score LS of each feature fr as
f̃rT Lf̃r
LS(fr ) =
f̃rT Df̃r
5. Select the k features with lowest Laplacian scores.
13.2 Multi-cluster feature selection (MCFS)

This approach ranks the features according to their power of preserving the
cluster structure of the data as well as covering all the possible clusters [251].
The higher the score MCFS(fr ), the better the feature fr .
127
13.2. MULTI-CLUSTER FEATURE SELECTION (MCFS)
13.2.1 Preparation
A graph here is constructed in the same way as described in 13.1.1 for the
Laplacian score method. After obtaining a p-nearest neighbour graph and the
Laplacian matrix L = D − W, solve the generalised eigenvalue problem
Lu = λDu
to obtain a flat embedding for the data points. According to [245], the equation
Lu = λDu is derived from an optimisation problem of constructing a
representation for data lying on a low-dimensional manifold embedded in a
high-dimensional space, where local neighbourhood information is preserved.
Let U = [u1 , · · · , uκ ] be a matrix containing the eigenvectors of the above
mentioned eigenvalue problem corresponding to the κ smallest eigenvalues. Here,
κ can be interpreted as the insintric dimensionality of the data and each
eigenvector reflects how the data is distributed along the corresponding
dimension, or in other word the corresponding cluster [251]. If the number of
clusters of the data set is known, κ can be chosen to be this number.

For each vector ui in U, where i = 1, · · · , κ, find a vector ai ∈ Rn which
minimises the fitting error
kui − Xai k2 + β |ai |
where X is the data matrix, |ai | denotes the sum nr=1 (ai )r and β is some
P
parameter. This optimisation problem is about assessing the importance of each

feature for differentiating the clusters. The solution ai contains the combination
coefficients for the n different features in approximating ui . Then, one can select
the most relevant features with respect to ui , which are those features
corresponding to the non-zero coefficients in ai .
Using the `1 -norm regularisation on the last term β |ai | as a penalty, a sparse
vector ai can be obtained if the parameter β is chosen to be large enough. The
larger β is, the more elements in ai are being forced to shrink to zero. The
sparsity of ai helps prevent the case where too many features get selected, some
of which some may be noisy or irrelevant.
128
13.3. NON-NEGATIVE DISCRIMINATIVE FEATURE SELECTION
(NDFS)
For each feature fr , define its multi-cluster feature selection score of as
MCFS(fr ) = max (ai )r

i
The k of n features which should be selected according to this approach are now
those with highest multi-cluster feature selection scores.

Given a chosen number of clusters κ, by default κ = 5, the following algorithm
returns the k best features according to their multi-cluster feature selection
scores.
1. Construct a p-nearest neighbour graph as described in 13.1. Compute the
matrices D and W as described in the same section and the Laplacian
matrix L = D − W.
2. Solve the generalised eigenvalue problem
Lu = λDu
and obtain κ eigenvectors u1 , · · · , uκ corresponding to the κ smallest

eigenvalues.
3. Solve the minimisation problem
min kui − Xai k2 + β |ai |

ai
for each ui and obtain κ vectors {ai }.

4. Compute the score of each feature as
MCFS(fr ) = max (ai )r

i
5. Select the k features with the highest scores.
13.3 Non-negative discriminative feature

selection (NDFS)
This approach aims to select the most discrimitive features with the aid of the
cluster labels of the data [252]. In practice, a data set may have well-defined
129
(NDFS)
clusters, commonly known as categories, considering the real-life interpretation of

the data. If this is not the case, pseudo cluster labels can be generated using
cluster detection methods such as k-means clustering algorithm.
13.3.1 Preparation
A p-nearest neighbour graph G is created as described in 13.1. In this proposed
method, instead of the Laplacian matrix L = D − W, the corresponding
normalised Laplacian matrix
L̂ = D−1/2 (D − W)D−1/2
is considered. This particular chosen version of the Laplacian matrix for the task
of feature selection is not motivated in [252]. However, it is pointed out in [243]
that L̂ in general has its theoretical advantages where many results can be
generalised to all graphs and not only regular graphs.
Define even a cluster indicator matrix Y ∈ {0, 1}m×κ based on a clustering of the
vertices in G, where κ is the number of clusters. Then, the corresponding scaled
cluster indicator matrix F ∈ Rm×κ is defined as
−1/2
F = Y YT Y
According to this definition, F is an orthogonal matrix since

−1/2 −1/2
F F=
T T
Y Y Y T
Y Y Y T
−1/2 −1/2
= YT Y YT Y YT Y
=I
where I ∈ Rκ×κ is the identity matrix.

The idea is to minimise the sum
tr(FT L̂F) + αkXZ − Yk2F + βkZk2,1
over Y and Z ∈ Rn×κ for some parameters α and β, where the scaled cluster
label matrix F and the feature selection matrix Z are optimised simultaneously.
130
(NDFS)
The first term tr(FT L̂F) is about modelling the local structure of the data
manifold, in a similar manner to what is done in LS. This structure is important
for clustering the data points. The second term αkXZ − Yk2F is about
minimising the fitting error when the original data X is embedded on a
low-dimensional subspace with a transformation matrix Z. In the third term, the
`2,1 -norm ensures the row sparsity of Z to filter out noisy features. If the feature
fj is not highly correlated to the pseudo cluster labels described by F, it is likely
that the row zj will be shrunk to the zero vector if β is large enough.
Since F is a cluster indicator matrix, it is necessary that F ≥ 0. This gives now
the optimisation problem
min tr(FT L̂F) + αkXZ − Fk2F + βkZk2,1

F,Z
s. t. F = Y(YT Y)−1/2 and F ≥ 0
Relaxation: This problem is, however, intractable because the constraint

F = Y(YT Y)−1/2 forces the elements in F to be discrete values. A relaxed
tractable continuous optimisation problem can be obtained using the constraint
FT F = I for orthogonality of F [255]:
min tr(FT L̂F) + αkXZ − Fk2F + βkZk2,1

F,Z
s. t. FT F = I and F ≥ 0
After one additional relaxation, the problem is simplified further to
(e) min φ(F, Z) = tr(FT L̂F) + αkXZ − Fk2F + βkZk2,1

F,Z
+ γkFT F − Ik2F
s. t. F≥0
where γ is practically a large constant to make sure the orthogonality constraint

FT F = I is respected.
Finding optimal Z: The matrix Z for which φ(F, Z) becomes minimal can now
be found by deriving φ partially with respect to Z and set it to zero. For
simplification, introduce the auxiliary diagonal matrix G with the main diagonal
elements
1
Gii =
2kzi k
131
(NDFS)
for i = 1, · · · , n. Then, some algebraic re-writings give

∂φ
= 0 ⇒ 2αXT (XZ − F) + 2βGZ = 0
∂Z
−1
⇒ Z = αXT X + βG αXT F
 −1
 β 
⇒ Z =  XT X + G XT F = E−1 XT F
 
{z α }
 
|
:=E
Since both XT X and G are symmetric, E is also symmetric and thus

ZT = FT XE−1 . Substituting this ZT into φ(F, Z), while using the properties
kZk2,1 = tr(ZT GZ) and kAk2F = tr(AT A) for an arbitrary matrix A, yields
αkXZ − Fk2F + βkZk2,1

= α tr (XZ − F)T (XZ − F) + β tr(ZT GZ)

= α tr ZT XT XZ − ZT XT F − FT XZ + FT F + β tr(ZT GZ)

= α tr FT F − 2α tr ZT XT F + α tr ZT XT XZ + β tr(ZT GZ)
| {z } | {z }
depends on F does not depend on F

= tr F T
αI − 2αXE X −1 T
F + tr Z T
αX X + βG Z
T
The optimisation problem (e) now has the form

min φ(F, Z) = tr(F MF) + γkF F − T T
Ik2F + tr Z T
αX X + βG Z
T
F,Z
s. t. F ≥ 0
where the matrix M ∈ Rn×n stands for M = L̂ + αI − 2αXE−1 XT .

Finding optimal F: The matrix F for which φ(F, Z) becomes minimal can be
computed iteratively with the Lagrange multiplier method. Let Λ be the matrix
containing the Lagrange multipliers Λij corresponding to the constraints Fij ≥ 0.
This gives the Lagrange function

L(F, Λ) = tr(FT MF) + γkFT F − Ik2F + tr ΛFT
Setting the partial derivative of L with respect to F to zero yields

2MF + 4γF FT F − I + Λ = 0

⇒ 2 MF + 2γFFT F − 2γF + Λ = 0
132
(NDFS)
Combining this with the Karush-Kuhn-Tucker condition Λij Fij = 0 for

complementary slackness leads to

2 MF + 2γFFT F − 2γF Fij + Λij Fij = 0
ij

⇒ MF + 2γFFT F − 2γF Fij = 0
ij
A reasonable updating rule for iterative computation of F is therefore

(2γF)ij
Fij ← Fij
(MF + 2γFFT F)ij
At this point, the whole idea can be summed up in an algorithm.

For some chosen parameters α, β, γ, κ, k and p, the NDFS algorithm can be
formulated as follows [252]:
1. Construct a p-nearest neighbour graph. Compute the Laplacian matrix
L = D − W and normalise it to L̂ = D−1/2 (D − W)D−1/2 .
2. At the iteration step t = 1, initialise Ft ∈ Rm×κ and let Gt ∈ Rn×n be the
identity matrix I.
3. Do iterations as follows until a chosen convergence criterion is met:
β t
Et = XT X + G
α
t −1

M = L̂ + α I − 2X E
t
X T
(2γFt )ij
ij = Fij
Ft+1 t
Mt Ft + 2γFt (Ft )T Ft
ij
−1
Zt+1 = Et XT Ft+1
 
1
 2kzt1 k 
..
 
G t+1
= 
 .


 
 1 
2kztn k
4. Select the k features with the highest values kzi k, where i = 1, · · · , n and zi
denotes the i-th row of Z.
133
13.4. FEATURE SELECTION VIA NON-NEGATIVE SPECTRAL
ANALYSIS AND REDUNDANCY CONTROL (NSCR)
The convergence criterion can be chosen as

φt − φt−1
<θ
φt
meaning that the relative change of the objective value is under a threshold.
According to the convergence analysis in [252], the objective function φ is
decreasing and the algorithm actually converges.
13.4 Feature selection via non-negative spectral

analysis and redundancy control (NSCR)
This approach is an extension of NDFS with further conditions to control the
feature redundancy by taking into account the correlation between the features
[253]. Additionally, the row sparsity of the feature selection matrix Z is
regularised by the `2,q -norm for some 0 < q ≤ 1.

One way to control redundancy is to prevent two highly correlated features fi
and fj to be selected together, even though both of them are discriminative.
This can be formulated by introducing a penalty containing the correlation
coefficient Cij between fi and fj . A suitable minimisation problem is therefore
(¦) min φ(F, Z) = tr(FT L̂F) + αkXZ − Fk2F + βkZkq2,q

F,Z
n
+ρ kzi kkzj kCij + γkFT F − Ik2F
X
i,j=1
s. t. F≥0
where C ∈ Rn×n is the correlation matrix between the n features and ρ is a

parameter. Do also note that instead of the `2,1 -norm as in NDFS, this method
uses the `2,q -norm to regularise the row sparsity of Z, where
 1/q
 v q 1/q
n n uXn 2 
kZk2,q =  kzi kq2  =
X X u
 t Zij  
i=1 i=1 j=1
for some 0 < q ≤ 1. Even though this choice may seem strange and complicated,
the `2,q -norm for regularisation leads to a better sparse matrix than the usual
134
`2,1 -norm. According to an empirical investigation by [253] on several data sets,

the best result is achieved with q = 1/2. A more extensive study is provided by
[256].
Finding optimal Z: As in NDFS, the matrix Z for which φ(F, Z) becomes
minimal can be found by deriving φ partially with respect to Z:
∂φ
= 0 ⇒ 2αXT (XZ − F) + 2βGZ + 2ρHZ = 0
∂Z
−1
⇒ Z = αXT X + βG + ρH αXT F
 −1
 β ρ 
⇒ Z = XT X + G + H XT F = E−1 XT F
 

| α
{z α }
:=E
Here, G and H are the diagonal matrices satisfying

j kzi kCij
P
q
Gii = and Hii =
2kzi k2−q 2kzi k
for i = 1, · · · , n and j = 1, · · · , n.
Finding optimal F: As in NDFS, the matrix F for which L(F, Z) becomes
minimal can once more be computed iteratively with the Lagrange multiplier
method applied on the Lagrange function

L(F, Λ) = tr(FT MF) + γkFT F − Ik2F + tr ΛFT
where M = L̂ + αI − 2αXE−1 XT . Then, the Karush-Kuhn-Tucker condition for

complementary slackness leads to

MF + 2γFFT F − 2γF Fij = 0
ij
Because of the matrices C and H in this method, the elements in M may have
mixed signs and affect the sign of the elements in F, which in turn may violate
the constraint F ≥ 0. An intervention can be made [253] by introducing the
auxiliary matrices M+ and M− where
Mij + Mij Mij − Mij
(M+ )ij = and (M− )ij =
2 2
These matrices satisfy M = M+ − M− and the updating rule can be written as
(M− F + 2γF)ij
Fij ← Fij
(M+ F + 2γFFT F)ij
The whole process can now be summed up in an algorithm.
135

For some chosen parameters α, β, γ, ρ, κ, k, p and q, the NSCR algorithm can be
1. Construct a p-nearest neighbour graph. Compute the Laplacian matrix
L = D − W and normalise it to L̂ = D−1/2 (D − W)D−1/2 .
2. Compute the correlation matrix C ∈ Rn×n for the n features.
3. At the iteration step t = 1, initialise Ft ∈ Rm×κ . Let also Gt ∈ Rn×n be the
identity matrix and Ht ∈ Rn×n the zero matrix.
Compute
β t ρ t
E t = XT X + G + H
α α
t −1

M = L̂ + α I − 2X E
t
X T
Mt + Mt
Mt+ =
2
M − Mt
t
Mt− =
2
Mt− Ft + 2γFt
ij
ij = Fij
Ft+1 t
Mt+ Ft + 2γFt (F ) Ft
t T
ij
−1
Zt+1 = Et XT Ft+1
 
q
2−q
 2kzt+1
1 k

..
 
G t+1
= .
 
 
 
 q 
2−q
2kzt+1
n k
P 
 j (zt+1 )j C1j

1 k
2kzt+1
 
..
 
Ht+1 = .
 
 
 
(zt+1 )j
 P 

 j
Cnj 

2kzt+1
n k
136
13.5. FEATURE SELECTION VIA ADAPTIVE SIMILARITY
LEARNING AND SUBSPACE CLUSTERING (SCFS)

Note that NDFS is a special case of this approach with ρ = 0 and q = 1.
13.5 Feature selection via adaptive similarity

learning and subspace clustering (SCFS)
This approach has similar ideas and processes as NDFS. The proposed
optimisation problem is
(`) min φ(F, Z) = kX − FFT Xk2F + αkXZ − Fk2F + βkZk2,1

F,Z
+ γkFFT 1 − 1k2F
s. t. F≥0
where F, according to the authors, can be interpreted as both a similarity matrix

and a cluster label matrix. Additionally, the product FFT represents “the
pairwise sample similarities in terms of the clustering values” [250]. In other
words, each element Fij shows if two data samples xi and xj are assigned to the
same cluster or not.
Following the same steps described in 13.3, the optimal Z is computed to
!−1
β
Z= X X+ G
T
XT F
α
and the updating rule for optimal F is shown to be
(2N + αXZ)ij
Fij ← Fij
(NFT F + FFT N + αF)ij

where N = XXT + mγ1 F.

For some chosen parameters α, β, γ, κ and k, the SCFS algorithm can be
137
13.5. FEATURE SELECTION VIA ADAPTIVE SIMILARITY
LEARNING AND SUBSPACE CLUSTERING (SCFS)
1. At the iteration step t = 1, initialise Ft ∈ Rm×κ . Let also Gt ∈ Rn×n be the

identity matrix.
Compute
!−1
β
Z t+1
= X X+ G T
XT F
α

Nt = XXT + mγ1 Ft

2Nt + αXZt+1
ij
ij = Fij
Ft+1 t
Nt (Ft )T Ft + Ft (Ft )T Nt + αFt
ij
and update  
1
 2kzt+1
1 k

..
 
G t+1
= 
 .


 
 1 
2kzt+1
n k
138
CHAPTER 14
Experiments and results
14.1 Line of action

The algorithms are run on different data sets obtained from open sources online
and from internal intra-day data. To highlight the effect of systematic feature
selection, the five presented algorithms are benchmarked with Baseline and
Maximum variance. Baseline, below abbreviated as BL, is simply that all
features are included, without any feature selection. Maximum variance, below
abbreviated as MV, is an intuitive statistics-based approach where the features
with the largest variances are selected.
The selected features are evaluated with respect to clustering quality, using the
measures clustering accuracy (ACC), normalised mutual information (NMI) and
adjusted mutual information (AMI). Two clusterings to compare are one
clustering C using all the n original features and one clustering C˜ using only the k
selected features. For simple communication, a such C will be called the true
clustering of a data set, while the measures ACC, NMI and AMI will be called
the three clustering quality measures. A good feature selection is expected to
give higher values of clustering quality measures than BL does.
Furthermore, a test for the stability of a feature selection algorithm regarding a
data set is performed by randomly splitting the data set into two subsets of equal
cardinality and observing if the selected features are the same. The measure used
for comparison is the Jaccard index.
139
14.2. DATA SETS
14.2 Data sets

The algorithms are first run on eight public data sets [?] collected from studies
on biotechnology, text mining, face recognition, and spoken letter recognition
with typically many features. In Table 14.1, these data sets are called ALLAML,
BASEHOCK, GLIOMA, LSOLET, LYMPHOMA, NCI9, PIE10P and PROSTATE.
Afterwards, the algorithms are run on a data set owned by Fortum. This data
set is used by Fortum for forecasting the electricity price, among other important
quantities. In Table 14.1, and only in this paper, this data set is for the sake of
reference called CELEBI as an abbreviation of “colossal data of electricity price
for bibliophiles”. What type of features are included in CELEBI were presented in
Section 10.3.
Data set m n
ALLAML 72 7129
BASEHOCK 1993 4862
GLIOMA 50 4434
LSOLET 2600 500
LYMPHOMA 96 4026
NCI9 60 9712
PIE10P 210 2420
PROSTATE 102 5966
CELEBI 941 442 576
Table 14.1: Data sets with their numbers of samples (m) and features (n).
14.3 Parameter setting

Some parameters need to be chosen in advance. Inspired by the methodology
used in [249], [250], [251], [252] and [253], the setting is chosen as follows. The
parameters α and β in NDFS, NSCR and SCFS are empirically set to 1 and 103
respectively. The punishment parameter γ should be significantly large, thus it is
chosen as γ = 106 . The threshold θ in the convergence criterion for these three
methods for is set to θ = 10−4 . For NSCR, ρ = 103 and q = 1/2 are used. The
parameter β in MCFS is also set to 103 . For all methods, the number p in the
p-nearest neighbour graph is set to p = 5.
140
14.4. EXPERIMENT: EIGHT PUBLIC DATA SETS
Since the k-means clustering algorithm returns a different clustering each session
due to the inherent random choice of start value, 50 clusterings C˜ are generated
for computing the three clustering quality measures. After that, the mean and
the standard deviation of each measure are computed.
Because the optimal number of selected features k is normally unknown in
practice, an empirical grid search is performed with the value set {50, 100,
150, 200, 250, 300} for k in case of public data sets and the value set
{25, 50, 75, 100, 125, 150, 175, 200} in case of CELEBI as CELEBI has much fewer
features than seven of the public data sets. After that, the k-values
corresponding to the largest ACC, NMI and AMI respectively are reported.
14.4 Experiment: Eight public data sets

The results are shown in Table B.1, B.2 and B.3. The numbers in parentheses
show the number of selected features for which the corresponding evaluation
metric is obtained. The best value for each data set is made bold. The last row
shows the difference in percentage point between the best value and the one
obtained from Baseline (BL) in the same column.
14.4.1 Clustering accuracy

For each of the eight data sets, there is at least one of the five presented feature
selection methods which leads to a better clustering accuracy than BL where no
feature selection is performed. Overall, the three iterative methods NDFS, NSCR
and SCFS give the best values. Depending on data set, the clustering accuracy
can increase with about 30 percentage points by reducing the number of features.
Note that fewer features can also decrease the clustering accuracy depending on
the data set.
14.4.2 Normalised and adjusted mutual information

The results look promising even here since there is always at least one of the five
presented feature selection methods yielding a better normalised and adjusted
mutual information than BL. Overall, the three iterative methods NDFS, NSCR
and SCFS give the best values. Depending on data set, both the normalised and
adjusted mutual information can increase with about 80 percentage points by
considering fewer features.
141
14.5. EXPERIMENT: CELEBI OF FORTUM
14.4.3 Stability
The result is shown in Figure B.1. Except for LSOLET, none of the other seven
data sets have a Jaccard index close to 80 %. This is not surprising [?]
considering the fact that the other seven data sets have high amounts of features
(thousands) and six of them have relatively low amounts of samples (below 250).
A low Jaccard index does not necessarily mean that a feature selection algorithm
is not stable. The number of samples and features also have a role. This
experiment shows empirically that a high Jaccard index can be obtained if a data
set has many samples (over a thousand) and relatively few (less than a thousand)
samples.
14.5 Experiment: CELEBI of Fortum

It is clearly not reasonable to run feature selection algorithms on CELEBI with
nearly one million data samples. Firstly, the memory required would be
enormous even for powerful computers. Secondly, the data sets contain time
series data every day since 2015 and likely not all of these are still actual. The
factors having a strong impact on the electricity price may change regularly.
According to experts at Fortum, the behaviour of the electricity price can be
different for different years, even when observing the same month.
Hence, the chosen focus of this study is on the data from 2019. The master data
set CELEBI is empirically split into 48 subsets X1 , · · · , X48 . Each ensemble of
four subsets with indices {4i − 3, 4i − 2, 4i − 1, 4i}, where i = 1, · · · , 12, make up
the i-th month of 2019. The subsets have equally many features but slightly
differently many samples (about 4500 samples each) depending on the
availability of data.
Due to the excessive computational time, only the most recent algorithm SCFS
will be run on CELEBI to demonstrate a proof-of-concept. From the experiment
with the eight public data sets, SCFS has proven to be a good method giving
high values for the three clustering quality measures, with quick convergence.
The results regarding the three clustering quality measures are reported in Table
B.4. The average stability is shown as a plot in Figure 14.1. Most information
for the similarity between the selected features for the 48 subsets would be
presented with a 48 × 48-matrix showing every pairwise similarity. However, only
the average stability is reported in this paper for readability reason.
142
14.5. EXPERIMENT: CELEBI OF FORTUM
Figure 14.1: Average Jaccard index.
Just as in the case with the eight public data sets, the values of ACC, NMI and
AMI for all 48 subsets increase when fewer features are used. Figure 14.1
suggests that the most influential features for the electricity price vary the most
during the months January, February and March and the least during the
months April, May and June.
This experiment shows that the SCFS algorithm can prove to be a good
candidate for feature selection of large intra-day data sets with both endogenous
and exogenous parameters. The variation in the selected features suggests that
shorter or longer time intervals for selecting features are preferable depending on
the period of the year.
143
CHAPTER 15
Discussion
15.1 Convergence speed

It is interesting to also study how the threshold θ in the convergence criterion
φt − φt−1
relative change = <θ
φt
for the three methods NDFS, NSCR and SCFS affects the number of necessary
iterations. An experiment is done where all three methods are executed for 15
iterations on the eight public data sets and how the relative change varies is
observed. The result is shown in Figure B.2.
As observed from the plots, the relative change in SCFS has a tendency to
decrease linearly after four iterations and seems to be able get as low as 10−10 .
Meanwhile, the relative changes in NDFS and NSCR do not seem to be able to
come down to 10−8 . It is therefore important to keep in mind that a too low
value on θ may cause some iterative methods to go on endlessly. This problem
can be easily solved by setting a limit on the number of iterations, such as 15.
All three methods seem to converge quickly, where the relative change can get
down to about 10−4 within 10 iterations.
144
15.2. PARAMETER SENSITIVITY
15.2 Parameter sensitivity

An additional interesting aspect to consider is how the parameters α and β in
the three methods NDFS, NSCR and SCFS affect the evaluation values ACC,
NMI and AMI. An experiment is done with the value set {10−4 , 10−2 , 1, 102 , 104 }
for both α and β and three-dimensional bar charts were plotted showing 75
evaluation values for each data set. Please see Figure B.3. Due to the excessive
computational time, only the most recent method SCFS has been used for this
sensitivity investigation.
Overall, the best evaluation values are obtained for large β. This is expected
since the parameter β is used to control the sparsity of the feature selection
matrix Z. The optimal α is, however, different for different data sets. It can also
be noted that clustering accuracy seems not so sensitive to the change of α and β
in the data sets BASEHOCK, GLIOMA and LSOLET, while it increases substantially
for larger α for the other data sets.
15.3 Jaccard index

In this paper, the Jaccard index has been used to compare two feature selections
for a data set split into two or more subsets, for the purpose of demonstrating a
way to work with large scale data sets. Even though the definition formula for
the Jaccard index is simple and intuitive, it may not be the most suitable
measure for every data set. Assume two selections where the five best features
are given by the sets A = {5, 14, 2, 67, 11} and B = {65, 11, 14, 3, 29}. Then, the
corresponding Jaccard index is
|A ∩ B| 2
J= = = 0.25 = 25 %
|A ∪ B| 8
However, if for example the features with indices 65 and 67 in practice turn out
to be highly correlated or completely dependent of each other, it may be
reasonable to consider them as the same choice of feature. This would lead to a
new possible Jaccard index, namely
|A ∩ B| 3
J0 = = ≈ 0.43 = 43 %
|A ∪ B| 7
which is about 72 % higher than J. In other words, just because the indices of
the selected features differ numerically does not mean that the corresponding
145
15.4. CONCEIVABLE CHALLENGES
feature selections are essentially different or that the feature selection algorithm
is unstable. A more suitable measure than the Jaccard index can be symmetrical
uncertainty [?] which accounts for the information similarity of the features and
not only the indices of these.
15.4 Conceivable challenges

A non-trivial challenge is clearly the long computational time for the data sets
with high sample dimension despite low feature dimension, which is typical for
time series data spanning over a long period. The empirical experiments show
the need of alternatives for effective treatment of large scale data, in order for the
presented methods to be useful. This poses a problem for particularly the
methods which require iterations. A suggestion is usage of graphics processing
units to accelerate the computations. This can even be combined with numerical
methods for matrix computations when it comes to solution of linear systems of
equations, eigenvalue problems and matrix functions.
Another direction for treatment of large scale data is to split the data to reduce
the number of samples per batch. This approach requires an analysis if the
selected features are essentially different among the batches, which can mean
that the algorithms are unstable, sensitive to changes of data or that the target
variables depend on different predictor variables under different time periods. It
is possible that the relevancy of features differs over time due to sporadic factors
such as local weather, grid congestion or seasonal variability of demand.
15.5 Implications for forecasting activities

The results can be used to provide information on the potential redundancies in
large available data sets. This knowledge can be used to assess the quality of a
data set independently of a forecasting model. The evaluation performed in this
research is primarily aimed at assessing the performance of a graph-based feature
selection method on intra-day data. As has been experienced in previous
research, the appropriate set of features can sometimes be non-intuitive and so a
data-driven method would be preferred to assess the actual accuracy of a
subsequent forecasting model.
146
15.6. SCALABILITY
15.6 Scalability
An interesting point to raise is the potential scalability of this model, both
within the trading sector and even in others. As already discussed at the end of
Part I, an inclusion of other parameters from energy markets and from the grid,
is conceivable. In the use-case gathering, feature selection was included in
maintenance for hydropower, maintenance for nuclear power, maintenance for
wind-power, as it is technically suitable on any time series data or for almost any
classification purposes. As filter-based feature selection has the advantage of
being model-free and faster than the alternatives, it can technically be
generalisable to more domain areas as the pre-processing phase of computational
or statistical models in presence of large scale or chaotic data. A proper
evaluation of the actual scalability of graph-based feature selection methods
compared with industry standards would be necessary to confirm the hypothesis.
147
CHAPTER 16
Research conclusion
Graph analytics is emerging as a promising technology across industries.

Applicable use-cases can range from operation of power systems to management
of databases as well as decision support tools for administrative staff. Spurred by
the changing market conditions and the decarbonisation of the energy sector,
electric utility companies are screening the opportunities offered by digitalisation.
Because of their ability to analyse networks, graph analytics can be a good
candidate for this endeavour.
This research proposes a methodical approach to capture the essence of the
underlying mathematical foundations of graph theory and to screen, evaluate and
prioritise potential applications tied to common operations of electric utility
companies. The best contextual fit for graph theory was found to be the sector of
energy trading and more particularly, the problem of feature selection for
electricity price forecasting. A proof-of-concept was elaborated to demonstrate
the added value graphs can bring.
Electricity price forecasting is a key challenge for actors in the energy markets
since the liberalisation of the industry and due to an increasing volatility in
electricity prices with the integration of more renewable sources of electricity. As
data is becoming more pervasive, effective methods for automatically selecting
the most explanatory endogenous and exogenous variables impacting the
electricity prices are highly relevant. Mathematical methods for automatic
feature selection can reduce the time-consuming manual workload and also the
148
possible bias in decision making. Graph-based feature selection methods having
the ability to reduce the dimensionality of data sets by preserving the geometric
structure have been chosen in order to tackle the challenges brought about by
the sometimes non-linear impact of variables on the electricity prices as well as
the high dimensionality of the data sets. They were applied after obtaining the
nearest neighbour graph from the observations in the intra-day time series.
The presented methods LS, MCFS, NDFS, NSCR and SCFS work fast with data
sets of high feature dimension and low sample dimension. However, it takes
considerably more time with data sets of high sample dimension despite low
feature dimension. Some metrics such as clustering accuracy, normalised mutual
information, adjusted mutual information and the Jaccard index can be used to
evaluate and compare the selected features. Because of its high performance in
experiments on benchmark, open-source data sets SCFS was applied to intra-day
data consisting in over 500 features. Its higher scores than the Baseline (original)
data set on the all three metrics suggest that the reduced data set could improve
forecasting accuracy.
Prototypical regression models can be built after having the number of original
features reduced. From this, the quality of the models can be assessed with
respect to prediction errors, which may even give a hint about whether potential
sets of already selected features need some adjustments. The independence of the
feature selection and its evaluation from the forecasting model allows to isolate
the performance of the graph-based method. Thus, this approach can be
generalised to more applications benefiting from feature selection.
For effectiveness purposes, one should consider splitting the data, utilising
graphics processing units or looking into other computing methods suitable for
large-scale data as a complement. Another direction is to improve graph-based
feature selections by customising some functions and measures depending on the
characteristics of the data set and what the problem solver wishes to achieve. For
example, alternative ways for computing the similarity matrix, for clustering the
data points and for comparing selected features in case of split data are
interesting subjects.
149
Bibliography
[1] Eurostat (2019). Electricity production, consumption and market overview.

Fetched at
https://ec.europa.eu/eurostat/statistics-explained/index.php/
Electricity_production,_consumption_and_market_overview
[2] IVA, Royal Swedish Academy of Engineering Sciences (2016) Future
Electricity Production in Sweden - A project report.
https://www.iva.se/globalassets/rapporter/vagval-el/
201705-iva-vagvalel-framtidens-elproduktion-english-c.pdf
[3] Deloitte Monitor (2018). Power Market Study 2030 - A new outlook for the
energy industry. https://www2.deloitte.com/content/dam/Deloitte/de/
Documents/energy-resources/Power_Market_EN_Komplett_web.pdf
[4] McKinsey (2016). The digital utility: New opportunities and challenges.
https://www.mckinsey.com/industries/
electric-power-and-natural-gas/our-insights/
the-digital-utility-new-opportunities-and-challenges
[5] Costello, K.W., and R.C. Hemphill (2014). Electric Utilities’ "Death Spiral":
Hyperbole or Reality?. The Electricity Journal 27 (10): 7–26.
doi:10.1016/j.tej.2014.09.011.
[6] Pérez-Arriaga, Ignacio ; Knittel, Christopher (2016). Utility of the future -
An MIT Energy Initiative response to an industry in transition.
https://energy.mit.edu/wp-content/uploads/2016/12/
Utility-of-the-Future-Full-Report.pdf
[7] Statkraft (2018). Global energy trends - Statkrafts low emissions scenario
2018. https://www.statkraft.com/globalassets/explained/
statkrafts-low-emissions-scenario-report-2018.pdf
150
BIBLIOGRAPHY
[8] Tsafos, Nikos (2020). How will natural gas fare in the eneergy transition.
Center for strategic and international studies (CSIS). https://csis-prod.
s3.amazonaws.com/s3fs-public/publication/200114_Tsafos_How_Will_
Natural_Gas_Fare.pdf?_oyuJJTiixKn0sDY1y7ZeQ3c3C5ijhnG
[9] Burke, Paul J., Abayasekera, Ashani (2017). The price elasticity of electricity
demand in the United States: A three-dimensional analysis. Center for
Applied Macroeconomic Analysis (CAMA), Crawford School of Public Policy.
https:
//cama.crawford.anu.edu.au/sites/default/files/publication/cama_
crawford_anu_edu_au/2017-08/50_2017_burke_abayasekara_0.pdf
[10] Wagner, Marcus (2011). To explore or to exploit? An empirical investigation
of acquisitions by large incumbents. Research Policy, Volume 40, Issue 9,
November 2011, Pages 1217-1225. doi.org/10.1016/j.respol.2011.07.006
[11] Albadi, M.H. ; El-Saadany, E.F. (2007). Demand Response in Electricity
Markets: An Overview.
[12] Poullikas, Andreas (2013). A comparative assessment of net metering and
feed in tariff schemes for residential PV systems. Sustainable Energy
Technologies and Assessments 3:1–8. DOI: 10.1016/j.seta.2013.04.001.
[13] International Energy Agency (IEA) (2019). World Energy Outlook 2019.
[14] Staffell, Iain ; Pfenninger, Stefan (2017). The increasing impact of weather
on electricity supply and deemand. Energy, Volume 145, 15 February 2018,
Pages 65-78. doi.org/10.1016/j.energy.2017.12.051.
[15] International Energy Agency (2017). Digitalization & Energy.
http://www.iea.org/publications/freepublications/publication/
DigitalizationandEnergy3.pdf.
[16] Bloomberg, J. (2018) Digitization, Digitalization, And Digital
Transformation: Confuse Them At Your Peril. Forbes Magazine
https://www.forbes.com/sites/jasonbloomberg/2018/04/29/
digitization-digitalization-and-digital-transformation-confuse-them-at-your-per
#7b166c3d2f2c.
[17] Saunders, M. N. K., Lewis, P., & Thornhill, A. (2009). Research methods for
business students (5th ed). Prentice Hall.
151
BIBLIOGRAPHY
[18] Goddard, W. & Melville, S. (2004) “Research Methodology: An

Introduction” 2nd edition, Blackwell Publishing.
[19] Wilson, J. (2010). Essentials of Business Research: A Guide to Doing Your
Research Project. SAGE Publications, p.7.
[20] Kovács, G., & Spens, K. M. (2005). Abductive reasoning in logistics
research. International Journal of Physical Distribution & Logistics
Management, 35(2), 132–144. https://doi.org/10.1108/09600030510590318.
[21] Kelly, P.; Kranzburg M. (1978). Technological Innovation: A Critical Review
of Current Knowledge. San Francisco: San Francisco Press.
[22] Eling, K., & Herstatt, C. (2017). Managing the Front End of
Innovation—Less Fuzzy, Yet Still Not Fully Understood. Journal of Product
Innovation Management, 34(6), 864–874. https://doi.org/10.1111/jpim.12415
[23] Boeddrich, H.-J. (2004). Ideas in the Workplace: A New Approach Towards
Organizing the Fuzzy Front End of the Innovation Process. Creativity and
Innovation Management, 13(4), 274–285.
https://doi.org/10.1111/j.0963-1690.2004.00316.x
[24] Frishammar, J., Florén, H., & Wincent, J. (2011). Beyond Managing
Uncertainty: Insights from Studying Equivocality in the Fuzzy Front-End of
Product and Process Innovation Projects. IEEE Transactions on Engineering
Management, 58(3), 551–563. https://doi.org/10.1109/TEM.2010.2095017
[25] Gerlach, S., & Brem, A. (2017). Idea management revisited: A review of the
literature and guide for implementation. International Journal of Innovation
Studies, 1(2), 144–161. https://doi.org/10.1016/j.ijis.2017.10.004
[26] Edeland, C., & Mörk, T. (2018). Blockchain Technology in the Energy
Transition: An Exploratory Study on How Electric Utilities Can Approach
Blockchain Technology.
[27] Alexe,C.G., Alexe, C.M., Militaru, G. (2014), Idea Management in the
innovation process, Network Intelligence Studies, Volume 11, Issue 2(4), 2014.
[28] Brem, A., & Voigt, K.-I. (2009). Integration of market pull and technology
push in the corporate front end and innovation management—Insights from
the German software industry. Technovation, 29(5), 351–367.
https://doi.org/10.1016/j.technovation.2008.06.003
152
BIBLIOGRAPHY
[29] Vahs and Brem, 2013. Innovations management. Schäffer-Poeschel Verlag,

Stuttgart (2013)
[30] L.N. Neagoe, V.M. Klein (2009). Employee suggestion system (Kaizen
Teian) - the bottom-up approach for productivity improvement International
conference on economic engineering and manufacturing systems, Vol. 10
(2009), pp. 361-366
[31] Sandström, C. ; Björk, J. (2010). Idea management systems for a changing
innovation landscape. International Journal of Product Development, 11 (3)
(2010), pp. 310-324.
[32] Brem, A. ; Voigt, K.-I. (2007). Innovation management in emerging
technology ventures - the concept of an integrated idea management.
International Journal of Technology, Policy and Management, 7 (3) (2007),
pp. 304-321.
[33] El Bassiti, L. ; Ajhoun, R. (2013). Toward an innovation management
framework: A life-cycle model with an idea management focus International
Journal of Innovation, Management and Technology, 4 (6) (2013), pp.
551-559.
[34] Nilsson, L. ; Elg, M. ; Bergman, B. (2002). Managing ideas for the
development of new products. International Journal of Technology
Management (Published 1 January 2002).
https://doi.org/10.1504/IJTM.2002.003067
[35] Lloyd, G.C. (1996). Fostering an environment of employee contribution to
increase commitment and motivation. Empowerment in Organizations, 4 (1)
(1996), pp. 25-28.
[36] Gamlin, J.N. ; Yourd, R. ; Patrick, V. (2007). Unlock creativity with
“active” idea management Research Technology Management (January-Fe)
(2007), pp. 13-16.
[37] Westerski, A. ; Iglesias, C.A. ; Nagle, T. (2011). The road from community
ideas to organisational innovation: A life cycle survey of idea management
systems. International Journal of Web Based Communities, 7 (4) (2011), pp.
493-506
153
BIBLIOGRAPHY
[38] Brem, A. ; Voigt, K.-I. (2007). Integration of Market Pull and Technology
Push in the Corporate Front End and Innovation Management - Insights
from the German Software Industry. Technovation 29(5):351-367. DOI:
10.1016/j.technovation.2008.06.003
[39] Dean, D.L. ; Hender, J.M. ; Rodgers, T.L. ; Santanen, E.L. (2006).
Identifying quality, novel, and creative ideas: Constructs and scales for idea
evaluation. Journal of the Association for Information Systems, 7 (10) (2006),
pp. 646-698.
[40] Fortum (2019). CEO’s Business Review.
[41] Diestel, R. (2016). Graph Theory, 5th edition.
[42] Bondy, J. & Murty, U. (1982). Graph Theory with Applications, 5th edition.
[43] Wilson, R. (1998). Introduction to Graph Theory, 4th edition.
[44] Nandhini, V. (2019). A study on course timetable scheduling and exam
timetable scheduling using graph coloring approach. International Journal for
Research in Applied Science and Engineering Technology, 7(3), 1999–2006.
doi: 10.22214/ijraset.2019.3368
[45] Imhof, K., & Arias, C. (1990). On the use of colours for presenting power
system connectivity information. Conference Papers Power Industry
Computer Application Conference. doi: 10.1109/pica.1989.38989
[46] Yu, H., Han, X., Ma, Y. & Xing, X. (2016). Transmission line maintenance
scheduling based on graph coloring. Proceedings of the 2016 International
Conference on Education, Management, Computer and Society. doi:
10.2991/emcs-16.2016.37
[47] Anwar, A., & Mahmood, A. N. (2016). Anomaly detection in electric
network database of smart grid: Graph matching approach. Electric Power
Systems Research, 133, 51–62. doi: 10.1016/j.epsr.2015.12.006
[48] Elmrabet, Z., Elghazi, H., Kaabouch, N. & Elghazi, H. (2018).
Cyber-security in smart grid: Survey and challenges. Computer and Electrical
Engineering, 67. doi: 10.1016/j.compeleceng.2018.01.015
154
BIBLIOGRAPHY
[49] Mookiah, L., Dean, C. & Eberle, W. (2017). Graph-based anomaly detection
on smart grid data. flairs conference. Proceedings of the Thirtieth
International Florida Artificial Intelligence Research Society Conference.
[50] Mohamad, H., Faezy Wan Zalnidzham, W., Ashida Salim, N., Shahbudin,
S., & Mat Yasin, Z. (2019). Power system restoration in distribution network
using minimum spanning tree - Kruskal’s algorithm. Indonesian Journal of
Electrical Engineering and Computer Science, 16 (1). doi:
10.11591/ijeecs.v16.i1.pp1-8
[51] Zhang, Y. (2009). Stock market network topology analysis based on a
minimum spanning tree approach. Bowling Green State University.
[52] Lautier, D., Ling, J., & Raynaud, F. (2014). Systemic risk in commodity
markets: What do trees tell us about crises? SSRN Electronic Journal. doi:
10.2139/ssrn.2430167
[53] Erickson, J. (2019). Algorithms. Retrieved: 2020.05.29.
https://jeffe.cs.illinois.edu/teaching/algorithms
[54] Takaoka, T. (1998). Shortest path algorithms for nearly acyclic directed
graphs. Theoretical Computer Science, 203 (1), 143–150. doi:
10.1016/s0304-3975(97)00292-2
[55] Fortum. (2020). Laddinfrastruktur för elfordon med Fortum Charge &
Drive. Retrieved: 2020.05.29.
https://www.fortum.se/foretag/ladda-elbil/om-fortum-charge-drive
[56] Baum, M., Dibbelt, J., Gemsa, A., Wagner, D., & Zündorf, T. (2019).
Shortest feasible paths with charging stops for battery electric vehicles.
Transportation Science, 53 (6), 1627–1655. doi: 10.1287/trsc.2018.0889
[57] Parastvand, H., Moghaddam, V., Bass, O., Masoum, M. A. S., Chapman,
A., & Lachowicz, S. (2020). A graph automorphic approach for placement
and sizing of charging stations in EV network considering traffic. IEEE
Transactions on Smart Grid, 1–1. doi: 10.1109/tsg.2020.2984037
[58] Jia, L., Hu, Z., Song, Y., & Luo, Z. (2012). Optimal siting and sizing of
electric vehicle charging stations. 2012 IEEE International Electric Vehicle
Conference. doi: 10.1109/ievc.2012.6183283
155
BIBLIOGRAPHY
[59] Chauhan, V., Gutfraind, A., & Safro, I. (2019). Multiscale planar graph
generation. Applied Network Science, 4 (1). doi: 10.1007/s41109-019-0142-3
[60] Chauhan, V. (2018). Planar graph generation with application to water
distribution networks. Clemson University. doi: 10.13140/RG.2.2.18915.81445
[61] Ansari, M. H., Vakili, V. T., Bahrak, B., & Tavassoli, P. (2018). Graph
theoretical defense mechanisms against false data injection attacks in smart
grids. Journal of Modern Power Systems and Clean Energy, 6 (5), 860–871.
doi: 10.1007/s40565-018-0432-2
[62] Nussbaum, Y. (2014). Network flow problems in planar graphs/ Yahav,
Nussbaum. Tel-Aviv: Universitat Tel-Aviv.
[63] Fujie, F., & Zhang, P. (2014). Covering walks in graphs. SpringerBriefs in
Mathematics. doi: 10.1007/978-1-4939-0305-4
[64] Leite, J. B., & Mantovani, J. R. S. (2016). Distribution system state
estimation using the Hamiltonian cycle theory. 2016 IEEE Power and Energy
Society General Meeting (PESGM). doi: 10.1109/pesgm.2016.7741360
[65] Robinson, I., Webber, J. & Eifrem, E. (2015). Graph databases. Sebastopol,
CA: O’Reilly & Associates.
[66] Diendorfer, C., Haslwanter, J. D. H., Stanovich, M., Schoder, K.,
Sloderbeck, M., Ravindra, H., & Steurer, M. (2017). Graph traversal-based
automation of fault detection, location, and recovery on MVDC shipboard
power systems. 2017 IEEE Second International Conference on DC
Microgrids (ICDCM). doi: 10.1109/icdcm.2017.8001032
[67] Liao, H., Mariani, M., Medo, M., Zhang, Y. & Zhou, M. (2017). Ranking in
evolving complex networks. Physics Reports, 689. doi:
10.1016/j.physrep.2017.05.001
[68] Kiss, C., & Bichler, M. (2008). Identification of influencers - Measuring
influence in customer networks. Decision Support Systems, 46 (1), 233–253.
doi: 10.1016/j.dss.2008.06.007
[69] Yada, K., Motoda, H., Washio, T., & Miyawaki, A. (2006). Consumer
behavior analysis by graph mining technique. New Mathematics and Natural
Computation, 02 (01), 59–68. doi: 10.1142/s1793005706000294
156
BIBLIOGRAPHY
[70] Green, T. & Hartley, N. (2015). Using graph theory to value paying and
nonpaying customers in a social network: Linking customer lifetime value to
word-of-mouth social value. Journal of Relationship Marketing, 14 (4). doi:
10.1080/15332667.2015.1095008
[71] Sami, M. (2018). Effective placement of sensors for efficient early warning
system in water distribution network. Chalmers University of Technology.
[72] Simone, A., Ridolfi, L., Laucelli, D., Berardi, L., & Giustolisi, O. (2018).
Centrality metrics for water distribution networks. doi: 10.29007/7lxd
[73] Hochbaum, D. S. (2006). Ranking sports teams and the inverse equal paths
problem. Lecture Notes in Computer Science Internet and Network
Economics, 307–318. doi: 10.1007/11944874_28
[74] Vigna, S. (2019). Spectral ranking. Retrieved: 2020.05.29.
https://arxiv.org/pdf/0912.0238.pdf
[75] Darvish, M., Yasaei, M., & Saeedi, A. (2009). Application of the graph
theory and matrix methods to contractor ranking. International Journal of
Project Management, 27 (6), 610–619. doi: 10.1016/j.ijproman.2008.10.004
[76] Hakimi-Asl, A., Amalnick, M. S., & Hakimi-Asl, M. (2016). Proposing a
graph ranking method for manufacturing system selection in high-tech
industries. Neural Computing and Applications, 29 (1), 133–142. doi:
10.1007/s00521-016-2420-7
[77] Nian, K., Zhang, H., Tayal, A., Coleman, T. & Li, Y. (2016). Unsupervised
spectral ranking for anomaly and application to auto insurance fraud
detection. The Journal of Finance and Data Science, 2 (1). doi:
10.1016/j.jfds.2016.03.001
[78] Lempitsky, V., Blake, A., & Rother, C. (2012). Branch-and-mincut: Global
optimization for image segmentation with high-level priors. Journal of
Mathematical Imaging and Vision, 44 (3), 315–329. doi:
10.1007/s10851-012-0328-0
[79] Sen, A., Ghosh, P., Vittal, V., & Yang, B. (2009). A new min-cut problem
with application to electric power network partitioning. European
Transactions on Electrical Power, 19(6), 778–797. doi: 10.1002/etep.255
157
BIBLIOGRAPHY
[80] Sun, K., Hou, Y., Sun, W. & Qi, J. (2019). Power system control under
cascading failures: Understanding, mitigation, and system restoration.
Wiley-IEEE Press.
[81] Sou, K. C., Sandberg, H., & Johansson, K. H. (2011). Electric power
network security analysis via minimum cut relaxation. IEEE Conference on
Decision and Control and European Control Conference. doi:
10.1109/cdc.2011.6160456
[82] Salem, R., Moneim, W. & Hassan, M. (2019). Graph mining techniques for
graph clustering: Starting point. Journal of Theoretical and Applied
Information Technology, 97 (15).
[83] Ehteram, M., Binti Koting, S., Afan, H. A., Mohd, N. S., Malek, M. A.,
Ahmed, A. N., El-shafie, A. H., Onn, C. C., Lai, S. H., & El-Shafie, A.
(2019). New Evolutionary Algorithm for Optimizing Hydropower Generation
Considering Multireservoir Systems. Applied Sciences, 9(11), 2280.
https://doi.org/10.3390/app9112280.
[84] Stoll, B., Andrade, J., Cohen, S., Brinkman, G., & Brancucci
Martinez-Anido, C. (2017). Hydropower Modeling Challenges
(NREL/TP–5D00-68231, 1353003; p. NREL/TP–5D00-68231, 1353003).
https://doi.org/10.2172/1353003.
[85] Liu, P., Nguyen, T.-D., Cai, X., & Jiang, X. (2012). Finding Multiple
Optimal Solutions to Optimal Load Distribution Problem in Hydropower
Plant. Energies, 5(5), 1413–1432. https://doi.org/10.3390/en5051413.
[86] Fortum Interviewee M (2020).
[87] International Hydropower Association (2020). 2020 Hydropower Status
Report. https://www.hydropower.org/country-profiles/region-europe
[88] Fortum (2019). Financials 2019.
[89] Moeini, R., & Afshar, M. H. (2011). Arc-based constrained ant colony
optimisation algorithms for the optimal solution of hydropower reservoir
operation problems. 38, 14.
[90] Johnson, M. P., Rybalov, L., Zhao, L., & Bald, S. (2019). Maximizing power
production in path and tree riverine networks. Sustainable Computing:
158
BIBLIOGRAPHY
Informatics and Systems, 22, 300–310.

https://doi.org/10.1016/j.suscom.2017.10.004.
[91] Ahmed, I., Dagnino, A., Bongiovi, A., & Ding, Y. (2018). Outlier Detection
for Hydropower Generation Plant. 2018 IEEE 14th International Conference
on Automation Science and Engineering (CASE), 193–198.
https://doi.org/10.1109/COASE.2018.8560424.
[92] Yu, X., Zhang, J., Fan, C., & Chen, S. (2016). Stability analysis of
governor-turbine-hydraulic system by state space method and graph theory.
Energy, 114, 613–622. https://doi.org/10.1016/j.energy.2016.07.164.
[93] Xu, W.-Y. (2007). Slope stability analysis of limit equilibrium finite element
method based on the Dijkstra algorithm. Yantu Gongcheng Xuebao/Chinese
Journal of Geotechnical Engineering , 2007, Vol.29(8), p.1159-1172.
[94] Blancke, O., Tahan, A., Komljenovic, D., Amyot, N., Hudon, C., &
Levesque, M. (2016). A hydrogenerator model-based failure detection
framework to support asset management. 2016 IEEE International
Conference on Prognostics and Health Management (ICPHM), 1–6.
https://doi.org/10.1109/ICPHM.2016.7542867.
[95] Ahmed, I., Dagnino, A., & Ding, Y. (2019). Unsupervised Anomaly
Detection Based on Minimum Spanning Tree Approximated Distance
Measures and its Application to Hydropower Turbines. IEEE Transactions on
Automation Science and Engineering, 16(2), 654–667.
https://doi.org/10.1109/TASE.2018.2848198
[96] Tenenbaum, J. B. (1998). Mapping a Manifold of Perceptual Observations.
In M. I. Jordan, M. J. Kearns, & S. A. Solla (Eds.), Advances in Neural
Information Processing Systems 10 (pp. 682–688). MIT Press.
http://papers.nips.cc/paper/1332-mapping-a-manifold-of-perceptual-
observations.pdf.
[97] Fortum Interviewee A (2020).
[98] Fortum Interviewee C (2020).
[99] Fortum Interviewee K (2020).
159
BIBLIOGRAPHY
[100] Selak, L., Butala, P., Sluga, A. (2014). Condition monitoring and fault
diagnostics for hydropower plants. Computers in Industry 65 (2014), pp.
924–936.
[101] Åstrand, S., (2008). Environmental Effects of Turbine Oil Spills from
Hydro Power Plants to Rivers, Stockholm: The Royal Institute of Technology.
[102] Ma, J., & Jiang, J. (2009). Applications of Fault Diagnosis in Nuclear
Power Plants: An Introductory Survey. IFAC Proceedings Volumes, 42(8),
1150–1161. https://doi.org/10.3182/20090630-4-ES-2003.00189.
[103] HINES, J. W., & DAVISH, E. (n.d.). LESSONS LEARNED FROM THE
U.S. NUCLEAR POWER PLANT ON-LINE MONITORING PROGRAMS.
14.
[104] Dabrowski, C., Hunt F., (2011). Identifying failure scenarios in complex
systems by perturbing markov chain models. Proceedings of the 2011
Pressure Vessels & Piping Division (PVPD) Conference July 7-11, 2011,
Baltimore, MD, USA.
[105] Wu, M., Liu, Y., Wen, Z., Peng, M., Xia, H., Li, W., Wang, H., Abiodun,
A., & Yu, W. (2017). Fault diagnosis and severity estimation in nuclear power
plants.
[106] Chen, J., Zhou, T., & Ran, K. (2010). The Energy-Saving Diagnosis of
PWR Nuclear Power Station Based on the Thermo-Economic Analysis
Model. 18th International Conference on Nuclear Engineering: Volume 1,
137–144. https://doi.org/10.1115/ICONE18-29454.
[107] Ashaari, A., Ahmad, T., Shamsuddin, M., Mohamad, W. M. W., & Omar,
N. (2015). Graph Representation for Secondary System of Pressurized Water
Reactor with Autocatalytic Set Approach. Journal of Mathematics and
Statistics, 11(4), 107–112. https://doi.org/10.3844/jmssp.2015.107.112.
[108] Yamada, Y., & Teraoka, Y. (1998). An optimal design of piping route in a
CAD system for power plant. Computers & Mathematics with Applications,
35(6), 137–149. https://doi.org/10.1016/S0898-1221(98)00025-X.
[109] Al-Masri, A. (2019). How Does Back-Propagation in Artificial Neural
Networks Work? Towards Data Science (Medium). shorturl.at/lwxyI.
160
BIBLIOGRAPHY
[110] Baraldi, P., Di Maio, F., Rigamonti, M., Zio, E., & Seraoui, R. (2015).
Unsupervised clustering of vibration signals for identifying anomalous
conditions in a nuclear turbine. Journal of Intelligent & Fuzzy Systems, 28(4),
1723–1731. https://doi.org/10.3233/IFS-141459.
[111] Chun-Ling Dong, Zhang, Q., & Yue Zhao. (2013). An adaptive decision
method using structure feature analysis on dynamic fault propagation model.
2013 International Conference on Machine Learning and Cybernetics,
664–669. https://doi.org/10.1109/ICMLC.2013.6890373.
[112] Dutta, S., & Overbye, T. J. (2012). Optimal Wind Farm Collector System
Topology Design Considering Total Trenching Length. IEEE Transactions on
Sustainable Energy, 3(3), 339–348.
https://doi.org/10.1109/TSTE.2012.2185817.
[113] Pouyaei, A., Golgouneh, A., & Kholghi, B. (2019). A New Local Searching
Approach for Wind Farm Layout Optimization Problem. 2019 IEEE 2nd
International Conference on Renewable Energy and Power Engineering
(REPE), 114–119. https://doi.org/10.1109/REPE48501.2019.9025115.
[114] Esfahanian, V.; Pour, A.S.; Harsini, I.; Haghani, A.; Pasandeh, R.;
Shahbazi, A.; Ahmadi, G. Numerical analysis of flow field around NREL
Phase II wind turbine by a hybrid CFD/BEM method. J. Wind Eng. Ind.
Aerod. 2013, 120, 29–36.
[115] Xu, C., Chen, D., Han, X., Pan, H., & Shen, W. (2016). The Collection of
The Main Issues for Wind Farm Optimisation in Complex Terrain. Journal of
Physics: Conference Series, 753, 032066.
https://doi.org/10.1088/1742-6596/753/3/032066.
[116] Li, J., Hu, W., Wu, X., Huang, Q., Liu, Z., Chen, Z., & Blaabjerg, F.
(2019). A Hybrid Cable Connection Structure for Wind Farms With
Reliability Consideration. IEEE Access, 7, 144398–144407.
https://doi.org/10.1109/ACCESS.2019.2944888.
[117] Dahmani O, Bourguet S, Machmoum M et al (2015) Optimization of the
connection topology of an offshore wind farm network. IEEE Syst J
9(4):1519–1528.
[118] Loganathan, M.K., Besbaurah, I., Gandhi, O.P., Borah, R.C. (2018).
Criticality analysis of wind turbine energy system using fuzzy digraph models
161
BIBLIOGRAPHY
and matrix method. Safety and Reliability – Safe Societies in a Changing

World. PROCEEDINGS OF THE 28TH INTERNATIONAL EUROPEAN
SAFETY AND RELIABILITY CONFERENCE (ESREL 2018),
TRONDHEIM, NORWAY, 17–21 JUNE 2018.
[119] Jing, Y., Ning, L., & Shaoyuan, L. (2017). Two-layer PSDG based fault
diagnosis for wind turbines. 2017 36th Chinese Control Conference (CCC),
7148–7154. https://doi.org/10.23919/ChiCC.2017.8028484.
[120] Djeziri, M. A., Benmoussa, S., & Sanchez, R. (2018). Hybrid method for
remaining useful life prediction in wind turbine systems. Renewable Energy,
116, 173–187. https://doi.org/10.1016/j.renene.2017.05.020
[121] Marti-Puig, P., Blanco-M, A., Cárdenas, J., Cusidó, J., & Solé-Casals, J.
(2019). Feature Selection Algorithms for Wind Turbine Failure Prediction.
Energies, 12(3), 453. https://doi.org/10.3390/en12030453.
[122] Afanasyeva, S., Saari, J., Pyrhönen, O., & Partanen, J. (2018). Cuckoo
search for wind farm optimization with auxiliary infrastructure. Wind
Energy, 21(10), 855–875. https://doi.org/10.1002/we.2199.
[123] Fortum Interviewee F (2020).
[124] Global Market Insights (2019).Wind Energy Market Report.
[125] Allied Market Research (2016). Wind Turbine Market by Type of Wind
Farm (Onshore and Offshore) and Application (Industrial, Commercial, and
Residential) - Global Opportunity Analysis and Industry Forecast, 2017-2023.
[126] Zion Market Research (2019). Wind Turbine Operations and Maintenance
Market by Application (Offshore and Onshore): Global Industry Perspective,
Comprehensive Analysis, and Forecast, 2018–2025.
[127] Márquez, F. P. G., Pérez, J. M. P., Marugán, A. P., & Papaelias, M.
(2016). Identification of critical components of wind turbines using FTA over
the time. Renewable Energy, 87, 869–883.
https://doi.org/10.1016/j.renene.2015.09.038.
[128] Pan, L., Yao, E., Yang, Y., & Zhang, R. (2020). A location model for
electric vehicle (EV) public charging stations based on drivers’ existing
activities. Sustainable Cities and Society, 59, 102192.
https://doi.org/10.1016/j.scs.2020.102192.
162
BIBLIOGRAPHY
[129] Baum, M., Dibbelt, J., Pajor, T., Sauer, J., Wagner, D., & Zündorf, T.
(2020). Energy-Optimal Routes for Battery Electric Vehicles. Algorithmica,
82(5), 1490–1546. https://doi.org/10.1007/s00453-019-00655-9.
[130] Campana, M., & Inga, E. (2019). Optimal Allocation of Public Charging
Stations based on Traffic Density in Smart Cities. 2019 IEEE Colombian
Conference on Applications in Computational Intelligence (ColCACI), 1–6.
https://doi.org/10.1109/ColCACI.2019.8781986.
[131] Phonrattanasak, P., & Leeprechanon, N. (n.d.). Optimal placement of EV
fast charging stations considering the impact on electrical distribution and
traffic condition.
[132] Raagapriya, S., Ain, A., & Dasgupta, P. (2017). Route optimization for an
electric vehicle with priority destinations. 2017 International Conference On
Smart Technologies For Smart Nation (SmartTechCon), 1244–1249.
https://doi.org/10.1109/SmartTechCon.2017.8358565.
[133] Fortum Interviewee G (2020).
[134] Fortum Interviewee J (2020).
[135] Allied Market Research (2020). Electric Vehicle Charger Market by Vehicle
Type (Battery Electric Vehicle (BEV), Plug-in Hybrid Electric Vehicle
(PHEV), and Hybrid Electric Vehicle (HEV)), Charging Type (On-board
Chargers, and Off-board Chargers), and End User (Residential and
Commercial): Global Opportunity Analysis and Industry Forecast, 2020-2027.
[136] Baker, T., Aibino, S., Belsito, E., Aubert, G., Sahoo, A. (2019). Electric
Vehicles Are a Multibillion-Dollar Opportunity for Utilities. Published by
Boston Consulting Group (BCG).
https://www.bcg.com/publications/2019/
electric-vehicles-multibillion-dollar-opportunity-utilities.
aspx.
[137] Schwenk, K., Faix, M., Mikut, R., Hagenmeyer, V., & Appino, R. R.
(2019). On Calendar-Based Scheduling for User-Friendly Charging of Plug-In
Electric Vehicles. 2019 IEEE 2nd Connected and Automated Vehicles
Symposium (CAVS), 1–5. https://doi.org/10.1109/CAVS.2019.8887782.
163
BIBLIOGRAPHY
[138] Bunn, D. W., & Karakatsani, N. (2003). Forecasting Electricity Prices.

London Business School.
[139] Aggarwal, S. K., Saini, L. M., & Kumar, A. (2009). Electricity price
forecasting in deregulated markets: A review and evaluation. International
Journal of Electrical Power & Energy Systems, 31(1), 13–22.
https://doi.org/10.1016/j.ijepes.2008.09.003.
[140] Polikarpova, M. (n.d.). FINANCIALIZATION IN THE US GAS
MARKET AND ITS INFLUENCE ON THE PRICE DYNAMICS. 2.
[141] Lee, J. W., & Nobi, A. (2018). Structural Transformation of Minimal
Spanning Trees in World Commodity Market. Acta Physica Polonica A,
133(6), 1414–1416. https://doi.org/10.12693/APhysPolA.133.1414.
[142] Ji, Q. (2012). System analysis approach for the identification of factors
driving crude oil prices. Computers & Industrial Engineering, 63(3), 615–625.
https://doi.org/10.1016/j.cie.2011.07.021.
[143] Guyon I, Elisseeff A. (2003). An introduction to variable and feature
selection. J Mach Learn Res 2003;3:1157–82.
[144] Amjady, N., & Keynia, F. (2009). Day-ahead price forecasting of electricity
markets by a new feature selection algorithm and cascaded neural network
technique. Energy Conversion and Management, 50(12), 2976–2982.
https://doi.org/10.1016/j.enconman.2009.07.016.
[145] Brusaferri, A., Fagiano, L., Matteucci, M., & Vitali, A. (2019). Day ahead
electricity price forecast by NARX model with LASSO based features
selection. 2019 IEEE 17th International Conference on Industrial Informatics
(INDIN), 1051–1056. https://doi.org/10.1109/INDIN41052.2019.8972263.
[146] Noorie, Z., & Afsari, F. (2020). Sparse feature selection: Relevance,
redundancy and locality structure preserving guided by pairwise constraints.
Applied Soft Computing, 87, 105956.
https://doi.org/10.1016/j.asoc.2019.105956.
[147] Ferkingstad, E., Løland, A., & Wilhelmsen, M. (2011). Causal modeling
and inference for electricity markets. Energy Economics, 33(3), 404–412.
https://doi.org/10.1016/j.eneco.2010.10.006.
164
BIBLIOGRAPHY
[148] Ding, T., Sun, H., Sun, K., Li, F., & Zhang, X. (2015). Graph theory based
splitting strategies for power system islanding operation. 2015 IEEE Power
Energy Society General Meeting, 1–5.
https://doi.org/10.1109/PESGM.2015.7286392.
[149] Hines, P., & Blumsack, S. (2008). A Centrality Measure for Electrical
Networks. Proceedings of the 41st Annual Hawaii International Conference on
System Sciences (HICSS 2008), 185–185.
https://doi.org/10.1109/HICSS.2008.5.
[150] H., P., L., W., Georgiadis, G., Papatriantafilou, M., A., L., & Bertling, L.
(2012). Application of the Graph Theory in Managing Power Flows in Future
Electric Networks. In Y. Zhang (Ed.), New Frontiers in Graph Theory.
InTech. https://doi.org/10.5772/35527.
[151] Abrams, J. (2017). Analysis of Equity Markets: A Graph Theory
Approach. SIAM Undergraduate Research Online, 10.
https://doi.org/10.1137/16S015632.
[152] Lautier, D., & Raynaud, F. (2012). Systemic risk in energy derivative
markets: A graph theory analysis.
[153] Liu, S., Huang, S., Chi, Y., Feng, S., Li, Y., & Sun, Q. (2020). Three-level
network analysis of the North American natural gas price: A multiscale
perspective. International Review of Financial Analysis, 67, 101420.
https://doi.org/10.1016/j.irfa.2019.101420.
[154] Park, H., Mjelde, J. W., & Bessler, D. A. (2006). Price dynamics among
U.S. electricity spot markets. Energy Economics, 28(1), 81–101.
[155] Shimizu, S., Hoyer, P. O., Hyvarinen, A., & Kerminen, A. (n.d.). A Linear
Non-Gaussian Acyclic Model for Causal Discovery.
[156] Fortum Interviewee L (2020).
[157] Ackerman, E., Fox, J., Pach, J., & Suk, A. (2014). On grids in topological
graphs. Computational Geometry, 47(7), 710–723.
https://doi.org/10.1016/j.comgeo.2014.02.003.
[158] Cuadra, L., Pino, M., Nieto-Borge, J., & Salcedo-Sanz, S. (2017).
Optimizing the Structure of Distribution Smart Grids with Renewable
165
BIBLIOGRAPHY
Generation against Abnormal Conditions: A Complex Networks Approach

with Evolutionary Algorithms. Energies, 10(8), 1097.
https://doi.org/10.3390/en10081097.
[159] Huang, S.-C., & Wu, C.-F. (2018). Energy Commodity Price Forecasting
with Deep Multiple Kernel Learning. Energies, 11(11), 3029.
https://doi.org/10.3390/en11113029.
[160] Markovič, R., Gosak, M., Grubelnik, V., Marhl, M., & Virtič, P. (2019).
Data-driven classification of residential energy consumption patterns by
means of functional connectivity networks. Applied Energy, 242, 506–515.
https://doi.org/10.1016/j.apenergy.2019.03.134.
[161] Srivastava, A. K., Singh, D., Pandey, A. S., & Maini, T. (2019). A Novel
Feature Selection and Short-Term Price Forecasting Based on a Decision Tree
(J48) Model. Energies, 12(19), 3665. https://doi.org/10.3390/en12193665.
[162] Zareipour, H.; Canizares, C.A.; Bhattacharya, K. Economic impact of
electricity market price forecasting errors: A demand-side analysis. IEEE
Trans. Power Syst. 2010, 25, 254–262.
[163] Uniejewski, B., Nowotarski, J., & Weron, R. (2016). Automated Variable
Selection and Shrinkage for Day-Ahead Electricity Price Forecasting.
[164] Ji, Q., & Fan, Y. (2016). Evolution of the world crude oil market
integration: A graph theory analysis. Energy Economics, 53, 90–100.
https://doi.org/10.1016/j.eneco.2014.12.003
[165] Kekatos, V., Giannakis, G. B., & Baldick, R. (2014). Grid topology
identification using electricity prices. 2014 IEEE PES General Meeting |
Conference Exposition, 1–5. https://doi.org/10.1109/PESGM.2014.6939474.
[166] Lerch, E., Bokhari, M., & Jennrich, F. (2018). A Framework for Developing
VPP Conceptual Models: From Multiple Dimensions and Stakeholders,
Towards a Unified Perspective. 2018 International Conference and Utility
Exhibition on Green Energy for Sustainable Development (ICUE), 1–6.
https://doi.org/10.23919/ICUE-GESD.2018.8635673
[167] Sosnina, E. N., Shalukho, A. V., Lipuzhin, I. A., Kechkin, A. Yu., &
Voroshilov, A. A. (2018). Optimization of Virtual Power Plant Topology with
166
BIBLIOGRAPHY
Distributed Generation Sources. 2018 International Conference and Utility

Exhibition on Green Energy for Sustainable Development (ICUE), 1–7.
https://doi.org/10.23919/ICUE-GESD.2018.8635749.
[168] Dimeas, A. L., & Hatziargyriou, N. D. (2005). Operation of a Multiagent
System for Microgrid Control. IEEE Transactions on Power Systems, 20(3),
1447–1455. https://doi.org/10.1109/TPWRS.2005.852060.
[169] Qiu, H., Zhou, A., Hu, B., Chai, B., Song, Y., & Chen, R. (2019). Design
and Implementation of Power Grid Graph Data Management Platform Based
on Distributed Storage. IOP Conference Series: Earth and Environmental
Science, 234, 012026. https://doi.org/10.1088/1755-1315/234/1/012026.
[170] Pan, Z., & Jing, Z. (2018). Modeling methods of big data for power grid
based on graph database. 2018 International Conference on Power System
Technology (POWERCON), 4340–4348.
https://doi.org/10.1109/POWERCON.2018.8602074.
[171] Tan, X., Wu, Y., & Tsang, D. H. K. (2015). A Stochastic Shortest Path
Framework for Quantifying the Value and Lifetime of Battery Energy Storage
Under Dynamic Pricing. IEEE Transactions on Smart Grid, 1–1.
https://doi.org/10.1109/TSG.2015.2478599.
[172] Zhang, Y., Song, Y., & Fei, S. (2020). Consensus Design for Heterogeneous
Battery Energy Storage Systems with Droop Control Considering
Geographical Factor. Applied Sciences, 10(2), 726.
https://doi.org/10.3390/app10020726.
[173] Persistence Market Research (2017). Global Market Study on Master Data
Management: Public Cloud Deployment of MDM Solutions to Gain Traction
during 2017-2022. https://www.persistencemarketresearch.com/
market-research/master-data-management-market.asp
[174] StiboSystems (2018). What is Master Data Management? And Why You
Need It. White Paper.
[175] Salah, K. (2017). Does MDM Need Graph?. Semarchy Blog.
https://blog.semarchy.com/does-mdm-need-graph.
[176] Khandelwal, S., & Mathias, A. (2011). Using a 360° view of customers for
segmentation. Journal of Medical Marketing: Device, Diagnostic and
167
BIBLIOGRAPHY
Pharmaceutical Marketing, 11(3), 215–220.

https://doi.org/10.1177/1745790411408853.
[177] Fortum Interviewees B, E and I (2020). Group Interview.
[178] Yu, X., Xinyu, C. (2018). Graph-based customer entity resolution.
TigerGraph Blog. https://www.tigergraph.com/2018/12/18/
graph-based-customer-entity-resolution/.
[179] Ghrab, A., Romero, O., Skhiri, S., & Zimanyi, E. (n.d.). Analytics-Aware
Graph Database Modeling.
[180] Neo4j Website. https://neo4j.com/use-cases/knowledge-graph/
[181] PoolParty (2020). White Paper: Knowledge Graphs - AI Enrichde with
Semantics.
[182] Zhao, Y., & Smidts, C. (2019). A method for systematically developing the
knowledge base of reactor operators in nuclear power plants to support
cognitive modeling of operator performance. Reliability Engineering &
System Safety, 186, 64–77. https://doi.org/10.1016/j.ress.2019.02.014.
[183] Lin, R., Fang, Q., & Wu, B. (2020). Hydro-graph: A knowledge graph for
hydrogen research trends and relations. International Journal of Hydrogen
Energy, S0360319919345367. https://doi.org/10.1016/j.ijhydene.2019.12.036.
[184] Wang, X., Ma, C., Liu, P., Pan, B., & Kang, Z. (2018). A Potential
Solution for Intelligent Energy Management—Knowledge Graph. 2018 IEEE
International Conference on Energy Internet (ICEI), 281–286.
https://doi.org/10.1109/ICEI.2018.00058.
[185] Neo4J Interviewee N (2020).
[186] Sun, M., Wang, Y., & Gao, C. (2016). Visibility graph network analysis of
natural gas price: The case of North American market. Physica A: Statistical
Mechanics and Its Applications, 462, 1–11.
https://doi.org/10.1016/j.physa.2016.06.051.
[187] Lacasa, L., Luque, B., Ballesteros, F., Luque, J., & Nuño, J. C. (2008).
From time series to complex networks: The visibility graph. Proceedings of
the National Academy of Sciences, 105(13), 4972–4975.
https://doi.org/10.1073/pnas.0709247105.
168
BIBLIOGRAPHY
[188] Yu, L. (2013). Visibility graph network analysis of gold price time series,
Physica A 392 (16) (2013) 3374–3384.
[189] Zhang, B., Wang, J., Fang, W. (2015). Volatility behavior of visibility
graph EMD financial time series from Ising interacting system, Physica A 432
(2015) 301–314.
[190] Yang, Y., Wang, J., Yang, H., Mang, J. (2009). Visibility graph approach
to exchange rate series, PhysicaA 388 (20) (2009) 4431–4437.
[191] Alzate, C., & Sinn, M. (2013). Improved Electricity Load Forecasting via
Kernel Spectral Clustering of Smart Meters. 2013 IEEE 13th International
Conference on Data Mining, 943–948.
https://doi.org/10.1109/ICDM.2013.144.
[192] Gajowniczek, K., & Ząbkowski, T. (2018). Simulation Study on Clustering
Approaches for Short-Term Electricity Forecasting. Complexity, 2018, 1–21.
https://doi.org/10.1155/2018/3683969.
[193] Verma, D., & Meila, M. (2003). A comparison of spectral clustering
algorithms. University of Washington, uw-cse-03-05-01.
[194] Zhang, X., Wang, J., & Gao, Y. (2019). A hybrid short-term electricity
price forecasting framework: Cuckoo search-based feature selection with
singular spectrum analysis and SVM. Energy Economics, 81, 899–913.
[195] Fortum Interviewee D (2020).
[196] Casper, W. R., & Nadiga, B. (2017). A New Spectral Clustering
Algorithm. ArXiv:1710.02756 [Physics]. http://arxiv.org/abs/1710.02756.
[197] Bolón-Canedo, V., Rego-Fernández, D., Peteiro-Barral, D.,
Alonso-Betanzos, A., Guijarro-Berdiñas, B., & Sánchez-Maroño, N. (2018).
On the scalability of feature selection methods on high-dimensional data.
Knowledge and Information Systems, 56(2), 395–442.
https://doi.org/10.1007/s10115-017-1140-3.
[198] Amjday, N., Hemmati, M. (2006). Energy price forecasting problems and
proposals for such predictions. IEEE Power & energy magazine, March/April
2006.
169
BIBLIOGRAPHY
[199] Kiener, E. (2006). Analysis of balancing markets. Master’s Thesis at

Electric Power Systems Lab, KTH.
[200] Weron, R. (2014). Electricity price forecasting: A review of the
state-of-the-art with a look into the future. International Journal of
Forecasting, 30 (4). doi: 10.1016/j.ijforecast.2014.08.008
[201] van der Veen, R. A. C., & Hakvoort, R. A. (2016). The electricity
balancing market: Exploring the design challenge. Utilities Policy, 43,
186–194. https://doi.org/10.1016/j.jup.2016.10.008.
[202] Grace, J. B., Schoolmaster, D. R., Guntenspergen, G. R., Little, A. M.,
Mitchell, B. R., Miller, K. M., & Schweiger, E. W. (2012). Guidelines for a
graph-theoretic implementation of structural equation modeling. Ecosphere,
3(8), art73. https://doi.org/10.1890/ES12-00048.1.
[203] Shipley, B. (2000). Cause and correlation in biology. Cambridge University
Press, Cambridge, UK.
[204] Yin, L., & Ma, X. (2018). Causality between oil shocks and exchange rate:
A Bayesian, graph-based VAR approach. Physica A: Statistical Mechanics
and Its Applications, 508, 434–453.
https://doi.org/10.1016/j.physa.2018.05.064.
[205] Ji, Q. (2012). System analysis approach for the identification of factors
driving crude oil prices. Computers & Industrial Engineering, 63(3), 615–625.
https://doi.org/10.1016/j.cie.2011.07.021.
[206] Ji, Q., Zhang, H.-Y., & Geng, J.-B. (2018). What drives natural gas prices
in the United States? – A directed acyclic graph approach. Energy
Economics, 69, 79–88. https://doi.org/10.1016/j.eneco.2017.11.002.
[207] Maciejowska, K., & Weron, R. (2016). Short- and Mid-Term Forecasting of
Baseload Electricity Prices in the U.K.: The Impact of Intra-Day Price
Relationships and Market Fundamentals. IEEE Transactions on Power
Systems, 31(2), 994–1005. https://doi.org/10.1109/TPWRS.2015.2416433.
[208] Carta, J. A., Cabrera, P., Matías, J. M., & Castellano, F. (2015).
Comparison of feature selection methods using ANNs in MCP-wind speed
methods. A case study. Applied Energy, 158, 490–507.
170
BIBLIOGRAPHY
[209] Bengio, Y., Courville, A., & Vincent, P. (2014). Representation Learning:
A Review and New Perspectives. ArXiv:1206.5538 [Cs].
http://arxiv.org/abs/1206.5538.
[210] Conejo, A.J.; Contreras, J.; Espínola, R.; Plazas, M.A. Forecasting
electricity prices for a day-ahead pool-based electric energy market. Int. J.
Forecast. 2005, 21, 435–462.
[211] Tan, Z., Zhang, J., Wang, J., Xu, J., 2010. Day-ahead electricity price
forecasting using wavelet transform combined with ARIMA and GARCH
models. Appl. Energy 87 (11), 3606–3610.
[212] Lago, J., Ridder, F.D., Schutter, B.D., 2018a. Forecasting spot electricity
prices: deep learning approaches and empirical comparison of traditional
algorithms. Appl. Energy 221, 386–405.
[213] Singh, N., Mohanty, S.R., Shukla, R.D., 2017. Short term electricity price
forecast based on environmentally adapted generalized neuron. Energy 125,
127–139.
[214] Che, J., Wang, J., 2010. Short-term electricity prices forecasting based on
support vector regression and auto-regressive integrated moving average
modeling. Energy Conversion & Management 51 (10), 1911–1917.
[215] Catalao, J.P.S., Pousinho, H.M.I., Mendes, V.M.F. (2011). Hybrid
wavelet-PSO-ANFIS approach for short-term electricity prices forecasting.
IEEE Trans. Power Syst. 26 (1), 137–144.
[216] Stevenson M. (2001). Filtering and forecasting spot electricity prices in the
increasingly deregulated Australian electricity market. QFRC research paper
series, no. 63. Quantitative Finance Research Centre, University of
Technology, Sydney; 2001.
[217] Cruz, A., Muñoz, A., Zamora, J., Espínola, R. (2011). The effect of wind
generation and week-day on Spanish electricity spot price forecasting. Electr.
Power Syst. Res. 81 (10), 1924–1935.
[218] Rodriguez, C.P., Anders, G.J., 2004a. Energy price forecasting in the
Ontario competitive power system market. IEEE Trans. Power Syst. 19,
366–374.
171
BIBLIOGRAPHY
[219] Hong, Y., Wu, C., 2012. Day-ahead electricity price forecasting using a
hybrid principal component analysis network. Energies 5 (11), 4711–4725.
[220] Ludwig, N., Feuerriegel, S., & Neumann, D. (2015). Putting Big Data
analytics to work: Feature selection for forecasting electricity prices using the
LASSO and random forests. Journal of Decision Systems, 24(1), 19–36.
https://doi.org/10.1080/12460125.2015.994290.
[221] Ziel, F. (2016). Forecasting Electricity Spot Prices Using Lasso: On
Capturing the Autoregressive Intraday Structure. IEEE Transactions on
Power Systems, 31(6), 4977–4987.
https://doi.org/10.1109/TPWRS.2016.2521545.
[222] Shao, Z., Yang, S., Gao, F., Zhou, K., & Lin, P. (2017). A new electricity
price prediction strategy using mutual information-based SVM-RFE
classification. Renewable and Sustainable Energy Reviews, 70, 330–341.
https://doi.org/10.1016/j.rser.2016.11.155.
[223] Amjady, N., & Daraeepour, A. (2008). Day-ahead electricity price
forecasting using the relief algorithm and neural networks. 2008 5th
International Conference on the European Electricity Market, 1–7.
https://doi.org/10.1109/EEM.2008.4579109.
[224] Yang, W., Wang, J., Niu, T., & Du, P. (2019). A hybrid forecasting system
based on a dual decomposition strategy and multi-objective optimization for
electricity price forecasting. Applied Energy, 235, 1205–1225.
[225] Abedinia, O., Amjady, N., & Zareipour, H. (2017). A New Feature
Selection Technique for Load and Price Forecast of Electrical Power Systems.
IEEE Transactions on Power Systems, 32(1), 62–74.
https://doi.org/10.1109/TPWRS.2016.2556620.
[226] Urro, C., Niciejewska, K. (2018). EPEX SPOT 2017/2018. EEX Group.
https://www.eex.com/blob/83774/5b3b05a55934b0970baba79d2d872deb/
20180628-workshop-4-epex-spot-2017-2018-data.pdf.
[227] Uniejewski, B., Marcjasz, G., & Weron, R. (2019). Understanding intraday
electricity markets: Variable selection and very short-term price forecasting
using LASSO. International Journal of Forecasting, 35(4), 1533–1547.
https://doi.org/10.1016/j.ijforecast.2019.02.001.
172
BIBLIOGRAPHY
[228] Pape, C. (2017). The impact of intraday markets on the market value of
flexibility– decomposing effects on profile and the imbalance costs.
[229] Kulakov, S., & Ziel, F. (2019). The impact of renewable energy forecasts on
intraday electricity prices.
[230] Monteiro, C., Ramirez-Rosado, I., Fernandez-Jimenez, L., & Conde, P.
(2016). Short-Term Price Forecasting Models Based on Artificial Neural
Networks for Intraday Sessions in the Iberian Electricity Market. Energies,
9(9), 721. https://doi.org/10.3390/en9090721.
[231] Andrade, J., Filipe, J., Reis, M., & Bessa, R. (2017). Probabilistic Price
Forecasting for Day-Ahead and Intraday Markets: Beyond the Statistical
Model. Sustainability, 9(11), 1990. https://doi.org/10.3390/su9111990.
[232] Marcjasz, G., Uniejewski, B., & Weron, R. (2020). Beating the
Naïve—Combining LASSO with Naïve Intraday Electricity Price Forecasts.
Maciejowska, K.; Nitka, W.; Weron, T. Day-ahead vs. Intraday—Forecasting
the price spread to maximize economic benefits. Energies 2019, 12, 631.
[233] N. Satish, N. Sundaram, M. M. A. Patwary, J. Seo, J. Park, M. A. Hassaan,
S. Sengupta, Z. Yin, and P. Dubey, “Navigating the maze of graph analytics
frameworks using massive graph datasets,” in SIGMOD, 2014, pp. 979–990.
[234] Needham, M., & Hodler, A. E. (2019). Graph Algorithms Practical
Examples in Apache Spark and Neo4j. O’Reilly.
[235] Cai, H., Zheng, V. W., & Chang, K. C.-C. (2018). A Comprehensive
Survey of Graph Embedding: Problems, Techniques and Applications.
ArXiv:1709.07604 [Cs]. http://arxiv.org/abs/1709.07604.
[236] Meier, J.-H., Schneider, S., Le, C., & Schmidt, I. (2020). Short-Term
Electricity Price Forecasting: Deep ANN vs GAM. In V. Ermolayev, F.
Mallet, V. Yakovyna, H. C. Mayr, & A. Spivakovsky (Eds.), Information and
Communication Technologies in Education, Research, and Industrial
Applications (Vol. 1175, pp. 257–276). Springer International Publishing.
https://doi.org/10.1007/978-3-030-39459-2_12.
[237] Montgomery, D., Peck, E. & Vining, G. (2012). Introduction to linear
regression analysis. Oxford: Wiley-Blackwell.
173
BIBLIOGRAPHY
[238] Vidal-Naquet, M. & Ullman, S. (2013). Object recognition with informative

features and linear classification. Proceedings Ninth IEEE International
Conference on Computer Vision. doi: 10.1109/iccv.2003.1238356
[239] Lin, D. & Tang, D. (2006). Conditional infomax learning: An integrated
framework for feature extraction and fusion. ECCV’06: Proceedings of the 9th
European conference on Computer Vision - Volume Part I. doi:
10.1007/11744023_6.
[240] Yu, L. & Liu, H. (2003). Feature selection for high-dimensional data: A
fast correlation-based filter solution. Proceedings, Twentieth International
Conference on Machine Learning.
[241] Ma, Y. & Fu, Y. (eds.) (2012). Manifold learning theory and applications.
Hoboken: Taylor and Francis.
[242] Blockeel, H., Kersting, K., Nijssen, S. & Železný, F. (eds.) (2013). Machine
Learning and Knowledge Discovery in Databases: European Conference,
ECML PKDD 2013, Prague, Czech Republic, September 23-27, 2013,
Proceedings, Part III.
[243] Chung, F. R. K. (2009). Spectral graph theory. Providence, RI: Published
for the Conference Board of the mathematical sciences by the American
Mathematical Society.
[244] von Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and
Computing, 17 (4). doi: 10.1007/s11222-007-9033-z
[245] Belkin, M. & Niyogi, P. (2002). Laplace eigenmaps and spectral techniques
for embedding and clustering. Advances in Neural Information Processing
Systems 14. doi: 10.7551/mitpress/1120.003.0080
[246] Shannon, C. E. & Weaver, W. (1999). The mathematical theory of
communication. Urbana: University of Illinois Press.
[247] Learned-Miller, E. (2013). Entropy and mutual information. Retrieved:
2020.05.29.
https://people.cs.umass.edu/~elm/Teaching/Docs/mutInf.pdf
[248] Nguyen, V., Epps, J. & Bailey, J. (2009). Information theoretic measures
for clusterings comparison: Is a correction for chance necessary? Proceedings
174
BIBLIOGRAPHY
of the 26th Annual International Conference on Machine Learning - ICML

09. doi: 10.1145/1553374.1553511
[249] He, X., Cai, D. & Nigoyi, P. (2005). Laplacian score for feature selection.
Retrieved: 2020.05.29. https://papers.nips.cc/paper/
2909-laplacian-score-for-feature-selection.pdf
[250] Parsa, M., Zare, H., & Ghatee, M. (2019). Unsupervised feature selection
based on adaptive similarity learning and subspace clustering. Retrieved:
2020.05.29. https://arxiv.org/pdf/1912.05458.pdf
[251] Cai, D., Zhang, C., & He, X. (2010). Unsupervised feature selection for
multi-cluster data. Proceedings of the 16th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining - KDD 10. doi:
10.1145/1835804.1835848
[252] Li, Z., Yang, Y., Liu, J., Zhou, X. & Lu, H. (2012). Unsupervised Feature
Selection Using Nonnegative Spectral Analysis. Proceedings of the
Twenty-Sixth AAAI Conference on Artificial Intelligence. Retrieved:
2020.05.29. http://www.nlpr.ia.ac.cn/2012papers/gjhy/gh27.pdf
[253] Li, Z. & Tang, J. (2015). Unsupervised Feature Selection via Nonnegative
Spectral Analysis and Redundancy Control. IEEE Transactions on Image
Processing, 24 (12). doi: 10.1109/tip.2015.2479560
[254] Dash, M. & Liu, H. (2010). Feature selection for clustering. Retrieved:
2020.05.29.
http://www.public.asu.edu/~huanliu/papers/pakdd00clu.pdf
[255] Yu, S. & Shi, J. (2010). Multiclass spectral clustering. Proceedings Ninth
IEEE International Conference on Computer Vision. doi:
10.1109/iccv.2003.1238361
[256] Wang, L. & Chen, S. (2013). `2,p -matrix norm and its application in feature
selection. Retrieved: 2020.05.29. https://arxiv.org/pdf/1303.3987.pdf
175
Appendices
176
APPENDIX A
Part I & II: Figures and tables
A.1 List of interviews

Interviewee Role
A Head of Hydro Operational Excellence
B Data scientist
C Power portfolio manager
D Short-term analysis manager
E Data scientist
F Senior technical expert in wind power
G Data scientist
H Data scientist
I Data scientist
J UX designer
K Business developer
L Market analyst
M Head of Hydro Operations
N Global head of Business Design (Neo4j)
Table A.1: List of interviewees.

A.2. INVENTORY OF USE-CASES
A.2 Inventory of use-cases
N Title App/Con A/Y

1 Arc-based constrained ant colony optimisation Hydropower Moeini R.
algorithms for the optimal solution of optimisation (Ant (2011)
hydropower reservoir operation problems colony
optimisation)
2 A multi-time-scale power prediction model of Hydropower Chen J.
hydropower station considering multiple optimisation (2019)
uncertainties (Bayesian
networks)
3 Scheduling the hydropower system of a Hydropower Chen F.
municipal power grid considering multistage optimisation (2015)
congestion problems (Depth-first
search)
4 Application of an ant colony optimization Hydropower Moeini R.
algorithm for optimal operation of reservoirs: A optimisation (2009)
comparative study of three proposed (Shortest path)
formulations
5 A decoupled solution of hydro-thermal optimal Hydropower Wei H.
power flow problem by means of interior point optimisation (1997)
method and network programming (Shortest path)
6 Multi-objective optimized scheduling for Hydropower Liu W.J.
hydro-thermal power system optimisation (2015)
(Euclidian
distance)
7 The multi-object optimal dispatcher research of Hydropower Zhao T.
cascade hydropower stations based on optimisation (2013)
MSM-SANGA (Euclidian
distance)
8 Optimize dispatchers of cascade hydropower Hydropower Tinghong
station based on IS-NGA optimisation Z. (2013)
(Euclidian
distance)
9 Key technologies to design and develop Hydropower Liao S.
optimized operation system for large-scale optimisation (2013)
hydropower stations of China (Directed graph)
10 Key technologies to optimize operation system Hydropower Cheng
for large-scale hydropower stations in provincial optimisation C.-T.
power grid (Directed graph) (2010)
Table A.2: Use-cases in Hydropower optimisation

N Title App/Con A/Y

1 Selection of features for analysis of reliability of Hydropower O & Majumder
performance in hydropower plants: a M (Feature P. (2020)
multi-criteria decision making approach selection)
2 Outlier Detection for Hydropower Generation Hydropower O & Ahmed I.
Plant M (Feature (2018)
selection)
3 Identification of Shaft Orbit Based on the Grey Hydropower O & Xiao J.
Wolf Optimizer and Extreme Learning Machine M (Feature (2018)
selection)
4 Risk Analysis of Dam Overtopping for Cascade Hydropower O & Lin P.
Reservoirs Based on Bayesian Network M (Bayesian (2018)
networks)
5 The logistics risk assessment studies of the Hydropower O & Jia D.H.
large equipment of hydropower project based M (Minimum cut) (2013)
on three-dimensional structure
6 Bayesian network model for fault diagnosis of Hydropower O & Zhang
hydropower equipment M (Bayesian X.-D.
networks) (2006)
7 Fault diagnosis model for hydropower Hydropower O & Lan F.
generating unit based on directed acyclic graph M (Directed (2010)
support vector machine acyclic graph)
8 A hybrid multi-objective optimization model Hydropower O & Zhou K.-B.
for vibration tendency prediction of M (Feature (2019)
hydropower generators selection)
9 Fault Warning of Hydro-Generator Based on Hydropower O & Gao G.
Fisher Criterion M (Feature (2018)
selection)
10 Identification of shaft orbit of hydropower unit Hydropower O & Xiao J.
by simultaneous optimization of feature M (Feature (2013)
parameters and support vector machine based selection)
on hybrid artificial bee colony
Table A.3: Use-cases in Hydropower operation and maintenance

N Title App/Con A/Y

1 Evaluation of vulnerable path: Using heuristic Nuclear power Zou B.
path-finding algorithm in physical protection (Shortest path) (2018)
system of nuclear power plant
2 Evacuation Assisting Strategies in Vehicular Ad Nuclear power Pu C.
Hoc Networks (Shortest path) (2018)
3 Identification of Pivotal Causes and Spreaders Nuclear power Dong C.-L.
in the Time-Varying Fault Propagation Model (Centrality) (2016)
to Improve the Decision Making under
Abnormal Situation
4 An adaptive decision method using structure Nuclear power Dong C.-L.
feature analysis on dynamic fault propagation (Centrality) (2013)
model
5 Enhanced graph-based fault diagnostic system Nuclear power Liu Y.-K.
for nuclear power plants (Signed directed (2019)
Graph)
6 Semisupervised classification for fault diagnosis Nuclear power Jianping
in nuclear power plants (Spectral M. (2015)
clustering)
7 Fault Diagnosis of Nuclear Power Plant Based Nuclear power Xin M.
on Simplified Signed Directed Graph with (Directed graph) (2019)
Principal Component Analysis and Support
Vector Machine
8 Intelligent Maintenance Design of Nuclear Nuclear power Shi L.
Power System Based on PHM (Directed graph) (2019)
9 Unsupervised ensemble clustering for transients Nuclear power Al-Dahidi
classification in a nuclear power plant turbine (Spectral S. (2015)
clustering)
10 Dynamic accident scenario generation, Nuclear power Di Maio F.
modeling and post-processing for the integrated (Dynamic event (2018)
deterministic and probabilistic safety analysis tree)
of nuclear power plants
Table A.4: Use-cases in Nuclear power operation and maintenance

N Title App/Con A/Y

1 Identification of critical components of wind Wind power Marquez
turbines using FTA over the time (Breadth-first F.P.G.
search) (2016)
2 A Hybrid Cable Connection Structure for Wind Wind power Li J.
Farms with Reliability Consideration (Minimum (2019)
spanning tree)
3 Optimal Design for Offshore Wind Farm Wind power Shin J.-S.
considering Inner Grid Layout and Offshore (Minimum (2017)
Substation Location spanning tree)
4 Optimal Cable Design of Wind Farms: The Wind power Cerveira A.
Infrastructure and Losses Cost Minimization (Minimum (2016)
Case spanning tree)
5 New sequential approach based on graph Wind power Mehdizadeh
traversal algorithm to investigate cascading (Graph traversal) M. (2019)
outages considering correlated wind farms
6 Heuristics-based design and optimization of Wind power Perez-Rua
offshore wind farms collection systems (Prim’s algorithm) J.-A.
(2019)
7 A Two-steps Splitting Strategy Based On Wind power Qing M.
Laplacian Eigenmap Algorothm For Large (Spectral (2019)
Power Grid With Power Wind Connected clustering)
8 A tight upper bound for quadratic knapsack Wind power Quan N.
problems in grid-based wind farm layout (Knapsack (2018)
optimization problem)
9 Optimal wind farm collector system topology Wind power Dutta S.
design considering total trenching length (Minimum (2012)
spanning tree)
10 Research on Collection System Optimal Design Wind power Huang W.
of Wind Farm with Obstacles (Prim’s algorithm) (2017)
Table A.5: Use-cases in Wind power design, operation and maintenance.

N Title App/Con A/Y

1 An efficient scheme for wireless charging of EV (Shortest Arora S.
electric vehicles using RFID with an optimal path) (2019)
path planning
2 Path finding algorithm in consideration of EVs’ EV (Shortest Uchida H.
charging behavior on multi-agent based traffic path) (2017)
simulation
3 Optimal energy/time routing in EV (Shortest Faraj M.
battery-powered vehicles path) (2016)
4 Route optimization for an electric vehicle with EV (Breadth-first Raagapriya
priority destinations search) S. (2018)
5 Breadth-first search-based remaining range EV (Breadth-first Potarusov
estimation and representation for electric search) R. (2014)
vehicle
6 Dynamic EV Charging Pricing Methodology EV (Shortest Zhou S.
for Facilitating Renewable Energy with path) (2020)
Consideration of Highway Traffic Flow
7 Electric vehicle travel planning with lazy EV (Shortest Cuchy M.
evaluation of recharging times path) (2019)
8 A Complex Network Approach for the EV (Centrality) Mureddu
Estimation of the Energy Demand of Electric M. (2018)
Mobility
9 Critical behaviour in charging of electric EV (Max-flow) Carvalho
vehicles R. (2015)
10 An autonomous electric vehicle based charging EV (Bipartite Cao Z.
system: Matching and charging strategy graph) (2018)
Table A.6: Use-cases for Electric vehicle applications

N Title App/Con A/Y

1 Dynamic analysis on the topological properties Energy trading Chen
of the complex network of international oil (Centrality) W.-D.
prices (2010)
2 Co-movement of coherence between oil prices Energy trading Huang S.
and the stock market from the joint (Complex (2018)
time-frequency perspective network)
3 Reconstructing time series into a complex Energy trading Fang W.
network to assess the evolution dynamics of the (Complex (2018)
correlations among energy prices network)
4 Discovering hidden knowledge in carbon Energy trading Bhardwaj
emissions data: A multilayer network approach (Multi-graph) K. (2017)
5 Price dynamics among U.S. electricity spot Energy trading Park H.
markets (Directed graph) (2006)
6 Data-driven classification of residential energy Energy trading Markovic
consumption patterns by means of functional (Minimum R. (2019)
connectivity networks spanning tree)
7 Application of the weighted k-nearest neighbor Energy trading Fan G.-F.
algorithm for short-term load forecasting (Euclidean (2019)
distance)
8 A hybrid short-term load forecasting with a Energy trading Ghofrani
new input selection framework (Euclidean M. (2015)
distance)
9 Causal modeling and inference for electricity Energy trading Ferkingstad
markets (Directed acyclic E. (2011)
graph)
10 Forecasting method study on chaotic load series Energy trading Jiang C.
with high embedded dimension (Euclidean (2005)
distance)
Table A.7: Use-cases in Energy trading.

N Title App/Con A/Y

1 Consensus design for heterogeneous battery Energy storage Zhang Y.
energy storage systems with droop control (Minimum (2020)
considering geographical factor spanning tree)
2 Directed-Graph-Observer-Based Model-Free Energy storage Xu D.
Cooperative Sliding Mode Control for (Directed graph) (2020)
Distributed Energy Storage Systems in DC
Microgrid
3 Optimal Control of Energy Storage Devices Energy storage Zargari N.
Based on Pontryagin’s Minimum Principle and (Shortest path ) (2019)
the Shortest Path Method
4 Optimal positioning of storage systems in Energy storage Korjani S.
microgrids based on complex networks (Centrality) (2018)
centrality measures
5 Modeling, Control, and Simulation of a New Energy storage Saleh A.
Topology of Flywheel Energy Storage Systems (Shortest path) (2019)
in Microgrids
6 Remaining energy estimation for lithium-ion Energy storage Niri M.F.
batteries via Gaussian mixture and Markov (Clustering) (2020)
models for future load prediction
7 Development of operational strategies of energy Energy storage Jeong H.C.
storage system using classification of customer (Clustering) (2020)
load profiles under time-of-use tariffs in South
Korea
8 Markov Model-Based Energy Storage System Energy storage Hong Y.-Y.
Planning in Power Systems (Clustering) (2019)
9 Residential battery sizing model using net Energy storage Tang R.
meter energy data clustering (Clustering) (2019)
10 Collaboration strategy and optimization model Energy storage Liu J.
of wind farm-hybrid energy storage system for (Similarity) (2019)
mitigating wind curtailment
Table A.8: Use-case in Energy storage

A.3. EVALUATION TOOL
A.3 Evaluation tool

A.3.1 Graph applicability
Measures the ability of graphs to model the system in each cluster and its
relevance and efficiency in solving the defined problems.
Underlying graph structure
Can the system be modelled with nodes and edges?
0: There is no physical or abstract connection between the elements
1: There is an abstract connection between the elements in the system, but there
is some level of customisation necessary to construct the graph.
2: There is a clear way to connect the elements studied in the system, either in
an abstract or a physical sense, or both and clear graph types can be identified.
Richness of relationships
How much information is stored in the relationships between the elements of the
system?
0: The system components have no concrete nor theoretical influence on each
other.
1: System components a priori impact each other but in a way that it is abstract,
or not yet perceivable.
2: There is a high degree of connection between the system components, essential
to capture to solve the problems.
Identified concepts and algorithms

Can specific algorithms proper to graph theory be identified to solve the use-case
cluster’s needs?
0: Few or no algorithms found.
1: Some algorithms found but without a proven marginal benefit identified.
2: Some algorithms have a demonstrable marginal benefit.
Availability of supporting use-cases How many benchmark use-cases are

there available to get inspiration or information from?
0: Low benchmarking possibilities and previous use-cases sparse, giving little
proof of the relevance of a graph-based
solution.
1: There are a few benchmark use-cases to go from, more or less directly
connected to graph theory.

2: There is a considerable number of moderately to highly relevant previous
studies regarding this topic, confirming the relevance of a graph-based solution.
A.3.2 Technical feasibility

This covers the technical aspects of implementing a graph-based application in
each cluster.
Model setup How resource-light can the setup of a proof of concept be, in terms
of time, money and competence?
0: A proof of concept cannot easily be created, given the diversity of the
competences and additional tools needed.
1: A proof of concept is possible to build within a short time frame but will need
some level of support from experts.
2: A proof of concept can be quickly set up, without need for additional support.
Computational constraints
How critical is it for the application to yield results quickly?
0: High computational constraints (seconds to minutes), typically online
applications.
1: Medium computational constraint (minutes to hours)
2: Low computational constraint (days)
Risk
How impactful would an error in the graph-based model be in terms of social,
technical or economical aspects?
0: The model’s output would potentially be central in the decision-making
process and small errors would have great consequences with low to no
risk-mitigation possibilities, such as human verification (usually the case with
prescriptive models).
1: Errors in the model would have a moderately to large impact, which can
however be mitigated to some extent.
2: Errors in the model would have relatively low impact.
A.3.3 Economic potential

Potential economic impacts from applying a graph-based solution in each cluster
Sector size
What is the economic size of the sector in the markets within which the company
is operating?
0: Small
1: Medium
2: Big
Sector growth
How fast is the sector from the respective clusters growing in the markets the
company is operating in?
0: Slow
1: Medium
2: Fast
Substitute solutions
Is there a gap in available technology to solve the problem at hand?
0: High maturity and consolidation of industry with established tools with a
proven performance
1: Good amount of available applications, but no dominant design established.
2: Low level of maturity, low availability of commercial applications and in-house
development necessary.
Scalability
To what extent can a pilot be generalised to more applications, either within the
same vertical or in other domains?
0: “One-shot” application, very specific and contained to the cluster.
1: A pilot could be replicated to other applications but would require significant
adaptations.
2: High level of modularity and deployability of graph application within the
cluster or in other clusters with minimal adaptation.
A.3.4 Workability
How well do the clusters’ solutions align with the perceived problems and
available resources?
Relevance
How central is the problem at hand in the sector’s operations, either from an IT
or a mathematical perspective?
0: The potential application would solve a need only of secondary nature for the
business.
1: The potential application is directly in line with the activities within the
sector.
2: A solution to the pain-point is of critical importance and could bring a high
competitive advantage for the company in this sector.
Data alignment
How well aligned is the data with a graph-based solution, in terms of quantity,
quality and availability?
0: Data too sparse or not ready to be extracted for analyses.
1: Smaller datasets potentially available and could be analysed with little
restrictions.
2: Large datasets readily available.
Human alignment
How well aligned is the competences with the potential graph-based application, in
terms of quantity, quality and availability?
0: There is little or no relevant expertise available in this domain in-house and
making it available would be subject to managerial decisions.
1: A certain level of competence potentially available in-house, requiring minor
organisational changes.
2: Competence in this sector is readily available and can potentially support the
establishment of the model.
Integrability in technical environment

Is there low friction between a potential graph-based application and the
environing technological stack?
0: A graph-based application has redundancy or friction with existing
applications and an integration would require a prohibitive amount of additional
work.
1: A graph-based application could be plugged in the technological stack.
2: A graph-based application would empower the technological stack of the
company within the use-case cluster or could be implemented as a stand-alone.
APPENDIX B
Part III: Figures and tables
B.1 Experiments and results

ALLAML BASEHOCK GLIOMA LSOLET
BL 66.46 ± 6.76 50.17 ± 0.14 58.64 ± 3.79 85.00 ± 12.99
MV 73.68 ± 4.56 (250) 50.22 ± 0.13 (150) 52.31 ± 5.89 (300) 80.99 ± 9.78 (200)
LS 57.36 ± 8.90 (200) 50.25 ± 0.21 (50) 50.74 ± 5.28 (300) 79.79 ± 9.28 (150)
MCFS 71.53 ± 3.36 (200) 50.52 ± 0.82 (250) 49.36 ± 6.47 (300) 86.61 ± 11.12 (250)
NDFS 98.61 ± 0.00 (300) 51.67 ± 1.93 (250) 61.13 ± 4.66 (300) 83.65 ± 10.15 (300)
NSCR 97.64 ± 0.65 (300) 51.48 ± 1.71 (250) 60.39 ± 9.39 (300) 87.33 ± 11.21 (250)
SCFS 96.39 ± 8.00 (250) 52.87 ± 0.32 (100) 60.92 ± 8.72 (150) 92.96 ± 5.54 (250)
% 32.15 2.70 2.49 7.96
LYMPHOMA NCI9 PIEP10P PROSTATE

BL 54.53 ± 8.06 38.83 ± 4.56 25.93 ± 2.40 58.38 ± 0.50
MV 55.10 ± 9.62 (200) 40.58 ± 3.56 (250) 28.88 ± 2.67 (300) 62.35 ± 0.49 (200)
LS 53.85 ± 9.02 (250) 38.67 ± 3.32 (50) 29.43 ± 2.09 (50) 56.86 ± 0.00 (300)
MCFS 54.95 ± 11.25 (300) 43.83 ± 3.63 (100) 37.31 ± 3.45 (50) 61.86 ± 0.3 (50)
NDFS 61.41 ± 8.82 (200) 63.33 ± 6.73 (150) 40.90 ± 2.62 (50) 58.82 ± 0.00 (100)
NSCR 65.63 ± 11.32 (200) 63.58 ± 6.20 (100) 31.57 ± 2.68 (50) 78.38 ± 15.21 (50)
SCFS 62.19 ± 8.87 (250) 55.00 ± 5.90 (300) 41.81 ± 4.62 (50) 66.18 ± 12.16 (50)
% 11.10 24.75 15.88 20.00
Table B.1: Clustering accuracy (ACC) [%] corresponding to different data sets
and feature selection methods.
B.1. EXPERIMENTS AND RESULTS

BL 9.30 ± 4.75 1.28 ± 0.70 49.60 ± 5.64 82.56 ± 12.90
MV 15.76 ± 5.45 (250) 1.66 ± 0.71 (100) 27.39 ± 7.90 (300) 73.74 ± 6.13 (200)
LS 11.70 ± 4.35 (50) 1.72 ± 0.60 (250) 32.41 ± 5.45 (300) 71.95 ± 8.81 (200)
MCFS 13.39 ± 3.27 (100) 1.74 ± 1.28 (150) 25.50 ± 6.04 (250) 79.26 ± 8.98 (250)
NDFS 90.19 ± 0.00 (300) 2.96 ± 0.90 (150) 51.12 ± 5.80 (100) 76.38 ± 9.63 (300)
NSCR 83.45 ± 4.53 (300) 2.93 ± 0.95 (150) 41.99 ± 11.27 (300) 78.87 ± 8.66 (250)
SCFS 82.88 ± 19.95 (250) 2.53 ± 1.02 (250) 41.27 ± 8.48 (150) 83.04 ± 3.97 (250)
% 80.89 1.68 1.52 0.48

BL 62.43 ± 5.67 40.54 ± 4.56 24.86 ± 3.84 2.24 ± 0.36
MV 65.53 ± 5.07 (250) 42.63 ± 3.34 (250) 25.56 ± 2.19 (300) 6.34 ± 0.19 (200)
LS 62.85 ± 4.51 (200) 39.04 ± 2.98 (50) 28.24 ± 3.41 (200) 1.34 ± 0.00 (300)
MCFS 63.91 ± 4.43 (150) 45.21 ± 3.50 (250) 45.42 ± 3.85 (50) 5.06 ± 0.75 (50)
NDFS 72.17 ± 5.41 (200) 64.74 ± 3.82 (200) 48.26 ± 2.55 (50) 7.42 ± 0.89 (50)
NSCR 74.21 ± 5.56 (200) 65.18 ± 5.07 (250) 34.84 ± 3.30 (50) 35.55 ± 24.73 (50)
SCFS 72.82 ± 4.07 (200) 55.98 ± 6.06 (300) 48.57 ± 4.44 (50) 18.21 ± 14.36 (50)
% 11.78 24.64 23.71 33.31
Table B.2: Normalised mutual information (NMI) [%] corresponding to different

data sets and feature selection methods.

BL 7.45 ± 5.22 1.28 ± 0.70 42.86 ± 4.85 81.92 ± 13.32
MV 14.08 ± 6.46 (250) 1.66 ± 0.71 (100) 20.66 ± 8.69 (300) 73.36 ± 6.52 (200)
LS 7.87 ± 3.46 (50) 1.72 ± 0.60 (250) 26.08 ± 6.23 (300) 71.40 ± 9.04 (200)
MCFS 12.00 ± 3.33 (100) 1.74 ± 1.28 (250) 18.51 ± 6.55 (250) 78.42 ± 9.31 (250)
NDFS 89.43 ± 0.00 (300) 2.96 ± 0.90 (150) 44.79 ± 5.27 (100) 75.42 ± 9.86 (300)
NSCR 83.07 ± 4.27 (300) 2.93 ± 0.95 (150) 35.83 ± 12.51 (300) 77.58 ± 9.62 (250)
SCFS 82.29 ± 20.10 (250) 2.53 ± 1.02 (250) 35.50 ± 9.47 (150) 82.44 ± 4.06 (250)
% 75.62 1.68 1.93 0.52

BL 49.32 ± 6.76 18.00 ± 5.72 15.82 ± 4.29 1.43 ± 0.32
MV 52.04 ± 5.82 (250) 20.28 ± 4.41 (250) 16.59 ± 2.26 (300) 4.99 ± 0.26 (200)
LS 48.68 ± 4.82 (200) 15.87 ± 3.66 (50) 19.50 ± 3.79 (200) 0.62 ± 0.00 (300)
MCFS 50.36 ± 4.92 (150) 23.75 ± 4.80 (250) 37.31 ± 4.45 (50) 4.01 ± 0.53 (50)
NDFS 59.76 ± 6.67 (300) 50.14 ± 5.25 (200) 41.24 ± 3.03 (50) 4.52 ± 0.96 (50)
NSCR 62.01 ± 7.11 (200) 50.22 ± 6.88 (250) 26.68 ± 3.81 (50) 34.53 ± 25.50 (50)
SCFS 61.01 ± 4.51 (200) 38.21 ± 8.27 (300) 41.45 ± 4.94 (50) 15.19 ± 15.65 (50)
% 12.69 32.22 25.63 33.10
Table B.3: Adjusted mutual information (AMI) [%] corresponding to different

data sets and feature selection methods.
(a) ALLAML (b) BASEHOCK
(c) GLIOMA (d) LSOLET
(figure continued next page)

(e) LYMPHOMA (f) NCI9
(g) PIE10P (h) PROSTATE
Figure B.1: Jaccard index for eight different data sets.

ACC NMI AMI

BL SCFS BL SCFS BL SCFS
90.29 ± 9.01 87.55 ± 7.08 87.26 ± 6.98
1 78.31 ± 12.38 71.90 ± 15.07 71.18 ± 15.39
(200) (200) (200)
89.50 ± 13.77 87.94 ± 13.07 87.82 ± 13.25
2 87.57 ± 13.30 85.11 ± 14.30 84.88 ± 14.57
(175) (175) (175)
87.02 ± 11.78 81.75 ± 10.62 80.73 ± 11.57
3 78.22 ± 13.3 74.33 ± 14.73 73.23 ± 15.29
(100) (100) (100)
93.62 ± 5.63 86.89 ± 2.24 86.85 ± 2.24
4 81.68 ± 16.25 78.01 ± 17.19 77.73 ± 17.43
(100) (100) (100)
89.52 ± 11.24 84.2 ± 9.7 83.63 ± 10.44
5 73.53 ± 8.15 67.3 ± 9.54 66.1 ± 9.39
(100) (100) (100)
86.7 ± 9.87 82.65 ± 7.95 82.28 ± 8.11
6 76.14 ± 14.17 74.98 ± 14.02 74.43 ± 14.29
(125) (125) (125)
95.99 ± 8.13 91.96 ± 8.66 91.94 ± 8.65
7 83.15 ± 13.69 78.16 ± 16.05 77.95 ± 16.14
(100) (100) (100)
79.93 ± 14.82 72.73 ± 14.04 72.34 ± 14.14
8 66.66 ± 11.66 63.23 ± 9.68 62.90 ± 9.74
(200) (200) (200)
81.54 ± 8.85 76.4 ± 11.02 76.40 ± 11.05
9 74.70 ± 9.55 73.48 ± 8.68 73.48 ± 8.68
(50) (100) (100)
79.55 ± 9.73 68.68 ± 6.37 68.65 ± 6.36
10 66.36 ± 7.63 54.34 ± 7.33 53.72 ± 7.27
(50) (50) (50)
90.85 ± 9.26 82.86 ± 12.95 82.74 ± 13.02
11 89.14 ± 8.05 80.49 ± 12.83 80.26 ± 12.95
(125) (175) (175)
86.95 ± 9.95 78.95 ± 9.23 78.61 ± 9.64
12 83.02 ± 10.23 73.27 ± 11.41 72.71 ± 11.86
(150) (150) (150)
78.11 ± 8.69 73.72 ± 5.85 71.74 ± 6.85
13 65.48 ± 7.51 61.13 ± 8.81 58.87 ± 8.69
(75) (75) (75)
63.71 ± 10.28 59.06 ± 11.37 58.56 ± 7.53
14 56.26 ± 7.74 54.5 ± 6.86 53.91 ± 7.08
(75) (100) (75)
72.81 ± 16.26 68.31 ± 14.58 67.52 ± 15.07
15 63.09 ± 9.41 59.75 ± 9.1 58.88 ± 9.12
(200) (200) (200)
78.19 ± 10.36 74.13 ± 7.08 73.45 ± 7.18
16 73.04 ± 12.38 66.23 ± 12.5 65.68 ± 12.62
(150) (50) (50)
83.46 ± 8.78 75.65 ± 10.54 74.49 ± 11.12
17 76.39 ± 11.18 67.88 ± 13.07 66.67 ± 13.32
(175) (175) (175)
74.85 ± 10.68 63.49 ± 13.21 63.11 ± 13.26
18 64.27 ± 12.99 56.46 ± 14.02 55.97 ± 14.18
(75) (125) (125)
86.57 ± 13.19 79.76 ± 12.66 79.37 ± 13.09
19 72.34 ± 11.78 64.71 ± 13.77 64.27 ± 13.92
(125) (125) (125)
76.31 ± 11.47 69.17 ± 9.16 68.58 ± 9.39
20 64.21 ± 9.99 53.97 ± 10.24 53.56 ± 10.51
(100) (100) (100)
86.16 ± 8.62 80.84 ± 8.4 80.21 ± 9.12
21 79.68 ± 11.27 78.22 ± 11.06 77.03 ± 11.5
(125) (125) (125)
80.33 ± 10.95 77.26 ± 5.83 77.26 ± 5.83
22 67.03 ± 5.16 64.65 ± 7.75 64.65 ± 7.75
(50) (75) (75)
71.39 ± 8.77 69.04 ± 8.31 68.01 ± 8.43
23 56.24 ± 10.82 51.26 ± 12.14 50.38 ± 11.95
(75) (50) (50)
86.75 ± 11.71 81.00 ± 13.52 80.29 ± 14.03
24 80.39 ± 14.64 74.06 ± 17.8 73.36 ± 18.21
(150) (150) (150)
(table continued next page)

ACC NMI AMI

BL SCFS BL SCFS BL SCFS
89.12 ± 8.02 82.49 ± 8.20 82.32 ± 8.32
25 79.48 ± 12.69 75.71 ± 9.79 75.32 ± 10.09
(100) (200) (200)
85.99 ± 11.4 77.89 ± 14.08 77.89 ± 14.08
26 81.11 ± 11.61 72.28 ± 12.98 72.23 ± 13.02
(175) (175) (175)
76.59 ± 15.04 71.81 ± 15.05 71.15 ± 15.19
27 73.49 ± 11.87 67.06 ± 11.67 66.58 ± 11.93
(200) (200) (200)
86.10 ± 11.28 83.63 ± 10.67 83.06 ± 11.11
28 79.70 ± 9.45 76.91 ± 9.77 75.86 ± 10.19
(200) (200) (200)
88.20 ± 9.39 84.21 ± 9.65 84.09 ± 9.68
29 85.01 ± 9.31 79.78 ± 10.37 79.69 ± 10.37
(200) (200) (200)
80.11 ± 5.92 79.22 ± 4.84 77.64 ± 4.86
30 77.7 ± 6.12 76.43 ± 7.72 74.51 ± 7.95
(125) (125) (125)
87.69 ± 10.27 86.37 ± 10.75 86.16 ± 11.02
31 80.76 ± 10.58 77.58 ± 10.98 77 ± 11.48
(100) (150) (150)
85.18 ± 13.63 80.09 ± 14.36 79.05 ± 15.35
32 81.92 ± 13.84 76.45 ± 14.95 75.48 ± 15.76
(200) (175) (175)
92.95 ± 8.2 89.44 ± 9.58 89.39 ± 9.69
33 81.07 ± 12.38 74.36 ± 14.42 73.92 ± 14.74
(175) (175) (175)
81.97 ± 11.79 76.09 ± 6.82 74.87 ± 6.16
34 71.42 ± 11.91 65.24 ± 9.51 63.96 ± 9.32
(50) (100) (100)
91.34 ± 8.56 83.67 ± 4.49 83.29 ± 4.65
35 79.94 ± 12.04 70.24 ± 14.22 69.81 ± 14.38
(100) (100) (100)
72.98 ± 12.32 74.01 ± 10.08 73.16 ± 10.63
36 63.81 ± 8.96 67.29 ± 6.46 65.97 ± 6.29
(100) (75) (75)
89.79 ± 10.06 81.82 ± 9.95 81.68 ± 10.17
37 83.72 ± 12.63 77.72 ± 12.00 77.52 ± 12.24
(75) (125) (125)
70.73 ± 9.27 67.05 ± 4.06 67.05 ± 4.06
38 68.18 ± 6.90 63.83 ± 7.30 63.83 ± 7.30
(175) (150) (150)
92.53 ± 12.07 88.93 ± 13.60 88.85 ± 13.73
39 83.55 ± 13.28 76.04 ± 16.80 75.67 ± 16.96
(175) (175) (175)
78.32 ± 7.19 71.71 ± 6.74 71.40 ± 6.60
40 65.09 ± 6.60 60.95 ± 5.04 60.81 ± 4.99
(100) (125) (125)
82.57 ± 14.29 77.33 ± 11.72 77.05 ± 11.89
41 74.91 ± 11.58 68.96 ± 11.60 68.42 ± 11.56
(125) (125) (125)
95.10 ± 5.20 92.76 ± 10.27 92.45 ± 10.80
42 91.20 ± 13.68 90.59 ± 11.31 90.24 ± 11.81
(75) (200) (200)
87.27 ± 13.67 85.90 ± 13.41 85.48 ± 13.89
43 77.12 ± 11.69 75.64 ± 10.41 74.46 ± 11.16
(200) (200) (200)
83.22 ± 10.92 79.08 ± 11.01 78.69 ± 11.15
44 77.13 ± 12.32 71.81 ± 10.70 71.36 ± 10.73
(150) (150) (150)
76.72 ± 13.59 77.59 ± 10.51 76.95 ± 10.70
45 70.52 ± 16.56 68.4 ± 15.39 67.90 ± 15.65
(100) (100) (100)
84.34 ± 11.36 82.68 ± 8.46 82.12 ± 8.52
46 74.54 ± 8.73 75.98 ± 5.11 75.58 ± 5.11
(125) (125) (125)
74.14 ± 14.27 71.21 ± 11.8 70.16 ± 12.38
47 69.41 ± 10.29 67.35 ± 8.85 65.79 ± 9.33
(125) (125) (125)
82.23 ± 12.14 76.79 ± 13.03 75.75 ± 13.78
48 76.76 ± 12.05 70.38 ± 14.17 69.56 ± 14.42
(75) (75) (75)
Table B.4: Clustering quality measures for feature selection of CELEBI. The first
column shows the indices of the subsets.
B.2. DISCUSSION
B.2 Discussion
(a) ALLAML (b) BASEHOCK
(c) GLIOMA (d) LSOLET

B.2. DISCUSSION
(e) LYMPHOMA (f) NCI9
(g) PIE10P (h) PROSTATE
Figure B.2: Relative change of the objective value in iterative methods. A mark
ζ on the vertical axis means that the relative change is 10ζ .
B.2. DISCUSSION
(a) ALLAML
(b) BASEHOCK
(c) GLIOMA

B.2. DISCUSSION
(d) LSOLET
(e) LYMPHOMA
(f) NCI9

B.2. DISCUSSION
(g) PIE10P
(h) PROSTATE
Figure B.3: Clustering quality for different α and β, obtained with the features
selected using SCFS. A mark ζ on the axes for α and β corresponds to a value of
10ζ .
APPENDIX C
Glossary
accuracy noggrannhet degree, valency grad, valens

adjacency matrix grannmatris degree centrality gradcentralitet
adjusted justerad directed edge riktad kant
affinity affinitet directed graph riktad graf
arc båge edge kant
betweenness centrality edge colouring kantfärgning
mellancentralitet eigenvector centrality,
bipartite graph bipartit graf eigencentrality egenvektorscentralitet,
centrality centralitet egencentralitet
chromatic number kromatiskt tal embed inbädda
chromatic index kromatiskt index Euclidean distance euklidiskt avstånd
circuit krets Eulerian trail Eulerstig
closeness centrality Eulerian circuit Eulerkrets
närhetscentralitet feature selection variabelselektering
cluster kluster flow flöde
clustering klustring forecasting prognostisering
colouring färgning graph, subgraph graf, delgraf
complete graph komplett graf graph theory grafteori
connected sammanhängde Hamiltonian path Hamiltonväg
connectivity konnektivitet Hamiltonian cycle Hamiltoncykel
cut snitt heat kernel värmekärna
cycle cykel Laplacian matrix Laplacematris
manifold mångfald regular graph reguljär graf
matching matchning regularisation regularisering
measure mått relaxation relaxering
metric metrik set, subset mängd, delmängd
minimum spanning tree minsta shortest path kortaste väg
(upp)spännande träd similarity similäritet
mutual information ömsesidig spanning tree (upp)spännande träd
information sparse gles
neighbour granne spectral graph theory
neighbourhood omgivning spektralgrafteori
node, vertex nod, hörn spectrum spektrum
non-connected icke-sammanhängande subspace delrum, underrum
norm norm threshold tröskel
normalised normerad trace spår
overfitting överanpassning tractable traktabel, hanterbar
path väg trail stig
percentage point procenenhet tree träd
perfect matching fullständig vertex colouring hörnfärgning
matchning walk vandring
planar graph planär graf weight vikt
preserve bevara weighted graph viktad graf
ranking rangordning
TRITA ITM-EX 2020:343
www.kth.se

FULLTEXT01

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FULLTEXT01

Uploaded by

Copyright:

Available Formats

EXAMENSARBETE INOM MASKINTEKNIK,

Graph theory applications in

Graph theory is a mathematical study of objects and their pairwise relations,

2 Market trends for utility companies 5

3.5 Data collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

I Graph theory and applications 28

5 Selected applications of graphs 38

7 Assessment of use-case clusters 54

7.1.2 Scoring system . . . . . . . . . . . . . . . . . . . . . . . . . 55

8 Assessment of use-cases in Energy trading 88

9 Conclusion and discussion on PART II 101

III Graphs, feature selection and electricity price

10.2.1 Forecasting methods . . . . . . . . . . . . . . . . . . . . . . 108

11 Introduction to feature selection 116

13 Feature selection methods 123

13.5 Feature selection via adaptive similarity learning and subspace

14 Experiments and results 139

16 Research conclusion 148

A Part I & II: Figures and tables

B Part III: Figures and tables

B.1 Experiments and results . . . . . . . . . . . . . . . . . . . . . . . .

2.1 European Net electricity generation, EU-28, 1990-2017. Source:

3.1 The idea generation phase . . . . . . . . . . . . . . . . . . . . . . . 21

4.1 Examples of graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.1 The complete bipartite graph K3,3 is not planar. . . . . . . . . . . . 42

7.1 Hydropower optimisation . . . . . . . . . . . . . . . . . . . . . . . . 59

7.5 Assessment of EV Applications . . . . . . . . . . . . . . . . . . . . 71

8.1 Assessment of Natural gas market analysis with visibility graphs . . 92

10.1 Formation of electricity prices . . . . . . . . . . . . . . . . . . . . . 107

14.1 Average Jaccard index. . . . . . . . . . . . . . . . . . . . . . . . . . 143

B.1 Jaccard index for eight different data sets. . . . . . . . . . . . . . .

3.1 Evaluation dimensions and criteria. . . . . . . . . . . . . . . . . . . 22

5.1 The centrality measures of the vertices in Figure 5.2. . . . . . . . . 46

6.1 Number of past use-cases per cluster . . . . . . . . . . . . . . . . . 52

7.1 Scoring of use-case clusters . . . . . . . . . . . . . . . . . . . . . . . 56

8.1 Scoring of use-cases in Energy trading . . . . . . . . . . . . . . . . . 89

13.1 Notations associated with a given data set. . . . . . . . . . . . . . . 124

A.1 List of interviewees. . . . . . . . . . . . . . . . . . . . . . . . . . . .

B.1 Clustering accuracy (ACC) [%] corresponding to different data sets

B.4 Clustering quality measures for feature selection of CELEBI. The

Graph theory is a mathematical study of objects and their pairwise relations,

generators. On the other hand, the technological advancements with respect to

1.1 Purpose and research question

1.2 Research contribution

an inspiring mathematical background for practitioners needing mathematical

Utility companies have a central role in facilitating the coming transition to a

2.1 Uncertain growth in electricity demand

Europe, the figure below summarises the evolution of electricity consumption

Figure 2.1: European Net electricity generation, EU-28, 1990-2017.

2.2 A more complex portfolio

2.3 Evolving technology

2.4 Evolving conditions in the energy markets

2.5 Customer trends

competitiveness of a utility dependent on product pricing rather than on product

Figure 2.2: Evolution of the estimated impact of technology on utility companies

2.6 Digitalisation of utility companies

opportunities for utility companies to respond to the evolving environment they

Figure 2.3: The economic impacts of digitalisation on utility earnings

Utility companies have already started to invest heavily in digitilisation,

2.7 The emergence of graph analytics

Digitilisation is enabled by digitisation, which is the mere process "changing from

2.7.1 Relational databases

2.7.2 Graph databases

Figure 2.4: A friendship directed social graph.