You are on page 1of 42

Bachelor Degree Project

Securing electronic health records


- A blockchain solution.

Author: Oscar Andersson


Supervisor: Hemant Ghayvat
Semester: VT 2021
Subject: Computer Science
Abstract

Blockchain is an interesting technology, with different projects developing every day


since it first gained its light back in 2008. More and more research finds blockchain
useful in several different sectors. One of the sectors being healthcare, specifically
for electronic health records (EHR). EHR contains highly sensitive data which is
critical to protect and, just in the year 2019, 41,232,527 records were deemed stolen.
Blockchain can provide several benefits when it comes to EHR, such as increased
security, availability, and privacy, however, it needs to be done correctly. Due to
blockchain being a rather novel technology, there is room for improvement when it
comes to integrating blockchain with EHR. In this thesis a framework for EHR in the
healthcare sector is proposed, using Ethereum based smart contracts together with
decentralized off-chain storage using InterPlanetary File System (IPFS) and strong
symmetric encryption. The framework secures the records and provides a scalable
solution. Furthermore, a discussion and evaluation regarding several security aspects
that the framework excels on as well as what the framework could improve on.

Keywords: Electronic health records, blockchain, ethereum, smart contracts, in-


terplanetary file system
Preface

I would like to thank everybody around me for supporting me and cheering me on as I


worked day and night with this thesis. Furthermore I would also like to give a huge thanks
to the course coordinator Daniel Toll for giving great support through the difficulties dur-
ing this thesis.
Contents

1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 Scope/Limitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.7 Target group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.8 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Method 5
2.1 Design Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Step 1 - problem identification and motivation . . . . . . . . . . . 5
2.1.2 Step 2 - define the objectives for a solution . . . . . . . . . . . . 5
2.1.3 Step 3-4 - design, development and demonstration . . . . . . . . 6
2.1.4 Step 5 - evaluation . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.5 Step 6 - communication . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Reliability and Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Ethical considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Theoretical Background 8
3.1 Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Ethereum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3 Smart Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.4 IPFS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.5 State of the art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.6 Evaluation of state of the art . . . . . . . . . . . . . . . . . . . . . . . . 12
3.7 Blockchain for finance vs healthcare . . . . . . . . . . . . . . . . . . . . 13

4 Implementation 15
4.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2 Technology Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 System architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.4 System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.4.1 Contract layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.4.2 Blockchain layer . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4.3 Data layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.4.4 UI layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5 Evaluation & Discussion 26
5.1 Research questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.3 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.3.1 Confidentiality challenges . . . . . . . . . . . . . . . . . . . . . 30
5.3.2 Privacy challenges . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.4 System comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6 Conclusion 33
6.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

References 35
1 Introduction

The healthcare domain deals with massive amounts of data daily, parts of it being highly
sensitive data such as medical histories, diagnoses, vital signs, medications, and much
more [1]. In regards to the tremendous advance in technology, the healthcare sector has
widely embraced the functionality of electronic health records(EHR). By the year 2017,
96% of all non-federal acute hospitals in the US had adopted the use of EHR [2]. EHR
together with well-established health information exchange (HIE) systems provides ben-
efits such as reduced healthcare costs and quality of care improvement [3]. However, a
few concerns arise when making use of digital systems to transfer highly sensitive and
personal information. Privacy and security are two major concerns because when making
the information available to more health care organizations the information also becomes
more obtainable by the public, specifically hackers who perform targeted attacks [4].
This thesis project sets out to secure electronic health records efficiently with the use
of decentralization. The proposed system makes use of the Ethereum blockchain and the
distributed file system InterPlanatery File System (IPFS). Together it creates a scalable
decentralized solution for storing and exchanging electronic health records.

1.1 Background

Securing the information exchange within the healthcare domain is vital, and it is a diffi-
cult challenge. Technology progresses every day and so do the challenges for securing the
new technologies. From the years 2014-2019 157.40 million people were affected by data
breaches within the healthcare domain [5], and in the year 2019 41,232,527 records were
exposed. If taken into account that the average mitigation cost of one exposed record in
the healthcare domain globally lied roughly around $150 in 2019. This would give a cost
of $6,184,879,050 due to exposed health records according to hipaajournal12 , which is an
extreme amount of money, due to lack of security in the healthcare sector. A promising
solution towards information exchange has been blockchain. Blockchain is a technology
that makes use of a distributed ledger of transactions, meaning that each participant node
stores an exact copy of the same ledger. With the growth of blockchain since its reveal
in 2008 [6], more use cases for it have shown and more technology revolving around it
made it possible.
While blockchain provides significant opportunities for the healthcare domain specif-
ically electronic health records, there are still issues with using blockchain. Scalability,
transparency and key management are three major issues. Scalability becomes an is-
sue due to how big EHR data can be, storing this on the blockchain causes long and
slow transactions which are not ideal. Transparency is an issue because all transactions
are public and possible to view, sensitive information, therefore, needs to stay private
while still being logged and recorded on the blockchain. When incorporating EHR with
blockchain the records need to be encrypted, otherwise, they are publicly available for
1
https://www.hipaajournal.com/december-2019-healthcare-data-breach-report/
2
https://www.hipaajournal.com/2019-cost-of-a-data-breach-study-healthcare-data-breach-costs/

1
anyone which is far from ideal. Therefore patients and providers need to securely store
blockchain account credentials as well as secret keys for encrypting and decrypting. This
puts pressure on providers and patients as they need to manage their keys securely, which
can be difficult for inexperienced individuals involved with cryptography.

1.2 Related work

Solving the issues regarding the storage and exchange of EHR is a hot topic, and there are
several approaches and techniques that can be used in order to do it. Matos et al. proposed
a system architecture and solution for managing EHR with the use of cloud services and
granular access control [7]. The goal was to create a secure and scalable solution for
EHR where a patient or provider should be able to access the needed record anywhere in
the world. The storage of the EHR is an intercloud, an intercloud is the way of chaining
singular clouds together creating a mass of clouds which then is refered to as an intercloud
or cloud of clouds. Interclouds provide benefits in the way it supports end-to-end privacy
for cloud application, and the ease of migrating data between providers. Furthermore,
Matos et al. describes their access control process as dependant on authentication and a
privilege check. While the end goal of the system was to be able to guarantee a patients
privacy, the system is still susceptible to exploits that end with a perpetrator having access
to data he or she should not.
Action-EHR is a proposed framework by Dubovitskaya et al [8]. The proposed frame-
work runs on Hyperledger Fabric, Hyperledger Fabric is a private/permissioned based
blockchain framework, which gives stronger authentication and authorization rather than
the open public blockchain where any node can willingly connect and view transactions
[8]. Similar to the Ethereum network Hyperledger Fabric allows for smart contracts to
be made. Action-EHR makes use of the logic in the smart contracts the same way as
previously mentioned frameworks, where the main purpose is to store the state variables
regarding access control to a certain patient’s health record. Action-EHR makes use of
HIPAA compliant cloud storage, instead of a local database. A HIPAA compliant cloud
storage can ensure that they will to the best of their capacity make sure that the informa-
tion stored keeps its integrity, confidentiality and, availability. The specific cloud storage
service used in Action-EHR is Amazon S3. Before uploading a record to the storage, the
record is encrypted using a symmetric key approach, while on the cloud the symmetric
key is encrypted with the public key of the patient and the doctor to whom the patient has
allowed access. Decrypting the records the doctor uses his private key to retrieve the sym-
metric key, which then is used to decrypt the data. The researchers discuss the use of only
a symmetric key, as this would ease the key management, and if the patient has more than
one doctor to share his records with, he would only have to upload once. While with the
earlier described approach the patient would have to upload data for each of the doctors
that need access [8]. They conclude their paper by stating that the prototype meets the
requirements from a medical perspective and their next step is setting up a test network
and testing it from a real healthcare institutional point of view.

2
1.3 Problem formulation

In today’s current EHR exchange systems there are security issues and privacy issues that
are abused, and data breaches occur almost every day in the healthcare sector and the
records of patients are compromised. There is a difficulty in balancing the privacy of the
patients and the availability of the records. The use of blockchain technology can increase
security and privacy, as well as the availability of records for both patients and providers.
However, blockchain technology is novel and more research has to be done in regards to
integrating it together with EHR, to solve certain issues that blockchain propose, such as
scalability, transparency and key management. The framework proposed in this thesis is
set to securely store and exchange electronic health records, while solving and addressing
the given issues.
The following research questions have been made to ease the process of implementing
the proposed framework and setting certain guidelines to follow.

1. What is the state of the art of storing and exchanging electronic health records with
the integration of blockchain technology?

2. How can confidentiality, integrity and availability be maintained with the use of
decentralization for sensitive healthcare data?

3. How can users of the system maintain privacy, while the blockchain network is
remained public and transparent?

4. How can the system provide a solution to solve the scalability issue of big data in
blockchain?

5. How can patients and providers lack of experience and ease of use be solved re-
garding key management for blockchain credentials as well as encryption keys?

1.4 Motivation

As previously mentioned, properly exchanging EHR brings multiple benefits to the health-
care domain. The novel technology of blockchain may be the solution for managing EHR
securely and efficiently, but as blockchain is a novel technology there is room for im-
provement. If more research regarding the integration of EHR and blockchain is done
more domains apart from healthcare can also reap the benefits of standardized methods
regarding big data and blockchain.

1.5 Objectives

The objectives for this thesis has been developed by closely following the 6 step design
science methodology. Information regarding design science and the 6 steps of the method-
ology is listed in chapter 2. The chapter contain the general description of the 6 steps of
design science, and these steps are later on in the same chapter explained and tailored to
fit this thesis and its objectives.

3
1.6 Scope/Limitation

This thesis is set to secure the storage and exchange of health records together with
blockchain and off-chain storage. The main focus lays on the development of smart
contracts and incorporating off-chain storage and encryption. Therefore no changes or
further implementation regarding the blockchain technology will occur in this project.
The blockchain technology is already established, as a test/personal blockchain based
on Ethereum will be used. Furthermore, due to time limit, the assumption is made that
each provider and patient knows how to distribute and safely store secret keys as well as
credentials for their account used on the blockchain.

1.7 Target group

This is a bachelor thesis, the intended target group are students who can continue develop
and research the framework further. The intended target group are also developers and
researchers in the field of healthcare security, who can make use of the results and future
work suggestions as well as withdraw important information from several parts of the
thesis.

1.8 Outline

This bachelor thesis report is structured in the following manner. Chapter 2 provides
information regarding the methodological approach chosen for this particular project.
Chapter 3 contains a theoretical background which will provide the reader with back-
ground knowledge of prominent technologies and approaches regarding blockchain and
electronic health records. In chapter 4 the system implementation is described and dis-
cussed. Chapter 5 includes the evaluation regarding the proposed system in perspective
of the research questions listed in section 1.3. Chapter 6 holds the discussion section and
lastly chapter 7 concludes the thesis and future work is suggested.

4
2 Method

This section of the thesis involves a thorough explanation regarding the research method-
ology used to answer research questions 1 through 5 which can be found in section 1.3.
This thesis made use of design science methodology. Furthermore the section includes an
explanation regarding the proposed systems reliability, validity and lastly ethical consid-
erations.

2.1 Design Science

Design science methodology was used for the process of developing the system described
in this thesis. Peffers et al. [9] describe design science as a methodology for creating and
evaluating IT artifacts with intent to solve a specified problem. The specifics of design
science are divided into 6 key steps. The 6 key steps are listed and explained briefly
below, the steps are also further described in the point of view of the thesis, and how the
different steps stood in assist in order to solve the problem and challenges at hand.

1. Problem identification and motivation, this step is needed to define a specific


problem for the research and furthermore be able to motivate why a solution to the
problem is beneficial and of value.

2. Define the objectives for a solution, is about defining what has to be done to meet
the solution for the problem identified in step 1.

3. Design and development, the step of creating and designing the artifact, by defin-
ing its functionality and architecture and then finally developing it.

4. Demonstration, involves demonstrating and showing the intended functionality of


the artifact, through experimentation, case studies and other similar activities.

5. Evaluation, it involves an in-depth evaluation of the artifact developed. How well


does the functionality of the design deal with the problems defined?

6. Communication, the last step of the design science methodology. Communicate


the problem and the necessities and the process of solving it.

2.1.1 Step 1 - problem identification and motivation

This step included an initial contact and discussion with the supervisor for this thesis.
Together the author and supervisor defined a challenge and the importance of addressing
it. This step of design science is written and discussed in chapter 1.

2.1.2 Step 2 - define the objectives for a solution

Defining the objectives for a solution has been approached by conducting a limited liter-
ature study. The literature study is later on translated into a state of the art regarding the

5
integration of blockchain and electronic health records which can be viewed in section
3.5. This step is important for gathering the knowledge required to understand require-
ments and necessities to ease the process of developing a system. The studies used vary
in publish dates from the year 2015-2020, this gives a proper view of what has been done
and certain standards that should be obliged in order to develop a proper framework. The
database that was used in order to search for the different studies was google scholar and
OneSearch, and with the help of my university, the author had access to subscription-
protected articles. Certain criteria had to be met in order to be either included or excluded
from the study. The criterias in question are as follows.

• The article needs to be written in English.

• The publishing date needs to be between 2015-2020.

• The title has to contain the keywords “blockchain” and “health”.

• The study needs to be regarding a proposed framework.

2.1.3 Step 3-4 - design, development and demonstration

Step 3 and 4 could be started once step 2 was complete and the objectives for the system
stood clear. The objectives/requirements was now transformed into system architecture
and design which was revised and discussed between the author and the supervisor until
a finalized agreement. The requirements, design and implementation of the artifact is
described in-depth in chapter 4.

2.1.4 Step 5 - evaluation

The evaluation step of this system can be seen in section 5. It has been done conceptually
meaning that the evaluation is based on theory and analysis regarding the systems intended
functionality.

2.1.5 Step 6 - communication

Communication of this thesis has been done in the form of an oral presentation and this
written report. The report has undergone reviews and several readings from examiners in
regards of making sure that the quality of the report is valid.

2.2 Reliability and Validity

The systems main intention is to securely exchange and store electronic health records.
The logic and technology stack behind the system is transparent and displayed in section 4
using algorithms and well explained subtext, furthermore, the source code can be viewed
at GitHub for recreation and further testing as suggested in section 6.1. The conceptual
security and scalability aspects are based on well-known and established facts regarding

6
techniques/technologies used in the system and the theoretical concept is highly based on
peer-reviewed white papers. The complete evaluation can be found in chapter 5. These
two concepts are the main factors for reliability and validity for this system. To further
increase the Reliability and Validity, the scalability and security aspects should be tested,
however due to the time limit this is deemed not feasible for this thesis.

2.3 Ethical considerations

The usage of decentralized technology in the proposed framework enlightens certain eth-
ical considerations. Data that has been part of a transaction within a public blockchain
can not be deleted, data that is uploaded to IPFS can not be deleted either. This proposes
issues to certain data protection regulations. According to the general data protection
regulation (GDPR) a private person should be allowed under certain circumstances to be
forgotten, meaning that sensitive personal information should be removed. The proposed
framework does not allow for deletion of records nor any other information stored on
IPFS or the blockchain.
Furthermore, all the data used in the system as of now is simulated an fictional. The
system should not be used for real data until it has been tested regarding all its aspect, and
until regulations and laws have been established and compliant with the design.

7
3 Theoretical Background

This chapter contains a brief technical background that explains the most prominent tech-
nologies forming the proposed system, this has been done to properly educate the reader
and make the report easy to understand. The chapter furthermore contains a literature
study in the form of state of the art, where the latest systems regarding blockchain and
EHR are presented. The state of the art is evaluated and discussed afterwards and lastly,
a comparison regarding blockchain for finance in comparison to healthcare is carried out.
This has been done to gather knowledge for defining the objectives/requirements needed
to develop a proper solution with respect to step 2 and 1 in design science methodology.

3.1 Blockchain

Many misunderstand blockchain and immediately relates it to Bitcoin, due to Bitcoin


being the first real use case for blockchain technology [10]. However, the original idea
behind blockchain was introduced back in 1991 [11]. The idea behind it was to create
a way to time-stamp digital documents in such a way that nobody could back-date or
forward-date digital documents. Therefore, it is important to note that blockchain not only
serves use cases for finance but also any other industry that needs tracking of transactions.
As mentioned in section 1.1, blockchain is a technology that uses distributed ledgers
that hold recordings of all transactions carried out over the blockchain. When a transaction
is made between two peers in the blockchain, the transaction is recorded and added to
a block. Each transaction contains an immutable hash signature, due to this hash and
the distribution between all the nodes in the chain, an unauthorized change in the blocks
would be visible and disregarded which makes it generally impossible to tamper with data
within the blockchain. Figure 3.2 shows, how each block is linked together and figure 3.1
shows how the signature of hashes during transactions work, together they provide the
immutable system.

Figure 3.1: Hash signing as shown in [6]

8
Figure 3.2: Linking of blocks as shown in [10]

Blockchain technology has over the years since the first real use case in Bitcoin,
developed further and many more cryptocurrencies evolved since. Apart from Bitcoin,
Ethereum is another major cryptocurrency that was released in 2015, which was a project
based on a research article released 2 years prior by Vitalik Buterin [12]. With Ethereum a
new blockchain evolved with similar technology to Bitcoin’s system, however, Ethereum
introduced smart contracts, which allows for logical code execution on top of the blockchain
[12].

3.2 Ethereum

The blockchain technology/network used in the proposed system is Ethereum. The Ethereum
mainnet was introduced in the development of the cryptocurrency Ethereum, with the gen-
eral idea of creating an open-source smart contract platform. Ethereum has its own cur-
rency called Ether, which is the general currency used when performing transactions on
the Ethereum blockchain [13]. Ethereum also has its own programming language named
Solidity, which provides the ability to develop smart contracts for the blockchain.

3.3 Smart Contracts

A blockchain may consist of smart contracts, which is logical code injected into the
blockchain with terms of the agreement between two peers involved in a specific transac-
tion. A smart contract consists of functions and variables which are executed and possibly
changed upon interaction with the smart contract’s address in the blockchain, the change
of the variables depends on the logic behind the smart contract. The smart contract inte-
gration with blockchain gives room for flexible and logical design for real-world problems
such as storage and exchange of electronic health records [14].

3.4 IPFS

InterPlanatery File System (IPFS) is a peer-to-peer (P2P) hypermedia protocol that allows
for a distributed way of storing and accessing files. In other words, IPFS was designed

9
for storing hypermedia in a decentralized manner. What differs IPFS from other P2P
protocols such as BitTorrent, is the way the content is addressed. When data is uploaded
to IPFS the data is split into small chunks and distributed and added as a Merkle DAG3 .
This furthermore returns a cryptographic hash which is called CID, and it is the value used
to retrieve files uploaded to IPFS. This is also what makes it possible for IPFS to provide
no duplication of data nor change to data already uploaded. Duplication of data would be
disregarded and any change to data would lead to a completely new CID [15]. IPFS thus
provides the means of giving all uploaded data integrity, meaning it cannot be deleted
or changed, similar to the technology of blockchain. It also provides greater availability,
as it is distributed, the data may be accessed from any of the nodes participating in the
network, instead of having one point of access as with a centralized database [16]. This
furthermore creates a more secure environment for the distributed data as it is protected
against several attacks which stand with minimal effect on distributed systems as opposed
to a non-distributed database.

3.5 State of the art

While blockchain is a rather novel technology, there have been some research and frame-
work proposals for the use of blockchain for securing and exchanging big data in several
domains. However, this state of the art section has the primary focus of studying the con-
cept of electronic health records and blockchain. Therefore no other domain is touched
upon or the technologies and techniques they have used.
One of the first major framework proposals for electronic health records and the use of
blockchain is MedRec [17]. MedRec is built off the Ethereum network with the creation
of smart contracts in the programming language Python. MedRec takes a unique approach
by encouraging the healthcare community to mine, advance in trust, and gain access to
anonymized medical data as rewards [17]. Furthermore, MedRec makes use of off-chain
storage for the records with the help of a local database based on SQL. The database
is accessed through permissions stored on the blockchain together with a unique hash
for each of the records. This implementation is made to make it easier for providers
to integrate the system with their already existing local database/storage. They do not
however integrate any encryption with the off-chain storage, but the paper states that it is
“unarguably crucial for future development” [17]. Further improvements are still being
made today in MedRec, and developers are working on MedRec 2.0 where a swap from
Python to Go-ethereum and solidity has been made.
Ancile and BHEEM are two more framework proposals, which makes use of a simi-
lar approach as MedRec. Where the health records are stored off-chain in a local database,
and the permissions/access control is controlled through the Ethereum network/blockchain
[18][19]. While [18] makes use of two different encryption methods to store and distribute
the records, [19] does not reveal what encryption their framework makes use of. The two
methods in use by the Ancile framework is symmetric encryption due to it being more
3
https://docs.ipfs.io/concepts/merkle-dag/

10
efficient for larger files [18]. However, the encryption during the distribution is done
through asymmetric and the use of proxy re-encryption. This makes it possible to restore
the fully encrypted message with a user’s private key, even if it was not the user’s public
key used for encryption. Proxy re-encryption works as following, person A generates an
encryption key which he/she delegates to a proxy. When the proxy later receives person
A’s encrypted data and information regarding whom should have the rights to decrypt, the
proxy re-encrypts with the shared key, and further sends it to person B. Person B decrypts
using his/her private key [20]. Figure 2.1 gives an overview of how a proxy re-encryption
is carried out.

Figure 3.3: Proxy re-encryption example

Shahnaz, Qamar, and Khalid made research and contributed with a framework for
electronic health records with the use of blockchain technology in 2019 [21]. Just as the
previous research and frameworks they made use of the Ethereum network, and with smart
contracts they defined access control for a patient’s health records. Furthermore, they also
make use of off-chain storage as this improves the scalability of blockchain, which is one
of the biggest issues as previously mentioned. However, their framework had a differ-
ent approach regarding off-chain storage than the previously stated frameworks. Instead
of using a SQL-based local database, they made use of the Interplanetary File System,
which they state is a favorable choice for storing critical and sensitive data, due to the
cryptographic identifier that protects the data from being altered. While the previously
mentioned frameworks made us of encryption to protect the sensitive data, [21] does not
mention any encryption before uploading the data to IPFS. This makes their solution ar-
guably less secure, as their main protection for the data is from alteration. They conclude
their research by issuing three different performance tests, execution time, throughput
and, latency. Madine et al. [22] proposed a patient-centric framework for personal health
records(PHR) using blockchain. Similar to others they use the Ethereum network and
together with smart contracts they create the access control foundation of their system.
To solve the scalability issues, they make use of IPFS together with proxy re-encryption.
The main difference regarding the suggested framework [22] is the patient-centric ap-
proach. They make the point that patients controlling their records by being responsible
for uploading as well as keeping track of provider’s requests for their records offers an

11
advantageous realignment in the relationship between patient and doctor. As an end to
their research, they compare their solution with already existing cloud-based PHR sys-
tems, where the main aspects taken into consideration are, privacy, decentralized storage,
decentralized execution, patient-centered, provenance, immutability and, trustful. Their
finalized comparison can be viewed in table 2.1.

Cloud-based PHR
Aspect Their solution
Management
Privacy Yes Yes
Decentralized storage No Yes
Decentralized Execution No Yes
Patient-centered Partially Yes
Provenance Partially Yes
Immutability Partially Yes
Trustful Partially Yes

Table 3.1: Recreation of the table presented in [22]

3.6 Evaluation of state of the art

There are a few common factors that appear in many of the proposed frameworks, such as
the use of smart contracts to contribute to granular access control, and the use of off-chain
storage instead of storing the data on the blockchain. Furthermore, they lean towards a
more patient-centric solution where each patient controls the access privileges to his/her
electronic health records.
Three out of the five presented frameworks make use of centralized databases for
their storage of electronic health records, one of the major issues with this solution is
that a centralized database brings a single point of failure [23]. Meaning that once a
breach happens all data is compromised. While in decentralized storage, if one file is
compromised it does not indicate that the other files have been compromised. This may
make a distributed/decentralized database the favorable choice between centralized and
decentralized when dealing with highly sensitive data such as electronic health records.
Furthermore, due to the decentralization, it is infeasible to modify a file within the de-
centralized database, as it would have to be confirmed by several nodes. This provides
further protection for the data stored in the decentralized database.
Using encryption for keeping the sensitive data inside the electronic healthcare records
safe is highly vital, especially as we are seeing an increase in data breaches in the health-
care domain. A hacker who gets their hand on the information in a record could poten-
tially commit identity theft and furthermore use the information for committing serious
and detrimental crimes. [17] and [21] do not provide any encryption options for their
stored data, which arguably makes their frameworks less secure. Furthermore, [19] does
not disclose whether they have a finalized integration of encryption or if it is merely a

12
suggestion for others to implement. However, [18], [22] do make use of encryption for
their electronic health records. They all make use of proxy re-encryption which provides
confidentiality to the data and can prevent serious damage in the case of a breach. Proxy
re-encryption is a type of asymmetric key encryption, this brings the question of whether
asymmetric or symmetric is better for encrypting health records. Symmetric encryption
algorithms are nearly 1000 times faster than asymmetric algorithms due to the process-
ing power needed for the computations [24]. This gives a strong case that symmetric
encryption may have the upper hand as the size of electronic health records can be very
large. Another strong point for symmetric encryption in electronic health care records is
the less need for uploading files. With the use of asymmetric, each record would have
to be uploaded as many times as the people it should be shared with, because of the way
asymmetric encryption is carried out. [8]. This could cause a significant delay, but also
depending on the storage solution, a massive increase in volume due to the possibly big
sizes of records [21].
The framework in [22] makes use of personal health records instead of traditional
electronic health records. Personal health records contain the same information as in a
traditional electronic health record, the main difference between the two is that personal
health records are maintained by the patient, while electronic health records are main-
tained by the providers. The purpose of using personal health records instead of elec-
tronic health records is to encourage patients to be more engaged in the decision-making
regarding healthcare and for the patient to be able to correct errors or uncertainties in
their records [25]. There have been both negative and positive results regarding personal
health records, the positive being better healthcare results, and correction of the records.
The negative results show that personal health records can be difficult for patients to use
and understand, the providers within healthcare also showed concern regarding the pa-
tient’s understanding of medical records and their legal liability [25]. Further research
regarding the difficulties in the adoption of personal health records suggests that the ben-
efits differentiate highly depending on the patient’s technical and health knowledge, it has
also highlighted that the results vary depending on the individual’s socioeconomic status
[26]. Therefore the better solution may be the adoption of electronic health records with
a patient’s decision regarding who should be authorized to view and update their records.
This would give the patient control over the records without the concerns of the patient
being capable enough of handling all the technicalities.

3.7 Blockchain for finance vs healthcare

The first intended use of blockchain was for the cryptocurrency named Bitcoin. It clas-
sified itself as a peer-to-peer electronic cash system [6]. The major point was to create a
financial system where no third parties had to be used and trusted for a transaction, for
this to be possible the transactions would have to be immutable and public and each node
on the blockchain would have a copy of each block filled with transactions [6]. The trans-
parency and public availability of the data within the transactions in the different blocks

13
of the blockchain may work for cryptocurrencies, but it is not the case for domains such
as healthcare.
Healthcare is a domain that consists of highly sensitive data, and in big volumes, this
makes storing such information on the blockchain not ideal. However, this does not mean
that there is no use of blockchain technology for domains that do deal with sensitive data.
The way blockchain can be integrated into such domains is by storing non-sensitive values
on the blockchain which can for instance be mapped towards the more sensitive data
which in its turn is stored off-chain in a database. This is a common case for electronic
health records, where the data of the record itself is not stored on the blockchain but a
corresponding hash of the data is stored on the blockchain in order to create a mapping
from point A to B, without giving away the data in B. Important to note is that in order to
keep the data secure one would have to integrate encryption on the data before storing it,
as an attacker could find the related hash due to it being public, and retrieve the record,
depending on the database and logic of smart contracts. Figure 3.4 shows the example of
the integration of blockchain and electronic health records when retrieving a record using
a hash stored on the blockchain.

Figure 3.4: Blockchain and electronic health record example

14
4 Implementation

This chapter describes all the different aspects and details regarding the implementation
of the proposed system. The chapter focuses on giving a broad overview regarding the
systems architecture and design, and further digs in deeper detail regarding the different
technologies and design it contains.

4.1 Requirements

The requirements described in this section were conducted during the second step of the
design science methodology shown in chapter 2. Note that there are several requirements
needed for a system to be fully working, but listing everything would be superfluous.
Therefore the following requirements are the absolute necessities for creating a secure
storage and exchange for electronic health records. An evaluation regarding the require-
ments can be found in chapter 5.

• Requirement 1: Registration abilities for various entities.

• Requirement 2: Patients need the ability to provide access for their records.

• Requirement 3: Patients need the ability to revoke access for their records.

• Requirement 4: Providers need the ability to upload records according to their


access.

• Requirement 5: Providers and Patients need the ability to download records ac-
cording to their access.

• Requirement 6: Each uploaded record needs to be encrypted.

Requirement Algorithm
1 1, 2, 3
2 4
3 5
4 8
5 9
6 8

4.2 Technology Stack

The proposed framework is a decentralized application or dApp. The dApp is constructed


by the back-end and front-end, the back-end and front-end are interconnected with the use
of Node.js which is a JavaScript runtime environment. The back-end of the framework
contains the smart contracts. The smart contracts have been developed using the solidity
programming language together with the integration of truffle and ganache. Truffle is a

15
development framework for Ethereum which eases the process of writing and deploying
smart contracts 4 . Ganache is a personal Ethereum blockchain that makes the process of
testing and executing smart contract functions an easy process 5 . The front-end of the
system is built upon React.js, which is a JavaScript framework made for easy integration
with HTML and JavaScript for creating user interfaces. Furthermore, the front-end re-
quires a cryptocurrency wallet, MetaMask has been used in this system and it makes it
possible to connect with an account to the blockchain network, which is necessary to be
able to be a part of the system and further complete transactions.

4.3 System architecture

In this section, the architecture for the proposed framework will be overviewed and ex-
plained. The architecture for the framework has been developed and updated during the
progress of the implementation. In figure 4.5 the complete architecture for the proposed
framework can be viewed.

Figure 4.5: Architecture of proposed framework

Administrator - The start of the architecture is at the Administrator node, the Ad-
ministrator node is constructed and owned by the deployer of the smart contract. The role
of the Administrator node is to assign Hospital nodes to specified Ethereum addresses.
Without the Administrator node, all the other nodes on the blockchain are redundant.
When the Administrator has assigned a hospital or hospitals, the state variables regarding
the information of the hospital are stored on the blockchain, such as hospital name, city
and, Ethereum address.
4
https://www.trufflesuite.com/truffle
5
https://www.trufflesuite.com/ganache

16
Hospital - Hospitals are responsible for assigning Ethereum addresses to either providers
or patients, just as with the assigning of a hospital, the patient or provider state variables
are stored on the blockchain.
Patient - The patient node contains identifiers such as Ethereum address, gender and,
the patient also contains a mapping of which providers are allowed access to the patient’s
health records. The patient has the responsibility to update the mapping by either allowing
or revoking access from a certain provider.
Provider - A provider node contains identifiers such as Ethereum Address and, Spe-
ciality. The provider node has the ability to create and update health records and view a
patient’s already created health record. However, the provider needs to be added to the
patient’s access list, which is the patient’s responsibility.
The essential outcome for integrating blockchain with electronic health records is that
the data in the records are stored in a secure manner. Therefore the essential of any
proposed framework regarding blockchain and health records is how the data is treated.
In the proposed framework, the first step of a record being created starts with the patient
giving permission to a provider. The provider who has gained permission from the patient
can now create a record, once the record is created it is uploaded to IPFS and encrypted
using the symmetric encryption algorithm AES, the hash of the record is generated and
stored on the blockchain. If the provider would want to view the record, the provider
inputs the hash of the file and downloads the encrypted version of the record. The provider
or patient can now decrypt the record with the same key generated during the encryption
phase. This key would have to be kept safe and only shared in a secure manner with the
providers and the patient who should access the record.

4.4 System Design

The system can be broken down into 4 main layers. The 4 main layers are UI layer,
Blockchain layer, Contract layer and Data layer. This subsection will explain each layer’s
role in the system starting from the bottom according to figure 4.6 where the 4 layers can
be viewed.

17
Figure 4.6: System Design/Model

4.4.1 Contract layer

Each smart contract deployed onto the blockchain is part of the contract layer. The main
functionality of the smart contracts is to carry out transactions between two peers on
the blockchain where certain requirements are needed to be met for the functions within
the smart contract to fully execute. The logic behind the smart contracts is made to de-
fine granular access control to the systems various transactions possible to make sure no
unauthorized access occurs. The various transactions possible within the system are the
following:
1. Addition of node/user.

2. Give/Revoke access to patient’s record.

3. Set hash of patient’s record.

4. Get hash of patient’s record.


Algorithm 1 explains the logic behind the function of adding a hospital that is available
to the administrator node after deploying the smart contracts. The algorithm requires 3
inputs in the form of a public key address of the MetaMask account, a city and a name.
If the caller of the function is not an admin the function is aborted and the corresponding
error message is shown, furthermore if the public key assigned is already in the mapping
of existing hospitals the function is aborted and again a fitting error message is shown.
Each hospital node created has the ability to assign a patient role or provider role to
a public key on the blockchain. Algorithm 2 shows the adding of a provider node, which
needs the public key and speciality of the provider as input. The public key cannot already
be assigned and the caller of the function has to be a hospital. Similar to Algorithm 2 is
Algorithm 3 which is the addition of a patient node, the logic is similar, meaning that the
caller of the function has to be a hospital and the public key cannot already be assigned.

18
Algorithm 1: Addition of hospital node
Input: Public Key, City, Name
Output: Hospital Node
1 if msg.sender == Admin then
2 if Public Key already exists in hospital mapping then
3 Abort function and output Error
4 else
5 Create hospital node using input data
6 Add Public Key to hospital mapping
7 else
8 Abort function and output Error

Algorithm 2: Addition of provider node


Input: Public Key, Specialty
Output: Provider Node
1 if msg.sender == Hospital then
2 if Public Key already exists in provider mapping then
3 Abort function and output Error
4 else
5 Create provider node using input data
6 Add Public Key to provider mapping
7 else
8 Abort function and output Error

Algorithm 3: Addition of patient node


Input: Public Key, Gender
Output: Patient Node
1 if msg.sender == Hospital then

2 if Public Key already exists in patient mapping then


3 Abort function and output Error
4 else
5 Create patient node using input data
6 Add Public Key to patient mapping

7 else
8 Abort function and output Error

With a patient node and provider node added to the system, each patient obtain the
ability to give access or revoke access to a provider. By giving access the provider can set
and retrieve hashes of a patient’s record and by revoking the provider can no longer set or
retrieve hashes of a patient’s record. Algorithm 4 shows when access is given, by using
the public key of the provider node who should be given access. Algorithm 5 shows when
access should be revoked, by utilizing the same input of a provider’s public key.

19
Algorithm 4: Give access
Input: Public Key
Output: Provider added to Patient’s access list
1 if msg.sender == Patient then
2 if Public Key already exists in access list mapping then
3 Abort function and output Error
4 else
5 Add Public Key to access list mapping
6 else
7 Abort function and output Error

Algorithm 5: Revoke access


Input: Public Key
Output: Provider removed from Patient’s access list
1 if msg.sender == Patient then

2 if Public Key does not exist in access list mapping then


3 Abort function and output Error
4 else
5 Remove Public Key from access list mapping

6 else
7 Abort function and output Error

20
The last algorithms of the smart contract layer are the, set hash and get hash functions.
They are simple setters and getter methods which serve a purpose for the off-chain storage
in the Data layer, which is further explained in section 4.4.3. Algorithm 6 shows the set
hash, which requires an input of a hash which then is stored in a patient’s list of hashes,
and a public key. Algorithm 7 shows the get hash, which takes an input of an index which
then is given to the list of hashes in order to get the hash of the correct index, and again a
public key is also used as input.

Algorithm 6: Set Hash


Input: Hash, Public Key
Output: Hash added to patient’s hash list
1 if msg.sender == Provider then
2 if Public Key does not exist in access list mapping then
3 Abort function and output Error
4 else
5 Append hash to patient’s hash list.
6 else
7 Abort function and output Error

Algorithm 7: Get Hash


Input: Index, Public Key
Output: Hash at Index returned
1 if msg.sender == Provider || msg.sender == Patient then

2 if Public Key does not exist in access list mapping then


3 Abort function and output Error
4 else
5 Retrieve hash from patient’s hash list at position index.

6 else
7 Abort function and output Error

4.4.2 Blockchain layer

The proposed system uses Ganache as explained in section 4.2, which makes it possible to
set up a test blockchain network similar to Ethereum Mainnet. Each transaction performed
between two peers is controlled and recorded in the blockchain layer. The blockchain
layer further holds the deployed smart contracts.

4.4.3 Data layer

InterPlanetary File System (IPFS) is used as off-chain storage for the proposed system
which is a distributed file system described deeper in section 3.4. When a file is uploaded
and submitted to IPFS through the UI layer, the cryptographic hash calculated of the

21
uploaded file to IPFS is sent through a transaction and stored as a patient variable using
smart contract logic, as shown in Algorithm 6. This hash variable is later on used when
providers or patients wish to retrieve a specific record, by using Algorithm 7 and further
functionalities handled in the UI layer.

4.4.4 UI layer

The UI layer refers to the front-end of the system, which has been built using React.
The purpose of the UI layer is easing the process for users, and retrieving the arguments
needed for the systems necessary functions. The UI layer further contains the integra-
tion of MetaMask, making it possible for each node assigned to the system to execute
various functions depending on the role of the user. Lastly the UI layer handles the com-
munication with the Data layer (IPFS) and the encryption and decryption process of the
electronic health records when uploading and downloading from IPFS.
The encryption and decryption process is a vital part of the system. When first en-
crypting an electronic health record, a secret key for encrypting it using AES has to be
generated, this same key will further on be used for decryption. The system makes use of
a random bytes generator which generates the key, this prevents human error from occur-
ring and choosing a weak and non random key. With a key generated, the record that the
provider wishes to be uploaded, is automatically encrypted and the key is printed to the
provider. The UI layer then adds the file to IPFS using an API call, which returns with
a hash in response which is then stored on the blockchain by calling Algorithm 6. An
example flow of the process can be viewed in figure 4.7. A full overview regarding the
upload process is further visible in Algorithm 8.

22
Figure 4.7: Uploading record to IPFS

Algorithm 8: Upload record


Input: EHR, Public Key
Output: Uploaded record
1 if msg.sender == Provider then

2 if Public Key does not exist in Provider’s access list mapping then
3 Abort function and output Error
4 else
5 Capture EHR as a buffer
6 Generate 32 random bytes as key and generate 16 random bytes as iv,
using Crypto API
7 Encrypt buffer using AES-256 together with generated key and iv and
append the iv value to the EHR.
8 if Original buffer == Encrypted buffer then
9 Abort function and output Error
10 else
11 Upload encrypted buffer using IPFS API, and CID
12 if CID == null then
13 Abort function and output Error
14 else
15 Call algorithm 6 using retrieved CID

16 else
17 Abort function and output Error

23
A similar process to the encryption process occurs when a patient or provider wishes
to download and decrypt a specific record, apart from the key not being generated. The
user calls Algorithm 7, in order to retrieve the correct hash of the record he or she intends
to decrypt and download, the user is then prompted with a downloaded file which is
decrypted correctly if the user has provided the same key as the one used in the encryption
process. An example flow of the decryption process is shown in in figure 4.8, while the
algorithm for downloading and decrypting a record is shown in algorithm 9.

Figure 4.8: Downloading record from IPFS

24
Algorithm 9: Download record
Input: Index, Public Key
Output: Downloaded Record
1 if msg.sender == Provider || msg.sender == Patient then

2 if Public Key does not exist in Provider’s access list mapping || Patient’s
public key != Public Key input then
3 Abort function and output Error
4 else
5 Call algorithm 7 using both input data Download encrypted EHR using
IPFS API with retrieved hash
6 if Retrieved hash == null then
7 Abort function and output Error
8 else
9 Capture downloaded EHR as a buffer
10 Retrieve appended iv
11 Decrypt buffer using iv and key
12 Convert decrypted buffer into Blob and save Blob object into
preferred file format

13 else
14 Abort function and output Error

25
5 Evaluation & Discussion

This section aims to provide an answer to research questions 2-5 listed in section 1.3 and
further evaluation regarding different security perspectives and the requirements listed in
section 4.1. The research questions and evaluation will be answered regarding the pro-
posed system and its functionality in a theoretical manner. Furthermore a section describ-
ing the possible challenges the proposed design faces, and finishing with a comparison to
prior research.

5.1 Research questions

RQ 2. How can confidentiality, integrity and availability be maintained with the use
of decentralization for sensitive healthcare data?
When it comes to information technology and security, confidentiality, integrity and avail-
ability are the three main terms that are being used, it is often referred to as the CIA
triad. The proposed system maintains confidentiality with the main focus on two different
strategies. As each patient is giving granular access control to their own electronic health
records by maintaining an access list of which providers should have access to the pa-
tient’s records as defined in the smart contract and shown in Algorithm 4 and Algorithm
5. This ensures that no unauthorized access of the related hashes of a patient’s record is
retrieved or set. Furthermore, each electronic health record is encrypted using AES and
a randomly generated passphrase, the generated passphrase helps mitigate users applying
weak passphrases for the secret keys, which gives stronger protection against brute-force
attacks. The reason for choosing AES as the encryption algorithm is due to its wide adop-
tion for classified information within the US government, due to its strong algorithm [27].
These two strategies are able to provide the confidentiality of the records.
The integrity of the records is maintained by the use of blockchain technology and
the distributed file system IPFS. Blockchain provides integrity to each of the transactions
that have occurred, it does so by chaining each block of transactions to each other with
the help of a cryptographically irreversible hash function. All new blocks are appended
to the end of the chain and form an immutable storage of transactions [28]. Apart from
this, the blockchain network is also beneficial as a form of an immutable log, as each
transaction is recorded, all transactions are available for auditing when needed, giving
the artifact further extension of the CIA triad in the form of accountability, which is a
common extension to the CIA triad [29]. Furthermore, the records maintain integrity
when uploaded to IPFS, each file on IPFS is given an irreversible hash value, similar
to the technology of blockchain and the transactions. Due to this hash value, the files
uploaded to IPFS are also immutable [16], if a change would occur, the hash would not
be the same and the changed record becomes redundant as the hash is not updated on the
blockchain.
Lastly, the availability of the records and the system are maintained due to the system
being partially decentralized. The transactions and back-end of the artifact are completely
decentralized by the use of blockchain and smart contracts, thus if one peer/node on the

26
network goes down, there are others up and running that can persist through Distributed-
denial-of-service(DDoS) attacks and other similar attacks that would cause an issue on
services running with a single point of access such as a locally hosted server [30]. Due
to the use of blockchain technology, smart contracts and transactions are always up and
running and available, making it a profound solution against DDoS attacks, that would be
detrimental to a centralized hospital system. Apart from the transactions and smart con-
tracts maintaining high availability due to the use of blockchain, IPFS in addition manages
to maintain high availability for the records already in storage. As mentioned in section
3.4, IPFS is a distributed file system, where the files can be accessed through multiple
nodes. This provides similar efficiency against DDoS attacks as with blockchain, but it
also provides higher availability because data can be accessed quickly despite the user
being located in America, Europe or any other continent of the world. With a centralized
database Europeans would for example experience massive delays when accessing a file
that is stored in the US, and vice versa.
RQ 3. How can users of the system maintain privacy, while the blockchain net-
work is remained public and transparent?
Maintaining privacy for users involved in transactions regarding sensitive data is of high
priority. The proposed artifact is running on a public blockchain, which means that all
transactions will be possible to track and publicly visible, these transactions will show
when a function has been executed by any user on the system. For instance, if a provider
has issued Algorithm 8 the transaction in Algorithm 6 will be visible publicly, which
would further show which patient it regarded and the corresponding hash to the record.
Therefore it is highly important to protect the privacy of the users. Each individual in
the proposed framework remains private from only being identified by their address on
the blockchain, this is done through using very few identifiers when adding a provider or
patient to the system, the only attributes of a patient is the Ethereum Address and gender,
while the provider is Ethereum Address and speciality. This makes it severely harder for
attackers to identify an individual based solely on the Ethereum address, as it has no con-
nection to the user’s real-life identity, which generates each participant in the system high
privacy, despite blockchain and IPFS being publicly available for outsiders to view.
RQ 4. How can the system provide a solution to solve the scalability issue of big
data in blockchain?
Scalability is a common issue with blockchain, especially when integrated with healthcare
because healthcare has a tendency to involve massive amounts of data. This big data is
not optimal to store on the blockchain as introduced in section 1.1, therefor the proposed
framework makes use of IPFS. With the integration of IPFS for off-chain storage, there is
no longer big data stored in various transactions occurring over the blockchain network,
as each record is only stored with the help of a reference variable. Furthermore, there is
also no need for a local database to carry the amount of volume that can be expected from
the electronic health records. While there are several other distributed storage alternatives
other than IPFS, IPFS provides excellent ability to integrate with JavaScript. This means
that developers can code the exchange with IPFS in a user-friendly manner, this provides

27
the system with ease of use for users when storing data, which is why IPFS is the choice
of storage for the proposed artifact. These features included in the artifact solves the
major scalability issue as there is no single location of storage and the big transactions
that would have to be made to store the data on the blockchain are avoided.
RQ 5. How can patients and providers lack of experience and ease of use be
solved regarding key management for blockchain credentials as well as encryption
keys?
Dealing with cryptography with zero experience and a lack of general technical knowl-
edge can be rather overwhelming. Therefore the reason that symmetric encryption was
the choice for the proposed framework, was in order to minimize the number of records
that had to be uploaded, as well as the number of keys that has to be dealt with. As with
asymmetric, each record has to be assigned to one recipient for the recipient to be able to
decrypt using its private key. Meaning each record has to be uploaded as many times as
there are recipients, furthermore discussing each participant’s public key and making sure
it is correct. The proposed framework mitigates this due to symmetric encryption, where
each record has to be uploaded only once and can be shared with as many providers as
needed without uploading a new record. However, the downside comes with the distribu-
tion of the secret key for the proposed framework. Due to the time limit, an effective way
of sharing the key has not been implemented into the framework, as of now, the secret key
has to be shared by the choice of the participants. This is not ideal as it could compromise
the security of the key as not every individual will have the knowledge or experience in
handling a secret key for symmetric encryption and decryption. However, the use of a
key management service could solve this issue, where the generation of secret keys and
sharing of the keys could be done in a secure and easy way for inexperienced users. The
framework proposes the use of AWS key management service, which allows for usage
outside of cloud services.

5.2 Requirements

Requirement 1: The registration abilities for various entities provides means for vari-
ations of users in the system the ability to register. However, giving the ability for any
user to register as any entity would not be optimal. Therefore the system needs gate-
keeping when it comes to registration. The solution in the proposed system manages this
by giving the owner of the contract the ability to assign hospital nodes and in return the
hospitals gain the ability to assign either a patient node or a provider node. This is con-
trolled through a granular mapping of access, and it provides the system control over who
is allowed to register and as what role.
Requirement 2,3: In this proposed system the patient will be in charge of who can
access and who cannot access his/her record. This gives the patients enough insight into
their records without giving them too much control regarding the creation and deletion of
records which could be problematic as described in section 3.6. Each patient consists of
a mapping attribute, this mapping maps an address to a true or false statement. When a

28
patient gives access to a provider the address in the mapping is changed to true, giving the
provider full access to view and upload records. If a patient were to revoke a providers
access, the mapping is changed to false, giving the provider no access to either uploading
or downloading the patient’s records. Due to this logic, each record is protected by an
access list that is controlled by the patient itself.
Requirement 4: An exchange system without the ability to upload and share data is
not an exchange system. Therefore the ability for a provider to upload records according
to their access is vital. When a provider uploads a record, the provider first undergoes
two access controls, which controls that the provider is in fact a provider and that the
provider has been granted access by the patient. This gives security that no unauthorized
user manages and uploads a patient’s record which he or she should not. Now that the
access control is finished, the encryption process starts. As shown in algorithm 8 the
record being uploaded is converted into a buffer, that buffer is later on encrypted and a
check whether the buffer has been encrypted or not is carried out. This step is necessary
and ensures that no record passes the upload without being encrypted and compromising
the confidentiality of the record. The record is now ready for upload and the IPFS API
is used in order to push the encrypted buffer onto IPFS, if IPFS returns a hash value the
upload is successful but if IPFS return no hash value the upload has failed and the function
is interrupted. This prevents the provider from updating a patient’s hash values without
any record being uploaded, which would lead to confusion among the patient and the
providers. Lastly, algorithm 6 is called, which carries out yet another access control for
protection against unauthorized actions.
Requirement 5: Both patients and providers needs to be able to view the records
that are relevant to them in regards to their access list. A patient has the right to view
their own records, as it is information about their own health. A provider has the right to
view records in regards to their access list because it is required for them to do their job
and provide better healthcare for the patients. If a provider wishes to view/download a
patient’s record, the provider needs to pass the access control check, which demands that
the provider has been allowed by the patient which is controlled by a boolean mapping.
If the provider passes the access control the provider needs to provide the system with
the secret key and iv generated during the upload process. As of now, the system does
not provide any key distribution solution which may have an impact on the security of the
keys generated during upload processes, however, it is mentioned that a key management
service such as AWS KMS could be beneficial and help patients and providers securely
share and store keys.
Requirement 6: As the system is built on a public blockchain, and a public file sys-
tem. The records need to be encrypted to maintain confidentiality. The encryption of
the records is handled by the use of a JavaScript library called crypto. Crypto provides
several cryptography necessities such as the symmetric encryption algorithm AES which
is the algorithm used for encrypting and decrypting records in this system. As described
in section 5.1 AES is the chosen algorithm due to its wide use regarding classified infor-
mation within the US government. Before a record is encrypted the crypto library assists

29
in randomly generating a secret key and iv. By randomly generating and not letting users
decide on a key, the human error factor gets taken out and weak keys are avoided, which
protects the records from being breached due to human error regarding the key selection.

5.3 Challenges

While the previous subsection highlights the strengths of the proposed design and pro-
vides facts regarding several security aspects. This subsection however, aims to give an
overview regarding a few challenges that the design faces.

5.3.1 Confidentiality challenges

Even with the use of strong encryption algorithms and granular access control for each
electronic health record, some challenges arise which can compromise the confidentiality
of a record. In figure 5.9 an attack tree is drawn which demonstrates how the confiden-
tiality for the records could be compromised. An attack tree consists of different nodes,
where the root node represents the end goal, the leaf nodes on the other hand are the
different possible ways of reaching the goal [31].

Figure 5.9: Attack tree against records confidentiality

The confidentiality of the records is in possible danger through unauthorized decryp-


tion, which can be exploited by the use of a man-in-the-middle attack (MITM). A man-
in-the-middle attack occurs when an attacker positions himself in between the commu-
nication of two points, the attacker can eavesdrop and collect valuable information or
disrupt/alter the communication to his/her advantage. The end result of MITM in this
situation refers to retrieving the information needed in regards to decrypting records. Fur-
thermore, there is a possibility that an attacker could make use of brute force and retrieve
the necessary information.
The second possible danger is through human error. Human’s are not flawlessly coded
robots, we tend to make mistakes, the proposed design and the records may be in danger
due to this. A user in the system may be the victim of malware which could lead to a
compromised record, a user may by accident leak valuable information which furthermore

30
leads to a compromised record. Lastly, an insider attack, which is not a human error, as it
is done with intention. Insider attacks are a major concern regarding information security,
and it is one of the toughest attacks to deal with [32]. The insider could abuse its power
and challenge the confidentiality of the records without raising suspicions.
The third danger the design may be vulnerable to is impersonation attacks. An im-
personation attack is based upon impersonating a specific individual or individuals who
possess certain authorization in order to receive valuable information. In the case of the
proposed artifact, an individual may impersonate as a user of the system in order to gain
information which the attacker should not.

5.3.2 Privacy challenges

As described in section 5.1 privacy is granted to the users of the system in regards to
how their identity as a user is defined. All the users are anonymous and assigned to an
Ethereum Address. However, there are still possibilities in which a real-life identity could
be pinned to a specific address. An insider could reveal the identity of the Ethereum
Address if the insider has dealt with the health record of a Patient, the insider then has
access to real identification information from the record, as well as the Ethereum Address
it is tied to. Furthermore, the use of frequency analysis could be used to pin an individual
to an Ethereum Address, based on hospital visits and frequency of transactions on the
blockchain.

5.4 System comparison

The proposed system for blockchain integrated electronic health records differs from ex-
isting proposals in the choice of storage and encryption. In section 3.5, it stands clear that
a centralized database whether it be hosted on a cloud or locally is the general approach
for storing electronic health records. The proposed system in this thesis makes use of a
decentralized database/file system called IPFS. The advantage of IPFS in comparison to
centralized storage is due to the distribution of the data. When a file is uploaded to IPFS,
the data is distributed and shared across a network of nodes, while centralized storage has
a single point of access. Due to this distribution, there are key benefits such as higher se-
curity in terms of integrity. Availability due to the multiple access points, and also speed
due to the multiple access points preventing the bottleneck issue that comes with central-
ized storage. Nesting all these benefits together with strong encryptions makes IPFS an
excellent storage solution for electronic health records.
Moving further to the encryption difference, a few of the frameworks in state of the
art section did not implement any encryption. However, they do state that encryption is
needed for it to be a strong framework. The rest made use of encryption in the form
of proxy re-encryption. What the proposed framework of the thesis does differently is
making the use of symmetric encryption, which brings the advantages of fewer keys to
handle, faster and less computational resources required. The last arguable advantage
would be the fact that there would be less need for uploading records, as symmetric only

31
uses one private key, each record would only be uploaded once and the private key shared
with those who should access it. With public encryption schemes, each record would have
to be uploaded as many times as the providers it should be shared with, which brings a
major inconvenience and could cause massive volume in storage, depending on the choice
of storage. However, this framework does not integrate any key distribution solution, but
what could potentially be implemented in the future is the use of AWS key management
service(KMS)6 . KMS provides the option of storing asymmetric and symmetric keys with
granular permission control, this service would be handled by the administrator node
where each provider would be added to the service and be able to store keys and share
with the providers with access and the patient in a secure and fast manner. The key
management is what many researchers has neglected, and it is often assumed that the
users will know how to properly store and share credentials.

6
https://aws.amazon.com/kms/

32
6 Conclusion

This project set out to solve challenges regarding the security of electronic health records
by the use of the novel technology blockchain. The project formulated 5 different re-
search questions, which helped to carve the direction and objectives for the system. The 5
research questions addressed (1) the state of the art, (2) how the system handles the CIA
triad, (3) how the scalability issues regarding blockchain may be resolved and furthermore
(4) the privacy of the users and (5) how user ease of cryptographic measures could be ad-
dressed. The first research question regarding the state of the art was answered through
a limited literature study. The other research questions reaching from 2 through 5 was
answered by the development of the system.
The proposed system makes use of Ethereum smart contracts, decentralized storage
using IPFS and symmetric encryption as the fundamentals for creating a strong and robust
solution for securely exchanging electronic health records between patients and providers.
It does so by storing reference variables on the blockchain while the sensitive information
of health records is encrypted and stored off-chain on IPFS and keeping the users of the
system private by minimizing the amount of identifiers each user contains.
Specific security aspects are discussed and evaluated in the form of research questions,
the overall result shows that the integration of blockchain and distributed off-chain storage
brings several benefits which support confidentiality, integrity, availability, privacy and
scalability as the big data is stored off-chain.

6.1 Future Work

As the project came to an end, more opportunities and upgrades became clear, there are
various upgrades and additions that could be done to the system which would garner more
security and overall performance of the artifact. The upgrades and additions are listed
below with a brief explanation regarding their importance for the future of the system.

• As discussed in section 5.1 regarding a key management service, incorporating such


a service into the system would benefit each user and possibly decrease misuse of
credentials and encryption keys, which in its return would mitigate several nodes in
the attack tree in figure 5.9.

• Conduct experimental security tests against various attacks which could result in
leaked records or privacy breaches that could be simulated and tested for further
knowledge regarding the strength of the system. As of now, the artifact is solemnly
evaluated in a conceptual manner regarding its security aspects.

• With the development of blockchain 3.0 the system could potentially be converted
and tested on such a network, which would presumably decrease transaction fees
and time immensely. This would be beneficial if the artifact would reach mass
adoption, and better overall performance is needed.

33
• Migration towards the ceramic network7 could be a potential upgrade. The ceramic
network is built on top of IPFS which further provides greater granular access con-
trol and similar features which is a lacking part of IPFS as of the time of developing
this artifact.

7
https://ceramic.network/

34
References

[1] O. of the National Coordinator for Health Information Technology, “What infor-
mation does an electronic health record (ehr) contain?” 2019. [Online]. Available:
https://www.healthit.gov/faq/what-electronic-health-record-ehr (Accessed August
21, 2021).

[2] ——, “Non-federal acute care hospital electronic health record adoption,” Health
IT Quick-Stat 47, 09 2017. [Online]. Available: https://dashboard.healthit.gov/
quickstats/pages/FIG-Hospital-EHR-Adoption.php (Accessed August 21, 2021).

[3] N. Menachemi, S. Rahurkar, C. A. Harle, and J. R. Vest, “The benefits of health


information exchange: an updated systematic review,” Journal of the American
Medical Informatics Association, vol. 25, no. 9, pp. 1259–1265, 04 2018. [Online].
Available: https://doi.org/10.1093/jamia/ocy035 (Accessed August 21, 2021).

[4] J. Goodman, L. Gorman, and D. Herrick, “Health information technology: Benefits


and problems,” 2010. [Online]. Available: https://www.ncpathinktank.org/pdfs/
st327.pdf (Accessed August 21, 2021).

[5] P. R. Clearinghouse, “Data breaches.” [Online]. Available: https://privacyrights.org/


data-breaches (Accessed August 21, 2021).

[6] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” 2009. [Online].


Available: http://www.bitcoin.org/bitcoin.pdf (Accessed August 21, 2021).

[7] D. R. Matos, M. L. Pardal, P. Adão, A. R. Silva, and M. Correia, “Securing


electronic health records in the cloud,” in Proceedings of the 1st Workshop on
Privacy by Design in Distributed Systems, ser. W-P2DS’18. New York, NY,
USA: Association for Computing Machinery, 2018. [Online]. Available: https:
//doi-org.proxy.lnu.se/10.1145/3195258.3195259 (Accessed August 21, 2021).

[8] A. Dubovitskaya, F. Baig, Z. Xu, R. Shukla, P. S. Zambani, A. Swaminathan,


M. M. Jahangir, K. Chowdhry, R. Lachhani, N. Idnani, M. Schumacher,
K. Aberer, S. D. Stoller, S. Ryu, and F. Wang, “Action-ehr: Patient-centric
blockchain-based electronic health record data management for cancer care,” J
Med Internet Res, vol. 22, no. 8, p. e13598, Aug 2020. [Online]. Available:
http://www.jmir.org/2020/8/e13598/ (Accessed August 21, 2021).

[9] K. Peffers, T. Tuunanen, M. A. Rothenberger, and S. Chatterjee, “A design science


research methodology for information systems research,” Journal of Management
Information Systems, vol. 24, no. 3, pp. 45–77, 2007. [Online]. Available:
https://doi.org/10.2753/MIS0742-1222240302 (Accessed August 21, 2021).

[10] M. Gupta, Blockchain For Dummies, 3rd IBM Limited Edition. John Wiley & Sons,
Inc, 2020.

35
[11] S. Haber and W. S. Stornetta, “How to time-stamp a digital document,”
J. Cryptol., vol. 3, no. 2, p. 99–111, Jan. 1991. [Online]. Available:
https://doi-org.proxy.lnu.se/10.1007/BF00196791 (Accessed August 21, 2021).

[12] V. Buterin, “Ethereum: A next-generation smart contract and decentralized


application platform,” 2013. [Online]. Available: https://github.com/ethereum/wiki/
wiki/White-Paper (Accessed August 21, 2021).

[13] “What is ether (eth)?” 2021. [Online]. Available: https://ethereum.org/en/eth/


(Accessed August 21, 2021).

[14] A. Bagha and V. Madisetti, “Blockchain platform for industrial internet of things,”
Journal of Software Engineering and Applications, no. 9, pp. 533–546, 2016.
[Online]. Available: https://doi.org/10.4236/jsea.2016.910036 (Accessed August
21, 2021).

[15] J. Benet, “IPFS - content addressed, versioned, P2P file system,” CoRR, vol.
abs/1407.3561, 2014. [Online]. Available: http://arxiv.org/abs/1407.3561 (Accessed
August 21, 2021).

[16] “What is ipfs?” [Online]. Available: https://docs.ipfs.io/concepts/what-is-ipfs/


#decentralization (Accessed August 21, 2021).

[17] A. Ekblaw, A. Azaria, J. D. Halamka, and A. Lippman, “A case study for blockchain
in healthcare:“medrec” prototype for electronic health records and medical research
data,” in Proceedings of IEEE open & big data conference, vol. 13, 2016, p. 13.

[18] G. G. Dagher, J. Mohler, M. Milojkovic, and P. B. Marella, “Ancile: Privacy-


preserving framework for access control and interoperability of electronic health
records using blockchain technology,” Sustainable Cities and Society, vol. 39, pp.
283–297, 2018. [Online]. Available: https://www.sciencedirect.com/science/article/
pii/S2210670717310685 (Accessed August 21, 2021).

[19] J. Vora, A. Nayyar, S. Tanwar, S. Tyagi, N. Kumar, M. S. Obaidat, and J. J. P. C.


Rodrigues, “Bheem: A blockchain-based framework for securing electronic health
records,” in 2018 IEEE Globecom Workshops (GC Wkshps), 2018, pp. 1–6.

[20] T. Matsuo, “Proxy re-encryption systems for identity-based encryption,” in Pairing-


Based Cryptography – Pairing 2007, T. Takagi, T. Okamoto, E. Okamoto, and
T. Okamoto, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007, pp. 247–
267.

[21] A. Shahnaz, U. Qamar, and A. Khalid, “Using blockchain for electronic health
records,” IEEE Access, vol. 7, pp. 147 782–147 795, 2019.

36
[22] M. M. Madine, A. A. Battah, I. Yaqoob, K. Salah, R. Jayaraman, Y. Al-Hammadi,
S. Pesic, and S. Ellahham, “Blockchain for giving patients control over their medical
records,” IEEE Access, vol. 8, pp. 193 102–193 115, 2020.

[23] T. Gabriel, A. Cornel-Cristian, M. Arhip-Calin, and A. Zamfirescu, “Cloud storage.


a comparison between centralized solutions versus decentralized cloud storage solu-
tions using blockchain technology,” in 2019 54th International Universities Power
Engineering Conference (UPEC), 2019, pp. 1–5.

[24] T. Hardjono and L. R. Dondeti, Security In Wireless LANS And MANS (Artech House
Computer Security). USA: Artech House, Inc., 2005.

[25] M. Lester, S. Boateng, J. Studeny, and A. Coustasse, “Personal health records:


Beneficial or burdensome for patients and healthcare providers?” Perspectives in
health information management, vol. 13, no. Spring, Apr 2016. [Online]. Available:
https://pubmed.ncbi.nlm.nih.gov/27134613 (Accessed August 21, 2021).

[26] C. Showell, “Barriers to the use of personal health records by patients: a


structured review,” PeerJ, vol. 5, pp. e3268–e3268, Apr 2017. [Online]. Available:
https://pubmed.ncbi.nlm.nih.gov/28462058 (Accessed June 6, 2021).

[27] Z. J. Chowdhury, D. Pishva, and G. G. D. Nishantha, “Aes and confidentiality from


the inside out,” in 2010 The 12th International Conference on Advanced Communi-
cation Technology (ICACT), vol. 2, 2010, pp. 1587–1591.

[28] B. Liu, X. L. Yu, S. Chen, X. Xu, and L. Zhu, “Blockchain based data integrity
service framework for iot data,” in 2017 IEEE International Conference on Web
Services (ICWS), 2017, pp. 468–475.

[29] M. Warkentin and C. Orgeron, “Using the security triad to assess blockchain
technology in public sector applications,” International Journal of Information
Management, vol. 52, p. 102090, 2020. [Online]. Available: https://www.
sciencedirect.com/science/article/pii/S026840121930060X (Accessed August 21,
2021).

[30] T. M. Fernández-Caramés and P. Fraga-Lamas, “A review on the use of blockchain


for the internet of things,” IEEE Access, vol. 6, pp. 32 979–33 001, 2018.

[31] E. G. Amoroso, Fundamentals of Computer Security Technology. USA: Prentice-


Hall, Inc., 1994.

[32] C. W. Probst, R. R. Hansen, and F. Nielson, “Where can an insider attack?” in


Formal Aspects in Security and Trust, T. Dimitrakos, F. Martinelli, P. Y. A. Ryan,
and S. Schneider, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007, pp.
127–142.

37

You might also like