Expert Systems - 2023 - Gopalakrishnan - PriMed Private Federated Training and Encrypted Inference On Medical Images in

Received: 19 October 2022 Revised: 19 January 2023 Accepted: 3 March 2023
DOI: 10.1111/exsy.13283
ORIGINAL ARTICLE
PriMed: Private federated training and encrypted inference on

medical images in healthcare
Aparna Gopalakrishnan 1 | Narayan P. Kulkarni 1 | Chethan B. Raghavendra 1 |

Raghavendra Manjappa 1 | Prasad Honnavalli 1 | Sivaraman Eswaran 2
1
Department of Computer Science and
Engineering, PES University, Bengaluru, India Abstract
2
Department of Electrical and Computer In healthcare, patient information is a sparse critical asset considered as private data
Engineering, Curtin University, Miri, Malaysia
and is often protected by law. It is also the domain which is least explored in the field
Correspondence of Machine Learning. The main reason for this is to build efficient artificial intelli-
Sivaraman Eswaran, Department of Electrical
and Computer Engineering, Curtin University,
gence (AI) based models for preliminary diagnosis of various diseases, it would
Miri, Malaysia. require a large corpus of data which can be obtained by pooling in patient informa-
Email: sivaraman.eswaran@gmail.com
tion from multiple sources. However, for these sources to agree to sharing their data
across distributed systems for training algorithms and models, there has to be an
assurance that there will be no disclosure of the personally identifiable information
(PII) of the respective Data Owners. This paper proposes PriMed, an approach to
build robust privacy preserving additions to convolutional neural networks (CNN) for
training and performing inference on medical images without compromising privacy.
Since privacy of the data is preserved, large amounts of data can be effectively accu-
mulated to increase the accuracy and efficiency of AI models in the field of
healthcare. This involves implementing a hybrid of privacy-enhancing techniques like
Federated Learning, Differential Privacy, and Homomorphic Encryption to provide a
private and secure environment for learning through data.
KEYWORDS
convolutional neural networks, differential privacy, federated learning, homomorphic
encryption, privacy preserving machine learning
1 | I N T RO DU CT I O N
While developing Machine Learning models, the algorithm is trained over a stationary dataset, held in a single location. Evolving expert systems in
the domains of E-Commerce, Social Media, and Finance have been very successful, but classification and detection models for healthcare have
been largely unexplored.
Although there have been Artificial Intelligence (AI)-based models built before for preliminary diagnosis of diseases like diabetic retinopathy
by Malekzadeh et al. (2021), breast cancer, and tuberculosis based on medical image scans, there is still a lot of scope for detection of other dis-
eases. The major reason this domain has not been researched to its full potential is because the patient data on which models need to train is not
This paper was produced by the Computer Science and Engineering Department of PES University located in Bengaluru, India.
This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any
medium, provided the original work is properly cited and is not used for commercial purposes.
© 2023 The Authors. Expert Systems published by John Wiley & Sons Ltd.
Expert Systems. 2023;e13283. wileyonlinelibrary.com/journal/exsy 1 of 14

https://doi.org/10.1111/exsy.13283
14680394, 0, Downloaded from https://onlinelibrary.wiley.com/doi/10.1111/exsy.13283, Wiley Online Library on [15/10/2023]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
2 of 14 GOPALAKRISHNAN ET AL.
only local to the medical institution having rights to it, but is also considered private and hence, closely governed by privacy laws. Large amounts
of classified and labelled data are required to train AI-based models, and such requirements are usually not met by data from a single institution.
To be able to build efficient models, data from various sources need to be collected, but there is a risk to the privacy of the individual in the data
record if there is a misuse of the information (Seh et al., 2020).
Algorithms can be effectively trained if multiple medical institutions can be made to trust data analysts enough to share data and this can be
done by making sure patient information in data records cannot be traced back to the original patient. One way to ensure this is through the pro-
cess of anonymization (Majeed & Lee, 2020), i.e, removing all personally identifiable information (PII) from the dataset and keeping only the neces-
sary information to train the model. However, anonymization is not infallible and it is still possible to re-create PII by making use of correlated
data (Narayanan & Shmatikov, 2008).
Another possible solution to the data privacy problem is to make use of a group of techniques to train models without compromising on the
privacy of the model as well as that of the data used to train the model. As a whole, these techniques are known as Privacy Preserving
Techniques.
1.1 | Federated learning
Federated learning (FL) is a machine learning technique for distributed, decentralized devices or servers (Lian et al., 2022; McMahan et al., 2017;
Pandya et al., 2023). This approach can be used to train algorithms on the different datasets located on different edge devices instead of
uploading all the local datasets to a single server on which the algorithm is trained. In this way, each entity has complete control and ownership of
their data as well as can obtain the benefits of a model developed on the large corpus of data from different servers. It is important to note that
any of the edge devices/nodes involved in training can perform model inversion attacks (Hidano et al., 2018) on any of the other nodes, since all
of them will have the updated model parameters.
1.2 | Differential privacy
Differential privacy (DP; Abadi et al., 2016) is a method used to provide access to a dataset of individuals without revealing information specific
to a single individual in the dataset (Arachchige et al., 2019). This is done by making a single substitution or by addition of noise to an individual's
information in the database, which does not affect the query result, therefore preserving privacy. One main drawback of DP is that for complex
datasets, one needs to add more noise in order to preserve privacy, which may lead to reduction in accuracy (Bagdasaryan et al., 2019).
1.3 | Homomorphic encryption
Homomorphic encryption (HE; Fan & Vercauteren, 2012; Gentry, 2009) is a method of encryption that lets users perform operations on
encrypted data without decryption and obtaining an encrypted output which when decrypted results in an output identical to the one generated
if the operations were executed directly on the unencrypted data (AAcar et al., 2018; Ishiyama et al., 2020). Hence, the accuracy of the operation
remains the same, but the privacy of the operands is increased. When the data is encrypted, that is, conversion of plaintext to ciphertext, it
increases the computation complexity by a significant amount (Moore et al., 2014).
In this work, the shortcomings of each of these techniques, when used individually are overcome with the proposal of a hybrid approach
which makes use of Federated Learning, Differential Privacy, and Homomorphic Encryption over the training and inference processes of con-
volutional neural networks (CNN).
The remainder of this paper is structured as follows: Section 2 briefly reviews related works which make use of the above techniques both
individually and in simple hybrid approaches and how the idea of PriMed proposed in this paper contrasts them, Section 3 describes the proposed
methodology and algorithms for secure and accurate training and inference phases, Section 4 presents the implementation of said methodology,
Section 5 displays an experimental evaluation of the implementation and the final sections discuss conclusions and future work.
2 | R E LA T E D W O R K
McMahan et al. (2017), (Lian et al., 2022), (Pandya et al., 2023) show the implementation of FL, which has since then been bettered by Shokri and
Shmatikov (2015), Truex et al. (2019), Malekzadeh et al. (2021) and Wei et al. (2020) by the addition of DP during the accumulation process.
GOPALAKRISHNAN ET AL. 3 of 14
Truex et al. (2019) were the first to introduce a hybrid approach to this problem. They make use of multiple clients for local training and
secure federated aggregation done at a central aggregator. Weights sent from the client to the centre are homomorphically encrypted, which are
averaged then decrypted with a majority of the various clients' keys to update the central model. Similarly, Malekzadeh et al. (2021) implement
this hybrid algorithm taking into account the trade-off in (Truex et al., 2019) for using momentum while training and multiple local stochastic gra-
dient descent (SGD) steps.
Stripelis et al. (2021) and Sav et al. (2020) are some of the hybrid approaches in which FL processes are used along with HE to securely train
models. However, in (Sav et al., 2020), the model is encrypted as a whole and the entire training is done on the encrypted model parameters,
which is computationally expensive. (Stripelis et al., 2021) overcomes this by local training on unencrypted weights and encrypting the weights
only when they need to be sent to a centralized controller for aggregation. Even so, it does not prevent model inversion attacks in the event of cli-
ent compromise as every client has the aggregated un-noisy (without DP) model weights in plain text.
Dowlin et al. (Gilad-Bachrach et al., 2016), Chou et al. (Hesamifard et al., 2017), and Disabato et al. (2020) have provided efficient ways to
use HE in the inference phase including substitution of operations not supported by HE with polynomial approximations (Gilad-Bachrach
et al., 2016).
As described in Section 1, medical data is regarded as highly private and large amount of data from various institutions is required to train
models efficiently. The governance laws and risk involved need to be considered for institutions to share private data to support research. If data
provided is encrypted, processing it for training neural networks is highly compute-intensive and slow. These are the current limitations pertaining
to developments in this field. All the currently documented approaches either protect the privacy of an individual while training a model
(Malekzadeh et al., 2021; Sav et al., 2020; Truex et al., 2019; Wei et al., 2020) or while using the model for predictions (Disabato et al., 2020;
Gilad-Bachrach et al., 2016; Hesamifard et al., 2017), rarely describing privacy preservation in both scenarios as a whole.
3 | PROPOSED APPROACH
The various approaches discussed in the previous section have tried to find a balance between protecting privacy and training/accuracy costs.
Weighing the pros and cons, this work proposes PriMed which involves using: Federated Learning, to allow various organizations to keep their data
locally and still participate in the training process. Differential Privacy to make the model weights in transit and at the local clients noisy, safe and
uninferrable while training. Homomorphic Encryption of prediction input data during inference as shown in as shown in Figure 1. Unlike other
implementations, this novel approach intends to cover the privacy of the entire machine learning process—training and inference.
FIGURE 1 PriMed training and inference phase flow

3.1 | Training
The PriMed training setup consists of N medical institutions/Data Owners c1 , c2 , c3 , …cN , each having their own private datasets d1 , d2 , d3 …dN as
shown in Figure 2. The owners of the data wish to collectively train a model on their dispersed data, but do not trust one another enough to pro-
vide direct data access. There is a Global Server, a third party which coordinates the training process among the institutions.
Assuming the server is untrusted, the prevention of inference attacks can be ensured with the usage of Differentially Private Federated
Learning, similar to (Malekzadeh et al., 2021). For a more cautious approach, there would be a need to use encrypted user data and model weights
like in (Sav et al., 2020; Stripelis et al., 2021) which is a huge overhead.
Additional network security can be provided for protecting the model weights in transit using public key cryptography where each owner
encrypts the model weights with their private key before sending it to the Global Server which can then decrypt it with the corresponding public
key. This is, however, not implemented in this work.
The training process, described in Algorithm 1, begins with the Global Server randomly initializing the weights for a central model m with the
specified model architecture and sends a copy of it to each owner for local Federated Training on their private datasets. After an epoch of train-
ing, noise is added to the model using Differentially-Private Stochastic Gradient Descent (DP-SGD; Abadi et al., 2016) and privacy loss is calcu-
lated before being sent back to the server.
Adding noise only to the model parameters will not fetch much gain as the model can still memorize the parameters. Additionally, if a third
party is able to ascertain the model weights, the input data can be inferred back and thus, privacy guarantees will be lost. Instead, if noise is intro-
duced in the training process while calculating SGD during backpropagation, the model weights will be safe from inversion attacks (Abadi
et al., 2016; Malekzadeh et al., 2021). The amount of noise added is proportional to the privacy inherited. However, as the noise increases, the
accuracy of the model decreases. This process has to ensure that there is a profound balance in the privacy vs. accuracy tradeoff by adding an
optimum amount of noise.
FIGURE 2 PriMed training phase methodology

ALGORITHM 1 Training Phase
Input: Global Server g, Initialized Global Server model m, Set of N Data Owners ci ϵ C each with a remote dataset di for i ¼ 1 to N,
Number of epochs E, set of Differential Privacy Parameters dp.
Output: Fully Trained Global Server model m0 Initialisation of client models:
1: for i ¼ 1 to N do
2: Send m to each ci as remote model mci
3: end for
Training Process:
4: for epoch e ¼ 1 to E do
5: Initialise the aggregate of all remote models, param_sum = 0
6: for each remote client ci ϵ C do
7: Remote_Train ( mci , e, dp):
P
8: New parameters Pmci = SGD(Parameters of mci + differential privacy noise )
9: param_sum + = Pmci
10: end for
Federated Averaging:
11: federated_average = param_sum / N
12: Update Parameters of m’ = federated_average
Updation of client models:
13: for i ¼ 1 to N do
14: Send m’ to each ci as remote model mci
15: end for
16: end for
17: return m0
DP-SGD (Abadi et al., 2016) is the technique used to add noise to the data during backpropagation. The method computes gradients for a
random subset of data samples considered, clips them to a maximum threshold, aggregates the per sample values, and adds noise to the aggregate
to protect privacy. The optimiser then takes a step in the direction opposite to the aggregated noisy gradient calculated. It majorly takes into
account two parameters -
• Maximum Gradient Normal—The gradients are clipped to this threshold to limit the overall sensitivity within a bound preventing outlier
influence.
P
• Noise scale ( )—This is directly proportional to the amount of noise added to the aggregated gradient.
Once the final noisy parameters are calculated, the server accumulates the various parameters from each owner and averages them to obtain
a new approximate value for the parameters. This is similar to the process of federated aggregation described in (Truex et al., 2019). The Global
Server updates its central model with the newly averaged parameters and then resends the new model to each client for another round of train-
ing. This noise addition and aggregation process happens for a specific number of epochs before the central model can be said to be fully trained
and available for inference.
3.2 | Inference
The final model obtained after the training phase can be used for plain inference by various institutions. Most work (Malekzadeh et al., 2021;
Stripelis et al., 2021; Truex et al., 2019) end their implementations here. However, PriMed intends to protect the privacy of a user not only while
training as a part of the dataset, but also, during inference when a client sends a medical image to the Global Server for prediction. The clients for
our model, that is, medical institutions, are responsible for protecting the privacy of their patients. Sending patient image data for predictions in
plaintext, over the network to the Global Server may be susceptible to attacks. The Server itself may be compromised and image data may be
exposed. For similar reasons, patient data was maintained at the owners' end only throughout the training process. At the same time, the Global
Server may not prefer to send the model weights to every client that requests its services, as it is not scalable and even though the model weights
are differentially-private. In short, the clients cannot be trusted in a way similar to a common third party Global Server bound to maintain privacy
by law. Considering these factors, encrypting the image data before sending it to the server, ensuring that it remains encrypted throughout the
entire process at the server would help maintain the privacy of the patient as well as of the model weights as shown in Figure 3.
If the image has to remain encrypted throughout the prediction process, the model's convolutional operations have to be performed on
encrypted images. HE is a method, which allows computing on encrypted data providing a guarantee that decrypting the result of the operation
performed on ciphertext would be the same as the result of the operations performed on plaintext. At the core of it, convolutions and every other
layer, excluding some non-linear activation functions, in a model can be implemented as additions and multiplications. To support both these oper-
ations, there is a need to make use of a fully homomorphic encryption (FHE) (Gentry, 2009) scheme. These are the operations allowed by FHE are
shown in Equations (1) and (2).
a þ b ¼ HEdec ðHEenc ðaÞ þ HEenc ðbÞÞ ð1Þ
a b ¼ HEdec ðHEenc ðaÞ HEenc ðbÞÞ ð2Þ
A partial homomorphic encryption scheme, although more efficient, would only allow addition operations. After the discovery of Gentry's
FHE (Gentry, 2009) scheme, Brakerski/Fan-Vercauteren (BFV; Fan & Vercauteren, 2012) and Cheon-Kim-Kim-Song (CKKS; Cheon et al., 2017)
schemes were also developed based off the Ring Learning with Errors (LWE; Regev, 2009) problem. The privacy of the data is assured by the addi-
tion of noise through these schemes during encryption, which increases in magnitude as operations are performed on the ciphertext. There is a
limit to the noise which can be added to the ciphertext called a noise budget, which if exceeded, implies that a valid decrypted output cannot be
obtained. This noise budget can be replenished by performing implementation specific bootstrapping (Chillotti et al., 2021) or recryption methods,
resulting in a new ciphertext corresponding to the initial ciphertext without noise. There are libraries like SEAL (Microsoft., 2021) which imple-
ment these schemes in C++ and Pyfhel (Ibarrondo & Viand, 2021) which extends SEAL to Python.
The approach described in Algorithm 2, makes use of asymmetric keys to encrypt the data using HE. A client will use their public key to
encrypt their data and send it to the server for prediction. The prediction will happen on the encrypted data and sent back to the sender, where
the output will be decrypted using the client's private key.
FIGURE 3 PriMed inference phase methodology

ALGORITHM 2 Inference Phase
Input: Fully Trained Global Server Model m, Input Image i, Sender s, Global Server g
Output: Prediction o
At Sender:
s sends to g Encrypted image ienc = Encbfv (i) At Global Server:
2: g sends to s Encrypted output oenc ¼ mðienc Þ
At Sender:
s obtains decrypted output o = Decbfv ðoenc Þ
4: return
There are non-linear activation functions and layers like Rectified Linear Unit (ReLU), Sigmoid, Max pooling, Softmax, and so on which cannot
be directly applied on homomorphically encrypted data. They have to be converted into their corresponding polynomial approximations (Ishiyama
et al., 2020), as a combination of addition and multiplication operations only. This approximation is used while implementing the model architec-
ture during the training phase itself to ensure correct inference phase results. Implementations like (Malekzadeh et al., 2021; Truex et al., 2019)
use these non-linear activation functions while training, so there is no scope to extend HE to performing predictions.
4 | IMPLEMENTATION
4.1 | Convolutional neural networks
CNNs (Albawi et al., 2017) are widely used to perform predictions over images as they are able to differentiate and recognize patterns in images.
The same can be used to obtain high accuracies if trained over medical images and remains majorly unexplored due to issues with privacy and
security, which can be overcome using the proposed approach.
Arbitrary models are created for the training and inference of the datasets described in the next subsection. The generic layers used in devel-
oping CNNs for the evaluation of PriMed are: convolutional layer, fully connected layer, average pooling layer instead of a max pooling layer, and
a square activation function instead of ReLU activation function as suggested by (Gilad-Bachrach et al., 2016; Ishiyama et al., 2020).
4.2 | Dataset
The implementation is done on MNIST (Deng, 2012) with 60,000 grey-scale 28 x 28 sized images for benchmarking and MedMNIST (Yang, Shi,
Wei, et al., 2021)—a collection of standardized colour medical images of size 28 x 28 for analysis. BreastMNIST, PneumoniaMNIST, RetinaMNIST
and BloodMNIST image datasets are preprocessed and dataset-specific models are developed to validate the approach. The model architectures
used for each dataset are described in the Appendix.
4.3 | Training
Python-based open source libraries PySyft (Ryffel et al., 2018), Opacus (Yousefpour et al., 2021), and Pyfhel (Ibarrondo & Viand, 2021) are used
to implement Privacy Preserving Machine Learning techniques (PPML).
PySyft (Ryffel et al., 2018) is an open-source library developed by OpenMined to support remote data training and simulate the presence of
multiple Data Owners or medical institutions who want to participate in the private data training process with their remote, sparsely distributed
datasets. PySyft provides a session called—Duet (Jain & Jain, 2021), which is used to provide coordination between Data Owners and the Global
Server, from where the training on private data will be orchestrated, both implemented as jupyter notebooks. Remote references to the datasets
at each owner are obtained using Duet along with remote references to torch modules of every owner, at the server. These references are only
used to coordinate the process from the server and the data always remains local to the owner.
A PyTorch model architecture is defined at the Global Server and is sent to each of the Data Owners using the send() function which returns
a pointer reference to each remote model for every Data Owner. These model references will be used to train the models remotely from the
Global Server.
During each epoch of training, the remote model is trained and noise is added using Opacus (Yousefpour et al., 2021) library's DP-SGD imple-
mentation. Opacus is integrated with Pytorch and allows addition of DP. It implements the parameters: Noise Scale as noise_multiplier in the
range 0.1–2 (Yousefpour et al., 2021) and Maximum Gradient Normal as max_grad_norm, also providing fine grained control over them.
The (ϵ, δ) pair in Table 1 represents the upper bound on privacy spent (Abadi et al., 2016) in which, smaller the corresponding values, the more
the privacy preserved. Experimentation with these DP parameters was done to find the right balance between accuracy of a baseline MNIST
(Deng, 2012) model run for 5 epochs with FL on two clients and DP versus the noise multiplier used in DP-SGD. These parameters are used by a
privacy engine object, directly attached to the optimizer, implementing DP-SGD during backpropagation. Privacy is achieved at the expense of
the accuracy of the model following the magnitude of the noise added through DP-SGD.
After the epoch, federated aggregation is implemented at the Global Server, which uses the remote model references to average their
corresponding parameters and update both the initial global model as well as the remote model states. This process continues till the global model
is fully trained.
4.4 | Inference
Pyfhel (Ibarrondo & Viand, 2021), an open-source library written in python to provide wrappers for SEAL homomorphic encryption capabilities,
was used to implement PriMed's inference phase. PyCrCNN (Disabato et al., 2020) library, built on top of Pyfhel, provides the APIs required to
implement the layers of a CNN using only additions and multiplications. Simulating the process using socket programming, the model weights
from training are used to make predictions. A client has to homomorphically encrypt an image before sending it to the server for inference.
The levelled HE scheme, BFV (Fan & Vercauteren, 2012), provided by Pyfhel is used to do the same. To generate the set of keys to encrypt a
value, BFV scheme requires certain encryption parameters (Disabato et al., 2020; Ibarrondo & Viand, 2021): Plaintext Modulus ( p), Polynomial
Modulus Degree (m) and Ciphertext Modulus (q, which is internally determined by the other two parameters in the supporting library). Using the
random keys generated with these parameters, the input image, a multidimensional vector of integers/pixels, is converted into a ciphertext, repre-
sented as a polynomial, which is transparently a multidimensional vector of encrypted integers. This implementation also involves enhancing
PyCrCNN's approach to encrypting and decrypting a matrix using multiprocessing to increase speed of computation.
Depending on the value of these parameters, a certain amount of maximum-noise-allowed budget is allocated to each ciphertext. After every
operation on the ciphertext, the amount of noise in it increases and accordingly the noise budget allocated decreases. If the noise budget reaches
0, the values cannot be decrypted accurately. Higher the value of polynomial modulus degree (m), more is the initial noise budget allocated for the
ciphertext. This means that more encrypted operations can be performed on the ciphertext. However, as the value of m increases, the amount of
memory taken by the ciphertext also increases, which leads to a drop in the performance in terms of execution time and memory. Additionally,
higher the value of plaintext modulus (p), more is the precision obtained between results on plain data and encrypted data. However, this also
means that there is an increased loss in the noise budget after every operation.
Disabato et al. (2020) suggest certain values for these parameters, but various experiments performed helped deduce a set of parameter
values, E, for PriMed, providing an optimum between performance and accuracy, which are:
• Plaintext modulus, p: 15974401

• Polynomial modulus degree, m: 4096
Operands used in any operation with homomorphically encrypted data need to be encoded into a plaintext polynomial to support HE-based
operations consistently. In this case, the model weights must be encoded with the same parameters to follow the underlying BFV scheme. The
TABLE 1 Privacy versus accuracy
Noise multiplier Accuracy Privacy spent

0.1 95.99 (ϵ = 5863.95, δ = 1e-05)
0.2 95.92 (ϵ = 155.10, δ = 1e-05)
0.5 92.28 (ϵ = 5.65, δ = 1e-05)
1 82.2 (ϵ = 0.73, δ = 1e-05)
2 35.35 (ϵ = 0.17, δ = 1e-05)
client uses the parameter set, E, to generate a set of HE keys, encrypts the image and sends the encrypted image along with E, to the Global
Server without the actual keys. This is done so that the server can encode the CNN using its own set of keys generated by the same parameters.
The keys are different from that of the client and hence, cannot be used to decrypt the image.
Furthermore, to ensure that the noise budget does not decrease drastically during the inference process, which would otherwise lead to inac-
curate decryption, after a certain number of layers, the intermediate encrypted result is sent back to the client for re-encryption. The new cipher-
text with a replenished noise budget is sent back to the server for the remaining inference without any compromise on the accuracy. After
performing the remaining operations on the encrypted data, the encrypted result in its homomorphic form is sent back to the client. The client
decrypts it to get the final prediction.
There is no loss of privacy in the entire process as the decryption keys exist only at the client, ensuring that no third party can decrypt it and
any encryption, decryption or re-encryption operations happen only on the client.
5 | EXPERIMENTAL RESULTS
All results are obtained on Jupyter notebooks run on a 16GB RAM, CPU-based system with an Intel Core i5 processor and Windows/Ubuntu
18.04 (backward and forward compatible) operating systems.
The proposed approach is validated against the baseline: training and inference processes without any PPML techniques. Figure 4 compares
the classification accuracies of the model with no PPML techniques, the model implementing only DP (i.e., with 1 Data Owner), and PriMed's
training approach, implementing DP and FL (i.e., with 2 Data Owners and a Global Server) trained on the MNIST dataset.
PriMed obtains an accuracy comparable to the baseline model and better than if only DP is used. The noise multiplier value used throughout
the implementation is 0.5, providing an optimum between privacy and accuracy previously shown in Table 1.
The training approach is verified on several MedMNIST medical image datasets, which are enumerated in Table 2. Their respective models
have been trained remotely on two clients with DP loss calculation for varying number of epochs. The model architectures described in the
Appendix considered are of atmost two convolutional layers with corresponding activation functions and multiple fully connected layers. Keeping
time and memory consumption by HE during the inference phase in mind, the models developed are relatively simple in complexity. The accura-
cies obtained are restricted by the limitations of the libraries used. However, they can be further improved by varying batch sizes, training for
more epochs and overall using deeper neural networks.
To verify PriMed's inference approach, the evaluation metrics are accuracy, time, data transfer across the network, and privacy of the data as
well as of the model. The first three are described in Table 2. All the data gathered is per image. The time taken for inference is directly propor-
tional to the complexity of the model architecture and image size. MNIST consists of single channel, black and white images and MedMNIST
datasets are preprocessed to three channels (RGB). Hence the encrypted input size and encryption time is approximately threefold. The metrics
obtained are also dependent on the underlying hardware as multiprocessing is used to parallelize the encryption of the image and decryption of
the output.
Table 2 also shows the time taken for re-encryption. Some model architectures require the use of re-encryption to replenish the noise budget
of encrypted images in order to keep the value of homomorphically encrypted predictions and plain predictions as close to each other as possible.
FIGURE 4 Comparison of validation accuracies for different approaches

TABLE 2 Evaluation metrics of PriMed's PPML approach
MNIST PneumoniaMNIST BreastMNIST BloodMNIST RetinaMNIST

Criteria (2-conv +2-FC) (2-conv +2-FC) (1-conv +3-FC) (1-conv +3-FC) (2-conv +3-FC)
Accuracy (in %)
Train accuracy 94.14 94 90.28 97.23 92
Inference accuracy 94.22 95 87.58 91.65 87
Time (in seconds)
Encryption time 1.5 4.1 4.3 4.38 4
Inference Time 82.9 127.9 171.4 336.7 806.9
decryption time 0.3 0.35 0.5 0.45 0.3
Re-encryption time 1.7 0.9 2.35 4.07 8.3
Inference phase data
transfer (in MB)
Encrypted input size 98 294.2 294.2 294.2 294.2
Re-encryption data 163 64 162 432.2 1058.6
size
Encrypted result size 1.2 0.2 0.2 2 0.62
Note: N-Conv, N Convolutional Layers; N-FC, N Fully Connected Layers.
TABLE 3 Comparison of PriMed with state-of-the-art encrypted inference methodology for MNIST
Criteria PriMed CryptoNets (Gilad-Bachrach et al., 2016) CryptoDL (Hesamifard et al., 2017)
Encryption scheme BFV YASHE BGV
Data transfer over network (in MB) 99.2 595.5 336.7
Encryption time (in seconds) 1.5 122 15.7
Decryption time (in seconds) 0.3 5 1
Inference time (in seconds) 264 697 320
Note: The values are made bold to highlight our results.
Without re-encryption, neither the absolute value nor the order of magnitude of the encrypted prediction result were comparable to that of the
plain prediction result. The re-encryption process happens at different layers for every model in accordance with the complexity of operations
used in each model layer. This causes the varying re-encryption data transfer amount and the re-encryption time for the different models. Using
this functionality, the time taken for prediction increases marginally, but the encrypted predictions obtained are similar to their corresponding
plain predictions with a precision of up to three decimal places.
PriMed is compared with two state of the art solutions for encrypted inference, CryptoNets (Gilad-Bachrach et al., 2016) and CryptoDL
(Hesamifard et al., 2017). A different model architecture is used for MNIST, similar to the one used by Chou et al. (Hesamifard et al., 2017) for an
accurate evaluation of CryptoDL and CryptoNets and metrics are compared. CryptoNets metrics are gathered per batch size of 8192, but the
framework takes the same amount of time to run predictions with a batch size of 1, that is, on one image as that of predictions with batch size of
8192. As shown in Table 3, the proposed inference approach outperforms CryptoNets and CryptoDL in all aspects.
The privacy of training data is maintained by keeping it local to their corresponding trusted owners and the model parameters are differen-
tially private to prevent indirect inference. The client data for inference can only be viewed in plaintext by the client themselves. Even if the
Global Server or other parties access the encrypted image, they cannot infer or decrypt it without the right keys, which are only present at the cli-
ent. The model remains at the Global Server, so the client will never know about the classification model used.
6 | C O N CL U S I O N
Through this research, PriMed, a private, secure, and accurate approach to using machine learning models in healthcare has been developed. The
training setup depicts the real world scenario in which the data of patients is dispersed among several hospitals and each of the hospitals do not
trust one another enough to share their data for training a ML model. DP and FL are used to solve that problem. FL helps keep the dataset local-
ized at hospitals and DP helps prevent model inversion attacks that can be performed by various third parties. Hospitals and other institutes can
make use of the trained model for performing predictions on patient data by homomorphically encrypting it to protect the privacy of the patient
from the central server and other attack vectors during transit. At the same time, using HE allows the server to carry out basic operations on the
data for inference. The necessary changes to be made for the model to support it are also discussed. This work will instill confidence in hospitals
and other medical institutions to promote the usage of their data in the field of AI. Although there are certain drawbacks of computational com-
plexity due to HE, given better memory and compute resources, this can be used on a larger scale for medical image training across various
institutes.
7 | FUTURE WORK
This novel approach is still an open area of research and can be significantly worked upon to extend its implementation to models other than
CNNs and to larger standardized datasets. An additional feature for enhanced protection of privacy in the training of model weights can be
implemented involving the encryption of these weights initially at the server, before sending them to the different clients in the training phase.
The final decryption, after multiple epochs of remote training, will happen at the server after the model weights are completely trained. Another
enhancement to this is to use secure multiparty computation with public key cryptography, similar to (Stripelis et al., 2021; Truex et al., 2019)
approaches to enhance protection of data in transit over the network. All these improvements can be used to guarantee better privacy of individ-
uals, albeit they will increase the computational cost of the implementation and may also negatively affect its accuracy.
ACKNOWLEDGEMEN T
The authors would like to thank the Computer Science and Engineering Department of PES University for providing the opportunity and
resources to complete this work. Open access publishing facilitated by Curtin University, as part of the Wiley - Curtin University agreement via
the Council of Australian University Librarians.
DATA AVAI LAB ILITY S TATEMENT

Data sharing not applicable to this article as no datasets were generated or analysed during the current study.
ORCID
Sivaraman Eswaran https://orcid.org/0000-0003-0858-148X
RE FE R ENC E S
AAcar, A., Aksu, H., Uluagac, A. S., & Conti, M. (2018). A survey on homomorphic encryption schemes: Theory and implementation. ACM Computing Surveys
(Csur), 51(4), 1–35.
Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., & Zhang, L. (2016). Deep learning with differential privacy. In Proceedings of the
2016 ACM SIGSAC Conference on Computer and Communications Security (pp. 308–318).
Albawi, S., Mohammed, T. A., & Al-Zawi, S. (2017). Understanding of a convolutional neural network. In In 2017 International Conference on Engineering and
Technology (ICET) (pp. 1–6). Ieee.
Arachchige, P. C. M., Bertok, P., Khalil, I., Liu, D., Camtepe, S., & Atiquzzaman, M. (2019). Local differential privacy for deep learning. IEEE Internet of Things
Journal, 7(7), 5827–5842.
Bagdasaryan, E., Poursaeed, O., & Shmatikov, V. (2019). Differential privacy has disparate impact on model accuracy. Advances in Neural Information
Processing Systems, 32, 15453–15462.
Cheon, J. H., Kim, A., Kim, M., & Song, Y. (2017). Homomorphic encryption for arithmetic of approximate numbers. In International Conference on the Theory
and Application of Cryptology and Information Security (pp. 409–437). Springer.
Chillotti, I., Joye, M., & Paillier, P. (2021). Programmable bootstrapping enables efficient homomorphic inference of deep neural networks. In International
Symposium on Cyber Security Cryptography and Machine Learning (pp. 1–19). Springer.
Deng, L. (2012). The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6), 141–142.
Disabato, S., Falcetta, A., Mongelluzzo, A., & Roveri, M. (2020). A privacy-preserving distributed architecture for deep-learning-as-a-service. In In 2020
International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). IEEE.
Fan, J., & Vercauteren, F. (2012). Somewhat practical fully homomorphic encryption. Cryptology ePrint Archive.
Gentry, C. (2009). Fully homomorphic encryption using ideal lattices. In Proceedings of the Forty-First Annual ACM Symposium on Theory of Computing
(pp. 169–178).
Gilad-Bachrach, R., Dowlin, N., Laine, K., Lauter, K., Naehrig, M., & Wernsing, J. (2016). Cryptonets: Applying neural networks to encrypted data with high
throughput and accuracy. In International conference on machine learning (pp. 201–210). PMLR.
Hesamifard, E., Takabi, H., & Ghasemi, M. (2017). Cryptodl: Deep neural networks over encrypted data. arXiv preprint arXiv:1711.05189.
Hidano, S., Murakami, T., Katsumata, S., Kiyomoto, S., & Hanaoka, G. (2018). Model inversion attacks for online prediction systems: Without knowledge of
non-sensitive attributes. IEICE Transactions on Information and Systems, 101(11), 2665–2676.
Ibarrondo, A., & Viand, A. (2021). Pyfhel: Python for homomorphic encryption libraries. In Proceedings of the 9th on Workshop on Encrypted Computing &
Applied Homomorphic Cryptography (pp. 11–16).
Ishiyama, T., Suzuki, T., & Yamana, H. (2020). Highly accurate CNN inference using approximate activation functions over homomorphic encryption. In In
2020 IEEE International Conference on Big Data (Big Data) (pp. 3989–3995). IEEE.
Jain, T., & Jain, T. (2021). Duet demo—how to do data science on data owned by a different organization. OpenMined Blog. https://blog.openmined.org/
duet-demo-how-to-do-data-science-on-data-owned-by-a-different-organization/
Lian, Z., Zeng, Q., Wang, W., Gadekallu, T. R., & Su, C. (2022). Blockchain-based two-stage federated learning with non-IID data in IoMT system. IEEE Trans-
actions on Computational Social Systems., 1–10.
Majeed, A., & Lee, S. (2020). Anonymization techniques for privacy preserving data publishing: A comprehensive survey. IEEE Access, 9, 8512–8545.
Malekzadeh, M., Hasircioglu, B., Mital, N., Katarya, K., Ozfatura, M. E., & Gündüz, D. (2021). Dopamine: Differentially private federated learning on medical
data. arXiv.
McMahan, B., Moore, E., Ramage, D., Hampson, S., & Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. In
Artificial intelligence and statistics (pp. 1273–1282). PMLR.
Microsoft. (2021). Microsoft SEAL: Fast and easy-to-use homomorphic encryption library. Microsoft Research https://www.microsoft.com/en-us/research/
project/microsoft-seal/
Moore, C., O'Neill, M., O'Sullivan, E., Doröz, Y., & Sunar, B. (2014). Practical homomorphic encryption: A survey. In In 2014 IEEE International Symposium on
Circuits and Systems (ISCAS) (pp. 2792–2795). IEEE.
Narayanan, A., & Shmatikov, V. (2008). Robust de-anonymization of large sparse datasets. In In 2008 IEEE Symposium on Security and Privacy (sp 2008)
(pp. 111–125). IEEE.
Pandya, S., Srivastava, G., Jhaveri, R., Babu, M. R., Bhattacharya, S., Maddikunta, P. K. R., … Gadekallu, T. R. (2023). Federated learning for smart cities: A
comprehensive survey. Sustainable Energy Technologies and Assessments, 55, 102987.
Regev, O. (2009). On lattices, learning with errors, random linear codes, and cryptography. Journal of the ACM, 56(6), 1–40.
Ryffel, T., Trask, A., Dahl, M., Wagner, B., Mancuso, J., Rueckert, D., & Passerat-Palmbach, J. (2018). A generic framework for privacy preserving deep learn-
ing. arXiv preprint arXiv:1811.04017.
Sav, S., Pyrgelis, A., Troncoso-Pastoriza, J. R., Froelicher, D., Bossuat, J. P., Sousa, J. S., & Hubaux, J. P. (2020). POSEIDON: Privacy-preserving federated
neural network learning. arXiv preprint arXiv:2009.00349.
Seh, A. H., Zarour, M., Alenezi, M., Sarkar, A. K., Agrawal, A., Kumar, R., & Ahmad Khan, R. (2020). Healthcare data breaches: Insights and implications. In
Healthcare (Vol. 8, p. 133). Multidisciplinary Digital Publishing Institute.
Shokri, R., & Shmatikov, V. (2015). Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications
Security (pp. 1310–1321).
Stripelis, D., Saleem, H., Ghai, T., Dhinagar, N., Gupta, U., Anastasiou, C., … Ambite, J. L. (2021). Secure neuroimaging analysis using federated learning with
homomorphic encryption. In In 17th International Symposium on Medical Information Processing and Analysis (Vol. 12088, pp. 351–359). SPIE.
Truex, S., Baracaldo, N., Anwar, A., Steinke, T., Ludwig, H., Zhang, R., & Zhou, Y. (2019). A hybrid approach to privacy-preserving federated learning. In Pro-
ceedings of the 12th ACM Workshop on Artificial Intelligence and Security (pp. 1–11).
Wei, K., Li, J., Ding, M., Ma, C., Yang, H. H., Farokhi, F., … Poor, H. V. (2020). Federated learning with differential privacy: Algorithms and performance anal-
ysis. IEEE Transactions on Information Forensics and Security, 15, 3454–3469.
Yang, J., Shi, R., & Ni, B. (2021). Medmnist classification decathlon: A lightweight automl benchmark for medical image analysis. In In 2021 IEEE 18th Inter-
national Symposium on Biomedical Imaging (ISBI) (pp. 191–195). IEEE.
Yang, J., Shi, R., Wei, D., Liu, Z., Zhao, L., Ke, B., & Ni, B. (2021). Medmnist v2: A large-scale lightweight benchmark for 2d and 3d biomedical image classifi-
cation. arXiv preprint arXiv:2110.14795.
Yousefpour, A., Shilov, I., Sablayrolles, A., Testuggine, D., Prasad, K., Malek, M., … Mironov, I. (2021). Opacus: User-friendly differential privacy library in
PyTorch. arXiv preprint arXiv:2109.12298.
How to cite this article: Gopalakrishnan, A., Kulkarni, N. P., Raghavendra, C. B., Manjappa, R., Honnavalli, P., & Eswaran, S. (2023). PriMed:
Private federated training and encrypted inference on medical images in healthcare. Expert Systems, e13283. https://doi.org/10.1111/
exsy.13283
APP E NDIX: MODEL ARCHITECTURE
The following model architectures were developed and used to evaluate the approach described in this paper. The modifications made to support
HE include Square approximation instead of ReLU activation layer and Average Pooling Layer instead of a Max Pooling Layer.
MNIST
This dataset (Deng, 2012) consists of 60,000 training images and 10,000 testing images of size 28 28 trained over a model with the following
architecture:
1. Convolutional Layer: in_channels = 1, out_channels = 8, kernel_size = 3, padding = 0, stride = 1

2. Square Activation Layer

3. Average Pooling Layer: kernel_size = 2, stride = 3, padding = 0
7. Linear Layer: in_channels = 64, out_channels = 16
PneumoniaMNIST
This chest x-ray images dataset (Yang, Shi, & Ni, 2021) consists of 5856 images with 4708 used for training, 524 for validation and 624 for testing,
trained over a model with the following architecture:

4. Square activation layer
BloodMNIST
This microscopic peripheral blood cell images dataset (Yang, Shi, & Ni, 2021) contains a total of 17,092 images with 11,959 used for training,
1712 used for validation and 3421 used for testing, trained over a model with the following architecture:

2. Square activation layer
BreastMNIST
This breast ultrasound images dataset (Yang, Shi, & Ni, 2021) consists of 780 images with 546 used for training, 78 used for validation and
156 used for testing, trained over a model with the following architecture:

2. Square Activation layer
RetinaMNIST
This retina fundus images dataset (Yang, Shi, & Ni, 2021) consists of 1600 images with 1080 used for training, 120 for validation and 400 for test-
ing, trained over a model with the following architecture:

AUTHOR BIOGRAPHI ES
Aparna Gopalakrishnan is a student at PES University, Bangalore pursuing a Bachelors of Technology in Computer Science and Engineering
with a specialization in Machine Intelligence and Data Science. She has worked as a Technical Intern at Visa, Inc. As an undergraduate, she is
currently taking up projects in her preferred domains of Cybersecurity and Machine Learning.
Narayan P. Kulkarni is a student at PES University, pursuing a Bachelor's Degree in Computer Science and Engineering with a specialization
in Systems and Core Computing. He works as a Technical Intern in Morgan Stanley. His interests include data structures, algorithms and
machine learning.
Chethan B. Raghavendra is a student at PES University, pursuing a Bachelor's Degree in Computer Science and Engineering with a specializa-
tion in Systems and Core Computing. He has worked as a software engineer intern at Akamai Technologies and made efficient contributions.
His interests include data structures, algorithms, and machine learning.
Raghavendra Manjappa is an undergraduate pursuing a Bachelors of Technology in Computer Science and Engineering at PES University,
Bengaluru. He works as a Software Developer Intern in Commvault Systems undertaking projects in UnixFS and Laptop Backup/Recovery
services.
Prasad Honnavalli is a professor in Computer Science and Engineering of PES University. He is the Director for PESU Centre for Information
Security, Forensics and Cyber Resilience and PESU Centre for Internet of Things with a focus on Security. He is an accomplished executive
with over 30+ years of professional experience in end-to-end programme management, IT technology transformation, IT Infrastructure, com-
plex cloud engagements—IaaS, PaaS & SaaS, software development, automation, IT security, and managed services.
Sivaraman Eswaran received PhD degree in Computer Science from Bharathiar University, India, in 2019. He is currently working as senior
lecturer with the Electrical and Computer Engineering department, Curtin University, Malaysia. Prior to that, he was working at PES Univer-
sity, Bangalore. He is a CompTIA Security+ certified professional. He is also a Microsoft Certified Professional and EMC Academic Associate.
He is a senior member of IEEE. His research interests include cloud computing and cyber security.

Expert Systems - 2023 - Gopalakrishnan - PriMed Private Federated Training and Encrypted Inference On Medical Images in

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Expert Systems - 2023 - Gopalakrishnan - PriMed Private Federated Training and Encrypted Inference On Medical Images in

Uploaded by

Copyright:

Available Formats

Received: 19 October 2022 Revised: 19 January 2023 Accepted: 3 March 2023

PriMed: Private federated training and encrypted inference on

Aparna Gopalakrishnan 1 | Narayan P. Kulkarni 1 | Chethan B. Raghavendra 1 |

Expert Systems. 2023;e13283. wileyonlinelibrary.com/journal/exsy 1 of 14

1.1 | Federated learning

1.2 | Differential privacy

1.3 | Homomorphic encryption

FIGURE 1 PriMed training and inference phase flow

FIGURE 2 PriMed training phase methodology

ALGORITHM 1 Training Phase

a þ b ¼ HEdec ðHEenc ðaÞ þ HEenc ðbÞÞ ð1Þ

a b ¼ HEdec ðHEenc ðaÞ HEenc ðbÞÞ ð2Þ

FIGURE 3 PriMed inference phase methodology

ALGORITHM 2 Inference Phase

4.1 | Convolutional neural networks

• Plaintext modulus, p: 15974401

TABLE 1 Privacy versus accuracy

Noise multiplier Accuracy Privacy spent

FIGURE 4 Comparison of validation accuracies for different approaches

TABLE 2 Evaluation metrics of PriMed's PPML approach

MNIST PneumoniaMNIST BreastMNIST BloodMNIST RetinaMNIST

Note: N-Conv, N Convolutional Layers; N-FC, N Fully Connected Layers.

Note: The values are made bold to highlight our results.

DATA AVAI LAB ILITY S TATEMENT

APP E NDIX: MODEL ARCHITECTURE

1. Convolutional Layer: in_channels = 1, out_channels = 8, kernel_size = 3, padding = 0, stride = 1

2. Square Activation Layer

1. Convolutional Layer: in_channels = 3, out_channels = 8, kernel_size = 3, padding = 0, stride = 1

1. Convolutional Layer: in_channels = 3, out_channels = 12, kernel_size = 3, padding = 0, stride = 1

1. Convolutional Layer: in_channels = 3, out_channels = 8, kernel_size = 3, padding = 0, stride = 1

1. Convolutional Layer: in_channels = 3, out_channels = 12, kernel_size = 3, padding = 0, stride = 1

You might also like