Professional Documents
Culture Documents
Breast Cancer
Classification in a Federated Learning
Framework
ABSTRACT
• Many applications now use AI to diagnose breast cancer. However, most new research has only been
conducted in centralized learning (CL) environments, which entails the risk of privacy breaches. Moreover,
the accurate identification and localization of lesions and tumor prediction using AI technologies is expected
to increase patients’ likelihood of survival. To address these difficulties, we developed a federated learning
(FL) facility that extracts features from participating environments rather than a CL facility. This study’s
novel contributions include (i) the application of transfer learning to extract data features from the region
of interest (ROI) in an image, which aims to enable careful pre-processing and data enhancement for data
training purposes; (ii) the use of synthetic minority oversampling technique (SMOTE) to process data, which
aims to more uniformly classify data and improve diagnostic prediction performance for diseases; (iii) the
application of FeAvg-CNN + MobileNet in an FL framework to ensure customer privacy and personal security;
and (iv) the presentation of experimental results from different deep learning, transfer learning and FL
models with balanced and imbalanced mammography datasets, which demonstrate that our solution leads to
EXISTYING SYSTEM
• Today, breast cancer research mainly focuses on detecting and diagnosing breast tumors using deep learning
algorithms [3], [4], [16], [17], [18], [19]. However, most studies still focus on data processing, tumor prediction,
and diagnosis using distributed learning models on a server. In such an environment, patient data and information
must be shared for centralized processing on the server, which negatively affects privacy. Moreover, centralized
processing entails more work when not all parties have access to a highly configured server. A study [16] proposed a
novel and efficient DL model based on transfer learning to automatically detect and diagnose breast cancer. The
specific of this study is to use the knowledge gained while solving one problem in another relevant problem.
Furthermore, in the proposed model, the features are extracted from a mammographic image analysis dataset
(MIAS) using a pre-trained CNN such as InceptionV3, RestNet50, Visual Geometry Group Networks (VGG)-10, VGG-
1. Domain Shift: The distribution of medical images from different sources or hospitals may vary due to differences in
imaging protocols, equipment, patient demographics, etc. This can lead to a domain shift, causing the transferred
knowledge to be less effective or even detrimental.
2. Privacy and Security Concerns: Federated learning involves training models across decentralized data sources, often
without centralizing the raw data. However, medical data, including breast cancer images, are sensitive, raising privacy
and security concerns during data sharing and aggregation.
3. Model Interpretability: Transfer learning models can be complex, making it challenging to interpret their decisions,
especially in critical applications like medical diagnoses. Interpretable models are crucial in healthcare to gain trust
and insight into the decision-making process.
4. Fine-Tuning Complexity: Fine-tuning the transferred model on the breast cancer dataset requires careful parameter
tuning. Selecting appropriate layers for transfer and deciding on the extent of fine-tuning can be time-consuming and
require domain expertise.
5. Lack of Transparency: Pre-trained models might come from diverse sources, and their architectures and training data
might not be transparent. This lack of transparency could potentially raise ethical concerns in healthcare, where
understanding the model's biases and limitations is essential.
6. Data Heterogeneity: In a federated learning framework, participating institutions might have differing data qualities,
leading to varying model performance across different nodes. Ensuring consistent data quality across nodes can be
challenging.
PROPOSED SYSTEM
• Design of an FL framework for breast cancer classification that includes a global server, which acts as a weight aggregator and mobile
replacement edge clients in tissue training deep learning (DL). This solution is useful for AI healthcare applications and can be widely
• Pioneering use of a transfer learning pre-training dataset in FL for breast cancer classification. Various models in transfer learning were
selected for performance evaluation, including k-nearest neighbors (kNN), AdaBoost, and eXtreme Gradient Boosting (XGB).
First, the image’s features are extracted using the Convolutional Neural Network (ConvNet) of the pre-trained model, and a linear
classifier is used to classify the images. Next, we used data equalization techniques such as SMOTE and data augmentation in
combination with ImageNet to enrich and further optimize the training data.
• With both balanced and imbalanced methods, experimental results from the Digital Database for Screening Mammography (DDSM)
dataset demonstrate that our solution’s FeAvg-CNN + MobileNet is much better for centralized learning, which is more than 5% recall [5]
in improved performance. Moreover, the accuracy of our research results reached nearly 98%; by comparison, the maximum results were
only 88.67% for the two-class cases (calcifications and masses) and 94.92% (benign mass vs. malignant mass and benign calcification vs.
1.Improved Performance: Transfer learning leverages pre-trained models on large datasets, typically
from related tasks. This initialization allows the model to capture relevant features and patterns,
leading to faster convergence and potentially better performance on the breast cancer classification
task.
2.Data Efficiency: Breast cancer datasets may be limited in size, especially for specific subtypes or rare
cases. Transfer learning enables the model to benefit from knowledge gained from larger datasets,
making efficient use of available data.
3.Reduced Computational Requirements: Training deep neural networks from scratch requires
significant computational resources. Transfer learning reduces this burden by starting with pre-trained
weights, making it feasible to train on less powerful devices or in resource-constrained environments.
4.Faster Convergence: The pre-trained model has already learned low-level features, reducing the time
needed for convergence during fine-tuning on the breast cancer dataset. This can be especially
beneficial when time is a critical factor.
5.Generalization: Transfer learning enhances model generalization by leveraging knowledge from diverse
sources. It can perform well even with limited labeled breast cancer data, helping to address the
challenges of medical image classification tasks.
• HARDWARE REQUIREMENTS:
• System : Pentium i3 Processor.
• Hard Disk : 500 GB.
• Monitor : 15’’ LED
• Input Devices : Keyboard, Mouse
• Ram : 4 GB
SOFTWARE REQUIREMENTS:
• In this study, we presented a solution for classifying breast cancer images using feature extraction from multiple participating
• The centralized environment consisted of an inter-national group of hospitals and medical imaging centers that joined collaborative
efforts to train the model to be completely datadecentralized, without sharing any data between hospitals.
• Moreover, we focused on analyzing recall performance more than accuracy because false negatives can be life-threatening. By
contrast, like studies, humans can consider false positives instead of a whole before. The results demonstrate that the accuracy was
• In the future, we plan to create a system that will scan the entire mammogram as input, segment it, and analyze each segment to
yield results for the entire mammogram to make an end- the complete to-end for mammogram analysis.
• In addition to improved data processing, simulations can be extended with multiple clients or groups of clients on separate devices,