You are on page 1of 25

1.

INTRODUCTION

In the world of medical imaging, the ability to make precise diagnoses is crucial. However,
images often suffer from blurriness or noise, making it challenging to discern important details.
To address this, various denoising methods are employed, each with its own strengths and
limitations. These methods range from Non-Local Means Denoising to Wavelet-Based
Denoising, offering different approaches to reduce image noise while preserving essential
features. Despite the effectiveness of traditional denoising techniques, they may not always fully
address the issue. This is where our project comes in. We're leveraging cutting-edge technology
called autoencoders, specifically the U-Net architecture, to enhance medical images.
Autoencoders act as intelligent filters, capable of cleaning up images by removing unwanted
fuzziness while retaining critical information intact. Imagine viewing a blurry photo on your
phone. Similar blurriness or specks can occur in medical images like X-rays or MRI scans,
which we refer to as "noise." This noise can stem from various factors such as hardware
limitations, signal interference, or patient movement during imaging. In our project, we're
employing smart computer tools, namely autoencoders, to tackle this noise issue. These tools
function akin to skilled photo editors, learning from examples to identify and eliminate unwanted
blur or specks in medical images. By doing so, we aim to provide clearer and more accurate
images, empowering doctors to make informed decisions about patient care.In essence, our
project is focused on harnessing the capabilities of autoencoders to improve the clarity and
accuracy of medical images. By mitigating noise and enhancing image quality, we strive to
positively impact the diagnostic process, ultimately leading to better healthcare outcomes for
atients.
2. RELATED WORK

Numerous methods have been suggested and documented in various journals. Each paper
referenced in this section addresses the challenge in a distinct manner.

[1] (Sapienza, Davide, et al,2022) proposed the Deep Image Prior technique's main focus is to
analyze the robustness of these networks concerning different types of initialization, specifically
related to Batch Normalization and Convolutional layers. The ultimate goal is to apply the
acquired knowledge to denoise computer tomography images, which are essential diagnostic
tools in neuroscience and oncology. By using Autoencoder model the overall accuracy achieved
is 95.67.

[2] In medical diagnosis (El-Shafai, Walid, et al.,2022) their study proposes CADTra, an
automatic detection model based on classification, denoising autoencoder, and transfer learning.
The model achieves high precision in binary and multi-class classifications of chest CT and X-
ray images. By using FCNN, CADTRA model the overall Accuracy achieved is 98.34.

[3] (Khader, Firas, et al.,2022) The growing importance of deep learning in medical imaging and
highlights the potential of generating synthetic 3D medical images using diffusion models.
Synthetic 3D images have various applications, such as data sharing, education, and disease
progression prediction that most image generation studies have focused on 2D images, despite
modern medical imaging techniques like MRI and CT providing 3D data. Therefore, the need to
develop methods for generating synthetic 3D medical images is emphasized. By using Diffusion
model the overall Accuracy achieved is 97.34.

[4] ( Patil, Rajesh, and Surendra Bhosale.,2022) Detecting pathological brain lesions in medical
images due to the lack of comprehensive data and annotations. To overcome this limitation, the
focus is shifted to unsupervised anomaly detection, where only healthy data is used for training,
and the goal is to detect unseen anomalies during testing. By using Autoencoder model the
overall Accuracy achieved is 96.08.
[5] (Yang, Chensheng, et al,2022) they propose a novel framework that leverages the Masked
Autoencoder (MAE) model to learn the structure of normal samples without relying on labeled
abnormal data. The combination of the MAE model, anomaly classifier, and pseudo-abnormal
module offers a novel and promising approach for unsupervised anomaly detection in medical
images. By using convolutional autoencoder model the overall Accuracy achieved is 96.86.

[6] (Mehr, O. Mahmoudi, M. R. Mohammadi, and M. Soryani,2023) ECG is a crucial tool for
cardiology research and patient assessment due to its non-invasiveness, efficiency, and low cost.
To address this issue, a noise reduction method based on a disentangled autoencoder with a fully
convolutional neural network. Where the objective is to separate the clean ECG signal from
noisy components, improving the signal-to-noise ratio and enhancing diagnostic visualization
and analysis. By using speckle noise-based inception convolutional denoising neural network"
(snicdn) model the overall Accuracy Achieved is 97.05.

[7] (Müller-Franzese, Gustav, et al.2023) the author introduces a proposed method called
"speckle noise-based inception convolutional denoising neural network" (SNICDNN) aims to
remove speckle noise, which degrades image quality and hinders accurate diagnosis. By using
denoising diffusion probabilistic model” (ddpm) model the overall Accuracy Achieved is 95.72.

[8] (Ghahremani, Morteza, et al. 2022) The Med fusion model in deep learning is a conditional
latent diffusion probabilistic model (DDPM) for medical images. It is a type of generative
adversarial network (GAN) that can be used to generate realistic images of medical conditions,
such as glaucoma and pneumonia. By using efficient-unet model the overall Accuracy achieved
is 98.09.

[9] (Chauhan, Nishant, and Byung-Jae Choi.2019) MASKED-DDPM (MDPPM) stands for
Masked Diffusion Probabilistic Model with Masking. It is a type of generative adversarial
network (GAN) that can be used to generate realistic images of medical conditions. It is an
extension of the Med fusion model that introduces a masking mechanism to improve the quality
of the generated images. By using fuzzy filtering model the overall Accuracy achieved is 97.87.
[10] (Dhahri, Habib, et al.2021) Unsupervised technique to remove noise from medical images
by learning noise characteristics and using patch-based dictionaries and residual learning. It
works for both 2D and 3D images and shows promising results on MRI/CT datasets. By using
FD-VGG model the overall Accuracy achieved is 96.38.
3. PROBLEM STATEMENT

Fig.1, X-ray image of the lungs exhibiting Fig.2, MRI scan of brain exhibiting noise.
noise.
In medical imaging, clear pictures are crucial for accurate diagnoses. However, images like
X-rays and MRI scans often have unwanted fuzzy spots known as "noise" that can obscure
important details. Traditional methods for cleaning up these images sometimes struggle to keep
the important information intact while removing the noise. Our project addresses this challenge
by using smart computer tools called convolutional autoencoders. These autoencoders act like
expert cleaners for medical images, learning from examples to distinguish between noise and
critical features. By training these autoencoders, we aim to enhance image clarity while
preserving essential information. This means that doctors can see the pictures more clearly and
make better decisions about patients' health. By providing clearer visuals, our approach has the
potential to improve diagnostic accuracy and enable more informed decision-making in medical
practice. Ultimately, our goal is to enhance patient care outcomes by ensuring that healthcare
professionals have access to the clearest and most accurate medical images possible.
4. REQUIREMENT ANALYSIS

Software Requirements:

Python Environment:
Ensure Python is installed if not you download from Python official website
(https://www.python.org).

Development Platforms:
Google Colab for collaborative Python coding and machine learning. Make sure to change the
runtime to GPU from CPU.

Python Libraries:
To begin working on your Python project, it's important to install additional tools that help with
image processing and machine learning tasks. These tools include popular frameworks like
TensorFlow, Keras, and scikit-learn. They provide useful functions and interfaces for creating,
training, and deploying machine learning models. Installing these frameworks is as simple as
running a few commands in your Python environment. You can install them by running the
below commands:
!pip install tensorflow
!pip install keras

Browser:
You will need a modern web browser (e.g., Chrome, Firefox) to view web application.
5. RISK ANALYSIS

Risk Description Mitigation


There's a chance that the Keep checking and adjusting
model might make mistakes the model to make it more
when looking at medical accurate over time. Update
Model Inaccuracy
images. This could cause it to the training data regularly
clean up the images with different kinds of
incorrectly and maybe even medical images to make sure
give the wrong diagnosis. the model keeps getting
better.
If the training data doesn't Make sure the dataset used
cover a wide range of for training is balanced and
examples, the model might varied to reduce bias. Work
become biased. This could together with medical
Data Bias
make it less effective at professionals and researchers
cleaning up certain types of to gather a diverse set of data
medical images. that includes a wide range of
medical conditions.
Difficulties in creating and Involve skilled machine
putting the model into use learning engineers and data
might cause delays or even scientists to streamline the
cause the project to not development process. Keep
Model Development
Complexity succeed. thorough documentation for
both the model's structure and
the steps needed for
deployment.
Differences in the Create models and data
environment when taking collection techniques that
Environmental Conditions images, like changes in consider changes in
equipment or lighting, could environmental conditions.
affect how well the model Use real-time monitoring of
cleans up medical images. the environment to ensure
accurate performance.
Without input from medical Work closely with medical
professionals, the project's professionals to incorporate
success may suffer, as their their insights into the project.
Expert Involvement
expertise is essential for Hold regular training and
comprehending and assessing knowledge-sharing sessions
denoised medical images. to equip medical staff with
the skills they need.
To keep the model accurate, To reduce bias, assemble a
maintain a diverse and up-to- well-balanced and top-notch
Resulting Actions date training dataset covering training dataset that
a wide range of medical accurately reflects the variety
conditions. of medical images
encountered in real-world
practice.
Table-1 Risk Analysis and their corresponding mitigation strategies.
6. FEASIBILITY ANALYSIS

1. Technical Feasibility:
Data Collection and Analysis: Using modern tools and techniques, gathering and examining
medical images for denoising is technically possible. Model Training and Deployment: In the
realm of machine learning, training convolutional autoencoder models for image denoising is a
well-established practice.
2. Economic Feasibility:
Cost-Effective Approach: The project is budget-friendly since it doesn't demand extensive
infrastructure or resources. Key expenses involve data collection, model development, and
potential cloud computing fees.
Opportunity for Funding: Due to its potential to enhance medical image diagnostics, the project
could attract funding and backing from medical research organizations and institutions.
3. Legal and Ethical Feasibility:
Legal Compliance: It's crucial for the project to follow legal rules concerning patient data
privacy, medical imaging usage rights, and healthcare regulations. Legal experts can ensure
everything is compliant.
Ethical Considerations: The project respects ethical principles by seeking to improve medical
diagnostics, safeguarding patient privacy, and adhering to ethical guidelines in medical research.
4. Social Impact Feasibility:
Patient Outcomes: The project improves patient outcomes by enhancing the precision of medical
image diagnostics, which could result in improved treatment decisions.
Medical Progress: It plays a role in advancing medical technology, demonstrating how AI-driven
solutions can enhance healthcare.
Healthcare Accessibility: By offering a more effective and precise diagnostic tool, the project
improves healthcare accessibility, particularly in areas with fewer resources.
7. PROPOSED APPROACH

The provided algorithm outlines the crucial steps in our project for cleaning up medical images
using convolutional autoencoders. This algorithm serves as the foundation of our project
methodology.
Step 1: Data Collection
 Gather a varied collection of medical images, encompassing X-rays, MRI scans, CT
scans, and ultrasounds, sourced from reputable medical institutions.
 Maintain a balanced dataset, containing images that depict a range of medical conditions
and body parts.

Step 2: Data Augmentation


 Utilize image augmentation methods to improve the dataset, such as rotation, flipping,
zooming, and contrast adjustment.

Step 3: Data Pre-processing


 Ensure uniformity by standardizing the image dimensions and resolution.
 Maintain consistent input for the model by normalizing pixel values to a common scale.

Step 4: Model Architecture


 Develop a convolutional autoencoder structure tailored for denoising medical images.
 Create encoder and decoder layers with suitable activation functions and filter sizes.

Step 5: Model Training


 Train the convolutional autoencoder model using the augmented dataset.
 Employ a portion of the dataset for validation to track model performance during
training.
Step 6: Model Evaluation and Selection
 Assess the denoising effectiveness of the trained model using metrics like Mean Squared
Error (MSE) and Peak Signal-to-Noise Ratio (PSNR). Compare the model's performance
with conventional denoising techniques.

8. ARCHITECTURE
Fig.3, Flowchart of proposed approach.

Fig.4, Architecture of Autoencoders.


9. IMPLEMENTATION

Our new system combines two powerful concepts: the Autoencoder and the U-Net model.
Let's break it down. Firstly, an Autoencoder is like a smart translator. It takes information,
compresses it into a simpler form, and then translates it back into its original shape. It's handy
because it can learn the most important features of data while getting rid of unnecessary details.
Now, the U-Net model is a special kind of Autoencoder, built specifically for understanding
images. Imagine you have a picture and you want to understand every tiny detail of it. That's
where the U-Net comes in. It's designed like a "U" shape, hence the name. One side of the "U"
learns the big picture of the image, while the other side zooms in to focus on the small details.
This way, it can understand both the overall picture and the little nuances. This architecture is
perfect for tasks like medical image segmentation. What's that? Well, imagine you have an X-ray
image, and you want to know exactly where the bones are. Medical image segmentation helps
with that. The U-Net is great at this because it can understand both the general shape of the
bones and the tiny details, like cracks or breaks. So, in our system, we're taking advantage of the
U-Net's special abilities. It's like having a detective who can see the big picture of a crime scene
while also noticing the tiniest clues. By combining the power of Autoencoders and the precision
of the U-Net model, we're aiming to make medical image analysis more accurate and helpful for
doctors.
Autoencoders consist of three main components:
1. Encoder: This part encodes the input data into a stored representation, which is typically
larger than the original data.
2. Bottleneck: The compressed representation of knowledge, which is the core component
of the network.
3. Decoder: This module decodes the information representation to reconstruct the original
data.
The autoencoder's objective is to produce an output similar to the input, which is then
compared to the ground truth. They are trained similarly to artificial neural networks (ANNs)
through backpropagation. The encoder, the first component, compresses the raw data into a fixed
low-dimensional format. The bottleneck, the most critical part, holds the compressed data. The
decoder module reconstructs the data from this compressed representation. Results are then
compared to a reliable source. Multiple layers and activation functions enable the autoencoder to
learn various features, with convolutional layers particularly useful for images and sequential
data. The autoencoder outputs an instance of each layer. Deep learning techniques can further
enhance autoencoder performance.
9.1 Data Collection:
For our project, we needed data to train our model to identify lung conditions from medical
images. We didn't have any specific data at first, so we went looking on a website called Kaggle,
where people share datasets. There, we found three different sets of medical images: X-rays,
MRI scans, and CT scans. These images were from patients with lung conditions, like cancer,
and from people who were healthy. Each type of scan shows different details about the lungs.
For example, X-rays give a general picture, while MRI and CT scans show more detailed
images. We downloaded all these images, making sure to get permission if needed, and ended up
with a total of 1025 pictures. To make it easier to work with, we put all the images together into
one big file. Then, we uploaded this file onto a platform called Google Drive, which lets us store
and share files online. By merging all these images into one file and uploading them to Google
Drive, we made it convenient for our team to access the data and start working on our project.
This way, everyone involved could easily use the images to train the computer program we were
building to recognize lung conditions.
Fig.5, Sample images of Dataset.

9.2 Data Augmentation:


To start our project, we need a bunch of medical images. But the ones we found are pretty
clear, and we want to make them a bit more realistic. So, we decided to add some special kind of
fuzziness to them called Gaussian noise. It's like adding a soft blur to the pictures, which makes
them look a bit smoother. This fuzziness is important because real medical images often have
some fuzziness or noise in them. By adding this noise ourselves, we can make our dataset more
like the real thing. It's a bit like practicing with weights on your ankles before a race – it makes
the real thing feel easier. Now, we're not just randomly adding fuzziness. We're doing it in a very
controlled way, using something called Gaussian noise with a specific factor. This factor
determines how much fuzziness we add to each picture. We want to make sure it's just enough to
make the images look realistic but not so much that we can't see what's going on. After we add
this noise, the images won't look exactly like the ones we started with. They'll have a bit of that
medical imaging vibe to them, which is exactly what we're after. These slightly fuzzier images
will be the ones we use to train our Autoencoder model. So, by intentionally blurring our images
with Gaussian noise, we're making our dataset more realistic and better suited for training our
model to recognize patterns in medical images. It's like adding a touch of realism to our project,
so we're ready for the real deal when it comes.
def add_noise(image):
row, col, ch = image.shape
mean = 0
sigma = 1
gauss = np.random.normal(mean, sigma, (row, col, ch))
gauss = gauss.reshape(row, col, ch)
noisy = image + gauss * 0.05
return noisy
noised_df = []
for img in train_df:
noisy = add_noise(img)
noised_df.append(noisy)

noised_df = np.array(noised_df)
def plot_img(dataset):
f, ax = plt.subplots(1, 5)
f.set_size_inches(40, 20)
for i in range(5):
ax[i].imshow(dataset[i], cmap='gray')
plt.show()

plot_img(noised_df)

Fig.6, Sample images after adding noise.


9.3 Data Preprocessing:
Data preprocessing is an important part of getting data ready for machine learning models, and in
the code provided, it includes several important steps such as:
Normalization:
In this step, the pixel values of the images are adjusted so they fall within a range between 0 and
1. This is important because it standardizes the pixel intensity values, making sure that no one
feature of the image is more important than another during the learning process.

By doing this, we're making sure that the model isn't affected by the scale of the input data. This
makes the training process more stable and effective, ultimately helping the model learn better
from the data.
# Image data is normalized to values between 0 and 1.
xtest_binary = (xtest > 0.5).astype(int)
9.4 Model Architecture: (pending)
10.RESULT AND DISCUSSIONS

Before we jump into the analysis and comparison, let's first examine the output of
our model by inputting a noisy image. This initial step provides a visual understanding
of how well our model performs in reducing image noise. Fig.7 shows the generated
output of our model when the noised image is given.

Fig.7, original images

Fig.8, Denoised images

The graphs below show how the loss and accuracy change with each training session. This gives
us a clear picture of how our denoising model is learning and improving over time.
Fig.9, Loss and Accuracy metrics on each epoch.
In the project, various methods are being tested to improve image clarity by reducing
noise. Four types of filters—median, Gaussian, average, and bilateral—are being examined to
determine their effectiveness in noise reduction. Each filter operates differently; for example, the
median filter selects the middle value from a group of pixels to replace the noisy pixel, while the
Gaussian filter blurs the image to diminish noise. Additionally, an autoencoder is under
evaluation. Unlike filters, the autoencoder is a smart computer program that learns to remove
noise from images without relying on predefined filters. It does this by training the computer to
recognize and address noise within the images independently.

After applying each method, the resulting images are compared to the original noisy ones
using Peak Signal-to-Noise Ratio (PSNR) to measure their similarity. A higher PSNR value
indicates that the filtered or autoencoded image closely resembles the original noisy image,
indicating better noise reduction. By testing these methods and comparing their PSNR values,
the project aims to determine the most effective approach for enhancing image quality by
reducing noise. This evaluation process is crucial for identifying the optimal method to improve
the quality of images for the project's objectives.
Fig.10, PSNR values of all models.
In our project, we're exploring several methods to enhance image quality by reducing
noise. Specifically, we're investigating the efficacy of median, Gaussian, average, median
autoencoder, and bilateral filters. Each of these filters employs distinct techniques to mitigate
noise and improve image clarity. For each filter, we're assessing several key metrics to
comprehensively evaluate its performance. One crucial metric is Accuracy, which measures the
proportion of correctly identified noise pixels to the total number of noise pixels in the image. A
higher accuracy score indicates that the filter is effective in correctly identifying and removing
noise. Precision is another important metric we're considering. It quantifies the accuracy of the
positive predictions made by the filter, indicating how many of the noise pixels identified by the
filter are indeed noise. A high precision score signifies that the filter makes accurate predictions
with minimal false positives. Recall, on the other hand, measures the ability of the filter to
identify all relevant instances of noise in the image. It assesses the proportion of correctly
identified noise pixels to the total number of noise pixels present in the image. A high recall
score indicates that the filter effectively captures most of the noise in the image. Furthermore,
we're evaluating the F1 Score, which combines precision and recall into a single metric. This
score provides a balanced assessment of the filter's performance by considering both its ability to
accurately identify noise and its capacity to capture all instances of noise in the image.
Additionally, we're measuring the Cross Entropy of each filter. Cross Entropy quantifies the
difference between the predicted noise probability distributions and the actual noise distributions
in the images. A lower Cross Entropy score suggests that the filter's predictions closely match
the actual noise distributions, indicating better performance in noise reduction. By analyzing
these metrics for each filter, we aim to gain a comprehensive understanding of their effectiveness
in reducing noise and improving image quality. This rigorous evaluation process enables us to
identify the most suitable filter or combination of filters for our project's objectives, ensuring
optimal performance in enhancing image clarity.
Fig.11, Autoencoder Model evaluation metrics. Fig.12, Median model metrics.

Fig.13, Gaussian model metrics. Fig.14, Average model metrics.

Fig.15, Bilateral model metrics.

In our study, we compared traditional filters with a special computer program called an
autoencoder using a graph called the ROC curve. This graph helps us see how well each method
can tell the difference between noisy and clean images. We found that the autoencoder did a
better job than traditional filters at reducing noise in images. This was shown by higher numbers
in the PSNR values and other measurements we looked at. When we looked at the ROC curve,
we saw that the autoencoder's curve was in a perfect spot near the top-left corner. This means it
was very good at spotting and fixing noise in images. So, overall, our study showed that the
autoencoder was more effective at cleaning up images than traditional filters. This suggests that
using special computer programs like autoencoders might be a better way to improve image
quality by getting rid of noise.

Fig.16, These graphs display ROC curves, illustrating the comparison between each individual
filter and our model.
Fig.17, Comparison of auto encoders against various image denoising methods.

Model Precision. Recall F1 Score Cross


Accuracy entrop
y

Autoencoder 89.43% 81.88% 99.12% 89.68% 0.37


Median 87.37% 48.50% 16.75% 24.91% 0.45
Filter
Gaussian 86.44% 40.18% 17.25% 24.14% 0.45
Filter
Average 86.58% 40.59% 15.85% 22.79% 0.45
Filter
Bilateral - - 13.61% 20.58% 0.46
Filter
Table-2. Comparison of various methods Vs auto encoder evaluation metrics.
11.CONCLUSION AND FUTURESCOPE

You might also like