You are on page 1of 20

A PROJECT PHASE-1 REPORT SUBMITTED ON

LUNG NODULE DETECTION USING DEEP LEARNING


A report submitted in partial fulfillment of the requirements for the Award of Degree of

BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING
By
BOINPALLY RAVITEJA 20671A0503
KONAPURAM SIDDA 20671A0522
L. SHARATH KUMAR 20671A0525
GURU DEVENDER 21671A0503

Under the esteemed guidance of

Mrs. S. PAVANI
ASSISTANT PROFESSOR

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING


J.B. INSTITUTE OF ENGINEERING & TECHNOLOGY
(UGC Autonomous)
Approved by AICTE, Autonomous, accredited by NBA &NAAC Permanently affiliated
to JNTHU,Hyderabad, Telangana.
2023-2024

I
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
J.B. INSTITUTE OF ENGINEERING & TECHNOLOGY
(UGC Autonomous)
Approved by AICTE, Autonomous, accredited by NBA &NAAC Permanently affiliated
to JNTHU, Hyderabad, Telangana.

CERTIFICATE
This is to certify that the Major Project stage-1 report entitled “LUNG NODULE DETECTION
USING DEEP LEARNING” submitted to the Department of Computer Science and Engineering,
J.B Institute of Engineering & Technology, in accordance with Jawaharlal Nehru Technological
University regulations as partial fulfillment required for successful completion of Bachelor of
Technology is a record of Bonafide work carried out during the academic year 2023-2024 by,

BOINPALLY RAVITEJA 20671A0503


KONAPURAM SIDDA 20671A0522
L. SHARATH KUMAR 20671A0525
GURU DEVENDER 21671A0503

Internal Guide Head of the Department


MRS. S. PAVANI Mr. G. SREENIVASULU
Assistant Professor Associate Professor
Department of CSE

II
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
J.B. INSTITUTE OF ENGINEERING & TECHNOLOGY
(UGC Autonomous)
Approved by AICTE, Autonomous, accredited by NBA &NAAC Permanently affiliated to
JNTHU, Hyderabad, Telangana.

DECLARATION

w e hereby declare that the Major Project stage-1 report entitled “LUNG NODULE DETECTION
USING DEEP LEARNING” carried out by us under the guidance of, Mrs. S. Pavani , Assistant
Professor is submitted in partial fulfillment of the requirements for the award of the degree of Bachelor
of Technology in Computer Science and Engineering. This is a record of Bonafide work carried out
by us and the results embodied in this project report have not been reproduced or copied from any
source. The results embodied in this project report have not been submitted to any other university
or institutefor the award of any other degree or diploma.

Date: 13 / 12 / 2023

III
ACKNOWLEDGEMENT

At outset we express our gratitude to the almighty lord for showering his grace andblessings upon us
to complete this Major Project stage-1. Although our name appears on the cover of this book, many
people have contributed in some form or the other to this project Development. We could not have
done this Project without the assistance or support of each of the following.

First of all, I am highly indebted to Dr. P. C. KRISHNAMACHARY, Principal for giving us the
permission to carry out this Major Project stage-1.

We would like to thank Dr. G. SREENIVASULU, Professor & Head of the Department of
COMPUTER SCIENCE AND ENGINEERING, for being moral support throughout the period of
study in the Department.

We are grateful to Mrs. S. GAYATHRI DEVI, Associate Professor COMPUTER SCIENCE


ENGINEERING, for her valuable suggestions and guidance given by her during the execution of
this Project work.

We would like to thank the Teaching and Non-Teaching Staff of the Department of Computer
Science & Engineering for sharing their knowledge with us.

IV
TABLE OF CONTENT

SL.NO NAME OF TOPIC PAGE NO

1. INTRODUCTION 1-2
2. LITERATURE SURVEY 3-5
3. SYSTEM ANALYSIS 6-7
3.1 Existing Systems 6
3.2 Disadvantages 6
3.3 Proposed Systems 7
3.4 Advantages 7
4. REQUIREMENT SPECIFICATIONS 8-9
4.1 Functional Requirements 8
4.2 Non-Functional Requirements 8
4.3 Software Requirements 9
4.4 Hardware Requirements 9
5. SYSTEM DESIGN 10-11
5.1 System Architecture 10
5.2 Data flow diagram 11
5.3 sequence diagram 12
6. BIBLIOGRAPHY 13

V
LIST OF FIGURES
Sl. No Description of Figures Page No
1. System Architecture 9
2. Data flow diagram 10
3. sequence diagram 11

VI
ABSTRACT

Lung cancer is a high-risk disease that affects people all over the world, and lung nodules are the most
common sign of early lung cancer. Since early identification of lung cancer can considerably improve
a lung scanner patient’s chance of survival, an accurate and efficient nodule detection system can be
essential. Automatic lung nodule recognition decreases radiologists' effort, as well as the risk of
misdiagnosis and missed diagnoses. Hence, this article developed a new lung nodule detection model
with four stages like “Image pre-processing, segmentation, feature extraction and classification”. In
these processes, pre-processing is the first step, in which the input image is subjected to a series of
operations. Then, the "Otsu Threshold model" is used to segment the per-processed pictures. Then in
the third stage, the LBP features are retrieved that is then classified via optimized Convolutional Neural
Network (CNN). In this, the activation function and convolutional layer count of CNN is optimally
tuned via a proposed algorithm known as Improved Moth Flame Optimization (IMFO). At the end,
the betterment of the scheme is validated by carrying out analysis in terms of certain measures.
Especially, the accuracy of the proposed work is good accuracy
1. INTRODUCTION

Recently, lung cancer and COVID-19 are two drastic pulmonary diseases, which cause millions of
deaths globally each year. Lung cancer is said to be the 2nd most widespread form of cancer in both
women as well as men and it is the primary cause of deaths occurring due to cancer in US. The finest
possibilities of survival emerge from earlier detection and diagnosis that could be aided by enhanced
automated malignant nodule recognition techniques. A lung nodule will be round and it is a smaller
growth of tissue found in the cavity of the chest. Nodules are usually below 30 mm in size, and outsized
growths are termed as masses and are assumed to be malignant.

Nodules among 5–30 mm might be malignant or benign, with the probability of malignancy rising
with size. Spiculated or lobulated nodule edges might specify malignancy whereas Smooth nodules
with indications of calcifications are expected to be benign. There are two most important chest
imaging methods, fundamental X-ray imaging and CT.

Radiographs or chest X-ray images offer a single outlook on the chest cavity. Poster anterior analysis,
where the X-ray beam passes over the chest of the patient from back to front is general. CT scans are
3-D images generated by means of X-ray images obtained from several orientations and it could offer
an entire view of the internals parts of the chest and can, therefore, be exploited for easily detecting
the shapes, sizes, locations, and densities of lung nodules. Nevertheless, CT scan equipment is highly-
priced and is often not obtainable in rural areas or minor hospitals.

Moreover, radiographs are comparatively fast and cheap, and the patients are exposed to minute
radiation, hence they are typically the initial diagnostic step for identifying any abnormalities in the
chest. CAD methods were deployed to identify the lung nodules more precise and quicker. Nodule
recognition approaches are modeled by conventional image processing schemes to discover areas of
the chest radiograph, which encloses a brighter object of the expected texture, shape, and size of a lung
nodule. With current enhancements in CNNs, certain researchers have aimed at exploiting these
techniques to categorize lung nodules. Unluckily, the accessible datasets are comparatively low in
medical imaging The literature on detecting and diagnosing lung nodules is extensive.

To date, the general technique for lung nodule diagnosis in all existing CAD systems has been to utilize

8
a candidate identification stage. While some of these researches use low-level appearance-based
variables to drive this identification task, others use shape and size information. Ypsilantis et
al. proposed using recurrent neural networks in a patch-based strategy to improve nodule detection,
which is related to deep learning-based methodologies. A 2D multi-step segmentation approach was
presented by Krishnamurthy et al. to discover candidates.

There have also been in-depth studies of high-level discriminatory information extraction employing
deep networks to improve FP minimization. Setio et al. employed a fusion technique to conduct FP
reduction after training 9 independent 2D convolutional neural networks on 9 different perspectives of
candidates. For candidate detection, another study used a modified version of Faster R-CNN, which
was the state-of-the-art object detector at the time, and a patch-based 3D CNN for the FP reduction
step . All of these approaches, however, are computationally ineffective

9
2. LITERATURE SURVEY
In the realm of lung cancer detection and prediction, several studies have explored various
methodologies and algorithms to enhance accuracy and efficiency.

Detection of Lung Cancer in CT Images using Image Processing (Nidhi.s,


2019) introduced an image processing approach coupled with support vector
machine (SVM) for lung cancer detection. However, this study did not utilize
a dedicated database for its analysis.

Multi-Stage Lung Cancer Detection and Prediction Using Multi-class


SVM Classifier (Janee Alam, 2019) emphasized the application of a multi-
class SVM classifier for lung cancer detection and prediction across multiple
stages. The study highlighted challenges in achieving accurate detection,
particularly in scenarios with low accuracy rates.

A Comparative study of Lung Cancer detection using supervised neural


network (Ahana Gangly, 2019) conducted a comparative analysis of lung
cancer detection techniques employing a supervised neural network,
specifically utilizing the SURF (Speeded Up Robust Features) algorithm.
However, reported accuracy rates were relatively lower.

Multi-Layer Perceptron Based Lung Tumor Classification (Sneha


Potaganam, 2018) explored lung tumor classification using a Multi-Layer
Perceptron (MLP) algorithm within an image processing framework. One of
the identified challenges was the time-consuming nature of the process.

Robustness-Driven Feature Selection in Classification of Fibrotic


Interstitial Lung Disease Patterns in Computed Tomography Using 3D
Texture Features (Dainel Y Hung, 2016) proposed a robustness-driven
feature selection approach for classifying fibrotic interstitial lung disease
patterns in CT images. However, the study reported slow processing times
as a limitation.

10
Automatic Detection and Segmentation of Lung Nodule on CT Images
(Yangchuran, 2018) focused on automatic detection and segmentation of
lung nodules in CT images using a fully convolutional network (FCN).
Nevertheless, the study reported poor detection rates.

Segmentation and Analysis of CT Chest Images for Early Lung Cancer


Detection (Rachid Sammoda, 2016) utilized segmentation and analysis of
CT chest images for early lung cancer detection, employing an Artificial
Neural Network Classifier. This approach built upon previous techniques.

Efficient edge detection method for diagnosis of 2D and 3D lung and


liver images (P. Sutha, 2017) proposed an efficient edge detection method
for diagnosing lung and liver images, emphasizing its potential for less
invasive surgery and increased survival rates.

11
3. SYSTEM ANALYSIS

3.1 EXISTING SYSTEM

K means clustering
Wavelet and Principal component analysis
KNN classifier

3.2 Disadvantages of Existing System

Difficult to get accurate results


Not applicable for multiple images for lesion segmented in a short time
Poor discriminatory power and less classification accuracy

3.3 PROPOSED SYSTEM

Pre processing
Feature extraction
CNN
3.4Advantages of Proposed System

High Accuracy
Automation and Efficiency
Early Detection

12
13
4. REQUIREMENT SPECIFICATIONS

4.1 FUNCTIONAL REQUIREMENTS

1. Image Acquisition and Input Handling: The system should be able to handle DICOM images
obtained from various medical imaging devices. It should support multiple input formats and
resolutions.
2. Preprocessing: The system must preprocess images to enhance nodule features and ensure
consistency. Preprocessing steps should include resizing, normalization, and noise reduction.
3. Nodule Detection: The system should accurately detect and localize lung nodules within CT scan
images. It must be able to distinguish nodules from other anatomical structures. The detection
should provide information about the size, shape, and location of nodules.
4. Model Training and Validation: The system must train deep learning models using annotated
datasets. It should support various deep learning architectures suitable for nodule detection.
5. Integration and Deployment: The system should integrate with existing PACS or other healthcare
systems for seamless deployment.
6. Reporting and Visualization: The system should generate comprehensive reports detailing
detected nodules and their characteristics.

4.2 NON - FUNCTIONAL REQUIREMENTS

1. Accuracy and Performance: The system must achieve high accuracy in nodule detection to
minimize false positives and false negatives. It should be able to process images within a
reasonable time frame to support clinical workflows.
2. Scalability: The system should be scalable to handle large volumes of image data efficiently.
It should accommodate future growth in the dataset size and user base.
3. Security and Privacy: The system must comply with relevant healthcare data security and
privacy regulations (e.g., HIPAA, GDPR).
4. Reliability and Availability: The system should be reliable, with minimal downtime and robust
error handling mechanisms. It should have built-in redundancy and failover capabilities to
ensure continuous availability.
5. Interoperability: The system should be interoperable with other healthcare IT systems, allowing

14
seamless data exchange and integration. It should support standard data formats and
communication protocols used in the healthcare industry.
6. Usability: The system interface should be intuitive and easy to use, even for non-technical
clinicians. It should provide clear feedback and guidance to users during operation.
7. Regulatory Compliance: The system must comply with applicable medical device regulations
and standards. It should undergo rigorous testing and validation to ensure safety and efficacy
in clinical use.

4.3 SOFTWARE REQUIREMENTS

⚫ Coding Language: Python

4.1 HARDWARE REQUIREMENTS

⚫ Operating System: Windows 10


⚫ Processor: i5 and above
⚫ RAM: 4 GB
⚫ Disk Space: 16 GB

15
16
5.SYSTEM DESIGN

5.1 SYSTEM ARCHITECTURE


The architecture for lung nodule detection using deep learning encompasses several
interconnected components designed to facilitate accurate and efficient detection of nodules
within CT scan images. It begins with the acquisition of DICOM-format lung images, which
undergo preprocessing steps such as resizing, normalization, and noise reduction to enhance
relevant features and ensure consistency. Annotated data, detailing the location and characteristics
of nodules, serves as the foundation for training deep learning models, typically employing
architectures like Convolutional Neural Networks (CNNs) tailored for medical imaging tasks.
Following model training and validation to ensure robust performance metrics, the system
integrates seamlessly into clinical workflows through deployment, often within Picture Archiving
and Communication Systems (PACS) or standalone applications. In the operational phase, the
system applies the trained model to new images for nodule detection, providing clinicians with
valuable insights into nodule size, shape, and location. Continuous feedback loops, supported by
post-processing techniques and comprehensive reporting functionalities, enable iterative
improvements to both model performance and system usability

Fig 5.1: Block Diagram

17
5.2 DATA FLOW DIAGRAM:

1. The DFD is also called as bubble chart. It is a simple graphical formalism that can be used to
represent a system in terms of input data to the system, various processing carried out on this
data, and the output data is generated by this system.
2. The data flow diagram (DFD) is one of the most important modeling tools. It is used to model
the system components. These components are the system process, the data used by the
process, an external entity that interacts with the system and the information flows in the
system.
3. DFD shows how the information moves through the system and how it is modified by a series
of transformations. It is a graphical technique that depicts information flow and the
transformations that are applied as data moves from input to output.
4. DFD is also known as bubble chart. A DFD may be used to represent a system at any level of
abstraction. DFD may be partitioned into levels that represent increasing information flow and
functional detail.

18
19
6. BIBLIOGRAPHY

1. Paul. Key Statistics for Lung Cancer. Version 1.6.0. Available


online: https://www.cancer.org/cancer/non-small-cell-lung-cancer/about/key-
statistics.html (accessed on 15 May 2019).

2. Zhou, Z.H.; Jiang, Y.; Yang, Y.B.; Chen, S.F. Lung cancer cell identification based on artificial
neural network ensembles. Arif. Intel. Med. 2002, 24, 25–36. [Google Scholar] [Crossruff]

3. Boroczky, L.; Zhao, L.; Lee, K.P. Feature subset selection for improving the performance of false
positive reduction in lung nodule CAD. IEEE Trans. Inf. Technol. Biomed. 2006, 10, 504–511.
[Google Scholar] [CrossRef]

4. Tajbakhsh, N.; Suzuki, K. Comparing two classes of end-to-end machine-learning models in lung
nodule detection and classification: MTANNs vs. CNNs. Pattern Recognit. 2017, 63, 476–486.
[Google Scholar] [CrossRef]

5. Sivakumar, S.; Chandrasekar, C. Lung nodule detection using fuzzy clustering and support vector
machines. Int. J. Eng. Technol. 2013, 5, 179–185. [Google Scholar]

6. Russell, S.J.; Norvig, P. Artificial Intelligence: A Modern Approach; Pearson Education Limited:
Malaysia, 2016. [Google Scholar]

7. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.;
Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach.
Learn. Res. 2011, 12, 2825–2830. [Google Scholar]

8. LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436. [Google Scholar]
[CrossRef

20

You might also like