You are on page 1of 22

DAYANANDA SAGAR UNIVERSITY

DEPARTMENT OF COMPUTER SCIENCE &


ENGINEERING SCHOOL OF ENGINEERING
DAYANANDA SAGAR
UNIVERSITY KUDLU GATE

BANGALORE - 560068

MINI PROJECT REPORT


ON
“Breast cancer classification using CNN”

SUBMITTED TO THE 6th SEMESTER DATA SCIENCE (19CS3613)

BACHELOR OF TECHNOLOGY IN COMPUTER


SCIENCE & ENGINEERING

Submitted by
DEEKSHA J REDDY - ENG19CS0081
DONA S MANJUNATH - ENG19CS0090

Under the supervision of


Prof.Meenakshi Malhothra
<Project Title>

CERTIFICATE

This is to certify that Ms. Deeksha J Reddy, Dona S Manjunath bearing USN
ENG19CS0081, ENG19CS0090 has satisfactorily completed his/her Mini Project
as prescribed by the University for the VI semester B.Tech. program in Computer
Science & Engineering during the year 2021-22 at the School of Engineering,
Dayananda Sagar University., Bangalore.

Date:

Signature of the faculty in charge

Max Marks Marks Obtained

3
<Project Title>

DECLARATION
We hereby declare that the work presented in this mini project entitled -“Breast cancer
classification using CNN", has been carried out by us and it has not been submitted for the
award of any degree, diploma or the mini project of any other college or university.

DEEKSHA J REDDY - ENG19CS0081

DONA S MANJUNATH - ENG19CS0090

4
<Project Title>

ACKNOWLEDGMENT

The satisfaction that accompanies the successful completion of a task would be


incomplete without the mention of the people who made it possible and whose
constant guidance and encouragement crown all the efforts with success.

We are especially thankful to our Chairman, Dr.Girisha Rao, for providing necessary
departmental facilities, moral support, and encouragement.

We are very much thankful to our Course Advisor, Meenakshi Malhotra, for providing
help and suggestions in completing this mini-project successfully.

We have received a great deal of guidance and co-operation from our friends and we
wish to thank all that have directly or indirectly helped us in the successful completion
of this project work.

5
<Project Title>

TABLE OF CONTENTS
Page

ABSTRACT ............................................................................................................ ix

CHAPTER 1 INTRODUCTION………................................................................. 4
1.1. ABOUT THE PROJECT.............................................................................. 5
1.2. SCOPE............................................................................................................. 6
CHAPTER 2 PROBLEM DEFINITION ………..................................................... 7
CHAPTER 3 REQUIREMENTS ................................................................................ 8
3.1. FUNCTIONAL REQUIREMENTS ................................................................. 9
3.2. NON-FUNCTIONAL REQUIREMENTS ........................................................ 10
3.3. SYSTEM REQUIREMENTS ............................................................................ 11
3.4. HARDWARE REQUIREMENTS ..................................................................... 11
CHAPTER 4 PROJECT DESIGN ............................................................................. 12
4.1. USE CASE DIAGRAM................................................................................. 12
4.2. DATA FLOW DIAGRAM…………….......................................................... 12
4.3 SEQUENTIAL DIAGRAM……………………………………………………. 12

6
<Project Title>

ABSTRACT

Convolutional Neural Network (CNN) has been set up as an intense class of models for image
acknowledgment issues. CNN is a deep learning model that extracts an image's features and
uses these features to classify an image. Other classification algorithms need to extract an
image's feature using feature extraction algorithms like Gray Level Co-occurrence Matrix. A
convolutional neural network is a class of deep, feed-forward artificial neural networks that
have successfully been applied to recognize images. It is also widely used in video
recognition, image classification, recommender systems, natural language processing, and
speech recognition. In this project, a dataset of 7909 breast cancer histopathology images
acquired from 82 patients is taken. These images are of two different classes benign and
malignant. We extract the patches of the image to train the network and finally we give the
image as an input to classify the image.

7
<Project Title>

CHAPTER 1

INTRODUCTION

8
<Project Title>

CHAPTER 1 INTRODUCTION

Breast cancer is known to be the second leading cause of death among women in the world.
Breast cancer occurs when there is an abnormal growth of a mass of tissues of malignant cells.
These cells originate from the milk glands of the breast. The classification of these cells is
determined by the rate at which these unusual cells progress and the effect that they have on
other normal cells that can eventually affect the whole body. According to statistics of the
World Health Organization (WHO), breast cancer is the most common type of cancer
occurring in women globally, accounting for approximately 2.1 million new breast cancer
cases. In 2018, it is estimated that 627,000 women died from breast cancer–that is
approximately 15% of all cancer deaths among women. There are research studies and work
in the literature on the detection and classification of breast cancers using some conventional
methods that have not used any specific type of machine learning.

It is reported that breast cancer is detected if specific symptoms appear. However, it was found
that many women who have breast cancer do not have any symptoms. Primary prevention of
breast cancer is not yet a reality except by undergoing prophylactic mastectomy for those who
have the BRCA1 or BRCA2 gene mutation, known to cause breast cancer. The effect of
treatments administered to patients can also have an effect on the prognosis of the disease and
the possibility of relapse in cancer patients. Surgery, chemotherapy, radiotherapy, and
hormone therapy are among the classical medical treatments used on breast cancer patients.
Thus, screening is recommended as a mandatory step so that the detection of breast cancer is
made at an early stage. If the cancer is detected at an early stage then lives can be saved. Early
detection of breast cancer helps in the early diagnosis and treatment. This is vital because the
prognosis is very important for long-term survival. Delays in the treatment of cancer
contribute to the propagation of the illness and cause delays in the treatment. According to a

9
<Project Title>

study conducted,it was noticed that those who have started their treatment within 3 months
from the appearance of breast cancer symptoms have a higher chance of survival and can
reduce the proliferation of malignant cells from the body.

Advances in image processing especially medical image processing are bringing hopes in
devising appropriate automated breast cancer detection and classification applications. The
algorithms being deployed in deep learning have the capabilities of using layers of neural
networks to recognize patterns, and computer algorithms are becoming very useful in the
medical field. Despite continuous research in automated breast cancer applications, the correct
identification or classification of breast abnormalities is still a challenge. In addition, deep
learning requires large training data, which is very difficult to obtain in the medical domain.
Thus, there are scopes in the further exploration of automated breast cancer detection
applications to improve the accuracy of cancer screening. We have tried to use CNN to solve
this problem in our project.

10
<Project Title>

1.1. ABOUT THE PROJECT:

Breast cancer tumor biopsy classification of benign or malignant remains a very manual and
subjective process, done by specially trained pathologists. Breast cancer is the second
leading cause of cancer deaths, globally in women. Advancing the fight against cancer
requires early detection which could be aided by an efficient classification system. The goal
of this system is to train and tune a CNN to accurately classify breast cancer biopsies as
malignant and benign.

1.2. SCOPE OF THE PROJECT


Advances in image processing, especially medical image processing are bringing hopes of
devising appropriate automated breast cancer detection and classification applications. The
algorithms being deployed in deep learning have the capabilities of using layers of neural
networks to recognize patterns, and computer algorithms are becoming very useful in the
medical field. Despite continuous research in automated breast cancer applications, the correct
identification or classification of breast abnormalities is still a challenge. In addition, deep
learning requires large training data, which is very difficult to obtain in the medical domain.
Thus, there are scopes in the further exploration of automated breast cancer detection
applications to improve the accuracy of cancer screening. We have used a deep learning model
for the classification of abnormalities in this work. Our project can help with the early
detection of breast cancer and help the fight against cancer. Automated breast cancer
classification reduces the amount of time it requires to detect breast cancer by a lot and it
eliminates the need for an experienced professional.

11
<Project Title>

CHAPTER 2
PROBLEM DEFINITION

12
<Project Title>

CHAPTER 2 PROBLEM DEFINITION

Breast cancer is the second most frequent cancer in women and men globally. In 2020, there
were 2.3 million women diagnosed with breast cancer and 685 000 deaths globally. As of the
end of 2020, there were 7.8 million women alive who were diagnosed with breast cancer in
the past 5 years, making it the world's most prevalent cancer. Breast cancer arises when cells
in the breast start to develop out of control. These cells usually grow a tumor that can
frequently be seen on an x-ray or considered a lump. The tumor is malignant (cancer) if the
cells can expand into (invade) encompassing tissues or increase (metastasize) to different
sections of the body.

13
<Project Title>

CHAPTER 3
REQUIREMENTS

14
<Project Title>

CHAPTER 3 REQUIREMENTS

3.1 FUNCTIONAL REQUIREMENTS

Fig 1.1 Functional Requirements

The above diagram shows the Functional Requirements of the project (breast cancer
classification) which tells us what the system does i.e,

1. Extract the features from the given dataset using CNN, which extracts input image
features and another neural network classifies the image features. The input image is
used by the feature extraction network. The extracted feature signals are utilized by
the neural network for classification.
2. Apply deep learning algorithms to classify the given medical images into benign and
malignant.
3. Evaluate model performance using metrics.

15
<Project Title>

There are two classes namely,


1. Benign
2. Malignant.

Fig 1.2 Benign and malignant


Malignant tumors are cancerous, that is they invade other sites.
Benign tumors are noncancerous.

● The system will clearly present test results in a format understandable by users.
● The system must be able to perform predictions accurately on demand.
● The system should be able to display results on the result form.
● Aids in early detection of breast cancer to increase the survival rates.
● A large amount of data can be stored, processed, and analyzed efficiently using
low-cost memory and cheaper computing power.

16
<Project Title>

3.2 NON-FUNCTIONAL REQUIREMENTS

1. The training folder has 1000 images in each category while the validation folder has
250 images in each category.
2. Total number of Images: 7,909
Of which 2,480 are Benign and 5,429 are Malignant.
3. Algorithms used: CNN

4. User Related Requirements:

● The system should provide consistency and standards


● The system should provide an aesthetic and minimalist design
● The system should help users recognize, diagnose and recover from errors: error
messages should almost always be expressed in plain language to ensure nothing gets
lost in translation.
● The system should speak the users’ language, with words, phrases, and concepts
familiar to the user, rather than system-oriented terms.

5. Quality-Related Requirements :

● Maintainability: the system should be able to undergo changes without impact on


components, services, features, and interfaces when adding or changing the
functionality, fixing errors, and meeting new business requirements.
● Reusability: the components and subsystems should be suitable for use in other
applications and in other scenarios, minimizing the duplication of components and also
the implementation time.

17
<Project Title>

● Interoperability: the system should be able to operate successfully by communicating


and exchanging information with other external systems written and run by external
parties.
● Interoperability: the system should be able to operate successfully by communicating
and exchanging information with other external systems written and run by external
parties.

6. Performance Related Requirements:

● The response time: the total time to send a request and get a response should not be
more than 30 seconds.
● The CPU utilization: how much time the CPU needs to process requests should not be
more than 50%.
● The memory utilization: how much memory is needed to process the request should
not be more than 20% of the main memory on running the application.

18
<Project Title>

3.3 SOFTWARE REQUIREMENTS

Server-side Requirements include:

1. The language used: Python


2. Model building: TensorFlow framework, Keras, DenseNet201
3. Plotting: Matplotlib
4. Image Processing: Python OpenCV2
5. Dependencies: numpy, sklearn libraries
6. Microsoft Operating System i.e. Windows 7 and upwards

Client-side Requirements include:

1. Web browsers such as Internet Explorer, Mozilla Firefox, Opera, and Google Chrome.
2. Microsoft Operating System i.e. Windows 7 and upwards

19
<Project Title>

3.4 HARDWARE REQUIREMENTS

The Server-side Requirements include:

1. Core i5 and upwards processor


2. 4 GB RAM
3. 100 GB free storage space
4. High-performance processor

The Client-side Requirements include

1. Any desktop or laptop devices.


2. 2GB RAM
3. 1 processor
4. 10 GB free storage space.

20
<Project Title>

CHAPTER 4
PROJECT DESIGN

21
<Project Title>

4.1. USE CASE DIAGRAM

Fig 4.1 Use case diagram

4.2. DATA FLOW DIAGRAM

Fig 4.2 Level 0:Data flow Diagram

22
<Project Title>

Fig 4.3 Level 1:Data flow diagram

4.3. SEQUENCE DIAGRAM

Fig 4.4 Sequential Diagram

23

You might also like