You are on page 1of 8

CHAPTER 1

BREAST TUMOR DETECTION

1.1 INTRODUCTION

The most commonly occurring sort of cancer is Breast cancer. It’s identified to affect
over 2 million girls annually. Survival is mostly identified to be higher than fifty-five
hundredths in most places. There are no interference techniques for breast cancer.
However, early detection and diagnosis are essential in determining the possibilities of
survival. The following are the general methods used in determining Breast Cancer.

Breast ultrasound: A machine that utilises sound waves to make detailed pictures,
called sonograms, of areas inside the breast.

Diagnostic mammogram: On the off chance that you have an issue in your bosom, like
bumps, or if a space of the bosom looks strange on a screening mammogram, specialists
may have you get a diagnostic mammogram. This is a more specific X-Ray of the bosom.

Magnetic resonance imaging (MRI): A body examine that utilizes a magnet connected
to a PC. The MRI sweep will make nitty gritty pictures of regions inside the bosom.

Biopsy: This test eliminates tissue or liquid from the bosom to be taken a gander at
under a microscope and accomplishes more testing. There are various types of biopsies
present.

1.1.1 Objectives and Scope

The breast tumor in girls is the most frequently diagnosed and second leading reason for
cancer deaths. Specialists, as a rule, utilize more tests to discover and analyze bosom
malignant growth. They typically may allude women to a breast matter expert or a
specialist.

Objective Our objective is to detect the Tumor in the Breast using Breast Mammographic
Images. We were provided with the data set of Mammographic Images of 2216 Images
in which all of them are unlabeled but assumed to be cancer. Our objective is to derive
using a model whether a given image from the unlabeled dataset has a tumor or not.

The Following is an Image from the Dataset(1.1):

Figure 1.1: Breast Mammogram

2
1.2 Literature Review

In Gopee, The author works on the image classification task on the CIFAR-10 dataset,
where each picture has a place with one of the ten different classes. The classes are
fundamentally unrelated and are for the most part articles and creatures. The pictures
are little (32 32 pixels), uniform size and shape, and RGB hued. The author carries out
a proposed picture preprocessing system to learn and collect the remarkable features
of the images. The strategy was exhibited to expand the order execution essentially.
They likewise vouch for such cases and see a significant improvement of over 15
percentage from the gauge (i.e., without preprocessing). They further trial with different
parameters and settings of the proposed technique to tune the preprocessing systems.
They likewise explore different avenues regarding an assortment of linear classifiers on
the preprocessed pictures. At that point they discover that a simple SVM classifier with
a linear kernel does the best. They at last trial with group learning by consolidating a
simple SVM with a multinomial logistic regression. The ensemble learning enhances
the basic simple SVM at a high computational expense. In Sharma et al. (2018), The
paper tries to introduce a correlation of the to a great extent well-known ML algorithms
and procedures normally utilized for breast cancer forecast, specifically Random Forest,
kNN (k-Nearest-Neighbor), and Naïve Bayes. The Wisconsin Diagnosis Breast Cancer
informational collection was utilized as a preparation set to look at the exhibition of the
different ML methods as far as key metrics like precision and accuracy. The outcomes
acquired are exceptional and can be utilized for location and treatment.

1.3 Work & Methodology

1.3.1 WorkFlow:

1.Sending Dicom Images to MongoDB. 2.Converting Dicom Images to Jpeg Images.


3.Using Dataset for fine-tuning pre-trained ResNet-50 architecture. 4.Extracting the
Feature Vector from ResNet-50 and classifying them using One-Class SVM. 5.Serving
the Model Using BentoML.

3
Figure 1.2: Workflow

1.3.2 Framework Used:

BentoML: BentoML is a machine learning model serving, management, and deploy-


ment system. It aims to close the gap between data science and DevOps, allowing
different teams to produce prediction services in a fast, repeatable, and scalable manner.

Flask: Flask is a Python-based microweb framework. It is referred to as a microframe-


work because it does not necessitate the usage of any specific tools or libraries. It does
not have an information base deliberation layer, structure approval, or different segments
where outsider libraries give standard functions .

1.3.3 Tools Used:

MongoDB: MongoDB is a cross-stage archive arranged information base application


that is open source. MongoDB is a NoSQL information base application that works with
JSON-like reports and discretionary compositions. MongoDB is an information base
that was made by MongoDB Inc. furthermore, is disseminated under the Server Side
Public License.

Rest API: Representational state transfer is a software design paradigm that makes use
of a subset of HTTP. It’s often used to construct interactive Web service applications.

4
RESTful refers to a Web service that respects specific criteria.

1.3.4 Architecture and Models Used:

RESNET-50: ResNet-50 is a convolutional neural network that’s fifty layers deep. In


our case,it is a pre-trained version of the network trained on over 1,000,000 pictures
from the ImageNet. The pre-trained network will classify images into a thousand object
classes, like console, mouse, pencil, and a lot of creatures. As a result, it has learned to
make feature representations for a large variety of pictures. It has a picture input size of
224-by-224.(Ashabb. (2020))

One Class- SVM: One-class SVM is an associate unsupervised algorithmic program


that learns a choice for novelty detection: classifying new information as similar or
completely different from the coaching set.

1.3.5 Methodology:

1.The mammographic pictures are regenerated into "JPEG" format, and therefore, the
images are divided into train and test batches.

2.The weights of a pre-trained model of RESNET-50 is being taken and fine-tuned with
images to get actual feature vectors of the them.

3.Once we get the feature vectors of the photographs, these vectors are given to one class
SVM, and therefore, the SVM model is fitted with the feature vectors.

4. Currently, the complete model is served with the assistance of BentoML.

1.4 Results and Discussions

Finally, we have Implemented the entire technique and achieved an accuracy of 67


percent with the data set available although the accuracy is not up to the mark. We
approached this method because the data is labeled with one class. We have developed a
User Interface for the entire algorithm

5
Figure 1.3: User Interface

1.5 SUMMARY AND CONCLUSION

The breast cancer data provided us contains a single class of images all of cancer-
ous(assumed) .Since the images are uni-labeled , we followed a new approach by
fine-tuning the architecture and then extracted the feature vector, Later we used One
Class SVM to check the similarities and determine whether the image is cancerous.

We used the data from MongoDB as the size of the dataset is very high and using
MongoDB we can compress the image and then use the image automatically from the
internet without downloading it physically. The approach we used can also be applied to
many other problem statements with the uni-labeled data .

6
REFERENCES
1. M. Ali (2020). Time Series Anomaly Detection with
PyCaret. https : / / towardsdatascience . com /
time-series-anomaly-detection-with-pycaret-706a6e2b2427.

2. Ashabb. (2020). Image recognition with RESNET50


model. https : / / medium . com / @ashabb /
image-recognition-with-model-resnet50-d89bce852c24.

3. A. Ayanzadeh (2018). Canny Edge Detection Method. https://a-ayanzadeh.


medium.com/canny-edge-detection-method-23a23b282ac0.

4. J. Brownlee (2017). How to Handle Missing Data with Python. https://


machinelearningmastery.com/handle-missing-data-python/.

5. S. Das and U. M. Cakmak, Hands-On Automated Machine Learning: A beginner’s


guide to building automated machine learning systems using AutoML and Python. Packt
Publishing Ltd, 2018.

6. J. Dias, P. Godinho, and P. Torres, Machine learning for customer churn prediction in
retail banking. In International Conference on Computational Science and Its Applica-
tions. Springer, 2020.

7. M. Feurer and F. Hutter, Towards further automation in automl. In ICML AutoML


workshop. 2018.

8. Y. G. (2017). The 7 Steps of Machine Learning. https://towardsdatascience.


com/the-7-steps-of-machine-learning-2877d7e5548e.

9. K. Gautam, Indian currency detection using image recognition technique. In 2020


International Conference on Computer Science, Engineering and Applications (ICCSEA).
IEEE, 2020.

10. N. Gopee (). Classifying cifar-10 images using unsupervised feature & ensemble
learning.

11. B. Jyenis (2020). Anomaly Detection in Time Series Sen-


sor Data.. https : / / towardsdatascience . com /
anomaly-detection-in-time-series-sensor-data-86fd52e62538.

12. N. Laptev, S. Amizadeh, and I. Flint, Generic and scalable framework for automated
time-series anomaly detection. In Proceedings of the 21th ACM SIGKDD international
conference on knowledge discovery and data mining. 2015.

13. Y. Li and B. Wang, A study on customer churn of commercial banks based on learn-
ing from label proportions. In 2018 IEEE International Conference on Data Mining
Workshops (ICDMW). IEEE, 2018.

14. V. Meel (). YOLOv3 Overview.

37
15. M. Munir, S. A. Siddiqui, A. Dengel, and S. Ahmed (2018). Deepant: A deep learning
approach for unsupervised anomaly detection in time series. IEEE Access, 7, 1991–2005.

16. M. Olafenwa and J. Olafenwa (2021). Custom Object Detection: Train-


ing and Inference. https://imageai.readthedocs.io/en/latest/
customdetection/index.html.

17. D. Pereira (2020). A brief introduction to Au-


toML. https : / / towardsdatascience . com /
a-brief-introduction-to-automl-4854c76877b6.

18. S. Pulagam (2020). A Simplified approach using PyCaret


for Anomaly. https : / / towardsdatascience . com /
a-simplified-approach-using-pycaret-for-anomaly-detection-7d33aca3f0

19. E. Real, C. Liang, D. So, and Q. Le, Automl-zero: evolving machine learning algo-
rithms from scratch. In International Conference on Machine Learning. PMLR, 2020.

20. J. Redmon and A. Farhadi (2018). Yolov3: An incremental improvement. arXiv


preprint arXiv:1804.02767.

21. S. Sharma, A. Aggarwal, and T. Choudhury, Breast cancer detection using machine
learning algorithms. In 2018 International Conference on Computational Techniques,
Electronics and Mechanical Systems (CTEMS). IEEE, 2018.

22. P. K. Singh, A. K. Kar, Y. Singh, M. H. Kolekar, and S. Tanwar, Proceedings of


ICRIC 2019: Recent Innovations in Computing, volume 597. Springer Nature, 2019.

23. U. Subbiah, D. K. Kumar, S. K. Thangavel, and L. Parameswaran, An extensive


study and comparison of the various approaches to object detection using deep learning.
In 2020 International Conference on Smart Electronics and Communication (ICOSEC).
IEEE, 2020.

24. M. Teng, Anomaly detection on time series. In 2010 IEEE International Conference on
Progress in Informatics and Computing, volume 1. IEEE, 2010.

25. Z. THORAT, B. SUMANTH, V. AGAWANE&, and S. BHOSALE (). Smart traffic


control using object detection based on image ai.

26. C. Zhang, D. Song, Y. Chen, X. Feng, C. Lumezanu, W. Cheng, J. Ni, B. Zong,


H. Chen, and N. V. Chawla, A deep neural network for unsupervised anomaly detection
and diagnosis in multivariate time series data. In Proceedings of the AAAI Conference
on Artificial Intelligence, volume 33. 2019.

27. J. Zhao and X.-H. Dang, Bank customer churn prediction based on support vector
machine: Taking a commercial bank’s vip customer churn as the example. In 2008
4th International Conference on Wireless Communications, Networking and Mobile
Computing. IEEE, 2008.

38

You might also like