Professional Documents
Culture Documents
1.1 INTRODUCTION
The most commonly occurring sort of cancer is Breast cancer. It’s identified to affect
over 2 million girls annually. Survival is mostly identified to be higher than fifty-five
hundredths in most places. There are no interference techniques for breast cancer.
However, early detection and diagnosis are essential in determining the possibilities of
survival. The following are the general methods used in determining Breast Cancer.
Breast ultrasound: A machine that utilises sound waves to make detailed pictures,
called sonograms, of areas inside the breast.
Diagnostic mammogram: On the off chance that you have an issue in your bosom, like
bumps, or if a space of the bosom looks strange on a screening mammogram, specialists
may have you get a diagnostic mammogram. This is a more specific X-Ray of the bosom.
Magnetic resonance imaging (MRI): A body examine that utilizes a magnet connected
to a PC. The MRI sweep will make nitty gritty pictures of regions inside the bosom.
Biopsy: This test eliminates tissue or liquid from the bosom to be taken a gander at
under a microscope and accomplishes more testing. There are various types of biopsies
present.
The breast tumor in girls is the most frequently diagnosed and second leading reason for
cancer deaths. Specialists, as a rule, utilize more tests to discover and analyze bosom
malignant growth. They typically may allude women to a breast matter expert or a
specialist.
Objective Our objective is to detect the Tumor in the Breast using Breast Mammographic
Images. We were provided with the data set of Mammographic Images of 2216 Images
in which all of them are unlabeled but assumed to be cancer. Our objective is to derive
using a model whether a given image from the unlabeled dataset has a tumor or not.
2
1.2 Literature Review
In Gopee, The author works on the image classification task on the CIFAR-10 dataset,
where each picture has a place with one of the ten different classes. The classes are
fundamentally unrelated and are for the most part articles and creatures. The pictures
are little (32 32 pixels), uniform size and shape, and RGB hued. The author carries out
a proposed picture preprocessing system to learn and collect the remarkable features
of the images. The strategy was exhibited to expand the order execution essentially.
They likewise vouch for such cases and see a significant improvement of over 15
percentage from the gauge (i.e., without preprocessing). They further trial with different
parameters and settings of the proposed technique to tune the preprocessing systems.
They likewise explore different avenues regarding an assortment of linear classifiers on
the preprocessed pictures. At that point they discover that a simple SVM classifier with
a linear kernel does the best. They at last trial with group learning by consolidating a
simple SVM with a multinomial logistic regression. The ensemble learning enhances
the basic simple SVM at a high computational expense. In Sharma et al. (2018), The
paper tries to introduce a correlation of the to a great extent well-known ML algorithms
and procedures normally utilized for breast cancer forecast, specifically Random Forest,
kNN (k-Nearest-Neighbor), and Naïve Bayes. The Wisconsin Diagnosis Breast Cancer
informational collection was utilized as a preparation set to look at the exhibition of the
different ML methods as far as key metrics like precision and accuracy. The outcomes
acquired are exceptional and can be utilized for location and treatment.
1.3.1 WorkFlow:
3
Figure 1.2: Workflow
Rest API: Representational state transfer is a software design paradigm that makes use
of a subset of HTTP. It’s often used to construct interactive Web service applications.
4
RESTful refers to a Web service that respects specific criteria.
1.3.5 Methodology:
1.The mammographic pictures are regenerated into "JPEG" format, and therefore, the
images are divided into train and test batches.
2.The weights of a pre-trained model of RESNET-50 is being taken and fine-tuned with
images to get actual feature vectors of the them.
3.Once we get the feature vectors of the photographs, these vectors are given to one class
SVM, and therefore, the SVM model is fitted with the feature vectors.
5
Figure 1.3: User Interface
The breast cancer data provided us contains a single class of images all of cancer-
ous(assumed) .Since the images are uni-labeled , we followed a new approach by
fine-tuning the architecture and then extracted the feature vector, Later we used One
Class SVM to check the similarities and determine whether the image is cancerous.
We used the data from MongoDB as the size of the dataset is very high and using
MongoDB we can compress the image and then use the image automatically from the
internet without downloading it physically. The approach we used can also be applied to
many other problem statements with the uni-labeled data .
6
REFERENCES
1. M. Ali (2020). Time Series Anomaly Detection with
PyCaret. https : / / towardsdatascience . com /
time-series-anomaly-detection-with-pycaret-706a6e2b2427.
6. J. Dias, P. Godinho, and P. Torres, Machine learning for customer churn prediction in
retail banking. In International Conference on Computational Science and Its Applica-
tions. Springer, 2020.
10. N. Gopee (). Classifying cifar-10 images using unsupervised feature & ensemble
learning.
12. N. Laptev, S. Amizadeh, and I. Flint, Generic and scalable framework for automated
time-series anomaly detection. In Proceedings of the 21th ACM SIGKDD international
conference on knowledge discovery and data mining. 2015.
13. Y. Li and B. Wang, A study on customer churn of commercial banks based on learn-
ing from label proportions. In 2018 IEEE International Conference on Data Mining
Workshops (ICDMW). IEEE, 2018.
37
15. M. Munir, S. A. Siddiqui, A. Dengel, and S. Ahmed (2018). Deepant: A deep learning
approach for unsupervised anomaly detection in time series. IEEE Access, 7, 1991–2005.
19. E. Real, C. Liang, D. So, and Q. Le, Automl-zero: evolving machine learning algo-
rithms from scratch. In International Conference on Machine Learning. PMLR, 2020.
21. S. Sharma, A. Aggarwal, and T. Choudhury, Breast cancer detection using machine
learning algorithms. In 2018 International Conference on Computational Techniques,
Electronics and Mechanical Systems (CTEMS). IEEE, 2018.
24. M. Teng, Anomaly detection on time series. In 2010 IEEE International Conference on
Progress in Informatics and Computing, volume 1. IEEE, 2010.
27. J. Zhao and X.-H. Dang, Bank customer churn prediction based on support vector
machine: Taking a commercial bank’s vip customer churn as the example. In 2008
4th International Conference on Wireless Communications, Networking and Mobile
Computing. IEEE, 2008.
38