You are on page 1of 15

I HAVE HEARD THERE ARE ,

TROUBLE OF MORE THAN ONE KIND .


SOME COME FROM AHEAD,
SOME COME FROM BEHIND.
BUT I’VE BOUGHT A BIG BAT,
AND I’M ALL READY TO SEE YOU.
NOW MY TROUBLES ARE GOING TO
HAVE
TROUBLES WITH ME.
BREAST CANCER
PREDICTION
USING DATA SCIENCE (ML & PYTHON)

Under the guidance of :


Ms. Poonam Sharma
Ms. Ashima Narang
Assistant Professor BY:- PALAK ARORA
Department of Computer Science And BTECH CSE Sec A
Engineering, ASET A50105218032
Amity University, Haryana
INTRODUCTION
Breast cancer!!
Breast Cancer is cancer that forms in the cells of the breasts .

Breast Cancer is the most common cancer diagnosed in


women. It occurs in both men and women, but it’s far more common in
women.
Breast cancer Tumor is of two types : Benign and Malignant.

WHO estimated that over 2.1 million women are affected by Breast
cancer each year also causes the greatest number of cancer related deaths
among women.
Early diagnosis can increase the survival rate up to 95%.
PROBLEM STATEMENT

Breast cancer prediction:


Diagnosis of breast cancer is performed when an abnormal lump is found
(from self-examination or x-ray) or a tiny speck of calcium is seen (on an
x-ray). After a suspicious lump is found, the doctor will conduct a
diagnosis to determine whether it is Malignant or Benign.

OBJECTIVE:-
On The Basis Of Given Attributes We’ll predict whether the person
tumor is Malignant or Benign using Machine Learning and analyse
which machine learning algorithm will best for our model
TECHNOLOGY
MACHINE LEARNING WITH PYTHON
ML ALGORITHM
CLASSIFIERS :-
 DECISION TREE
 RANDOM FOREST
 SVM
REGRESSION :-
 LOGISTIC REGRESSION
DATASET
The breast cancer dataset was obtained from the
University of Wisconsin Hospitals, Madison from Dr.
William H. Wolberg.
ATTRIBUTES:
• diagnosis: The diagnosis of breast tissues (1 =
malignant, 0 = benign)
• mean_radius: mean of distances from center to
ALSO IT HAS TARGET ATTRIBUTE
points on the perimeter
(SELF ADDED) WHICH TELLS • mean_texture: standard deviation of gray-scale
DIAGNOSIS IN CATEGORIAL FORM (M values
or B) • mean_perimeter: mean size of the core tumor
• mean_area
• mean_smoothness: mean of local variation in
radius lengths
Basic steps of Machine Learning :
 Exploring the dataset Data pre-processing and Cleaning.
Breaking dataset in input : x and output : y.
Splitting x and y in training and testing data.

X_train ,y_train X_test,y_test


Applying the regression/classification
models on the dataset.
Confusion matrix
CLASSIFICATION REPORT :-
CONCLUSION:-
BASED ON ACCURACY,CONFUSION MATRIX AND
CLASSIFICATION REPORT . WE CAN SAY THAT RANDOM
FOREST CLASSIFIER IS BEST FOR OUR PROBLEM. 
REFERCENCES:
 https://www.kaggle.com
 https://en.wikipedia.org/wiki/Breast_cancer
 https://en.wikipedia.org/wiki/Machine_learn
ing
 https://www.kaggle.com/merishnasuwal/breast-cancer
-prediction-dataset
THANKYOU

You might also like