Professional Documents
Culture Documents
WHO estimated that over 2.1 million women are affected by Breast
cancer each year also causes the greatest number of cancer related deaths
among women.
Early diagnosis can increase the survival rate up to 95%.
PROBLEM STATEMENT
OBJECTIVE:-
On The Basis Of Given Attributes We’ll predict whether the person
tumor is Malignant or Benign using Machine Learning and analyse
which machine learning algorithm will best for our model
TECHNOLOGY
MACHINE LEARNING WITH PYTHON
ML ALGORITHM
CLASSIFIERS :-
DECISION TREE
RANDOM FOREST
SVM
REGRESSION :-
LOGISTIC REGRESSION
DATASET
The breast cancer dataset was obtained from the
University of Wisconsin Hospitals, Madison from Dr.
William H. Wolberg.
ATTRIBUTES:
• diagnosis: The diagnosis of breast tissues (1 =
malignant, 0 = benign)
• mean_radius: mean of distances from center to
ALSO IT HAS TARGET ATTRIBUTE
points on the perimeter
(SELF ADDED) WHICH TELLS • mean_texture: standard deviation of gray-scale
DIAGNOSIS IN CATEGORIAL FORM (M values
or B) • mean_perimeter: mean size of the core tumor
• mean_area
• mean_smoothness: mean of local variation in
radius lengths
Basic steps of Machine Learning :
Exploring the dataset Data pre-processing and Cleaning.
Breaking dataset in input : x and output : y.
Splitting x and y in training and testing data.