Welcome to Scribd!

Classification 4

Uploaded by

0% found this document useful (0 votes)

9 views1 page

The document discusses preparing diabetes patient data for machine learning modeling. It separates feature data from label data, then visualizes the feature distributions for each label value. Some features like pregnancies and age show markedly different distributions between diabetic and non-diabetic patients. The data is then split into 70% for model training and 30% for testing the trained model.

Original Description:

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

9 views1 page

Classification 4

Uploaded by

Elisée TEGUE

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 1

Search inside document

(Pregnancies,PlasmaGlucose,DiastolicBloodPressure, and so on) are the features we will

use to predict the Diabetic label.

Let’s separate the features from the labels - we’ll call the features X and the label y:

[ ]: # Separate features and labels

features =␣
,→['Pregnancies','PlasmaGlucose','DiastolicBloodPressure','TricepsThickness','SerumInsulin','B

label = 'Diabetic'
X, y = diabetes[features].values, diabetes[label].values

for n in range(0,4):
print("Patient", str(n+1), "\n Features:",list(X[n]), "\n Label:", y[n])

Now let’s compare the feature distributions for each label value.

[ ]: from matplotlib import pyplot as plt

%matplotlib inline

features =␣
,→['Pregnancies','PlasmaGlucose','DiastolicBloodPressure','TricepsThickness','SerumInsulin','B

for col in features:

diabetes.boxplot(column=col, by='Diabetic', figsize=(6,6))
plt.title(col)
plt.show()

For some of the features, there’s a noticeable difference in the distribution for each label value. In
particular, Pregnancies and Age show markedly different distributions for diabetic patients than
for non-diabetic patients. These features may help predict whether or not a patient is diabetic.

1.1.2 Split the data

Our dataset includes known values for the label, so we can use this to train a classifier so that it
finds a statistical relationship between the features and the label value; but how will we know if
our model is any good? How do we know it will predict correctly when we use it with new data
that it wasn’t trained with? Well, we can take advantage of the fact we have a large dataset with
known label values, use only some of it to train the model, and hold back some to test the trained
model - enabling us to compare the predicted labels with the already known labels in the test set.
In Python, the scikit-learn package contains a large number of functions we can use to build a
machine learning model - including a train_test_split function that ensures we get a statistically
random split of training and test data. We’ll use that to split the data into 70% for training and
hold back 30% for testing.

[ ]: from sklearn.model_selection import train_test_split

# Split data 70%-30% into training set and test set

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30,␣
,→random_state=0)

Teachers Edition
Document88 pages
Teachers Edition
birraj
100% (2)
Workplace Etiquette Manners
Document12 pages
Workplace Etiquette Manners
denizenalexr
100% (3)
Presentation - Gothic Elements in T.S. Eliot's Postmodern Classic "The Waste Land" - Sahar
Document12 pages
Presentation - Gothic Elements in T.S. Eliot's Postmodern Classic "The Waste Land" - Sahar
s_hashbal
No ratings yet
Sajjad DS
Document97 pages
Sajjad DS
Hey Buddy
100% (2)
Midterm2006 Sol Csi4107
Document9 pages
Midterm2006 Sol Csi4107
martin
100% (2)
Practical Engineering, Process, and Reliability Statistics
From Everand
Practical Engineering, Process, and Reliability Statistics
Mark Allen Durivage
No ratings yet
Writing A Research Title
Document26 pages
Writing A Research Title
Joy Del Rosario
No ratings yet
Scikit - Notes ML
Document12 pages
Scikit - Notes ML
Vulli Leela Venkata Phanindra
100% (1)
MA Graphic Design Major Project Proposal
Document60 pages
MA Graphic Design Major Project Proposal
Eleanor Maclure
No ratings yet
Top 9 Feature Engineering Techniques With Python: Dataset & Prerequisites
Document27 pages
Top 9 Feature Engineering Techniques With Python: Dataset & Prerequisites
Mamafou
No ratings yet
Machine Learning Hands-On
Document18 pages
Machine Learning Hands-On
Vivek JD
100% (1)
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Whole ML PDF 1614408656
Document214 pages
Whole ML PDF 1614408656
Kshatrapati Singh
100% (1)
Week 7 Laboratory Activity
Document12 pages
Week 7 Laboratory Activity
Gar Noob
No ratings yet
Pytorch (Tabular) - Regression
Document13 pages
Pytorch (Tabular) - Regression
Guru75
No ratings yet
Introduction To Decision Tree: Gini Index
Document15 pages
Introduction To Decision Tree: Gini Index
MUhammad Waqas
No ratings yet
utf-8''C2M1 Assignment
Document24 pages
utf-8''C2M1 Assignment
Sarah Mendes
No ratings yet
Maxbox - Starter67 Machine Learning
Document7 pages
Maxbox - Starter67 Machine Learning
Max Kleiner
No ratings yet
1 Lecture 2: Supervised Machine Learning
Document20 pages
1 Lecture 2: Supervised Machine Learning
Jeremy Wang
No ratings yet
Data Mining
Document15 pages
Data Mining
MUhammad Waqas
No ratings yet
Machine Learning Lab Manual 06
Document8 pages
Machine Learning Lab Manual 06
Raheel Aslam
100% (1)
Logistic Regression
Document10 pages
Logistic Regression
Chichi Jnr
100% (1)
Train Test Split in Python
Document11 pages
Train Test Split in Python
Nikhil Tiwari
No ratings yet
Maxbox Starter66 Machine Learning4
Document10 pages
Maxbox Starter66 Machine Learning4
Max Kleiner
No ratings yet
Diabetes Case Study - Jupyter Notebook
Document10 pages
Diabetes Case Study - Jupyter Notebook
Abhising
100% (1)
ML 5th
Document8 pages
ML 5th
sahugungun76
No ratings yet
AIML Report.
Document12 pages
AIML Report.
Suma T K
No ratings yet
Scikit Learn What Were Covering
Document15 pages
Scikit Learn What Were Covering
Raiyan Zannat
No ratings yet
XSTK Câu hỏi
Document19 pages
XSTK Câu hỏi
imaboneofmysword
No ratings yet
The Art of Finding The Best Features For Machine Learning - by Rebecca Vickery - Towards Data Science
Document14 pages
The Art of Finding The Best Features For Machine Learning - by Rebecca Vickery - Towards Data Science
Hamdan Gani, S.Kom., MT
No ratings yet
Naive Bayes Model With Python 1684166563
Document9 pages
Naive Bayes Model With Python 1684166563
mohit c-35
No ratings yet
Learning With Python: TOPIC: ML Algorithms
Document32 pages
Learning With Python: TOPIC: ML Algorithms
Mukul Sharma
No ratings yet
AIML Report (1) 11
Document13 pages
AIML Report (1) 11
sumatksumatk5
No ratings yet
IT0089 TB391 Decision Tree RABE
Document6 pages
IT0089 TB391 Decision Tree RABE
Kim Vincere
No ratings yet
ML0101EN Clas K Nearest Neighbors CustCat Py v1
Document11 pages
ML0101EN Clas K Nearest Neighbors CustCat Py v1
banicx
100% (1)
Project 1
Document4 pages
Project 1
aqsa yousaf
No ratings yet
Classification
Document40 pages
Classification
niranjan
No ratings yet
Code
Document16 pages
Code
Ali Malik
No ratings yet
Kunal DS
Document92 pages
Kunal DS
Vipul Gupta
No ratings yet
Confusion Matrix
Document6 pages
Confusion Matrix
amir
No ratings yet
Hypothesis Testing
Document17 pages
Hypothesis Testing
Jose Maria Arjona Caballero
No ratings yet
Building A Simple Machine Learning Model On Breast Cancer Data
Document12 pages
Building A Simple Machine Learning Model On Breast Cancer Data
Khalifa Moiz
No ratings yet
Classification Algorithms I
Document14 pages
Classification Algorithms I
Jayod Rajapaksha
No ratings yet
Project Occupancy Alfonso Vicente Aragues
Document18 pages
Project Occupancy Alfonso Vicente Aragues
Alfonso
No ratings yet
MLRBy SKJ
Document4 pages
MLRBy SKJ
vishwas gupta
No ratings yet
Logistic Regression
Document32 pages
Logistic Regression
Raja
100% (1)
ML0101EN Clas Decision Trees Drug Py v1
Document12 pages
ML0101EN Clas Decision Trees Drug Py v1
banicx
No ratings yet
Data Mining Problem 2 Report
Document13 pages
Data Mining Problem 2 Report
Babu Shaikh
No ratings yet
Cancer Cell Classification Using Scikit
Document4 pages
Cancer Cell Classification Using Scikit
20pba216 Pavithra Meenakshi M
No ratings yet
2.1 ML (Implementation of Simple Linear Regression in Python)
Document8 pages
2.1 ML (Implementation of Simple Linear Regression in Python)
Muhammad shayan umar
No ratings yet
Using Predictive Analytics Model To Diagnose Breast Cnacer
Document9 pages
Using Predictive Analytics Model To Diagnose Breast Cnacer
Runal Bhosale
No ratings yet
Data Science and Machine Learning Essentials: Lab 4B - Working With Classification Models
Document29 pages
Data Science and Machine Learning Essentials: Lab 4B - Working With Classification Models
aussatris
No ratings yet
MLP - Week 5 - MNIST - Perceptron - Ipynb - Colaboratory
Document31 pages
MLP - Week 5 - MNIST - Perceptron - Ipynb - Colaboratory
Meer Hassan
No ratings yet
Assignment 3 Q1
Document5 pages
Assignment 3 Q1
Pratyush
No ratings yet
1 - An Introduction To Machine Learning With Scikit-Learn
Document9 pages
1 - An Introduction To Machine Learning With Scikit-Learn
yati kumari
No ratings yet
ML0101EN Clas Logistic Reg Churn Py v1
Document13 pages
ML0101EN Clas Logistic Reg Churn Py v1
banicx
100% (1)
Machine Learning Assignment 1
Document4 pages
Machine Learning Assignment 1
Balaji Dhanabal
No ratings yet
Tutorial 9 - Questions 2023
Document4 pages
Tutorial 9 - Questions 2023
ceewang23
No ratings yet
4-10 Aiml
Document25 pages
4-10 Aiml
Guna Seelan
No ratings yet
Assignment 1:: Intro To Machine Learning
Document6 pages
Assignment 1:: Intro To Machine Learning
Minh Trí
No ratings yet
How To Train A Model With MNIST Dataset
Document7 pages
How To Train A Model With MNIST Dataset
Magdalena Falkowska
No ratings yet
ML Unit 2
Document31 pages
ML Unit 2
Krishnaveni Yata
No ratings yet
Charmi Shah 20bcp299 Lab2
Document7 pages
Charmi Shah 20bcp299 Lab2
Princy
100% (1)
Machine Learning
Document115 pages
Machine Learning
manshi.yogendra1402
No ratings yet
Tutorial 6
Document8 pages
Tutorial 6
POEASO
No ratings yet
Only Quat
Document8 pages
Only Quat
BI11-286 Nguyễn Xuân Vinh
No ratings yet
B2 COURSE - Unit 17 PDF
Document4 pages
B2 COURSE - Unit 17 PDF
Eugenia
No ratings yet
Ayaw at Gusto
Document4 pages
Ayaw at Gusto
Jed Villaluz
No ratings yet
TOS Sci8 1stQtr
Document5 pages
TOS Sci8 1stQtr
ARLENE DE JESUS
No ratings yet
CONCEPT Piece For Words and Views Eng
Document3 pages
CONCEPT Piece For Words and Views Eng
Alan Eric Sanguinetti
No ratings yet
Republic of The Philippines University Town, Northern Samar Web: Email
Document15 pages
Republic of The Philippines University Town, Northern Samar Web: Email
yoonglespiano
No ratings yet
English - Make Sentences
Document16 pages
English - Make Sentences
Rosana Llinas
No ratings yet
From: Eco-Tec Architecture of The In-Between, Ed. Amerigo Marras Princeton Architectural Press 1999
Document20 pages
From: Eco-Tec Architecture of The In-Between, Ed. Amerigo Marras Princeton Architectural Press 1999
Delia Nechita
No ratings yet
ESP vs. General English
Document13 pages
ESP vs. General English
Lan Anh Tạ
No ratings yet
Supplement Activity: Apply It To Your Journey Moving To A Better Vision of You
Document3 pages
Supplement Activity: Apply It To Your Journey Moving To A Better Vision of You
Maria Faye Mariano
No ratings yet
Correction of Ambiguous Sentences - by Feli
Document2 pages
Correction of Ambiguous Sentences - by Feli
felifire1
100% (1)
PREDYS - How To Prepare Children For The Process of Literacy Acquisition
Document81 pages
PREDYS - How To Prepare Children For The Process of Literacy Acquisition
Evija Kasa
No ratings yet
Psychoanalytic Theory and A Clockwork Orange
Document4 pages
Psychoanalytic Theory and A Clockwork Orange
Mihai Iulian Luca
100% (4)
46909
Document11 pages
46909
Chenny Charisse Conde
No ratings yet
Research Paper On Performance Appraisal
Document6 pages
Research Paper On Performance Appraisal
Alla nemma
0% (1)
I We I We I We I We I: Am Are Do Do Have Have
Document3 pages
I We I We I We I We I: Am Are Do Do Have Have
eliza ismail
No ratings yet
English 2 Unit 4 (4.2)
Document20 pages
English 2 Unit 4 (4.2)
Linda Sagitario Calderon Arevalo
No ratings yet
Classroom Rules - Introduction: Home Courses Blog Job Board About Reviews Help Desk My Courses
Document4 pages
Classroom Rules - Introduction: Home Courses Blog Job Board About Reviews Help Desk My Courses
Hichem BsM
No ratings yet
Vietnamese Word Segmentation
Document9 pages
Vietnamese Word Segmentation
Minh Ánh
No ratings yet
Messy Process of School Counselor Leadership
Document3 pages
Messy Process of School Counselor Leadership
Mus Oub
No ratings yet
1.nature and Elements of Communication
Document2 pages
1.nature and Elements of Communication
Copas Marie Ellengrid
No ratings yet
Assignment of Psycholinguistics by
Document7 pages
Assignment of Psycholinguistics by
Noor Ulain
No ratings yet
School Development Plan
Document4 pages
School Development Plan
igor
No ratings yet
Wiley - An Introduction To Mass and Heat Transfer - Principles of Analysis and Design - 978-0-471-11176-4
Document2 pages
Wiley - An Introduction To Mass and Heat Transfer - Principles of Analysis and Design - 978-0-471-11176-4
jmisc
0% (1)