Professional Documents
Culture Documents
Practical No.03
Aim: To extract features from given data set and establish training data.
Objectives:
Package Used:-Python3
ProblemDefinition:-
Diabetes is a disease that occurs when your blood glucose, also called blood sugar, is too
high. We are working on Pima Indians Diabetes Dataset (PIDD). PIDD consists of several medical
parameters and one dependent (outcome) parameter of binary values. This dataset is mainly for
female gender and Description of dataset is as following:
9 columns with 8 independent parameter and one outcome parameter with uniquely identified 768
observations having 268 positive for diabetes (1) and 500 negative for diabetes (0).
Target Variable:
Label is the target variable.
Input data
1. Dataset given in form of .csv file(comma separated values)
4 1 89 66 23 94 28.1 0.167 21 0
7 3 78 50 32 88 31 0.248 26 1
10 8 125 96 0 0 0 0.232 54 1
12 10 168 74 0 0 38 0.537 34 1
Program:-
# Load libraries
import pandas as pd
from sklearn.model_selectionimport train_test_split
#Import train_test_split function
# load dataset
data=pd.read_csv(r"E:\pythonProject_experiment\pima-indians-diabetes.csv")
print(data)
#split dataset in features and target variable
X=data.iloc[:,1:9]
Y=data.iloc[:,-1]
Conclusion:-
In the above case the Diabetes dataset is divided into training and testing data.
For the separation of dataset the “sklearn” library is used and from that “train_test_split” method
is called.