You are on page 1of 7

Experiment – 2.

Student Name: Kartik Bhardwaj UID: 20BCS7251


Branch: BE-CSE Section/Group:20BCS_WM_702/A
Semester: 5th Subject Name: ML Lab

Aim: Implement SVM on any data set.

Software/Hardware Requirements: Windows 7 & above version.

Tools to be used:
1. Anaconda Jupyter Notebook
2. Kaggle

Code:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

iris=pd.read_csv("C:\\Users\\Asus\\Downloads\\iris.csv")
#data extraction

# Checking the dataset and doing EDA


iris.head()
print(iris.shape)
print(iris.info())
print(iris.columns)

# Creating a pairplot to visualize the similarities and especially difference between


the species
sns.pairplot(data=iris, hue='species', palette='Set2')

#Train Test Split


from sklearn.model_selection import train_test_split

# Separating the independent variables from dependent variables


x=iris.iloc[:,:-1]
y=iris.iloc[:,4]
x_train,x_test, y_train, y_test=train_test_split(x,y,test_size=0.30)

#Training and Fitting the model


from sklearn.svm import SVC
model=SVC()

model.fit(x_train, y_train)

#Predictions from the trained model


pred=model.predict(x_test)

# Importing the classification report and confusion matrix


from sklearn.metrics import classification_report, confusion_matrix

print(confusion_matrix(y_test,pred))

print(classification_report(y_test, pred))

sns.pairplot(iris)

#Encoding the categorical column


iris = iris.replace({"class": {"Iris-setosa":1,"Iris-versicolor":2, "Iris-virginica":3}})
#Visualize the new dataset
iris.head()

plt.figure(1)
sns.heatmap(iris.corr())
plt.title('Correlation On iris Classes')

X = iris.iloc[:,:-1]
y = iris.iloc[:, -1].values
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25,
random_state = 0)

#Create the SVM model


from sklearn.svm import SVC
classifier = SVC(kernel = 'linear', random_state = 0)
#Fit the model for the data

classifier.fit(X_train, y_train)

#Make the prediction


y_pred = classifier.predict(X_test)

from sklearn.metrics import confusion_matrix


cm = confusion_matrix(y_test, y_pred)
print(cm)

from sklearn.model_selection import cross_val_score


accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv =
10)
print("Accuracy: {:.2f} %".format(accuracies.mean()*100))
print("Standard Deviation: {:.2f} %".format(accuracies.std()*100))
Output
Learning Outcomes:

1. SVM works relatively well when there is a clear margin of


separation between classes.
2. SVM is more effective in high dimensional spaces.
3. SVM is effective in cases where the number of dimensions is
greater than the number of samples.
4. SVM is relatively memory efficient

You might also like