Exp 5

Machine Learning 21BEC505
Experiment-5
Objective: Implementation of Naïve Bayes Classifier
Task #1
Code:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv(r"E:\Jay\NIRMA\Sem6\ML\Exp5\Naive-Bayes-Classification-Data.csv")
x = df.iloc[:,:-1].values
y = df.iloc[:,2].values
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.25, random_state = 42)
model = GaussianNB()
model1 = model.fit(x_train,y_train)
GaussianNB(priors=None, var_smoothing=1e-09)
y_pred = model.predict(x_test)
print(y_pred)
print('\n')
accuracy = accuracy_score(y_test, y_pred)*100
print(accuracy)
print('\n')
sns.scatterplot(x="glucose",y="bloodpressure", data=df, hue="diabetes").set(title="Full Data")

plt.show()
sns.scatterplot(x=x_test[:,0],y=x_test[:,1], hue=y_test).set(title="Testing Data")
plt.show()
Output:
Task #2
Code:
import pandas as pd
import numpy as np
from sklearn import preprocessing
weather=['sunny','sunny','overcast','rainy','rainy','rainy','overcast','sunny','sunny','rainy','sunny','overcast','over
cast','rainy']
temp =['hot','hot','hot','mild','cool','cool','cool','mild','cool','mild','mild','mild','hot','mild']
play=['no','no','yes','yes','yes','no','yes','no','yes','yes','yes','yes','yes','no']
le = preprocessing.LabelEncoder()
weather_encoded = le.fit_transform(weather)
temp_encoded = le.fit_transform(temp)
label = le.fit_transform(play)
print("Weather Encoded : ",weather_encoded)
print('\n')
print("Temperature Encoded : ",temp_encoded)
print('\n')
print("play: ",label)
print('\n')
features = list(zip(weather_encoded,temp_encoded))
print(features)
print('\n')
model.fit(features,label)
predicted = model.predict([[0,2]]) # 0 : overcast , 2 : mild
print("Predicted value : ",predicted)
Output:
Task #3
Code:
from sklearn import datasets
wine = datasets.load_wine()
print("Features : ",wine.feature_names)
print('\n')
print("Labels : ",wine.target_names)
print('\n')
print("Data shape : ",wine.data.shape)
print('\n')
print("Top 5 records",wine.data[0:5],sep="\n")
print('\n')
print("0:Class_0, 1:class_1, 2:class_2 \n",wine.target)
print('\n')
X_train, X_test, y_train, y_test = train_test_split(wine.data,wine.target,test_size = 1/3,random_state = 109)
model.fit(X_train,y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test,y_pred)
print("Accuracy : ",accuracy*100)
Output:
Exercise:
1. Apply Naïve Bayes classifier on the Iris Flower Species Dataset.
The Iris Flower Dataset involves predicting the flower species given measurements of iris flowers. It is
a multiclass classification problem. The number of observations for each class is balanced. There are
150 observations with 4 input variables and 1 output variable. The variable names are as follows:
 Sepal length in cm
 Sepal width in cm
 Petal length in cm
 Petal width in cm
 Class
Code:
import pandas as pd
import numpy as np
df = pd.read_csv(r"iris.data")
print("features and their labels")
print(df.iloc[:,:].values)
X = df.iloc[:,:-1].values
y = df.iloc[:,-1].values
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size = 0.25, random_state = 42)
print("Accuracy : ",accuracy)
Output:
2. Apply kNN on all the above three tasks of and compare the output of kNN and Naïve Bayes
classifier
Code: For task 1
import numpy as np
import pandas as pd
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import confusion_matrix,accuracy_score
df = pd.read_csv(r"E:\Jay\NIRMA\Sem6\ML\Exp5\Naive-Bayes-Classification-Data.csv")
X = df.iloc[:,:-1].values
y = df.iloc[:,-1].values
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size = 1/3, random_state = 42)
model = KNeighborsClassifier(n_neighbors=3, metric='minkowski', p=2)
cm= confusion_matrix(y_test, y_pred)
print("Accuracy for task 1 using KNN : ",accuracy)
print("Accuracy for task 1 using Naive Bayes : ",accuracy)
Output:
Code: For task 2
import numpy as np
import pandas as pd
weather
=['sunny','sunny','overcast','rainy','rainy','rainy','overcast','sunny','sunny','rainy','sunny','overcast','overcast','rai
ny']
temp =['hot','hot','hot','mild','cool','cool','cool','mild','cool','mild','mild','mild','hot','mild']
play=['no','no','yes','yes','yes','no','yes','no','yes','yes','yes','yes','yes','no']
le = preprocessing.LabelEncoder()
weather_encoded = le.fit_transform(weather)
temp_encoded = le.fit_transform(temp)
label = le.fit_transform(play)
features = list(zip(weather_encoded,temp_encoded))
X_train, X_test, y_train, y_test = train_test_split(features,label,test_size = 1/3,
random_state = 42)
Output:
Code: For task 3
import numpy as np
import pandas as pd
from sklearn import datasets
wine = datasets.load_wine()
X_train, X_test, y_train, y_test = train_test_split(wine.data,wine.target,test_size = 1/3,
random_state = 109)
Output:
For task 1, KNN has higher accuracy, therefore KNN will be preferred for task 1.
For task 2, both the models are showing very low accuracy, also dataset is having very less number of
data, so we can judge which model is better.
For task 3, Naïve Bayes has higher accuracy, therefore Naïve Bayes will be preferred.
Conclusion:
I learned about the Naive Bayes Classification method from this experiment. I predicted the outcomes after
applying the Naive Bayes Classification to various datasets. I likewise estimated the exactness of the model.
I also used KNN to check the accuracy of the models and compare which one is best for which dataset. The
Naive Bayes Classifier is a straightforward but efficient algorithm that can be applied to document
classification, sentiment analysis, spam filtering, and other tasks. Its effortlessness and speed settle on it a
well-known decision for some AI errands, and its precision and execution make it an important instrument for
information investigation and direction.

Exp 5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Exp 5

Uploaded by

Copyright:

Available Formats

Machine Learning 21BEC505

sns.scatterplot(x="glucose",y="bloodpressure", data=df, hue="diabetes").set(title="Full Data")

1. Apply Naïve Bayes classifier on the Iris Flower Species Dataset.

Code: For task 1

Code: For task 2

Code: For task 3

You might also like