Professional Documents
Culture Documents
Experiment-5
Objective: Implementation of Naïve Bayes Classifier
Task #1
Code:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv(r"E:\Jay\NIRMA\Sem6\ML\Exp5\Naive-Bayes-Classification-Data.csv")
x = df.iloc[:,:-1].values
y = df.iloc[:,2].values
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.25, random_state = 42)
model = GaussianNB()
model1 = model.fit(x_train,y_train)
GaussianNB(priors=None, var_smoothing=1e-09)
y_pred = model.predict(x_test)
print(y_pred)
print('\n')
accuracy = accuracy_score(y_test, y_pred)*100
print(accuracy)
print('\n')
Output:
Machine Learning 21BEC505
Task #2
Code:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import preprocessing
Machine Learning 21BEC505
weather=['sunny','sunny','overcast','rainy','rainy','rainy','overcast','sunny','sunny','rainy','sunny','overcast','over
cast','rainy']
temp =['hot','hot','hot','mild','cool','cool','cool','mild','cool','mild','mild','mild','hot','mild']
play=['no','no','yes','yes','yes','no','yes','no','yes','yes','yes','yes','yes','no']
le = preprocessing.LabelEncoder()
weather_encoded = le.fit_transform(weather)
temp_encoded = le.fit_transform(temp)
label = le.fit_transform(play)
print("Weather Encoded : ",weather_encoded)
print('\n')
print("Temperature Encoded : ",temp_encoded)
print('\n')
print("play: ",label)
print('\n')
features = list(zip(weather_encoded,temp_encoded))
print(features)
print('\n')
model = GaussianNB()
model.fit(features,label)
predicted = model.predict([[0,2]]) # 0 : overcast , 2 : mild
print("Predicted value : ",predicted)
Output:
Task #3
Code:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score
wine = datasets.load_wine()
print("Features : ",wine.feature_names)
print('\n')
print("Labels : ",wine.target_names)
print('\n')
print("Data shape : ",wine.data.shape)
Machine Learning 21BEC505
print('\n')
print("Top 5 records",wine.data[0:5],sep="\n")
print('\n')
print("0:Class_0, 1:class_1, 2:class_2 \n",wine.target)
print('\n')
X_train, X_test, y_train, y_test = train_test_split(wine.data,wine.target,test_size = 1/3,random_state = 109)
model = GaussianNB()
model.fit(X_train,y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test,y_pred)
print("Accuracy : ",accuracy*100)
Output:
Exercise:
The Iris Flower Dataset involves predicting the flower species given measurements of iris flowers. It is
a multiclass classification problem. The number of observations for each class is balanced. There are
150 observations with 4 input variables and 1 output variable. The variable names are as follows:
Sepal length in cm
Machine Learning 21BEC505
Sepal width in cm
Petal length in cm
Petal width in cm
Class
Code:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_csv(r"iris.data")
print("features and their labels")
print(df.iloc[:,:].values)
X = df.iloc[:,:-1].values
y = df.iloc[:,-1].values
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size = 0.25, random_state = 42)
model = GaussianNB()
model.fit(X_train,y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test,y_pred)
print("Accuracy : ",accuracy)
Output:
Machine Learning 21BEC505
2. Apply kNN on all the above three tasks of and compare the output of kNN and Naïve Bayes
classifier
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import confusion_matrix,accuracy_score
df = pd.read_csv(r"E:\Jay\NIRMA\Sem6\ML\Exp5\Naive-Bayes-Classification-Data.csv")
X = df.iloc[:,:-1].values
y = df.iloc[:,-1].values
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size = 1/3, random_state = 42)
model = KNeighborsClassifier(n_neighbors=3, metric='minkowski', p=2)
Machine Learning 21BEC505
model.fit(X_train,y_train)
y_pred = model.predict(X_test)
cm= confusion_matrix(y_test, y_pred)
accuracy = accuracy_score(y_test,y_pred)
print("Accuracy for task 1 using KNN : ",accuracy)
model = GaussianNB()
model.fit(X_train,y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test,y_pred)
print("Accuracy for task 1 using Naive Bayes : ",accuracy)
Output:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import confusion_matrix,accuracy_score
from sklearn import preprocessing
weather
=['sunny','sunny','overcast','rainy','rainy','rainy','overcast','sunny','sunny','rainy','sunny','overcast','overcast','rai
ny']
temp =['hot','hot','hot','mild','cool','cool','cool','mild','cool','mild','mild','mild','hot','mild']
play=['no','no','yes','yes','yes','no','yes','no','yes','yes','yes','yes','yes','no']
le = preprocessing.LabelEncoder()
weather_encoded = le.fit_transform(weather)
temp_encoded = le.fit_transform(temp)
label = le.fit_transform(play)
features = list(zip(weather_encoded,temp_encoded))
X_train, X_test, y_train, y_test = train_test_split(features,label,test_size = 1/3,
random_state = 42)
model = KNeighborsClassifier(n_neighbors=3, metric='minkowski', p=2)
model.fit(X_train,y_train)
y_pred = model.predict(X_test)
cm= confusion_matrix(y_test, y_pred)
accuracy = accuracy_score(y_test,y_pred)
print("Accuracy for task 1 using KNN : ",accuracy)
model = GaussianNB()
model.fit(X_train,y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test,y_pred)
print("Accuracy for task 1 using Naive Bayes : ",accuracy)
Machine Learning 21BEC505
Output:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import confusion_matrix,accuracy_score
from sklearn import preprocessing
from sklearn import datasets
wine = datasets.load_wine()
X_train, X_test, y_train, y_test = train_test_split(wine.data,wine.target,test_size = 1/3,
random_state = 109)
model = KNeighborsClassifier(n_neighbors=3, metric='minkowski', p=2)
model.fit(X_train,y_train)
y_pred = model.predict(X_test)
cm= confusion_matrix(y_test, y_pred)
accuracy = accuracy_score(y_test,y_pred)
print("Accuracy for task 3 using KNN : ",accuracy)
model = GaussianNB()
model.fit(X_train,y_train)
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test,y_pred)
print("Accuracy for task 3 using Naive Bayes : ",accuracy)
Output:
For task 1, KNN has higher accuracy, therefore KNN will be preferred for task 1.
For task 2, both the models are showing very low accuracy, also dataset is having very less number of
data, so we can judge which model is better.
For task 3, Naïve Bayes has higher accuracy, therefore Naïve Bayes will be preferred.
Conclusion:
I learned about the Naive Bayes Classification method from this experiment. I predicted the outcomes after
applying the Naive Bayes Classification to various datasets. I likewise estimated the exactness of the model.
I also used KNN to check the accuracy of the models and compare which one is best for which dataset. The
Naive Bayes Classifier is a straightforward but efficient algorithm that can be applied to document
classification, sentiment analysis, spam filtering, and other tasks. Its effortlessness and speed settle on it a
well-known decision for some AI errands, and its precision and execution make it an important instrument for
information investigation and direction.