You are on page 1of 47

Unit 4: Naïve Bayes Classifiers

Bayes Theorem

Prof.Sachin S. Patil
D.Y.Patil University Ambi,Pune
Bayes Theorem
Bayes' Theorem states that the conditional probability of an event,
based on the occurrence of another event, is equal to the likelihood
of the second event given the first event multiplied by the probability
of the first event.
Bayes Theorem
• Naïve Bayes algorithm is a supervised learning algorithm, which is
based on Bayes theorem and used for solving classification
problems.
• It is mainly used in text classification that includes a high-
dimensional training dataset.
• Naïve Bayes Classifier is one of the simple and most effective
Classification algorithms which helps in building the fast machine
learning models that can make quick predictions.
Bayes Theorem
• It is a probabilistic classifier, which means it predicts on the basis of the
probability of an object.

• Some popular examples of Naïve Bayes Algorithm are spam filtration,


Sentimental analysis, and classifying articles.

• https://codinginfinite.com/naive-bayes-classification-numerical-example/
Why is it called Naïve Bayes?
• The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which
can be described as:

• Naïve: It is called Naïve because it assumes that the occurrence of a certain


feature is independent of the occurrence of other features.

• Such as if the fruit is identified on the bases of color, shape, and taste, then red,
spherical, and sweet fruit is recognized as an apple.

• Hence each feature individually contributes to identify that it is an apple without


depending on each other. .
Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem.
Bayes Theorem
• Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is
used to determine the probability of a hypothesis with prior
knowledge. It depends on the conditional probability.

Where,
P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.
P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a
hypothesis is true.
P(A) is Prior Probability: Probability of hypothesis before observing the evidence.
P(B) is Marginal Probability: Probability of Evidence.
Bayes Theorem
Bayes Theorem
Bayes Theorem

• P(B|A) is the conditional probability of event B given that event A


has already occurred.

• P(A) represents the earlier probability that event A will take place.

• P(B) represents the probability that event B will occur.


Bayes Theorem

• The probability of an event A occurring given evidence B is calculated


by multiplying the likelihood of evidence B given the occurrence of
event A by the prior probability of A and dividing the result by the
prior probability of B.
Compare Bays Theorem Vs Conditional Probability
Problem: If the weather is sunny, then the Player should play or not?

Outlook Play
0 Rainy Yes
1 Sunny Yes
2 Overcast Yes
3 Overcast Yes
4 Sunny No
5 Rainy Yes
6 Sunny Yes
7 Overcast Yes
8 Rainy No
9 Sunny No
10 Sunny Yes
11 Rainy No
12 Overcast Yes
13 Overcast Yes
Frequency table for the Weather Conditions:

Weather Yes No
Overcast 5 0
Rainy 2 2
Sunny 3 2
Total 10 5
Frequency table for the Weather Conditions:
Likelihood table weather condition:

Weather No Yes
Overcast 0 5 5/14= 0.35
Rainy 2 2 4/14=0.29
Sunny 2 3 5/14=0.35
All 4/14=0.29 10/14=0.71
Applying Bayes'theorem:
P(Yes|Sunny)= P(Sunny|Yes)*P(Yes) / P(Sunny)
P(Sunny|Yes)= 3/10= 0.3
P(Sunny)= 0.35
P(Yes)=0.71
So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60
So P(No|Sunny)= 0.5*0.29/0.35 = 0.41
So as we can see from the above calculation
that P(Yes|Sunny)>P(No|Sunny)
Hence on a Sunny day, Player can play the game.
Applying Bayes'theorem:
P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)
P(Sunny|NO)= 2/4=0.5
P(No)= 0.29
P(Sunny)= 0.35
So P(No|Sunny)= 0.5*0.29/0.35 = 0.41

So as we can see from the above calculation

that P(Yes|Sunny)>P(No|Sunny)
Advantages of Naïve Bayes Classifier:
• Spam detection, medical diagnosis, picture recognition, and natural
language processing
• Spam detection
• Spam detection is among the machine learning techniques where the
Bayes Theorem is most frequently used. Machine learning algorithms
may precisely detect undesired emails and block them from reaching a
user's mailbox by calculating the likelihood that a message is spam
using the Bayes Theorem.
Advantages of Naïve Bayes Classifier:
• Naïve Bayes is one of the fast and easy ML algorithms to predict a
class of datasets.

• It can be used for Binary as well as Multi-class Classifications.

• It performs well in Multi-class predictions as compared to the


other Algorithms.

• It is the most popular choice for text classification problems.


Disadvantages of Naïve Bayes Classifier:
• Naive Bayes assumes that all features are independent or
unrelated, so it cannot learn the relationship between features.
Applications of Naïve Bayes Classifier:
• It is used for Credit Scoring.

• It is used in medical data classification.

• It can be used in real-time predictions because Naïve Bayes Classifier


is an eager learner.

• It is used in Text classification such as Spam filtering and Sentiment


analysis.
Applications of Naïve Bayes Classifier:
Medical Diagnosis
• In order to determine the likelihood that a patient has a specific
condition based on their symptoms and medical history, the
Bayes Theorem is also utilized in the field of healthcare. This might
aid medical professionals in prescribing the best therapies and
more accurate diagnoses.
Applications of Naïve Bayes Classifier:
• Image Recognition
• The Bayes Theorem is used for identifying objects in photographs.
Machine learning algorithms are good at classifying photos and
identifying objects by calculating the likelihood that an object will
appear in a photograph based on its features.
Applications of Naïve Bayes Classifier:
• Natural Language Processing
• In natural language processing, the Bayes Theorem is widely used
to calculate the likelihood that a certain word or phrase would be
used in a given situation.

• Programs that require to process of natural languages, like speech


recognition and machine translation, can benefit from this.
Types of Naïve Bayes Model:
Types of Naïve Bayes Classifier

Bernoulli Naïve Multinomial Naïve Gaussian Naïve


Bayes Bayes Bayes
Gaussian Naive Bayes classifier
• In Gaussian Naive Bayes, continuous values associated with each
feature are assumed to be distributed according to a Gaussian
distribution. A Gaussian distribution is also called Normal
Distribution. When plotted, it gives a bell shaped curve which is
symmetric about the mean of the feature values as shown below:
:

Gaussian Naive Bayes classifier

• The likelihood of the features is assumed to be Gaussian, hence,


conditional probability is given by:
Naïve Bayes in Scikit- learn

# training the model on training set


from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()
gnb.fit(X_train, y_train)
Naïve Bayes in Scikit- learn
# load the iris dataset
from sklearn.datasets import load_iris
iris = load_iris()

# store the feature matrix (X) and response vector (y)


X = iris.data
y = iris.target

# splitting X and y into training and testing sets


from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.4, random_state=1)

# training the model on training set


from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()
gnb.fit(X_train, y_train)

# making predictions on the testing set


y_pred = gnb.predict(X_test)
Naïve Bayes in Scikit- learn

# comparing actual response values (y_test) with predicted response values (y_pred)
from sklearn import metrics
print("Gaussian Naive Bayes model accuracy(in %):", metrics.accuracy_score(y_test, y_pred)*100)
Naïve Bayes in Scikit- learn
# Fitting Naive Bayes to the Training set
from sklearn.naive_bayes import GaussianNB % Import GaussianNB from naïve_bayes
classifier = GaussianNB() % Create object classifier of function GaussianNB()
classifier.fit(x_train, y_train) % GaussianNB classifier to fit it to the training dataset.
# Predicting the Test set results
y_pred = classifier.predict(x_test)

# Making the Confusion Matrix


from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
Bernoulli Naïve Bayes

• In the multivariate Bernoulli event model, features are independent


booleans (binary variables) describing inputs. Like the multinomial
model, this model is popular for document classification tasks, where
binary term occurrence(i.e. a word occurs in a document or not)
features are used rather than term frequencies (i.e. frequency of a word
in the document).

https://iq.opengenus.org/bernoulli-naive-bayes/
Que . Using Bernoulli's Naïve Bayes , Find Probability of Buys computer yes or no for
the instance X =(age =youth, income =medium, student =yes, credit rating =fair) and
we need to predict its class label (yes or no)
Solution 
• P(C1)=P(buys_computer = yes) =9/14 =0.643 (since total 9 rows of yes)

• P(C2)=P(buys_computer = no) =5/14 =0.357.

• Here we have x1=age ,x2=income, x3=student , x4=credit_rating

• P(age =youth |buys_computer = yes) =2/9 =0.222

• P(age =youth | buys_computer = no) =3/5 =0.600

• P(income =medium | buys_computer = yes) =4/9 =0.444

• P(income =medium | buys_computer = no) =2/5 =0.400


Solution 
• P(student =yes | buys_computer =yes) =6/9 =0.667

• P(student =yes |buys_computer D=no) =1/5 =0.200

• P(credit_rating =fair |buys_computer =yes) =6/9 =0.667

• P(credit_rating =fair |buys_computer =no) =2/5 =0.400


Solution 
• P(X|buys_computer = yes) =

P(age =youth |buys computer = yes)*

P(income =medium |buys_computer = yes)*

P(student =yes |buys_computer = yes)*

P(credit rating =fair |buys_computer = yes)

=0.222*0.444*0.667*0.667 =0.044.

Similarly, P(X|buys_computer =no) = 0.600*0.400*0.200*0.400 = 0.019.


Solution 
• P(X|buys_computer =yes)*P(buys-computer =yes) = 0.044*0.643 = 0.028

• P(X|buys_computer =no)*P(buys_computer =no) = 0.019*0.357 = 0.007

• P(X|buys_computer =yes)*P(buys-computer =yes) >


P(X|buys_computer =no)*P(buys_computer =no)

• Therefore, the naive Bayesian classifier predicts

buys_computer = yes for instance X (age =youth, income =medium, student =yes, credit
rating =fair)
Multinomial Naïve Bayes
• Feature vectors represent the frequencies with which certain events
have been generated by a multinomial distribution. This is the event
model typically used for document classification.

• Application –

• To find number of times particular word is repeated in text document


then use Multinomial distribution

https://www.upgrad.com/blog/multinomial-naive-bayes-explained/
Multinomial Naïve Bayes

• Application –

• To find number of times particular word is repeated in text


document then use Multinomial distribution

• To find count of count particular word in text document

• To find frequency of particular word in text document

• To find number of occurrence of word in text document


Multinomial Naïve Bayes
• import numpy as np
• X = np.random.randint(8, size = (8, 100))
• y = np.array([1, 2, 3, 4, 5, 6, 7, 8])

• from sklearn.naive_bayes import MultinomialNB


• MNBclf = MultinomialNB()
• MNBclf.fit(X, y)
Gaussian Naïve Bayes

• Gaussian Naive Bayes (GNB) is a classification technique used in


Machine Learning (ML) based on the probabilistic approach and
Gaussian distribution. Gaussian Naive Bayes assumes that each
parameter (also called features or predictors) has an independent
capacity of predicting the output variable.
Gaussian Naïve Bayes
• Gaussian Naive Bayes is a variant of Naive Bayes that follows Gaussian
normal distribution and supports continuous data. We have explored the
idea behind Gaussian Naive Bayes along with an example.

• Naive Bayes are a group of supervised machine learning classification


algorithms based on the Bayes theorem. It is a simple classification
technique, but has high functionality. They find use when the
dimensionality of the inputs is high. Complex classification problems can
also be implemented by using Naive Bayes Classifier.
Gaussian Naïve Bayes
from sklearn.naive_bayes import GaussianNB
classifier = GaussianNB()
classifier.fit(x_train, y_train)

# Predicting the Test set results


y_pred = classifier.predict(x_test)

#Accuracy score
from sklearn.metrics import accuracy_score
accuracy_score(y_test, y_pred)
Gaussian Naïve Bayes
• Good Example

• https://www.youtube.com/watch?v=kufuBE6TJew

• https://www.youtube.com/watch?v=kufuBE6TJew

• https://levelup.gitconnected.com/classification-using-gaussian-naive-
bayes-from-scratch-6b8ebe830266
Guassian Naïve Bayes using sklearn
# fitting naive bayes to the training set
from sklearn.naive_bayes import GaussianNB
classifier = GaussianNB();
classifier.fit(X_train, y_train)

# predicting test set results


y_pred = classifier.predict(X_test)

# making the confusion matrix


from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
cm
Thank You

You might also like