You are on page 1of 6

Naive Bayes, Fall-19

Example: For example, a fruit may be considered to be an apple if it is red, round, and about 3 inches
in diameter. Even if these features depend on each other or upon the existence of the other features, all
of these properties independently contribute to the probability that this fruit is an apple and that is why
it is known as ‘Naive’

What Is Naive Bayes?


Naive Bayes is among one of the simplest, but most powerful algorithms for classification based
on Bayes' Theorem with an assumption of independence among predictors. The Naive Bayes
model is easy to build and particularly useful for very large data sets. There are two parts to this
algorithm:

 Naive
 Bayes

The Naive Bayes classifier assumes that the presence of a feature in a class is unrelated to any
other feature. Even if these features depend on each other or upon the existence of the other
features, all of these properties independently contribute to the probability that a particular fruit
is an apple or an orange or a banana, and that is why it is known as "Naive."

Bayes theorem provides a way of calculating posterior probability P(c|x) from P(c), P(x) and P(x|c).
Look at the equation below:

Above,
 P(c|x) is the posterior probability of class (c, target) given predictor (x, attributes).
 P(c) is the prior probability of class.
 P(x|c) is the likelihood which is the probability of predictor given class.
 P(x) is the prior probability of predictor.

How Naive Bayes algorithm works? [Simulation]

 Let’s understand it using an example. Below I have a training data set of weather and
corresponding target variable ‘Play’ (suggesting possibilities of playing). Now, we need
to classify whether players will play or not based on weather condition. Let’s follow the
below steps to perform it.
 Step 1: Convert the data set into a frequency table
 Step 2: Create Likelihood table by finding the probabilities like Overcast probability =
0.29 and probability of playing is 0.64.


 Step 3: Now, use Naive Bayesian equation to calculate the posterior probability for each
class. The class with the highest posterior probability is the outcome of prediction.

 Problem: Players will play if weather is sunny. Is this statement is correct?


 We can solve it using above discussed method of posterior probability.
 P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)
 Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 = 0.64
 Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher probability.
 Naive Bayes uses a similar method to predict the probability of different class based on
various attributes. This algorithm is mostly used in text classification and with problems
having multiple classes.
Extra Information
What Is Bayes' Theorem:
In statistics and probability theory, Bayes' theorem describes the probability of an event, based
on prior knowledge of conditions that might be related to the event. It serves as a way to figure
out conditional probability.
Given a Hypothesis (H) and evidence (E), Bayes' Theorem states that the relationship between
the probability of the hypothesis before getting the evidence, P(H), and the probability of the
hypothesis after getting the evidence, P(H|E), is :

For this reason, P(H) is called the prior probability, while P(H|E) is called the posterior
probability. The factor that relates the two, P(H|E)/P(E), is called the likelihood ratio. Using
these terms, Bayes' theorem can be rephrased as:

"The posterior probability equals the prior probability times the likelihood ratio."

A little confused? Don't worry. Let's continue our Naive Bayes tutorial and understand this
concept with a simple concept.

Bayes' Theorem Example


Let's suppose we have a Deck of Cards and we wish to find out the probability of the card we
picked at random to being a king, given that it is a face card. So, according to Bayes' Theorem,
we can solve this problem. First, we need to find out the probability:

 P(King) which is 4/52 as there are 4 Kings in a Deck of Cards.


 P(Face|King) is equal to 1 as all the Kings are face Cards.
 P(Face) is equal to 12/52 as there are 3 Face Cards in a Suit of 13 cards and there are 4
Suits in total.

Now, putting together all the values in the Bayes' Equation, we get a result of 1/3.

Game Prediction Using Bayes' Theorem


Let's continue our Naive Bayes tutorial and predict the future with some weather data.
Here we have our data, which comprises the day, outlook, humidity, and wind conditions. The
final column is 'Play,' i.e., can we play outside, which we have to predict.

 First, we will create a frequency table using each attribute of the dataset.

 For each frequency table, we will generate a likelihood table.


 Likelihood of ‘Yes’ given ‘Sunny‘ is:
o P(c|x) = P(Yes|Sunny) = P(Sunny|Yes)* P(Yes) / P(Sunny) = (0.3 x 0.71)
/0.36 = 0.591
 Similarly, the likelihood of ‘No’ given ‘Sunny‘ is:
o P(c|x) = P(No|Sunny) = P(Sunny|No)* P(No) / P(Sunny) = (0.4 x 0.36) /0.36 =
0.40
 Now, in the same way, we need to create the Likelihood Table for other attributes as
well.

Suppose we have a Day with the following values :

 Outlook = Rain
 Humidity = High
 Wind = Weak
 Play = ?

So, with the data, we have to predict wheter "we can play on that day or not."

 Likelihood of 'Yes' on that Day = P(Outlook = Rain|Yes)*P(Humidity= High|Yes)*


P(Wind= Weak|Yes)*P(Yes)
o = 2/9 * 3/9 * 6/9 * 9/14 = 0.0199
 Likelihood of 'No' on that Day = P(Outlook = Rain|No)*P(Humidity= High|No)*
P(Wind= Weak|No)*P(No)
o = 2/5 * 4/5 * 2/5 * 5/14 = 0.0166

Now, when we normalize the value, we get:

 P(Yes) = 0.0199 / (0.0199+ 0.0166) = 0.55


 P(No) = 0.0166 / (0.0199+ 0.0166) = 0.45

Our model predicts that there is a 55% chance there will be a game tomorrow.

4 Applications of Naive Bayes Algorithms


 Real time Prediction: Naive Bayes is an eager learning classifier and it is sure fast.
Thus, it could be used for making predictions in real time.
 Multi class Prediction: This algorithm is also well known for multi class prediction
feature. Here we can predict the probability of multiple classes of target variable.
 Text classification/ Spam Filtering/ Sentiment Analysis: Naive Bayes classifiers
mostly used in text classification (due to better result in multi class problems and
independence rule) have higher success rate as compared to other algorithms. As a result,
it is widely used in Spam filtering (identify spam e-mail) and Sentiment Analysis (in
social media analysis, to identify positive and negative customer sentiments)
 Recommendation System: Naive Bayes Classifier and Collaborative Filtering together
builds a Recommendation System that uses machine learning and data mining techniques
to filter unseen information and predict whether a user would like a given resource or not

You might also like