You are on page 1of 37

Naive Bayes

Tim Pengajaran
Mata Kuliah Machine Learning
Jurusan Teknologi Informasi Tahun 2021
Disclaimer
▪ This presentation material, including examples, images, references
are provided for informational and explanation assistance only

▪ The names of actual products and companies mentioned here in, if


any, may be the trademarks of their respective owners

▪ Credits shall be given to the images taken from the open-source


and cannot be used for promotional activities

Machine Learning 2021 - Materi 06 - Naive Bayes


Outline Naive Bayes

• Bayes’ Theorem
• Generative and Discriminative Models
• Naive Bayes Computation

Machine Learning 2021 - Materi 06 - Naive Bayes


Recalling Classifier
• A machine learning
model that is used to
discriminate different
objects based on
certain features.

• Naive Bayes Classifier.

Machine Learning 2021 - Materi 06 - Naive Bayes


What is Naive Bayes?

• A probabilistic machine learning model that’s used for classification task.


• Its essential is based on the Bayes theorem.

Machine Learning 2021 - Materi 06 - Naive Bayes


Bayes’ Theorem
• A formula for calculating the probability • 𝑨 and 𝑩 are events.
of an event using prior knowledge of • 𝑷 𝑨 is the probability of observing event 𝑨.
related conditions. • 𝑷 𝑩 is the probability of observing event 𝑩.
• 𝑷 𝑨 𝑩 is the conditional probability of
• The theorem was discovered by an observing 𝑨 given that 𝑩 was observed.
English statistician and minister named • In classification tasks, the goal is to map
Thomas Bayes in the 18th century. features of explanatory variables to a discrete
response variable.
𝑷 𝑨 𝑷𝑩𝑨 • Must find the most likely label, 𝑨, given the
𝑷 𝑨𝑩 =
𝑷 𝑩 features, 𝑩.

Machine Learning 2021 - Materi 06 - Naive Bayes


Discriminative and Generative Model
• Discriminative models learns a decision
boundary that is used to discriminate
between classes.
• They predict P(y|x), the probability of y
given x, calculating the P(x,y), the
probability of x and y.

• Generative models model the joint


probability distribution of the features
and the classes, P(x, y).
• A discriminative model does not care how
the data is generated. Here we just care
about P(y|x)

Machine Learning 2021 - Materi 06 - Naive Bayes


Discriminative vs. Generative Model
• Imagine we are trying to classify dogs and cats
based on the animal weight and height
• using a generative model.
– We will have to compute the following
probabilities for each data point:
• P(cat,weight)
• P(cat,height)
• P(dog,weight)
• P(dog,height)
– IF we have 1,000 data points to train our model.
This means that at least we will need to
compute 4,000 probabilities.

Machine Learning 2021 - Materi 06 - Naive Bayes


Discriminative vs. Generative Model
• Imagine we are trying to classify dogs and cats
based on the animal weight and height
• Using a Discriminative model.
– we just need to compute P(y|x) for each data
point.
– calculate 2,000 probabilities if the data set has
1,000 data points

Machine Learning 2021 - Materi 06 - Naive Bayes


Discriminative vs. Generative Model

Machine Learning 2021 - Materi 06 - Naive Bayes


Generative Model
• Equivalent to modelling the probabilities of the classes and the
probabilities of the features given the classes.
• Models model how the classes generate features or new
examples of the data with intermediate steps but can be more
biased.
• More robust to noisy training data and may perform better when
training data is scarce(difficult to get data or the data is small as compared to the
amount needed).

• Intermediate step introduces more assumptions to the model.


When these assumptions. The disadvantage is that these
assumptions can prevent generative models from learning

Machine Learning 2021 - Materi 06 - Naive Bayes


Naive Bayes Computation
• Rewrite Bayes' theorem for a classification task

𝑃 𝑥1 , … , 𝑥𝑛 𝑦 𝑃 𝑦
𝑃 𝑦 𝑥1 , … , 𝑥𝑛 =
𝑃 𝑥1 , … , 𝑥𝑛

• 𝑦 is the positive class, 𝑥1 is the first feature for the instance, and 𝑛 is the number of features.
• 𝑃 𝑥1 , … , 𝑥𝑛 is constant for all inputs, so we can omit it; the probability of observing a
particular feature in the training set does not vary for different test instances.
• This leaves two terms: the prior class probability, 𝑃 𝑦 , and the conditional probability,
𝑃 𝑥1 , … , 𝑥𝑛 𝑦 . Naive Bayes estimates these terms using maximum a posteriori estimation
(MAP).
• 𝑃 𝑦 is simply the frequency of each class in the training set.

Machine Learning 2021 - Materi 06 - Naive Bayes


Naive Bayes Computation
• Performing maximum a posteriori estimation (MAP) in Naive Bayes

𝑃 𝑥1 , … , 𝑥𝑛 𝑦 𝑃 𝑦
𝑃 𝑦 𝑥1 , … , 𝑥𝑛 =
𝑃 𝑥1 , … , 𝑥𝑛
𝑃 𝑦 𝑥1 , … , 𝑥𝑛 ∝ 𝑃 𝑦 𝑃 𝑥1 𝑦 𝑃 𝑥2 𝑦 … . 𝑃 𝑥𝑛 𝑦
𝑛
𝑃 𝑦 𝑥1 , … , 𝑥𝑛 ∝ 𝑃 𝑦 ෑ 𝑃 𝑥𝑖 𝑦
𝑖=1

• The predicted class is given by:


𝑛
𝑦 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑦 𝑃 𝑦 ෑ 𝑃 𝑥𝑖 𝑦
𝑖=1

Machine Learning 2021 - Materi 06 - Naive Bayes


Assumptions of NB
• The features are conditionally independent given the response
variable.

• Training instances are independent and identically distributed


(i.i.d), this means that training instances are independent from
each other and are drawn from the same probability
distribution.

Machine Learning 2021 - Materi 06 - Naive Bayes


Naive Bayes Types
• Multinomial Naive Bayes – Mostly used for document classification
problem, i.e whether a document belongs to the category of sports,
politics, technology etc. The features/predictors used by the classifier are
the frequency of the words present in the document.

• Bernoulli Naive Bayes – Similar to the multinomial one but the predictors
are Boolean variables. The parameters that is used to predict the class
variable take up only values yes or no, for example if a word occurs in the
text or not.

• Gaussian Naive Bayes – When the predictors take up a continuous value


and are not discrete, assumed that these values are sampled from a
Gaussian distribution.

Machine Learning 2021 - Materi 06 - Naive Bayes


Let’s Watch a Film
https://youtu.be/O2L2Uv9pdDA

Machine Learning 2021 - Materi 06 - Naive Bayes


EXAMPLE
• Diketahui hasil survey yang dilakukan sebuah lembaga kesehatan menyatakan
bahwa 30% penduduk di dunia menderita sakit paru-paru. Dari 90% penduduk
yang sakit paru-paru ini 60% adalah perokok, dan dari penduduk yang tidak
menderita sakit paru-paru 20% perokok.
• Fakta ini bisa didefinisikan dengan: X=sakit paru-paru dan Y=perokok.
• Maka :
– P(X) = 0.9
– P(~X) = 0.1
– P(Y|X) = 0.6 → P(~Y|X) = 0.4
– P(Y|~X) = 0.2 → P(~Y|~X) = 0.8

Machine Learning 2021 - Materi 06 - Naive Bayes


EXAMPLE
• Dengan metode bayes dapat dihitung:
– P({Y}|X) = P(Y|X).P(X) = (0.6) . (0.9) = 0.54
– P({Y}|~X) = P(Y|~X) P(~X) = (0.2).(0.1) = 0.02
• Bila diketahui seseorang merokok, maka dia menderita
sakit paru-paru karena P({Y}|X) lebih besar dari P({Y}|~X)

P(AB)
P ( B| A ) =
P(A)

Machine Learning 2021 - Materi 06 - Naive Bayes


EXAMPLE
• Asumsi:
# Cuaca Temperatur Kecepatan Angin Berolah-raga – Y = berolahraga,
1 Cerah Normal Pelan Ya – X1 = cuaca,
2 Cerah Normal Pelan Ya – X2 = temperatur,
3 Hujan Tinggi Pelan Tidak
4 Cerah Normal Kencang Ya
– X3 = kecepatan angin
5 Hujan Tinggi Kencang Tidak • Berdasar Data
6 Cerah Normal Pelan Ya
– P(Y=ya) = 4/6 → P(Y=tidak) = 2/6
– P(X1=cerah|Y=ya) = 4/4=1,
P(X1=cerah|Y=tidak) = 0
– P(X3=kencang|Y=ya) = 1/4 ,
P(X3=kencang|Y=tidak) = 1/2

Machine Learning 2021 - Materi 06 - Naive Bayes


EXAMPLE
• Apakah bila cuaca cerah dan kecepatan angin
kencang, Apakah orang akan berolahraga?
# Cuaca Temperatur Kecepatan Angin Berolah-raga • Propabilitas terhadap ya
1 Cerah Normal Pelan Ya – P( X1=cerah,X3=kencang | Y=ya )
2 Cerah Normal Pelan Ya
3 Hujan Tinggi Pelan Tidak
– {P(X1=cerah|Y=ya).P(X3=kencang|Y=ya) } .
4 Cerah Normal Kencang Ya
P(Y=ya)
5 Hujan Tinggi Kencang Tidak – { (1) . (1/4) } . (4/6) = 1/6
6 Cerah Normal Pelan Ya • Propabilitas terhadap tidak
– P( X1=cerah,X3=kencang | Y=tidak )
– {P(X1=cerah|Y=tidak).P(X3=kencang|Y=tida
k) } . P(Y=tidak)
𝑛
– { (0) . (1/2) } . (2/6) = 0
𝑦 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑦 𝑃 𝑦 ෑ 𝑃 𝑥𝑖 𝑦
𝑖=1
Machine Learning 2021 - Materi 06 - Naive Bayes
Gaussian Naive Bayes
• Naive bayes classifier juga dapat menangani atribut
bertipe kontinyu.
• Salah satu caranya adalah menggunakan distribusi
Gaussian.
• Distribusi ini dikarakterisasi dengan dua parameter yaitu
mean (μ), dan variansi(σ2 ).
• Untuk setiap kelas Yj , peluang kelas bersyarat untuk
atribut Xi dinyatakan dengan persamaan distribusi
Gaussian
Machine Learning 2021 - Materi 06 - Naive Bayes
Gaussian Naive Bayes
• Fungsi densitas mengekspresikan probabilitas relatif.
• Data dengan mean μ dan standar deviasi σ, fungsi densitas
probabilitasnya adalah :

• μ dan σ dapat diestimasi dari data, untuk setiap kelas.

• untuk menghitung Likelihood P(X | C)

Machine Learning 2021 - Materi 06 - Naive Bayes


Gaussian Naive Bayes
• Probabilitas kemunculan setiap nilai
untuk atribut Harga Tanah (C1)

Machine Learning 2021 - Materi 06 - Naive Bayes


Gaussian Naive Bayes
• Probabilitas kemunculan setiap nilai
untuk atribut Jarak dari Pusat Kota (C2)

Machine Learning 2021 - Materi 06 - Naive Bayes


Gaussian Naive Bayes
• Probabilitas kemunculan setiap nilai
untuk atribut Angkutan Umum (C3)

Machine Learning 2021 - Materi 06 - Naive Bayes


Gaussian Naive Bayes
• Probabilitas kemunculan setiap nilai
untuk atribut Dipilih untuk Perumahan
(C4)

Machine Learning 2021 - Materi 06 - Naive Bayes


Gaussian Naive Bayes
• Apabila diberikan C1 = 300, C2 = 17, C3 = Tidak, maka:

• P(C3=tidak|C4=ya) = 4/5 , P(C3=tidak|C4=tidak) = 2/5

Machine Learning 2021 - Materi 06 - Naive Bayes


Gaussian Naive Bayes
• Probabilitas (likehood) terhadap ya
– P( C1=300,C2=17,C3=tidak| Y=ya )
– {P(C1=300|Y=ya).P(C2=17 |Y=ya). P(C3=tidak |Y=ya) } . P(Y=ya)
– 0,0021*0,0009*4/5*1/2
– 0,000000756.
• Probabilitas (likehood) terhadap tidak
– P( C1=300,C2=17,C3=tidak| Y=tidak )
– {P(C1=300|Y=tidak).P(C2=17 |Y=tidak). P(C3=tidak |Y=tidak) } .
P(Y=tidak)
– 0,0013*0,0633*2/5*1/2
𝑛
– 0,000016458. 𝑦 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃 𝑦 ෑ 𝑃 𝑥 𝑦
𝑦 𝑖
𝑖=1
Machine Learning 2021 - Materi 06 - Naive Bayes
Gaussian Naive Bayes
• Nilai probabilitas dapat dihitung dengan melakukan
normalisasi terhadap likelihood tersebut sehingga jumlah
nilai yang diperoleh = 1

Machine Learning 2021 - Materi 06 - Naive Bayes


Exercise
• How to predict “harga tanah MAHAL, jarak
dari pusat kota SEDANG, dan ADA angkutan
umum”?
• C1
• P(c1=murah|c4=ya)=2/5
• P(c1=murah|c4=tidak)=1/5
• P(c1=sedang|c4=ya)=2/5
• P(c1=sedang|c4=tidak)=1/5
• P(c1=mahal|c4=ya)=1/5
• P(c1=mahal|c4=tidak)=3/5

Machine Learning 2021 - Materi 06 - Naive Bayes


Exercise
• Probabilitas kemunculan setiap
nilai untuk atribut Jarak dari
Pusat Kota (C2)

Machine Learning 2021 - Materi 06 - Naive Bayes


Exercise
• Probabilitas kemunculan setiap
nilai untuk atribut Jarak dari
Pusat Kota (C3)

Machine Learning 2021 - Materi 06 - Naive Bayes


Exercise
• Probabilitas kemunculan setiap
nilai untuk atribut Dipilih untuk
perumahan (C4)

Machine Learning 2021 - Materi 06 - Naive Bayes


Exercise
• predict “harga tanah MAHAL, jarak dari pusat kota
SEDANG, dan ADA angkutan umum”?
• Probabilitas (likehood) terhadap ya
YA = P(Ya|Tanah=MAHAL) . P(Ya|Jarak=SEDANG) .
P(Ya|Angkutan=ADA) . P(Ya)
= 1/5 x 2/5 x 1/5 x 5/10 = 2/125 = 0,008
• Probabilitas (likehood) terhadap tidak
TIDAK = P(Tidak| Tanah=MAHAL) . P(Tidak|Jarak=SEDANG) .
P(Tidak|Angkutan=ADA) . P(Tidak)
= 3/5 x 1/5 x 3/5 x 5/10 = 2/125 = 0,036
𝑛
Machine Learning 2021 - Materi 06𝑦- = 𝑎𝑟𝑔𝑚𝑎𝑥
Naive Bayes 𝑦 𝑃 𝑦 ෑ 𝑃 𝑥𝑖 𝑦
𝑖=1
(Sunny, Hot, Normal, False)
(Sunny, Hot, Normal, False)

homework
• Classify whether the day is suitable for playing golf, given
the features of the day.

• The columns represent these features and the rows


represent individual entries.

• Example: If we take the first row of the dataset, we can


observe that is not suitable for playing golf if the outlook is
rainy, temperature is hot, humidity is high and it is not
windy.

• How to predict “Play Golf or Not knowing Information


given by the Features”?
• (today = Sunny, Hot, Normal, False)?

Machine Learning 2021 - Materi 06 - Naive Bayes


Homework
1. Create Python codes to make
prediction to the Case on the Left.

2. Write the Manual Computation in


Spreadsheet application to compare
the results with the Automatic
Compution in point 1.

Machine Learning 2021 - Materi 06 - Naive Bayes


Machine Learning 2021 - Materi 06 - Naive Bayes

You might also like