You are on page 1of 41

UNIT II PROBABILISTIC

REASONING
UNIT II PROBABILISTIC REASONING
• Acting under uncertainty – Bayesian inference
– naïve bayes models. Probabilistic reasoning
– Bayesian networks – exact inference in BN –
approximate inference in BN – causal
networks.
• Uncertainty
• When we are not sure about the situation then
the situation is called Uncertainty.
• Example
• Final Sem examination results
• You are not sure about pass/fail till the results are
published
• Covid 19
• Best eg for uncertainty.
• We were unaware of the disease till Dec 2019.
Acting under uncertainty

• Causes for uncertainty in the Real world


• Power failure
• Equipment failure
• Sudden variations in the climatic condition
• Unreliable sources provided the information
• Hacking
• Sudden spread of disease
• Uncertainty in AI –An agent may never know for sure what
state it is in now or where it will end up after a sequence of
actions.
• Problem-solving and logical agents handle uncertainty by
keeping track of a belief state
• Belief state
• It is a representation of the set of all possible world states and
generates a possible plan that accounts for all potential
outcomes.
• Belief state for chess game: List of all possible board positions
• Drawbacks of belief state :
• the agent must consider all explanations for the sensor
observations. This causes a large belief state full of impossible
possibilities.
• A proper contingency plan that accounts for every possibility
can expand indefinitely.
• There is no plan that is guaranteed to achieve the goal,
however, the agent must take action.
Bayesian inference
• Baye’s Rule:
• Baye’s theorem is also known as Baye’s rule or
Baye’s law or Bayesian reasoning.
• It determines the probability of an event with
uncertain knowledge.
• In probability theory, it relates the conditional
probability and marginal probability of two
random events.
• The Bayesian inference is an application of Baye’s
theorem.
PROBABILITIES
• Marginal Probability
• Joint Probability
• Conditional Probability
• Marginal Probability: The probability of an
event irrespective of the outcome of another
variable. Occurrence of a single event
• Example
• An exam was taken by a group of male and female candidates. The
probability of pass and fail events are given in the table. Let us find
marginal probability for the given table of data
Adding the values along row wise
total male and female candidates
taken the exam

Adding the values along


column- total number of
candidates passed and
failed.

The total probability calculated Is called the Marginal probability. The


sum of the total probability values should be 1
• Joint Probability: Joint probability determines
the probability that two events will occur
simultaneously and at the same time.
• P(A∩B)
• From the example, Lets find What is the
probability that the candidate is male and he
passed in the exam?
• i.e. P(MALE ∩ PASS) the value is 0.27
Conditional Probability: The probability of one event occurring
in the presence of a second event.

What is the probability that the randomly selected candidate is female and
she passed in the exam?
General form of Bayes’ rule with
normalization is
Example
A doctor knows that the disease meningitis causes a patient to have a stiff neck,
say, 70% of the time. The doctor also knows some unconditional facts: the prior
probability that any patient has meningitis is 1/50,000, and the prior probability
that any patient has a stiff neck is 1%. What is the probability that a patient has
the disease meningitis with a stiff neck?
In a neighbourhood, 90% children were falling sick due flu and 10% due to measles
and no other disease. The probability of observing rashes for measles is 0.95 and
for flu is 0.08. If a child develops rashes, find the child’s probability of having flu.
Apply the Bayes' rule for the following
From a standard deck of playing cards, a single card is drawn. The probability that the card is
king is 4/52, then calculate posterior probability P(King|Face). Assume the drawn face card is
a king card.
naïve bayes models

Naive Bayes algorithm is a supervised machine


learning algorithm which is based on Bayes
Theorem used mainly for classification problem

It is a probabilistic classifier, which means it


predicts on the basis of the probability of an
object. Some popular examples of Naive Bayes
Algorithm are spam filtration, Sentimental
analysis, and classifying articles.
Naïve Bayes models example problem

Given the data for symptoms and whether patient have flu or not, classify
following:
x = (chills = Y, runny nose = N, headache = mild, fever = Y)
P(FLU)=YES P(FLU)=NO

5/8 3/8
RUNNY HEADACHE
NOSE
Flu = Flu = Flu = Yes Flu = No
Yes No
Strong 2/3 1/3
Yes 4/5 1/5

No 1/3 2/3 Mild 2/3 1/3

No 1/2 1/2
FEVER

Flu = Yes Flu = No


CHILLS
Yes 4/5 1/5 Flu = Yes Flu = No

No 1/3 2/3 Yes 3/4 1/4

No 2/4 2/4
Pros of Naïve Bayes
• It is easy and fast to predict a class of test data set
• Naïve Bayes classifier performs better compared
to other models i.e. logistic regression and it
needs less training data
• It performs well in case of categorical input
variables compared to numerical variables
• Highly scalable, it scales linearly with number of
predictors and data points
• Handles continuous and discrete data
• Not sensitive to irrelevant features
Bayesian network
• A Bayesian network is a graph:
• A set of random variables
• A set of directed links connects pairs of nodes
• Each node has a conditional P table that
quantifies the effects that the parents have on
the node
• The graph has no directed cycles (DAG
A Bayesian Network
A Bayesian network is made up of:
1. A Directed Acyclic Graph
A

C D

2. A set of tables for each node in the graph


A P(A) A B P(B|A) B D P(D|B) B C P(C|B)
false 0.6 false false 0.01 false false 0.02 false false 0.4
true 0.4 false true 0.99 false true 0.98 false true 0.6
true false 0.7 true false 0.05 true false 0.9
true true 0.3 true true 0.95 true true 0.1
A Directed Acyclic Graph
Each node in the graph is a A node X is a parent of
random variable another node Y if there is an
arrow from node X to node Y
A eg. A is a parent of B

C D

Informally, an arrow from


node X to node Y means X
has a direct influence on Y
Weng-Keen Wong, Oregon
28
State University ©2005
A Set of Tables for Each Node
A P(A) A B P(B|A) Each node Xi has a
false 0.6 false false 0.01
conditional probability
true 0.4 false true 0.99
distribution P(Xi | Parents(Xi))
true false 0.7
true true 0.3
that quantifies the effect of
the parents on the node
B C P(C|B) The parameters are the
false false 0.4
probabilities in these
false true 0.6 A
conditional probability tables
true false 0.9
true true 0.1
(CPTs)
B
B D P(D|B)

C D false false 0.02


false true 0.98
true false 0.05
true true 0.95
A Set of Tables for Each Node
Conditional Probability
Distribution for C given B
B C P(C|B)
false false 0.4
false true 0.6
true false 0.9
true true 0.1 For a given combination of values of the parents (B
in this example), the entries for P(C=true | B) and
P(C=false | B) must add up to 1
eg. P(C=true | B=false) + P(C=false |B=false )=1

If you have a Boolean variable with k Boolean parents, this table


has 2k+1 probabilities (but only 2k need to be stored)

Weng-Keen Wong, Oregon


30
State University ©2005
Bayesian Networks
Two important properties:
1. Encodes the conditional independence
relationships between the variables in the
graph structure
2. Is a compact representation of the joint
probability distribution over the variables

Weng-Keen Wong, Oregon


31
State University ©2005
Conditional Independence
The Markov condition: given its parents (P1, P2),
a node (X) is conditionally independent of its
non-descendants (ND1, ND2)

P1 P2

ND ND
X
1 2

C1 C2

Weng-Keen Wong, Oregon


32
State University ©2005
Using a Bayesian Network Example
Using the network in the example, suppose you want to
calculate:
P(A = true, B = true, C = true, D = true)
= P(A = true) * P(B = true | A = true) *
P(C = true | B = true) P( D = true | B = true)
= (0.4)*(0.3)*(0.1)*(0.95)
A

C D

Weng-Keen Wong, Oregon


33
State University ©2005
Using a Bayesian Network Example
Using the network in the example, suppose you want to
calculate:
This is from the
P(A = true, B = true, C = true, D = true) graph structure
= P(A = true) * P(B = true | A = true) *
P(C = true | B = true) P( D = true | B = true)
= (0.4)*(0.3)*(0.1)*(0.95)
A

B
These numbers are from the
conditional probability tables
C D

Weng-Keen Wong, Oregon


34
State University ©2005
Inference
• Using a Bayesian network to compute
probabilities is called inference
• In general, inference involves queries of the form:
P( X | E )

E = The evidence variable(s)


X = The query variable(s)

Weng-Keen Wong, Oregon


35
State University ©2005
Inference
HasAnthr
ax

HasCo HasFe HasDifficultyBre HasWideMediastin


ugh ver athing um

• An example of a query would be:


P( HasAnthrax = true | HasFever = true, HasCough = true)
• Note: Even though HasDifficultyBreathing and
HasWideMediastinum are in the Bayesian network, they are
not given values in the query (ie. they do not appear either as
query variables or evidence variables)
• They are treated as unobserved variables and summed out.
Weng-Keen Wong, Oregon
36
State University ©2005
A
Inference Example
Supposed we know that A=true.
B
What is more probable C=true or D=true?
For this we need to compute
P(C=t | A =t) and P(D=t | A =t). C D
Let us compute the first one.

A P(A) A B P(B|A) B D P(D|B) B C P(C|B)


false 0.6 false false 0.01 false false 0.02 false false 0.4
true 0.4 false true 0.99 false true 0.98 false true 0.6
true false 0.7 true false 0.05 true false 0.9
true true 0.3 true true 0.95 true true 0.1
What is P(A=true)? A

C D

A P(A) A B P(B|A) B D P(D|B) B C P(C|B)


false 0.6 false false 0.01 false false 0.02 false false 0.4
true 0.4 false true 0.99 false true 0.98 false true 0.6
true false 0.7 true false 0.05 true false 0.9
true true 0.3 true true 0.95 true true 0.1
A
What is P(C=true, A=true)?
B

C D

A P(A) A B P(B|A) B D P(D|B) B C P(C|B)


false 0.6 false false 0.01 false false 0.02 false false 0.4
true 0.4 false true 0.99 false true 0.98 false true 0.6
true false 0.7 true false 0.05 true false 0.9
true true 0.3 true true 0.95 true true 0.1

You might also like