You are on page 1of 19

INT404:

ARTIFICIAL
INTELLIGENCE
UNIT IV
Statistical reasoning :
Probability & Bayes' theorem,
Bayesian networks,
Dempster-Shafer-Theory,
Certainty factors & Rule-based systems

Weak slot and filler structures : Semantic nets, Frames

Strong slot and filler structures : Conceptual dependency, Scripts


Dempster Shafer Theory
What Dempster Shafer Theory was given by Arthure P.Dempster in 1967 and his student Glenn
Shafer in 1976. 

This theory was released because of following reason:- 


◦ Bayesian theory is only concerned about single evidences.
◦ Bayesian probability cannot describe ignorance.

DST is an evidence theory, it combines all possible outcomes of the problem. Hence it is used to solve
problems where there may be a chance that a different evidence will lead to some different result. 
Dempster Shafer Theory
The uncertainty in this model is given by:- 
1.Consider all possible outcomes.

2.Belief will lead to believe in some possibility by bringing out some evidence.

3.Plausibility will make evidence compatible with possible outcomes.


Dempster Shafer Theory
There will be the possible evidence by which we can find the murderer by measure of plausibility. 

Using the above example we can say: 

Set of possible conclusion (P): {p1, p2….pn} 


where P is set of possible conclusions and cannot be exhaustive,
i.e. at least one (p)i must be true. 
(p)i must be mutually exclusive. 
Power Set will contain 2n elements where n is number of elements in the possible set. 
For eg:- 
If P = { a, b, c}, then Power set is given as 
{0, {a}, {b}, {c}, {a, b}, {b, c}, {a, c}, {a, b, c}}= 23 elements. 
Dempster Shafer Theory
Mass function m(K): It is an interpretation of m({K or B}) i.e; it means there is evidence for {K or B}
which cannot be divided among more specific beliefs for K and B. 

Belief in K: The belief in element K of Power Set is the sum of masses of element which are subsets of K.
This can be explained through an example 
Lets say K = {a, b, c} 
Bel(K) = m(a) + m(b) + m(c) + m(a, b) + m(a, c) + m(b, c) + m(a, b, c) 

Plausibility in K: It is the sum of masses of set that intersects with K. 
i.e; Pl(A) = m(a) + m(a, b) + m(a, c) + m(a, b, c)
Characteristics of Dempster Shafer
Theory
•Ignorance is reduced in this theory by adding more and more evidences.

•Combination rule is used to combine various types of possibilities.


Dempster Shafer Theory - Example
Let us consider a room where four people are present, B, J, S and K. Suddenly the lights go out and
when the lights come back, K has been stabbed in the back by a knife, leading to his death. No one
came into the room and no one left the room. We know that K has not committed suicide. Now we have
to find out who the murderer is. 
To solve these there are the following possibilities: 
•Either {B} or {J} or {S} has killed him.
•Either {B, J} or {J, S} or {B, S} have killed him.
•Or the three of them have killed him i.e; {B, J, S}
•None of them have killed him {0} (let’s say).
Dempster Shafer Theory - Example
Detectives, after reviewing the crime scene, assign Mass probabilities to various elements of the
Power set:
Event Mass
No one is Guilty 0
B is Guilty 0.1
J is Guilty 0.2
S is Guilty 0.1
Either B or J is Guilty 0.1
Either J or S is Guilty 0.3
Either B or S is Guilty 0.1
One of 3 is Guilty 0.1
If K = { B, J, S}, then Power set is given
as 
{0, {B}, {J}, {S}, {B, J}, {J, S}, {S, B},
{B, J, S}}

= 2  elements
3
Dempster Shafer Theory - Example
Given the Mass Assignments are assigned by the Detectives:
K {B} {J} {S} {B,J} {J,S} {B,S} {B,J,S}

M(K) 0.1 0.2 0.1 0.1 0.3 0.1 0.1

Mass function m(K): It is an interpretation of m({K or B})


i.e; it means there is evidence for {K or B} which cannot be
divided among more specific beliefs for K and B. 
m(B) + m(J) + m(S) + m(B, J) + m(J, S) + m(B, S) + m(B, J,
S)

Dempster Shafer Theory - Example


Belief in K: The Belief in an element K of the Power set is the sum of
the Masses of elements which are Subsets of K
Eg., Given K = (B, J, S}
Bel(B) = m(B) = 0.1
Bel(J) = m(J) = 0.2 K {B} {J} {S} {B,J} {J,S} {B,S} {B,J,S}
Bel(S) = m(S) = 0.1
M(K) 0.1 0.2 0.1 0.1 0.3 0.1 0.1
Bel(K) 0.1 0.2 0.1 0.4 0.6 0.3 1.0
m(B) + m(J) + m(S) + m(B, J) + m(J, S) + m(B, S) + m(B, J, S)

Dempster Shafer Theory - Example


Bel(B,J) = m(B) + m(J) + m(B,J) = 0.1+0.2+0.1 = 0.4
Bel(J,S) = m(J) + m(S) + m(J,S) = 0.2+0.1+0.3 = 0.6
Bel(B,S) = m(B) + m(S) + m(B,S) = 0.1+0.1+0.1 = 0.3
Bel(B,J,S) = m(B) + m(J) + m(S) + m(B, J) + m(J, S) + m(B, S) + m(B,
J, S) = 0.1+0.2+0.1+0.1+0.3+0.1+0.1 = 1
K {B} {J} {S} {B,J} {J,S} {B,S} {B,J,S}

M(K) 0.1 0.2 0.1 0.1 0.3 0.1 0.1


Bel(K) 0.1 0.2 0.1 0.4 0.6 0.3 1.0
m(B) + m(J) + m(S) + m(B, J) + m(J, S) + m(B, S) + m(B, J,
S)

Dempster Shafer Theory - Example


Plausibility in K: It is the sum of masses of set that intersects with K. 
Plausibility of an element K, pl(K) is the Sum of all masses of the sets
that intersect with the set K:
Pl(B) = m(B) + m(B, J) + m(B, S) + m(B, J, S) = 0.1 + 0.1 + 0.1 + 0.1
= 0.4
Pl(J) = m(J) + m(B, J) + m(J, S) + m(B, J, S) = 0.2 + 0.1 + 0.3 + 0.1 =
0.7
K {B} {J} {S} {B,J} {J,S} {B,S} {B,J,S}
Pl(S) = m(S)
M(K) 0.1 + m(J,
0.2 S) + m(B,
0.1 S) +
0.1 m(B, J,
0.3 S) = 0.1
0.1 + 0.3
0.1+ 0.1 + 0.1
= 0.6
Pl(K) 0.4 0.7 0.6 0.9 0.9 0.8 1.0
m(B) + m(J) + m(S) + m(B, J) + m(J, S) + m(B, S) + m(B, J,
S)

Dempster Shafer Theory - Example


Pl(B,J) = m(B) + m(J) + m(B,J) + m(J,S) + m(B,S) + m(B,J,S) = 0.1+0.2+0.1+0.3+0.1+0.1
= 0.9

Pl(J,S) = m(J) + m(S) + m(B,J) + m(J,S) + m(B,S) + m(B,J,S) = 0.2+0.1+0.1+0.3+0.1+0.1


= 0.9

Pl(B,S) = m(B) + m(S) + m(B,J) + m(J,S) + m(B,S) + m(B,J,S) =


0.1+0.1+0.1+0.3+0.1+0.1 = 0.8

Pl(B,J,S)K= m(B) {B}


+ m(J) + {J} {S} J) + m(J,
m(S) + m(B, {B,J}S) + {J,S}
m(B, S) +{B,S}
m(B, J, {B,J,S}
S)
=M(K) 0.1 0.2 0.1 = 1.00.1
0.1+0.2+0.1+0.1+0.3+0.1+0.1 0.3 0.1 0.1
Pl(K) 0.4 0.7 0.6 0.9 0.9 0.8 1.0
Dempster Shafer Theory - Example
The certainty associated with a given subset A is defined by the Belief interval:
[Bel(K) Pl(K)]
Eg., the Belief interval of {B,S} = [0.3 0.8]
The Probability in K falls somewhere between Bel(K) and Pl(K)
- Bel(K) represents the evidence we have for K directly, so Prob(K) cannot be less than this value.
- Pl(K) represents the maximum share of the evidence we could possibly have, for all the sets that intersect
with K, the part that intersects is actually valid, So Pl(K) is the maximum possible value of Prob(K).
K {B} {J} {S} {B,J} {J,S} {B,S} {B,J,S}
belief ≤ plausibility M(K) 0.1 0.2 0.1 0.1 0.3 0.1 0.1
Bel(K) 0.1 0.2 0.1 0.4 0.6 0.3 1.0
Pl(K) 0.4 0.7 0.6 0.9 0.9 0.8 1.0
Dempster Shafer Theory - Example
Belief intervals allow Dempster Shafer Theory to reason about the
degree of Certainty or certainty of our Beliefs.

A Small difference between Belief and Plausibility shows that we


are certain about our belief
A large difference shows that we are uncertain about our Belief
Dempster Shafer Theory
Advantages: 
•As we add more information, uncertainty interval reduces.
•DST has much lower level of ignorance.
•Diagnose hierarchies can be represented using this.
•Person dealing with such problems is free to think about evidences.

Disadvantages: 
•In this, computation effort is high, as we have to deal with 2n of sets.
Dempster Shafer Theory - Example
The disbelief of K is can also denoted as bel(⌐K):
It is calculated by summing all masses of elements which not
intersect with K.

The Plausibility of K is 1-dis(K)


Pl(K) = 1-dis(K)
So,
Dis(K) = 1-Pl(K)

You might also like