You are on page 1of 14

UNIT-VI

Syllabus:

Uncertainty measure: probability theory: Introduction, probability theory, Bayesian belief


networks, certainty factor theory, dempster-shafer theory. Fuzzy sets and fuzzy logic:
Introduction, fuzzy sets, fuzzy set operations, types of membership functions, multi valued logic,
fuzzy logic, linguistic variables and hedges, fuzzy propositions, inference rules for fuzzy
propositions, fuzzy systems.

UNCERTAINTY MEASURE

Uncertainty as used here means the range of possible values within which the true value
of the measurement lies. This definition changes the usage of some other commonly used terms.
For example, the term accuracy is often used to mean the difference between a measured result
and the actual or true value.

Certainty factor (CF) is a numerical value that expresses a degree of subjective belief that
a particular item is true. The item may be a factor or a rule. When probabilities are used attention
must be paid to the underlying assumptions and probability distributions in order to show
validity.

PROBABILITY THEORY

Probability theory is the mathematical framework that allows us to analyze chance events
in a logically sound manner. ... If we assign numbers to the outcomes say, 1 for heads, 0 for tails
then we have created the mathematical object known as a random variable. Agents do not
encounter generic events but have to make a decision based on uncertainty about the particular
circumstances they face. Probability theory can be defined as the study of how knowledge affects
belief. Belief in some proposition, α, can be measured in terms of a number between 0 and 1.

The theory of probability provides the means to rationally model, analyze and solve
problems where future events cannot be foreseen with certitude. ... Thus, probability theory is
indispensable for rational decision making.

85
There are three basic rules associated with probability: the addition, multiplication, and
complement rules. The addition rule is used to calculate the probability of event A or event B
happening; we express it as: P(A or B) = P(A) + P(B) - P(A and B).

BAYESIAN BELIEF NETWORKS

A Bayesian network, Bayes network, belief network, decision network, Bayes(ian) model
or probabilistic directed acyclic graphical model is a probabilistic graphical model (a type of
statistical model) that represents a set of variables and their conditional dependencies via a
directed acyclic graph (DAG). Bayesian networks are ideal for taking an event that occurred and
predicting the likelihood that any one of several possible known causes was the contributing
factor. For example, a Bayesian network could represent the probabilistic relationships between
diseases and symptoms. Given symptoms, the network can be used to compute the probabilities
of the presence of various diseases.

Efficient algorithms can perform inference and learning in Bayesian networks. Bayesian
networks that model sequences of variables (e.g. speech signals or protein sequences) are called
dynamic Bayesian networks. Generalizations of Bayesian networks that can represent and solve
decision problems under uncertainty are called influence diagrams.

Example: Imagine you have a dog that really enjoys barking at the window whenever it’s
raining outside. Not necessarily every time, but still quite frequently. You also own a sensitive
cat that hides under the couch whenever the dog starts barking. Again, not always, but she tends
to do it often.

The reason I’m emphasizing the uncertainty of your pets’ actions is that most real-world
relationships between events are probabilistic. You rarely observe straightforward links like “If
X happens, Y happens with complete certainty”. To continue the example above, if you’re
outside your house and it starts raining, there will be a high probability that the dog will start
barking. This, in turn, will increase the probability that the cat will hide under the couch. You see
how information about one event (rain) allows you to make inferences about a seemingly
unrelated event (the cat hiding under the couch).You can also make the inverse inference. If you
see the cat hiding under the couch, this will increase the probability that the dog is currently
barking. And that, in turn, will increase the probability that it’s currently raining.Bayesian

86
networks are very convenient for representing similar probabilistic relationships between
multiple events.

CERTAINTAY FACTOR THEORY

Certainty factors theory is an alternative to Bayesian reasoning when reliable statistical


information is not available or the independence of evidence cannot be assumed and introduces
a certainty factors calculus based on the human expert heuristics.

Certainty factors are a compromise. The good news is that a system based on rules with
certainty factors requires the expert to come up with only a small set of numbers (one for each
rule) and will allow fast computation of answers. The bad news is that the answer computed may
lead to irrational decisions.

Certainty factors have been justified by their performance (Mycin performed as well or
better than expert doctors) and by intuitive appeal. However, they are subject to paradoxes where
they compute bizarre results .If the rules that make up the knowledge base are designed in a
modular fashion, then problems usually do not arise, but it is certainly worrisome that the
answers may be untrustworthy.

Before Mycin, most reasoning with uncertainty was done using probability theory. The
laws of probability—in particular, Bayes’s law—provide a well-founded mathematical
formalism that is not subject to the inconsistencies of certainty factors. Indeed, probability theory
can be shown to be the only formalism that leads to rational behavior, in the sense that if you
have to make a series of bets on some uncertain events, combining information with probability
theory will give you the highest expected value for your bets. Despite this, probability theory
was largely set aside in the mid-1970s. The argument made by Shortliffe and Buchanan (1975)
was that probability theory required too many conditional probabilities, and that people were not
good at estimating these. They argued that certainty factors were intuitively easier to deal with.
Other researchers of the time shared this view. Shafer, with later refinements by Dempster,
created a theory of belief functions that, like certainty factors, represented a combination of the
belief for and against an event. Instead of representing an event by a single probability or

87
certainty, Dempster-Shafer theory maintains two numbers, which are analagous to the lower and
upper bound on the probability. Instead of a single number like 5.

DEMPSTER-SHAFER THEORY

The theory of belief functions, also referred to as evidence theory or Dempster–Shafer


theory (DST), is a general framework for reasoning with uncertainty, with understood
connections to other frameworks such as probability, possibility and imprecise probability
theories. First introduced by Arthur P. Dempster in the context of statistical inference, the theory
was later developed by Glenn Shafer into a general framework for modeling epistemic
uncertainty—a mathematical theory of evidence. The theory allows one to combine evidence
from different sources and arrive at a degree of belief (represented by a mathematical object
called belief function) that takes into account all the available evidence.

In a narrow sense, the term Dempster–Shafer theory refers to the original conception of
the theory by Dempster and Shafer. However, it is more common to use the term in the wider
sense of the same general approach, as adapted to specific kinds of situations. In particular, many
authors have proposed different rules for combining evidence, often with a view to handling
conflicts in evidence better. The early contributions have also been the starting points of many
important developments, including the transferable belief model and the theory of hints.

Dempster–Shafer theory is a generalization of the Bayesian theory of subjective


probability. Belief functions base degrees of belief (or confidence, or trust) for one question on
the probabilities for a related question. The degrees of belief themselves may or may not have
the mathematical properties of probabilities; how much they differ depends on how closely the
two questions are related. Put another way, it is a way of representing epistemic plausibilities but
it can yield answers that contradict those arrived at using probability theory.

Often used as a method of sensor fusion, Dempster–Shafer theory is based on two ideas:
obtaining degrees of belief for one question from subjective probabilities for a related question,
and Dempster's rule for combining such degrees of belief when they are based on independent
items of evidence. In essence, the degree of belief in a proposition depends primarily upon the

88
number of answers (to the related questions) containing the proposition, and the subjective
probability of each answer. Also contributing are the rules of combination that reflect general
assumptions about the data.In this formalism a degree of belief (also referred to as a mass) is
represented as a belief function rather than a Bayesian probability distribution. Probability values
are assigned to sets of possibilities rather than single events: their appeal rests on the fact they
naturally encode evidence in favor of propositions.

Dempster–Shafer theory assigns its masses to all of the subsets of the propositions that
compose a system—in set-theoretic terms, the power set of the propositions. For instance,
assume a situation where there are two related questions, or propositions, in a system. In this
system, any belief function assigns mass to the first proposition, the second, both or neither.

FUZZY SETS, FUZZY LOGIC, SET OPERATIONS

Fuzzy sets can be considered as an extension and gross oversimplification of classical


sets. It can be best understood in the context of set membership. Basically it allows partial
membership which means that it contain elements that have varying degrees of membership in
the set. From this, we can understand the difference between classical set and fuzzy set. Classical
set contains elements that satisfy precise properties of membership while fuzzy set contains
elements that satisfy imprecise properties of membership.

Representation of fuzzy set

Let us now consider two cases of universe of information and understand how a fuzzy set
can be represented.

Case 1

When universe of information UU is discrete and finite −

A˜={μA˜(y1)y1+μA˜(y2)y2+μA˜(y3)y3+...}A~={μA~(y1)y1+μA~(y2)y2+μA~(y3)y3+...
}

={∑ni=1μA˜(yi)yi}={∑i=1nμA~(yi)yi}

Case 2

89
When universe of information UU is continuous and infinite −

A˜={∫μA˜(y)y}A~={∫μA~(y)y}

In the above representation, the summation symbol represents the collection of each
element.

Operations on Fuzzy Sets

Having two fuzzy sets A˜A~ and B˜B~, the universe of information UU and an element 𝑦


of the universe, the following relations express the union, intersection and complement operation

on fuzzy sets.

Union/Fuzzy ‘OR’

Let us consider the following representation to understand how the Union/Fuzzy


‘OR’ relation works −

μA˜∪B˜(y)=μA˜∨μB˜∀y∈U

TYPES OF MEMBERSHIP FUNCTIONS

90
A membership function (MF) is a curve that defines how each point in the input space is
mapped to a membership value (or degree of membership) between 0 and 1.

Definition: a membership function for a fuzzy set A on the universe of discourse X is


defined as µA:X → [0,1], where each element of X is mapped to a value between 0 and 1. This
value, called membership value or degree of membership, quantifies the grade of membership of
the element in X to the fuzzy set A.

a. Membership functions were first introduced in 1965 by Lofti A. Zadeh in his first
research paper “fuzzy sets”.
b. Membership functions characterize fuzziness (i.e., all the information in fuzzy
set), whether the elements in fuzzy sets are discrete or continuous.
c. Membership functions can be defined as a technique to solve practical problems
by experience rather than knowledge.
d. Membership functions are represented by graphical forms.
e. Rules for defining fuzziness are fuzzy too.

Two Example Methods

91
MULTI VALUED LOGIC and FUZZY LOGIC

Multi-valued logics are logical calculi in which there are more than two possible truth
values. Traditionally, logical calculi are bivalent—that is, there are only two possible truth values
for any proposition, true and false (which generally correspond to our intuitive notions of truth
and falsity).

Fuzzy Logic (FL) is a method of reasoning that resembles human reasoning. The
approach of FL imitates the way of decision making in humans that involves all intermediate
possibilities between digital values YES and NO. The fuzzy logic works on the levels of
possibilities of input to achieve the definite output.

Fuzzy logic includes 0 and 1 as extreme cases of truth (or "the state of matters" or "fact")
but also includes the various states of truth in between so that, for example, the result of a
comparison between two things could be not "tall" or "short" but ".38 of tallness.

A fuzzy control system is a control system based on fuzzy logic a mathematical system
that analyzes analog input values in terms of logical variables that take on continuous values
between 0 and 1, in contrast to classical or digital logic, which operates on discrete values of
either 1 or 0 true or false, respectively .

Fuzzy Knowledge Representation

The experience of the expert physician regarding the set of considered diseases D is captured in a
set of fuzzy tables, each of which specifies the profile for one disease. We consider three fuzzy
sets Yes, May Be, and No as shown in fig.1 to represent the certainty of disease presence.
Entries in the disease profile tables would be selected from these fuzzy sets.

92
No MayBe Yes
1
0.8
0.6
value
0.4
Truth

0.2
0
0 20 40 60 80 100
Certainty
Level

Fuzzy sets for the certainty of disease presence.

For a given disease there will be a set R of k ≤ n relevant features which is a subset of the
collective features set F. Table 1 shows an empty fuzzy table for the disease profile. It shows
five fuzzy values for each relevant feature ri , i=1, … , k.

LINGUISTIC VARIABLES AND HEDGES

These terms are referred to as linguistic or fuzzy variables. A linguistic hedge is an


operation that modifies the meaning of a fuzzy set, which can be understood as terms that modify
the shapes of fuzzy sets by using adverbs such as very, quite, more, less and slightly.

By a linguistic variable we mean a variable whose values are words or sentences in a


natural or artificial language. For example, Age is a linguistic variable if its values are linguistic
rather than numerical, i.e.,young, not young, very young, quite young, old, not very old and not
very young, etc., rather than 20, 21,22, 23, In more specific terms, a linguistic variable is
characterized by a quintuple (L>, T(L), U,G,M) in which L is the name of the variable; T(L) is
the term-set of L, that is, the collection of its linguistic values; U is a universe of discourse; G is
a syntactic rule which generates the terms in T(L); and M is a semantic rule which associates
with each linguistic value X its meaning, M(X), where M(X) denotes a fuzzy subset of U. The
meaning of a linguistic value X is characterized by a compatibility function, c: U → [0,1], which
associates with each u in U its compatibility with X. Thus, the compatibility of age 27 with
young might be 0.7, while that of 35 might be 0.2.

93
The function of the semantic rule is to relate the compatibilities of the so-called primary
terms in a composite linguistic value-e.g., young and old in not very young and not very old-to
the compatibility of the composite value. To this end, the hedges such as very, quite, extremely,
etc., as well as the connectives and and or are treated as nonlinear operators which modify the
meaning of their operands in a specified fashion. The concept of a linguistic variable provides a
means of approximate characterization of phenomena which are too complex or too ill-defined to
be amenable to description in conventional quantitative terms.

In particular, treating Truth as a linguistic variable with values such as true, very true,
completely true, not very true, untrue, etc., leads to what is called fuzzy logic. By providing a
basis for approximate reasoning, that is, a mode of reasoning which is not exact nor very inexact,
such logic may offer a more realistic framework for human reasoning than the traditional two-
valued logic. It is shown that probabilities, too, can be treated as linguistic variables with values
such as likely, very likely, unlikely, etc. Computation with linguistic probabilities requires the
solution of nonlinear programs and leads to results which are imprecise to the same degree as the
underlying probabilities. The main applications of the linguistic approach lie in the realm of
humanistic systems-especially in the fields of artificial intelligence, linguistics, human decision
processes, pattern recognition, psychology, law, medical diagnosis, information retrieval,
economics and related areas.

FUZZY PROPOSITIONS & INFERENCE RULES

Fuzzy propositions are assigned to fuzzy sets. Suppose a fuzzy proposition 'P' is assigned
to a fuzzy set 'A', then the truth value of the proposition is proposed by T (P) = μA(x) where 0 ≤
μA(x) ≤ 1. Therefore truthness of a proposition P is membership value of x in fuzzy set A.

1. Conjunction
P / Q : x is A and B
T( P / Q) = Min [ T(P), T(Q)]
2. Negation
T(Pc) = 1 – T(P)
3. Disjunction
P V Q : x in A or B

94
T (P V Q) = Max [ T(P), T(Q) ]
4. Implication
P → Q : x is A then x is B

T( P → Q ) = T (Pc V Q) = Max [ T(Pc, T(Q)]


If P is a proposition defined on set A on universe of discourse X and
Q is another proposition defined on set B on universe of discourse Y,
then the implication P → Q can be represented by the relation R

R = ( A X B) U (Ac X Y) = If A then B
If x ∈ A, where x ∈ X and A ⊂ X
then y ∈ B, where y ∈ Y and B ⊂ Y

Implication of Classical Logic:-


Properties P and Q are given by
P : x ∈ A, where A is defined on x.
Q : y ∈ B, where B is defined on y.

Then the implication P → Q is represented in set theoretic form by a relation R as

R = (A X B) U (Ac X Y)
The implication is equivalent to linguistic rule form, if x ∈ A then y ∈ B.

For the classical predicate logical rule, (P → Q) V (Pc → S) the linguistic rule form is,
if x is A then y is B, else y is C.
Where C is defined as
S : y is C , C ⊂ Y.

95
FUZZY SYSTEMS

Defining Fuzzy Sets

In mathematics a set, by definition, is a collection of things that belong to some


definition. Any item either belongs to that set or does not belong to that set. Let us look at
another example; the set of tall men. We shall say that people taller than or equal to 6 feet are
tall. This set can be represented graphically as follows:

The function shown above describes the membership of the 'tall' set, you are either in it or you
are not in it. This sharp edged membership functions works nicely for binary operations and
mathematics, but it does not work as nicely in describing the real world. The membership
function makes no distinction between somebody who is 6'1" and someone who is 7'1", they are
both simply tall. Clearly there is a significant difference between the two heights. The other side
of this lack of distinction is the difference between a 5'11" and 6' man. This is only a difference
of one inch, however this membership function just says one is tall and the other is not tall.

The fuzzy set approach to the set of tall men provides a much better representation of the tallness
of a person. The set, shown below, is defined by a continuously inclining function.

96
The membership function defines the fuzzy set for the possible values underneath of it on the
horizontal axis. The vertical axis, on a scale of 0 to 1, provides the membership value of the
height in the fuzzy set. So for the two people shown above the first person has a membership of
0.3 and so is not very tall. The second person has a membership of 0.95 and so he is definitely
tall. He does not, however, belong to the set of tall men in the way that bivalent sets work; he has
a high degree of membership in the fuzzy set of tall men.

Defining Fuzzy Sets Mathematically

Fuzzy sets were first proposed by Lofti A. Zadeh in his 1965 paper entitled none other than:
Fuzzy Sets. This paper laid the foundation for all fuzzy logic that followed by mathematically
defining fuzzy sets and their properties. The definition of a fuzzy set then, from Zadeh's paper is:

Let X be a space of points, with a generic element of X denoted by x.


Thus X = {x}.

A fuzzy set A in X is characterized by a membership function fA(x)


which associates with each point in X a real number in the interval
[0,1], with the values of fA(x) at x representing the "grade of
membership" of x in A. Thus, the nearer the value of fA(x) to unity,
the higher the grade of membership of x in A.

                                                             

97
This definition of a fuzzy set is like a superset of the definition of a set in the ordinary
sense of the term. The grades of membership of 0 and 1 correspond to the two possibilities of
truth and false in an ordinary set. The ordinary boolean operators that are used to combine sets
will no longer apply; we know that 1 AND 1 is 1, but what is 0.7 AND 0.3? This will be covered
in the fuzzy operations section.

Membership functions for fuzzy sets can be defined in any number of ways as long as
they follow the rules of the definition of a fuzzy set. The Shape of the membership function used
defines the fuzzy set and so the decision on which type to use is dependent on the purpose. The
membership function choice is the subjective aspect of fuzzy logic; it allows the desired values
to be interpreted appropriately. The most common membership functions are shown below:

98

You might also like