You are on page 1of 17

Fuzzy Sets and Systems 55 (1993) 255-271

North-Holland

255

On the issue of defuzzification and


selection based on a fuzzy set
Ronald R. Yager and Dimitar Filev
Machine Intelligence Institute, lona College, New Rochelle, N Y 10801, USA

Received November 1991;


Revised May 1992
Abstract: We are concerned with the problem of selecting a crisp element based on information provided by a fuzzy set, a

problem which manifests itself in the defuzzification step in fuzzy logic controllers. We provide a unifying approach to this
selection process. Among other characteristics this unification puts the defuzzification methods of mean of maxima and center of
gravity in the same framework. We show that this selection can be viewed as a three step operation: transformation of the
decision fuzzyset; normalization to probability distribution; selection based on the probability distribution. A number of different
procedures for selection are discussed.
Keywords: Fuzzy logic control; defuzzification; probability; decision; possibility.

1. Introduction
A problem of great importance for the application of fuzzy set theory is the selection of a specific
element from the universe of discourse based upon a fuzzy subset. This issue arises in the use of fuzzy
subsets in decision making [1]. In this case we have a set of alternatives X and a fuzzy subset F over X
indicating the degree to which each x e X satisfies our decision criteria and goals. We must use the set F
to guide us in the selection of an element x* from X as our choice. The issue also arises in the problem
of defuzzification associated with the use of fuzzy logic controllers [12, 15, 16]. In this case if V is a
control variable whose value the knowledge base portion of the controller provides as a fuzzy subset F
over the real line, the membership grades indicate the appropriateness of each value as the discrete
controller value. We must again in this case use the set F to help guide our selection.
In this paper we show that the selection problem can be implemented by converting the fuzzy subset
into a 'probability' distribution and use this probability distribution to select the element either via the
performance of an experiment or by calculation of an expected value.
A crucial observation made in this work is that the process of converting the fuzzy subset into the
appropriate probability distribution appears to be mediated by the degree of confidence we have in the
fuzzy subset F being used.
We provide some general conditions required of the transformation from a fuzzy subset into a
probability distribution and then suggest some specific formulations for accomplishing this
transformation.

2. Selecting and e l e m e n t based on a fuzzy subset


Assume V is a variable whose value is to be determined as some element in the set X called the
universe of discourse of V. The selection of the value for V is to be guided by a fuzzy subset A of X
which denotes the result of some procedure providing the suitability of each x e X to be the chosen
Correspondence to: Prof. R.R. Yager, Machine Intelligence Institute, Iona College, New Rochelle, NY 10801, USA.

0165-0114/93/$06.00 (~) 1993~EIsevier Science Publishers B.V. All rights reserved

256

R.R. Yager, D. Filev / Defuzzification and selection based on a fuzzy set

element. We shall denote the process of selecting the element x* from X based upon A as
Select(x* X [ A).
In order to get an understanding of the process involved in the Select operation we shall consider
some prototypical situations.
Assume X = {x~, x2, x3, x4, xs}. Let A = {Xl}. It appears quite natural that in this situation the
obvious choice is to select Xl.
Consider next the case where A = {xl, x2}. It appears quite natural that in this case the problem
comes down to selecting either xl or x2. In this case the only way to distinguish between these elements
is to perform a random experiment in which

P ( x , ) = ,

P(x2) =

Thus we see that Select can be a random probabilistic experiment.


In the more general setting in which X = {xl, x2 . . . . . x,} and A = { x ~ , . . . , xm} where rn ~< n then
again the only way to select an element from X compatible with A is to perform a random experiment
in which
1
P(xi) = - ,
m

xiEA.

Let us now consider the more complex environment in which we allow A ( x i ) c [0, 1] rather simply
being in the binary set {0, 1}. In this situation it seems that as in the preceding examples the selection
process should be based upon a random experiment. In particular the fuzzy subset A should be used to
generate a probability distribution on X where P / = P(xi) indicates the probability of selecting the
element xi. This probability distribution should then be used to select x*. A crucial issue is that of
deciding how the fuzzy subset A determines the probability distribution P. We note that a few authors
[2-5, 8, 9, 11, 13] have looked at the problem of converting a possibility distribution (fuzzy subset) into
a probability distribution. Rather then at this point suggesting a procedure for going from A to the
probability distribution P on X we indicate what we feel are required properties:
(I) If m(xi) = A ( x j ) then P(xi) = P(xj).
(II) If A ( x i ) > A ( x j ) then P(xi) >I P(xj).
Thus we see that it is required that if two alternatives score the same in the set A then they must have
the same probability of selection. The second condition indicates that a form monotonicity holds in that
if xi scores better than xj in A then xj cannot have a higher probability of selection.
Least it be not obvious we note that two of the most commonly used procedures for selecting x*
satisfy the above two conditions. The case of simply selecting the x with the largest value for A falls into
the category. In this case P(xi) = 0 if x i :)(:Maxx A ( x ) and for xj = Maxx A ( x ) then P(xj) = 1/m, where m
is the number of elements of X which attain the maximum membership in A. We shall call this
procedure M1.
The second common procedure is to select
A(xi)

e(x,) =

A(xj-----5;

we shall call this method M2. We see that this approach also satisfies the two conditions stated above.
One can also suggest another method of obtaining the probabilities from A, which we shall denote as
M3. In this case
P(xi) = P(xj)

for all i.

Here P(xj) = 1/n where n is the cardinality of x. It can easily be seen that the condition P(xi) = P(xj)
guarantees satisfaction of the two previously mentioned conditions.
Whatever method we use for obtaining the probability distribution it gives us a set of probabilities
P~ . . . . . Pn on the elements x ~ , . . . , x,. The actual selection of the optimal x* is obtained by the

257

R.R. Yager, D. Filer / Defuzzification and selection based on a f u z z y set

I
DATA

Procedure
] I[ De~rmJaa'donof
Random
todetermineA ~_~ probability
P Experiment
distributlon
stop3
st~pi
s~p 2

X*

Fig. 1.

performance of a random experiment. We can perform this experiment as follows. Let


So ~. O,

Si ~- S i _ l -4- ei,

i=1,...

,n.

Assume we have a random number generator Rand, such that Rand e [0, 1]. To obtain the value x* we
proceed as follows:
1. Run the random number generator to obtain the value Rand.
2. If Rand 6 [ S i - 1, Si] select x* =xi.
Figure 1 shows the whole process. The two boxes enclosed in the dashed line constitute the process
Select(x* s X I A). At present our main interest is the middle step, the determination of P.
Given the only requirements in going from A to P are the satisfaction of conditions (I) and (II), there
exists an infinite number of possible procedures for accomplishing this [9].
We feel that there exists another consideration that must be included in the process of obtaining the
probability distribution P from the fuzzy subset A. In particular in determining the probability
distribution P used to select the optimal element x* some measure of the confidence we have in the
process used to obtain A should effect the probabilities used to decide P. For example if
A = {1/x~, 0.99/x2}, one would have to be extremely confident in the process used to obtain A to
clearly select x~ as the choice without at all considering x2. Thus we see that the use of M1 requires
extreme confidence in the correctness of the process used in step one to obtain A. At the other extreme
if A = {1/xl, 0.01/x2}, the use of M3, P(x~) = P(x2), would indicate a significant lack of confidence in
the procedure used to obtain A.
The essential fact manifested in the above example is that as our confidence in the process used to
determine A gets less, our uncertainty in the knowledge of the optimal choice should increase.
The above observation can be more formally expressed in a further condition on the procedure of
going from A to P. In following we shall use c~ to indicate our degree of confidence in the process used
to get A.
Let ~ be some formal procedure used to obtain the probability distribution P from the fuzzy subset A
and confidence ol. Let us denote
P = .~(A, a).
Let H(P) be the entropy of the probability distribution P. We recall that entropy measures the degree
of uncertainty manifested by the probability distribution P; the more uncertain P the larger H(P). Let
c~, and 0~2 be two measures of confidence in A such that oq > ere. Then the above observations can be
manifested in the requirement that

H(~(A, oq)) <~H(~(Ai, olz)).


Theorem. Under the requirement of conditions (I) and (II) in going from a fuzzy subset A to a
probability distribution P, the uniform probability distribution, P(xi) = 1/n, has the highest possible
entropy.
Proof. Consider a set X of n elements. It is well established that P(xi) = 1/n gives the highest entropy.
Furthermore since P(x~) = P(xj) for all xj the conditions (I) and (II) are also satisfied.

R.R. Yager, D. Filer / Defuzzification and selection based on a fuzzy set

258

The consequence of this theorem is that the situation of lowest confidence corresponds to an
assumption of equal likely for all possible alternatives. Effectively this corresponds to completely
discounting the information contained in A. Thus M3 leads to the highest entropy and hence for any A
and a~,

H ( ~ ( A , aO) ~<M3(X).
Theorem. Under the requirements of conditions (I) and (II) in going from a fuzzy subset A to a
probability distribution P, the use of method M1 leads to the probability distribution with the lowest
entropy; we denote this MI(A).
Proof. Assume X = {xl . . . . .
X n ) , without loss of generality let A be such that xi for i = 1 . . . . .
with m i> 1 be the elements with the highest membership grade in A. Then under M1, we get

1/m,
[ 0,

P*(xi)

m,

i = 1. . . . . m,
for i > m.

In this case

H(P*) =

ml

1
- In - = In m.
i=lm m

25

Consider any other process satisfying conditions (I) and (II) leading to P. In this case
a,
P(xi) =

bi,

i=l,...,m,
i = m + 1. . . . .

n,

where (1) a < i / m , (2) a >>-bi, and (3) m *a + Einm+l b,- = 1. In this case

H(P)=-~e(xi)lnP(xi)=-(a*mlna+

bilnbi).

i=m+l

Since bi <~ a < 1/m then (1) In a ~< I n ( l / m ) and (2) In bi <~In(i/m). This means

H ( P ) >i - a * m * I n - - +
>~-(a*m+

bi In

1)

i=m+l

bi)lnl>~-lnl~lnm

i=m+l

and thus H ( P ) >i H(P*).


The consequence of this theorem is the observation that any process ~'(P, o0 must lead to a
distribution having higher entropy than the use of M1. Since we have indicated that high confidence in
A leads to least uncertainty in P it follows that using M1 corresponds to the highest possible confidence
in A. We see that for any c~ and A,

H(M~(X)) >i H ( ~ ( A , tr)) >i H ( M , (A))


where M3 indicates the lowest confidence situation and M1 the highest confidence.
If we introduce a similarity relation [21] S on the space X of alternative solutions we can extend the
possible procedures which can be used for the selection of the best alternative. In particular we can
provide for approaches other than the use of a random experiment to obtain x* given the probability
distribution P. W e recall that a similarity relation S is a mapping S : X X ~ [0, 1] such that
(1) S(x, x) = 1,
(2) S(x, y) = S(y, x),
(3) S(x, z) >~maxy[S(x, y) ^ S(y, z)].

R.R. Yager, D. Filer / Defuzzification and selection based on a fuzzy set

259

A reasonable way to select x in this case is to select an x that is similar to most good solutions. Formally
we can consider a proposition

Most good solutions are similar to x


and select the x that has the highest truth value for this proposition. M o r e generally we can replace
most by any monotonic linguistic quantifier Q and consider the proposition
Q good solutions are similar to x.
Since A(xi) indicates the degree to which xi is a good solution we can formally express this condition as
Q A ' s are Sx
where Sx(xi)= S(x, xi). As suggested by Z a d e h [23] the degree of truth of this proposition Tx is
obtained as follows:
(1) Calculate rx = ~i A(xi) * Sx(xi)/~iA(xi).
(2) Calculate T, = Q(rx).
Then we would select the x having the greatest value for Tx.
A few observations and generalizations are in order. Since Q is assumed m o n o t o n e we have that
Q(r,) >1Q(ry) implies r,/> ry. It is enough to select the x* having the greates value for rx. Let us look at
rx in more detail:
Fx = E Sx(Xi) * Pi

where Pi = A(xi)/Y~ A(xi). We note that Pi is actually a probability value for x, obtained by the use of
method M2. This suggests a further generation as
rx =

Sx(xi)

where P/ is a probability distribution obtained by using a general procedure described previously,


P = ~ ( A , c~).
It is interesting to note that in the special case of similarity where S(x, y) = 0 for x =/=y then
rx, -- P~
and we select the element with the largest probability, i.e. largest A(xi). A t the other extreme if
S(x, y) = 1 for all x and y then

rx=ZP/
and hence no distinction is made.
Closely related to a similarity measure is metric m ; a metric has properties c o m p l e m e n t a r y to a
similarity relation. Consider an environment in which we have a set of alternatives drawn from the real
line. Assume we have some fuzzy set F indicating the degree to which each y e Y is a desirable solution.
We can as described earlier transform this into a probability distribution P on Y. Then for each y e Y
we can calculate

My = ~ m(yi, Y)Pi.
yiE Y

In the above m(yi, y) is the distance from Yi to y and P~ is the probability ofyi. W e would then select as
our choice the y with the smallest My value; it is the one nearest to the good solutions.

260

R.R. Yager, D. Filev / Defuzzification and selection based on a fuzzy set

3. B A D D transformations for decision and defuzzification


In [6] Filev and Yager introduced a general approach to defuzzifications based upon the B A D D
(Basic defuzzification distributions) transformator. Figure 2 shows the typical process involved in the
fuzzy controller.
The output for the fuzzy controller F is a fuzzy subset of the real line; for simplicity we shall assume
the support set Y is finite, Y = {Yl . . . . . Yn}. For y~ e Y, F ( y i ) = w,. indicates the degree to which each y~
is suggested as a good output value by the rule base under the current input. The defuzzifier unit uses F
to select a best value y* to be the output of the controller.
Two commonly used methods for defuzzification are the center of area (COA) method and the mean
of maximum (MOM) method [10]. In the C O A method one calculates the output of the defuzzifier,
yCOA, as follows:
yCOA=

EiYiWi

(I)

wi

In the M O M method one calculates the output of the controller


yMOM= 1 ~, Yi

(II)

m yieA

where A is the set of elements in Y which provide the maximum value of F ( y ) and m is the cardinality
of A. A closer look at (I) shows that
yCOA = Ei Yi * Vi

wi
where

~3i = E

Wi "

A closer look at (II) shows that


yMM = ~ yi * u i
i

with

ui = { lo/m

for yi e Fmax,

for Yi =/=Fmax,

where Fmax is the maximal non-null level set of F and m = card FmaxWe see that both these methods can be viewed as based on a similar process. The processes of
obtaining the ui's and vi's which we denote generically as qi, have a number of properties in common:
(1) With the knowledge of F transform each w,- into a new value qi.
(2) Take as output a weighted average y* = ~ y~ * qi.
The process of obtaining the ui's and v / s , it should be remarked, is not a pointwise operation but is
based upon knowledge of the whole set F.
The process of obtaining the q/s from the wi's has a number of properties which are reminiscent of
those mentioned in the earlier section.
The process of obtaining the qi's from the w/s in both the C O A and M O M method is such that
(i) for any i and j, if wi = wj, then qi = qj,
(ii) for any i and j, if W i ~ Wj, then qi ~ qj.
Furthermore, we note that in both cases q~ has the basic probability distribution property that (a)
q~ e [0, 1] and (b) E~ q~ = 1.
Based upon these observations one can view the defuzzification process under the C O A and M O M
methods as first converting the fuzzy subset F of Y into a probability distribution on Y, in the spirit
described in the earlier section and then taking the expected value as our output. Keeping with this

inputl Fuzzy
RuleBaae

Defuzz~r

Fig. 2. Fuzzy controller.

R.R. Yager, D. Fileo / Defuzzification and selection based on a fuzzy set

Fuzzy
controller

]CoavertF into
a probability
dis~bu'do~

Take

261

y*

expecm~
wlue

Fig. 3.
probabilistic interpretation we shall in the following use P,- instead of qi to denote the transformation
values. Figure 3 shows this view of the defuzzification process.
Even m o r e significantly in [6] Filev and Yager have shown that the process of obtaining the oi's, the
C O A probability values, and the ui's, the M O M probability values, are special cases of a continuum of
possible values.
In [6] Filev and Yager introduced the BAsic Defuzzification Distribution ( B A D D ) transform for
going from fuzzy subsets to probability distributions. This transformation is defined as

e~-

w7

Ej w7

where a~ is a p a r a m e t e r such that ~ ~ [0, o~]. Using this transformation we note that
(1) If ae = 1 then P / = v~ and we recover the C O A method.
(2) If ct---~oo then P,. = ui and we recover the M O M method.
To see that this is the case we note that

t",

w~
Ej w7

(W,/Wmax)~
E, (w/W~ax) ~

where Wmax is the largest m e m b e r s h i p grade in F. W e see that as o:--~ 0% (W~/Wmax)~---~O for wi < Wmax
and (Wmax/Wm.~x)~= 1.
(3) If a~ = 0 then w7 = 1 and hence P, = Pj = 1/n where n is the cardinality of Y.
Thus we see that all three of the methods noted in the earlier section transforming a fuzzy subset into a
probability distribution are special cases of the continua based upon the p a r a m e t e r tr.
One immediate implication of the introduction of this transformation is that we can provide for an
adaptive learning scheme to obtain the optimal defuzzification p a r a m e t e r tr.
Let us denote P = B A D D ( F , 00 as the probability distribution resulting from transforming F under 0l
into a probability distribution using the B A D D transformation. It can be shown [6] that the entropy of
the P's, H ( P ) , obtained under this transformation satisfies the following property.

Theorem. I f o~i > olj then


H ( B A D D ( F , cri)) ~< H ( B A D D ( F , a(j)).
Based upon this t h e o r e m and our discussion in the earlier section on the effect of model confidence
we see that the value of c~ used in the transformation can be interpreted as some kind of measure of
confidence in the controller rule base portion. W e see that cr = 0 corresponds to no confidence, for in
this case we completely discount the information supplied by the rule base. If o~ = 1 we take the
information supplied by the controller at its face value, which can be interpreted as normal confidence.
If 0c = o0 then we are placing extremely high confidence in the information supplied by the controller.
The view of cr as a confidence measure can be further enhanced upon the following view of the
process of obtaining the probabilities from the fuzzy subset. Starting with the output of the rule portion
of the controller, the fuzzy subset F, we convert this into a new fuzzy subset, E, based upon our
confidence. We then simply obtain the P,'s by normalization:

E(x,)
Ej E(xj) "

262

R.R. Yager, D. Filer / Defuzzification and selection based on a f u z z y set


"7

I-"
i

input-

Con'~oller F [ ~ _ J F into E basedEL_~ pmbabili~s


rule b~e
] [ [ on confidence [
[ from ~. by
r~.alJza~n

Obtain
output

expected
values

Y*I
I
I

I
I

L..

Fig. 4.

Under the BADD transformation E is defined as


E =f~

where

E(x) = (F(x)) ~.

As described by Zadeh [22] the operation of raising a fuzzy subset to a power for a~> 1 is called
concentration and has the property E(x) <~F(x). For a power tr < 1 the operation is called dilation and
has the property E(x) >i F(x).
In [19] Yager has used the operation of raising a fuzzy subset to a power to model the concept of
importance as well as credibility. Thus we see that tr can correspond to a measure of how important we
consider the output.
Figure 4 shows a schematic diagram of this view of the fuzzy control process. The defuzzification
operation is enclosed within the dashed lines. We can describe the process occurring in the second box
in a more general fashion. Let F* be a normalized version of F,

F*(x) =

F(x)
Fmax

Let fl be some parameter on the scale [a, b] and let e be some special intermediate value on this scale.
Then the transformation to the fuzzy set E can be obtained by some function Gt3:1--91 such that for
i el,
{~

G,(i) = 1,

Ge(i) = i,

Gb(i) =

ifi=a,
if i ~ 1.

Furthermore for fll > f12, Gp,(i) <~Gp~(i).

4. Dilation and concentration operators


In the previous section we showed that the defuzzification operation can be viewed as consisting of
three steps. The first step is a transformation of the output of the controller rule base, F, into another
fuzzy subset E. The second step is the conversion of E into a probability distribution by a normalization
process, dividing each element by the cardinality of E. The third step consists of using the probability to
get an expected value. The first step described above is very much affected by the confidence that one
has in the rule base. In this section we shall provide some alternative methodologies for transforming F
into E. In the following we shall assume that F is a normal fuzzy subset, having at least one element
with membership grade one. This normalization can always be accomplished by dividing each
membership grade by the maximum membership grade in F.
The process of obtaining E from F must satisfy the two conditions of consistency and monotonicity:
(1) Consistency: if F(yi) = F(yj) then E(yi) = E(yj).
(2) Monotonicity: if F(yi) > F(yi) then E(yi) >t F(yj).
As we have indicated, the process of obtaining E from F depends upon the confidence or weight we are
willing to give to the knowledge provided by the rule base. If we have low confidence, we assign a low
weight to the information provided by the rule base and use a dilation type of operation to transform F
into E. If we have a strong confidence, we assign a high weight to the information provided by the rule
base and use a concentration type of operation [20, 22].

R.R. Yager, D. Filer / Defuzzification and selection based on a fuzzy set

263

g5
Fig. 5.
Let us now look at the dilation operation. Assuming F is a normal fuzzy subset of Y the process of
dilation consists of generating a new fuzzy subset of Y, E, such that for any y Y:
(3) E ( y ) >t F ( y ) .
(4) If the confidence is minimal, E ( y ) --- 1 for all y and if there is no dilation then E ( y ) = F ( y ) .
Consider c~ [0, 1] to be our measure of confidence in the rule base. Z e r o means no confidence and
one means normal confidence (no dilation). O n e procedure for accomplishing the dilation operation is
to transform F into E using the following rule:
E(y) = S v F(y).
It is easy to show that this transformation satisfies conditions (1)-(4). Figure 5 shows the effect of this
operation. Effectively we have raised all m e m b e r s h i p grades below 6: to a~. W e can see that 6: can be
viewed as the amount of dilation.
More generally we can replace the max o p e r a t o r by any t-conorm operator S [7]; thus
E ( y ) = S(6:, F ( y ) ) .
Because max is the smallest of the t-conorm operators the use of max provides the least dilation.
One special case worth noting is where S(a, b) = a + b - ab; in this case
E ( y ) = 1-- o~P(y).
We see this operation is an inversion of F, followed by multiplication by cr followed by another
inversion.
A semantics that can be associated with this general dilation procedure is that we are enforcing the
rule if the confidence is not too low then use F ( y ) . This semantics follows since c~---~F ( y ) can be
implemented as S(6:, F ( y ) ) .
From a systems point of view the process of dilation p e r f o r m e d here can be seen as adding noise to
the output of the fuzzy rule base (see Figure 6).
Let us now look at the concentration operation. H e r e again we transform F into a new fuzzy subset E
which must satisfy the conditions of consistency and monotonicity stated earlier in this section.
H o w e v e r in the concentration process conditions three and four are replaced by
(3) E ( y ) <~F ( y ) .
(4) If the confidence is maximal, then E ( y ) = 0 if F ( y ) ~ 1 and E ( y ) = 1 if F ( y ) = 1.

I Rule
b~e

Co~ide~e I
f~c~r

Fig. 6. Dilation operation as noise introduction.


J Actually the condition is E(y) > 1 if F(y) = 1.

264

R.R. Yager, D. Fileo / Defuzzification and selection based on a fuzzy set


E

Fig. 7. Concentration.
Furthermore we assume that if there is no concentration, E ( y ) = F ( y ) .
We shall allow tr [1, oo] to be our measure of confidence, or weight we want to assign to the
information provided by the rule base. The larger a~ the more confidence. In the following we shall find
it more convenient to use fl = 1 - l/re. We see that fl is monotonic in a~ but that fl [0, 1]. The higher fl
the concentration required.
One form of concentration can be implemented by an operation which reduces to zero all
membership grades below fl while leaving those above fl alone (see Figure 7).
To formally capture this kind of concentration operation we introduce a new operation, which is a
kind of intuitionistic negation [17, 18], denoted Neg~ where fl is a parameter such that fl [0, 1]. We
define Nega : I--~ I by
Nega(a)=

1 if a < f l ,
0 ifa/>fl.

Figure 8 shows this new negation operation as well as the ordinary negation, ti = 1 - a.
Using this new negation operation we can implement the above described concentration operation as
E ( y ) = Nega(F(y)) ^ F ( y )

where the bar indicates the ordinary negation operation. We see that this is a kind of not(not F) and F.
We further note that the operation Nega(F(y)) can be viewed as some kind of crisper (or defuzzifier)
around fl, where
Crispert~(a) = { ~

if a > f l ,
if a<-fl.

The operation Negt~(a ) can actually be generalized to be of the form shown by the dotted line in
Figure 9. This figure shows that is any negation lying between the strong Nega and the usual negation, ~i
will suffice for the concentration operation.
We can formally define a whole family of these kinds of negation operations which we shall denote as
r/~,p where fl [0, 1] and/9 is any number greater than one. We define
1

fl--~-l,

O~a<fl,

Oa,o(a)=|(l__a)O
L ( I _ fl)p-1,

fl <~a<~l"

_~

rreg f~(a)

i_
Fig. 8.

R.R. Yager, D. Filer / Defuzzification and selection based on a fuzzy set

265

Fig. 9.
W e first observe that for p = 1 we get r/a,p(a)= 1 - a ; h e n c e for p = 1 we get ti. Next we see that for
p = o0 we get the following:
(1) F o r a <~fl, aP/flP-1---~O and h e n c e we get r/a.p(a) = 1.
(2) F o r a > fl, (1 - a) < (1 - / 3 ) and h e n c e (1 - a)((1 - a ) / ( 1 -/3))p-1___~ 0; thus for p = ~, we get
Negt3 ' p(a) = 0.
W e further see that for a =/3, Nega,o(a ) = 1 - / 3 .
Using this new function for n e g a t i o n we can define the c o n c e n t r a t i o n o p e r a t i o n as

E(y) = r/t~,p(F(y)) ^ F(y).


To simplify the n o t a t i o n we shall let Sa, o, called an S-function, be defined as 1 - r/~,o. T h u s

a/,,

=I

Sa, p(a)

l1

-(-1-a)
(1 - / 3 ) P - "

O<a</3,
/3<_a<_l,

and therefore the c o n c e n t r a t i o n is

E(y) = Sa,o(F(y)) A F(y).


W e should note that for a </3, Sa, o(a ) < a thus E ( y ) = Sa,p(F(y)). F o r a >/3, St3,p(a ) >! a and therefore
E(y) = F ( y ) .
W e see that actually for the two p a r a m e t e r s / 3 and p, the bigger their value the m o r e c o n c e n t r a t i o n .
W h e n / 3 = 1, we get S~,p(a) = a p, p > 1, which was the type of o p e r a t i o n used in the earlier section
because a p ^ a = a P.
W h e n fl -- 0, Sts,p(a ) -- 1 - (1 - a) p = 1 - (ti) p and h e n c e E ( y ) = F(y) A (1 -- (1 -- F(y))P).
W h e n p -- 1, St3,p(a ) = a and E ( y ) = F(y). W h e n p = 0% then = St3,o(a ) = 0 if a < 1 and St3.p(a ) = 1 if
a -- 1, therefore E(y) = 1 if F ( y ) = 1 otherwise E ( y ) = O.
A natural situation is to define /3 and p in a related way, that is to assume p - - 0 c , the d e g r e e of
confidence, and fl = 1 - 1 / m M a k i n g this equivalence we can replace St3.p(a ) by So,.
A further generalization of this p r o c e d u r e can be accomplished by replacing the min o p e r a t o r by a
t - n o r m o p e r a t o r T; thus

E(y) = T(S~(F(y), F(y)).


W e can see f r o m a systems point of view that the process of c o n c e n t r a t i o n can be seen as a
filtering-like o p e r a t i o n o n the o u t p u t of the rule base (see Figure 10).

Rule
Base

Fx
Fig. 10.

,,~t~F(y))

Fil~r

R.R. Yager, D. Filer / Defuzzification and selection based on a fuzzy set

266
5. Transformation

under consonant

belief structures

In a n u m b e r of papers D u b o i s and P r a d e [3] stressed the r e p r e s e n t a t i o n of a fuzzy subset as a


c o n s o n a n t belief structure [14]. This r e p r e s e n t a t i o n is based u p o n the D e m p s t e r - S h a f e r t h e o r y of
evidence.
A s s u m e Y is a finite set. A belief structure m has associated with it a collection of non-null subsets of
Y, A1 . . . . . An called focal elements and a set of weights d e n o t e d m ( A i ) w h e r e (1) m ( m i ) E [0, 1] and
(2) ~ i m ( A 1 ) = 1.
O n e m e a s u r e associated with belief structure is the m e a s u r e of plausibility, PI. P1 is defined such that
for any subset B of Y, PI ~ [0, 1], is defined as
PI(B) =

m(Ai).

E
i s.t. A i f q B ~ O

A c o n s o n a n t belief structure is defined as o n e in which the focal elements are nested: Y = A~ D A :


~ A , . It can be s h o w n [14] that w h e n the belief structure is c o n s o n a n t PI(B) b e c o m e s a possibility

measure, P I ( B ) = maxy~n{Pl(y)}. In particular this implies a u n i q u e c o r r e s p o n d e n c e b e t w e e n a fuzzy


subset F and a c o n s o n a n t belief structure.
In particular if m is a c o n s o n a n t belief structure then it can be seen to induce a fuzzy subset F on Y
where the m e m b e r s h i p grade F ( y ) is the plausibility, i.e.

F(y) =

~
A i

s.t. y ~ A

m(Zi).
i

Alternatively if F is a n o r m a l fuzzy subset of Y we can use it to induce a belief structure as described


in the following. A s s u m e wl . . . . . wn are the different n o n - z e r o m e m b e r s h i p grades that a p p e a r in F,
where Wl < We < < wn = 1. Let Fw, be the wi level set,

F,,, = {y I F ( y ) >! wi}.


W e then define o u r belief function m as follows. T h e focal elements o f m are the discrete level sets of
F, Ai = Fw, and the weights associate with these focal elements are

m ( A ~ ) = Wl,

m ( A i ) = wi - wi-1,

i = 2.....

n.

F o r o u r p u r p o s e s we n e e d to include the whole space Y as o n e of the focal elements. If it is not equal


to A~ we shall include it as A0 and assign its weight equal to zero. If all the elements are in A1 then we
n e e d not m a k e this addition.
Let X = {xl, x z , . . . ,
In this case
Example.

XT} and assume F = { 1 / x l ,

A1 = {xl, x2, X3, X4, X5, X6},

m(A1) = 0.1,

A e = {xl, Xe, x3, x4, xs},

m ( A 2 ) = 0.2,

A3 = {Xl, x2, x3, x4},

m(m3) = 0.3,

A4 = {xl, x2},

m(A4) = 0.1,

A5 = {xl},

m ( A s ) = 0.3,

Ao = {xl, x2, x3, Xa, Xs, x6, x7},

m ( A o ) = O.

0.7/X2, 0.6/X3, 0.6/X4, 0.3/X5, 0.1/X 6, 0/X7}.

Given this c o n s o n a n t belief structure view of a fuzzy subset let us look at h o w the n o r m a l fuzzy set F
is t r a n s f o r m e d into a new fuzzy subset E. T h a t is we are interested in seeing h o w we t r a n s f o r m a belief
structure m into a new belief structure rh, to reflect dilation (lack of confidence) and c o n c e n t r a t i o n
(increase o f confidence).

R.R. Yager, D. Fileo / Defuzzification and selection based on a fuzzy set

267

In the following we shall assume that F is represented by a belief structure m with consonant focal
elements A0 = A 1 ~ ' " ~ A n where A0 = Y and ~7=0 m ( A i ) = 1. F r o m normality we note that all the
elements in An have m e m b e r s h i p grades of one in F.
The transformation we are interested in is that of getting a new belief structure rh, corresponding to
the fuzzy subset E. W e recall that F ( y ) = ~a i:r~Ag m(A~). In transforming from F to E the following
two conditions must be satisfied:
(1) F(x) = F ( y ) requires E ( x ) = E ( y ) ,
(2) F(x) >- F ( y ) implies F(x) >--F(y).
The requirement to satisfy these conditions implies that the only allowable operation is to modify the
weights of already existing focal elements. In particular we cannot introduce any new focal element
with non-zero weight.
T h e o r e m . The only allowable operation which modifies the set E to give F is to change the weights of the

already existing focal elements, including the element Ao.


We leave the proof to the reader.
Thus we see that the only way to modify the belief structure m, corresponding to the fuzzy subset E,
to get the new belief structure rh, corresponding to E, is to exchange weights between already existing
focal elements.
Let m be a belief structure with focal elements, Ao, AL . . . . . An, with weights m(Ai):/:O for
i = 1 , . . . , n (m(Ao) = 0 or :/:0) and of course ET=om(Ai) = 1 and m ( A i ) e [ O , 1]. The new belief
structure rh can only have the same focal elements Ao . . . . . An with weights vh(Ai), which must satisfy
n= 1 rh(Ai) : - 1 and rh(Ai) E [0, 1], but any of the ffl(Ai) c a n be zero.
Let us now look at distinction between dilation and concentration. The process of dilation requires
that for any y , E ( y ) > ~ F ( y ) . This condition can be guaranteed by the following. Let
AA i = rh(Ai) - m ( A i ) , the change in the weight assigned to the focal element Ai. The dilation condition
is guaranteed if for any q = 0 . . . . .
q

rh(Zi) >! ~ m(mi).


i=0

i =0

This condition implies that ~q=0 AAi >~0 so that


q--I

AA~ ~ - AAq.
Thus we see that if the sum of the current change is greater than what is next lost we are okay. We
should note that ET=0 AAi = 0.
Let us look at some dilation operations.
Example. In [14] Sharer suggests a discounting operation which has all the properties of a dilation. In
this case we select a value o~ e [0, 1] and obtain a new belief structure as

rh(Ai) = arn(Ai),

i = 1. . . . .

n,

rh(Ao) = a'm(Ao) + 1 - ol.

Thus in this case we see that E ( y ) = & + o~(F(y)). It is easy to see that E ( y ) >i F(y).
We notice that when o: = 0 all the weights are put in A0 and E ( y ) = 1 for all y.
For concentration operations we require that

F ( y ) >1E ( y ) .
In this case
q--I

q--1

rh(Ai) <~ E m(Ai).


i=0

i=0

R.R. Yager, D. Filer / Defuzzification and selection based on a f u z z y set

268

Since 7=0 rh(Ai) =

Y~7=om(Ai)= 1 we see that

q--I

q--1

2 th(Zi)- ~ rh(ai)>i ~ m ( Z i ) - ~ m(Zi)


i=0

i=0

i=0

i=0

~7=qAA~ >IO, so that

and hence

A A i >~ - A A q.
i=q+l

Thus if all the increases beyond q are greater than the decrease of q we get concentration. One case
that satisfies this is to place all the weights into the An focal element.
A family of these concentration operations can be obtained by the following. Select 0~ e [0, 1] and let

rh(Ai) = (1 - ol)m(Ai) , i = O , . . . , n - 1,

rh(A~) = (1 -

o:)rn(A,~)+ ol.

In this case

E(y):{(ll-OOF(y ) for F ( y ) : / = l ,
for F(y) = 1.
A n alternative form closely related to this where ol e [1, ~] is
for

F(y) ~ 1,

for

F(y)= l.

6. A level set approach to defuzzification


In the previous sections we consider the process of defuzzification as consisting of three steps: 1.
Transform F into E based upon confidence. 2. Formulate the probabilities as P~ E(xi)/~ E(xi). 3.
Calculate y* = Y, Pi *Yi. In this section we shall look at an approach which combines all three steps.
This approach is based upon a level set method.
Again assume F is a normal fuzzy subset of Y. W e recall that the w-level set of F is the crisp subset,
Fw, of Y such that Fw = {y I F(y) >1w} for w e [0, 1]. We shall let M(Fw) be the m e a n value of the
elements in Fw.
We shall introduce two possible methods for obtaining the defuzzified value. One method is an
optimistic one and yields y* and the other is a pessimistic and yields y , as the defuzzified value:
=

l fj M(F~)dw and

y, -1-o~

y,=~

1 t~

M(F~)dw.

We note that if a~ = 1 we get the M O M value. I f / 3 = 0 we get the average of all the set Y. Thus these
two approaches cover the whole spectrum of defuzzification procedures.
Let us first look at the optimistic method. In the following we shall denote wi = F(yi). Without loss of
generality we shall assume w,-/> w / f o r i < j . Because of the normality of F it is necessary that wl = 1.
Since Fw= {y I F(y) >i w} then

F~ =

r,
{YI . . . . .

|{yl,.
~.{Yl},

W ----0,

yn},

O<w<-wn,

,yo-1},

Wn ~ W ~ W n _ l ,

,yn-1},

Wn_i+ 1 ~ W ~ Wn_l,
W2<W<~Wl=I.

R.R. Yager, D. Filev / Defuzzification and selection based on a fuzzy set

269

We shall initially look at the case of y* when tr = 0


Y*

w.

-,

Yid w + cw._~ n--I Yi d w +


~ - dw + +
y~ dw.
2
J,,,.
~--~ n 1
f~.. , . . . i. =2l n -y~
~.~

fo ' ~ n

Doing the integration and gathering the terms we get


w.
/w.
y*=y,*-[(+y,-lk~+

W~_L_--W~]
(W.
-- W. ~_ W n - 2 - - W n _ 3 ] \ q_ . . . .
n - 1
/'t-Yn-27
q-Wn-1
n--1
n-2
/

In general we see that


n

Y* = ~ Yi *qi
i=1

where qi is defined as
Wn

qn = - - ,
n

qi-i

qi -t

W i - 1 -- Wi
-

i-1

i = n, n - 1 . . . . .

2.

It can be shown that the q,'s have the properties of a probability distribution qi e [0, 1] and ~ q g = 1.
Thus we see that in this case the defuzzified value is defined as an expected value over the set of
possible values, the y/s. We also note that the probabilities are also obtained from the membership
grades of the fuzzy set F. However, in this situation the probabilities ae obtained in a recursive manner.
Let us now look at the more general case where
l

y* =

M(Fw) dw.

1-

Here ce is some discounting from the MOM.


Let W,,_r be the smallest membership grade in F that is larger than o:. In this case
l

f .... n~r

t.wn_(~+l)n--(r+l)

Y * = 1 - o : j-~/ 2 . , i = , n -Yi- r d w + Jw. ,

Yi

i~l

n-(r+l)

dw+'''"

Doing the integration and gathering the terms we get

1
Y* =Y~-~x 1 - ol

(w._,_

1
I-yn-(r+l) X ~ \

X - -

n-r

(w~_,--ol
n-r

Wn_(r+l)--Wn__r~

J+

In general we see that in this case


n--r

Y* = ~, Yi *Pi
i=1

where Pi is defined as
i

Pn-r

P, = 0

1--Ol

)<--,11)
.....
n--r

p , p _ _ l

---l=--+l-o:\

( w i - I S Wi]

i-1

for i = n . . . .

n -- r,

elsewhere.

We see that in this method we cut off the elements yi with membership grade less than tr, since our
summation of y* goes only up to n - r .
Let us now look at the pessimistic approach. H e r e
y. = ~

M(Fw) dw.

27O

R.R. Yager, D. Filev / Defuzzification and selection based on a fuzzy set

L e t wr-1 be t h e s m a l l e s t m e m b e r s h i p g r a d e l a r g e r t h a n / 3 .

Y*

~
'.
~ fo~O~y_~w_fw,
i~=lln-~
1d w + . - - +
n
.
.=

fl

Yi

dw.

i i=l n -- r - - 1

Doing the integration we get again the

-2 Yi * qi.

Y, -

i=1

H o w e v e r in this case
q,=?

XIwn)
~- ,

qi-l=qi+~\

1 ( w c _ 2 - - w,.]
i-1
I

1
qr-I =qr +~(fl--Wr),

qi = qr--1

fori=n

.....

r+l,

for i < r - 1.

In this case we see t h a t as /~ d e c r e a s e s , t h e h i g h e r m e m b e r s h i p g r a d e s a c c o u n t for less o f t h e


probability.

7. Conclusion

In this p a p e r we p r o v i d e a c o m p r e h e n s i v e f r a m e w o r k for t h e u n d e r s t a n d i n g of t h e p r o b l e m of
e l e m e n t s e l e c t i o n b a s e d on a fuzzy set. T h i s p r o b l e m is c e n t r a l to t h e defuzzification s t e p in t h e fuzzy
logic c o n t r o l l e r s n o w in use. O n e i m p o r t a n t c o n c l u s i o n f r o m o u r w o r k is t h e r o l e t h a t o u r c o n f i d e n c e in
t h e fuzzy set b e i n g u s e d p l a y s in t h e s e l e c t i o n o f t h e p r o c e d u r e u s e d . W e h o p e to use t h e results o f this
w o r k to p r o v i d e t h e basis for t h e d e v e l o p m e n t o f a d a p t i v e a l g o r i t h m s for l e a r n i n g t h e o p t i m a l
defuzzification p r o c e d u r e for a given a p p l i c a t i o n o f fuzzy logic c o n t r o l .

References

[1] R.E. Bellman and L.A. Zadeh, Decision-making in a fuzzy environment, Management Sci. 17 (4) (1970) 141-164.
[2] M. Delgado and S. Moral, On the concept of possibility-probabilityconsistency, Fuzzy Sets and Systems 21 (1987) 311-318.
[3] D. Dubois and H. Prade, On several representations of an uncertain body of evidence, in M.M. Gupta, E. Sanchez, Eds.,
Fuzzy Information and Decision Processes (North-Holland, Amsterdam, 1982) 309-322.
[4] D. Dubois and H. Prade, Unfair coins and necessary measures: A possible interpretation of histograms, Fuzzy Sets and
Systems 10 (1983) 15-20.
[5] D. Dubois and H. Prade, Fuzzy sets and statistical data, Europ. J. Oper. Res. 25 (1986) 345-356.
[6] D. Filev and R.R. Yager, A generalized defuzzification method under BAD distributions, lnternat. J. Intelligent Systems 6
(1991) 687-697.
[7] E.P. Klement, Characterization of fuzzy measures constructed by means of triangular norms, J. Math. Anal. Appl. 86 (1982)
345-358.
[8] G.J. Klir, Probability-possibilityconversion, Proc. Third IFSA Congress, Seattle (1989) 408-411.
[9] G.J. Klir, A principle of uncertainty and information invariance, lnternat. J. General Systems 17 (1990) 249-275.
[10] L.I. Larkin, A fuzzy logic controller for aircraft flight control, in: M. Sugeno, Ed., Industrial Applications of Fuzzy Control
(North-Holland, Amsterdam, 1985) 87-104.
[11] Y. Leung, Maximum entropy estimation with inexact information, in: R.R. Yatger, Ed., Fuzzy Set and Possibility Theory
(Pergamon Press, New York, 1982) 32-37.
[12] E.H. Mamdani and S. Assilian, An experiment in linguistic synthesis with a fuzzy logic controller, lnternat. J. Man-Machine
Stud. 7 (1975) 1-13.
[13] S. Moral, Construction of a probability distribution from a fuzzy information, in: A. Jones, A. Kaufmann and H.-J.
Zimmerman, Eds., Fuzzy Sets Theory and Applications (Reidel, Dordrecht, 1986) 51-60.
[14] G. Shafer, A Mathematical Theory of Evidence (Princeton University Press, Princeton, NJ, 1976).
[15] M. Sugeno, An introductory survey of fuzzy control, Inform. Sci. 36 (1985) 59-83.
[16] M. Sugeno, Industrial Applications of Fuzzy Control (North-Holland, Amsterdam, 1985).

R.R. Yager, D. Filer / Defuzzification and selection based on a fuzzy set

271

[17] R.R. Yager, On the measure of fuzziness and negation, Part I: Membership in the unit interval, Internat. J. General Systems
5 (1979) 221-229.
[18] R.R. Yager, On the measure of fuzziness and negation, Part II: Lattices, Inform. and Control 44 (1980) 236-260.
[19] R.R. Yager, Credibility discounting in the theory of approximate reasoning, Proc. of the Sixth Conference on Uncertainty in
Artificial Intelligence, Cambridge, MA (1990) 301-306.
[20] L.A. Zadeh, Fuzzy sets, Inform. and Control 8 (1965) 338-353.
[21] L.A. Zadeh, Similarity relations and fuzzy orderings, Inform. Sci. 3 (1971) 177-200.
[22] L. Zadeh, Outline of a new approach to the analysis of complex systems and decision processes, IEEE Trans. Systems Man
Cybernet. 3 (1973) 28-44.
[23] L.A. Zadeh, A computational approach to fuzzy quantifiers in natural Languages, Cornput. and Math. Appl. 9 (1983)
149-184.

You might also like