Professional Documents
Culture Documents
To cite this article: JAMES F. GEER & GEORGE J. KLIR (1992) A MATHEMATICAL ANALYSIS OF INFORMATION-PRESERVING
TRANSFORMATIONS BETWEEN PROBABILISTIC AND POSSIBILISTIC FORMULATIONS OF UNCERTAINTY, International Journal of
General Systems, 20:2, 143-176, DOI: 10.1080/03081079208945024
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained
in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no
representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the
Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and
are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and
should be independently verified with primary sources of information. Taylor and Francis shall not be liable for
any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever
or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of
the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any
form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://
www.tandfonline.com/page/terms-and-conditions
In,. I . Ccnerol System. Vol. 20. pp. 143-176 0 1992 Gordon md Breach Scicncc hblirhcn S.A.
Repnne available directly from the publirhcr Rinted in thc United Kingdom
Phataopying permitted by liccnw only
A MATHEMATICAL ANALYSIS OF
INFORMATION-PRESERVING
TRANSFORMATIONS BETWEEN PROBABILISTIC
AND POSSIBILISTIC FORMULATIONS OF
UNCERTAINTY
It is now generally recognized that uncertainty can be formalized in different mathematical theories.
Two of these theories. on which we focus in this paper, are probability theory and possibility theory.
The paper deals with transformations from probabilistic formalizations of uncertainty into their possi-
bilistic countrrparts that contain the same amount of uncertainty and, consequently, the same amount
of information (expressed as a reduction of uncertainty) as well; it also deals with the inverse uncertainty
and information preserving transformations. Since well-justified and unique measures of uncertainty (and
information) are now well established in both probability theory and possibility theory, the tnnsfor-
mations are well defined. Mathematical properties of the transformations are analyzed in the paper under
the assumption that probabilities and possibilities are connected via interval or log-interval scales. The
primary results are: (i) the interval scale transformation that preserves information exists and is unique
only from probability theory to possibility theory, but the inverse transformation does not always exist;
(ii) the log-interval scale transformation exists and is unique in both directions; and (iii) the log-interval
scale transformation satisfies the probability-possibility consistency requirement.
1. INTRODUCTION
Consider a mathematical system that was constructed to model some aspect of real-
ity. Assume that the purpose for which the system was constructed (such as predic-
tion, retrodiction, or prescription) involves some uncertainty. This uncertainty (pre-
dictive, retrodictive, or prescriptive) is formalized within some mathematical theory.
This means, in essence, that the uncertainty is expressed in relevant numerical values
(degrees of belief, weights of evidence, etc.). These values conform to the axiomatic
constraints of the theory and are derived by certain rules (including, possibly, sub-
jective judgements) from some inconclusive evidence regarding the phenomenon of
concern. When the model is utilized for answering pertinent questions (e.g. giving
The work on this paper was partially supported by the National Science Foundation under Grant No.
IRI-90 15675.
I44 JAMES F. GEER AND GEORGE J. KLIR
justified way, the amount of uncertainty associated with each possible characteriza-
tion of uncertainty within the theory. Moreover, the measured amount must be unique
when we choose a particular measurement unit.
The concept of uncertainty is intuitively connected with the concept of informa-
tion. For example, when uncertainty in predicting an experimental outcome is re-
solved by observing the actual outcome, the amount of information obtained by the
observation may be measured by the amount of uncertainty prior to the observation.
Similarly, when uncertainty regarding some historical event is reduced by finding a
relevant historical document, the amount of information in the document (with re-
spect to the event of concern) may be measured by the amount of uncertainty re-
duced. In general, the amount of information obtained by an action may be measured
by the reduction of uncertainty that results from the action.
Consider now that the action by which we reduce uncertainty is the construction
of a system that models some aspect of the real world. Such a system must be
l'orniulated within a particular experimental frame. A question posed to the system
may be answered, in general, with some uncertainty. If no model were available
within the experimental frame, the question would be answered with the highest
possible uncertainty allowed by the experimental frame, expressing thus our total
ignorance regarding the phenomenon involved. [nformation obtained by constructing
the model (or information contained in the model) with respect to the question of
concern is measured by the difference between the highest uncertainty associated
with the experimental frame and the actual uncertainty expressed by the model.
Some of the theories of uncertainty are more general than others, while some are
not comparable in this respect. The theories also differ from one another in their
meaningful interpretations, computational complexit robustness, and other aspects.
It has recently been argued on several occasions7.8.Y4.23 that none of the theories is
superior in all situations. Each theory seems to have some advantages and some
disadvantages when compared with the other theories. Furthermore, this comparison
is context-dependent; each theory is suitable for some types of evidence and unfit
for others. Which theory to use in each application context should be decided by
appropriate metarules on the basis of the type of evidence involved, estimated com-
putational complexity, and other criteria.
In order to utilize opportunistically advantages of the various theories of uncer-
tainty in knowledge-based systems, we need the capability of moving from one the-
ory to another as appropriate. These moves, or transformations, from one theory to
another should satisfy some justifiable requirements. As argued previously," we should
INFORMATION-PRESERVING TRANSFORMATIONS
require that the numbers expressing uncertainty in one theory (probabilities, possi-
bilities, weights of evidence, etc.) be transformed into the corresponding numbers
in another theory by an appropriate scale. In addition, we should also require that
the amount of uncertainty (and information) be preserved under the transformation.
Transformations that satisfy these two requirements (scaling and uncertainty in-
variance) are called information preserving transformations. These transformations
guarantee that no information is unwittingly added or eliminated solely by moving
from one mathematical theory of uncertainty into another within the same experi-
mental frame.
Two theories of uncertainty, on which we focus in this paper, are probability
theory and possibility theory. The paper deals with information-preserving transfor-
mations between the two theories, which were first proposed at the 1989 IFSA Con-
gress in seattlel0 and discussed more thoroughly in a subsequent paper." It was
shown that ratio and difference scales are too rigid and, consequenlly, not applicable
Downloaded by [Columbia University] at 17:40 15 October 2014
to these transformations. Ordinal scales, on the other hand, are too loose and, hence,
lead to ambiguous transformations. Interval and log-interval scales were shown to
be appropriate candidates for unique information-preserving transformations, but their
mathematical properties were not adequately scrutinized. A mathematical analysis
of these transformations, especially their existence and uniqueness, is the subject of
this paper. The primary results are: (i) the interval scale transformation that preserves
information exists and is unique only from probability theory to possibility theory,
but the inverse transformation does not always exist; (ii) the log-interval scale trans-
formation exists and is unique in both directions; and (iii) the log-interval scale trans-
formation satisfies the probability-possibility consistency requirement.
2. RELEVANT BACKGROUND
In this paper, we deal with probability and possibility theories defined only on finite
sets. We assume that the reader is familiar with these theories at least at the level
at which they are covered in the text by Klir and ~ 0 1 ~ e rHowever,
.l~ to introduce
the relevant terminology, notation, and conventions, we briefly overview in this sec-
tion basic notions of the two theories. For this purpose, it seems appropriate to over-
view the theories from a broader perspective of the Dempster-Shafer theory, under
which they appear as special cases.
Let X denote a universal set under consideration, assumed here to be finite, and
let P(X) denote the power set of X. One way of formulating the Dempster-Shafer
theory is to define a function
m : P(X)+ [0, I ]
such that
This function is called a basic probability assignmenr; the value m(A) expresses the
degree of belief (based on relevant evidence) in a proposition that is represented by
146 JAMES F. GEER AND GEORGE J . KLIR
the set A , but it does not include possible degrees of belief in additional propositions
represented by the various subsets of A . That is, m(A) expresses the degree of belief
that is committed exactly to set A , not the degree of total belief committed to A . To
obtain the latter, we have to add to m(A) the values m(B) for all proper subsets B
of A . The roral belief committed to set A , B e l ( A ) , is thus an aggregate calculated
from the basic evidential claims by the formula
for all A E P ( X ) .
Every set A E P ( X ) for which m(A) # 0 is called a focal element. The pair ( F , m),
where F is the set of all focal elements associated with m, is called a body of evidence.
When all focal elements are singletons, belief and plausibility measures collapse
into a single measure, a classical additive probabiliry measure. Any probability mea-
sure, Pro, on a finite set X is uniquely determined by a probabiliry disrriburion
funcrion
When all of the focal elements are nested (ordered by set inclusion), the body of
evidence is called possibilistic. In this case, we obtain special belief and plausibility
measures, which are called necessity and possibility measures, n:spectively. A pos-
sibility measure, Pos, is conveniently (and uniquely) determined by a possibility
distribution function
max r(x) = 1.
SEX
A, C A , C . . . CA,,
fully characterize the basic assignment and the possibility distribution, respectively,
by which the possibility measur'e Pos is defined. The possibility distribution r is
ordered in the sense that r, 2 r,,, for all i = 1, 2, . . ., n - 1. It is well known that
A, = {x I h ( x ) 2 a}.
Since, clearly, A, C Ag when a > P , the a-cuts of every fuzzy set are nested.
3. MEASURES O F UNCERTAINTY
It is well knownI2 that the Dempster-Shafer theory is capable of capturing two types
of uncertainty. One type is connected with evidential claims focusing on propositions
that are not specific, the other with propositions that conflict with one another. Mea-
sures of both types of uncertainty are now well established.
Given a body of evidence (m, F), its nonspecificiry. N(m), is expressed by the
forniula
where IA~denotes the cardinality of A. Function N, which was proven unique under
appropriate requirements,'' measures nonspecificity in units that are called bits: one
bit of uncertainty (in this case nonspecificity) is equivalent to the total ignorance
regarding the truth or falsity of one proposition. The range of the function is
This function, which is referred to as discord, measures the conflict among evidential
claims in bits. Its range is
D(m) = 0 ~ f m
f ( A ) = I for one particular set A ; D ( m ) = log,l~liff ~ nis the uniform
probability distribution on x . ' ~
We may also define the amount of total uncertainty, T ( m ) , as the sum of the
amounts of the two types of uncertainty, N ( m ) and D(rn), that coexist in the theory.
That is
Downloaded by [Columbia University] at 17:40 15 October 2014
BEF
Function T measures uncertainty again in bits and it has already been proven18that
again
subsets that contains all focal elements, it is easy to derive the following possibilistic
forms of N and D:
4. INFORMATION-PRESERVING TRANSFORMATIONS
Let the 11-tuples p = (p,, p,, . . ., p,) and r = (r,, r,, . . ., r,) denote, respectively,
ordered probability and possibility distributions (defined on a finite set X with n or
more elements) that do not contain zero components. That is
(a) pi E (0, I] and r, E (0, I] for all i = I, 2, . . ., n;
( b ) p i 2 p , + , and r r + forall i = l , 2 , ..., n - I;
(c) 1)) + p2 + . . . + pn = I (probabilistic normalization);
(d) r , = 1 (possibilistic normalization);
(e) n 5 1x1.
It is assumed that only one of the distributions, p or r, is initially given. The other
distribution is to be determined from the given one by an information-preserving
transformation. That is, values pi are assumed to correspond to values r, for all i =
1 , 2, . . . , 11 by some scale (at least ordinal) and both p and r are required to contain
the same amount of uncertainty (and information). Furthermore, if n < 1x1, we also
assume that pi = ri = 0 for all i = n + 1, n + 2, . . ., 1x1.
The requirement that p and r contain the same amount of uncertainty is formally
expressed by the equation
The expression on the left-hand side of this equation represents the Shannon entropy
INFORMATION-PRESERVING TRANSFORMATIONS 151
(probabilistic uncertainty); the expression on the right-hand side represents the total
possibilistic uncertainty (the sum of nonspecificity and discord). When a given p is
;o be transformed into-r, the left-hand side of thd equation becomes a constant ithe
value of the Shannon entropy for the given p) and, similarly, when a given r is to
be transformed into p, the right-hand side of the equation beconies a constant.
Basic ideas regarding information-preserving transformations between probabili-
ties and possibilities are discussed for different scales in a previous paper.11 It is
shown that ratio and difference scales are too rigid and, consequently, not applicable.
Either of them possesses only one free coefficient, which is uniquely determined by
the probabilistic or possibilistic normalization requirement. No freedom is then left
for imposing the requirement of uncertainty equivalence expressed by Eq. (16). Or-
dinal scales, on the other hand, are too loose. Since they preserve only the order of
the given numbers, but not any additional structure involved, ordinal scale trans-
Downloaded by [Columbia University] at 17:40 15 October 2014
11, where a and p are constants ( a > 0). From the possibilistic normalization, we
obtain
T o preserve the amount of uncertainty, a must satisfy the equation g(a) = 0 , where
Downloaded by [Columbia University] at 17:40 15 October 2014
In order to satisfy the requirements that each r, must lie in the interval [0, I I and
~ilustsatisfy the inequality ri 2 pi, which expresses the general possibility-probability
consistency c ~ n d i t i o n we
, ~ must require that
In Fig. I we have plotted g(a) for the special case when the (pi's) are given by the
formula
where 0 < CI < 1/11 and n = 10. This corresponds to probabilities pi which are
equally spaced between I/n - a and I/n + a . The figure illustrates that, for this
particular probability distribution, the function g(a) has a unique positive root, which
lies in the interval (0, a,,).
We now prove some facts about the function g(a), which we state in the form of
lemmas, that will eventually allow us to prove (in general) a theorem concerning
the existence and uniqueness of a solution to the equation g(a) = 0.
LEMMA5.1 If pi = I/n for all I 5 i 5 n, then g(a)
the interval 10, w).
- 0 for any choice of a in
formation p -
Figure 1 Plots of function g ( a ) for several special probability distributions and interval scale uans-
r (Sec. 5 . 1 ) .
In view of Lemma 5.1, we shall assume from now on that p , > I /n and p, <
I/n. An immediate consequence of this assumption is that
since it is easy to show that H, regarded as a function of the n "variables" (pi) (with
the constraints that the pi's are positive and sum to one), has its unique maximum
when p, = p 2 - . . . = p,, = 1/11.
Proof Letting a --* 0' in the definition of g(a) and using (19): we find
Q.E.D.
Setting a = 0 in this expression and using the fact that y,(O)= 1 - i / n , we find
Downloaded by [Columbia University] at 17:40 15 October 2014
The second term in this expression is obviously negative for any a > 0,since each
term in the summation is non-negative and at least one term is strictly positive. Also,
since
Y i 2 (1 -
j=i+ 1
I -j - I ) <O = 1 - i n , for a > 0,
we see that I - y,(a)2 i / n and hence each of the logarithm terms in the first
summation is also negative. Thus, g'(a)< 0 for any a > 0.
Q.E.D.
LEMMA 5.4 g(a)+ -m as a --, +m and hence g(a)< 0 for all sufficiently large
values of a.
INFORMATION-PRESERVING TRANSFORMATIONS 155
Proof Using the definition of g(a), for large (positive) values of a we can write
terval (0, m) and hence it has at most one real root on this interval. Q.E.D.
Theorem I establishes that g(a) has exactly one positive real root, say a = 6, but
it does not establish that d lies in the interval (0, a,) (see condition (18)). Thus we
have not shown that the unique possibility distribution (ri), defined by (17) with a
= 6, satisfies the possibility-probability consistency conditions (ri t pi, for i = 1,
2, . . ., n). In fact, we discovered an example in which the possibility-probability
consistency condition is not satisfied. However, we have not pursued this point since,
as we shall show in the next subsection, it is not always possible to transform from
possibilities to probabilities using an interval scale.
f (a) = z
i= 1
((ri - P)/a + I /n) log2[(ri - ? ) / a + 1/n]
.n
+ 2 ri log2[i/(i - I)] -
i=z
z
n-l
i= 1
(ri - ri+J 10g2[l - i
j=i+1 I
rj/(j(j - I)) .
We note that this expression makes sense (i.e. is real) only when a > n ( i - r,),
since the argument of each logarithm function must be strictly positive. In Fig. 2
we have plotted f ( a ) for the special case when the {r;} are given by the formula
r = I - (I - a - 1 - 1 for i = 1, 2, . . ., n,
with n = 10 and for several values of a in the range 0 < a < 1. This corresponds
156 JAMES F. GEER AND GEORGE J . KLIR
Downloaded by [Columbia University] at 17:40 15 October 2014
mation r -
Figure 2 Plots of function/(o) for several special possibility distributions and interval scale transfor-
p (Sec. 5.2).
to a distribution of (r;) which are equally spaced between I and a . In particular, the
figure illustrates that there is no (real) solution for a when a is less than about 0.4.
From this example we conclude that some possibilistic bodies of evidence cannot
be converted by interval scale transfornations to their probabilistic counterparts that
contain the same amount of uncertainty and information. Hence, interval scale trans-
formations are not acceptable for our purpose.
where a and p are positive constants. From the possibilistic normalization require-
ment, we obtain I = P(p,)" and, hence, p = I /(p,)" and
INFORMATION-PRESERVING TRANSFORMATIONS 157
and each ri on the right-hand side is replaced, according to Eq. (20). with [p,/pIJn.
Thus, for any real number a > 0 , we now define the function g(a) by
Downloaded by [Columbia University] at 17:40 15 October 2014
Any a for which g(a) = 0 is clearly a solution of Eq. (16). Questions regarding the
existence and uniqueness of the solution of Eq. (16) can thus be studied by examining
the behavior of this function.
LEMMA6.1 If pi = l / n for I 5 i 5 n, then g(a) = 0 for all a > 0.
Proof T o see this, we note first that, in this case,
where a is a parameter which satisfies 0 < a < I. Here the (pi) are uniformly
distributed between 2a/(n(l + a)) and 2/(n(l + a)). In particular, when a = 1,
each pi = I /n. This figure illustrates for this particular probability distribution that
g(a) has a unique simple root which lies in the interval (0, 1).
JAMES F. GEER AND GEORGE J . KLLR
Downloaded by [Columbia University] at 17:40 15 October 2014
formation p -
Figure 3 Plots of function g(o) for several special probability distributions and log-interval scale trans-
r (Sec. 6.1)
We shall now prove three lemmas which will establish the existence and unique-
ness of a root of g(a) in the interval 0 < a < I for a general probability distribution
p (assumed to be ordered).
LEMMA6.2 As a + O+, g(a) + g(O+) = log2(n) - H > 0.
Proof Since each ratio (pi/pl) > 0 it follows that each term (pi/pI)" + 1 as a
+ O+. Hence
where
- 10~s-
) x
n-l
i= 1
(r, - ri+l)log(l - Y;),
INFORMATION-PRESERVING TRANSFORMATIONS
sE z
i= 1
r i = ( ~ / p , ) z p ; =I/p,
i= 1
and yi= i 2
j=i+,
rj/(j(j- I)),
for all 0 < x < I. Thus, h2(x) < 0 for 0 < x < 1 and h2(x) = 0 only when x = 0
or .r = 1. This establishes (22) for the special case when n = 2.
T o show (22) holds for n 2 3, we shall demonstrate first that
To see that (23) holds, we first use (21) to express the difference I;, - h, (after
a little manipulation) in the form
h, - h, = -(I /s) x
n- I
i-2
ri log(r,) + (1 /S .- I /s)r, log(r,,)
Downloaded by [Columbia University] at 17:40 15 October 2014
+r [k=2
- Is o r - IogS/s
1 + (1 /S - Ir l o g r (25)
Using our inequality (a) above with z = rk in each term in the first summation above
and then using inequality (c) with i = i in the second term above, we can write
"-1 "-1
Thus, the next to last term in (25) is also non-negative. The first term in (25) will
be non-negative if we can show that (n - I)(] - yl)s/(S(l - Y,,-~))> 1, or equiv-
alently, that
0- I
B = I - , - - / - 1
j=2
INFORMATION-PRESERVING TRANSFORMATION!; 161
Now D will be non-negative if we can show that each Bk 2 0 . T o see that this is
the case, we note that, for 2 5 k S n - 1 ,
and
Downloaded by [Columbia University] at 17:40 15 October 2014
for 2 5 i 5 n - 2. Using inequality (a) above with z = rk and inequality (c) with
z = i(1 - y,-,)S/(s(n - I) (I - yi)) - 1, we can write
where
- k k- 1
C k 1-i j j - 1 -i k k - I ) i r), 2 5 k z n - 1,
j=i+ 1 1=,+1
162 JAMES F. GEER AND GEORGE J . KLlR
Now, to show that E; 2 0, we only need to demonstrate that each Ck2 0. But, from
their definitions, we have, for 2 5 k 5 n - 1,
and
Thus, each term on the right side of (25) is non-negative and hence (23) is established.
Finally, to show that (24) holds, we use equation (21) with r, = x and r,-, = r,-,
- . . . = r l = 1 to write
Downloaded by [Columbia University] at 17:40 15 October 2014
We now use inequality (a) above with z = x in the first term in (26), then use
inequality (b) with z = x(l - x)/(k(k - 1)) in the second term, and finally inequality
(c) with z = (1 - x)/(k - 1) in the third term, to write
where
Y = 2j
I=,+ l
- 1)) and $:(a) = i
J='+ 1
qp lo&(qJ)/(j(j - 1)).
INFORMATION-PRESERVING TRANSFORMATIONS 163
We shall now show that g'(a) < 0 by showing that each term in each of the
summations in (27) above is nonpositive and that at least one of these terms is strictly
negative. To do this, we examine first the last summation in (27) and note first that
since each q; $ 1 and q, < 1. Then 1 - y,(a) > 1 - (1 - i/n) = i/n > 0. Also,
y:(a) < 0, since log,(qj) 5 0 (I 5 j 5 n - 1) and log,(q,) < 0. Thus each term in
the last summation is nonpositive (since each term (qp - qp,,) is nonnegative) and
the last term is strictly negative.
The single term just before the last summation is strictly negative since
Downloaded by [Columbia University] at 17:40 15 October 2014
Finally, each term in the first summation will be nonpositive if we can show that
and each p i on the left-hand side is replaced with the expression on the right side of
Eq. (29).
For any real number a > 0,we define the function f ( a ) by
Downloaded by [Columbia University] at 17:40 15 October 2014
where
if r = r = . .. = r = I, then T = TM = log2(n).
we can write
0 0.5 1
o!
Figure 4 Plots of function /(a)for several special possibility distributions and log-interval scale trans-
formation r + p (Sec. 6.2).
Furthermore, if T = T M ,then each term in the summation in (30) must vanish and
we are led to the conditions that each ri = 1 , for I 5 i 5 n. Thus, we have shown
-
that
and, fi~rthermore,
T = T ,- log,(n)
= if and only if r , = r2 = . . . = r, = I
We shall now prove the existence and uniqueness of a solution to the equation
166 JAMES F. GEER AND GEORGE 1. KLIR
f ( a ) = 0 , which lies in the interval 0 < a < 1. T o this end, we prove the following
lemmas.
6.6 If r, = 1, then f(cr) = 0 for all a > 0.
LEMMA
Proof To see this, just note that, if r, = 1, then r, = . . . = r,, = 1. Consequently,
for any a > 0 , we have T = log2(n), P = n, Pi = I/n, and hence
Q.E.D.
I, if 15 i S N I/N, if 15 is N
p + N, and p,+
0, if N < i 5 n' 0, if N < i % n'
Thus, as a -+ 0 + ,
since each term in the first summation is strictly positive and each term in the second
summation is non-negative. Q.E.D.
6.8 When a = 1, the function f is strictly negative, i.e. f ( l ) < 0.
LEMMA
Proof From the definition of f ( a ) , we can write
INFORMATION-PRESERVING TRANSFORMATIONS 167
where pi = ri/s and s = C;=, r,. Inserting the definition of Pi into this expression,
we find that we can express f(1) as
where g ( l ) is defined in the beginning of the proof of Lemma 6.3. Thus, since by
Downloaded by [Columbia University] at 17:40 15 October 2014
Lemma 6.3 we know that g ( l ) < 0 , we see that we must also have f ( l ) < 0.
Q.E.D.
We now make the following observation. Let 6 be any root o f f in the interval
(0, 1). Suppose we could show that f '(&) < 0 , i.e. that the derivative off is strictly
negative at a root. Then it would follow that f has only one root in the interval (0,
1). The reason for this is the following. Suppose that f has k roots at a = a;, where
0 < a I 5 a25 . . . 5 at < 1. Since f'(6) # 0 , it follows that the roots off are all
simple and hence the derivative o f f must alternate in sign at these roots, i.e.
f '(ai)f '(ai+,) < 0 for i = 1, 2, . . ., k - 1. But this would be inipossible i f f ' is
negative at each root.
We now prove the following lemma.
Substituting this expression for P,! into our expression for f '(a), we find
Downloaded by [Columbia University] at 17:40 15 October 2014
So far we have just computed an expression for f ' for a general value of a . Now,
if 6 is a root of f ( a ) = 0, then
from the definition of f(a). Using this expression, our expression for the derivative
off at a root off becomes
C P,lIog2(~,)]'- T~ > 0,
I= I
at a root of f ( a ) = 0,
2 Pi[log2(~,)12- T2 > 0,
i= l
when
i= 1
Pi log(P,) = - T
INFORMATION-PRESERVING TRANSFORMATIONS
0 < z
i= 1
Pi[log2(Pi) + T ] ~ (since each term is positive)
Downloaded by [Columbia University] at 17:40 15 October 2014
7. EXAMPLES
Pi
a
~9.1: ri= (T -
I
I ,
€9.1:
I
H(P)
I
- I
N(r1 + D(r)
Downloaded by [Columbia University] at 17:40 15 October 2014
I I
4 If
I
l/a
Pi -- ri
Figure 5 Overvicw o f the log-interval scale transformation between probabilities and possibilities.
any value
0.5888859
0.5425401
0.5086888
0.4805572
0.4557550
0.4331 166
0.41 19632
0.3918519
0.3724626
any value
0,3724626
0.3918519
0.41 19632
Downloaded by [Columbia University] at 17:40 15 October 2014
0.4331 166
0.4557550
0.4805572
0.5086888
0.5425401
0.5888859
anv value
(a) (b)
and, similarly,
As a second example, let us consider the problem of estimating the age of a per-
son, say Joe Doe, from two independent sources of incomplete information. From
the first source, we know that "Joe is around 21, but certainly not less than 19 or
more than 23." From the second source, we know that he is a student at a particular
college and we happen to have information on the age distribution of current students
Downloaded by [Columbia University] at 17:40 15 October 2014
at the college. How can we combine these two sources of information to obtain the
best estimate of the age of Joe Doe?
Assume that the age of students at the college varies from 17 to 26. That is, our
universal set, X, is the set of integers (17, 18, . . ., 26). Information from the first
source is almost perfectly expressed by a fuzzy set F on X whose membership func-
tion pp is specified in Figure 7a. As mentioned at the end of Sec. 2, values of this
function can directly be interpreted as possibilities. That is, the information can readily
be expressed by the possibility distribution
A graph of this distribution is identical with the graph of pF in Fig. 7a. The cor-
responding nested body of evidence is shown in Fig. 7b.
lnformation from the second source is of a statistical nature and, hence, it can be
expressed directly in terms of a probability distribution on X. Assume that the prob-
ability distribution derived from the statistical data is
The two pieces of information are now represented in different theories, pos-
sibility theory and probability theory, and, hence, they are not compatible. To com-
bine them, we need either to convert the probability distribution p, to its meaningful
possibilistic counterpart, r,, or to convert the possibility distribution r, to its mean-
ingful probabilistic counterpart, p,. Since either conversion should preserve infor-
mation contained in the given distribution, we can utilize the main results of this
paper and en~ploythe log-interval information-preserving transformation.
Applying the transformation to the probability distribution p,, we obtain for a =
0.3302797 the possibility distribution
A graph of this distribution is shown in Fig. 7c, and the corresponding nested
body of evidence is given in Fig. 7d.
INFORMATION-PRESERVING TRANSFORMATIONS
Downloaded by [Columbia University] at 17:40 15 October 2014
Figure 7 Possibility distributions and the corresponding nested bodies of evidence derived from two
incompatible sources of incomplete information (the example of estimated age).
This distribution and the associated nested body of evidence are shown in Fig.
8a,b, respectively. From r,, we can easily calculate values of the basic assignment,
m,, necessity measure, Nec,, and possibility measure Pos,, for the four focal ele-
ments. These are given in Table 11. For other subsets of X, clearly, the necessity
measure is always zero.
JAMES F. GEER AND GEORGE 1. KLlR
Downloaded by [Columbia University] at 17:40 15 October 2014
It is not obvious how to combine p, and p,, but, again, this is not our concern in
this paper.
8. DISCUSSION
The major results of this paper, which are expressed by Theorem 2 (supplemented
with Lemma 6.1) and Theorem 3 (supplemented with Lemma 6.6). are: the log-
interval scale allows us to transform any given probability distribution on a finite set
to a unique possibilistic counterpart that contains the same amount of information
and, similarly, the scale facilitates a unique information-preserving inverse trans-
forniation for any given possibility distribution; furthermore, the range of the trans-
formation constant a (Fig. 5) is the open interval (0, I), which implies that the
possibility-probability consistency condition is satisfied by the transformation. It is
interesting to observe that the log-interval transformation for CY = 1, which is used
Table 11 Necessity and possibility values for the focal elements defined by the possibility distribution
r, given in Fig. 8.
Focal clcments ma N~CFS POSF,
121) 0.042 0.042 I
INFORMATION-PRESERVING TRANSFORMATIONS 175
Since the value of possibilistic discord is severely restricted and often negligible
when compared with non~pecificity,~ the term D(r) in Eq. I1 in Fig. 5 plays a rel-
atively minor role and may be often neglected, especially for large bodies of evi-
dence. This saves computing time. Let us mention in this context that this existence
and uniqueness of the transformation is not affected when it is :simplified by ex-
cluding the term D(r) from Eq. 11. We fully analyzed the simplified transformation
and found that the existence and uniqueness hold in both directions.
After the mathematical analysis presented in this paper, which establishes the log-
interval scale as the only meaningful scale for the desired transfo~mation,the next
stage of research in this area should focus on the investigations of the pragmatic
value of this mathematically sound transformation. Covering as many application
areas as possible, the purpose of this investigation in each application area is to
compare, by relevant performance criteria, the use of the log-interval scale infor-
mation-preserving transformation with other transformations.
REFERENCES
12. G . J. Klir, and T. A. Folger. Fuzzv Sers. Uncertainty, and Informarion. Prentice Hall, Englewood
Cliffs. N.J. 1988.
13. G . J. Klir, and A. Ramer. 'Uncertainly in the Dempster-Shafer theory: a critical re-examination."
Intern. J. of General Systems, 18, No. 2, 1990, pp. 155-166.
14. N. S. Lee, Y. L. Grize, and K. Dehnad, 'Quantitative models for reasoning under uncertainty in
knowledge-based expert systems." Inrern. J . of lnrelligenr Systems. 2, 1987, pp. 15-38.
15. G . Matheron, Random Sets and Integral Geometry. John Wiley, New York, 1975.
16. H. T. Nguyen. "On random sets and belief functions." J . of Marh. Analysis and Appl.. 65, 1978,
pp. 531-542.
17. 2. Pawlak, 'Rough Sets." Inrern. J . of Computers and lnformnrion Sciences. 11, 1982, pp. 341-
356.
18. A. Ramer, 'Inequalities in evidence theory based on concordance and conflict." Proc. 4th IFSA
Congress. Brussels. 1991.
19. A. Ramer, and G . J. Klir, 'Measures of conflict and discord." lnformarion Sciences. 1992 (to
appear).
20. G . Shafer. A Morhemarical Theory of Evidence. Princeton Univ. Press. Princeton. N.J.. 1976.
2 1 . E. H. Shonliffe, Compurer-Based Medical Consulrarions: MYCIN. Elsevier, New York, 1976.
Downloaded by [Columbia University] at 17:40 15 October 2014
22. J. R. Sims, and Z. Wang, "Fuzzy measures and fuzzy integrals: an overview.- Inrern. J . of General
Sysrems, 17, Nos. 2-3, 1990, pp. 157-189.
23. H. E. Stephanou, and A. P. Sage. "Perspectives on imperfect information processing." IEEE Trans.
on Sysrcrrrs. Man. and Cybernerics. SMC-17, No. 5, 1987, pp. 780-798.
24. M. Sugcno. "Fuzzy measures and fuzzy integrals: a survey" In: Fuzzy Auromars and Decision Pro-
cesses, edited by M. M. Gupta. G . N. Saridis, and B. R. Gaines. North-Holland, Amsterdam and
New York. 1977, pp. 89-102.
25. P. Waley, Srarisrical Reasonir~g~virhImprecise Probabiliries. Chapman and Hall, London and New
York. 1991.
26. P. Walley, and T. L. Fine. 'Varieties of modal (classificatory) and comparative probability."
. Synrhese.
.
41, 1979, pp. 321-374.
27. 2. Wang, and G. J. Klir. Fuzzy Measure Theory. Plenum Press. New York. 1992.
28. R. R. Yager, el al.. F u z q Sets and Applicarions: Selected Papers by L. A. Zndeh. Wiley-lnter-
science. New York. 1987.
29. L. A . Zadch. 'Fuzzy Sets." lnformarion and Conrrol. 8, 1965. pp. 338-353.
30. L. A. Zadch. -Fuzzy sets as a basis for a theory of possibility." Fuzzy Sets and Systems, I , No.
1 , 1978. pp. 3-28.
For biogrnplrics and phorographs of James F . Geer and George J . K l i r , please see Vol. 19. No. 2.
1991, p . 132, and Vol. 17. Nos, 2-3. 1990, p. 275, respectively.