|Views: 832|Likes: 5

Published by api-3769743

See more

See less

Fuzzy Sets and Systems 132 (2002) 11 \u2013 32

www.elsevier.com/locate/fss

Possibilistic information theory: a coding theoretic approach\ue000

Andrea Sgarro\u2217

Department of Mathematical Sciences (DSM), University of Trieste, 34100 Trieste, Italy

Received 20 April 2001; accepted 19 November 2001

Abstract

We de\u00ffne information measures which pertain to possibility theory and which have a coding-theoretic meaning. We put forward a model for information sources and transmission channels which is possibilistic rather than probabilistic. In the case of source coding without distortion we de\u00ffne a notion of possibilistic entropy, which is connected to the so-called Hartley\u2019s measure; we tackle also the case of source coding with distortion. In the case of channel coding we de\u00ffne a notion of possibilistic capacity, which is connected to a combinatorial notion called graph capacity. In the probabilistic case Hartley\u2019s measure and graph capacity are relevant quantities only when the allowed decoding error probability is strictly equal to zero, while in the possibilistic case they are relevant quantities for whatever value of the allowed decoding error possibility; as the allowed error possibility becomes larger the possibilistic entropy decreases (one can reliably compress data to smaller sizes), while the possibilistic capacity increases (one can reliably transmit data at a higher rate). We put forward an interpretation of possibilistic coding, which is based on distortion measures. We discuss an application, where possibilities are used to cope with uncertainty as induced by a \u201cvague\u201d linguistic description of the transmission channel.

c

\ue0002001 Elsevier Science B.V. All rights reserved.

Keywords:Measures of information; Possibility theory; Possibilistic sources; Possibilistic entropy; Possibilistic channels; Possibilistic

capacity; Zero-error information theory; Graph capacity; Distortion measures

1. Introduction

When one speaks of possibilistic information the- ory, usually one thinks of possibilistic information measures, like U-uncertainty, say, and of their use in uncertainty management; the approach which one takes isaxiomatic, in the spirit of the validation of Shannon\u2019s entropy which is obtained by using Hin\u00c4

cin\u2019s axioms; cf. e.g. [8,12\u201314]. In this paper

we take a di\ue000erent approach: we de\u00ffneinformation

\ue000Partially supported by MURST and GNIM-CNR. Part of this

paper, based mainly on Section 5; has been submitted for presen- tation at Ecsqaru-2001, to be held in September 2001 in Toulouse, France.

\u2217Corresponding author. Tel.: +40-6762623; fax: +40-6762636.

E-mail address:sgarro@univ.trieste.it (A. Sgarro).

measureswhich pertain to possibility theoryand

which have acoding-theoretic meaning. This kind of

operationalapproach to information measures was

\u00ffrst taken by Shannon when he laid down the foun- dations of information theory in his seminal paper of 1948 [18], and has proved to be quite successful; it has lead to such important probabilistic functionals as are

source entropyor channel capacity. Below we shall

adopt a model for information sources and transmis- sion channels which is possibilistic rather than prob- abilistic (is based on logic rather than statistics); this will lead us to de\u00ffne a notion ofpossibilistic entropy and a notion ofpossibilistic capacity in much the same way as one arrives at the corresponding probabilistic notions. An interpretation of possibilistic coding is discussed, which is based ondistortion measures, a notion which is currently used in probabilistic coding.

0165-0114/01/$ - see front matterc

\ue0002001 Elsevier Science B.V. All rights reserved.

PII: S0165-0114(01)00245-7

12

A. Sgarro / Fuzzy Sets and Systems 132 (2002) 11\u201332

We are con\u00ffdent that our operational approach may be a contribution to enlighten, if not to disentangle, the vexed question of de\u00ffning adequate information measures in possibility theory.

We recall that both the entropy of a probabilistic

source and the capacity of a probabilistic channel are

asymptoticparameters; more precisely, they are limit

values for therates of optimal codes, compression codes in the case of sources, and error-correction codes in the case of channels; the codes one consid- ers are constrained to satisfy a reliability criterion of the type: the decoding-error probability of the code should be at most equal to a tolerated value\ue000, 06\ue000\u00a11. A streamlined description of source codes and channel codes will be given below in Sections

4and 5; even from our \ue001eeting hints it is however

apparent that, at least a priori, both the entropy of a source and the capacity of a channel depend on the value\ue000 which has been chosen to specify the reliability criterion. If in the probabilistic models the mention of\ue000 is usually omitted, the reason is that the asymptotic values for the optimal rates are the same whatever the value of\ue000,provided however that

\ue000is strictly positive.1Zero-error reliability criteria

lead instead to quite di\ue000erent quantities,zero-error

entropyand zero-error capacity. Now, the problem

of compressing information sources at zero error is so trivial that the term zero-error entropy is sel- dom used, if ever.2Instead, the zero-error problem

1The entropy and the capacity relative to a positive error

probability\ue000 allow one to construct sequences of codes whose probability of a decoding error is actuallyin\u00ffnitesimal; it will be argued below that this point of view does not make much sense for possibilistic coding; cf. Remark4.3.

2No error-free data compression is feasible for probabilistic

sources if one insists, as we do below, on usingblock-codes, i.e., codes whose codewords have all the same length; this is why one has to resort to variable-length codes, e.g., to Hu\ue000man codes. As for variable-length coding, the possibilistic theory appears to lack a counterpart for the notion ofaverage length; one should have to choose one of the variousaggregation operators which have been proposed in the literature (for the very broad notion of aggregation operators, and of \u201caveraging\u201d aggregations in particular, cf., e.g., [12] or [16]). Even if one insists on using block-codes, the problem of data compression at zero error is far from trivial when a distortion measure is introduced; cf. Appendix B. In this paper we deal only with the basics of Shannon\u2019s theory, but extensions are feasible to more involved notions, compound channels, say, or multi-user communication (as for these information-theoretic notions cf., e.g., [3] or [4]).

of data protection in noisy channels is devilishly di\ue002cult, and has lead to a new and fascinating branch of coding theory, and more generally of informa- tion theory and combinatorics, calledzero-error

information theory, which has been pretty recently

overviewed and extensively referenced in [15]. In particular, the zero-error capacity of a probabilis- tic channel is expressed in terms of a remarkable combinatorial notion called Shannon\u2019sgraph capac-

ity(graph-theoretic preliminaries are described in

Appendix A).

So, to be fastidious, even in the case of probabilistic

entropy and probabilistic capacity one deals with two

step-functionsof\ue000, which can assume only two dis-

tinct values, one for\ue000 = 0 and the other for\ue000\u00bf0. We shall adopt a model of the source and a model of the channel which are possibilistic rather than probabilis- tic, and shall choose a reliability criterion of the type: the decoding-errorpossibility should be at most equal to\ue000, 06\ue000\u00a11. As shown below, the possibilistic ana- logues of entropy and capacity exhibit quite a perspic- uous step-wise behaviour as functions of\ue000, and so the mention of\ue000 cannot be disposed of. As for the \u201cform\u201d of the functionals one obtains, it is of the same type as in the case of the zero-error probabilistic measures, even if the tolerated error possibility is strictly posi- tive. In particular, the capacities of possibilistic chan- nels are always expressed in terms of graph capacities; in the possibilistic case, however, as one loosens the reliability criterion by allowing a larger error possi- bility, the relevant graph changes and the capacity of the possibilistic channel increases.

We describe the contents of the paper. In Sec- tion2, after some preliminaries on possibility theory, possibilistic sources and possibilistic channels are introduced. Section3 contains two simple lemmas, Lemmas3.1 and3.2, which are handy tools apt to \u201ctranslate\u201d probabilistic zero-error results into the framework of possibility theory. Section4 is devoted to possibilistic entropy and source coding; we have decided to deal in Section4 only with the problem of source codingwithout distortion, and to relegate the more taxing case of source codingwith distortion to an appendix (Appendix B); this way we are able to make many of our points in an extremely simple way. In Section5, after giving a streamlined description of channel coding, possibilistic capacity is de\u00ffned and a coding theorem is provided. Section6 explores

A. Sgarro / Fuzzy Sets and Systems 132 (2002) 11\u201332

13

the consequences of changing the reliability criterion used in Section5; one requires that the average error possibility should be small, rather than the maximal error possibility.3Up to Section6, our point of view is rather abstract: the goal is simply to understand what happens when one replaces probabilities by pos- sibilities in the standard models for data transmission. A discussion of the practical meaning of our proposal is instead deferred to Section7: we put forward an in- terpretation of the possibilistic model which is based ondistortion measures. We discuss an application to the design of error-correcting telephone keyboards; in the spirit of \u201csoft \u201d mathematics possibilities are seen as numeric counterparts for linguistic labels, and are used to cope with uncertainty as induced by \u201cvague\u201d linguistic information.

Section7 points also to future work, which does not simply aim at a possibilistic translation and generalization of the probabilistic approach. Open problems are mentioned, which might prove to be stimulating also from a strictly mathematical view- point. In this paper we take the asymptotic point of view which is typical of Shannon theory, but one might prefer to take the constructive point of view of algebraic coding, and try to provide \u00ffnite-length code constructions, as those hinted at in Section7. We deem that the need for a solid theoretical foundation of \u201csoft\u201d coding, as possibilistic coding basically is, is proved by the fact that several ad hoc coding algo- rithms are already successfully used in practice, e.g., those for compressing images, which are not based on probabilistic descriptions of the source or of the chan- nel (an exhaustive list of source coding algorithms is to be found in [21]). Probabilistic descriptions, which are derived from statistical estimates, are often too costly to obtain, or even unfeasible, and at the same time they are uselessly detailed.

The paper aims at a minimum level of self- containment, and so we have shortly re-described certain notions of information theory which are quite

3The new possibilistic frame includes the traditional zero-error

probabilistic frame, as argued in Section3: it is enough to take possibilities which are equal to zero when the probability is zero, and equal to one when the probability is positive, whatever its value. However, the consideration of possibility values which are intermediate between zero and one does enlarge the frame; cf. Theorem6.1 in Section6, and the short comment made there just before giving its proof.

standard; for more details we refer the reader, e.g., to [3] or [4]. As for possibility theory, and in par- ticular for a clari\u00ffcation of the elusive notion of

non-interactivity, which is often seen as the natural

possibilistic analogue of probabilistic independence

(cf. Section2), we mention [5,6,9,11,12,16,23].

2. Possibilistic sources and possibilistic channels

We recall that apossibility distribution\ue001

over a \u00ffnite setA ={a1;:::;ak}, called theal-

phabet, is de\u00ffned by giving a possibility vector

\ue001=(\ue0021 ;\ue0022 ;:::;\ue002k) whose components \ue002iare the

possibilities\ue001(ai) of thek singletonsai (16i6k,

k\u00bf2):

\ue001(ai) =\ue002i;06\ue002i61;

max

16i6k\ue002i= 1:

The possibility4of each subsetA\u2286A is the maxi-

mum of the possibilities of its elements:

\ue001(A) = max

ai\u2208A\ue002i:

(2.1)

In particular\ue001(\u2205) = 0; \ue001(A) = 1. In logical terms taking a maximum means that eventA is\ue000-possible whenat least one of its elements is so, in the sense of a logical disjunction.

Instead, probability distributions are de\u00ffned

through a probability vectorP=(p1;p2;:::;pk),

P(ai) =pi; 06pi61,

\ue00016i6kpi= 1, and have an

additivenature:

P(A) =

\ue001

ai\u2208A

pi:

With respect to probabilities, an empirical interpreta- tion of possibilities is less clear. The debate on the meaning and the use of possibilities is an ample and long-standing one; the reader is referred to standard texts on possibility theory, e.g., those quoted at the

4The fact that the symbol\ue001 is used both for vectors and for

distributions will cause no confusion; below the same symbol will be used also to denote a stationary and non-interactive source, since the behaviour of the latter is entirely speci\u00ffed by the vector

\ue001. Similar conventions will be tacitly adopted also in the case of

probabilistic sources, and of probabilistic and possibilistic chan-

nels.

Filters