Professional Documents
Culture Documents
Research Article
A Joint Representation of Rényi’s and Tsalli’s Entropy with
Application in Coding Theory
Copyright © 2017 Litegebe Wondie and Satish Kumar. This is an open access article distributed under the Creative Commons
Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
We introduce a quantity which is called Rényi’s-Tsalli’s entropy of order 𝜉 and discussed some of its major properties with Shannon
and other entropies in the literature. Further, we give its application in coding theory and a coding theorem analogous to the
ordinary coding theorem for a noiseless channel is proved. The theorem states that the proposed entropy is the lower bound of
mean code word length.
tions. For any probability distribution 𝐴 = (𝑎1 , 𝑎2 , . . . , 𝑎𝑛 ) ∈ 𝐻𝑇 (𝐴; 𝜉) = (∑𝑎𝑖𝜉 − 1) ; 𝜉 > 0 (≠ 1) . (4)
1 − 𝜉 𝑖=1
Δ 𝑛 , Shannon [1] defined an entropy given as
𝑛 Equations (3) and (4) essentially have the same expression
𝐻 (𝐴) = −∑ (𝑎𝑖 ) log (𝑎𝑖 ) , (1) except the normalized factor. The Havrda and Charvát
𝑖=1 entropy is normalized to 1. That is, if 𝐴 = (1/2, 1/2) then
𝐻ℎ𝑐 (𝐴; 𝜉) = 1 whereas Tsalli’s entropy is not normalized. Both
where the convention 0 log(0) = 0 is adopted (see Shannon
the entropies yield the same result and we call these entropies
[1]). Throughout this paper, logarithms are taken to the base
as Tsalli-Havrda-Charvát entropy. Equations (2), (3), and (4)
𝐷(𝐷 > 1). A number of parametric generalizations of Shan-
reduce to (1) when 𝜉 → 1.
non entropy are proposed by many authors in literature which
N. R. Pal and S. K. Pal [5, 6] have proposed an exponential
produces (1) for specific values of parameters. The presence of
entropy as
parameters makes an entropy more flexible from application
point of view. One of the first generalizations of (1) was 𝑛
proposed by Rényi [2] as 𝐻𝑝𝑝 (𝐴) = ∑𝑎𝑖 (𝑒1−𝑎𝑖 − 1) . (5)
𝑖=1
𝑛
1
𝐻𝑅𝑒 (𝐴; 𝜉) = log (∑𝑎𝑖𝜉 ) ; 𝜉 > 0 (≠ 1) . (2) These authors claim that the exponential entropy has some
1−𝜉 𝑖=1 advantage over Shannon’s entropy, especially within context
of image processing. One such claim is that the exponential
Another well-known entropy was proposed by Havrda and
entropy has a fixed upper bound such as that for uniform
Charvát [3]
distribution (1/𝑛, 1/𝑛, . . . , 1/𝑛) and for the entropy in (5).
𝑛
−1
𝐻ℎ𝑐 (𝐴; 𝜉) = (21−𝜉 − 1) (∑𝑎𝑖𝜉 − 1) ; 𝜉 > 0 (≠ 1) . (3) 1 1 1
lim 𝐻𝑝𝑝 ( , , . . . , ) = 𝑒 − 1, (6)
𝑖=1 𝑛→∞ 𝑛 𝑛 𝑛
2 International Journal of Mathematics and Mathematical Sciences
Theorem 1. The parametric entropy 𝐻𝜉 (𝐴), {𝐴 = (𝑎1 , 𝑎2 , . . . , since 0 < 𝜉 < 1 ⇒ 𝜉−1 − 𝜉 > 0.
𝑎𝑛 ), 0 < 𝑎𝑖 ≤ 1, ∑𝑛𝑖=1 𝑎𝑖 = 1} has the following properties. Therefore, we get
𝐻𝜉 (𝑎1 , 𝑎2 , . . . , 𝑎𝑛 ; 0) = 𝐻𝜉 (𝑎1 , 𝑎2 , . . . , 𝑎𝑛 ) . (9) Case 2 (1 < 𝜉 < ∞). The proof is on the same lines as in Case
1. (Note that inequality in (14) will get reversed for 1 < 𝜉 <
∞.)
(4) Decisive
Property (6). Now, we prove that 𝐻𝜉 (𝐴) is a concave function
𝐻𝜉 (0, 1) = 𝐻𝜉 (1, 0) = 0. (10) of 𝐴 ∈ Δ 𝑛 .
International Journal of Mathematics and Mathematical Sciences 3
𝑛 𝑛 𝑛 𝑛 2
𝜕2 𝐻𝜉 (𝐴) 𝜉 (𝜉 − 1) ∑𝑖=1 (𝑎𝑖𝜉 ) ∑𝑖=1 (𝑎𝑖𝜉−2 ) (∑𝑖=1 𝑎𝑖𝜉 + 1) − 𝜉 (∑𝑖=1 𝑎𝑖𝜉−1 )
= −1 ( ). (16)
𝜕𝑎𝑖2 𝜉 −𝜉 (∑𝑛 𝑎𝜉 )
2
𝑖=1 𝑖
Now, for 0 < 𝜉 < 1, to be a function of length only. Further, in Jelinek [11], it is
shown that coding with respect to Campbell’s mean length
𝑛 𝑛 𝑛 𝑛 2
is useful in minimizing the problem of buffer overflow which
(𝜉 − 1) ∑ (𝑎𝑖𝜉 ) ∑ (𝑎𝑖𝜉−2 ) (∑𝑎𝑖𝜉 + 1) − 𝜉 (∑𝑎𝑖𝜉−1 ) occurs when the source symbol is produced at a fixed rate and
𝑖=1 𝑖=1 𝑖=1 𝑖=1
the code words are stored temporarily in a finite buffer.
< 0, (17) There are many different codes whose lengths satisfy
𝜉 the constraints (18). To compare different codes and pick
> 0. out an optimum code it is customary to examine the mean
𝜉−1 − 𝜉
length, ∑𝑛𝑖=1 𝑁𝑖 𝑎𝑖 , and to minimize this quantity. This is a
This implies that 𝐻𝜉 (𝐴) is a concave function of 𝐴 ∈ Δ 𝑛 . good procedure if the cost of using a sequence of length 𝑁𝑖 is
directly proportional to 𝑁𝑖 . However, there may be occasions
when the cost is more nearly an exponential function of 𝑁𝑖 .
This could be the case, for example, if the cost of encoding and
3. A Measure of Length decoding equipment was an important factor. Thus, in some
circumstances, it might be more appropriate to choose a code
Let a finite set of 𝑛 input symbols 𝑋 = (𝑥1 , 𝑥2 , . . . , 𝑥𝑛 ) be which minimizes the quantity
encoded using alphabet of 𝐷 symbols; then it has been shown
by Feinstein [8] that there is a uniquely decipherable code 𝑛
with lengths 𝑁1 , 𝑁2 , . . . , 𝑁𝑛 if and only if Kraft’s inequality 𝐶 = [𝜉 log𝐷 (∑𝑎𝑖 𝐷−𝑁𝑖 ((𝜉−1)/𝜉) )
holds; that is, 𝑖=1
[
(22)
𝑛 𝜉
−𝑁𝑖 𝑛
∑𝐷 ≤ 1, (18) + (∑𝑎𝑖 𝐷−𝑁𝑖 (𝜉−1)/𝜉 ) − 1] ; 𝜉 > 0 (≠ 1) ,
𝑖=1
𝑖=1
]
where 𝐷 is the size of code alphabet. Furthermore, if where 𝜉 is a parameter related to the cost. For reasons which
𝑛 will become evident later we prefer to minimize a monotonic
𝐿 = ∑𝑁𝑖 𝑎𝑖 (19) function of 𝐶. Clearly, this will minimize 𝐶.
𝑖=1 In order to make the result of this paper more directly
comparable with the usual coding theorem we introduce a
is the average code word length, then for a code satisfying (18), quantity which resembles the mean length. Let a code length
the inequality of order 𝜉 be defined by
𝐿 ≥ 𝐻 (𝐴) (20) 𝑛
1 [
𝐿𝜉 (𝐴) = 𝜉 log 𝐷 ( ∑ 𝑎𝑖 𝐷−𝑁𝑖 ((𝜉−1)/𝜉) )
is also fulfilled and the equality 𝐿 = 𝐻(𝐴) holds if and only if 𝜉−1 − 𝜉 𝑖=1
[
(23)
𝑁𝑖 = −log𝐷 (𝑎𝑖 ) ; ∀𝑖 = 1, 2, . . . , 𝑛, 𝑛 𝜉
𝑛
(21) + (∑𝑎𝑖 𝐷−𝑁𝑖 (𝜉−1)/𝜉 ) − 1] , 𝜉 > 0 (≠ 1) .
∑𝐷−𝑁𝑖 = 1. 𝑖=1
𝑖=1
]
Remark 2. If 𝜉 = 1, then (23) becomes the well-known result
If 𝐿 < 𝐻(𝐴), then by being suitably encoded into words of studied by Shannon.
long sequences, the average length can be made arbitrarily
close to 𝐻(𝐴) (see Feinstein [8]). This is Shannon’s noiseless Remark 3. If all 𝑁𝑖 are the same, say, 𝑁𝑖 = 𝑁, then (23)
coding theorem. By considering Rényi’s entropy [2], a coding becomes
theorem analogous to the above noiseless coding theorem has
1
been established by Campbell [9] and the authors obtained 𝐿𝜉 (𝑃) = −1 [𝑁 (1 − 𝜉) + 1 − 𝐷𝑁(1−𝜉) ] ;
bounds for it in terms of 𝐻𝑅𝑒 (𝐴; 𝜉). 𝜉 −𝜉 (24)
Kieffer [10] defined class rules and showed 𝐻𝑅𝑒 (𝐴; 𝜉) is the 𝜉 > 0 (≠ 1) .
best decision rule for deciding which of the two sources can
be coded with expected cost of sequence of length 𝑁 when This is reasonable property for any measure of length to
𝑁 → ∞, where the cost of encoding a sequence is assumed possess.
4 International Journal of Mathematics and Mathematical Sciences
Table 1: 𝜉 > 0(≠ 1) (taking 𝜉 = 0.5). Relation between the entropy 𝐻𝜉 (𝐴) and the average code word length 𝐿𝜉 (𝐴). 𝑁𝑖 denote the lengths of
Huffman code words, 𝜂 = 𝐻𝜉 (𝐴)/𝐿𝜉 (𝐴) is the efficiency code, and ] = 1 − 𝜂 is the redundancy of the code.
In the following theorem, we give a lower bound for 𝐿𝜉 (𝐴) From (30), we get
in terms of 𝐻𝜉 (𝐴). 𝜉
𝑛 𝑛
Theorem 4. If 𝑁1 , 𝑁2 , . . . , 𝑁𝑛 , denote the length of a uniquely 𝜉 log𝐷 (∑𝑎𝑖 𝐷−𝑁𝑖 ((𝜉−1)/𝜉) ) + (∑𝑎𝑖 𝐷−𝑁𝑖 ((𝜉−1)/𝜉) )
𝑖=1 𝑖=1
decipherable code satisfying (18); then (31)
𝜉 𝜉 𝑛 𝑛
𝐿 (𝐴) ≥ 𝐻 (𝐴) . (25) − 1 ≥ log𝐷 (∑𝑎𝑖𝜉 ) + (∑𝑎𝑖𝜉 ) − 1.
̈
Proof. By Holder’s inequality, 𝑖=1 𝑖=1
[∑𝑎𝑖 𝐷−𝑁𝑖 ((𝜉−1)/𝜉) ] ≥ [∑𝑎𝑖𝜉 ] . (30) In Table 1, we have developed the relation between the
𝑖=1 𝑖=1 entropy 𝐻𝜉 (𝐴) and average code word length 𝐿𝜉 (𝐴).
International Journal of Mathematics and Mathematical Sciences 5
𝜉 .1 .2 .3 .4 .5 .6 .7 .8 .9
𝐿𝜉 (𝐴) 1.25 1.67 1.78 1.82 1.87 1.92 1.96 1.99 2.01
𝜉 10 20 30 40 50 60 70 80 90 100
𝐿𝜉 (𝐴) 2.1618 2.2037 2.2199 2.2285 2.2337 2.2373 2.2399 2.2418 2.2433 2.2446
2.1
2 𝜉 > 1.
1.9
1.8
1.7 Conflicts of Interest
1.6
1.5 The authors declare that there are no conflicts of interest
1.4 regarding the publication of this paper.
1.3
2.25
2.24 Press, 1961.
2.23 [3] J. Havrda and F. S. Charvát, “Quantification method of classifi-
2.22 cation processes. Concept of structural 𝛼-entropy,” Kybernetika,
2.21 vol. 3, pp. 30–35, 1967.
2.2
[4] C. Tsallis, “Possible generalization of Boltzmann-Gibbs statis-
2.19
tics,” Journal of Statistical Physics, vol. 52, no. 1-2, pp. 479–487,
2.18
2.17 1988.
2.16 [5] N. R. Pal and S. K. Pal, “Object-background segmentation using
10 20 30 40 50 60 70 80 90 100 new definitions of entropy,” IEE Proceedings Part E Computers
→ and Digital Techniques, vol. 136, no. 4, pp. 284–295, 1989.
Figure 2: Monotonic behaviour of mean code word length 𝐿𝜉 (𝐴) [6] N. R. Pal and S. K. Pal, “Entropy: a new definition and its
(𝜉 > 1). applications,” The Institute of Electrical and Electronics Engineers
Systems, Man, and Cybernetics Society, vol. 21, no. 5, pp. 1260–
1270, 1991.
[7] T. O. Kvalseth, “On exponential entropies,” in Proceedings
From the Table 1, we can observe that average code word of the IEEE International Conference on Systems, Man And
length 𝐿𝜉 (𝐴) exceeds the entropy 𝐻𝜉 (𝐴). Cybernatics, vol. 4, pp. 2822–2826, 2000.
[8] A. Feinstein, Foundations of Information Theory, McGraw-Hill,
New York, NY, USA, 1956.
4. Monotonic Behaviour of Mean Code [9] L. L. Campbell, “A coding theorem and Rényi’s entropy,”
Word Length Information and Control, vol. 8, no. 4, pp. 423–429, 1965.
[10] J. C. Kieffer, “Variable-length source coding with a cost depend-
In this section, we study the monotonic behaviour of mean ing only on the code word length,” Information and Control, vol.
code word length (23) with respect to parameter 𝜉. Let 41, no. 2, pp. 136–146, 1979.
𝑃 = (0.3, 0.25, 0.2, 0.1, 0.1, 0.05) be the set of probabilities. [11] F. Jelinek, “Buffer overflow in variable length coding of fixed rate
For different values of 𝜉, the calculated values of 𝐿𝜉 (𝐴) are sources,” IEEE Transactions on Information Theory, vol. 14, no.
displayed in Tables 2 and 3. 3, pp. 490–501, 1968.
Graphical representation of monotonic behaviour of [12] E. F. Beckenbach and R. Bellman, Inequalities, Springer, New
𝐿𝜉 (𝐴) for (𝜉 < 1) is shown in Figure 1. York, NY, USA, 1961.
Graphical representation of monotonic behaviour of [13] D. A. Huffman, “A method for the construction of minimum-
𝜉 redundancy codes,” Proceedings of the IRE, vol. 40, no. 9, pp.
𝐿 (𝐴) for (𝜉 > 1) is shown in Figure 2.
1098–1101, 1952.
Figures 1 and 2 explain the monotonic behaviour of 𝐿𝜉 (𝐴)
for 𝜉 < 1 and 𝜉 > 1, respectively. From the figures, it is clear
Advances in Advances in Journal of Journal of
Operations Research
Hindawi Publishing Corporation
Decision Sciences
Hindawi Publishing Corporation
Applied Mathematics
Hindawi Publishing Corporation
Algebra
Hindawi Publishing Corporation
Probability and Statistics
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014 http://www.hindawi.com Volume 2014
International
Journal of Journal of
Mathematics and
Mathematical
#HRBQDSDĮ,@SGDL@SHBR
Sciences