P. 1
haberlesme__matemetigi

# haberlesme__matemetigi

|Views: 201|Likes:

### Availability:

See more
See less

05/09/2014

pdf

text

original

The entropy of a discrete set of probabilities p

£
£
£

¤

pn has been deﬁned as:

H¡

¡

pilogpi£

In an analogous manner we deﬁne the entropy of a continuous distribution with the density distribution
function p

x¡

by:

H¡

¡

§

p

x¡

logp

x¡

dx£

With an n dimensional distribution p

x
£
£
£

¤

xn¡

we have

H¡

¡

£¥
£¥
£

p

x
£
£
£

¤

xn¡

logp

x
£
£
£

¤

xn¡

dx1

£¥
£¥
£

dxn£

If we have two arguments x and y (which may themselves be multidimensional) the joint and conditional
entropies of p

x¤ y¡

are given by

H

x¤ y¡

¡

¡

¡

p

x¤ y¡

logp

x¤ y¡

dxdy

and

Hx

y¡

¡

¡

¡

p

x¤ y¡

log p

x¤ y¡

p

x¡

dxdy

Hy

x¡

¡

¡

¡

p

x¤ y¡

log p

x¤ y¡

p

y¡

dxdy

where

p

x¡

¡

p

x¤ y¡

dy

p

y¡

¡

p

x¤ y¡

dx£

The entropies of continuous distributions have most (but not all) of the properties of the discrete case.
In particular we have the following:

1. If x is limited to a certain volume v in its space, then H

x¡

is a maximum and equal to logv when p

x¡

is constant (1¢

v) in the volume.

35

2. With any two variables x, y we have

H

x¤ y¡

H

x¡

¢

H

y¡

with equality if (and only if) x and y are independent, i.e., p

x¤ y¡

¡

p

x¡

p

y¡

(apart possibly from a

set of points of probability zero).

3. Consider a generalized averaging operation of the following type:

p¢

y¡

¡

a

x¤ y¡

p

x¡

dx

with

a

x¤ y¡

dx¡

a

x¤ y¡

dy¡

a

x¤ y¡

£

Then the entropy of the averaged distribution p¢

y¡

is equal to or greater than that of the original

distribution p

x¡

.

4. We have

H

x¤ y¡

¡

H

x¡

¢

Hx

y¡

¡

H

y¡

¢

Hy

x¡

and

Hx

y¡

H

y¡

£

5. Let p

x¡

bea one-dimensionaldistribution. Theformof p

x¡

givinga maximumentropysubjecttothe

condition that the standard deviation of x be ﬁxed at

is Gaussian. To show this we must maximize

H

x¡

¡

¡

p

x¡

logp

x¡

dx

with

p

x¡

x2

dx and 1¡

p

x¡

dx

as constraints. This requires, by the calculus of variations, maximizing

¢

¡

p

x¡

logp

x¡

¨

p

x¡

x

p

x¡

¤

dx£

The condition for this is

¡

logp

x¡

¨

x

¡

0

and consequently (adjusting the constants to satisfy the constraints)

p

x¡

¡

1

2

¡

e§

¡

x2

¡

2

¢

£

Similarly in n dimensions, suppose the second order moments of p

x
£
£
£

¤

xn¡

are ﬁxed at Aij:

Aij¡

£¥
£¥
£

xixjp

x
£
£
£

¤

xn¡

dx1

£¥
£¥
£

dxn£

Then the maximum entropy occurs (by a similar calculation) when p

x
£
£
£

¤

xn¡

is the n dimensional

Gaussian distribution with the second order moments Aij.

36

6. The entropy of a one-dimensional Gaussian distribution whose standard deviation is

is given by

H

x¡

¡

log

2

e

£

This is calculated as follows:

p

x¡

¡

1

2

¡

e§

¡

x2

¡

2

¢

¡

logp

x¡

¡

log

2

¢

x2

2

2

H

x¡

¡

¡

p

x¡

logp

x¡

dx

¡

p

x¡

log

2

¡

dx¢

p

x¡

x2

2

2 dx

¡

log

2

¢

2

2

2

¡

log

2

¢

log

e

¡

log

2

e

£

Similarly the n dimensional Gaussian distribution with associated quadratic form aij is given by

p

x
£
£
£

¤

xn¡

¡

¡
aij¡

1
2

2

¡

n¡

2 exp

¡

1
2∑aijxixj

and the entropy can be calculated as

H¡

log

2

e¡

n¡

aij¡

§

1
2

where¡

aij¡

is the determinant whose elements are aij.

7. If x is limited to a half line (p

x¡

¡

0 for x

0) and the ﬁrst moment of x is ﬁxed at a:

a¡

0

p

x¡

xdx¤

then the maximum entropy occurs when

p

x¡

¡

1

ae§

¡

x¡

a¢

and is equal to logea.

8. There is one important difference between the continuous and discrete entropies. In the discrete case
the entropy measures in an absolute way the randomness of the chance variable. In the continuous
case the measurement is relative to the coordinate system. If we change coordinates the entropy will
in general change. In fact if we change to coordinates y1

£¥
£¥
£

yn the new entropy is given by

H

y¡

¡

£¥
£¥
£

p

x
£
£
£

¤

xn¡ J

x
y

logp

x
£
£
£

¤

xn¡ J

x
y

dy1

£¥
£¥
£

dyn

where J¡

x
y
¢

is the Jacobian of the coordinate transformation. On expanding the logarithm and chang-

ing the variables to x1

£¥
£¥
£

xn, we obtain:

H

y¡

¡

H

x¡

¡

£¥
£¥
£

p

x
£
£
£

¤

xn¡

logJ

x
y

dx
£
£
dxn£

37

Thusthe new entropy is the old entropy less the expectedlogarithmof the Jacobian. In the continuous
case the entropycan be considereda measure of randomnessrelative to an assumed standard, namely
the coordinate system chosen with each small volume element dx1

£¥
£¥
£

dxn given equal weight. When
we change the coordinate system the entropy in the new system measures the randomnesswhen equal
volume elements dy1

£¥
£¥
£

dyn in the new system are given equal weight.

In spite of this dependence on the coordinate system the entropy concept is as important in the con-
tinuous case as the discrete case. This is due to the fact that the derived concepts of information rate
and channel capacity depend on the difference of two entropies and this difference does not depend
on the coordinate frame, each of the two terms being changed by the same amount.

The entropy of a continuousdistribution can be negative. The scale of measurementssets an arbitrary
than this has less entropy and will be negative. The rates and capacities will, however,always be non-
negative.

9. A particular case of changing coordinates is the linear transformation

yj¡

i

aijxi£

In this case the Jacobian is simply the determinant¡

aij¡

§

1

and

H

y¡

¡

H

x¡

¢

log¡

aij¡

£

In the case of a rotation of coordinates (or any measure preserving transformation) J¡

1 and H

y¡

¡

H

x¡

.

scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->