You are on page 1of 14

Introduction to Information theory

Information theory deals with the problem of efficient and reliable transmission of information It specifically encompasses theoretical and applied aspects of
- coding, communications & communications networks - complexity and cryptography - detection and estimation - learning, Shannon theory, and stochastic processes

Some of the successes of Information Theory


Satellite communications: Reed Solomon Codes (also CD-Player) Viterbi Algorithm Public Key Cryptosystems Compression Algorithms Huffman Lempel-Ziv MP3 JPEG MPEG Modem Design with Coded Modulation Codes for Recording ( CD, DVD )

Information is knowledge that can be used i.e. data is not necessarily information

We:
1) specify a set of messages of interest to a receiver 2) and select a message to be transmitted 3) sender and receiver build a pair

INFORMATION AND INFORMATION THEORY


Information in its most restricted technical sense is an ordered sequence of symbols that record or transmit a message. It can be recorded as signs, or conveyed as signals by waves. Information theory is a branch of applied mathematics and electrical engineering involving the quantification of information. Information theory is a branch of science that deals with the analysis of a communications system

The higher the likelihood of a particular outcome, the less information that outcome conveys Consider tossing a coin If the coin is biased such that it lands with heads up 99% of the time, there is not much information conveyed when we flip the coin and it lands on heads. Information

I(X j )

log

1 P( X j )

log

1 Pj

log Pj

1. log2 units are bits (from 'binary') 2. log3 units are trits(from 'trinary') 3. loge units are nats (from 'natural logarithm') 4. log10 units are Hartleys

ENTROPY
Entropy is a measure of the uncertainty associated with a random variable.
The term usually refers to the Shannon entropy, which quantifies the expected value of the information contained in a message, usually in units such as bits. The concept was introduced by Claude E. Shannon in his 1948 paper A Mathematical Theory of Communication. Shannon's entropy represents an absolute limit on the best possible lossless compression of any communication.

Entropy is the measurement of the average uncertainty of information The entropy of a message is thus a measure of how much information it really contains.
H entropy p probability X random variable with a discrete set of possible outcomes (X0, X1, X2, Xn-1) where n is the total number of possibilities
n 1 n 1

Entropy H ( X )
j 0

p( x) j log p( x) j
j 0

p( x) j log

1 p ( x) j
8

The entropy of X can also be interpreted as the expected value of log 1/p(x), where X is drawn according to probability- mass function p(x). Thus H(X) = Ep log 1/p(X)

Properties of Entropy
1. Entropy is always positive. H(X) 0 since 0 p(x) 1 for all p(x). 2. We can change bases freely. Hb(X) = (logba) Ha(x) since logbp = logba logap

Q1. Find the entropy of X.

Q2.

Find the entropy of X.

10

JOINT ENTROPY

12

CONDITIONAL ENTROPY
The conditional entropy H(Y|X) is

13

CHAIN RULE
The entropy of a pair of random variable is the entropy of one plus the conditional entropy of the other.


Also

H(X,Y) = H(X) + H(Y|X) H(X,Y) = H(Y) + H(X|Y)

H(X) H(X|Y) = H(Y) H(Y|X)