Professional Documents
Culture Documents
Jrgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)
Today
1. 2. 3. 4. 5. What is information theory about? Stochastic (information) sources. Information and entropy. Entropy for stochastic sources. The source coding theorem.
Sometimes referred to as Shannon-Weaver, since the standalone publication has a foreword by Weaver. Be careful!
Change to an efficient Change representation, to an efficient representation for, Any source of information i.e., data compression. transmission, i.e., error control coding.
Source
Source coder
Channel coder
Channel
Channel
Channel
Source
Sink,
decoder
decoder
receiver
The channel is anything transmitting or storing information Recover from channel distortion. Uncompress a radio link, a cable, a disk, a CD, a piece of paper,
Fundamental Entities
Source
Source coder
Channel coder
Channel
Channel
Channel
Source
Sink,
decoder
decoder
receiver
H: The information content of the source. R: Rate from the source coder.
C: Channel capacity.
Fundamental Theorems
Source
Source coder
Channel coder
Channel
Channel
Channel
Source
Sink,
decoder
decoder
receiver
b
0.5 1.0
a
0.3 0.2
c
0.3
b
0.5 1.0
a
0.3 0.2
b
0.5 1.0
a
0.3 0.2
Self Information
So, lets look at it the way Shannon did. Assume a memoryless source with
alphabet A = (a1, , an) symbol probabilities (p1, , pn).
How much information do we get when finding out that the next symbol is ai? According to Shannon the self information of ai is
Why?
Assume two independent events A and B, with probabilities P(A) = pA and P(B) = pB. For both the events to happen, the probability is pA pB. However, the amount of information should be added, not multiplied.
Logarithms satisfy this! No, we want the information to increase with decreasing probabilities, so lets use the negative logarithm.
Self Information
Example 1:
Example 2:
Which logarithm? Pick the one you like! If you pick the natural log, youll measure in nats, if you pick the 10-log, youll get Hartleys, if you pick the 2-log (like everyone else), youll get bits.
Self Information
On average over all the symbols, we get:
This can be regarded as the degree of uncertainty about the following symbol.
Entropy
Example: Binary Memoryless Source
BMS
01101000
Let
Then
1 The uncertainty (information) is greatest when
0.5
That is, the summation is done over all possible combinations of n symbols.
Averaging over all states, we get the entropy for the Markov source as
1-
Probability for a burst of length r: P(r) = (1-)r-1 Entropy: HR = - r=11 P(r) log P(r) If the average run length is , then HR/ = HM.
The entropy is the smallest number of bits allowing error-free representation of the source.
Typical Sequences
Assume a long sequence from a binary memoryless source with P(1) = p. Among n bits, there will be approximately w = n p ones. Thus, there is M = (n over w) such typical sequences! Only these sequences are interesting. All other sequences will appear with smaller probability the larger is n.
bits/symbol
Thus, we need H(X) bits per symbol to code any typical sequence!
Summary
The mathematical model of communication.
Source, source coder, channel coder, channel, Rate, entropy, channel capacity.
Sources
BMS, Markov, RL