You are on page 1of 31

Information theory in

communication systems

References:
Communication Systems by Simon Haykin
Chapter: 9, 10

1
Information theory in communication systems
Communication systems are designed to transmit information. In any
communication system there exists an information source that produces the
information, and the purpose of the communication system is to transmit the output
of the source to the destination.
In this presentation, we investigate mathematical modeling of information sources
and provide a measure of information. Then we would see how the output of an
information source can be made more compact and, therefore, easier to transmit or
store.

Contents
• Fundamental elements of communication systems
• Information theory
• Source coding theorem
• Information transmission through channel
• Noise and Modeling of noisy channel
• Performance parameter: S/N ratio
• Channel coding theorem
2
Elements of an Electrical Communication System

3
Elements of an Electrical Communication System
Figure 1.2 illustrates the functional diagram and the basic elements of a digital communication
system. The source output may be either an analog signal, such as audio or video signal, or a
digital signal, such as the output of a computer which is discrete in time and has a finite
number of output characters.

In a digital communication system, the messages produced by the source are usually converted
into a sequence of binary digits. Ideally, we would like to represent the source output
(message) by as few binary digits as possible. In other words, we seek an efficient
representation of the source output that results in little or no redundancy. The process of
efficiently converting the output of either an analog or a digital source into a sequence of
binary digits is called source encoding or data compression.

The sequence of binary digits from the source encoder, which we call the information
sequence is passed to the channel encoder. The purpose of the channel encoder is to introduce,
in a controlled manner, some redundancy in the binary information sequence which can be
used at the receiver to overcome the effects of noise and interference encountered in the
transmission of the signal through the channel.

The binary sequence at the output of the channel encoder is passed to the digital modulator,
which serves as the interface to the communications channel.
4
Elements of an Electrical Communication System
At the receiving end of a digital communications system, the digital demodulator processes
the channel-corrupted transmitted waveform and reduces each waveform to a single number
that represents an estimate of the transmitted data symbol (binary or M-ary).

As a final step, when an analog output is desired, the source decoder accepts the output
sequence from the channel decoder and, from knowledge of the source encoding method used,
attempts to reconstruct the original signal from the source. Due to channel-decoding errors and
possible distortion introduced by the source encoder and, perhaps, the source decoder, the
signal at the output of the source decoder is an approximation to the original source output.
The difference or some function of the difference between the original signal and the
reconstructed signal is a measure of the distortion introduced by the digital communications
system.

5
Information Theory
 The intuitive and common notion of information refers to any new knowledge about
something. One can obtain information via hearing, seeing, or other means of perception. The
information source, therefore, produces outputs which are of interest to the receiver of
information, who does not know these outputs in advance.

 The communication-system designer designs a system that transmits the output of a random
process (information source) to a destination via a random medium (channel) and ensures low
distortion.

 Information sources can be modeled by random processes, and the properties of the random
process depend on the nature of the information source. For example, when modeling speech
signals, the resulting random process has all its power in a frequency band of approximately 300–
4000 Hz. Therefore, the power-spectral density of the speech signal also occupies this band of
frequencies.

Typical power spectrum of speech signal.

6
Information Theory
 Band-limited information signals can be sampled at the Nyquist rate or larger and
reconstructed from the sampled values. Therefore, it makes sense to confine ourselves to discrete-
time random processes in this presentation because all information sources of interest can be
modeled by such a process.

 The mathematical model for an information source is shown in figure below. Here the source
is modeled by a discrete-time random process . The alphabet over which the random
variables Xi are defined can be either discrete (in transmission of binary data, for instance) or
continuous (e.g., sampled speech). The statistical properties of the discrete-time random process
depend on the nature of the information source.

 The simplest model for the information source that we study is the discrete memoryless source
(DMS). A DMS is a discrete-time, discrete-amplitude random process in which all Xi ’s are
generated independently and with the same distribution.

7
Information Theory

 In order to give a quantitative measure of information, we will start with the basic
model of an information source and try to define the information content of the source
in such a way that certain intuitive properties are satisfied. A rational measure of
information for an output of an information source should be a decreasing function of the
probability of that output. A second intuitive property of a measure of information is that
a small change in the probability of a certain output should not change the information
delivered by that output by a large amount. In other words, the information measure
should be a decreasing and continuous function of the probability of the source output.

 The information revealed about each source output ai is defined as the self-
information of that output, given by −log(pi ), we can define the information content of
the source as the weighted average of the self-information of all source outputs. This is
justified by the fact that various source outputs appear with their corresponding
probabilities. Therefore, the information revealed by an unidentified source output is the
weighted average of the self-information of the various source outputs. The information
content of the information source is known as the entropy of the source and is denoted by
H(X).

(9.9) 8
Information Theory

9
Information Theory

10
Source Coding Theorem

11
Source Coding Theorem

12
Source Coding Theorem

13
Source Coding Theorem

14
Source Coding Theorem

15
Source Coding Theorem

16
Source Coding Theorem

Shannon’s first theorem : Source-encoding

17
Source Coding Theorem
In the preceding section, we observed that H, the entropy of a source, gives a sharp bound
on the rate at which a source can be compressed for reliable reconstruction.
This means that at rates above entropy it is possible to design a code with an error
probability as small as desired, whereas at rates below entropy such a code does not exist.
This important result however does not provide specific algorithms to design codes
approaching this bound. Algorithms designed to generate codes that perform very close to
the entropy bound are –
 Prefix coding algorithm
 Huffman coding algorithm
 Lempel-Ziv coding algorithm.

18
Source Coding Theorem
Huffman coding algorithm

1. Sort source outputs in decreasing order of


their probabilities.
2. Merge the two least-probable outputs into a
single output whose probability is
the sum of the corresponding probabilities.
3. If the number of remaining outputs is 2,
then go to the next step, otherwise go to
step 1.
4. Arbitrarily assign 0 and 1 as code words for
the two remaining outputs.
5. If an output is the result of the merger of
two outputs in a preceding step, append
the current code word with a 0 and a 1 to
obtain the code word for the preceding
outputs and then repeat 5. If no output is
preceded by another output in a preceding
step, then stop.

19
Huffman Coding

20
Huffman Coding

21
Communication Channel
In the design of communication systems for transmitting information through physical
channels, we find it convenient to construct mathematical models that reflect the most
important characteristics of the transmission medium.

The Additive Noise Channel:


The simplest mathematical model for a communication channel is the additive noise
channel, illustrated in Figure 1.8. In this model the transmitted signal s(t) is corrupted by an
additive random noise process n(t). Physically, the additive noise process may arise from
electronic components and amplifiers at the receiver of the communication system, or from
interference encountered in transmission, as in the case of radio signal transmission.
If the noise is introduced primarily by electronic components and amplifiers at the receiver, it
may be characterized as thermal noise. This type of noise is characterized statistically as a
gaussian noise process. Hence, the resulting mathematical model for the channel is usually
called the additive Gaussian noise channel. This is the predominant channel model used in
our communication system analysis and design. When the signal undergoes attenuation in
transmission through the channel, the received signal is r (t) = as(t) + n(t)
where a represents the attenuation factor. 22
Communication Channel
The Linear Filter Channel:
In some physical channels such as wireline telephone channels, filters are used to ensure that
the transmitted signals do not exceed specified bandwidth limitations and, thus, do not
interfere with one another. Such channels are generally characterized mathematically as
linear filter channels with additive noise, as illustrated in Figure 1.9. Hence, if the channel
input is the signal s(t), the channel output is the signal r (t) = s(t)*h(t) + n(t)
where h(t) is the impulse response of the linear filter and * denotes convolution.

23
Communication Channel
The Linear Time-Variant Filter Channel:
Physical channels such as underwater acoustic channels and ionospheric radio channels
which result in time-variant multipath propagation of the transmitted signal may be
characterized mathematically as time-variant linear filters. Such linear filters are
characterized by time-variant channel impulse response h(τ ; t) where h(τ ; t) is the response
of the channel at time t, due to an impulse applied at time (t − τ). Thus, τ represents the “age”
(elapsed time) variable.
The linear time-variant filter channel with additive noise is illustrated Figure 1.10. For an
input signal s(t), the channel output signal is r (t) = s(t)*h(τ ; t) + n(t)

The three mathematical models described above adequately characterize a large majority of
physical channels encountered in practice.

24
Channel Capacity
Measuring effect of noise:
Signal-to-noise ratio (SNR) of the output of the receiver that demodulates the amplitude-
modulated signals can be defined as
S/N= (average signal power)/(average noise power)
We need to consider the average signal power and the average noise power because these
may change with time. Figure below shows the idea of SNR.
SNR is actually the ratio of what is wanted (signal) to what is not wanted (noise). A high SNR
means the signal is less corrupted by noise; a low SNR means the signal is more corrupted by
noise.
Because SNR is the ratio of two powers, it is often described in decibel units, SNRdB, defined
as SNRdB = 10log10 SNR

25
Channel Capacity
How fast can we transmit information over a communication channel?
Suppose a source sends r messages per second, and the entropy is H bits per message. The
information rate is R = r H bits/second.
One can intuitively reason that, for a given communication system, as the information rate
increases the number of errors per second will also increase. Surprisingly, however, this is not
the case.

Shannon’s theorem:
 A given communication system has a maximum rate of information C known as the
channel capacity.
 If the information rate R is less than C, then one can approach arbitrarily small error
probabilities by using intelligent coding techniques.
 To get lower error probabilities, the encoder has to work on longer blocks of signal data.
This entails longer delays and higher computational requirements.
Thus, if R ≤C then transmission may be accomplished without error in the presence of noise.

Unfortunately, Shannon’s theorem is not a constructive proof—it merely states that such a
coding method exists. The proof can therefore not be used to develop a coding method that
reaches the channel capacity.
The negation of this theorem is also true: if R > C, then errors cannot be avoided regardless of
the coding technique used. 26
Channel Capacity
An application of the channel capacity concept to an additive white Gaussian noise (AWGN)
channel with B Hz bandwidth and signal-to-noise ratio S/N is the Shannon–Hartley theorem.

Shannon-Hartley theorem:
The Shannon-Hartley theorem states that the channel capacity is given by C =B log2(1+S/N)
where C is the capacity in bits per second, B is the bandwidth of the channel in Hertz, and
S/N is the signal-to-noise ratio.

The expression of the channel capacity of the Gaussian channel makes intuitive sense:
As the bandwidth of the channel increases, it is possible to make faster changes in the
information signal, thereby increasing the information rate.
As S/N increases, one can increase the information rate while still preventing errors due to
noise. For no noise, S/N→∞ and an infinite information rate is possible irrespective of
bandwidth.
Thus we may trade off bandwidth for SNR. For example, if S/N = 7 and B = 4kHz, then the
channel capacity is C = 12 x103 bits/s. If the SNR increases to S/N = 15 and B is decreased
27 to
3kHz, the channel capacity remains the same.
Channel Capacity
Example

Example

28
Error in data transmission

The bit error rate or bit error ratio (BER) is the number of bit errors divided by the total
number of transferred bits during a studied time interval. BER is a unitless performance
measure, often expressed as a percentage.

As an example, assume this transmitted bit sequence:


0 1 1 0 0 0 1 0 1 1,
and the following received bit sequence:
0 0 1 0 1 0 1 0 0 1,
The number of bit errors (the underlined bits) is in this case 3. The BER is 3 incorrect bits
divided by 10 transferred bits, resulting in a BER of 0.3 or 30%.

29
Error Control Method for Data Transmission
1. Automatic Repeat reQuest (ARQ)
2. Forward Error Correction (FEC)
Automatic Repeat reQuest (ARQ), also known as Automatic Repeat Query, is an error-
control method for data transmission that uses acknowledgements (messages sent by the
receiver indicating that it has correctly received a data frame or packet) and timeouts (specified
periods of time allowed to elapse before an acknowledgment is to be received) to achieve
reliable data transmission over an unreliable service. If the sender does not receive an
acknowledgment before the timeout, it usually re-transmits the frame/packet until the sender
receives an acknowledgment or exceeds a predefined number of re-transmissions.

The types of ARQ protocols include


 Stop-and-wait ARQ
 Go-Back-N ARQ
 Selective Repeat ARQ

30
Forward Error Correction/ Channel Coding
The central idea of Forward Error Coding (FEC) is the sender encodes their message in
a redundant way by using an error-correcting code (ECC). The redundancy allows the
receiver to detect a limited number of errors that may occur anywhere in the message,
and often to correct these errors without retransmission. FEC gives the receiver the
ability to correct errors without needing a reverse channel to request retransmission of
data, but at the cost of a fixed, higher forward channel bandwidth.

A simplistic example of FEC is to transmit each


data bit 3 times, which is known as a
(3,1) repetition code. Through a noisy channel,
a receiver might see 8 versions of the output,
see table on the left.
This allows an error in any one of the three
samples to be corrected by "majority vote" or
"democratic voting".
The two main categories of FEC codes
are block codes and convolution codes.

31

You might also like