Professional Documents
Culture Documents
CHAPTER 2
U
2.1 Introduction
At the start of Sec. 1.1.2, we had indicated that one of the possible ways of classifying the signals is: deterministic or random. By random we mean unpredictable; that is, in the case of a random signal, we cannot with certainty predict its future value, even if the entire past history of the signal is known. If the signal is of the deterministic type, no such uncertainty exists. Consider the signal x ( t ) = A cos ( 2 f1 t + ) . If A , and f1 are known, then (we are assuming them to be constants) we know the value of x ( t ) for all t . ( A , and f1 can be calculated by observing the signal over a short period of time). Now, assume that x ( t ) is the output of an oscillator with very poor frequency stability and calibration. Though, it was set to produce a sinusoid of frequency f = f1 , frequency actually put out maybe f1' where f1' ( f1 f1 ) .
Even this value may not remain constant and could vary with time. Then, observing the output of such a source over a long period of time would not be of much use in predicting the future values. We say that the source output varies in a random manner.
Another example of a random signal is the voltage at the terminals of a receiving antenna of a radio communication scheme. Even if the transmitted
2.1
Indian Institute of Technology Madras
Principles of Communication
(radio) signal is from a highly stable source, the voltage at the terminals of a receiving antenna varies in an unpredictable fashion. This is because the conditions of propagation of the radio waves are not under our control.
But randomness is the essence of communication. Communication theory involves the assumption that the transmitter is connected to a source, whose output, the receiver is not able to predict with certainty. If the students know ahead of time what is the teacher (source + transmitter) is going to say (and what jokes he is going to crack), then there is no need for the students (the receivers) to attend the class!
Although less obvious, it is also true that there is no communication problem unless the transmitted signal is disturbed during propagation or reception by unwanted (random) signals, usually termed as noise and interference. (We shall take up the statistical characterization of noise in Chapter 3.)
However, quite a few random signals, though their exact behavior is unpredictable, do exhibit statistical regularity. Consider again the reception of radio signals propagating through the atmosphere. Though it would be difficult to know the exact value of the voltage at the terminals of the receiving antenna at any given instant, we do find that the average values of the antenna output over two successive one minute intervals do not differ significantly. If the conditions of propagation do not change very much, it would be true of any two averages (over one minute) even if they are well spaced out in time. Consider even a simpler experiment, namely, that of tossing an unbiased coin (by a person without any magical powers). It is true that we do not know in advance whether the outcome on a particular toss would be a head or tail (otherwise, we stop tossing the coin at the start of a cricket match!). But, we know for sure that in a long sequence of tosses, about half of the outcomes would be heads (If this does not happen, we suspect either the coin or tosser (or both!)).
2.2
Indian Institute of Technology Madras
Principles of Communication
Statistical
regularity
of
averages
is
an
experimentally
verifiable
phenomenon in many cases involving random quantities. Hence, we are tempted to develop mathematical tools for the analysis and quantitative characterization of random signals. To be able to analyze random signals, we need to understand random variables. The resulting mathematical topics are: probability theory, random variables and random (stochastic) processes. In this chapter, we shall develop the probabilistic characterization of random variables. In chapter 3, we shall extend these concepts to the characterization of random processes.
2.2.1. Terminology
Def. 2.1: Outcome
The end result of an experiment. For example, if the experiment consists of throwing a die, the outcome would be anyone of the six faces, F1 ,........, F6
Def. 2.2: Random experiment
An experiment whose outcomes are not known in advance. (e.g. tossing a coin, throwing a die, measuring the noise voltage at the terminals of a resistor etc.)
Def. 2.3: Random event
A random event is an outcome or set of outcomes of a random experiment that share a common attribute. For example, considering the experiment of throwing a die, an event could be the 'face F1 ' or 'even indexed faces' ( F2 , F4 , F6 ). We denote the events by upper case letters such as A , B or
A1 , A2
2.3
Indian Institute of Technology Madras
Principles of Communication
The sample space of a random experiment is a mathematical abstraction used to represent all possible outcomes of the experiment. We denote the sample space by
S.
S
and is
called a sample point. We use s (with or without a subscript), to denote a sample point. An event on the sample space is represented by an appropriate collection of sample point(s).
Def. 2.5: Mutually exclusive (disjoint) events
Two events A and B are said to be mutually exclusive if they have no common elements (or outcomes).Hence if A and B are mutually exclusive, they cannot occur together.
Def. 2.6: Union of events
(A
This concept can be generalized to the union of more than two events.
Def. 2.7: Intersection of events
The intersection of two events, A and B , is the set of all outcomes which belong to A as well as B . The intersection of A and B is denoted by ( A B ) or simply ( A B ) . The intersection of A and B is also referred to as a joint event
An event A of a random experiment is said to have occurred if the experiment terminates in an outcome that belongs to A .
2.4
Indian Institute of Technology Madras
Principles of Communication
but not in A .
The null event, denoted , is an event with no sample points. Thus = (note that if A and B are disjoint events, then A B = and vice versa).
Suppose that a random experiment is repeated n times. If the event A occurs nA times, then the probability of A , denoted by P ( A ) , is defined as
(2.1)
n For small values of n , it is likely that A will fluctuate quite badly. But n
n 'outcome of a toss is Head'. If n is the order of 100, A may not deviate from n
2.5
Indian Institute of Technology Madras
Principles of Communication
1 by more than, say ten percent and as n becomes larger and larger, we 2 1 n expect A to converge to . 2 n
Def. 2.12: The classical definition:
The relative frequency definition given above has empirical flavor. In the classical approach, the probability of the event
is found without
experimentation. This is done by counting the total number N of the possible outcomes of the experiment. If N A of those outcomes are favorable to the occurrence of the event A , then
P ( A) = NA N
(2.2)
Whatever may the definition of probability, we require the probability measure (to the various events on the sample space) to obey the following postulates or axioms:
P1) P ( A ) 0 P2) P ( S ) = 1 P3) ( AB ) = , then P ( A + B ) = P ( A ) + P ( B )
(Note that in Eq. 2.3(c), the symbol + is used to mean two different things; namely, to denote the union of A and B and to denote the addition of two real numbers). Using Eq. 2.3, it is possible for us to derive some additional relationships: i) If A B , then P ( A + B ) = P ( A ) + P ( B ) P ( A B ) (2.4)
ii) Let A1 , A2 ,......, An be random events such that: a) Ai A j = , for i j and (2.5a)
2.6
Indian Institute of Technology Madras
Principles of Communication
b) A1 + A2 + ...... + An =
S.
(2.5b) (2.6)
Note: A1 , A2 , , An are said to be mutually exclusive (Eq. 2.5a) and exhaustive (Eq. 2.5b). iii) P ( A ) = 1 P ( A ) The derivation of Eq. 2.4, 2.6 and 2.7 is left as an exercise.
(2.7)
that A has occurred. In a real world random experiment, it is quite likely that the occurrence of the event B is very much influenced by the occurrence of the event A . To give a simple example, let a bowl contain 3 resistors and 1 capacitor. The occurrence of the event 'the capacitor on the second draw' is very much dependent on what has been drawn at the first instant. Such dependencies between the events is brought out using the notion of conditional probability . The conditional probability P ( B | A ) can be written in terms of the joint probability P ( A B ) and the probability of the event P ( A ) . This relation can be arrived at by using either the relative frequency definition of probability or the classical definition. Using the former, we have
nAB P ( AB ) = lim n n
n P ( A ) = lim A n n
2.7
Indian Institute of Technology Madras
Principles of Communication
where nAB is the number of times AB occurs in n repetitions of the experiment. As P ( B | A ) refers to the probability of B occurring, given that A has occurred, we have
P ( B | A ) = lim
nAB nA
P ( AB ) , P ( A) 0 = P ( A)
nAB = lim n n nA n or
(2.8a)
P ( A B ) = P (B | A) P ( A)
P ( A | B) =
P ( AB ) P (B )
, P (B ) 0
(2.8b)
(2.9)
P (B | A) =
Similarly
P (B ) P ( A | B ) P ( A)
, P ( A) 0
(2.10a)
P ( A | B) =
P ( A) P (B | A) , P (B ) 0 P (B )
(2.10b)
Eq. 2.9 expresses the probability of joint event AB in terms of conditional probability, say P ( B | A ) and the (unconditional) probability P ( A ) . Similar relation can be derived for the joint probability of a joint event involving the intersection of three or more events. For example P ( A BC ) can be written as
2.8
Indian Institute of Technology Madras
Principles of Communication
P ( A BC ) = P ( A B ) P (C | AB ) = P ( A ) P ( B | A ) P (C | AB )
(2.11)
Exercise 2.1
Let A1 , A2 ,......, An be n mutually exclusive and exhaustive events and B is another event defined on the same space. Show that
P Aj | B =
i = 1
) ( ) P (B | Aj ) P ( Aj )
P B | Aj P Aj
n
(2.12)
Another useful probabilistic concept is that of statistical independence. Suppose the events A and B are such that
P (B | A) = P (B )
(2.13)
That is, knowledge of occurrence of A tells no more about the probability of occurrence B than we knew without that knowledge. Then, the events A and B are said to be statistically independent. Alternatively, if Eq. 2.13, then
P ( A B ) = P ( A) P (B )
(2.14)
Either Eq. 2.13 or 2.14 can be used to define the statistical independence of two events. Note that if
and
are
independent,
then
of statistical independence can be generalized to the case of more than two events. A set of k events A1 , A2 ,......, Ak are said to be statistically independent if and only if (iff) the probability of every intersection of k or fewer events equal the product of the probabilities of its constituents. Thus three events A , B , C are independent when
2.9
Indian Institute of Technology Madras
Principles of Communication
P ( A B ) = P ( A) P (B ) P ( AC ) = P ( A ) P (C ) P ( BC ) = P ( B ) P (C )
and
P ( A BC ) = P ( A ) P ( B ) P (C )
We shall illustrate some of the concepts introduced above with the help of two examples.
Example 2.1
Priya (P1) and Prasanna (P2), after seeing each other for some time (and after a few tiffs) decide to get married, much against the wishes of the parents on both the sides. They agree to meet at the office of registrar of marriages at 11:30 a.m. on the ensuing Friday (looks like they are not aware of Rahu Kalam or they dont care about it).
However, both are somewhat lacking in punctuality and their arrival times are equally likely to be anywhere in the interval 11 to 12 hrs on that day. Also arrival of one person is independent of the other. Unfortunately, both are also very short tempered and will wait only 10 min. before leaving in a huff never to meet again.
a) b)
Picture the sample space Let the event A stand for P1 and P2 meet. Mark this event on the sample space.
c)
Find the probability that the lovebirds will get married and (hopefully) will live happily ever after.
a)
2.10
Indian Institute of Technology Madras
Principles of Communication
Fig. 2.1(a):
of Example 2.1
b)
The diagonal OP
Prasanna. Assuming that P1 arrives at 11: x , meeting between P1 and P2 would take place if P2 arrives within the interval a to b , as shown in the figure. The event A , indicating the possibility of P1 and P2 meeting, is shown in Fig. 2.1(b).
2.11
Indian Institute of Technology Madras
Principles of Communication
c)
Probability of marriage =
Example 2.2:
Let two honest coins, marked 1 and 2, be tossed together. The four possible outcomes are T1 T2 , T1 H2 , H1 T2 , H1 H2 . ( T1 indicates toss of coin 1 resulting in tails; similarly T2 etc.) We shall treat that all these outcomes are equally likely; that is the probability of occurrence of any of these four outcomes is 1 . (Treating each of these outcomes as an event, we find that these events 4
are mutually exclusive and exhaustive). Let the event A be 'not H1 H2 ' and B be the event 'match'. (Match comprises the two outcomes T1 T2 , H1 H2 ). Find
P ( B | A ) . Are A and B independent?
We know that P ( B | A ) =
P ( AB ) P ( A)
A B is the event 'not H1 H2 ' and 'match'; i.e., it represents the outcome T1 T2 .
Hence P ( A B ) =
H1 T2 ; therefore,
P ( A) =
3 4
toss would have resulted in anyone of the three other outcomes which can be
2.12
Indian Institute of Technology Madras
Principles of Communication
si
from
The figure shows the transformation of a few individual sample points as well as the transformation of the event A , which falls on the real line segment
[a1 , a2 ] .
2.3.1 Distribution function:
Taking a specific case, let the random experiment be that of throwing a die. The six faces of the die can be treated as the six sample points in
S ; that is,
2.13
Indian Institute of Technology Madras
Principles of Communication
then the events on the sample space will become transformed into appropriate segments of the real line. Then we can enquire into the probabilities such as
P {s : X ( s ) < a1}
P {s : b1 < X ( s ) b2 }
or
P {s : X ( s ) = c}
These and other probabilities can be arrived at, provided we know the
Distribution Function of X, denoted by FX (
FX ( x ) = P {s : X ( s ) x}
That is, FX ( x ) is the probability of the event, comprising all those sample points which are transformed by X into real numbers less than or equal to x . (Note that, for convenience, we use x as the argument of FX (
whose domain is the real line and whose range is the interval [0, 1] ).
As an example of a distribution function (also called Cumulative Distribution Function CDF), let
2.14
Indian Institute of Technology Madras
Principles of Communication
FX ( x ) 0, < x < FX ( ) = 0 FX ( ) = 1
If a > b , then FX ( a ) FX ( b ) = P {s : b < X ( s ) a} If a > b , then FX ( a ) FX ( b ) The first three properties follow from the fact that FX (
represents the
{s : X ( s )
b} {s : b < X ( s ) a} = {s : X ( s ) a}
Referring to the Fig. 2.3, note that FX ( x ) = 0 for x < 0.5 whereas FX ( 0.5 ) = 1 1 at the point . In other words, there is a discontinuity of 4 4
P {s : X ( s ) = a} = Pa
(2.16)
2.15
Indian Institute of Technology Madras
Principles of Communication
Random Variables (RV). In other words, for any real x , {s : X ( s ) x} should be an event in the sample space with some assigned probability. (The term random variable is somewhat misleading because an RV is a well defined function from
into
the real line need not be a random variable. For example, let
consist of six
sample points, s1 to s6 . The only events that have been identified on the sample space are: A = {s1 , s2 }, B = {s3 , s4 , s5 } and C = {s6 } and their probabilities are P ( A ) = 2 1 1 , P ( B ) = and P (C ) = . We see that the probabilities for the 6 2 6
various unions and intersections of A , B and C can be obtained. Let the transformation X be X ( si ) = i . Then the distribution function fails to exist because
P [s : 3.5 < x 4.5 ] = P ( s4 ) is not known as s4 is not an event on the sample
space.
1
TP PT
We intuitively feel that as 0 , the limit of the set s : a < X ( s ) a + is the null set and can be proved to be so. Hence,
F
X
( a+ ) FX ( a )
= 0 , where a
lim
(a + )
That is, F
2.16
Indian Institute of Technology Madras
Principles of Communication
Exercise 2.2
Let identified
are
the
same
as
above,
namely,
A = {s1 , s2 } ,
1 1 1 , P ( B ) = and P (C ) = . 3 2 6
Let Y (
be the transformation,
1, i = 1, 2 = 2 , i = 3, 4, 5 3 , i = 6
Y ( si )
Show that Y (
fX ( x ) =
or
FX ( x ) =
d FX ( x ) dx
(2.17a)
f () d
X
(2.17b)
i) ii)
2.17
Indian Institute of Technology Madras
Principles of Communication
Fig. 2.4: A CDF without a continuous derivative As can be seen from the figure, FX ( x ) has a discontinuous slope at
x = 1 and a step discontinuity at x = 2 . In the first case, we resolve the
ambiguity by taking f X to be a derivative on the right. (Note that FX ( x ) is continuous to the right.) The second case is taken care of by introducing the impulse in the probability domain. That is, if there is a discontinuity in FX at
x = a of magnitude Pa , we include an impulse Pa ( x a ) in the PDF. For
example, for the CDF shown in Fig. 2.3, the PDF will be,
fX ( x ) = 1 1 1 1 1 3 1 5 x + + x + x + x 4 2 8 2 8 2 2 2
(2.18)
1 8
at
x =
1 2
as
1 x ga ) because,
1 2
lim
1 2
f X ( x ) dx = lim
1 2
1 1 1 x dx 8 2 16
2.18
Indian Institute of Technology Madras
Principles of Communication
1 2
However, lim
1 2
f X ( x ) dx =
1 . This ensures, 8
1 1 2 8 , 2 x < 2 FX ( x ) = 3 , 1 x < 3 8 2 2 Such an impulse is referred to as the left-sided delta function. As FX is non-decreasing and FX ( ) = 1, we have i) ii)
fX ( x ) 0
(2.19a)
= 1
f ( x ) dx
X
(2.19b)
Based on the behavior of CDF, a random variable can be classified as: i) continuous (ii) discrete and (iii) mixed. If the CDF, FX ( x ) , is a continuous function of x for all x , then X 1 is a continuous random variable. If FX ( x ) is a
TP PT
staircase, then X corresponds to a discrete variable. We say that X is a mixed random variable if FX ( x ) is discontinuous but not a staircase. Later on in this lesson, we shall discuss some examples of these three types of variables.
We can induce more than one transformation on a given sample space. If we induce k such transformations, then we have a set of k co-existing random variables.
(2.20)
simply by X .
2.19
Indian Institute of Technology Madras
Principles of Communication
That is, FX ,Y ( x , y ) is the probability associated with the set of all those sample points such that under X , their transformed values will be less than or equal to x and at the same time, under Y , the transformed values will be less than or equal to y . In other words, FX ,Y ( x1 , y1 ) is the probability associated with the set of all sample points whose transformation does not fall outside the shaded region in the two dimensional (Euclidean) space shown in Fig. 2.5.
S,
S such that X ( s ) x1 . Similarly, if B is comprised of all those S such that Y ( s ) y1 ; then F ( x1 , y1 ) is the probability
sample points s
Properties of the two dimensional distribution function are: i) ii) iii) iv) v)
FX ,Y ( x , y ) 0 , < x < , < y <
FX ,Y ( , y ) = FX ,Y ( x , ) = 0 FX ,Y ( , ) = 1 FX ,Y ( , y ) = FY ( y ) FX ,Y ( x , ) = FX ( x )
2.20
Indian Institute of Technology Madras
Principles of Communication
vi)
(2.21a)
or
FX ,Y ( x , y ) =
f ( , ) d d
X ,Y
(2.21b)
The notion of joint CDF and joint PDF can be extended to the case of k random variables, where k 3 .
Given the joint PDF of random variables X and Y , it is possible to obtain the one dimensional PDFs, f X ( x ) and fY ( y ) . We know that,
FX ,Y ( x1 , ) = FX ( x1 ) .
That is, FX ( x1 ) =
x1
f ( , ) d d
X ,Y
f X ( x1 ) =
x d 1 f X ,Y ( , ) d d d x1
(2.22)
f X ( x1 ) =
fX ( x ) =
d FX ( x ) dx
=
x = x1
f ( x , ) d
X ,Y 1
or
f ( x , y ) dy
X ,Y
(2.23a)
Similarly, fY ( y ) =
f ( x , y ) dx
X ,Y
(2.23b)
(In the study of several random variables, the statistics of any individual variable is called the marginal. Hence it is common to refer FX ( x ) as the marginal
2.21
Indian Institute of Technology Madras
Principles of Communication
f X |Y ( x | y1 ) =
f X ,Y ( x , y1 ) fY ( y1 )
(2.24)
where it is assumed that fY ( y1 ) 0 . Once we understand the meaning of conditional PDF, we might as well drop the subscript on y and denote it by
f X |Y ( x | y ) . An analogous relation can be defined for fY | X ( y | x ) . That is, we have
f X |Y ( x | y ) = fY | X ( y | x ) =
f X ,Y ( x , y ) fY ( y ) f X ,Y ( x , y ) fX ( x )
(2.25a)
and or
f X ,Y ( x , y ) = f X |Y ( x | y ) fY ( y ) = fY | X ( y | x ) f X ( x )
The function f X |Y ( x | y ) may be thought of as a function of the variable x with variable y arbitrary, but fixed. Accordingly, it satisfies all the requirements of an ordinary PDF; that is,
f X |Y ( x | y ) 0
(2.26a)
= 1
and
f ( x | y ) dx
X |Y
(2.26b)
2.22
Indian Institute of Technology Madras
Principles of Communication
f X i ( xi )
(2.27)
(2.28)
(2.29)
Statistical independence can also be expressed in terms of conditional PDF. Let X and Y be independent. Then,
f X ,Y ( x , y ) = f X ( x ) fY ( y )
or
f X |Y ( x | y ) = f X ( x )
(2.30a) (2.30b)
Similarly, fY | X ( y | x ) = fY ( y )
Eq. 2.30(a) and 2.30(b) are alternate expressions for the statistical independence between X and Y . We shall now give a few examples to illustrate the concepts discussed in sec. 2.3.
Example 2.3
0 , x < 0 FX ( x ) = K x 2 , 0 x 10 100 K , x > 10 i) ii) Find the constant K Evaluate P ( X 5 ) and P ( 5 < X 7 )
2.23
Indian Institute of Technology Madras
Principles of Communication
iii)
What is f X ( x ) ?
i) ii)
FX ( ) = 100 K = 1 K =
1 . 100
fX ( x ) =
d FX ( x ) dx
, 0 = 0.02 x , 0 ,
x < 0 0 x 10 x > 10
Note: Specification of f X ( x ) or FX ( x ) is complete only when the algebraic expression as well as the range of X is given.
Example 2.4
i) ii) iii)
Determine the relation between a and b so that f X ( x ) is a PDF. Determine the corresponding FX ( x ) Find P [1 < X 2] . As can be seen f X ( x ) 0 for < x < . In order for f X ( x ) to
i)
ae
bx
dx = 2 a e b x dx = 1 .
0
1 ; hence b = 2 a . 2
ii)
a e b x , fX ( x ) = b x a e ,
x < 0 x 0.
2.24
Indian Institute of Technology Madras
Principles of Communication
f (x) d x
X
= 1 fX ( x ) d x
2
f (x) d x
X
Therefore, FX ( x ) = 1
1 bx e , x > 0 2
We can now write the complete expression for the CDF as FX ( x ) = 1 iii) FX ( 2 ) FX (1) = 1 bx e , 2 1 bx e , 2 x < 0 x 0
1 b e e 2 b 2
Example 2.5
(a)
fY | X ( y | x ) =
f X ,Y ( x , y ) fX ( x )
2.25
Indian Institute of Technology Madras
Principles of Communication
over the appropriate range. The maximum value taken by y is 2 ; in addition, for any given x , y x . Hence,
fX ( x ) = Hence, (i)
2 dy
x
= 1
x , 0 x 2 2
fY | X ( y | 1) =
1 2 = 1 , 1 y 2 1 0 , otherwise 2
(ii) b)
the dependence between the random variables X and Y is evident from the statement of the problem because given a value of X = x1 , Y should be greater than or equal to x1 for the joint PDF to be non zero. Also we see that fY | X ( y | x ) depends on x whereas if X and Y were to be independent, then fY | X ( y | x ) = fY ( y ) .
2.26
Indian Institute of Technology Madras
Principles of Communication
Exercise 2.3
For the two random variables X and Y , the following density functions have been given. (Fig. 2.6)
2.27
Indian Institute of Technology Madras
Principles of Communication
Our interest is to obtain fY ( y ) . This can be obtained with the help of the following theorem (Thm. 2.1).
Theorem 2.1
(2.31)
Proof: Consider the transformation shown in Fig. 2.7. We see that the equation
y 1 = g ( x ) has three roots namely, x1 , x2 and x3 .
2.28
Indian Institute of Technology Madras
Principles of Communication
Fig. 2.7: X Y transformation used in the proof of theorem 2.1 We know that fY ( y ) d y = P [ y < Y y + d y ] . Therefore, for any given y 1 , we need to find the set of values x such that y1 < g ( x ) y1 + d y and the probability that X is in this set. As we see from the figure, this set consists of the following intervals:
x1 < x x1 + d x1 , x2 + d x2 < x x2 , x3 < x x3 + d x3
where d x1 > 0 , d x3 > 0 , but d x2 < 0 . From the above, it follows that
+ P [ x3 < X x 3 + d x3 ]
This probability is the shaded area in Fig. 2.7.
P [ x1 < X x1 + d x1 ] = f X ( x1 ) d x1 , d x1 = dy g ' ( x1 )
P [ x2 + d x2 < x x2 ] = f X ( x2 ) d x2 , d x2 = P [ x3 < X x3 + d x 3 ] = f X ( x 3 ) d x3 , d x 3 =
dy g ' ( x2 ) dy g ' ( x3 )
2.29
Indian Institute of Technology Madras
Principles of Communication
We conclude that
fY ( y ) d y =
g ' ( x1 )
f X ( x1 )
dy +
g ' ( x2 )
f X ( x2 )
dy +
g ' ( x3 )
f X ( x3 )
dy
(2.32)
and Eq. 2.31 follows, by canceling d y from the Eq. 2.32. Note that if g ( x ) = y1 = constant for every x in the interval ( x0 , x1 ) , then we have P [Y = y1 ] = P ( x0 < X x1 ) = FX ( x1 ) FX ( x0 ) ; that is FY ( y ) is discontinuous at y = y1 . Hence fY ( y ) contains an impulse, ( y y1 ) of area
FX ( x1 ) FX ( x0 ) .
Example 2.6
Y = g ( X ) = X + a , where a is a constant. Let us find fY ( y ) .
We have g ' ( x ) = 1 and x = y a . For a given y , there is a unique x satisfying the above transformation. Hence,
fY ( y ) d y =
g '(x)
fX ( x )
fY ( y ) = f X ( y a )
1 x , x 1 Let f X ( x ) = 0 , elsewhere
and a = 1 Then, fY ( y ) = 1 y + 1 As y = x 1, and x ranges from the interval ( 1, 1) , we have the range of y as ( 2, 0 ) .
2.30
Indian Institute of Technology Madras
Principles of Communication
1 y + 1 , 2 y 0 Hence fY ( y ) = 0 , elsewhere
FX ( x ) and FY ( y ) are shown in Fig. 2.8.
As can be seen, the transformation of adding a constant to the given variable simply results in the translation of its PDF.
Example 2.7
x . As g ' ( x ) = b , we have fY ( y ) =
x 1 , 0 x 2 Let f X ( x ) = 2 0 , otherwise and b = 2 , then
1 y fX . b b
2.31
Indian Institute of Technology Madras
Principles of Communication
1 fY ( y ) = 2
y 1 + 4 , 4 y 0 0 , otherwise
Exercise 2.4
1 y b fX . a a
Example 2.8
Y = a X 2 , a > 0 . Let us find fY ( y ) .
2.32
Indian Institute of Technology Madras
Principles of Communication
y and a
fY ( y )
1 y f X + fX a = 2a y a 0
y , y 0 a , otherwise
Let a = 1 , and f X ( x ) =
Example 2.9
0 , X 0 Y = X , X > 0
a) Let us find the general expression for fY ( y ) 1 3 1 , < x < Let f X ( x ) = 2 2 2 0 , otherwise We shall compute and sketch fY ( y ) .
b)
2.33
Indian Institute of Technology Madras
Principles of Communication
a)
Note that g ( X ) is a constant (equal to zero) for X in the range of ( , 0 ) . Hence, there is an impulse in fY ( y ) at y = 0 whose area is equal to
FX ( 0 ) . As Y is nonnegative, fY ( y ) = 0 for y < 0 . As Y = X for x > 0 ,
f ( y ) w ( y ) + FX ( 0 ) ( y ) fY ( y ) = X 0 , otherwise
1 , y 0 where w ( y ) = 0 , otherwise
b) 1 3 1 , <x< Specifically, let f X ( x ) = 2 2 2 0 , elsewhere
1 4 (y ) , y = 0 3 1 = , 0 < y 2 2 , otherwise 0
Then, fY ( y )
2.34
Indian Institute of Technology Madras
Principles of Communication
Example 2.10
1, X < 0 Let Y = + 1, X 0
a) b) Let us find the general expression for fY ( y ) . We shall compute and sketch fY ( x ) assuming that f X ( x ) is the same as that of Example 2.9. In this case, Y assumes only two values, namely 1. Hence the PDF of Y has only two impulses. Let us write fY ( y ) as
fY ( y ) = P1 ( y 1) + P1 ( y + 1) where
a)
P 1 = P [ X < 0 ] and P1 [ X 0 ]
b) Taking f X ( x ) of example 2.9, we have P1 = has the sketches f X ( x ) and fY ( y ) . 3 1 and P 1 = . Fig. 2.11 4 4
Note that this transformation has converted a continuous random variable X into a discrete random variable Y .
2.35
Indian Institute of Technology Madras
Principles of Communication
Exercise 2.5
Let a random variable X with the PDF shown in Fig. 2.12(a) be the input to a device with the input-output characteristic shown in Fig. 2.12(b). Compute and sketch fY ( y ) .
Fig. 2.12: (a) Input PDF for the transformation of exercise 2.5 (b) Input-output transformation
Exercise 2.6
2.36
Indian Institute of Technology Madras
Principles of Communication
We now assume that the random variable X is of discrete type taking on the value x k with probability Pk . In this case, the RV, Y = g ( X ) is also discrete, assuming the value Yk = g ( x k ) with probability Pk . If y k = g ( x ) for only one x = xk , then P [Y = y k ] = P [ X = xk ] = Pk . If however, y k = g ( x ) for x = xk and x = xm , then P [Y = y k ] = Pk + Pm .
Example 2.11
Let Y = X 2 . a)
1 If f X ( x ) = 6
If f X ( x ) =
i = 1
( x i ) , find fY ( y ) .
3
b)
1 6
i = 2
( x i ) , find fY ( y ) .
1 , then Y takes the 6
a)
If X takes the values (1, 2, ......., 6 ) with probability of values 12 , 22 , ......., 62 with probability
1 1 . That is, fY ( y ) = 6 6
i = 1
(x i ) .
6 2
b)
1 , then 6
1 1 1 1 , , , respectively. 6 3 3 6
1 1 ( y ) + ( y 9 ) + ( y 1) + ( y 4 ) 3 6
2.37
Indian Institute of Technology Madras
Principles of Communication
z, w J where x, y
z x z, w J = w x, y x = z y w y
z w z w x y y x
That is, the Jacobian is the determinant of the appropriate partial derivatives. We shall now state the theorem which relates fZ ,W ( z , w ) and f X ,Y ( x , y ) .
(2.33a) (2.33b)
h ( x , y ) = w1 ,
fZ ,W ( z , w ) =
f X ,Y ( x1 , y1 )
z, w J x1 , y1
(2.34)
Proof of this theorem is given in appendix A2.1. For a more general version of this theorem, refer [1].
2.38
Indian Institute of Technology Madras
Principles of Communication
Example 2.12
a)
From
the
given
transformations,
we
obtain
x =
1 (z + w ) 2
and
y =
A on which f X ,Y ( x , y ) is non-zero.
Fig. 2.14: (a) The space where f X ,Y is non-zero (b) The space where fZ ,W is non-zero
2.39
Indian Institute of Technology Madras
Principles of Communication
A space
The line x = 1
B space
1 (z + w ) = 1 w = z + 2 2 1 ( z + w ) = 1 w = ( z + 2) 2 1 (z w ) = 1 w = z 2 2 1 (z w ) = 1 w = z + 2 2
The space is
1 1 z, w J = 2 and J ( = 1 1 x, y
Hence fZ ,W
1 1 4 = 8 , z, w B (z, w ) = 2 0 , otherwise
= 2.
b)
fZ ( z ) =
f (z, w ) d w
Z ,W
From Fig. 2.14(b), we can see that, for a given z ( z 0 ), w can take values only in the z to z . Hence
fZ ( z ) =
+ z
8 dw
1
1 z, 0 z 2 4
8 dw
z
1 z, 2 z 0 4
2.40
Indian Institute of Technology Madras
Principles of Communication
Hence fZ ( z ) =
1 z , z 2 4
Example 2.13
X 2 + Y 2 and
x 2 + y 2 1 f X ,Y ( x , y ) = exp , 2 2
Let us find fR , ( r , ) .
< x, y < .
As the given transformation is from cartesian-to-polar coordinates, we can write x = r cos and y = r sin , and the transformation is one -to -one;
r x x r y cos = sin r y sin 1 cos = r r
J =
r r2 exp , 0 r , < < Hence, fR , ( r , ) = 2 2 , otherwise 0 It is left as an exercise to find fR ( r ) , f ( ) and to show that R and are independent variables.
Theorem 2.2 can also be used when only one function Z = g ( X , Y ) is specified and what is required is fZ ( z ) . To apply the theorem, a conveniently chosen auxiliary or dummy variable W is introduced. Typically W = X or
W = Y ; using the theorem fZ ,W ( z , w ) is found from which fZ ( z ) can be
obtained.
2.41
Indian Institute of Technology Madras
Principles of Communication
As J = 1 ,
fZ ,W ( z , w ) = f X ,Y ( z w , w )
and
fZ ( z ) =
f (z w , w ) d w
X ,Y
(2.35)
f ( z w ) f (w ) d w
X Y
(2.36a) (2.36b)
That is, fZ = f X fY
Example 2.14
Let X and Y be two independent random variables, with 1 , 1 x 1 fX ( x ) = 2 0 , otherwise 1 , 2 y 1 fY ( y ) = 3 0 , otherwise If Z = X + Y , let us find P [ Z 2] . From Eq. 2.36(b), fZ ( z ) is the convolution of f X ( z ) and fY ( z ) . Carrying out the convolution, we obtain fZ ( z ) as shown in Fig. 2.15.
2.42
Indian Institute of Technology Madras
Principles of Communication
1 . 12
Example 2.15
Let Z =
As J =
1 , we have w
fZ ( z ) =
w f X ,Y ( zw , w ) dw
1 + zw 2 dw 4
2.43
Indian Institute of Technology Madras
Principles of Communication
f X ,Y ( x , y ) is non-zero if
(x, y ) A .
where
Fig. 2.16: (a) The space where f X ,Y is non-zero (b) The space where fZ ,W is non-zero To obtain fZ ( z ) from fZ ,W ( z , w ) , we have to integrate out w over the appropriate ranges. i) Let z < 1; then
1 + zw 2 1 + zw 2 fZ ( z ) = w dw = 2 w dw 4 4 0 1
1 1
z 1 1 + 2 4
ii)
fZ ( z ) = 2
iii)
1 z
1 + zw 2 1 1 1 4 w dw = 4 z2 + 2 z3 0
fZ ( z ) = 2
1 z
1 + zw 2 1 1 1 w dw = 2 + 4 4 z 2 z3
2.44
Indian Institute of Technology Madras
Principles of Communication
1 4 Hence fZ ( z ) = 1 4
z 1 + 2
, z 1
1 1 2 + , z > 1 2 z3 z
Exercise 2.7
1 z f X ,Y , w d w w w
, x 1
and
y y e fY ( y ) = 0
, y 0 , otherwise
Show that fZ ( z ) =
1 2
2 z
, < z < .
In transformations involving two random variables, we may encounter a situation where one of the variables is continuous and the other is discrete; such cases are handled better, by making use of the distribution function approach, as illustrated below.
Example 2.16
P [ X = 0] = P [ X = 1] =
2.45
Indian Institute of Technology Madras
Principles of Communication
1 2
y2 2
Let us first compute the distribution function of Z from which the density function can be derived.
P ( Z z ) = P [ Z z | X = 0] P [ X = 0 ] + P [ Z z | X = 1] P [ X = 1]
As Z = X + Y , we have
P [ Z z | X = 0 ] = FY ( z )
z2 1 1 + exp fZ ( z ) = 2 2 2
The distribution function method, as illustrated by means of example 2.16, is a very basic method and can be used in all situations. (In this method, if
Y = g ( X ) , we compute FY ( y ) and then obtain fY ( y ) =
d FY ( y ) dy
. Similarly, for
transformations involving more than one random variable. Of course computing the CDF may prove to be quite difficult in certain situations. The method of obtaining PDFs based on theorem 2.1 or 2.2 is called change of variable
Example 2.17
2.46
Indian Institute of Technology Madras
Principles of Communication
P [Z z ] = P [ X + Y z ] = P [Y z x ]
This probability is the probability of ( X , Y ) lying in the shaded area shown in Fig. 2.17.
That is,
FZ ( z ) =
z x d x f X ,Y ( x , y ) d y
z x fZ ( z ) = d x f X ,Y ( x , y ) d y z
d x z
z x
f X ,Y ( x , y ) d y
(2.37a)
f (x, z x) d x
X ,Y
2.47
Indian Institute of Technology Madras
Principles of Communication
fZ ( z ) =
f (z y , y ) d y
X ,Y
(2.37b)
If X and Y are independent, we have fZ ( z ) = f X ( z ) fY ( z ) . We note that Eq. 2.37(b) is the same as Eq. 2.35.
So far we have considered the transformations involving one or two variables. This can be generalized to the case of functions of n variables. Details can be found in [1, 2].
x fX ( x ) d x
(2.38)
where E denotes the expectation operator. Note that m X is a constant. Similarly, the expected value of a function of X , g ( X ) , is defined by
E g ( X ) = g ( X ) =
g (x) f (x) d x
X
(2.39)
Remarks: The terminology expected value or expectation has its origin in games
of chance. This can be illustrated as follows: Three small similar discs, numbered 1,2 and 2 respectively are placed in bowl and are mixed. A player is to be
2.48
Indian Institute of Technology Madras
Principles of Communication
blindfolded and is to draw a disc from the bowl. If he draws the disc numbered 1, he will receive nine dollars; if he draws either disc numbered 2, he will receive 3 dollars. It seems reasonable to assume that the player has a '1/3 claim' on the 9 dollars and '2/3 claim' on three dollars. His total claim is 9(1/3) + 3(2/3), or five dollars. If we take X to be (discrete) random variable with the PDF
fX ( x ) =
1 2 ( x 1) + ( x 2 ) and g ( X ) = 15 6 X , then 3 3
E g ( X ) =
(15 6 x ) f ( x ) d x
X
= 5
That is, the mathematical expectation of g ( X ) is precisely the player's claim or expectation [3]. Note that g ( x ) is such that g (1) = 9 and g ( 2 ) = 3 . For the special case of g ( X ) = X n , we obtain the n-th moment of the probability distribution of the RV, X ; that is,
E Xn =
x n fX ( x ) d x
(2.40)
The most widely used moments are the first moment ( n = 1, which results in the mean value of Eq. 2.38) and the second moment ( n = 2 , resulting in the mean
square value of X ).
E X2 = X2 =
n
x 2 fX ( x ) d x
(2.41)
(x m )
X
fX ( x ) d x
(2.42)
2.49
Indian Institute of Technology Madras
Principles of Communication
E g ( X , Y ) =
g (x, y ) f (x, y ) d x d y
X ,Y
(2.43)
( x + y ) f ( x , y ) d x dy
X ,Y
( x ) fX ,Y ( x , y ) d x dy +
( y ) f ( x , y ) d x dy
X ,Y
Integrating out the variable y in the first term and the variable x in the second term, we have
E [Z ] =
x f (x) d x + y f (y ) d y
X Y
= X + Y
2.5.1 Variance
Coming back to the central moments, we have the first central moment being always zero because,
E ( X m X ) =
(x m ) f (x) d x
X X
= mX mX = 0
2 = X 2 mX = X 2 X
2.50
Indian Institute of Technology Madras
Principles of Communication
The second central moment of a random variable is called the variance and its (positive) square root is called the standard deviation. The symbol 2 is generally used to denote the variance. (If necessary, we use a subscript on 2 )
The variance provides a measure of the variable's spread or randomness. Specifying the variance essentially constrains the effective width of the density function. This can be made more precise with the help of the Chebyshev
P g ( X ) c
E g ( X ) c
(2.44)
g (x) f (x) d x
X
X X B
E g ( X )
g (x) f (x) d x
X A
If x A , then g ( x ) c , hence
E g ( X ) c f X ( x ) d x
A
But f X ( x ) d x = P [ x A] = P g ( X ) c
A
That is, E g ( X ) c P g ( X ) c , which is the desired result. Note: The kind of manipulations used in proving the theorem 2.3 is useful in establishing similar inequalities involving random variables.
2.51
Indian Institute of Technology Madras
Principles of Communication
To see how innocuous (or weak, perhaps) the inequality 2.44 is, let g ( X ) represent the height of a randomly chosen human being with E g ( X ) = 1.6m . Then Eq. 2.44 states that the probability of choosing a person over 16 m tall is at most 1 ! (In a population of 1 billion, at most 100 million would be as tall as a 10
i) ii)
(2.45a) (2.45b)
2.3. We then have, 1 2 P ( X m X ) k 2 2 2 X k In other words, 1 P X mX k X 2 k which is the desired result. Naturally, we would take the positive number k to be greater than one to have a meaningful result. Chebyshev inequality can be interpreted as: the probability of observing any RV outside k standard deviations off its mean value is no larger than 1 . With k = 2 for example, the k2
probability of X m X 2 X does not exceed 1/4 or 25%. By the same token, we expect X to occur within the range ( m X 2 X ) for more than 75% of the observations. That is, smaller the standard deviation, smaller is the width of the interval around m X , where the required probability is concentrated. Chebyshev
2.52
Indian Institute of Technology Madras
Principles of Communication
inequality thus enables us to give a quantitative interpretation to the statement 'variance is indicative of the spread of the random variable'.
Note that it is not necessary that variance exists for every PDF. For example, if
f X ( x ) = 2 2 , < x < and > 0 , then X = 0 but X 2 is not finite. +x
2.5.2 Covariance
An important joint expectation is the quantity called covariance which is obtained by letting g ( X , Y ) = ( X m X ) (Y mY ) in Eq. 2.43. We use the symbol to denote the covariance. That is,
X Y = cov [ X ,Y ] = E ( X m X ) (Y mY )
(2.46a)
(2.46b)
X Y =
E [ X Y ] mX mY X Y
(2.47)
The correlation coefficient is a measure of dependency between the variables. Suppose X and Y are independent. Then,
E[XY] = =
x y f X ,Y ( x , y ) d x d y x y f X ( x ) fY ( y ) d x d y
x fX ( x ) d x
y f (y ) d y
Y
= m X mY
2.53
Indian Institute of Technology Madras
Principles of Communication
Thus, we have X Y (and X Y ) being zero. Intuitively, this result is appealing. Assume X and Y to be independent. When the joint experiment is performed many times, and given X = x1 , then Y would occur sometimes positive with respect to mY , and sometimes negative with respect to mY . In the course of many trials of the experiments and with the outcome X = x1 , the sum of the numbers x1 ( y mY ) would be very small and the quantity,
tends to zero as the number of trials keep increasing. On the other hand, let X and Y be dependent. Suppose for example, the outcome y is conditioned on the outcome x in such a manner that there is a greater possibility of
(y
( x mX ) .
Then we
expect X Y to be positive. Similarly if the probability of ( x m X ) and ( y mY ) being of the opposite sign is quite large, then we expect X Y to be negative. Taking the extreme case of this dependency, let
X and Y be so conditioned
that, given X = x1 , then Y = x1 , being constant. Then X Y = 1 . That is, for X and Y be independent, we have X Y = 0 and for the totally dependent
(y
dependent, then will have a magnitude between 0 and 1. Two random variables X and Y are said to be orthogonal if E [ X Y ] = 0 . If X Y ( or X Y ) = 0 , the random variables
and Y
are said to be
random variables are independent, they are uncorrelated. However, the fact that they are uncorrelated does not ensure that they are independent. As an example, let
2.54
Indian Institute of Technology Madras
Principles of Communication
and Y = X 2 . Then,
XY = X3
1 1 = x3 d x =
x3 d x = 0
Let Y be the linear combination of the two random variables X1 and X 2 . That is Y = k1 X1 + k 2 X 2 where k1 and k2 are constants. Let E [ X i ] = mi ,
E [Y ] = k1 m1 + k 2 m2
2 E Y 2 = E k1 X12 + k 2 X 2 + 2 k1 k 2 X1 X 2
(2.48a)
(2.48b)
2.55
Indian Institute of Technology Madras
Principles of Communication
Y =
2 Y =
i = 1
k
n
X i , then
2 i
i = 1
2 + 2 k i k j i j i j i
i j
i < j
(2.49)
where the meaning of various symbols on the RHS of Eq. 2.49 is quite obvious.
We shall now give a few examples based on the theory covered so far in this section.
Example 2.18
, 0 x 2 , , 2 x 4 4 x
We shall find
a) X
and
b) 2 X
a)
, 0 x 2 , , 2 x 4 4 x
Therefore,
E[X] =
1 1 2 8x dx + 8x dx 0 2
2.56
Indian Institute of Technology Madras
Principles of Communication
1 7 31 + = 4 3 12
2
b)
2 = E X 2 E [ X ] X
E X2 =
=
2 X
1 2 1 3 8x dx + 8x dx 0 2
1 15 47 + = 3 2 6
2
47 31 167 = = 6 144 12
Example 2.19
E [Y ] =
1 2
1 2
cos ( x ) d x =
1 2
2 = 0.0636 1 = 0.5 2
E Y =
2
2 Hence Y =
1 2
cos2 ( x ) d x =
1 4 2 = 0.96 2
Example 2.20
Principles of Communication
Let us find
a) E X Y 2 and
b) X Y
a)
x y 2 f X ,Y ( x , y ) dx dy
x y ( x + y ) dx dy
2 0 0
1 1
17 72
b)
Hence X Y =
Another statistical average that will be found useful in the study of communication theory is the conditional expectation. The quantity,
E g ( X ) | Y = y =
g ( x ) f ( x | y ) dx
X |Y
(2.50)
is called the conditional expectation of g ( X ) , given Y = y . If g ( X ) = X , then we have the conditional mean, namely, E [ X | Y ] .
E[X | Y = y] = E[X |Y ] =
x f X |Y ( x | y ) dx
(2.51)
Similarly, we can define the conditional variance etc. We shall illustrate the calculation of conditional mean with the help of an example.
Example 2.21
2.58
Indian Institute of Technology Madras
Principles of Communication
Let us compute E [ X | Y ] .
f X ,Y ( x , y ) fY ( y )
x dx
y
= ln y , 0 < y < 1
1 1 x f X |Y ( x | y ) = = , ln y x ln y
y < x < 1
Hence E ( X | Y ) =
=
x x ln y d x
y
y 1 ln y
Consider a random experiment in which our interest is in the occurrence or nonoccurrence of an event A . That is, we are interested in the event A or its complement, say, B . Let the experiment be repeated n times and let p be the
2.59
Indian Institute of Technology Madras
Principles of Communication
probability of occurrence of A on each trial and the trials are independent. Let X denote random variable, number of occurrences of A in n trials'. X can be equal to 0, 1, 2, ........., n . If we can compute P [ X = k ], k = 0, 1, ........., n , then we can write f X ( x ) . Taking a special case, let n = 5 and k = 3 . The sample space (representing the outcomes of these five repetitions) has 32 sample points, say,
s1 , ..........., s32 . The sample point s1 could represent the sequence ABBBB . The
sample points such as ABAAB , A A A B B etc. will map into real number 3 as shown in the Fig. 2.18. (Each sample point is actually an element of the five dimensional Cartesian product space).
5 2 P [ X = 3] = p3 (1 p ) 3
Generalizing this to arbitrary n and k , we have the binomial density, given by
2.60
Indian Institute of Technology Madras
Principles of Communication
fX ( x ) =
i = 0
P (x i )
i
(2.52)
n ni where Pi = p i (1 p ) i
As can be seen, f X ( x ) 0 and
fX ( x ) d x =
i = 0
Pi =
i =
n i n i p (1 p ) 0
n
= (1 p ) + p
= 1
It is left as an exercise to show that E [ X ] = n p and 2 = n p (1 p ) . (Though X the formulae for the mean and the variance of a binomial PDF are simple, the algebra to derive them is laborious). We write X is b ( n , p ) to indicate X has a binomial PDF with parameters
The following example illustrates the use of binomial PDF in a communication problem.
Example 2.22
A digital communication system transmits binary digits over a noisy channel in blocks of 16 digits. Assume that the probability of a binary digit being in error is 0.01 and that errors in various digit positions within a block are statistically independent. i) ii) Find the expected number of errors per block Find the probability that the number of errors per block is greater than or equal to 3. Let X be the random variable representing the number of errors per block. Then X is b (16, 0.01) . i)
E [ X ] = n p = 16 0.01 = 0.16;
2.61
Indian Institute of Technology Madras
Principles of Communication
ii)
P ( X 3 ) = 1 P [ X 2]
= 1
i =0
i ( 0.1) (1 p )
2
16
16 i
= 0.002
Exercise 2.8
Show that for the example 2.22, Chebyshev inequality results in P X 3 0.0196 tight.
ii) Poisson:
fX ( x ) =
m = 0
( x m)
m e m!
(2.53)
f X ( x ) d x = 1 because
m 0 m ! = e . m =
Since,
m e = , m = 0 m!
we have
d e d
( )
= e =
m m 1 1 0 m ! = m =
m = 1
m m!
E[X] =
m m e 1 m ! = e e = m =
2.62
Indian Institute of Technology Madras
Principles of Communication
E X 2 = 2 + . Hence 2 = . X
(2.54)
Note that the variance of the uniform PDF depends only on the width of the interval ( b a ) . Therefore, whether X is uniform in ( 1, 1) or ( 2, 4 ) , it has the same variance, namely 1 . 3
ii) Rayleigh:
2.63
Indian Institute of Technology Madras
Principles of Communication
(2.55)
A typical sketch of the Rayleigh PDF is given in Fig.2.20. ( fR ( r ) of example 2.12 is Rayleigh PDF.)
Rayleigh PDF frequently arises in radar and communication problems. We will encounter it later in the study of narrow-band noise processes.
2.64
Indian Institute of Technology Madras
Principles of Communication
Exercise 2.9
a)
dz . 2
b 2
and
iii) Gaussian
By far the most widely used PDF, in the context of communication theory is the Gaussian (also called normal) density, specified by
fX ( x ) =
1 2 X
(2.56)
where m X is the mean value and 2 the variance. That is, the Gaussian PDF is X completely specified by the two parameters, m X and 2 . We use the symbol X
N ( m X , 2 ) to denote the Gaussian density1. In appendix A2.3, we show that X
PT
As can be seen from the Fig. 2.21, The Gaussian PDF is symmetrical with respect to m X .
1
TP PT
In this notation, N ( 0, 1) denotes the Gaussian PDF with zero mean and unit variance. Note that
if X is N m X , X , then Y =
2
X mX is N ( 0, 1) . X 2.65
Principles of Communication
Hence FX ( m X ) =
mX
f (x) d x
X
= 0.5
Consider P [ X a ] . We have,
P [ X a] =
1 2 X
( x m X )2 exp dx 2 2 X
x mX z = , we have X
P [ X a] =
a mX X
1 2
z2 2
dz
a mX = Q X
where Q ( y ) =
x2 exp dx 2 2 1
(2.57)
communication theory as well as in standard mathematical tables. A small list is given in appendix A2.2 at the end of the chapter. 2.66
Indian Institute of Technology Madras
Principles of Communication
The importance of Gaussian density in communication theory is due to a theorem called central limit theorem which essentially states that: If the RV X is the weighted sum of N independent random components, where each component makes only a small contribution to the sum, then FX ( x ) approaches Gaussian as N becomes large, regardless of the distribution of the individual components.
For a more precise statement and a thorough discussion of this theorem, you may refer [1-3]. The electrical noise in a communication system is often due to the cumulative effects of a large number of randomly moving charged particles, each particle making an independent contribution of the same amount, to the total. Hence the instantaneous value of the noise can be fairly adequately modeled as a Gaussian variable. We shall develop Gaussian random
processes in detail in Chapter 3 and, in Chapter 7, we shall make use of this
Example 2.23
A random variable Y is said to have a log-normal PDF if X = ln Y has a Gaussian (normal) PDF. Let Y have the PDF, fY ( y ) given by, ( ln y )2 1 , y 0 exp fY ( y ) = 2 y 2 2 0 , otherwise where and are given constants. a) b) c) Show that Y is log-normal Find E (Y ) If m is such that FY ( m ) = 0.5 , find m . Let X = ln Y or x = ln y (Note that the transformation is one-to-one)
a)
2.67
Indian Institute of Technology Madras
Principles of Communication
( x )2 , exp 2 2 2
< x <
Note that X is N ( , 2 )
b)
Y = E e X =
( x ) 2 1 x e e 2 2
dx
= e
2 + 2
x ( + 2 ) 1 2 2 e d x 2
As the bracketed quantity being the integral of a Gaussian PDF between the limits ( , ) is 1, we have
Y = e
+ 2 2
c)
P [Y m ] = P [ X ln m ]
As an example of a two dimensional density, we will consider the bivariate Gaussian PDF, f X ,Y ( x , y ) , < x , y < given by,
1 ( x m X )2 ( y mY )2 ( x mX ) ( y mY ) exp fX , Y ( x , y ) = + 2 2 2 k1 X Y X Y k2
1
(2.58)
where,
k1 = 2 X Y
1 2
2.68
Indian Institute of Technology Madras
Principles of Communication
k 2 = 2 (1 2 )
= Correlation coefficient between X and Y
P2)
f X Y ( x , y ) = f X ( x ) fY ( y ) iff = 0
That is, if the Gaussian variables are uncorrelated, then they are independent. That is not true, in general, with respect to non-Gaussian variables (we have already seen an example of this in Sec. 2.5.2).
P3) If Z = X + Y where and are constants and X and Y are jointly
Gaussian, then Z is Gaussian. Therefore fZ ( z ) can be written after computing mZ and 2 with the help of the formulae given in section 2.5. Z Figure 2.22 gives the plot of a bivariate Gaussian PDF for the case of
= 0 and X = Y .
1
TP PT
Note that the converse is not necessarily true. Let fX and fY be obtained from fX , Y and let fX are
and fY be Gaussian. This does not imply fX , Y is jointly Gaussian, unless X and Y
independent. We can construct examples of a joint PDF fX , Y , which is not Gaussian but results in
fX and fY that are Gaussian.
2.69
Indian Institute of Technology Madras
Principles of Communication
Fig.2.22: Bivariate Gaussian PDF ( X = Y and = 0 ) For = 0 and X = Y , f X ,Y resembles a (temple) bell, with, of course, the striker missing! For 0 , we have two cases (i) , positive and (ii)
< 0.
Example 2.24
2 Let X and Y be jointly Gaussian with X = Y = 1 , 2 = Y = 1 and X
X Y =
(X,Y)
2.70
Indian Institute of Technology Madras
Principles of Communication
D of example 2.24
Let
2.71
Indian Institute of Technology Madras
Principles of Communication
X X P Y + 1 P Y + 2 2 2
Let Z = Y +
X 2
1 2 1 2 X + Y + 2 . X Y 4 2 1 1 1 3 + 1 2. . = 4 2 2 4
Z+ 3 4 1 2 is N ( 0, 1)
P [ Z 1] = P W
5 P [ Z 2] = P W 3
Hence the required probability = Q
( 3) Q
5 3
( 0.04 0.001) =
0.039
2.72
Indian Institute of Technology Madras
Principles of Communication
Exercise 2.10
X and Y are independent, identically distributed (iid) random variables, each being N ( 0, 1) . Find the probability of X , Y lying in the region Fig. 2.25.
A shown in
A of Exercise 2.10
Note: It would be easier to calculate this kind of probability, if the space is a product space. From example 2.12, we feel that if we transform
(X,Y)
B.
into
(Z , W )
2.73
Indian Institute of Technology Madras
Principles of Communication
Exercise 2.11
Two random variables X and Y are obtained by means of the transformation given below. X = ( 2 loge U1 ) 2 cos ( 2 U2 ) Y = ( 2 loge U1 ) 2 sin ( 2 U2 )
1 1
(2.59a) (2.59b)
U1 and U2 are independent random variables, uniformly distributed in the range 0 < u1 , u2 < 1 . Show that X and Y are independent and each is
N ( 0, 1) .
X1
Y = Y1 sin , where = 2 U2 . Note: The transformation given by Eq. 2.59 is called the Box-Muller transformation and can be used to generate two Gaussian random number sequences from two independent uniformly distributed (in the range 0 to 1) sequences.
2.74
Indian Institute of Technology Madras
Principles of Communication
the variables x and y can be replaced by their inverse transformation quantities, namely, x = g 1 ( z , w ) and y = h 1 ( z , w ) ) Let the transformation be one-to-one. (This can be generalized to the case of one-to-many.) Consider the mapping shown in Fig. A2.1.
Fig. A2.1: A typical transformation between the ( x y ) plane and ( z w ) plane in the z w
Infinitesimal rectangle
ABC D
parallelogram in the x y plane. (We may assume that the vertex A transforms to A' , B to B' etc.) We shall now find the relation between the differential area of the rectangle and the differential area of the parallelogram.
2.75
Indian Institute of Technology Madras
Principles of Communication
Fig. A2.2: Typical parallelogram Let ( x , y ) be the co-ordinates of P1 . Then the P2 and P3 are given by
g 1 h 1 P2 = x + dz, y + d z z z
x y dz, y + d z = x + z z x y P3 = x + dw , y + dw w w
That is,
V1 =
V2 =
x y dz i + dz j z z x y dw i + dw j w w
where i and j are the unit vectors in the appropriate directions. Then, the area
2.76
Indian Institute of Technology Madras
Principles of Communication
V1 V2 =
x y y x d z dw z w z w
x, y A = J d z dw z, w
That is,
x, y fZ,W ( z , w ) = f X ,Y ( x , y ) J z, w
= fX , Y ( x , y ) z, w J x, y .
2.77
Indian Institute of Technology Madras
Principles of Communication
Appendix A2.2
Q(
Function Table
Q () =
1 2
2 x
dx
Q (y )
0.05 0.4801 1.05 0.1469 2.10 0.10 0.4602 1.10 0.1357 2.20 0.15 0.4405 1.15 0.1251 2.30 0.20 0.4207 1.20 0.1151 2.40 0.25 0.4013 1.25 0.0156 2.50 0.30 0.3821 1.30 0.0968 2.60 0.35 0.3632 1.35 0.0885 2.70 0.40 0.3446 1.40 0.0808 2.80 0.45 0.3264 1.45 0.0735 2.90 0.50 0.3085 1.50 0.0668 3.00 0.55 0.2912 1.55 0.0606 3.10
0.0179 0.0139 0.0107 0.0082 0.0062 0.0047 0.0035 0.0026 0.0019 0.0013 0.0010
10 3 10 3 2 10 4 10 4 2 10 5 10 6
0.60 0.2743 1.60 0.0548 3.20 0.00069 0.65 0.2578 1.65 0.0495 3.30 0.00048 0.70 0.2420 1.70 0.0446 3.40 0.00034 0.75 0.2266 1.75 0.0401 3.50 0.00023 0.80 0.2119 1.80 0.0359 3.60 0.00016 0.85 0.1977 1.85 0.0322 3.70 0.00010 0.90 0.1841 1.90 0.0287 3.80 0.00007 0.95 0.1711 1.95 0.0256 3.90 0.00005 1.00 0.1587 2.00 0.0228 4.00 0.00003
2.78
Indian Institute of Technology Madras
Principles of Communication
erfc ( ) = 1 erf ( ) =
and the error function, erf ( ) = Hence Q ( ) =
e
0
1 erfc . 2 2
2.79
Indian Institute of Technology Madras
Principles of Communication
establishing
2 v
f (x) d x
X 2
Let I =
dv =
dy.
Then, I
v2 y2 2 dv e 2 d y = e
v2 + y2 2
dv d y
e
0 0
r2 2
r dr d
2
= 2 or I =
1 2
v2 2
dv = 1
(A2.3.1)
x mX x
dx x
(A2.3.2) (A2.3.3)
Then, d v =
Using Eq. A2.3.2 and Eq. A2.3.3 in Eq. A2.3.1, we have the required result.
2.80
Indian Institute of Technology Madras
Principles of Communication
References
1) Papoulis, A., Probability, Random Variables and Stochastic Processes, McGraw Hill (3rd edition), 1991.
P P
2)
Wozencraft, J. M. and Jacobs, I. J., Principles of Communication Engineering, John Wiley, 1965.
3)
Hogg, R. V., and Craig, A. T., Introduction to Mathematical Statistics, The Macmillan Co., 2nd edition, 1965.
P P
2.81
Indian Institute of Technology Madras