# Principles of Communication

Prof. V. Venkata Rao

2

CHAPTER 2
U

Probability and Random Variables

2.1 Introduction
At the start of Sec. 1.1.2, we had indicated that one of the possible ways of classifying the signals is: deterministic or random. By random we mean unpredictable; that is, in the case of a random signal, we cannot with certainty predict its future value, even if the entire past history of the signal is known. If the signal is of the deterministic type, no such uncertainty exists. Consider the signal x ( t ) = A cos ( 2 π f1 t + θ ) . If A , θ and f1 are known, then (we are assuming them to be constants) we know the value of x ( t ) for all t . ( A , θ and f1 can be calculated by observing the signal over a short period of time). Now, assume that x ( t ) is the output of an oscillator with very poor frequency stability and calibration. Though, it was set to produce a sinusoid of frequency f = f1 , frequency actually put out maybe f1' where f1' ∈ ( f1 ± ∆ f1 ) .

Even this value may not remain constant and could vary with time. Then, observing the output of such a source over a long period of time would not be of much use in predicting the future values. We say that the source output varies in a random manner.

Another example of a random signal is the voltage at the terminals of a receiving antenna of a radio communication scheme. Even if the transmitted

2.1

Principles of Communication

Prof. V. Venkata Rao

(radio) signal is from a highly stable source, the voltage at the terminals of a receiving antenna varies in an unpredictable fashion. This is because the conditions of propagation of the radio waves are not under our control.

But randomness is the essence of communication. Communication theory involves the assumption that the transmitter is connected to a source, whose output, the receiver is not able to predict with certainty. If the students know ahead of time what is the teacher (source + transmitter) is going to say (and what jokes he is going to crack), then there is no need for the students (the receivers) to attend the class!

Although less obvious, it is also true that there is no communication problem unless the transmitted signal is disturbed during propagation or reception by unwanted (random) signals, usually termed as noise and interference. (We shall take up the statistical characterization of noise in Chapter 3.)

However, quite a few random signals, though their exact behavior is unpredictable, do exhibit statistical regularity. Consider again the reception of radio signals propagating through the atmosphere. Though it would be difficult to know the exact value of the voltage at the terminals of the receiving antenna at any given instant, we do find that the average values of the antenna output over two successive one minute intervals do not differ significantly. If the conditions of propagation do not change very much, it would be true of any two averages (over one minute) even if they are well spaced out in time. Consider even a simpler experiment, namely, that of tossing an unbiased coin (by a person without any magical powers). It is true that we do not know in advance whether the outcome on a particular toss would be a head or tail (otherwise, we stop tossing the coin at the start of a cricket match!). But, we know for sure that in a long sequence of tosses, about half of the outcomes would be heads (If this does not happen, we suspect either the coin or tosser (or both!)).

2.2

Principles of Communication

Prof. V. Venkata Rao

Statistical

regularity

of

averages

is

an

experimentally

verifiable

phenomenon in many cases involving random quantities. Hence, we are tempted to develop mathematical tools for the analysis and quantitative characterization of random signals. To be able to analyze random signals, we need to understand random variables. The resulting mathematical topics are: probability theory, random variables and random (stochastic) processes. In this chapter, we shall develop the probabilistic characterization of random variables. In chapter 3, we shall extend these concepts to the characterization of random processes.

2.2 Basics of Probability
We shall introduce some of the basic concepts of probability theory by defining some terminology relating to random experiments (i.e., experiments whose outcomes are not predictable).

2.2.1. Terminology
Def. 2.1: Outcome

The end result of an experiment. For example, if the experiment consists of throwing a die, the outcome would be anyone of the six faces, F1 ,........, F6
Def. 2.2: Random experiment

An experiment whose outcomes are not known in advance. (e.g. tossing a coin, throwing a die, measuring the noise voltage at the terminals of a resistor etc.)
Def. 2.3: Random event

A random event is an outcome or set of outcomes of a random experiment that share a common attribute. For example, considering the experiment of throwing a die, an event could be the 'face F1 ' or 'even indexed faces' ( F2 , F4 , F6 ). We denote the events by upper case letters such as A , B or
A1 , A2 ⋅ ⋅ ⋅ ⋅

2.3

Principles of Communication

Prof. V. Venkata Rao

Def. 2.4: Sample space

The sample space of a random experiment is a mathematical abstraction used to represent all possible outcomes of the experiment. We denote the sample space by

S.
S

Each outcome of the experiment is represented by a point in

and is

called a sample point. We use s (with or without a subscript), to denote a sample point. An event on the sample space is represented by an appropriate collection of sample point(s).
Def. 2.5: Mutually exclusive (disjoint) events

Two events A and B are said to be mutually exclusive if they have no common elements (or outcomes).Hence if A and B are mutually exclusive, they cannot occur together.
Def. 2.6: Union of events

The union of two events A and B , denoted A ∪ B , {also written as

(A

+ B ) or ( A or B )} is the set of all outcomes which belong to A or B or both.

This concept can be generalized to the union of more than two events.
Def. 2.7: Intersection of events

The intersection of two events, A and B , is the set of all outcomes which belong to A as well as B . The intersection of A and B is denoted by ( A ∩ B ) or simply ( A B ) . The intersection of A and B is also referred to as a joint event

A and B . This concept can be generalized to the case of intersection of three or
more events.
Def. 2.8: Occurrence of an event

An event A of a random experiment is said to have occurred if the experiment terminates in an outcome that belongs to A .

2.4

denoted by A is the event containing all points in S but not in A . it is likely that ⎜ A ⎟ will fluctuate quite badly. we expect.2 Probability of an Event The probability of an event has been defined in several ways. denoted by P ( A ) .11: The relative frequency definition: Suppose that a random experiment is repeated n times. is defined as ⎛n ⎞ P ( A ) = lim ⎜ A ⎟ n → ∞ ⎝ n ⎠ ⎛ nA ⎞ ⎜ n ⎟ represents the fraction of occurrence of A in n trials. ⎜ A ⎟ to tend to a definite limiting ⎝ n ⎠ value. let the experiment be that of tossing a coin and A the event ⎛n ⎞ 'outcome of a toss is Head'.10: Null event The null event. Two of the most popular definitions are: i) the relative frequency definition. 2. 2.9: Complement of an event The complement of an event A . Thus φ = (note that if A and B are disjoint events. then the probability of A . is an event with no sample points. V. then A B = φ and vice versa). and ii) the classical definition. Venkata Rao Def. denoted φ . Def. ⎜ A ⎟ may not deviate from ⎝ n ⎠ 2. ⎝ ⎠ (2.2. S 2.1) ⎛n ⎞ For small values of n . For example. 2. If n is the order of 100. If the event A occurs nA times.5 Indian Institute of Technology Madras . But ⎝ n ⎠ ⎛n ⎞ as n becomes larger and larger.Principles of Communication Prof. Def.

3(c). we require the probability measure (to the various events on the sample space) to obey the following postulates or axioms: P1) P ( A ) ≥ 0 P2) P ( S ) = 1 P3) ( AB ) = φ . to denote the union of A and B and to denote the addition of two real numbers).. In the classical approach. for i ≠ j and (2. the symbol + is used to mean two different things. If N A of those outcomes are favorable to the occurrence of the event A .. 2. 2 ⎝ n ⎠ Def.. Using Eq..4) ii) Let A1 . An be random events such that: a) Ai A j = φ . then P ( A + B ) = P ( A ) + P ( B ) − P ( A B ) (2. say ten percent and as n becomes larger and larger. A2 . we 2 1 ⎛n ⎞ expect ⎜ A ⎟ to converge to . it is possible for us to derive some additional relationships: i) If A B ≠ φ .. Venkata Rao 1 by more than.Principles of Communication Prof.3b) (2..3a) (2.2) where it is assumed that all outcomes are equally likely! Whatever may the definition of probability. 2.3c) (Note that in Eq.6 Indian Institute of Technology Madras . 2. then P ( A + B ) = P ( A ) + P ( B ) (2. This is done by counting the total number N of the possible outcomes of the experiment.3. namely. V.. then P ( A) = NA N (2.5a) 2.12: The classical definition: The relative frequency definition given above has empirical flavor. the probability of the event A is found without experimentation.

+ P ( A An ) where A is any event on the sample space.4.7 Indian Institute of Technology Madras . Note: A1 . + An = S. it is quite likely that the occurrence of the event B is very much influenced by the occurrence of the event A .. it represents the probability of B occurring. Using the former.5a) and exhaustive (Eq.7 is left as an exercise. The occurrence of the event 'the capacitor on the second draw' is very much dependent on what has been drawn at the first instant. 2. (2. The conditional probability P ( B | A ) can be written in terms of the joint probability P ( A B ) and the probability of the event P ( A ) . A2 .5b) (2.. we have ⎛ nAB ⎞ P ( AB ) = lim ⎜ ⎟ n → ∞ ⎝ n ⎠ ⎛n ⎞ P ( A ) = lim ⎜ A ⎟ n → ∞ ⎝ n ⎠ 2. In a real world random experiment.Principles of Communication Prof. Venkata Rao b) A1 + A2 + ..5b). 2. 2. given that A has occurred... P ( A ) = P ( A A1 ) + P ( A A2 ) + . 2. An are said to be mutually exclusive (Eq..6 and 2... Such dependencies between the events is brought out using the notion of conditional probability .6) Then. let a bowl contain 3 resistors and 1 capacitor.7) A very useful concept in probability theory is that of conditional probability. To give a simple example. (2.. V. iii) P ( A ) = 1 − P ( A ) The derivation of Eq. This relation can be arrived at by using either the relative frequency definition of probability or the classical definition. ⋅ ⋅ ⋅. denoted P ( B | A ) ..

2.8(b) can be written as P ( A B ) = P (B | A) P ( A) = P (B ) P ( A | B ) (2.10a) P ( A | B) = P ( A) P (B | A) . Eq.9 expresses the probability of joint event AB in terms of conditional probability. Similar relation can be derived for the joint probability of a joint event involving the intersection of three or more events. 2. we have Def 2. 2. 2. Venkata Rao where nAB is the number of times AB occurs in n repetitions of the experiment.13: Conditional Probability P ( B | A ) = lim nAB nA ⎞ ⎟ P ( AB ) . P ( A) ≠ 0 (2. 2.8b) Eq.9.10b) Eq. V. P (B ) ≠ 0 (2.9) In view of Eq. P (B ) ≠ 0 P (B ) (2. say P ( B | A ) and the (unconditional) probability P ( A ) .10(b) is one form of Bayes’ rule or Bayes’ theorem.10(a) or 2.Principles of Communication Prof. we have P ( A | B) = P ( AB ) P (B ) . For example P ( A BC ) can be written as 2. As P ( B | A ) refers to the probability of B occurring.8 Indian Institute of Technology Madras .8(a) and 2. P ( A) ≠ 0 ⎟ = P ( A) ⎟ ⎟ ⎠ n → ∞ ⎛ nAB ⎜ = lim ⎜ n n → ∞ ⎜ nA ⎜ n ⎝ or (2. we can also write Eq.8a) P ( A B ) = P (B | A) P ( A) Interchanging the role of A and B . given that A has occurred.8(a) as P (B | A) = Similarly P (B ) P ( A | B ) P ( A) .

2. Another useful probabilistic concept is that of statistical independence. A2 .12 represents another form of Bayes’ theorem.12) Eq.11) Exercise 2.... if Eq...Principles of Communication Prof.9 Indian Institute of Technology Madras .14) Either Eq. B . the events A and B are said to be statistically independent. Venkata Rao P ( A BC ) = P ( A B ) P (C | AB ) = P ( A ) P ( B | A ) P (C | AB ) (2. then P ( A B ) = 0 . A2 .. 2.14 can be used to define the statistical independence of two events.. 2. Show that P Aj | B = ( ) i = 1 ) ( ) ∑ P (B | Aj ) P ( Aj ) P B | Aj P Aj n ( (2.. then P ( A B ) = P ( A) P (B ) A and B satisfy the (2. Thus three events A .13) That is. C are independent when 2. The notion of statistical independence can be generalized to the case of more than two events. An be n mutually exclusive and exhaustive events and B is another event defined on the same space. knowledge of occurrence of A tells no more about the probability of occurrence B than we knew without that knowledge. Then.1 Let A1 . Ak are said to be statistically independent if and only if (iff) the probability of every intersection of k or fewer events equal the product of the probabilities of its constituents..13 or 2.. Alternatively...13. Note that if A and B are independent. A set of k events A1 .. Suppose the events A and B are such that P (B | A) = P (B ) (2. whereas if they are disjoint.. then P ( A B ) = P ( A ) P ( B ) . V.

V. a) The sample space is the rectangle. c) Find the probability that the lovebirds will get married and (hopefully) will live happily ever after. shown in Fig. They agree to meet at the office of registrar of marriages at 11:30 a. Venkata Rao P ( A B ) = P ( A) P (B ) P ( AC ) = P ( A ) P (C ) P ( BC ) = P ( B ) P (C ) and P ( A BC ) = P ( A ) P ( B ) P (C ) We shall illustrate some of the concepts introduced above with the help of two examples. both are somewhat lacking in punctuality and their arrival times are equally likely to be anywhere in the interval 11 to 12 hrs on that day. Also arrival of one person is independent of the other. before leaving in a huff never to meet again.m. 2.1 Priya (P1) and Prasanna (P2). However. 2. on the ensuing Friday (looks like they are not aware of Rahu Kalam or they don’t care about it). much against the wishes of the parents on both the sides. Unfortunately. a) b) Picture the sample space Let the event A stand for “P1 and P2 meet”.1(a). Example 2.Principles of Communication Prof. both are also very short tempered and will wait only 10 min. Mark this event on the sample space.10 Indian Institute of Technology Madras . after seeing each other for some time (and after a few tiffs) decide to get married.

1 2.1(b). indicating the possibility of P1 and P2 meeting.Principles of Communication Prof. 2. is shown in Fig. as shown in the figure. V.11 Indian Institute of Technology Madras . meeting between P1 and P2 would take place if P2 arrives within the interval a to b . Fig.1(b): The event A of Example 2. 2. The event A .1(a): S of Example 2. 2. Assuming that P1 arrives at 11: x . Venkata Rao Fig.1 b) The diagonal OP represents the simultaneous arrival of Priya and Prasanna.

12 Indian Institute of Technology Madras . given 'not H1 H2 ’ the 3 toss would have resulted in anyone of the three other outcomes which can be 2. V. H1 T2 . Are A and B independent? We know that P ( B | A ) = P ( AB ) P ( A) .2: Let two honest coins. Hence P ( A B ) = H1 T2 ’.. A B is the event 'not H1 H2 ' and 'match'. similarly T2 etc. T1 H2 . i. T1 H2 and 4 P ( A) = 3 4 1 1 P (B | A) = 4 = 3 3 4 Intuitively.) We shall treat that all these outcomes are equally likely. it represents the outcome T1 T2 . Find P ( B | A ) . that is the probability of occurrence of any of these four outcomes is 1 . (Treating each of these outcomes as an event. H1 H2 ). therefore. 1 . H1 H2 . The four possible outcomes are T1 T2 . Let the event A be 'not H1 H2 ' and B be the event 'match'.e. the result P ( B | A ) = 1 is satisfying because. be tossed together. we find that these events 4 are mutually exclusive and exhaustive). marked 1 and 2.Principles of Communication Prof. ( T1 indicates toss of coin 1 resulting in tails. (Match comprises the two outcomes T1 T2 . The event A comprises of the outcomes ‘ T1 T2 . Venkata Rao c) Probability of marriage = Shaded area 11 = Total area 36 Example 2.

2.13 Indian Institute of Technology Madras . has a probability of As P ( B ) = 1 . which falls on the real line segment [a1 . let the random experiment be that of throwing a die.3. whose domain is the sample space (of a random experiment) and whose range is in the real line. 2. say X . 3 1 .2: A mapping X ( ) from S to the real line. a2 ] . that is. to each si ∈ S .Principles of Communication Prof. as shown in Fig. Venkata Rao treated to be equally likely.1 Distribution function: Taking a specific case. The figure shows the transformation of a few individual sample points as well as the transformation of the event A . Fig. 2 3 2.2. The six faces of the die can be treated as the six sample points in S . X ( si ) . that is. 2. V. 2.3 Random Variables Let us introduce a transformation or function. This implies that the outcome T1 T2 given 3 1 1 and P ( B | A ) = . X assigns a real number. A and B are dependent events. namely 'not H1 H2 '.

provided we know the Distribution Function of X. 2. Once the transformation is induced. i = 1. If X ( si ) = i − 1. comprising all those sample points which are transformed by X into real numbers less than or equal to x .) Evidently.Principles of Communication Prof. let S consist of four sample points. will be as shown in Fig. we use x as the argument of FX ( ) .3. it could be any ) is a function other symbol and we may use FX ( α ) . 4 each with sample point representing an event with the probabilities P ( s1 ) = P ( s2 ) = 1 1 1 . FX ( a1 ) etc. But. 4 . V. 6 . for convenience. . denoted by FX ( FX ( x ) = P ⎡{s : X ( s ) ≤ x}⎤ ⎣ ⎦ ) which is given by (2.. 3.5. 2. 2. s1 to s4 . 1] ). Venkata Rao Fi = si . Then we can enquire into the probabilities such as P ⎡{s : X ( s ) < a1}⎤ ⎣ ⎦ P ⎡{s : b1 < X ( s ) ≤ b2 }⎤ ⎣ ⎦ or P ⎡{s : X ( s ) = c}⎤ ⎣ ⎦ These and other probabilities can be arrived at.. FX ( whose domain is the real line and whose range is the interval [0..14 Indian Institute of Technology Madras . then 8 8 2 the distribution function FX ( x ) . As an example of a distribution function (also called Cumulative Distribution Function CDF).15) That is.. with 1 . (Note that. FX ( x ) is the probability of the event. Let X ( si ) = i .. 2. . i = 1. P ( s3 ) = and P ( s4 ) = . then the events on the sample space will become transformed into appropriate segments of the real line.

note that FX ( x ) = 0 for x < − 0.Principles of Communication Prof.3: An example of a distribution function FX ( ) satisfies the following properties: i) ii) iii) iv) v) FX ( x ) ≥ 0. 2. there is a discontinuity in FX of magnitude Pa at a point x = a .3. then FX ( a ) ≥ FX ( b ) The first three properties follow from the fact that FX ( ) represents the probability and P ( S ) = 1.5 whereas FX ( − 0. In other words. then ⎡FX ( a ) − FX ( b ) ⎤ = P ⎡{s : b < X ( s ) ≤ a}⎤ ⎣ ⎦ ⎣ ⎦ If a > b .5 ) = 1 1 at the point . there is a discontinuity of 4 4 x = − 0. Properties iv) and v) follow from the fact {s : X ( s ) ≤ b} ∪ {s : b < X ( s ) ≤ a} = {s : X ( s ) ≤ a} Referring to the Fig. 2. if and only if P ⎡{s : X ( s ) = a}⎤ = Pa ⎣ ⎦ (2.15 Indian Institute of Technology Madras . − ∞ < x < ∞ FX ( − ∞ ) = 0 FX ( ∞ ) = 1 If a > b . Venkata Rao Fig.16) 2. V.5 . In general.

s2 }. the limit of the set s : a < X ( s ) ≤ a + ∆ is the null set and can be proved to be so. s4 . (The term “random variable” is somewhat misleading because an RV is a well defined function from S into the real line. where a + = ∆ → 0 lim (a + ∆ ) That is. For example. and has a step of size Pa at point a if and if Eq. B = {s3 . F X { } ( a+ ) − FX ( a ) = 0 . s5 } and C = {s6 } and their probabilities are P ( A ) = 2 1 1 . Venkata Rao The properties of the distribution function are summarized by saying that Fx ( TP PT ) is monotonically non-decreasing. every transformation from S into the real line need not be a random variable. for any real x . F X ( x ) is continuous to the right. B and C can be obtained. We see that the probabilities for the 6 2 6 various unions and intersections of A . is continuous to the right at each point x 1. Consider.5 < x ≤ 4. let S consist of six sample points. Let the transformation X be X ( si ) = i .Principles of Communication Prof. V. 2. 2. {s : X ( s ) ≤ x} should be an event in the sample space with some assigned probability. In other words. s1 to s6 . 1 TP PT Let x = a .16 Indian Institute of Technology Madras . lim P ⎡a < X ( s ) ≤ a + ∆ ⎤ = lim ⎡FX ( a + ∆ ) − FX ( a ) ⎤ ⎣ ⎦ ⎣ ⎦ ∆ → 0 ∆ → 0 We intuitively feel that as ∆ → 0 . Hence. with ∆ > 0 .5 ] = P ( s4 ) is not known as s4 is not an event on the sample space. Then the distribution function fails to exist because P [s : 3.16 is satisfied. The only events that have been identified on the sample space are: A = {s1 . P ( B ) = and P (C ) = . Functions such as X ( ) for which distribution functions exist are called Random Variables (RV).) However.

4. s5 } and C = {s6 } with P ( A ) = 1 1 1 . that is fX ( x ) = or FX ( x ) = d FX ( x ) dx (2. namely. The PDF. f X ( x ) is defined as the derivative of the CDF. s4 . i = 3.3. P ( B ) = and P (C ) = . in practice we find it more convenient to deal with Probability Density Function (PDF). Venkata Rao Exercise 2. 3 2 6 Let Y ( ) be the transformation. 2 ⎪ = ⎨2 . 2. A = {s1 .17 Indian Institute of Technology Madras . 5 ⎪3 . Sketch FY ( y ) .Principles of Communication Prof. i = 1. V.17b) The distribution function may fail to have a continuous derivative at a point x = a for one of the two reasons: i) ii) the slope of the Fx ( x ) is discontinuous at x = a Fx ( x ) has a step discontinuity at x = a The situation is illustrated in Fig. i = 6 ⎩ Y ( si ) Show that Y ( ) is a random variable by finding FY ( y ) . 2. 2.2 Let identified S be a sample space with six sample points.17a) −∞ ∫ f (α) d α X x (2.2 Probability density function Though the CDF is a very fundamental concept. The events on S are the same as above. 4. ⎧1. s2 } . B = {s3 . s1 to s6 .

we resolve the ambiguity by taking f X to be a derivative on the right. That is. we include an impulse Pa δ ( x − a ) in the PDF. Venkata Rao Fig.18) In Eq.18 Indian Institute of Technology Madras .Principles of Communication Prof. f X ( x ) has an impulse of weight 1 8 at x = 1 2 as 1⎤ 1 ⎡ P ⎢ X = ⎥ = . For example. (Note that FX ( x ) is continuous to the right. 2. ε ⎝ε⎠ 1 2 lim 1 −ε 2 ∫ f X ( x ) dx = lim ε → 0 1 2 ∫ 1 ⎛ 1⎞ 1 δ ⎜ x − ⎟ dx ≠ 8 ⎝ 2⎠ 16 −ε 2. In the first case. fX ( x ) = 1 ⎛ 1⎞ 1 ⎛ 1⎞ 1 ⎛ 3⎞ 1 ⎛ 5⎞ δ⎜ x + ⎟ + δ⎜ x − ⎟ + δ⎜ x − ⎟ + δ⎜ x − ⎟ 4 ⎝ 2⎠ 8 ⎝ 2⎠ 8 ⎝ 2⎠ 2 ⎝ 2⎠ (2. 2.4: A CDF without a continuous derivative As can be seen from the figure.18.3. the PDF will be.) The second case is taken care of by introducing the impulse in the probability domain. This impulse function cannot be taken as the limiting case of 2⎦ 8 ⎣ an even function (such as 1 2 ε → 0 1 ⎛x⎞ ga ⎜ ⎟ ) because. for the CDF shown in Fig. 2. V. FX ( x ) has a discontinuous slope at x = 1 and a step discontinuity at x = 2 . if there is a discontinuity in FX at x = a of magnitude Pa .

say X and Y . Later on in this lesson.Principles of Communication Prof.19 Indian Institute of Technology Madras . 8 1 1 ⎧2 ⎪8 . As FX is non-decreasing and FX ( ∞ ) = 1. Venkata Rao 1 2 However. a random variable can be classified as: i) continuous (ii) discrete and (iii) mixed. 1 ≤ x < 3 ⎪8 2 2 ⎩ Such an impulse is referred to as the left-sided delta function. is a continuous function of x for all x .19a) = 1 −∞ ∫ f ( x ) dx X (2.20) As the domain of the random variable X ( ) is known. If the CDF. we shall discuss some examples of these three types of variables. lim ε → 0 1 −ε 2 ∫ f X ( x ) dx = 1 . then we have a set of k co-existing random variables. Y ( s ) ≤ y }⎤ ⎣ ⎦ 1 TP PT (2.3. We can induce more than one transformation on a given sample space. We say that X is a mixed random variable if FX ( x ) is discontinuous but not a staircase. then X 1 is a continuous random variable. we have i) ii) fX ( x ) ≥ 0 ∞ (2. They can be characterized by the (two-dimensional) joint distribution function. − 2 ≤ x < 2 ⎪ FX ( x ) = ⎨ ⎪3 . 2. If we induce k such transformations. given by FX .19b) Based on the behavior of CDF. then X corresponds to a discrete variable. FX ( x ) .3 Joint distribution and density functions Consider the case of two random variables. If FX ( x ) is a TP PT staircase.Y ( x . 2. V. This ensures. y ) = P ⎡{s : X ( s ) ≤ x . it is convenient to denote the variable simply by X .

Y ( x . Similarly. 2. −∞ < y < ∞ FX .5. y1 ) is the probability associated with the set of all sample points whose transformation does not fall outside the shaded region in the two dimensional (Euclidean) space shown in Fig. the transformed values will be less than or equal to y .Y ( ∞ . y1 ) Looking at the sample space points s ∈ S. y ) ≥ 0 . Venkata Rao That is. ∞ ) = 1 FX .5: Space of { X ( s ) . Fig. then F ( x1 .Y ( x .20 Indian Institute of Technology Madras . V. Properties of the two dimensional distribution function are: i) ii) iii) iv) v) FX .Y ( x . FX . y ) is the probability associated with the set of all those sample points such that under X . y ) = FY ( y ) FX . their transformed values will be less than or equal to x and at the same time.Principles of Communication Prof. y1 ) is the probability sample points s ∈ associated with the event AB .Y ( x .Y ( − ∞ . if B is comprised of all those S such that Y ( s ) ≤ y1 . 2. −∞ < x < ∞. In other words.Y ( x1 . − ∞ ) = 0 FX . y ) = FX .Y ( ∞ . FX . let A be the set of all those sample S such that X ( s ) ≤ x1 . ∞ ) = FX ( x ) 2. Y ( s )} corresponding to FX . under Y .Y ( x1 .

Y 1 or −∞ ∫ f ( x .Y ( x1 .Y ( x2 . where k ≥ 3 . β ) d α dβ X . y 2 ) ≥ FX .Principles of Communication Prof. Venkata Rao vi) If x2 > x1 and y 2 > y1 .Y y x (2. it is possible to obtain the one dimensional PDFs.23a) Similarly. Hence. We know that. FX .Y (2. y ) dy X . V.Y (2.Y ( x1 . Given the joint PDF of random variables X and Y .22) Eq. f X ( x1 ) = fX ( x ) = d FX ( x ) dx ∞ ∞ = x = x1 −∞ ∫ f ( x . y ) ∂x ∂y (2. y1 ) ≥ FX . y ) = ∂2 FX . y ) dx X .Y ( x .Y ( x2 . FX ( x1 ) = x1 − ∞− ∞ ∫ ∫ f ( α . β ) dβ X . y1 ) We define the two dimensional joint PDF as f X .Y ( α .Y ( x . then FX . That is. β) d β d α X . y ) = − ∞− ∞ ∫ ∫ f ( α . f X ( x ) and fY ( y ) . ∞ ) = FX ( x1 ) . β ) d β⎥ d α ⎬ d x1 ⎪− ∞ ⎢ − ∞ ⎥ ⎪ ⎦ ⎩ ⎣ ⎭ (2. 2.21b) The notion of joint CDF and joint PDF can be extended to the case of k random variables.Y ∞ f X ( x1 ) = x ∞ ⎫ ⎤ d ⎧ 1⎡ ⎪ ⎪ ⎨ ∫ ⎢ ∫ f X .21 Indian Institute of Technology Madras . Hence it is common to refer FX ( x ) as the marginal 2.Y ( x .21a) or FX .23b) (In the study of several random variables. fY ( y ) = ∞ −∞ ∫ f ( x .22 involves the derivative of an integral. the statistics of any individual variable is called the marginal.

22 Indian Institute of Technology Madras . f X |Y ( x | y ) ≥ 0 ∞ (2. We might also be interested in the PDF of X given a specific value of y = y1 . FY ( y ) and fY ( y ) are similarly referred to. That is.24) where it is assumed that fY ( y1 ) ≠ 0 .25c) (2.25b) (2. f X |Y ( x | y ) = fY | X ( y | x ) = f X . Accordingly.Principles of Communication Prof.Y ( x .Y ( x . y ) fY ( y ) f X . Once we understand the meaning of conditional PDF. V.4 Conditional density Given f X .Y ( x . y ) = f X |Y ( x | y ) fY ( y ) = fY | X ( y | x ) f X ( x ) The function f X |Y ( x | y ) may be thought of as a function of the variable x with variable y arbitrary. we might as well drop the subscript on y and denote it by f X |Y ( x | y ) . denoted f X |Y ( x | y1 ) and defined as f X |Y ( x | y1 ) = f X .Y ( x . given y = y1 .) 2.26a) = 1 and −∞ ∫ f ( x | y ) dx X |Y (2. y ) fX ( x ) (2. Venkata Rao distribution function of X and f X ( x ) as the marginal density function. y1 ) fY ( y1 ) (2.Y ( x . but fixed. it satisfies all the requirements of an ordinary PDF. we have the pair of equations. that is.25d) f X . we know how to obtain f X ( x ) or fY ( y ) . This is called the conditional PDF of X .25a) and or (2. An analogous relation can be defined for fY | X ( y | x ) . y ) .26b) 2.3.

We shall now give a few examples to illustrate the concepts discussed in sec.30a) (2. y ) = f X ( x ) fY ( y ) (2. Z are independent.Principles of Communication Prof..Y ( x ..30b) Similarly.23 Indian Institute of Technology Madras . X 2 . x > 10 ⎩ i) ii) Find the constant K Evaluate P ( X ≤ 5 ) and P ( 5 < X ≤ 7 ) 2.. We call k random variables X1 . Y . X k statistically independent iff. Then... . f X . iff f X . . Z ( x . we have f X |Y ( x | y ) fY ( y ) = f X ( x ) fY ( y ) or f X |Y ( x | y ) = f X ( x ) (2. 2.Y . the definition of statistical independence is somewhat simpler than in the case of events.29) Statistical independence can also be expressed in terms of conditional PDF. Let X and Y be independent.30(a) and 2. 2. x < 0 ⎪ FX ( x ) = ⎨ K x 2 . z ) = f X ( x ) fY ( y ) fz ( z ) (2. f X . 0 ≤ x ≤ 10 ⎪100 K . fY | X ( y | x ) = fY ( y ) Eq.3... iff.3 A random variable X has ⎧ 0 .30(b) are alternate expressions for the statistical independence between X and Y .25(c). y . two random variables X and Y are statistically independent. the k -dimensional joint PDF factors into the product i = 1 ∏ f X i ( xi ) k (2.5 Statistical independence In the case of random variables.28) and three random variables X . y ) = f X ( x ) fY ( y ) Making use of Eq. 2. Example 2. Venkata Rao 2.3.27) Hence.. V.Y ( x .

25 ⎝ 100 ⎠ P ( 5 < X ≤ 7 ) = FX ( 7 ) − FX ( 5 ) = 0.Principles of Communication Prof. i) ii) iii) Determine the relation between a and b so that f X ( x ) is a PDF. In order for f X ( x ) to ∞ i) represent a legitimate PDF. V. 2 ii) the given PDF can be written as ⎧a e b x . As can be seen f X ( x ) ≥ 0 for − ∞ < x < ∞ . ∫ a e − b x dx = 0 −∞ ∫ ae −bx dx = 2 ∫ a e − b x dx = 1 . ⎪0 .24 Indian Institute of Technology Madras .4 Consider the random variable X defined by the PDF fX ( x ) = a e −bx . 2.24 fX ( x ) = d FX ( x ) dx . ⎧0 ⎪ = ⎨0. Determine the corresponding FX ( x ) Find P [1 < X ≤ 2] . Venkata Rao iii) What is f X ( x ) ? i) ii) FX ( ∞ ) = 100 K = 1 ⇒ K = 1 . ⎪ fX ( x ) = ⎨ − b x ⎪a e . ⎩ x < 0 x ≥ 0. 0 ∞ ∞ 1 . Example 2. we require That is. hence b = 2 a .02 x . − ∞ < x < ∞ where a and b are positive constants. 100 ⎛ 1 ⎞ P ( x ≤ 5 ) = FX ( 5 ) = ⎜ ⎟ × 25 = 0. ⎩ x < 0 0 ≤ x ≤ 10 x > 10 Note: Specification of f X ( x ) or FX ( x ) is complete only when the algebraic expression as well as the range of X is given.

Principles of Communication Prof. 2 x < 0 x ≥ 0 1 −b e − e− 2 b 2 ( ) Example 2. otherwise ⎩ Find (a) (b) i) fY | X ( y | 1) and ii) fY | X ( y | 1. 2 2 −∞ ∫ 2 x Consider x > 0 . x > 0 2 We can now write the complete expression for the CDF as ⎧ ⎪ ⎪ FX ( x ) = ⎨ ⎪1 − ⎪ ⎩ iii) FX ( 2 ) − FX (1) = 1 bx e .5 ⎧1 . FX ( x ) = 1 − 1 − bx e . 0 ≤ y ≤ 2 ⎪ Let f X .Y ( x . y ) fX ( x ) 2. y ) = ⎨ 2 ⎪0 . Venkata Rao For x < 0 . 2 1 − bx e .Y ( x . V. we have FX ( x ) = b bα 1 e d α = eb x . 0 ≤ x ≤ y. ∫ f X ( x ) d x = 2 ∞ −2 −∞ ∫ f (x) d x X Therefore. Take a specific value of x = 2 FX ( 2 ) = −∞ ∫ f (x) d x X ∞ = 1 − ∫ fX ( x ) d x 2 But for the problem on hand.5 ) Are X and Y independent? (a) fY | X ( y | x ) = f X .25 Indian Institute of Technology Madras .

1.5 ) = 2 = ⎨ 1 ⎩0 . in addition.5 ≤ y ≤ 2 fY | X ( y | 1. y ≥ x . Hence. Venkata Rao f X ( x ) can be obtained from f X . otherwise 2 (ii) b) 1 ⎧2 . Also we see that fY | X ( y | x ) depends on x whereas if X and Y were to be independent. The maximum value taken by y is 2 . fX ( x ) = Hence. 2. for any given x .Y by integrating out the unwanted variable y over the appropriate range. Y should be greater than or equal to x1 for the joint PDF to be non zero.Principles of Communication Prof.26 Indian Institute of Technology Madras . V. 0 ≤ x ≤ 2 2 fY | X ( y | 1) = 1 2 = ⎧1 . (i) ∫2 dy x 2 1 = 1− x . then fY | X ( y | x ) = fY ( y ) . otherwise 4 the dependence between the random variables X and Y is evident from the statement of the problem because given a value of X = x1 . 1 ≤ y ≤ 2 ⎨ 1 ⎩0 .

Venkata Rao Exercise 2. 0 ≤ y ≤ 10 ⎪100 ⎪ fY ( y ) = ⎨ ⎪ 1 − y .Y ( x . (Fig. filtering etc. y ) Show that ⎧ y . detection.27 Indian Institute of Technology Madras . In performing these operations. 2. 2.Principles of Communication Prof. the following density functions have been given.3 Find a) b) f X . We will now develop the necessary theory for the statistical characterization of the output random variables.6: PDFs for the exercise 2. 2. 10 < y ≤ 20 ⎪ 5 100 ⎩ 2.3 For the two random variables X and Y . we typically generate new random variables from the given ones. given the transformation and the input random variables.4 Transformation of Variables The process of communication involves various operations such as modulation. V.6) Fig.

... = g ( xn ) = . V. solve the equation y = g ( x ) .. obtained from X by the transformation Y = g ( X ) . . 2..... for the specific y ..... f X ( x ) . Denoting its real roots by xn . . y = g ( x1 ) = . Theorem 2. Venkata Rao 2.4.. + g ' ( xn ) f X ( xn ) + ..1 Functions of one random variable Assume that X is the given random variable of the continuous type with the PDF.. (2. By this we mean the number Y ( s1 ) associated with any sample point s1 is Y ( s1 ) = g ( X ( s1 ) ) Our interest is to obtain fY ( y ) .1 Let Y = g ( X ) . x2 and x3 .. We see that the equation y 1 = g ( x ) has three roots namely.. Let Y be the new random variable..7... we will show that fY ( y ) = g ' ( x1 ) f X ( x1 ) + .....Principles of Communication Prof. 2. 2.. To find fY ( y ) .31) where g ' ( x ) is the derivative of g ( x ) ... Proof: Consider the transformation shown in Fig. This can be obtained with the help of the following theorem (Thm.1).. x1 ...28 Indian Institute of Technology Madras ..

it follows that P [ y1 < Y ≤ y1 + d y ] = P [ x1 < X ≤ x1 + d x1 ] + P [ x2 + dx2 < X ≤ x2 ] + P [ x3 < X ≤ x 3 + d x3 ] This probability is the shaded area in Fig. 2. 2. x3 < x ≤ x3 + d x3 where d x1 > 0 . As we see from the figure. From the above. Therefore. but d x2 < 0 .7.Principles of Communication Prof.1 We know that fY ( y ) d y = P [ y < Y ≤ y + d y ] . this set consists of the following intervals: x1 < x ≤ x1 + d x1 . we need to find the set of values x such that y1 < g ( x ) ≤ y1 + d y and the probability that X is in this set. d x1 = dy g ' ( x1 ) P [ x2 + d x2 < x ≤ x2 ] = f X ( x2 ) d x2 . P [ x1 < X ≤ x1 + d x1 ] = f X ( x1 ) d x1 . V. d x2 = P [ x3 < X ≤ x3 + d x 3 ] = f X ( x 3 ) d x3 . for any given y 1 .7: X − Y transformation used in the proof of theorem 2. x2 + d x2 < x ≤ x2 . d x 3 = dy g ' ( x2 ) dy g ' ( x3 ) 2. d x3 > 0 . Venkata Rao Fig.29 Indian Institute of Technology Madras .

32. Hence. We have g ' ( x ) = 1 and x = y − a . Venkata Rao We conclude that fY ( y ) d y = g ' ( x1 ) f X ( x1 ) dy + g ' ( x2 ) f X ( x2 ) dy + g ' ( x3 ) f X ( x3 ) dy (2. 2. Note that if g ( x ) = y1 = constant for every x in the interval ( x0 . we have fY ( y ) = f X ( y − a ) ⎧1 − x .32) and Eq. 2. then we have P [Y = y1 ] = P ( x0 < X ≤ x1 ) = FX ( x1 ) − FX ( x0 ) . We shall take up a few examples. δ ( y − y1 ) of area FX ( x1 ) − FX ( x0 ) . Example 2. Hence fY ( y ) contains an impulse. V.30 Indian Institute of Technology Madras . we have the range of y as ( − 2. where a is a constant. Let us find fY ( y ) . 0 ) . that is FY ( y ) is discontinuous at y = y1 . and x ranges from the interval ( − 1. fY ( y ) d y = g '(x) fX ( x ) d y and as g ' ( x ) = 1. there is a unique x satisfying the above transformation. fY ( y ) = 1 − y + 1 As y = x − 1. 2.31 follows. x ≤ 1 ⎪ Let f X ( x ) = ⎨ ⎪ 0 . elsewhere ⎩ and a = − 1 Then. 1) .6 Y = g ( X ) = X + a . For a given y .Principles of Communication Prof. x1 ) . by canceling d y from the Eq.

31 Indian Institute of Technology Madras . we have fY ( y ) = x ⎧ ⎪1 − . then 1 ⎛y⎞ fX ⎜ ⎟ . − 2 ≤ y ≤ 0 ⎪ Hence fY ( y ) = ⎨ 0 .8: FX ( x ) and FY ( y ) of example 2. Let us find fY ( y ) . V. otherwise ⎩ and b = − 2 . Venkata Rao ⎧1 − y + 1 . Solving for X . Again for a given y . Fig.Principles of Communication Prof.8. Example 2.6 As can be seen. elsewhere ⎪ ⎩ FX ( x ) and FY ( y ) are shown in Fig. As g ' ( x ) = b . the transformation of adding a constant to the given variable simply results in the translation of its PDF. we have X = 1 Y . there is a unique b x . 0 ≤ x ≤ 2 Let f X ( x ) = ⎨ 2 ⎪ 0 .7 Let Y = b X . where b is a constant. 2. b ⎝b⎠ 2. 2.

2.4 Let Y = a X + b . Show that fY ( y ) = 1 ⎛y − b⎞ fX ⎜ ⎟. 2. V.Principles of Communication Prof. − 4 ≤ y ≤ 0 ⎣ ⎦ 0 .32 Indian Institute of Technology Madras . Fig.8 Y = a X 2 . Let us find fY ( y ) . otherwise f X ( x ) and fY ( y ) are sketched in Fig. 2. a ⎝ a ⎠ If f X ( x ) is as shown in Fig.8. and b = 1.9: f X ( x ) and fY ( y ) of example 2. Venkata Rao ⎧1 ⎪ fY ( y ) = ⎨ 2 ⎪ ⎩ y⎤ ⎡ ⎢1 + 4 ⎥ .7 Exercise 2. a > 0 . Example 2. compute and sketch fY ( y ) for a = − 2 . where a and b are constants. 2.9.

then it has two solutions. 2. otherwise ⎩ We shall compute and sketch fY ( y ) . and Eq. Example 2. and f X ( x ) = ⎛ x2 ⎞ exp ⎜ − . −∞ < x < ∞ ⎜ 2 ⎟ ⎟ 2π ⎝ ⎠ 1 (Note that exp ( α ) is the same as e α ) Then fY ( y ) = 1 2 y ⎡ ⎛ y⎞ ⎛ y ⎞⎤ ⎢ exp ⎜ − 2 ⎟ + exp ⎜ − 2 ⎟ ⎥ 2π ⎣ ⎝ ⎠ ⎝ ⎠⎦ ⎧ 1 ⎛ y⎞ exp ⎜ − ⎟ . then the equation y = a x 2 has no real solution. b) 2.31 yields a y and a fY ( y ) ⎧ 1 ⎡ ⎛ y⎞ ⎛ ⎪ ⎢f X ⎜ ⎟ + fX ⎜ − ⎜ a⎟ ⎜ ⎪ ⎠ ⎝ ⎣ = ⎨ 2a y ⎢ ⎝ a ⎪ ⎪ 0 ⎩ y ⎞⎤ ⎟⎥ . otherwise ⎩ Sketching of f X ( x ) and fY ( y ) is left as an exercise. X ≤ 0 Y =⎨ ⎩X . If y < 0 .33 Indian Institute of Technology Madras .9 Consider the half wave rectifier transformation given by ⎧0 . y ≥ 0 a ⎟⎥ ⎠⎦ . If y ≥ 0 . y ≥ 0 ⎪ = ⎨ 2π y ⎝ 2⎠ ⎪ 0 .Principles of Communication Prof. X > 0 a) Let us find the general expression for fY ( y ) 1 3 ⎧1 ⎪ . Hence fY ( y ) = 0 for y < 0 . − < x < Let f X ( x ) = ⎨ 2 2 2 ⎪ 0 . V. x1 = x2 = − y . Venkata Rao g ' ( x ) = 2 a x . otherwise Let a = 1 .

let f X ( x ) = ⎨ 2 2 2 ⎪0 . 0 < y ≤ 2 ⎪2 . elsewhere ⎩ ⎧1 ⎪4 δ(y ) . otherwise ⎩ ⎧1 . As Y is nonnegative. Hence.9 2. V. 0 ) .Principles of Communication Prof. 2. fY ( y ) f X ( x ) and fY ( y ) are sketched in Fig. y = 0 ⎪ 3 ⎪1 = ⎨ . y ≥ 0 where w ( y ) = ⎨ ⎩0 . fY ( y ) = 0 for y < 0 .10. 2. As Y = X for x > 0 . − <x< Specifically.34 Indian Institute of Technology Madras . Fig.10: f X ( x ) and fY ( y ) for the example 2. Venkata Rao a) Note that g ( X ) is a constant (equal to zero) for X in the range of ( − ∞ . there is an impulse in fY ( y ) at y = 0 whose area is equal to FX ( 0 ) . otherwise b) 1 3 ⎧1 ⎪ . otherwise ⎪0 ⎪ ⎩ Then. we have fY ( y ) = f X ( y ) for y > 0 . Hence ⎧f ( y ) w ( y ) + FX ( 0 ) δ ( y ) ⎪ fY ( y ) = ⎨ X ⎪0 .

Hence the PDF of Y has only two impulses. Y assumes only two values. X < 0 Let Y = ⎨ ⎩+ 1. Fig. We shall compute and sketch fY ( x ) assuming that f X ( x ) is the same as that of Example 2. 3 1 and P− 1 = . 2. X ≥ 0 a) b) Let us find the general expression for fY ( y ) . namely ±1. V.11 4 4 Fig. 2. Venkata Rao Note that X .10 ⎧− 1.35 Indian Institute of Technology Madras . Example 2. Let us write fY ( y ) as fY ( y ) = P1 δ ( y − 1) + P−1 δ ( y + 1) where a) P− 1 = P [ X < 0 ] and P1 [ X ≥ 0 ] b) Taking f X ( x ) of example 2. we have P1 = has the sketches f X ( x ) and fY ( y ) . In this case.10 Note that this transformation has converted a continuous random variable X into a discrete random variable Y .9.11: f X ( x ) and fY ( y ) for the example 2. a continuous RV is transformed into a mixed RV. Y .9.Principles of Communication Prof. 2.

5 Let a random variable X with the PDF shown in Fig.12(a) be the input to a device with the input-output characteristic shown in Fig. 2.13. Compute and sketch fY ( y ) .12(b). 2. V. Fig. Compute and sketch fY ( y ) . 2.5 is applied as input to the X − Y transformation shown in Fig. 2. Venkata Rao Exercise 2. 2.6 The random variable X of exercise 2.36 Indian Institute of Technology Madras .12: (a) Input PDF for the transformation of exercise 2.Principles of Communication Prof.5 (b) Input-output transformation Exercise 2.

.11 Let Y = X 2 . then Y takes the 6 a) If X takes the values (1. ... 9 with probabilities That is. In this case. . 2.Principles of Communication Prof. Example 2. find fY ( y ) .. . then P [Y = y k ] = P [ X = xk ] = Pk . assuming the value Yk = g ( x k ) with probability Pk . 4. y k = g ( x ) for x = xk and x = xm . 2.. − 1. a) 1 If f X ( x ) = 6 If f X ( x ) = i = 1 ∑ δ ( x − i ) . If however.37 Indian Institute of Technology Madras .. ... 6 3 3 6 1 1 ⎡δ ( y ) + δ ( y − 9 ) ⎤ + ⎡δ ( y − 1) + δ ( y − 4 ) ⎤ ⎣ ⎦ 3⎣ ⎦ 6 2.. respectively... X takes the values − 2. 6 ) with probability of values 12 . 6 2 b) If.. 3 with probability 1 . fY ( y ) = 1 1 1 1 .4.2 Functions of two random variables Given two random variables X and Y (assumed to be continuous type). 22 . then 6 Y takes the values 0. 1. 1 . Venkata Rao We now assume that the random variable X is of discrete type taking on the value x k with probability Pk . Y ) and W = h ( X . V. 1. find fY ( y ) .. Y ) 2. 62 with probability 1 1 . Z and W are defined using the transformation Z = g ( X . the RV.. 0. two new random variables. Y = g ( X ) is also discrete. If y k = g ( x ) for only one x = xk . ∑ 3 6 b) 1 6 i = −2 δ ( x − i ) . fY ( y ) = 6 6 i = 1 ∑ δ(x − i ) . then P [Y = y k ] = Pk + Pm . That is. however.

33b) then there is a unique set of numbers. we would like to obtain fZ . y ) . solve the system of equations g ( x . y ⎠ ∂x = ∂z ∂y ∂w ∂y ∂ z ∂w ∂ z ∂w − ∂x ∂y ∂y ∂x That is.W ( z . for x and y .W ( z .Y ( x . the Jacobian is the determinant of the appropriate partial derivatives.33a) (2.W ( z . 2. y1 ⎠ (2. That is. y1 ) ⎛ z. w ) and f X . h ( x . ( x1 .Principles of Communication Prof. (2.33. For a more general version of this theorem. w ⎞ J⎜ ⎟ where ⎝ x.38 Indian Institute of Technology Madras . y ) = z1 . w ⎞ J⎜ ⎟ = ∂w ⎝ x. Venkata Rao Given the above transformation and f X .34) Proof of this theorem is given in appendix A2. denoted ⎛ z. y1 ) be the result of the solution. Theorem 2. Let ( x1 . refer [1]. For this.2: To obtain fZ . w ⎞ J⎜ ⎟ ⎝ x1 . w ) . w ) . y ) = w1 .Y ( x1 . 2. we require the Jacobian of the transformation. y1 ) satisfying Eq. y ) = w1 . fZ . Then. y ) . V. h ( x . w ) = f X .Y ( x . y ⎠ ∂z ∂x ⎛ z.W ( z . y ) = z1 . We shall assume that the transformation is one-to-one.1. given g ( x . We shall now state the theorem which relates fZ .

Fig. 2. x ≤ 1 2 1 . y ≤ 1 2 and (b) fZ ( z ) .14(a) 2 depicts the (product) space A on which f X . We see that the mapping is one-to-one. Venkata Rao Example 2.Y is non-zero (b) The space where fZ .Principles of Communication Prof. fX ( x ) = fY ( y ) = 1 . V.W is non-zero 2. Fig.14: (a) The space where f X . w ) a) From the given transformations. If Z = X + Y and W = X − Y .12 X and Y are two independent RVs with the PDFs .Y ( x . we obtain x = 1 (z + w ) 2 and y = 1 ( z − w ) . let us find (a) fZ . 2. y ) is non-zero.39 Indian Institute of Technology Madras .W ( z .

y ⎠ Hence fZ .W From Fig. z. Venkata Rao We can obtain the space B (on which fZ . Hence fZ ( z ) = + z −z ∫ 8 dw 1 1 = 1 z. V. we can see that. The Jacobian of the transformation 1 1 ⎛ z. w ) is non-zero) as follows: A space The line x = 1 B space 1 (z + w ) = 1 ⇒ w = − z + 2 2 1 ( z + w ) = − 1 ⇒ w = − ( z + 2) 2 1 (z − w ) = 1 ⇒ w = z − 2 2 1 (z − w ) = − 1 ⇒ w = z + 2 2 The line x = − 1 The line y = 1 The line y = − 1 The space is B is shown in Fig. w ) = ⎨ 2 ⎪0 . w ⎞ J⎜ = − 2 and J ( ⎟ = 1 −1 ⎝ x. w ∈ B (z. we have fZ ( z ) = −z ∫ 8 dw z = − 1 z.W ( z .W ⎧1 1 4 = ⎪8 . for a given z ( z ≥ 0 ). w can take values only in the − z to z .Principles of Communication Prof. w ) d w Z . 2. otherwise ⎩ ) = 2. −2 ≤ z ≤ 0 4 2. b) fZ ( z ) = ∞ −∞ ∫ f (z. 0 ≤ z ≤ 2 4 For z negative. 2.40 Indian Institute of Technology Madras .14(b).14(b).

z ≤ 2 4 Example 2. 0 ≤ r ≤ ∞ . ∂r ∂x ∂ϕ ∂x ∂r ∂y cos ϕ = − sin ϕ ∂ϕ r ∂y sin ϕ 1 cos ϕ = r r J = ⎧ r ⎛ r2 ⎞ exp ⎜ − ⎟ . ϕ ) = ⎨ 2 π ⎝ 2⎠ ⎪ . Y ) is specified and what is required is fZ ( z ) . w ) is found from which fZ ( z ) can be obtained. ϕ ) . − π < ϕ < π ⎪ Hence. Theorem 2. y ) = exp ⎢ − ⎜ ⎟⎥ . V. It is given that ⎝X⎠ ⎡ ⎛ x 2 + y 2 ⎞⎤ 1 f X . Φ ( r . 2π 2 ⎠⎦ ⎣ ⎝ Let us find fR . and the transformation is one -to -one.41 Indian Institute of Technology Madras . 2.2 can also be used when only one function Z = g ( X . we can write x = r cos φ and y = r sin φ . As the given transformation is from cartesian-to-polar coordinates. Φ ( r . fR . − ∞ < x.13 Let the random variables R and Φ be given by. Venkata Rao Hence fZ ( z ) = 1 z .W ( z . fΦ ( ϕ ) and to show that R and Φ are independent variables. otherwise ⎩0 It is left as an exercise to find fR ( r ) .Y ( x . using the theorem fZ . Typically W = X or W = Y . y < ∞ . a conveniently chosen auxiliary or dummy variable W is introduced.Principles of Communication Prof. R = X 2 + Y 2 and ⎛Y ⎞ Φ = arc tan ⎜ ⎟ where we assume R ≥ 0 and − π < Φ < π . To apply the theorem.

−1 ≤ x ≤ 1 fX ( x ) = ⎨ 2 ⎪0 . w ) = f X .14 Let X and Y be two independent random variables. otherwise ⎩ ⎧1 ⎪ . fZ . −2 ≤ y ≤ 1 fY ( y ) = ⎨ 3 ⎪0 .Y ∞ (2. let us find P [ Z ≤ − 2] .35) If X and Y are independent.36b) That is.35 becomes fZ ( z ) = −∞ ∫ f ( z − w ) f (w ) d w X Y (2.36a) (2. 2.15. Then. From Eq.W ( z . Let us introduce a dummy variable W = Y . fZ = f X ∗ fY Example 2. and Y = W . w ) and fZ ( z ) = ∞ −∞ ∫ f (z − w . As J = 1 . 2. Carrying out the convolution. then Eq.42 Indian Institute of Technology Madras . otherwise ⎩ If Z = X + Y . we obtain fZ ( z ) as shown in Fig. with ⎧1 ⎪ . w ) d w X . V. X = Z − W .36(b). 2. 2.Y ( z − w .Principles of Communication Prof. fZ ( z ) is the convolution of f X ( z ) and fY ( z ) . Venkata Rao Let Z = X + Y and we require fZ ( z ) .

Venkata Rao Fig. y ) = ⎨ 4 ⎪0 .15: PDF of Z = X + Y of example 2. x ≤ 1. y ≤ 1 ⎪ Let f X . w ) dw ⎧1 + x y . 12 Example 2. we have X = ZW Y = W As J = 1 .14 P [ Z ≤ − 2] is the shaded area = 1 .Y ( zw .Principles of Communication Prof. let us find an expression for fZ ( z ) . V. elsewhere ⎩ Then fZ ( z ) = ∞ −∞ ∫ w 1 + zw 2 dw 4 2.15 Let Z = X . Y Introducing the dummy variable W = Y .43 Indian Institute of Technology Madras .Y ( x . we have w ∞ fZ ( z ) = −∞ ∫ w f X . 2.

w ) .Y is non-zero (b) The space where fZ . Let fZ .W ( z .W is non-zero To obtain fZ ( z ) from fZ . Under x ≤ 1 and y ≤ 1 (Fig. we have to integrate out w over the appropriate ranges. V. Venkata Rao f X . B will be as shown in Fig. y ) ∈ A . we have fZ ( z ) = 2 iii) 1 z 1 + zw 2 1⎛ 1 1 ⎞ ∫ 4 w dw = 4 ⎜ z2 + 2 z3 ⎟ ⎝ ⎠ 0 For z < − 1. w ) ∈ the given transformation.16a). we have fZ ( z ) = 2 − 1 z ∫ 0 1 + zw 2 1⎛ 1 1 ⎞ w dw = ⎜ 2 + ⎟ 4 4 ⎝z 2 z3 ⎠ 2. where A is the product space B . y ) is non-zero if (x.W ( z .16: (a) The space where f X .16(b).Y ( x . then 1 + zw 2 1 + zw 2 fZ ( z ) = ∫ w dw = 2 ∫ w dw 4 4 0 −1 1 1 = z⎞ 1⎛ ⎜1 + 2 ⎟ 4⎝ ⎠ ii) For z > 1 .Principles of Communication Prof. w ) be non-zero if ( z . i) Let z < 1. Fig. 2.2. 2.44 Indian Institute of Technology Madras .

z > 1 2 z3 ⎠ ⎝z Exercise 2. V. − ∞ < z < ∞. z ≤ 1 ⎛ 1 1 ⎞ ⎜ 2 + ⎟.Y ⎜ . Venkata Rao ⎧1 ⎪4 ⎪ Hence fZ ( z ) = ⎨ ⎪1 ⎪4 ⎩ z⎞ ⎛ ⎜1 + 2 ⎟ ⎝ ⎠ . such cases are handled better. x ≤ 1 and y ⎧ ⎪y e− fY ( y ) = ⎨ ⎪0 ⎩ 2 . we may encounter a situation where one of the variables is continuous and the other is discrete. y ≥ 0 .Principles of Communication Prof. The output of the channel is given by Z = X + Y 2 2.16 The input to a noisy channel is a binary random variable with P [ X = 0] = P [ X = 1] = 1 . Example 2. w ⎟ d w w ⎝w ⎠ X and Y be independent with fX ( x ) = 1 π 1 + x2 2 . by making use of the distribution function approach. In transformations involving two random variables.45 Indian Institute of Technology Madras .7 Let Z = X Y and W = Y . a) b) Show that fZ ( z ) = ∞ −∞ ∫ 1 ⎛z ⎞ f X . as illustrated below. otherwise Show that fZ ( z ) = 1 2π e 2 −z 2 .

given f X . V. 1 2π e − y2 2 .Principles of Communication Prof.17 Let Z = X + Y .1 or 2.46 Indian Institute of Technology Madras . Example 2. is a very basic method and can be used in all situations.Y ( x . Of course computing the CDF may prove to be quite difficult in certain situations. 2. P ( Z ≤ z ) = P [ Z ≤ z | X = 0] P [ X = 0 ] + P [ Z ≤ z | X = 1] P [ X = 1] As Z = X + Y .2 is called change of variable method. − ∞ < y < ∞ . As fZ ( z ) = 2 2 dz ⎛ z − 1 2 ⎞⎤ ( ) ⎟⎥ exp ⎜ − ⎜ ⎟⎥ 2 2π ⎝ ⎠⎦ 1 ⎡ ⎛ z2 ⎞ 1⎢ 1 ⎟+ exp ⎜ − fZ ( z ) = ⎜ 2 ⎟ 2 ⎢ 2π ⎝ ⎠ ⎣ The distribution function method.16. Venkata Rao where Y is the channel noise with fY ( y ) = fZ ( z ) . if Y = g ( X ) . Similarly.) We shall illustrate the distribution function method with another example. for transformations involving more than one random variable. we have FY ( z ) + FY ( z − 1) . we compute FY ( y ) and then obtain fY ( y ) = d FY ( y ) dy . The method of obtaining PDFs based on theorem 2. we have P [ Z ≤ z | X = 0 ] = FY ( z ) Similarly P [ Z ≤ z | X = 1] = P ⎡Y ≤ ( z − 1) ⎤ = FY ( z − 1) ⎣ ⎦ Hence FZ ( z ) = 1 1 d FZ ( z ) . as illustrated by means of example 2. y ) . (In this method. Obtain fZ ( z ) . Find Let us first compute the distribution function of Z from which the density function can be derived.

y ) d y ⎥ ∫∞ ⎢ − ∞ ⎥ − ⎣ ⎦ ∞ ∞ ⎡z − x ⎤⎤ ∂ ⎡ fZ ( z ) = ⎢ ∫ d x ⎢ ∫ f X . 2. z − x) d x X . Fig.Principles of Communication Prof.Y ( x . Y ) lying in the shaded area shown in Fig.Y It is not too difficult to see the alternative form for fZ ( z ) .47 Indian Institute of Technology Madras . y ) d y ⎥ ⎥ ∂ z ⎢− ∞ ⎢−∞ ⎥⎥ ⎣ ⎦⎦ ⎣ = ⎡ ∂ ∫∞ d x ⎢ ∂ z ⎢ − ⎣ ∞ ∞ z− x −∞ ∫ ⎤ f X . namely.Y ( x . 2.17: Shaded area is the P [ Z ≤ z ] That is. Venkata Rao P [Z ≤ z ] = P [ X + Y ≤ z ] = P [Y ≤ z − x ] This probability is the probability of ( X . FZ ( z ) = ⎡z − x ⎤ d x ⎢ ∫ f X . 2.37a) = −∞ ∫ f (x.Y ( x . V. y ) d y ⎥ ⎥ ⎦ (2.17.

We note that Eq. Details can be found in [1. This can be generalized to the case of functions of n variables. 2. y ) d y X . Note that m X is a constant. 2.38) where E denotes the expectation operator. Some of these averages do provide a simple and fairly adequate (though incomplete) description of the random variable.Principles of Communication Prof. g ( X ) . The mean value (also called the expected value. Venkata Rao fZ ( z ) = ∞ −∞ ∫ f (z − y .37b) If X and Y are independent. We now define a few of these averages and explore their significance.2 and 2 respectively are placed in bowl and are mixed.48 Indian Institute of Technology Madras . V. Similarly.5 Statistical Averages The PDF of a random variable provides a complete statistical characterization of the variable. 2]. we have fZ ( z ) = f X ( z ) ∗ fY ( z ) . A player is to be 2. we might be in a situation where the PDF is not available but are able to estimate (with reasonable accuracy) certain (statistical) averages of the random variable.35. So far we have considered the transformations involving one or two variables. the expected value of a function of X . However.39) Remarks: The terminology expected value or expectation has its origin in games of chance.Y (2. mathematical expectation or simply expectation) of random variable X is defined as mX = E [ X ] = X = ∞ −∞ ∫ x fX ( x ) d x (2. is defined by E ⎡g ( X ) ⎤ = g ( X ) = ⎣ ⎦ ∞ −∞ ∫ g (x) f (x) d x X (2. 2. This can be illustrated as follows: Three small similar discs.37(b) is the same as Eq. numbered 1.

41) If g ( X ) = ( X − m X ) . Venkata Rao blindfolded and is to draw a disc from the bowl. Note that g ( x ) is such that g (1) = 9 and g ( 2 ) = 3 .Principles of Communication Prof. If we take X to be (discrete) random variable with the PDF fX ( x ) = 1 2 δ ( x − 1) + δ ( x − 2 ) and g ( X ) = 15 − 6 X . Consider a function of two random variables. or five dollars. E ⎡Xn ⎤ = ⎣ ⎦ ∞ −∞ ∫ x n fX ( x ) d x (2.38) and the second moment ( n = 2 . X . If he draws the disc numbered 1. that is. His total claim is 9(1/3) + 3(2/3). V.49 Indian Institute of Technology Madras . It seems reasonable to assume that the player has a '1/3 claim' on the 9 dollars and '2/3 claim' on three dollars. if he draws either disc numbered 2. that is. then E ⎡ g ( X ) ⎤ gives n-th central moment. he will receive 3 dollars. he will receive nine dollars. E ⎡X2⎤ = X2 = ⎣ ⎦ n ∞ −∞ ∫ x 2 fX ( x ) d x (2. For the special case of g ( X ) = X n . Then. the mathematical expectation of g ( X ) is precisely the player's claim or expectation [3]. 2. then 3 3 ∞ ⎡ ⎤ E ⎣g ( X ) ⎦ = −∞ ∫ (15 − 6 x ) f ( x ) d x X = 5 That is.42) We can extend the definition of expectation to the case of functions of k ( k ≥ 2 ) random variables. Y ) . g ( X . ⎣ ⎦ n E ⎡( X − m X ) ⎤ = ⎣ ⎦ ∞ −∞ ∫ (x − m ) X n fX ( x ) d x (2. resulting in the mean square value of X ).40) The most widely used moments are the first moment ( n = 1. we obtain the n-th moment of the probability distribution of the RV. which results in the mean value of Eq. 2.

Venkata Rao E ⎡g ( X . we have E [Z ] = = ∞ − ∞− ∞ ∞ ∞ ∫ ∫ ( α x + β y ) f ( x .Principles of Communication Prof.Y ∞ (2. y ) d x dy X . Y ) = α X + βY where α and β are constants. we have the first central moment being always zero because.Y ∞ − ∞− ∞ ∫ ∫ ( α x ) fX .1 Variance Coming back to the central moments. 2. 2 2 E ⎡ X 2 − 2 mX X + mX ⎤ = E ⎡ X 2 ⎤ − 2 mX E [ X ] + mX ⎣ ⎦ ⎣ ⎦ 2 2 = E ⎡ X 2 ⎤ − 2 mX + mX ⎣ ⎦ 2 = X 2 − mX = X 2 − ⎡ X ⎤ ⎣ ⎦ 2 2.Y ∞ Integrating out the variable y in the first term and the variable x in the second term. that is. then Z = α X + βY . E ⎡( X − m X ) ⎤ = ⎣ ⎦ ∞ −∞ ∫ (x − m ) f (x) d x X X = mX − mX = 0 Consider the second central moment 2 2 E ⎡( X − m X ) ⎤ = E ⎡ X 2 − 2 m X X + m X ⎤ ⎣ ⎦ ⎣ ⎦ From the linearity property of expectation. y ) d x dy X . V.5. This result can be established as follows.43) An important property of the expectation operator is linearity. if Z = g ( X . y ) d x d y X .50 Indian Institute of Technology Madras . From Eq. we have E [Z ] = ∞ −∞ ∫ α x f (x) d x + ∫ β y f (y ) d y X Y −∞ ∞ = α X + βY 2. y ) d x dy + ∞ − ∞− ∞ ∫ ∫ (β y ) f ( x . Y ) ⎤ = ⎣ ⎦ ∞ − ∞− ∞ ∫ ∫ g (x.43. y ) f (x.Y ( x .

E ⎡g ( X ) ⎤ = ⎣ ⎦ = ∞ −∞ ∫ g (x) f (x) d x X X X B ∫g (x) f (x) d x + ∫ g (x) f (x) d x A Since each integral on the RHS above is non-negative.3: Let g ( X ) be a non-negative function of the random variable X . we use a subscript on σ 2 ) The variance provides a measure of the variable's spread or randomness. ⎣ ⎦ P ⎡g ( X ) ≥ c ⎤ ≤ ⎣ ⎦ E ⎡g ( X ) ⎤ ⎣ ⎦ c (2. which is the desired result. 2.Principles of Communication Prof. E ⎡ g ( X ) ⎤ ≥ c P ⎡ g ( X ) ≥ c ⎤ . Specifying the variance essentially constrains the effective width of the density function.3 is useful in establishing similar inequalities involving random variables. V. hence E ⎡g ( X )⎤ ≥ c ∫ f X ( x ) d x ⎣ ⎦ A But ∫ f X ( x ) d x = P [ x ∈ A] = P ⎡g ( X ) ≥ c ⎤ ⎣ ⎦ A That is. If E ⎡ g ( X ) ⎤ exists. we can write E ⎡g ( X )⎤ ≥ ⎣ ⎦ ∫g (x) f (x) d x X A If x ∈ A . ⎣ ⎦ ⎣ ⎦ Note: The kind of manipulations used in proving the theorem 2. Theorem 2.51 Indian Institute of Technology Madras . This can be made more precise with the help of the Chebyshev Inequality which follows as a special case of the following theorem.44) Proof: Let A = { x : g ( x ) ≥ c} and B denote the complement of A . The symbol σ 2 is generally used to denote the variance. (If necessary. Venkata Rao The second central moment of a random variable is called the variance and its (positive) square root is called the standard deviation. then for every positive constant c . then g ( x ) ≥ c .

We then have. Chebyshev 2.Principles of Communication Prof.52 Indian Institute of Technology Madras . where the required probability is concentrated. smaller the standard deviation.45a) (2. By the same token.45(a). Naturally. we would take the positive number k to be greater than one to have a meaningful result. perhaps) the inequality 2. smaller is the width of the interval around m X .3. the k2 probability of X − m X ≥ 2 σ X does not exceed 1/4 or 25%. 1 P ⎡ X − mX ≥ k σ X ⎤ ≤ 2 ⎣ ⎦ k which is the desired result. V. That is.44 is. let g ( X ) = ( X − m X ) and c = k 2 σ2 in theorem X 2 2. Venkata Rao To see how innocuous (or weak. With k = 2 for example. 2. 1 2 P ⎡( X − m X ) ≥ k 2 σ 2 ⎤ ≤ 2 X ⎣ ⎦ k In other words. let g ( X ) represent the height of a randomly chosen human being with E ⎡ g ( X ) ⎤ = 1. at most 100 million would be as tall as a 10 full grown Palmyra tree!) Chebyshev inequality can be stated in two equivalent forms: i) ii) 1 P ⎡ X − mX ≥ k σ X ⎤ ≤ 2 .44 states that the probability of choosing a person over 16 m tall is at most 1 ! (In a population of 1 billion. (2. ⎣ ⎦ Then Eq. k > 0 ⎣ ⎦ k 1 P ⎡ X − mX < k σ X ⎤ > 1 − 2 ⎣ ⎦ k where σ X is the standard deviation of X . Chebyshev inequality can be interpreted as: the probability of observing any RV outside ± k standard deviations off its mean value is no larger than 1 .6m .45b) To establish 2. we expect X to occur within the range ( m X ± 2 σ X ) for more than 75% of the observations.

That is. V. Y ) = ( X − m X ) (Y − mY ) in Eq.Y ( x . Venkata Rao inequality thus enables us to give a quantitative interpretation to the statement 'variance is indicative of the spread of the random variable'.43. 2. − ∞ < x < ∞ and α > 0 .53 Indian Institute of Technology Madras . Then. Suppose X and Y are independent. we have λ X Y = E [ X Y ] − m X mY (2. E[XY] = = ∞ ∞ − ∞− ∞ ∞ ∞ ∫ ∫ x y f X . normalized with respect to the product σ X σY is termed as the correlation coefficient and we denote it by ρ . For example.Principles of Communication Prof. α +x (This is called Cauchy’s PDF) 2.Y ] = E ⎡( X − m X ) (Y − mY ) ⎤ ⎣ ⎦ (2. ρX Y = E [ X Y ] − mX mY σ X σY (2. y ) d x d y x y f X ( x ) fY ( y ) d x d y ∞ − ∞− ∞ ∞ ∫ ∫ ∫ = x fX ( x ) d x −∞ −∞ ∫ y f (y ) d y Y = m X mY 2. We use the symbol λ to denote the covariance.5.2 Covariance An important joint expectation is the quantity called covariance which is obtained by letting g ( X . That is. if α f X ( x ) = 2 π 2 .47) The correlation coefficient is a measure of dependency between the variables. λ X Y = cov [ X .46b) The cov [ X .46a) Using the linearity property of expectation. then X = 0 but X 2 is not finite. Note that it is not necessary that variance exists for every PDF.Y ] .

then X Y = X Y . α being constant.Principles of Communication Prof. the random variables X and Y are said to be uncorrelated. That is. this result is appealing. Then ρ X Y = ± 1 . for X and Y be independent. When the random variables are independent. If the variables are neither independent nor totally dependent. Suppose for example. the sum of the numbers x1 ( y − mY ) would be very small and the quantity. ρ X Y = 1. On the other hand. let 2. if X and Y are uncorrelated. and given X = x1 . That is. Intuitively. given X = x1 . the fact that they are uncorrelated does not ensure that they are independent. then ρ will have a magnitude between 0 and 1. number of trials tends to zero as the number of trials keep increasing. then Y would occur sometimes positive with respect to mY . sum . When the joint experiment is performed many times. As an example. V. Venkata Rao Thus. then Y = ± α x1 . then we expect ρ X Y to be negative. However.54 Indian Institute of Technology Madras . the outcome y is conditioned on the outcome x in such a manner that there is a greater possibility of (y − mY ) being of the same sign as ( x − mX ) . Similarly if the probability of ( x − m X ) and ( y − mY ) being of the opposite sign is quite large. and sometimes negative with respect to mY . let X and Y be dependent. we have λ X Y (and ρ X Y ) being zero. they are uncorrelated. we have ρ X Y = 0 and for the totally dependent (y = ± α x ) case. Assume X and Y to be independent. If ρ X Y ( or λ X Y ) = 0 . In the course of many trials of the experiments and with the outcome X = x1 . Then we expect ρ X Y to be positive. Taking the extreme case of this dependency. let X and Y be so conditioned that. Two random variables X and Y are said to be orthogonal if E [ X Y ] = 0 .

That is. Venkata Rao α α ⎧1 < x < ⎪ . 2 . σ2 i = σ2 . X Y = 0 which means λ X Y = X Y − X Y = 0 . ⎪0 .48b) Note: Let X1 and X 2 be uncorrelated. then Y = x1 ! Let Y be the linear combination of the two random variables X1 and X 2 . That is. Let ρ12 be the correlation coefficient between X1 and X 2 . we can show that 2 2 σY = k1 σ1 + k 2 σ2 + 2 ρ12 k1 k 2 σ1 σ 2 2 (2. if 2. Then. then 2 2 σY = k1 σ1 + k2 σ2 2 (2.55 Indian Institute of Technology Madras .Principles of Communication Prof. if X = x1 . the sum as well as the difference random Z 2 2 variables have the same variance which is larger than σ1 or σ 2 ! 2 The above result can be generalized to the case of linear combination of n variables. X i 2 We will now relate σY to the known quantities. V. − fX ( x ) = ⎨ α 2 2. 2 σY = E ⎡Y 2 ⎤ − ⎡E [Y ]⎤ ⎦ ⎣ ⎦ ⎣ 2 E [Y ] = k1 m1 + k 2 m2 2 E ⎡Y 2 ⎤ = E ⎡ k1 X12 + k 2 X 2 + 2 k1 k 2 X1 X 2 ⎤ ⎣ ⎦ ⎣ ⎦ With a little manipulation. i = 1. But X and Y are 2 not independent because. That is Y = k1 X1 + k 2 X 2 where k1 and k2 are constants. XY = X3 1 1 = ∫ x3 d x = α α −∞ ∞ α −α ∫ 2 x3 d x = 0 2 As X = 0 . 2 2 Then σ2 = σW = σ1 + σ2 . otherwise ⎩ and Y = X 2 .48a) If X1 and X 2 are uncorrelated. Let E [ X i ] = mi . and let Z = X1 + X 2 and W = X1 − X 2 .

E[X] = 1 1 2 ∫8x dx + ∫8x dx 0 2 2 4 2. .56 Indian Institute of Technology Madras .49) where the meaning of various symbols on the RHS of Eq. .49 is quite obvious. 2 ≤ x ≤ 4 4 ≤ x We shall find a) X and b) σ2 X a) The given CDF implies the PDF ⎧0 ⎪1 ⎪ ⎪8 fX ( x ) = ⎨ ⎪1 x ⎪8 ⎪0 ⎩ . x < 0 . x < 0 . We shall now give a few examples based on the theory covered so far in this section.Principles of Communication Prof. V. 0 ≤ x ≤ 2 . Example 2. 2 ≤ x ≤ 4 4 ≤ x Therefore. Venkata Rao Y = 2 σY = i = 1 ∑k n n i X i .18 Let a random variable X have the CDF ⎧0 ⎪x ⎪ ⎪8 FX ( x ) = ⎨ 2 ⎪x ⎪16 ⎪ ⎩1 . 2. then 2 i i = 1 ∑k σ2 + 2 ∑∑ k i k j ρi j σi σ j i i j i < j (2. 0 ≤ x ≤ 2 .

otherwise ⎩ 2 Let us find E [Y ] and σY . E [Y ] = 1 2 − 1 2 ∫ cos ( π x ) d x = 1 2 2 = 0.0636 π 1 = 0. V. From Eq.20 Let X and Y have the joint PDF ⎧ x + y . where 1 1 ⎧ ⎪1.38 we have. Venkata Rao = 1 7 31 + = 4 3 12 2 b) σ 2 = E ⎡ X 2 ⎤ − ⎡ E [ X ]⎤ X ⎦ ⎣ ⎦ ⎣ E ⎡X2⎤ = ⎣ ⎦ = σ 2 X 1 2 1 3 ∫8x dx + ∫8x dx 0 2 1 15 47 + = 3 2 6 2 2 4 47 ⎛ 31 ⎞ 167 = −⎜ ⎟ = 6 144 ⎝ 12 ⎠ Example 2. 0 < y < 1 f X . 2. − < x < fX ( x ) = ⎨ 2 2 ⎪0.Y ( x .5 2 E ⎡Y ⎤ = ⎣ ⎦ 2 2 Hence σY = − 1 2 ∫ cos2 ( π x ) d x = 1 4 − 2 = 0. elsewhere ⎩0 2.19 Let Y = cos π X .Principles of Communication Prof.57 Indian Institute of Technology Madras . y ) = ⎨ .96 2 π Example 2. 0 < x < 1.

51) Similarly. given Y = y . we require E [ X Y ].43. E [Y ] .50) is called the conditional expectation of g ( X ) .58 Indian Institute of Technology Madras . E ⎡g ( X ) | Y = y ⎤ = ⎣ ⎦ ∞ −∞ ∫ g ( x ) f ( x | y ) dx X |Y (2. then we have the conditional mean. The quantity. E [ X ]. V. 2.21 Let the joint PDF of the random variables X and Y be 2. Venkata Rao Let us find a) E ⎡ X Y 2 ⎤ and ⎣ ⎦ b) ρ X Y a) From Eq. We shall illustrate the calculation of conditional mean with the help of an example. E [ X | Y ] . If g ( X ) = X . we can define the conditional variance etc.Y ( x . σ2 = σY = X 144 12 144 1 11 Hence ρ X Y = − Another statistical average that will be found useful in the study of communication theory is the conditional expectation. σ X and σY . namely. we have E ⎡XY 2⎤ = ⎣ ⎦ ∞ − ∞− ∞ ∫ ∫ ∞ x y 2 f X .Principles of Communication Prof. We can easily show that E [ X ] = E [Y ] = 7 11 48 2 and E [ X Y ] = . y ) dx dy = ∫ ∫ x y ( x + y ) dx dy 2 0 0 1 1 = 17 72 b) To find ρ X Y . E[X | Y = y] = E[X |Y ] = ∞ −∞ ∫ x f X |Y ( x | y ) dx (2. Example 2.

To find E [ X | Y ] .1. y ) ≡ ⎨ x .6. Let the experiment be repeated n times and let p be the 2. 2.6 Some Useful Probability Models In the concluding section of this chapter. f X |Y ( x | y ) = fY ( y ) = f X . 0 < y < x f X . Venkata Rao ⎧1 ⎪ . we shall discuss certain probability distributions which are encountered quite often in the study of communication theory. y ) fY ( y ) ∫x dx y 1 1 = − ln y . Discrete random variables i) Binomial: Consider a random experiment in which our interest is in the occurrence or nonoccurrence of an event A . we require the conditional PDF.Y ( x . We will begin our discussion with discrete random variables.59 Indian Institute of Technology Madras . 0 < y < 1 1 1 x f X |Y ( x | y ) = = − . 2. we are interested in the event A or its complement. B . ⎪0 . outside ⎩ Let us compute E [ X | Y ] . − ln y x ln y y < x < 1 Hence E ( X | Y ) = = ∫ x ⎢ − x ln y ⎥ d x ⎣ ⎦ y 1 ⎡ 1 ⎤ y −1 ln y Note that E [ X | Y = y ] is a function of y . 0 < x < 1. That is.Principles of Communication Prof. say. V.Y ( x .

then we can write f X ( x ) . Taking a special case. The sample space (representing the outcomes of these five repetitions) has 32 sample points. as trails are independent.60 Indian Institute of Technology Madras ... The sample points such as ABAAB . n . 2. will map into real number 3 as shown in the Fig. V... say.... for n = 5 . s32 .. Fig.. If we can compute P [ X = k ]. The sample point s1 could represent the sequence ABBBB ....... given by 2. .. k = 0. we have the binomial density. ‘number of occurrences of A in n trials'.. .. 1. n . ⎝3⎠ In other words..18: Binomial RV for the special case n = 5 P ( A B A A B ) = P ( A ) P ( B ) P ( A ) P ( A ) P ( B ) . = p (1 − p ) p 2 (1 − p ) = p 3 (1 − p ) 2 ⎛5⎞ There are ⎜ ⎟ = 10 sample points for which X ( s ) = 3 . let n = 5 and k = 3 . X can be equal to 0.. s1 . 2. .. k = 3 . 2. ⎛ 5⎞ 2 P [ X = 3] = ⎜ ⎟ p3 (1 − p ) ⎝ 3⎠ Generalizing this to arbitrary n and k . Let X denote random variable..18. Venkata Rao probability of occurrence of A on each trial and the trials are independent..Principles of Communication Prof.... 1.. (Each sample point is actually an element of the five dimensional Cartesian product space).. A A A B B etc..

01 and that errors in various digit positions within a block are statistically independent. The following example illustrates the use of binomial PDF in a communication problem.01 = 0. We write X is b ( n . Assume that the probability of a binary digit being in error is 0.Principles of Communication Prof. f X ( x ) ≥ 0 and ∞ −∞ ∫ fX ( x ) d x = i = 0 ∑ n Pi = i = ∑ ⎜i n ⎛n⎞ i n −i ⎟ p (1 − p ) 0⎝ ⎠ n = ⎡(1 − p ) + p ⎤ ⎣ ⎦ = 1 It is left as an exercise to show that E [ X ] = n p and σ2 = n p (1 − p ) . (Though X the formulae for the mean and the variance of a binomial PDF are simple. Let X be the random variable representing the number of errors per block. p ) to indicate X has a binomial PDF with parameters n and p defined above. Venkata Rao fX ( x ) = i = 0 ∑ P δ(x − i ) i n (2.22 A digital communication system transmits binary digits over a noisy channel in blocks of 16 digits. V.01) . 0. Then X is b (16. i) ii) Find the expected number of errors per block Find the probability that the number of errors per block is greater than or equal to 3.61 Indian Institute of Technology Madras . 2. Example 2. the algebra to derive them is laborious).52) ⎛n⎞ n−i where Pi = ⎜ ⎟ p i (1 − p ) ⎝i ⎠ As can be seen.16. i) E [ X ] = n p = 16 × 0.

Venkata Rao ii) P ( X ≥ 3 ) = 1 − P [ X ≤ 2] = 1− i =0 ⎛ ⎞ ∑ ⎜ i ⎟ ( 0.53) where λ is a positive constant. we obtain. Chebyshev inequality results in P ⎡ X ≥ 3 ⎤ ≤ 0. V. λm e = ∑ . m = 0 m! λ ∞ we have d eλ dλ ( ) = eλ = ∞ m λm − 1 1 ∑0 m ! = λ m = ∞ m = 1 ∑ ∞ m λm m! E[X] = m λm e− λ ∑ 1 m ! = λ eλ e − λ = λ m = Differentiating the series again.22.1) (1 − p ) 2 16 i 16 − i ⎝ ⎠ = 0. Evidently f X ( x ) ≥ 0 and We will now show that E [ X ] = λ = σ2 X ∞ −∞ ∫ f X ( x ) d x = 1 because λm ∑0 m ! = eλ .02 . 2.0196 ⎣ ⎦ tight.002 Exercise 2. Note that Chebyshev inequality is not very A random variable X which takes on only integer values is Poisson distributed. m = ∞ Since. ii) Poisson: 0.8 Show that for the example 2. if ∞ fX ( x ) = m = 0 ∑ δ ( x − m) λm e− λ m! (2.Principles of Communication Prof.62 Indian Institute of Technology Madras .

⎧ 1 .2 Continuous random variables i) Uniform: A random variable X is said to be uniformly distributed in the interval a ≤ x ≤ b if. V.54) Fig. it has the same variance.6. namely 1 . (2. a ≤ x ≤ b ⎪ fX ( x ) = ⎨ b − a ⎪0 . Therefore. Hence σ2 = λ . elsewhere ⎩ A plot of f X ( x ) is shown in Fig. 1) or ( 2. 3 ii) Rayleigh: An RV X is said to be Rayleigh distributed if. 4 ) .19.2.63 Indian Institute of Technology Madras .2.19: Uniform PDF It is easy show that (b − a) a+b E[X] = and σ2 = X 2 12 2 Note that the variance of the uniform PDF depends only on the width of the interval ( b − a ) . X ⎣ ⎦ 2.Principles of Communication Prof. 2. Venkata Rao E ⎡ X 2 ⎤ = λ 2 + λ . whether X is uniform in ( − 1.

V. x ≥ 0 fX ( x ) = ⎨ b ⎝ 2 b⎠ ⎪ . 2.12 is Rayleigh PDF.Principles of Communication Prof.20: Rayleigh PDF Rayleigh PDF frequently arises in radar and communication problems.2.64 Indian Institute of Technology Madras . ( fR ( r ) of example 2. (2.) Fig. elsewhere ⎩0 where b is a positive constant. Venkata Rao ⎧x ⎛ x2 ⎞ exp ⎜ − ⎪ ⎟.2.55) A typical sketch of the Rayleigh PDF is given in Fig. We will encounter it later in the study of narrow-band noise processes.20.

55. the Gaussian PDF is X completely specified by the two parameters. ⎝ ⎠ X 2. In appendix A2. 2.Principles of Communication Prof. The Gaussian PDF is symmetrical with respect to m X . we show that X PT f X ( x ) as given by Eq. σ X . m X and σ2 . Hint: Make 0 ∞ the change of variable x 2 = z . 1) . Venkata Rao Exercise 2. in the context of communication theory is the Gaussian (also called normal) density. 2 πb 2 is Rayleigh distributed. That is. V. specified by fX ( x ) = 1 2 π σX ⎡ ( x − m X )2 ⎤ ⎥.56) where m X is the mean value and σ2 the variance. σ2 ) to denote the Gaussian density1. x d x = b) Show that if X E ⎡X2⎤ = 2 b ⎣ ⎦ dz .3.56 is a valid PDF. Note that if X is N m X .9 a) Let f X ( x ) be as given in Eq. 2. N ( 0. then Y = 2 ( ) ⎛ X − mX ⎞ ⎜ σ ⎟ is N ( 0. 2. 1 TP PT In this notation.65 Indian Institute of Technology Madras . Then. Show that ∫ f X ( x ) d x = 1. As can be seen from the Fig. then E [ X ] = and iii) Gaussian By far the most widely used PDF. −∞ < x < ∞ exp ⎢− 2 σ2 ⎢ ⎥ X ⎣ ⎦ (2. 1) denotes the Gaussian PDF with zero mean and unit variance.21. We use the symbol X N ( m X .

We have. A small list is given in appendix A2. By making a change of variable ⎛ x − mX ⎞ z = ⎜ ⎟ . we have ⎝ σX ⎠ P [ X ≥ a] = ∞ a − mX σX ∫ 1 2π e − z2 2 dz ⎛ a − mX ⎞ = Q⎜ ⎟ ⎝ σX ⎠ where Q ( y ) = ∞ ∫ y ⎛ x2 ⎞ exp ⎜ − ⎟dx 2π ⎝ 2 ⎠ 1 (2.5 Consider P [ X ≥ a ] . 1) . P [ X ≥ a] = ∞ ∫ a 1 2 π σX ⎡ ( x − m X )2 ⎤ exp ⎢− ⎥dx 2 σ2 ⎢ ⎥ X ⎣ ⎦ This integral cannot be evaluated in closed form.66 Indian Institute of Technology Madras .57 is N ( 0. 2.21: Gaussian PDF Hence FX ( m X ) = mX −∞ ∫ f (x) d x X = 0. 2. 2. Q( ) function table is available in most of the text books on communication theory as well as in standard mathematical tables.2 at the end of the chapter.Principles of Communication Prof. Venkata Rao Fig. V.57) Note that the integrand on the RHS of Eq.

in Chapter 7.5 . to the total.23 A random variable Y is said to have a log-normal PDF if X = ln Y has a Gaussian (normal) PDF. Example 2. Hence the instantaneous value of the noise can be fairly adequately modeled as a Gaussian variable. where each component makes only a small contribution to the sum. a) b) c) Show that Y is log-normal Find E (Y ) If m is such that FY ( m ) = 0. otherwise ⎪ ⎩ where α and β are given constants. We shall develop Gaussian random processes in detail in Chapter 3 and. y ≥ 0 exp ⎢− ⎪ fY ( y ) = ⎨ 2 π y β 2 β2 ⎢ ⎥ ⎣ ⎦ ⎪ 0 . you may refer [1-3]. we shall make use of this theory in our studies on the noise performance of various modulation schemes. fY ( y ) given by. ⎧ ⎡ ( ln y − α )2 ⎤ 1 ⎪ ⎥. V. then FX ( x ) approaches Gaussian as N becomes large. regardless of the distribution of the individual components. Let X = ln Y or x = ln y (Note that the transformation is one-to-one) a) 2. find m . each particle making an independent contribution of the same amount. The electrical noise in a communication system is often due to the cumulative effects of a large number of randomly moving charged particles.Principles of Communication Prof. For a more precise statement and a thorough discussion of this theorem. Venkata Rao The importance of Gaussian density in communication theory is due to a theorem called central limit theorem which essentially states that: If the RV X is the weighted sum of N independent random components.67 Indian Institute of Technology Madras . Let Y have the PDF.

− ∞ < x . Venkata Rao dx 1 1 = → J = dy y y Also as y → 0 . we have Y = e α + β2 2 c) P [Y ≤ m ] = P [ X ≤ ln m ] Hence if P [Y ≤ m ] = 0. exp ⎢− 2 β2 ⎥ 2π β ⎢ ⎣ ⎦ 1 −∞ < x < ∞ Note that X is N ( α . y < ∞ given by. then P [ X ≤ ln m ] = 0. x → ∞ Hence f X ( x ) = ⎡ ( x − α )2 ⎤ ⎥. β2 ) b) Y = E ⎡e X ⎤ = ⎣ ⎦ ⎡ − ( x − α) 2 1 x ⎢ e e 2β ∫ 2π β − ∞ ⎢ ⎣ ∞ 2 2 ⎤ ⎥dx ⎥ ⎦ = e β2 α+ 2 ⎡ x − ( α + β2 ) ⎤ ⎡∞ ⎤ ⎦ − ⎣ 1 ⎢ ⎥ 2 β2 e d x⎥ ⎢∫ ⎢− ∞ 2 π β ⎥ ⎣ ⎦ As the bracketed quantity being the integral of a Gaussian PDF between the limits ( − ∞ .Principles of Communication Prof. y ) . ⎧ 1 ⎡ ( x − m X )2 ( y − mY )2 ( x − mX ) ( y − mY ) ⎤⎫ exp ⎨− fX . f X .5 . ∞ ) is 1. ln m = α or m = e α iv) Bivariate Gaussian As an example of a two dimensional density. V. Y ( x .68 Indian Institute of Technology Madras .58) where. we will consider the bivariate Gaussian PDF. y ) = + −2ρ ⎢ ⎥⎬ 2 2 k1 σX σY σ X σY ⎩ k2 ⎣ ⎦⎭ 1 (2.5 That is. k1 = 2 π σ X σY 1 − ρ2 2.Y ( x . x → − ∞ and as y → ∞ .

5. P3) If Z = α X + βY where α and β are constants and X and Y are jointly Gaussian. Y .5. X is N ( m X .22 gives the plot of a bivariate Gaussian PDF for the case of ρ = 0 and σ X = σY . if the Gaussian variables are uncorrelated. unless X and Y independent. This does not imply fX . σ2 ) and Y is N ( mY . in general.69 Indian Institute of Technology Madras . with respect to non-Gaussian variables (we have already seen an example of this in Sec. Let fX and fY be obtained from fX . Z Figure 2. V. Venkata Rao k 2 = 2 (1 − ρ2 ) ρ = Correlation coefficient between X and Y The following properties of the bivariate Gaussian density can be verified: P1) If X and Y are jointly Gaussian. Therefore fZ ( z ) can be written after computing mZ and σ2 with the help of the formulae given in section 2. σY ) 1 X TP PT P2) f X Y ( x . 2. We can construct examples of a joint PDF fX . 1 TP PT Note that the converse is not necessarily true. That is not true.Principles of Communication Prof. Y is jointly Gaussian. then Z is Gaussian. Y and let fX are and fY be Gaussian. then the marginal density of X or Y is 2 Gaussian. which is not Gaussian but results in fX and fY that are Gaussian. 2. y ) = f X ( x ) fY ( y ) iff ρ = 0 That is.2). that is. then they are independent.

Y resembles a (temple) bell. 2. V. Similarly for ρ < 0.Principles of Communication Prof. with. negative. Let us find the probability of 2 (X. imagine the bell being compressed along the X = − Y axis so that it elongates along the X = Y axis.22: Bivariate Gaussian PDF ( σ X = σY and ρ = 0 ) For ρ = 0 and σ X = σY . Venkata Rao Fig.23. f X . Example 2. positive and (ii) ρ . σ2 = σY = 1 and X ρX Y = − 1 . If ρ > 0 .70 Indian Institute of Technology Madras . 2.Y) lying in the shaded region D shown in Fig.2. the striker missing! For ρ ≠ 0 . of course. we have two cases (i) ρ .24 2 Let X and Y be jointly Gaussian with X = − Y = 1 .

71 Indian Institute of Technology Madras .24(b). 2 2. y ) ∈ B ⎤ ⎦ ⎣ ⎦ The required probability = P ⎡( x . Venkata Rao Fig. y ) ∈ ⎣ For the region y ≥ − 1 2 A . Fig.24: (a) Region A and (b) Region B used to obtain region D A ⎤ − P ⎡( x . 2.Principles of Communication Prof.23: The region D of example 2.24(a) and B be the shaded region in Fig. V. we have 1 x + 2 .24 Let A be the shaded region shown in Fig. Hence the required probability is. 2. we have y ≥ − x + 1 and for the region B . 2. 2.

Principles of Communication

Prof. V. Venkata Rao

X X ⎡ ⎤ ⎡ ⎤ P ⎢Y + ≥ 1⎥ − P ⎢Y + ≥ 2⎥ 2 2 ⎣ ⎦ ⎣ ⎦

Let Z = Y +

X 2

Then Z is Gaussian with the parameters, Z =Y + σ2 = Z = 1 1 X = − 2 2

1 2 1 2 σ X + σY + 2 . ρ X Y 4 2 1 1 1 3 + 1− 2. . = 4 2 2 4
Z+ 3 4 1 2 is N ( 0, 1)

⎛ 1 3⎞ That is, Z is N ⎜ − , ⎟ . Then W = ⎝ 2 4⎠

P [ Z ≥ 1] = P ⎡W ≥ ⎣

3⎤ ⎦

5 ⎤ ⎡ P [ Z ≥ 2] = P ⎢W ≥ ⎥ 3⎦ ⎣
Hence the required probability = Q

( 3) − Q⎛ ⎜ ⎝

5 ⎞ ⎟ 3⎠

( 0.04 − 0.001) =

0.039

2.72

Principles of Communication

Prof. V. Venkata Rao

Exercise 2.10

X and Y are independent, identically distributed (iid) random variables, each being N ( 0, 1) . Find the probability of X , Y lying in the region Fig. 2.25.

A shown in

Fig. 2.25: Region

A of Exercise 2.10

Note: It would be easier to calculate this kind of probability, if the space is a product space. From example 2.12, we feel that if we transform

(X,Y)
B.

into

(Z , W )

such that Z = X + Y , W = X − Y , then the transformed space

B

would be square. Find fZ ,W ( z , w ) and compute the probability ( Z , W ) ∈

2.73

Principles of Communication

Prof. V. Venkata Rao

Exercise 2.11

Two random variables X and Y are obtained by means of the transformation given below. X = ( − 2 loge U1 ) 2 cos ( 2 π U2 ) Y = ( − 2 loge U1 ) 2 sin ( 2 π U2 )
1 1

(2.59a) (2.59b)

U1 and U2 are independent random variables, uniformly distributed in the range 0 < u1 , u2 < 1 . Show that X and Y are independent and each is
N ( 0, 1) .

Hint: Let X1 = − 2 loge U1 and Y1 = Show that Y1

X1

is Rayleigh. Find f X ,Y ( x , y ) using X = Y1 cos Θ and

Y = Y1 sin Θ , where Θ = 2 π U2 . Note: The transformation given by Eq. 2.59 is called the Box-Muller transformation and can be used to generate two Gaussian random number sequences from two independent uniformly distributed (in the range 0 to 1) sequences.

2.74

Principles of Communication

Prof. V. Venkata Rao

Appendix A2.1 Proof of Eq. 2.34
The proof of Eq. 2.34 depends on establishing a relationship between the differential area d z d w in the z − w plane and the differential area d x d y in the x − y plane. We know that
fZ ,W ( z , w ) d z d w = P [ z < Z ≤ z + d z , w < W ≤ w + d w ]

If we can find d x d y such that
fZ ,W ( z , w ) d z d w = f X ,Y ( x , y ) d x d y , then fZ ,W can be found. (Note that

the variables x and y can be replaced by their inverse transformation quantities, namely, x = g − 1 ( z , w ) and y = h − 1 ( z , w ) ) Let the transformation be one-to-one. (This can be generalized to the case of one-to-many.) Consider the mapping shown in Fig. A2.1.

Fig. A2.1: A typical transformation between the ( x − y ) plane and ( z − w ) plane in the z − w

Infinitesimal rectangle

ABC D

plane is mapped into the

parallelogram in the x − y plane. (We may assume that the vertex A transforms to A' , B to B' etc.) We shall now find the relation between the differential area of the rectangle and the differential area of the parallelogram.

2.75