Professional Documents
Culture Documents
Gururajan
Assistant Professor
Department of Mathematics
Malnad College of Engineering, Hassan 573 201
UNIT 8:
Jointly Distributed Random Variables
Introduction: So far the study was restricted to one dimensional random variable,
distribution function, and other related aspects.
However, in real world, one come
across a number of situations where we do find the presence of two or correlated random
variables. For example, consider the experiment of finding how far e - learning
programme initiated by VTU Belgaum has become popular among the engineering
students? To find this say, authorities collect feed back from the students by visiting their
institutions. Let the problem be about finding the opinion of students regarding two
parameters; (i) quality of transmission from studio situated at Bangalore which we call it
as X and (ii) students interest in this kind of programs which we shall refer it to as Y.
For convenience of discussion, authorities visit seven colleges located at different parts of
the state and results are in the following table. We assume that these are given in terms of
percentages.
Engg.
Colleges
X
Y
PGA
SSIT
BVB
GSSIT
AIT
SBMJCE
KLE
X1
x2
x3
x4
x5
x6
x7
y1
y2
y3
y4
y5
y6
y7
In problems like this, we/authorities are certainly interested to learn the mood of the
students/teachers about the e learning programme initiated by us of course with huge
cost. It is known to you that one satellite channel has been completely dedicated for this
purpose in India. Many people are involved in this programme to reach the un reached
and needy.
One comes across many illustrative examples like that. Therefore, there is a necessity to
extend the study beyond single random variable.
This chapter is devoted to a discussion on jointly related variables, their distribution
functions and other important characteristics. First we shall have a discussion on discrete
case.
(Dr. K. Gururajan, MCE, Hassan
page 1)
Consider a random experiment and let S denotes its sample space. Let X and Y be two
discrete random variables defined on S. Let the image set of these be
X : x1 x2 x3 . . . xm
Y:
y1 y2 y3 . . . yn
Suppose that there exists a correlation between the random variables X and Y. Then X
and Y are jointly related/distributed variables Also, note that X and Y together assumes
values. The same can be shown by means of a matrix or a table.
y1
y2
y3
. .
. .
yn
..
. .
. .
. .
x1 , yn
x 2 , yn
. .
. .
x 3 , yn
x2
x1 , y1 x1 , y2
x2 , y1 x2 , y2
x3
x3 , y1
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
x m , yn
x1
x1 , y3
x 2 , y3
x 3 , y2 x 3 , y3
xm , y1 xm , y2 xm , y3
xm
properties:
1. h x i , y j 0
2. 0 h x i , y j 1
3.
h x , y 1
i
page 2)
Note: One caution with discrete random variables is that probabilities of events must be
calculated individually. From the preceding sections, it is clear that in the current
problem, there totally m n events. Thus, it is necessary to compute the probability of
each and every event. This can also be shown by means of table.
y1
y2
y3
. .
. .
yn
x1
h x1 , y1
h x1 , y2
h x1 , y3
..
. .
h x1 , yn
x2
h x2 , y1
h x 2 , y2
h x 2 , y3
. .
. .
h x 2 , yn
x3
h x3 , y1
h x 3 , y2
h x 3 , y3
. .
. .
h x 3 , yn
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
xm
h xm , y1
h x m , y2
h x m , y3
. .
. .
h x m , yn
page 3)
[Y = 0] = {TTT},
[Y = 3] = {HHH}
X
0
1
h(0, 0) 0
h(1, 0) 1
h(0, 1) 1
h(1, 1) 2
8
8
h(0, 2) 2
h(1, 2) 1
8
8
h(0, 3) 1
8
h(1, 3) 0
Two cards are selected at random from a box which contains 5 cards numbered 1, 1,
2, 2, and 3. Let X denote the sum selected, and Y the maximum of two numbers
drawn. Determine the joint probability function of X and Y.
Solution: It is known that 2 cards can be selected from the available 5 in C (5, 2) = 10
ways. With cards being numbered as 1, 1, 2, 2, and 3, clearly, S = {(1, 1), (1, 2), (1, 2),
(1, 2), (1, 2), (1, 3), (1, 3), (2, 2), (2, 3), (2, 3)}. As X denote the sum of numbered
chosen, X takes the values 2, 3, 4, and 5. Y is defined as the maximum of the two
numbers, so Y takes the values 1, 2, 3. Therefore, (X, Y) assumes totally 12 values. It
can be shown as
(Dr. K. Gururajan, MCE, Hassan
page 4)
(2, 1)
(2, 2)
(2, 3)
(3, 1)
(3, 2)
(3, 2)
4
5
(4, 1)
(5, 1)
(4, 2)
(5, 2)
(4, 3)
(5, 3)
Let h(x, y) denotes the joint probability function. Now, [X = 2] = {(1, 1)}, [Y = 1] =
{(1, 1)}, therefore, h(2, 1) = 1/10 = 0.1; Observe that [Y = 2] = {(1, 2), (1, 2), (1, 2), (1,
2), (2, 2)}, thus, [X = 2, Y = 2] = { }, so h(2, 2) = 0.0, a similar argument will show that
h(2, 3) = 0.0 since [y = 3] = {(1, 3), (1, 3), (2, 3), ( 2, 3)}. Next, [X = 3] = {(1, 2), (1, 2),
(1, 2), (1, 2)}, clearly, [X = 3, Y = 1] = { }, so h (3, 1) = 0.0, [X = 3, Y = 2] = {(1, 2), (1,
2), (1, 2), (1, 2)}, thus, h(3, 2) = 4/10 = 0.4. However, h (3, 3) = 0.0. This is because, [Y
= 3] = {(1, 3), (1, 3), (2, 3), (2, 3) and nothing is common between the events [X = 3] and
[Y = 3]. By following above arguments, it may be seen that h (4, 1) = 0, h (4, 2) = 0.1, h
(4, 3) = 0.2, h (5, 1) = 0.0, h (5, 2) = 0.0 and h (5, 3) = 0.2. Thus, the joint distribution
function of X and Y may be given as
Y
0.1
0.0
0.0
0.0
0.4
0.0
0.0
0.1
0.2
0.0
0.0
0.2
H ( x t , yu ) P[ X x t , Y yu ]
x x t y yu
h( x, y) .
x x1 y y1
i m j n
page 5)
Y
Y
X
0
1
h(0, 0) 0
h(1, 0) 1
h(0, 1) 1
h(1, 1) 2
8
8
h(0, 2) 2
h(1, 2) 1
h(0, 3) 1
8
h(1, 3) 0
The distribution function of X is got by adding the probabilities row wise. Thus,
xi
: 0
1
1
2
2
and the marginal distribution function of Y is to be obtained by the adding the
probabilities column wise. Therefore,
f xi :
yi
g yj
Cov( X , Y ) E ( XY ) E ( X ) E (Y ) . The
Cov( X , Y )
regression coefficient between X and Y can be written as ( x , y )
. Here,
x y
x and y are standard deviation of X and Y series respectively.
ILLUSTRATIVE EXAMPLES:
Consider the joint distribution of X and Y
Y
X
-4
1/8
1/8
2/8
1/8
1/8
f xi
1/2
1/2
E ( X ) 1 (1 / 2) 5 (1 / 2) 3
E X 2 1 (1 / 2) 25 (1 / 2) 13
Var( X ) E X 2 E ( X ) 4 and x 2 .
2
Similarly,
-4
g yj
3/8
3/8
2/8
E (Y ) 4 (3 / 8) 2 (3 / 8) 7 (2 / 8) 1.0
E Y 2 16 (3 / 8) 4 (3 / 8) 49 (2 / 8) 39.5
Var(Y ) E Y 2 E (Y ) = 38.5
2
y 38.5 = 6.2048.
Consider
E ( XY ) x y h( x , y )
x
Cov( X ,Y )
5.8
0.4673 .
x y
( 2.0)( 6.2048)
As
page 7)
variables.
Note: If it is known that X and Y are independent random variables, then it is very easy
to compute the joint distribution function. This may be obtained by just multiplying the
respective probabilities in the corresponding row and column of the table. For example,
consider
X
f xi
0
0.2
g yj
1
0.2
3
0.2
2
0.5
5
0.5
3
0.1
6
0.3
h xi , y j
0
1
2
3
0.04
0.04
0.10
0.02
0.10
0.10
0.25
0.05
0.06
0.06
0.15
0.03
page 8)
s I .
When both s and t are varied, we generate a family of random variables constituting a
stochastic process.
Type of stochastic processes:
1. If the state space I and index set T of a stochastic process is discrete, then it is called
Discrete state discrete parameter (time) process.
2. On the other hand, if state space is continuous, and index set T is discrete, then we
have a continuous state discrete parameter process.
3. Similarly, one have discrete state continuous parameter process and
4 Continuous state continuous parameter process.
Theory of queues provides a number of examples of stochastic processes. Among the
various processes, Markov process is seen to be more useful.
Markov Process (Memory less process):
A stochastic process {x (t ) | t T } is called a Markov process if for any
t 0 t1 t 2 < . . . <t n < t The conditional distribution of X (t ) for given values
depends only on X (t n ) ; that is probability of occurrence of an event in future is
completely dependent on the chance of occurrence of the event in the present state but
(Dr. K. Gururajan, MCE, Hassan
page 9)
i.e. here, behavior of the stochastic process is such that probability distributions for its
future development depend only the present state and not on how the process arrived in
that state. Also, state space, I is discrete in nature.
Equivalently, in a Markov chain, (a Markov process where state space, I takes discrete
values), the past history is completely summarized in the current state, and, future is
independent of its past but depends only on present state.
First we shall discuss few basic concepts pertaining to Markov chains. A vector
A stochastic matrix A is said to be regular if all the entries of A are positive for some
power Am of A where m is a positive integer.
Result 1: Let A be a regular stochastic matrix. Then A has a unique fixed probability
vector, t. i.e. One can find a unique fixed vector t such that t t A .
Result 2: The sequence A, A2 , A3 , . . . of powers of A approaches the matrix T whose
rows are each the fixed point t.
Result 3: If q is any vector, then q A, q A2 , q A3 , . approaches the fixed point t.
Transition Matrix of a Markov Chain:
Consider a Markov chain a fine stochastic process consisting of a finite sequence of
trials, whose outcomes say, x 1 , x 2 , x 3 . . . satisfy the following properties:
Each outcome belongs to the state space, I {a1 , a2 , a3 . . . a m } . The outcome of any
trial depends, at most, on the outcome of the trial and not on any other previous
outcomes. If the outcome in n th trial is ai then we say that the system is in ai state at
the n th stage. Thus, with each pair of states ai , a j , we associate a probability value
pij which indicate probability of system reaching the state ai from the state a j in n
steps. These probabilities form a matrix called transition probability matrix or just
transition matrix. This may be written as
. . . . . . . . . . . . . . . . . .
p
n1 pn 2 pn 3 . . . pnn
Note: Here, i th row of M represents the probabilities of that system will change from
ai to a1 , a2 , a3 , . . . an . Equivalently, pij P x n j | x n 1 i .
n step transition probabilities:
The probability that a Markov chain will move from state I to state j in exactly n steps
is denoted by p ij( n ) pij (n ) P x m n j | x m i
Evaluation of nstep transition probability matrix
If M is the transition matrix of a Markov chain, then the n step transition matrix may be
obtained by taking nth power of M. Suppose that a system at time t = 0 is in the state
a a1 , a2 , a3 . . . a n where the process begins, then corresponding probability vector
a ( 0 ) a1( 0 ) , a 2( 0 ) , a 3( 0 ) . . . a (0)
denotes the initial probability distribution. Similarly, the
n
a (1 ) a ( 0 ) M
a ( 2 ) a (1 ) M a ( 0 ) M 2
a ( 3) a ( 2) M a (0) M 3
1
1
1
1
1
3
3
-
3
3
3
4
4
4
4
(i)
(ii)
(iii)
2
2
2
1
1
1
3
3
2
2
3
3
Solution: (i) is a stochastic matrix as all the entries are non negative and the sum of all
values in both the rows equals 1. (ii) This is not a stochastic matrix, since the second row
is such that sum exceeding 1 (iii) this is again not a stochastic matrix, because one of the
entry is negative.
4
4
2
4
4
(i) A 0
1
0
(ii) B 0
0
1
1
0
1
1
0
0
2
2
0
0 1 0
4
4
1
1 1
1 1 1
0
(i) A 3
(ii)
(iii) A
3
2 2
6 2 3
0
1
0
2 1
0
3 3
(x ,1 x ) (x ,1 x ) 3
3 ; A multiplication of RHS matrices, yields,
0
1
1
2
x x (1 x ) and (1 x ) x . Thus, one can find that x 0.6 , hence the
3
3
required unique fixed probability vector t (0.6, 0.4) .
(ii) We shall set up t ( x , y , z ) as the unique fixed probability vector. Again this is a
probability vector, therefore, t ( x , y , 1 x y ) . Now, consider the matrix equation,
3
1
0
4
4
1 1
(x, y , 1 x y ) (x, y , 1 x y )
0
2 2
1
0
0
I night
Transition probability
II night
Boy Studying
Boy Studying
0.3
0.7
0.4
0.6
0.7
0.3
Therefore, it is clear that required stochastic matrix is M
. To solve the
0.6
0.4
given problem, it is sufficient to compute a unique fixed probability vector, t ( x , 1 x )
such that t t M . Using the above, one obtains x 0.3 x 0.4(1 x ) . From this
4
equation, we can calculate x . Thus, a chance of boy studying in the long run is
11
36.7%.
6. A persons playing chess game or tennis is as follows: If he plays chess game one
week then he switches to playing tennis the next week with a probability 0.2. On the
other hand, if he plays tennis one week, then there is a probability of 0.7 that he will
play tennis only in the next week as well. In the long run, how often does he play
chess game?
Solution: Like in the previous case, this one too is based on Markov process. Here,
parameters are about a player playing either chess or tennis game. Also, playing the
game in the next week is treated to be dependent on which game player plays in the
previous week. As usual, first we obtain transition probability matrix.
II Week
Transition
probabilities
I Week
Player Playing
Chess
0.8
0.2
Player Playing
Tennis
0.3
0.7
0.2
0.8
Clearly, the transition probability matrix is M
. Here, too, the problem is
0.7
0.3
to find a unique fixed probability vector, t ( x , 1 x ) such that t t M . Consider the
0.2
0.8
matrix equation; x 1 - x ) ( x 1 - x )
. A simple multiplication work
0.7
0.3
results in x 0.8 x 0.7(1 x ) from which we get x 0.6 . Hence, we conclude that in
the long run, person playing chess game is about 60%.
7. A sales man S sells in only three cities, A, B, and C. Suppose that S never sells in
the same city on successive days. If S sells in city A, then the next day S sells in the
city B. However, if S sells in either B or C, then the next day S is twice likely to sell
in city A as in the other city. Find out how often, in the long run, S sells in each city.
Solution: First we shall obtain transition probability matrix which will have 3 rows and
3 columns. As a salesman does not sell in a particular city on consecutive days, clearly,
main diagonal is zero. It is given that if S sells in city A on first day, then next day he
will sell in city B, therefore, first row is 0, 1, 0. next, one can consider two cases: (i) S
selling city B or (ii) S selling in city C. It is also given that if S sells in city B with
probability p, then his chances of selling in city A, next day is 2p. There is no way, he
will sell in city C. Thus, we have p + 2p + 0 = 1 implying that p = 1/3. Therefore middle
row of the matrix is 2/3, 1/3, 0. Similarly, if S sells in city C with probability q, then his
chances of selling in city A, next day is 2q. Again, a chance of selling city B is 0. Thus,
last row is 2/3, 0 , 1/3. The probability transition matrix is
Next Day
I day
3
3
3
0
0
1 0
x y 1 - x - y x y 1 - x - y 2 3 0 13 Or
2
1
0
3
3
x 2 y 2 (1 x y ) and
y x 1 (1 x y ) . Solving these two equations,
3
3
3
one obtains x 0.4, y 0.45 and z 0.15 . Hence, chances of a sales man selling in
each of the cities is 40%, 45% and 15% respectively.
8. Marys gambling luck follows a pattern. If she wins a game, the probability of
winning the next game is 0.6. However, if she loses a game, the probability of losing
the next game is 0.7. There is an even chance that she wins the first game. (a) Find
the transition matrix M of the Markov process. (b) Find the probability that she
wins the second game? (c) Find the probability that she wins the third game? Find
out how often, in the long run, she wins?
10. Each year Rohith trades his car for a new car. If he has a Maruti, he trades it in
for a Santro. If he has a Santro car, then he trades it in for a Ford. However, is he
has a Ford car; he is just as likely to trade it in for a new Ford as to trade it in for a
Maruti or for a Santro. In 1995, he bought his first car which was a Ford.
(a) Find the probability that he has bought (i) a 1997 Buick, (ii) a 1998 Plymouth,
(iii) a 1998 Ford?
(b) Find out how often, in the long run, he will have a Ford?
PLEDGE
I commit to excel, in all I do.
I will apply myself actively to making a difference at
Malnad College of Engineering, Hassan.
I will use every opportunity presented to me by my
superiors from this moment to make that difference.
For myself, for colleagues, and for my students
(Dr. K. Gururajan)