Professional Documents
Culture Documents
~.
o r x FIGURE 6-32
The product l(x I x > t) dx equals the probability that the system fails in the interval
(x, x + dx). assuming that it functions at time t.
EX \MPLE 6-44 ~ If f(x) = ce-CX, then F(/) = 1 - e-cr and (6-223) yields
ce-CX
l(x I x> t} = -e--
ct = f(x - t)
This shows that the probability that a system functioning at time t fails in the interval
(x, x + dx) depends only on the difference x - t (Fig. 6-32). We show later that this is
true only if f (x) is an exponential density. ....
!he product {3(/) dt is the probability that a system functioning at time I fails in the
interval (t, t + dt). In Sec. 7-1 (Example 7-3) we interpret the function {3(t) as the
expected failure rate.
pet). Integrating from 0 to x and using the fact that In R(O) = 0, we obtain
EXAMPLE 6-46 ~ A system is called memoryless if the probability that it fails in an interval (t, x),
assuming that it functions at time t, depends only on the length of this interval. In other
MEMORYLESS words, if the system works a week, a month, or a year after it was put into operation, it
SYSTEMS is as good as new. This is equivalent to the assumption that f(x I x> t) = f(x - t) as
in Fig. 6-32. From this and (6-224) it follows that with x = t:
pet) = f(t I x> t) = f(t - t) = f(O) = c
and (6-225) yields f(x) = ce- cx • Thus a system is memoryless iff x has an exponential
density. ~
EXAMPLE 6-47 ~ A special form of pet) of particular interest in reliability theory is the function
pet) = ct b- 1
This is a satisfactory approximation of a variety of failure rates, at least near the origin.
The corresponding f(x) is obtained from (6-225):
This function is called the WeibuLl density. (See (4-43) and Fig. 4-16.) ~
We conclude with the observation that the function {3(t) equals the value of the
conditional density f(x I x > t) for x = t; however, pet) is not a density because its
area is not one. In fact its area is infinite. This follows from (6-224) be'cause R(oo) =
1- F(oo) =0.
Parallel: We say that the two systems are connected in parallel if S fails when both
systems fail. Denoting by z the time to failure of S, we conclude that z = t when the
230 PlOBABILITY AND RANDOM VARIABLES
s ~ s
z
~ 5
Y
Yt
'L
o z
(0)
X o w
(b)
x
s
o s
(c)
X
FIGURE 6-33
Series: We say that the two systems are connected in series if S fails when at least one
of the two systems fails. Denoting by w the time to failure of S. we conclude that w = t
when the smaller of the numbers x and y equals t. Hence [see (6-80)-(6-81)]
w = min(x, y) F... (w) = F;lI(w) + Fy(w) - Fx,(w, w)
If the random variables x and y are independent,
where Px(t), P,(t). and p... (r) are the conditional failure rates of systems Sl. ~. and S,
respectively.
Standby: We put system Sl into operation, keeping s,. in reserve. When SI fails, we put
s,. into operation. The system S so formed fails when S2 fails. If tl and t2 are the times
of operation of Sl and S2. tl + 12 is the time of operation of S. Denotin.g by s the time to
failure ~ system S, we conclude that
s=x+y
The distribution of s equals the probability that the point (x, y) is in the triangular
shaded region of Fig. 6-33c. If the random variables x and y are independent, the density
ofs equals
as in (6-45).
CHAPT.ER 6 TWO IWIDOM VAJUABLBS 231
E{g(y) IM)dy
be used to define the of y.
Using a limit argument as in (6-205). we can also define the conditional mean
E {g (y) 1x}. In particular.
conditional variance.
shall illustrate these ...cu'"W"'UUl~" tIlroullh example.
/x(x) = L: O<x<l
/y(y) = 11-IYI
Iyl < 1
This gives
/Xy(x. y) 1
/x!y(x, y) = /y(y) = 1 -Iyl 0 < Iyl < x < 1 (6-231)
FIGURE (;.34
232 PROBABILITY AND RANDOM VA1UA8L'I!S
and
~ ( I) = fxy(x.y) = -
1
Jylx y x 0< Iyl < x < 1 (6-232)
fx(x) 2x
Hence
E{x I y} = J xfxl)'(x I y) dx
t
= i lYI (1 _
x 1
Iyl) dx = (1 - lyD
X211
"2 Iyl
1 - lyl2 1 + Iyl
= 2(1-lyl) = -2- Iyl < 1 (6-233)
For a given x, the integral in (6-228) is the center of gravity of the masses in the
vertical strip (x, x + dx). The locus of these points, as x varies from -00 to 00, is the
1:
function
Note If the random variables x and y are functionally related, that is, ify = g(x), then the probability masses
on thexy plane are on the line y = g(x) (see Pig. 6-Sb); hence E(y Ix) = g(x).
Galton's law. The term regression has its origin in the following observation attributed
'to the geneticist Sir Francis Galton (1822-1911): "Population extremes regress toward
their mean." This observation applied to parents and their adult children implies that
children of tall (or short) parents are on the average shorter (or taller) than their parents.
In statistical terms be phrased in terms of conditional expected values:
Suppose that the random variables x and y model the height of parents and their
children respectively. These random variables have the same mean and variance, and
they are positively correlated.:
y
::.~
}~
. - - t -......r cp(x)
;,~;
X,'~x+dxx
o x FIGURE 6-36
According to Galton's law. the conditional mean E {y I x} of the height of children whose
parents height is x, is smaller (orlarger) than x if x > 11 (or x < 11):
<x if x> 11
E{y Ix} = q>(x) { >x if x<l1
This shows that the regression line q>(x) is below the line y = x for x > 11 and above
this line if x < 11 as in Fig. 6-36. If the random variables x and y are jointly normal,
then [see (6-236) below] the regression line is the straight line q>(x) = rx. For arbitrary
random variables, the function ({l(x) does not obey Galton's law. The term regression is
used, however, to identify any conditional mean.
D\:Al\lPLE G-49 ~ If the random variables x and y are normal as in Example 6-41, then the function
x -111
E{ylx} = 712 +ru2--
U1
(6-236)
is a straight line with slope rU21 Ul passing through the point (111, 112). Since for normal
random variables the conditional mean E {y I x} coincides with the maximum of I (y Ix),
we conclude that the locus of the maxima of all profiles of I(x, y) is the straight line
(6-236).
1:1:
From theorems (6-159) and (6-227) it follows that
This expression can be used to determine E {g (x, y) I x}; however, the conditional density
I(x, y I x) consists of line masses on the line x-constant. To avoid dealing with line
masses, we shall define E{g(x, y) I xl·as a limit:
As we have shown in Example 6-39, the conditional density t(x, y Ix < x <
x + AX) is 0 outside the strip (x,x + AX) and in this strip it is given by (6-203) where
Xl = =
X andx2 = X+Ax. It follows, therefore, from (6-237) withM {X < x !:: X+AX}
that
I:
We also note that
because g(x, y) is a function of the random variable y, with x a parameter; hence its
conditional expected value is given by (6-227). Thus
E{g(x,y) Ix} = E{g(x,y) Ix) (6-240)
One might be tempted from the above to conclude that (6-240) follows directly
from (6-227); however, this is not so. The functions g(x, y) and g(x, y) have the same
expected value, assuming x = x, but they are not equal. The first is a function g(x, y) of
the random variables x and y, and for a specific ~ it takes the value g[x(~), y(~)]. The
second is a function g(x, y) of the real variable x and the random variable y, and for a
specific ~ it takes the value g[x, y(~)] where x is an arbitrary number.
E{rp(x))
Since f (x, y) =
= I: rp(x)/(x)dx = 1: 1:
I (x) I (y Ix), the last equation yields
I(x) yl(y \x)dydx
,EXAl\IP i -" t ~ Suppose that the random variables x and y are N(O, 0, O'f, O'f. r). As we know
E{x2 } 30'~
FurthermOlre, f (y Ix) is a normal
Proof-
E{xy} = E{xE{y
(1,2
= 30')4 r 22-
0'(
PROBLEMS
are independent, identically (U.d,) random variables with common
h(X) =e-1 U(y)
the p.d.f. of the following x + y, (b) x - y, (c)
min(x, y). (f) max(x. y).
6-2 x and y are independent and uniform in the interval (0, a). Find the p.d.f. of (a) x/y,
(b) y/(x + y), (c) Ix - YI:
6-3 The joint p.d.f. of the random variables x and y is given by
lin the shaded area
f~1(X, y) = { 0 otherwise
-1
, FIGU.RE~3
236 PROBA8JLlTY AN'£) 1WIDOM VAlUABLES
{
X +Y 0 ~ x ~ 1, 0 ~ y ~ 1
IIl'l(x, Y) = 0 otherwise
= =
Show that (a) x + y has density II(z) 1}, 0 < Z < 1, II(z) %(2 - z), 1 < z < 2.
and 0 elsewhere. (b) xy has density 12(z) = 2(1 - z), 0 < Z < 1, and 0 elsewhere. (e) y/x
has density I,(z) = (1 + 7.)/3.0 < z < 1, 13(z) = (1 + z)f3z3. % > 1. and 0 elsewhere.
(d) y - x has density 14(z) = 1 -lzl.lzl < 1. and 0 elsewhere.
6·8 Suppose x and y have joint density
I 0 ~ oX ~ 2, 0 ~ 1 ~ 1, 21 ~ x
h,(X, y) = { 0 otherwise
Show that z = x + Ybas density
(lf3)Z 0 < %< 2
IIl,(x, y) = { 02 - (2/3)% 2 < z < 3
elsewhere
6-9 x and y are uniformly distributed on the triangular region 0 ~ 1 ~ x ~ 1. Show that (a) z =
=
x/y has density ,,(z) l/zl, z ~ 1. and It(z) = 0, otherwise. (b) Detennine the density
ofxy.
6-10 x and y are unifonnly distributed on the triangular region 0 < x ~ y ~ x + y ~ 2. Find the
p.d.f. of x + y and x - y.
6·11 x and y are independent Gamma random variables with common parameters a and p. Find
the p.d.f. of (a) x + Y. (b) x/y. (e) xf(x + y),
6-12 x and y are independent unifonnly distributed random variables on (O,d). Plnd the joint
p.d.f. ofx + y and x - y.
6-13 x and yare independent Rayleigh random variables with common parameter q2. Detennine
the density of x/yo
6·14 The random variables x and y are independent and z = x + y. Fmd I,(Y) if
IIl(x) = ee-UU(x) Ir.(z) = c?-tf"U(z)
6-1S The random variables x and y are independent and y is uniform in the interval (0, 1). Show
=
that. if z x + y. then
CHAI'TER 6 lWO RANDOM VARIABLES 237
6·16 (a) The function g(x) is monotone increasing and y = g(x). Show that
F,,(X) if Y > g(x)
F"y(x.y) = {
Fy(Y) if Y < g(x)
(b) Find Fxylx. y) if g(x} is monotone decreasing.
6·17 The random variables x and y are N(O, 4) and independent Find fl(1.} and Ft(x) if (a) z =
2x + 3y. and (b) z = x/yo
6·18 The random variables x and y are independent with
6·27 Let x and y be independent identically distributed exponential rahdom variables with common
=
parameter A. Fmd the p.d.f.s of (a) z y /max(x, y). (b) w = x/min (x, 2y).
6·28 If x and y are independent exponential random variables with common parameter A. show
that x/ex + y) is a uniformly distributed random variable in (0, 1).
6·29 x and y are independent exponential random variables with common parameter A. Show that
u = --=x?=-=r= v= xy
..jX2 + y2 ..jX2 +y2
(a) Find the joint p.dJ. J•.,(u. v) of the random variables u and v. (b) Show that u and v are
independent normal random variables. (c) Show that [(x - y)2 - 2fllJx2 + y2 is also a
normal random variable. Thus nonlinear functions of normal random variables can lead to
normal random variables! (This result is due to Shepp.)
6-35 Suppose z has an F distribution with (m. n) degrees of freedom. (a) Show that liz also has
an F distribution with (n, m) degrees of freedom. (b) Show that mz/(mz + n) has a beta
distribution.
6·36 Let the joint p.d.f. of x and y be given by
O<Y~X~OO
otherwise
Define z = x + Y. w = x - y. Find the joint p.d.f. of z and w. Show that z is an exponential
random variable.
6-37 Let
2e-<x+Y) 0< x < y< 00
ix,(x, y) ={0 th' .
o erwJse
Define z=x+y, w=y/x. Determine tbejoint p.d.f. ofz and w. Are z and w independent
random variables?
6-38 The random variables x and (} are independent and (} is uniform. in the interVal (-7r, 7r).
Show thatifz = xcos(wt + 8}. then
6-40 The random variables x and y are of discrete type, independent, with Pix = n} = a••
Ply = n) = b•• n = 0.1, .... Show that, ifz = x + y, then
/I
•
6-42 x and y are independent random variables with geometric p.m.f.
Pix = k} = pq" k = 0, 1.2, " . Ply = m} = pq'" m = 0, 1.2•...
Find the p.m.f. of (a) x + y and (b) x - y.
6-43 Let x and y be independent identically distributed nonnegative discrete random variables
with
Pix = k} = Ply = k} = Pi k = 0,1.2, ...
Suppose
!(XI,X2)
I
= 27r,jXexp {I c- Xt} .
-2 X I
c= [""II ""12]
IJ.2I JI.22
240 PROBABJUI'Y AND RANt10M VARIABL£S
6;48 Show that if the random variables x and y are nonnaI and independent, then
6-55 Let x represent the number of successes and y the number of failures of n independent
Bernoulli trials with p representing the probability of success in anyone trial. Find the
=
distribution ofz = x - y. Show that E{z} = n(2p - 1), Var{z} 4np(1 - p).
6-56 x and yare zero mean independent random variables with variances u~ and u;, respectively,
that is, x ,.., N (0, ut), y '" N (0, ui). Let
z=ax+by+c c;#O
(a) Find the characteristic function Cl>~(u) of z. (b) Using Cl>t(u) conclude that z is also a
normal random variable. (c) Fmd the mean and variance of z.
6-57 Suppose the Conditional distribution of " given y = n is binomial with parameters nand
PI. Further, Y is a binomial random variable with parameters M and 1'2. Show that the
distribution of x is also binomial. Find its parameters.
6·58 The random variables x and y are jointly distributed over the region 0 < JC < Y < 1 as
kX
fxy(x, y) = { 0
< < <I
0 x y
th •
o erwtse
for some k. Determine k. Find the variances of x and y. What is the covariance between :II:
andy?
6-59 x is a Poisson random variable with parameter A and y is a normal random variable with
mean J.t and variance u 2 • Further :II: and y are given to be independent. (a) Find the joint
characteristic function of x and y. (b) Define z = :II: + y. Find the characteristic function of z.
6-60 x and y are independent exponential random variables with common-:parameter A. Find
(a) E[min(x, y)], (b) E[max(2x, y)].
6-61 The joint p.d.f. of x and y is given by
6x
{ 0
x > 0, y > 0, 0 < x +y ~ I
f., (x, y) = otherwise
Define z = x - y. (a) FInd the p.d.f. of z. (b) Finel the conditional p.d.f. of y given x.
(c) Detennine Varix + y}.
6-62 Suppose xrepresents the inverse of a chi-square random variable with one degree of freedom,
and the conditional p.d.f. of y given x is N (0, x). Show that y has a Cauchy distribution.
CHAPTBR6 1WORANDOMVARIABLES 241
6-63 For any two random variables x and y.let (1; =Var{X).o} = Var{y) and (1;+1 =VarIx + y).
(a) Show that
a.t +y < 1
ax +0",. -
6-64 x and y are jointly normal with parameters N(IL",. IL" 0":. a;. Pxy). Find (a) E{y I x = x}.
and (b) E{x21 Y= y}.
6-65 ForanytworandomvariablesxandywithE{x2 } < oo,showthat(a)Var{x} ~ E[Var{xIY}].
(b) VarIx} = Var[E{x Iy}] + E[Var{x I y}].
6·66 .Let x and y be independent random variables with variances (1? and ai, respectively. Consider
the sum
z=ax+(1-a)y
Find a that minimizes the variance of z.
6-67 Show that, if the random variable x is of discrete type taking the valuesxn with P{x = x.1 =
p" and z = 8(X, y), then
E{z} = L: E{g(x", Y)}Pn fz(z) = L: fz(z I x.)Pn
n n
6-68 Show that, if the random variables x and y are N(O, 0, (12, (12, r), then
6-78 Show that the random variables x and Yate independent iff for any a and b:
E{U(a - x)U(b - y)) = E{U(a - x)}E{U(b - y»
6·79 Show dlat
CHAPTER
7
SEQUENCES
OF RANDOM
VARIABLES
where
I(x) -- I(X" )- all F(Xlo ... ,X/I)
••• ,X/I - aXl •••• , aX,. (7-3)
243
244 PROBAB/UTY AND RANDOM VARIABLES
'Note We haVe just identified various. functions in terms of their independent variables. Thus / (Xl, X3) is -;
joint density oflhe random variables "I and x3 and it is in general dif{erenrfrom thejoiDtdenSity /(X2. x.) of
the random variables X2 and X4. Similarly. the density /; (Xi) of the random variable Xi will often be denoted
by f(~i).
-
TRANSFORMATIONS. Given k functions
gl(X) •...• gk(X)
Ylt ... y". In this case, the masses in the k space are singular and can be detennined in
I
terms of the joint density ofy ...... Y". It suffices, therefore, to assume that k = n.
=
To find the density !,,(YI, ...• YIl) of the random vector Y [YII ... ,YII] for a
specific set of numbers YI, ••. , YII' we solve the system
gl (X) = Ylo ...• g,,(X) = Yn (7-7)
If this system has no solutions, then fY(YI • ... ,Yn) = O. If it has a singlt~ solution
X = [Xl, ..•• x,,), then
f(y ) - !x(XI, ... ,X,,) (7-8)
'Y I, ···.Y" - IJ(xlo ... , X,,) I
where
Independence
The random variables Xl> ••• , XII are called (mutually) indepen&nt if the events
(XI ~ XI}, ••• , {x" ~ XII} are independent. From this it follows that
EXAi\IPLE 7-1 .. Given n independent random variables Xi with respective densities Ii (XI), we funn
the random variables
. Yk = XI + ... + XJ; k = 1, ... , n
CHAPTER 7 SEQUENCES OF RANDOM VARIABLES 24S
.From (7-10) it follows that any subset of the set Xi is a set of independent random
variables. Suppose, for example. that
I(xi. X2. X3) = !(Xl)!(X2)!(X3)
=
Integrating with respect to Xl, we obtain !(XI. Xz) !(XI)!(X2). This shows that the
random vr¢ables Xl and X2 are independent. Note, however. that if the random variables
Xi are independent in pairs. they are not necessarily independent. For example. it is
possible that
!(X., X2) = !(Xl)!(XZ) !(x .. X3) = !(Xl)!(X3) !(X2. X3) = !(X2)!(X3)
but !(Xl. Xz. X3) # !(Xl)!(X2)!(X3) (see Prob. 7-2),
Reasoning as in (6-21). we can show that if the random variables Xi areindependent,
then the random variables
EX \\IPLE 7-2 ... The oider statistics of the random variables X; are n random variables y" defined
as follows: For a specific outcome ~, the random variables X; take the values Xi (n.
ORDER Ordering these numbers. we obtain the sequence
STATISTICS
Xrl (~) :s ... :s X/~ (~) ::: ••• :s Xr• (~)
and we define the random variable Yle such that
YI (n = 141 (~) :s ... ~ Yk(~) = 14t(~) ::: ... ::: Ylf(~) = Xr. (~) (7-13)
We note that for a specific i, the values X; (~) of X; occupy different locations in the above
ordering as ~ cbanges.
We maintain that the density lie (y) of the kth statistic Yle is given by
where Fx (x) is the distribution of the Li.d. random variables X; and Ix (x) is their density.
Proof. As we know
I,,(y)dy = Ply < Yle ~ Y +dy}
=
The event B {y < y" ~ y + dy} occurs iff exactly k - 1 of the random variables Xi are less
than y and one is in the interval (y. y + dy) (Fig. 7-1). In the original experiment S, the events
AI = (x ~ y) A1 = (y < x ~ y + dy} As = (x > y + dy)
fonn a partition and
P(A.) = F.. (y) P(A~ = I .. (y)dy P(A,) = 1 - Fx(y)
In the experiment SR, the event B occurs iff A I occurs k - 1 times. A2 occurs once, and As OCCUIS
= = =
n - k times. With k. k - I, k1 1, k, n - k. it follows from (4-102) that
n! k-I ..-I<
P{B} = (k _ 1)11 !(n _ k)! P (AJ)P(A1)P (A,)
then
!I(Y) = nAe-lItyU(y)
x,.
"'I
)E )( )(
y
)E
1k y+ dy
)(
1..
FIGURE 7·1
ClfAPTER 7 SEQUBNCf!S OF RANDOM VARIABLES 247
£XA,\IPLE 7-3 ~ A system consists of m components and the time to failure of the ith component is
a random variable Xi with distribution Fi(X). Thus
1 - FiCt) = P{Xi > t}
is the probability that the ith component is good at time t. We denote by n(t) the number
of components that are good at time t. Clearly,
n(/) = nl + ... + nm
where
n·- { I Xi>t E{nd = 1 - Fi(l)
, - 0 Xj<t
Hence the mean E{n(t)} = l7(t) ofn(t) is given by
17(/) = 1- FI(t) + ... + 1 - Fm(t)
We shall assume that the random variables Xi have the same distribution F(t). In this
case,
7](1) = m[1 - F(t}]
Failure rate The difference 11(/) - 11(t + dt) is the expected number offailures
in the interval (I. t + dl). The derivative -11' (I) = ml(/) of -17(1) is the rate offailure.
The ratio
fi(l) = _ 17'(1) =
I(t) (7-15)
1 - F(t)
71(t)
is called the relative expectedlailure rate. As we see from (6-221), the function fi(t)
can also be interpreted as the conditional failure rate of each component in the sys-
tem. Assuming that the system is put into operation at 1 = O. we have n(O) = m; hence
17(0) = E{n(O)} =m. Solving (7-15) for 71(1), we obtain
E\:r\~IPLE 7-4 ~ We measure an object of length 17 with n instruments of varying accuracies. The
results of the measurements are n random variables
MEASURE-
MENT E{Vi} = 0
ERRORS where Vi are the measurement errors which we assume independent with :{CEO mean. We
shall determine the unbiased, minimum variance, linear estimation of TJ. This means the
following: We wish to find n constants ai such that the sum
+ ... + all X"
j) = alx\
is a random variable with mean E{i}} = a\E{xd + ... +a"E{x,,} = 17 and its variance
V = atat+ ... +a;a; .
is minimum. Thus our problem is to minimize the above sum subject to the constraint
(7-16)
248 PROBABIUTY AND RANDOM VARIABLES
Group independence. We say that the group Gx of the random variables XI ••••• x" is!
independent of the group G y of the random variables YI ••••• Yk if
l(xl, ...• X"' Yit ...• Yk} = I(xi • ...• XII)I(YI, ... , Yk} (7-18~
1 1
-00
00 •••
00
-00
g(x], ...• x,. )f(Xl •... , XII) dXI ••• dXII (7-20)
If the random variables Z, = Xi + hi are complex. then the mean of g(zi •...• zn)
I equals
CORRELATION AND COVARIANCE MATRICES. The covariance C'J of two real ran-
dom variables X; and Xj is defined as in (6-163). For complex random variables
eij = E{(X; - 7U)(xj - J7j)} = E{x;xj} - E{x,}E{xj}
by definition. The variance of Xi is given by
U{ = Cti = ElIXt - 71112} = EUXtI2 } -IE(x,}l2
The random variables Xi are called (mutually) JUlCorreiated if C'j = 0 for every
I # j. In this case, if
X=XI + ... +x" (7-21)
and
(7-23)
Proof. The first equation in (7-22) follows from the linearity of expected values and the second
from (7-21):
1 II
-
E{v} =-1- L" E{(x, -I)21 = -
n--
n-l
- 0 '2
n-l n-l n
i_I
21(
=-
0'0
n
n-3,,) 1J.4 - - - q
n-l
If the random variables Xl ••••• Xn. are independent, they are also uncorrelated.
This follows as in (6-171) for real random variables. For complex random variables the
=
proof is similar: If the random variables ZI = Xl +jy1and Z2 X2 +j Y2 are independent,
then I(x" X2. Ylt Y2) = I(x" Yl)f(X2. h). Hence
1 1
00 •• •
00
zlzif(Xh Xl. Ylt n) dxl dYI dX2 dn
I: I: I: J:
-00 -00
Rn = [~1.1'~~~'~~~1 Cn = [~1~'~:~'~~~1 a
Rnl '" Rnn Cnl ••• C/I/I
where
RIj = E{Xixjl = Rjl Cij = R'j - 7Ji7Jj = Cj j
The first is the correlation matrix of the random vector X = [Xl ••••• Xn] and the
~ its covariance matrix. Clearly.
Rn = E{X'X*}
CKAPTIlR 7 SEQUENCES OF RANDOM VARIABLES 251
where X' is the transpose of X (column vector). We shall discuss the properties of the
matrix Rn and its determinant ~". The properties of Cn are similar because Cn is the
correlation matrix of the "centered" random variables Xi - 71i.
If (7-25) is strictly positive, that is, if Q > 0 for any A#- O. then RIO is called positive definite. J
The difference between Q ~ 0 and Q > 0 is related to the notion of linear dependence. ~
The correlation determinant. The det~nninant ~n is real because Rij = Rji' We shall
show that it is also nonnegative
(7-29)
with equality iff the random variables Xi are linearly dependent. The familiar inequality
~2 = RllR22 - R~2 ::: 0 is a special case [see (6-169)].
Suppose, first, that the random variables Xi are linearly independent. We maintain
that, in this case, the determinant ~n and all its principal minors are positive
(7-30)
IWe shall use the abbreviation p.d. to indicate that R" satisfies (7-25). The distinction between Q ~ 0 and
Q > 0 will be understood from the context.
252 PROBAllIL1TY ANOAANDOMVARlABW
Proof. 'This is true for n = 1 because III = RII > O. Since the random variables of
any subset of the set (XI) are linearly independent. we can assume that (7-30) is true for
k !: n - 1 and we shall show that Il.n > O. For this purpose. we form the system
Rlla. + ... + Rlnan = 1
R2Ja! + ... + R2nan = 0 (7-31)
Rnlal +···+RllI/an =0
Solving for a.. we obtain al = Iln_l/ Il n• where Iln-l is the correlation determinant of
the random variables X2 ••••• Xn' Thus al is a real number. Multiplying the jth equation
by aj and adding. we obtain
Q =~ * = al =
L.Ja,ajRij
.
Iln-l
- -
lln
(7-32)
I .J
In this, Q > 0 because the random variables X, are linearly independent and the left side
of (7-27) equals Q. Furthermore. lln-l > 0 by the induction hypothesis; hence lln > O.
We shall now show that, if the random variables XI are linearly dependent, then
lln = 0 (7-33)
Proof. In this case, there exists a vector A =F 0 such that alXI + ... + allxn = O. Multi-
plying by xi and taking expected values. we obtain
alRIl + ... + an Rill = 0 i = 1• ...• n
This is a homogeneous system satisfied by the nonzero vector A; hence lln = O.
Note. finally, that
llll !: RuRn ... Rlln (7-34)
with equality iff the random variables Xi are (mutually) orthogonal, that is, if the matrix
Rn is diagonal.
= t'- .: '1~!+1 f(a ll ••••• ak+! IXk • ••• , Xl) daHl' .. dan (7-36)
i-co -00
CIiAPTEIt 7 SEQUENCES OF RANDOM VARlABLBS 253
For example.
!(Xl IX2, X3) -- !(xJ,
f(
X2, X3) _
) -
dF(XI IX2. X3)
d
X2,X3 Xl
Chain rule From (7-35) it follows that
!(x" ... , x,,) = !(XII I XII-I • •..• XI) ... !(xli xl)f(XI) (7-37)
~ We have shown that [see (541)] if x is a random variable with distribution F(x).
then the random variable y = F (x) is uniform in the interval (0, 1). The following is a
generalization.
Given n arbitrary random variables Xi we form the random variables
(7-38)
We shall show that these random variables are independent and each is uniform in the
interval (0. 1).
SOLUTION
The random variables Yi are functions of the random variables Xi obtained with the
transformation (7-38). ForO::: Yi ::: 1. the system
YI = F(Xl) Yl = F(xllxl) •. "'YII = F(x li lXII-I ..... XI)
8Yl 0 0 o
aXI
8)'l
o
I:
From (7-5) and (7-35) it follows that
Generalizing. we obtain the following rule for removing variables on the left or on
the right of the conditional line: To remove any number of variables on the left of the
254 'PROBA81UTYANDRANDOMVARIABLES
,conditional line, we integrate with respeot to them. To remove any number of variables
, to the right of the line, we multiply by their conditional density with respect to thJ
remaining variables on the right, and we integrate the product. The following specia.
case is used extensively (Chapman-Kolmogoroff equation, see also Chapter 16):
Discrete type The above rule holds also for discrete type random variableS
provided that all densities are replaced by probabilities and all integrals by sums. WtA
mention as an example the discrete form of (7-39): If the random variables x" X2, X3I
take the values a;, bk, Cn respectively, then
P(x, = aj I X3 = Ci} = L: PIx, = aj Ibk, Cr }P{X2 = bit. Icr } (7-40
It.
(7-41~
This is a function of X2, ... , Xn; it defines, therefore, the random variable E{xil X2,'
: .. , xn}. Multiplying (7-41) by !(X2, .•. , xn) and integrating. we conclude that
= 1: E{x,lx2,X3.X4}!(X4Ix2.X3)dx4
, This leads to the following generalization: To remove any number of variables on the
(7-43)
right of the conditional expected value line, we multiply by their conditional density
with respect to the remaining variables on the right and we integrate the product. For
1:
example,
EXAMPLE 7-7 .. Given a discrete type random variable n taking the values 1, 2, ... and a sequence
of random variables XIt. independent of n, we form the sum
(7-46)
CHAPTER 7 SEQUENCES OF RANDOM VARIABLSS 255
This sum is a random variable specified as follows: For a specific ~,n(~) is an integer
and sen equals the sum of the numbers Xk(n for k from 1 to n(~). We maintain that if
the random variables XI: have the same mean, then
E(s} = l1E{n} where E {Xk} = 11 (7-47)
Clearly, E{Xkl n = n} = E{Xk} because Xk is independent ofn. Hence
E(sln =n} = E {tXk In = n} = tE(Xk} = l1 n
k=1 k=J
From this and (6-239) it follows that
E{s} = E{E{s InH = E{17n}
and (7-47) results.
We show next that if the random variables Xk are uncorrelated with the same
variance (12, then
(7-48)
Reasoning thus, we have
n n
E{s21 n = n} = L L E{XiXk} (7-49)
{ .. I k=l
where
{
(12 + 772 i =k
E{XiXk} = 772 i :F k
The double sum in (7-49) contains n terms with i = k and n2 - n terms with i :F k;
hence it equals
«(12 + 112)n + 772 (n1 - n) = 772n2 + (12n
This yields (7-48) because
E{S2} = E{E{s2In}} = E{111n2 + (11n}
Special Case. The number n of particles emitted from a substance in t seconds is
a Poisson random variable with parameter )..t. The energy Xk of the kth particle has
a Maxwell distribution with mean 3kT/2 and variance 3k1 T 1 /2 (see .Prob. 7-5), The
sum sin (7-46) is the total emitted energy in t seconds. As we know E{n} = At,
E(n2} = )..2t2 + At [see (5-64)]. Inserting into (7-47) and (7-48), we obtain
where
Proof. Since the random variables XI are independent and ejw;Xi depends only on X;, we
conclude that from (7-24) that
E {e}«(/)IXI + ' +CllnX.)} = E{ejwIXI} ... E{ejI»,tXn }
Hence
(7-52)
where <Pi (a» is the characteristic function ofxj. Applying the convolution theorem for
Fourier transforms. we obtain (7-51).
EXAl\lPLE 7-8 ~ (a) Bernoulli trials: Using (7 -52) we shall rederive the fundamental equation (3-13),
We define the random variables Xi as follows: Xi = 1 if heads shows at the ith trial and
Xi = 0 otherwise. Thus
PIx; = 1} = P{h} = P P{XI = O} = prE} = q (7-53)
The random variable z = XI + .. , + XII takes the values 0,1.. '" n and {z = k} is the
event {k heads in n tossings }, Furthermore.
II
The random variables Xi are independent because Xj depends only on the outcomes of
the ith trial and the trials are independent. Hence [see (7-52) and (7-53)]
DEFINITION ~ The random variables Xi are jointly normal iff the sum