Professional Documents
Culture Documents
1. Random Vector
A random vector is a vector of random variables,
X1 (s)
..
X(s) = .
Xn (s)
A joint pmf or pdf is obtained in the same way as in the random variable case.
A marginal pmf or pdf can be obtained in the same way. The relationships between joint and marginal
cdf/pdf are
Z Z Zz1 Zzn
FX (z) = ··· fX (x)dx = ··· fX1 ,··· ,Xn (x1 , · · · , xn )dxn · · · dx1
x≤z −∞ −∞
Zz1 Z∞ Z∞
FX1 (z1 ) = ··· fX1 ,··· ,Xn (x1 , · · · , xn )dxn · · · dx1
−∞ −∞ −∞
Z Z
dFX1 (x1 )
fX1 (x1 ) = = ··· fX (x)dx
dx1
x2 ,··· ,xn ∈Rn−1
1
(e.g) f (x1 , x2 ) = 12x1 x2 (1 − x2 ) when 0 < x1 , x2 < 1, 0 elsewhere.
R1
fX1 (x1 ) = 0 12x1 x2 (1 − x2 )dx2 = 2x1 when 0 < x1 < 1, and 0 elsewhere.
R1
fX2 (x2 ) = 0 12x1 x2 (1 − x2 )dx1 = 6x2 (1 − x2 ) when 0 < x2 < 1, and 0 elsewhere.
f (x1 , x2 ) 12x1 x2 (1 − x2 )
fX2 |X1 =x1 = = = 6x2 (1 − x2 ) when 0 < x2 < 1, and 0 elsewhere.
fX1 (x1 ) 2x1
Consider a random vector Y = g(X), where g : Rn → Rn . Let fX (x) be the pdf of X. What would
be the pdf of Y ? For n = 2, it is
¯ dw ¯
¯ 1 dw1 ¯
¯ (y1 , y2 ) (y1 , y2 ) ¯
¯ dy1 dy2 ¯
¯ ¯
fY1 ,Y2 (y1 , y2 ) = fX1 ,X2 [w1 (y1 , y2 ), w2 (y1 , y2 )] ¯det ¯
¯ dw dw2 ¯
¯ 2
(y1 , y2 ) (y1 , y2 ) ¯¯
¯
dy1 dy2
when (y1 , y2 ) ∈ SY = g(SX ), and 0 otherwise, where w1 and w2 are coordinate-wise inverse functions
of g. Generally,
¯ ¯
¯ dw1 dw1 ¯
¯ (y) (y) ¯
¯ dy1 dyn ¯
¯ .. ¯
¯
fY (y) = fX [w1 (y), · · · , wn (y)] ¯det
. ¯
¯
¯ dwn dwn ¯
¯ (y) (y) ¯¯
¯
dy1 dyn
when y ∈ SY = g(SX ), and 0 otherwise, where w1 , · · · , wn are similarly defined.
2006 PS2 Q2. fX (x1 , x2 ) = 2e−x1 −x2 when 0 < x1 < x2 < ∞, and 0 elsewhere.
Let Y1 = 2X1 and Y2 = X2 − X1 . Find the joint pdf of Y1 and Y2 .
Solution Note first that the support of Y1 is R+ , and that of Y2 is R+ for every value of y1 .
The support of Y is SY = {y|y ≥ 0}. Obtain the inverse function as
1 1
X1 = Y1 , X2 = Y1 + Y2
2 2
Therefore, by transformation theorem,
¯ µ ¶¯
¯
−( 12 y1 )−( 12 y1 +y2 ) ¯
1
0 ¯
fY (y1 , y2 ) = 2e 2 ¯ = e−y1 −y2 when y1 , y2 > 0, and 0 elsewhere.
¯det 1
1 ¯
2
(e.g) fX (x) = 10x1 x22 when 0 < x1 < x2 < 1, and 0 elsewhere.
Let Y1 = X1 /X2 and Y2 = X2 .
Note that the support of Y is SY = {y|0 < y1 , y2 < 1}.
Since X1 = Y1 Y2 and X2 = Y2 , then |J| = y2 .
2
2. Expectation
Expectation of a random vector X is similarly defined as
X
E[X] = xpX (x)
x∈SX
if X is discrete, and Z Z
E[X] = ··· xfX (x)dx
x∈Rn
if X is continuous. Of course, it is defined when it indeed exists. We can easily verify that
E[X1 ]
..
E[X] = .
E[Xn ]
For example, for continuous X, the first coordinate of E[X] is nothing but
Z Z Z∞ Z∞ Z∞ Z ∞
··· x1 fX (x)dx = x1 ··· fX (x)dxn · · · dx2 dx1 = x1 fX1 (x1 )dx1 = E[X1 ]
−∞
x∈Rn −∞ −∞ −∞
3
Important properties of expectation
(1) Expectation is a linear operator. In other words, for any scalars a and b, and functions g and h,
E[a0 X] = a0 E[X]
(3) Let X be a m × n random matrix. For any nonstochastic l × m matrix A and n × k matrix B,
E[AXB] = AE[X]B
and thus
h¡ ¢¡ ¢0 i h i
V ar[AX] = E AX − E[AX] AX − E[AX] = E A(X − EX)(X − EX)0 A0 = AV ar(X)A0
(4) Consider Y = X2 |X1 =x1 . This is a random variable and its expectation is
Z ∞
E[X2 |X1 = x1 ] = x2 fX2 |X1 =x1 (x2 )dx2
−∞
which is a constant. What would you think E[X2 |X1 ] is? This is a function of X1 .
It is because it varies with X1 . In fact, it is a random variable. What is its expectation?
h i
E E[X2 |X1 ] = E[X2 ]
3. Independence
pX1 ,X2 (x1 , x2 ) = pX1 (x1 )pX2 (x2 ) for all (x1 , x2 ) ∈ SX
fX1 ,X2 (x1 , x2 ) = fX1 (x1 )fX2 (x2 ) for all (x1 , x2 ) ∈ R2
4
Note that the following are equivalent (TFAE).1
(1) fX1 ,X2 (x1 , x2 ) = fX1 (x1 )fX2 (x2 ) for all (x1 , x2 )
(2) fX2 |X1 =x1 (x2 ) = fX2 (x2 ) for all (x1 , x2 )
(3) There exist function g and h such that
(4) FX1 ,X2 (x1 , x2 ) = FX1 (x1 )FX2 (x2 ) for all (x1 , x2 )
(5) Pr(a < X1 ≤ b, c < X2 ≤ d) = Pr(a < X1 ≤ b) Pr(c < X2 ≤ d) for all a, b, c, d
(6) MX (t1 , t2 ) = MX (t1 , 0)MX (0, t2 )
(3) Cov(X1 , X2 ) = 0
(4) The support of X1 doesn’t vary with X2 , and vice versa.
4. Miscellaneous
The correlation coefficient of X and Y is
E [(X − EX)(Y − EY )] Cov(X, Y )
ρXY = p p =
2
E [(X − EX) ] E [(Y − EY ) ]2 σX σY
p
where µX = E[X] and σX = V ar(X) = std(X). Note that |ρ| ≤ 1.
5
2006 mid Q5. Use Cauchy Schwartz ineqaulity to prove that |ρ| ≤ 1.
x0 Ax ≥ 0
x0 Ax > 0
x0 Bx = (x1 + x2 )2 ≥ 0
so B is positive semidefinite.
where Aij is a matrix obtained by deleting i’th row and j’th column. It is also equivalent to define
X
det A = ε(k1 ,··· ,kn ) a1k1 · · · ankn
(k1 ,··· ,kn )
where (k1 , · · · , kn ) is a permutation from (1, · · · , n) to (1, · · · , n) and ε(k1 ,··· ,kn ) is 1 if it is even
permutation, and −1 if it is odd permutation.
6
Theorem 8 Let X be a random variable FX . Then Y = FX (X) be a random variable whose distri-
bution is uniform on [0, 1], i.e., ½
y if y ∈ [0, 1]
FY (y) =
0 otherwise
Conversely, let F be a cdf and Y be a random varaible whose distribution is uniform on [0, 1], Then
Z = F −1 (Y ) has a cdf F .
1
Exercise. E[FX (X)] =
2
X −µ
2006 mid Q6. Let X be a random varaible whose pdf is fX (x). Let Y = . What is the pdf of Y ?
σ
2006 mid Q7. Show that X and Y −E[Y |X] are uncorrelated, in other words, Cov(X, Y −E[Y |X]) =
0.
2
You should be able to prove without this assumption.