Professional Documents
Culture Documents
119 120
2. Theorem: E(XY) = E(X)E(Y), when X is indepen- For continuous random variables X and Y,
dent of Y. ∞ ∞
E(XY) = xy f xy (x, y) dx dy
Proof: −∞∞ −∞
∞
= xy f x (x) fy (y) dx dy
For discrete random variables X and Y, −∞ −∞
∞ ∞
= x f x (x) dx y fy (y) dy = E(X)E(Y).
E(XY) = xi y j f xy (xi , y j ) = xi y j f x (xi ) fy (y j ) −∞ −∞
i j i j
When X is independent of Y, we have f xy (x, y) = f x (x) fy (y)
= xi f x (xi ) y j fy (y j ) = E(X)E(Y).
i j in the second equality.
If X is independent of Y, the second equality holds,
i.e., f xy (xi , y j ) = f x (xi ) fy (y j ).
121 122
Proof: = E(XY) − μ x μy
Cov(X, Y) = E((X − μ x )(Y − μy )) used, i.e., E(μ x Y) = μ x E(Y) and E(μy X) = μy E(X).
= E(XY − μ x Y − μy X + μ x μy )
123 124
4. Theorem: Cov(X, Y) = 0, when X is independent of 5. Definition: The correlation coefficient (૬ؔ)
Y. between X and Y, denoted by ρ xy , is defined as:
Therefore, Cov(X, Y) = 0 is obtained when X is inde- ρ xy < 0 =⇒ negative correlation between X and Y
pendent of Y.
ρ xy −→ −1 =⇒ strong negative correlation
125 126
Proof: Proof:
When X is independent of Y, we have Cov(X, Y) = 0. For both discrete and continuous random variables, V(X±
Cov(X, Y) Y) is rewritten as follows:
We obtain the result ρ xy = √ √ = 0.
V(X) V(Y)
However, note that ρ xy = 0 does not mean the indepen- V(X ± Y) = E ((X ± Y) − E(X ± Y))2
dence between X and Y. = E ((X − μ x ) ± (Y − μy ))2
127 128
+E((Y − μy )2 ) Proof:
= V(X) ± 2Cov(X, Y) + V(Y). Consider the following function of t: f (t) = V(Xt − Y),
which is always greater than or equal to zero because
of the definition of variance. Therefore, for all t, we
have f (t) ≥ 0. f (t) is rewritten as follows:
129 130
tive, which implies:
131 132
9. Theorem: V(X ± Y) = V(X) + V(Y), when X is inde- 10. Theorem: For n random variables X1 , X2 , · · ·, Xn ,
pendent of Y.
E( ai X i ) = ai μ i ,
Proof: i i
V( ai Xi ) = ai a j Cov(Xi , X j ),
From the theorem above, V(X±Y) = V(X)±2Cov(X, Y)+ i i j
V(Y) generally holds. When random variables X and where E(Xi ) = μi and ai is a constant value. Espe-
Y are independent, we have Cov(X, Y) = 0. Therefore, cially, when X1 , X2 , · · ·, Xn are mutually independent,
V(X + Y) = V(X) + V(Y) holds, when X is independent we have the following:
of Y.
V( ai Xi ) = a2i V(Xi ).
i i
133 134
Proof: For variance of i ai Xi , we can rewrite as follows:
2
For mean of i ai Xi , the following representation is V( ai Xi ) = E ai (Xi − μi )
i i
obtained.
=E ai (Xi − μi ) a j (X j − μ j )
i j
E( ai Xi ) = E(ai Xi ) = ai E(Xi ) = ai μi .
i i i i =E ai a j (Xi − μi )(X j − μ j )
i j
The first and second equalities come from the previous
= ai a j E (Xi − μi )(X j − μ j )
theorems on mean. i j
= ai a j Cov(Xi , X j ).
i j
135 136
obtain Cov(Xi , X j ) = 0 for all i j from the previous 11. Theorem: n random variables X1 , X2 , · · ·, Xn are mu-
theorem. Therefore, we obtain: tually independently and identically distributed with
mean μ and variance σ2 . That is, for all i = 1, 2, · · · , n,
V( ai Xi ) = a2i V(Xi ).
i i E(Xi ) = μ and V(Xi ) = σ2 are assumed. Consider
Note that Cov(Xi , Xi ) = E((Xi − μ)2 ) = V(Xi ). arithmetic average X = (1/n) ni=1 Xi . Then, mean and
variance of X are given by:
σ2
E(X) = μ, V(X) = .
n
137 138
1 1 1
n n n
The mathematical expectation of X is given by:
V(X) = V( Xi ) = 2 V( Xi ) = 2 V(Xi )
n i=1 n n i=1
1 1 1
n n n i=1
E(X) = E( Xi ) = E( Xi ) = 1 2
E(Xi ) n
n i=1 n i=1 n i=1 1 σ2
= 2 σ = 2 nσ2 = .
n i=1 n n
1
n
1
= μ = nμ = μ.
n i=1 n We use V(aX) = a2 V(X) in the second equality and
E(aX) = aE(X) in the second equality and E(X + Y) = V(X + Y) = V(X) + V(Y) for X independent of Y in the
E(X) + E(Y) in the third equality are utilized, where X third equality, where X and Y denote random variables
and Y are random variables and a is a constant value. and a is a constant value.
139 140
141 142
sity function and the distribution function of X, respectively. The first equality is the definition of the cumulative distribu-
Note that F x (x) = P(X ≤ x) and f x (x) = F x (x). tion function. The second equality holds because of ψ (Y) >
When X = ψ(Y), we want to obtain the probability density 0. Therefore, differentiating Fy (y) with respect to y, we can
function of Y. Let fy (y) and Fy (y) be the probability density obtain the following expression:
function and the distribution function of Y, respectively.
fy (y) = Fy (y) = ψ (y)F x ψ(y) = ψ (y) f x ψ(y) . (4)
In the case of ψ (X) > 0, the distribution function of Y, Fy (y),
is rewritten as follows:
Fy (y) = P(Y ≤ y) = P ψ(Y) ≤ ψ(y)
= P X ≤ ψ(y) = F x ψ(y) .
143 144
Next, in the case of ψ (X) < 0, the distribution function of Y, Note that −ψ (y) > 0.
Fy (y), is rewritten as follows: Thus, summarizing the above two cases, i.e., ψ (X) > 0 and
ψ (X) < 0, equations (4) and (5) indicate the following result:
Fy (y) = P(Y ≤ y) = P ψ(Y) ≥ ψ(y) = P X ≥ ψ(y)
= 1 − P X < ψ(y) = 1 − F x ψ(y) . fy (y) = |ψ (y)| f x ψ(y) ,
Thus, in the case of ψ (X) < 0, pay attention to the second which is called the transformation of variables.
145 146
Example 1.9: When X ∼ N(0, 1), we derive the probabil- On Distribution of Y = X2 : As an example, when we
ity density function of Y = μ + σX. know the distribution function of X as F x (x), we want to ob-
Since we have: tain the distribution function of Y, Fy (y), where Y = X 2 .
Y −μ
X = ψ(Y) = , Using F x (x), Fy (y) is rewritten as follows:
σ
√ √
ψ (y) = 1/σ is obtained. Therefore, fy (y) is given by: Fy (y) = P(Y ≤ y) = P(X 2 ≤ y) = P(− y ≤ X ≤ y)
√ √
1 1 = F x ( y) − F x (− y).
fy (y) = |ψ (y)| f x ψ(y) = √ exp − 2 (y − μ)2 ,
σ 2π 2σ
The probability density function of Y is obtained as follows:
which indicates the normal distribution with mean μ and vari- 1 √ √
fy (y) = Fy (y) = √ f x ( y) + f x (− y) .
ance σ2 , denoted by N(μ, σ2 ). 2 y
147 148
4.2 Multivariate Cases where J is called the Jacobian of the transformation, which
149 150
Multivariate Case: Let f x (x1 , x2 , · · · , xn ) be a joint proba- Then, we obtain a joint probability density function of Y1 ,
bility density function of X1 , X2 , · · · Xn . Suppose that a one- Y2 , · · ·, Yn , denoted by fy (y1 , y2 , · · · , yn ), as follows:
to-one transformation from (X1 , X2 , · · · , Xn ) to (Y1 , Y2 , · · · , Yn )
fy (y1 , y2 , · · · , yn )
is given by:
= |J| f x ψ1 (y1 , · · · , yn ), ψ2 (y1 , · · · , yn ), · · · , ψn (y1 , · · · , yn ) ,
X1 = ψ1 (Y1 , Y2 , · · · , Yn ),
X2 = ψ2 (Y1 , Y2 , · · · , Yn ),
..
.
Xn = ψn (Y1 , Y2 , · · · , Yn ).
151 152
153