ACTS 6306: Lecture Notes: Part 2 Independence, Conditional Distributions, and Expectations

ACTS 6306: Lecture Notes: Part 2
Independence, Conditional Distributions, and Expectations

Yuly Koshevnik
Introduction
These notes reflect what can be found in Chapter 2 of the textbook.
We are going to cover these topics during the period from August 24 to 31.
1 Independence and Conditional Probabilities

We need to operate with the general notion of a probability space.
1.1 Probability Rules

Recall that the sample space is a pair (Ω, A), where Ω is a set of outcomes and A denotes a collection of
subsets A ⊂ Ω called events. This collection satisfies the following conditions.
1. The entire sample space, Ω is an event: Ω ∈ A.
2. If A ⊂ Ω is an event, then its complement, A0 , is also an event: A0 inA.
3. For any finite or countable collection, {Ak : k = 1, 2, . . . , n, . . . }, the union is also an event:
[
A= Ak ∈ A
k
1.2 Probability
The probability is assigned to events A ∈ A in such a way that the following conditions are satisfied.
1. 0 ≤ P[A] ≤ 1 for any event A.
2. P[Ω] = 1.
3. For any finite or countable collection of pairwise disjoint events,
{Ak : k = 1, 2, . . . , n, . . .}
1
the probability of their union equals to the sum of probabilities:
[ X
P[ Ak ] = P[Ak ].
k k
This is the additive (countably-additive) property of probability.
B = ∅, then
T
In particular, if two events, A and B are disjoint, which means that A
[
P[A B] = P[A] + P[B].
This is the additive property of probability for disjoint events.

The formulated properties imply that P[A] + P[A0 ] = 1, because their union is Ω and they are disjoint.
This identity will be referred to as complementary event rule.
1.3 Conditional Probability and Independent Events

If A and B are two events, and P[B] > 0, the conditional probability of A, given that B occurs is defined
as T
P[A B]
P[A|B] = .
P[B]
T
Solving this for the probability of joint occurrence of two events, P[A B], obtain the multiplication
rule that states: \
P[A B] = P[B] · P[A|B],
thus the probability that two events occur jointly is the product of a marginal probability of one event
and conditional probability of the remaining event, given that the first event has occurred. Swapping two
events, conclude \
P[A B] = P[B] · P[A|B] = P[A] · P[B|A], (1)
provided P[A] > 0. Two events, A and B, are said to be independent if
P[A|B] = P[A|B 0 ],
which means that conditionally, whether B occurs or does not, the probability of A does not depend on
it. Using the definition of conditional probability, this can be rephrased in a symmetric form:
\
P[A B] = P[A] · P[B],
which implies that B also does not depend on A.
2
1.4 Partitions and Bayes Theorems
A finite or countable collection of pairwise disjoint events,
{Bk : k = 1, 2, . . . , n, . . .}
forms a partition of the space Ω, if [

Bk = Ω.
k
Using the rules of probability and definitions presented above, conclude that for any event A, the following
statement is valid.
Theorem. X
P[A] = P[Bk ] · P[A|Bk ] (2)
k
Probabilities assigned to events Bk in the partition are called prior. The other theorem discovered by
Bayes allows to introduce the posterior probabilities as follows.
Theorem. For any event A and for any partition {Bk : k = 1, 2, . . .}, the conditional probabilities of
events in the partition can be evaluated as
P[Bj ] · P[A|Bj ] P[Bj ] · P[A|Bj ]

P[Bj |A] = =P (3)
P[A] k P[Bk ] · P[A|Bk ]
The probabilities described in this formula are posterior, so given that event A occurred, we recalculate
probabilities of the partition according to the observed event.
3
2 Joint, Marginal, and Conditional Distributions
Consider a pair of two random variables, (X, Y ). Their joint distribution is explored in this section.
2.1 Purely Discrete Case.

The joint distribution of a purely discrete pair is characterized by a probability mass function, f X,Y (x, y),
such that:
1. [f X,Y (x, y) ≥ 0 and f X,Y (x, y) > 0 at a finite or countable set of points on the plane.
f X,Y (x, y) = 1.
P P
2. x y
If T = T (X, Y ) is a transformed variable, such as T = aX + bY = c, or T = (X ± Y )2 etc. then its

expected value is calculated as follows:
XX
E[T ] = T (x, y) · f X,Y (x, y),
x y
provided the double series converges absolutely.
2.2 Continuous Case

An observed pair (X, Y ) has a joint continuous distribution on the plane, R2 if its density is f X,Y (x, y) ≥ 0
satisfies the condition:
Z ∞ Z ∞ Z
f X,Y (x, y)dx dy = 2
f X,Y (x, y) dx dy = 1.
−∞ −∞ R
Notice that this double integral formally is taken over the entire plane, R2 .
2.3 Marginal and Conditional Distributions

When a pair (X, Y ) is observed, the marginal distribution of one component is defined as follows.
2.3.1 Discrete Case

Marginal probability function of X is
X
P[X = x] = f X (x) = f X,Y (x, y).
y
Similarly, marginal probability function of Y is

X
P[Y = y] = f Y (y) = p(x, y).
x
4
Using (2), it is possible to introduce conditional distributions as follows:
f X,Y (x, y)
f X|Y (x|y) = P[X = x|Y = y] = ,
f Y (y)
provided the denominator is positive. If f Y (y) = 0, we set conditional probability function as zero.
Similarly, under the same convention about zero denominator,
f X,Y (x, y)
f Y |X (y|x) = P[Y = y|X = x] = .
f X (x)
Like it was in (1), conclude
f X,Y (x, y) = f X (x) · f Y |X (y|x) = f Y (y) · f X|Y (x|y). (4)
2.3.2 Purely Continuous Case

Consideration of the purely continuous distribution of a pair (X, Y ) can be outlined similarly, with re-
placement of sums by integrals as follows. If f X,Y (x, y) is the joint density of the pair (X, Y ), then the
marginal densities for components are:
Z ∞ Z ∞
f X (x) = f X,Y (x, y) dy and f Y (y) = f X,Y (x, y) dx.
−∞ −∞
The conditional densities step-by-step follow the same scenario, hence the distribution of X, given Y = y
is described by the formula:
f X,Y (x, y) Y |X f X,Y (x, y)

f X|Y (x|y) = and f (y|x) = .
f Y (y) f X (x)
Regardless of the distribution type, the rule ”marginal times conditional” is valid:
f X,Y (x, y) = f X (x) · f Y |X (y|x) = f Y (y) · f X|Y (x|y). (5)
Remark Conditional distributions, discrete or continuous, will be used when we want to proceed with
conditional expectations. Formally, equations (4) and (5) look alike, yet they apply to different types of
variables.
5
3 Mixed Case
A mixed situation occurs when two components, now denoted as (X, Y = N ) are such that X is continuous
and N is discrete. Assume that Y = N takes integer values.
The joint distribution of (X, N ) can be described in terms of the probability function, f X,N (x, n) ≥ 0
that combines properties of density and mass function as follows.
1. For any integer n, the value

Z ∞
N
f (n) = P[N = n] = f X,N (x, n)
−∞
is a probability mass function, which means

X
n f N (n) = 1.
2. For any real x ∈ R, the expression

X
f X (x) = f X,N (x, n)
n
is a density function of X, which implies

Z ∞
f X (x) dx = 1.
−∞
3.1 Conditional Distributions

Using the same approach as before, conditional distributions of (N |X = x) and of (X|N = n) are defined
as follows.
Conditioning on N The conditional density of (X|N = n) is
f X,N (x, n)
f X|N (x|n) = ,
f N (n)
for x ∈ R. Notice that in this case, the given value N = n, can be viewed as a parameter of this
distribution.
Conditioning in X The conditional probability mass function of (N |X = x) is
f X,N (x, n)
f N |X (n|x) = ,
f N (n)
for integer n, while the given value of X = x is viewed as a parameter of this distribution.
Similar to (4) and (5), restoring the initial notation, Y = N, we conclude that
f X,Y (x, y) = f X (x) · f Y |X (y|x) = f Y (y) · f X|Y (x|y) (6)
6
Remarks
• Equations (4), (5) and the just mentioned (6) all look similar to each other, despite the differences
in types of random variables involved.
• These notions can be extended to cover multivariate distributions for X or Y if necessary.
4 Conditional Expectations
When variables, (X, Y ), are considered as a single case, their joint distribution can be either determined
explicitly, via f X,Y (x, y), or implicitly, using marginal for one and conditional for other. In all these
situations, equations (4), (5), and (6) help us proceed with conditional and marginal expectations.
4.1 Purely Discrete Case

Recall that in the purely discrete case we observed
(Y = y)] = P[X = x] · P[Y = y|X = x] = f X (x) · f Y |X (y|x),

\
P[(X = x)
which will help establish the important general double expectation rule. Conditioning on X results in
y · f X,Y (x, y)
P
Y |X y
X
E[Y |X = x] = y·f (y|x) = P X,Y (x, y)
.
y y f
Similarly, if T = T (X) or T = T (X, Y ) is a transformed variable, then
T (x, y) · f X,Y (x, y)

P
Y |X y
X
E [T |X = x] = T (x, y) · f (y|x) = P X,Y (x, y)
,
y y f
which can be expressed as follows:
E[T (X, Y )|X = x] = E[T (x, Y )|X = x]. (7)
In other words, given X = x, the conditional expectation of T coincides with that for T (x, Y ) with respect
to the distribution of (Y |X = x). In particular, conditional variance of (Y |X = x) is
Var[Y |X = x] = E[Y 2 |X = x] − (E[Y |X = x])2 ,
which is common for the definition of variance.
7
4.2 Purely Continuous Case
Again, (X, Y ) are of the same nature and considerations are presented for conditioning on X only. Let
f (x, y) denote the joint density function. Recall the identity (5), so the conditional expectation of (Y |X =
x) is R∞
y · f X,Y (x, y) dy
Z ∞
E[Y |X = x] = y·f Y |X
(y|x) dy = −∞
R∞
X,Y (x, y) dy
.
−∞ −∞ f
Similarly, if T = T (X, Y ) is a transformed variable, its conditional expectation is
Z ∞ R∞
Y |X −∞ T (x, y) · f (x, y) dy
E[Y |X = x] = T (x, y) · f (y|x) dy = R∞ .
−∞ −∞ f (x, y) dy
Again, the interpretation is similar to the purely discrete case and leads to
E[T (X, Y )|X = x] = E[T (x, Y )|X = x]. (8)
As before, conditional variance of (Y |X = x) is
Var [Y |X = x] = E [Y 2 |X = x] − (E [Y |X = x])2 ,
4.3 Mixed Case

To distinguish between the continuous and discrete components, the observable variables are denoted as
(X, Y = N ), where X is continuous and Y = N is discrete. Here we need to consider both variations of
conditioning, on X and N.
4.3.1 Conditioning on N
Given N = n, the distribution of (X|N ) has the density,
f X,N (x, n)
f X|N (x|n) = ,
f N (n)
which was mentioned as (6) Thus given N = n, the (conditional) expected value of X is
Z ∞ R∞ X,N (x, n) dx
X|N −∞ x · f
E[X|N = n] = x·f (x|n) dx = R∞
X,N (x, n) dx
.
−∞ −∞ f
Similarly, for T = T (X, N ), obtain:

R∞
T (x, n) · f X,N (x, n) dx
Z ∞
X|N −∞
E[T (X, N )|N = n] = T (x, n) · f (x|n) dx = R∞
X,N (x, n) dx
,
−∞ −∞ f
which can be interpreted as
E[T (X, N )] = E[T (X, n)|N = n], (9)

where the value N = n is given, and integration is performed in x only.
8
4.3.2 Conditioning on X
In this case, the conditional distribution of (N |X = x) is needed, so the evaluation of conditional expecta-
tion is performed in terms of
f X,N (x, n)
f N |X (n|x) = P[N = n|X = x] =
f X (x)
and the expected value of N, given X = x, is:

X,N (x, n)
n n·f
P
N |X
X
E[N |X = x] = n·f (n|x) = P X,N
.
n n f (x, n)
Similarly, obtain that for T = T (X, N ),
T (x, n) · p(x, n)
P
n
E[T |X = x] = P ,
n p(x, n)
which means that

E[T (X, N )|X = x] = E[T (x, N )|X = x], (10)
similar to (9).
9
5 Examples for Self-Studies
5.1 Poisson and Conditional Binomial
A discrete random variable (N ) is Poisson distributed with intensity (or rate) parameter λ > 0, therefore,
λn −λ
f N (n) = P[N = n] = ·e ,
n!
where n = 0, 1, . . . , . Given N = n, a random variable X|N = n has the binomial distribution, Bin[n, q],
where parameter q > 0 is specified. Assume that when N = 0, the value of X = 0 with probability 1. Also
notice that X can take values 0 ≤ x ≤ n.
Marginal Distribution of X. To determine P[X = x] one needs to evaluate the following sum:
X n!
P[X = x] = E[P[X = x|N = n]] = · q x · (1 − q)n−x .
n≥x
x! (n − x)!
Since sum is taken over n ≥ x, substitute the summation variable by m = n − x ≥ 0. After carrying
out common factors that do not depend on n, this equation results in
q x −λ X 1
P[X = x] = ·e · · (1 − q)m ,
x! m≥0
m!
so the sum equals to e−λ·(1−q) . Thus conclude that

λ · q x −λ·q
P[X = x] = ·e ,
x!
and the marginal distribution of X is Poisson with the rate equal to λ · q.
Conditional Distribution of (N |X). Notice that the joint p.m.f. of the pair (X, N ) is the product of
marginal function f N (n) and conditional p.m.f. f X|N (x|n), which after simplifications results in
λn −λ n!
p(x, n) = f N (n) · f X|N (x|n) = ·e · · q x · (1 − q)n−x ,
n! x! · (n − x)!
where integer values are: n ≥ 0 and 0 ≤ x ≤ n. Given X = x, use the same observation that N can
vary from x to ∞, and therefore, its conditional distribution is characterized as follows:
p(x, n) λn−x · (1 − q)n−x −λ(1−q)

P[N = n|X = x] = = ·e ,
f X (x) (n − x)!
valid for n ≥ x. This can be interpreted in terms of the variable, W = N − X, which conditionally,
given X = x, has the Poisson distribution with the rate equal to λ (1 − q) .
This can be interpreted by saying that (Y = N − X|X = x) does not depend on X.
10
Conditional Expectation and Variance of (N |X = x) First notice that conditioning on N results in
E[X|N = n] = n · q and Var[X|N = n] = n · qcdot(1 − q),
because (X|N = n) has the binomial distribution, Bin[n, q]. However, as was shown before, (W =
N − X|X = x) has the Poisson distribution, with rate equal to λ · (1 − q), which allows us to
immediately find the conditional expectation as follows:
E[N |X = x] = E[(N − x) + x|X = x] = λ(1 − q) + x.
The conditional variance of (N − x)|X = x equals to its mean, which is λ(1 − q). Finally, the
conditional variance of (N |X = x) is the same, that is λ(1 − q).
5.2 Two Independent Poisson Variables

Consider two independent Poisson-distributed random variables, (X, Y ), such that
X ∼ Poisson [λ] and Y ∼ Poisson [µ].
Then their sum, Z = X + Y is also Poisson-distributed with parameter λ + µ.
Joint Distribution of (X, Z). Notice that Z = X + Y ≥ X due to the fact that Y ≥ 0. Therefore the
joint p.m.f. of (X, Z) is positive only for z ≥ x. Also,
\ \
P[(X = x) (Z = z ≥ x)] = P[(X = x) (Y = z − x)] = P[X = x] · P[Y = z − x],
because (X, Y ) are independent. Hence
λx −λ µz−x
f X,Z (x, z) = [ ·e ]·[ ] · e−µ .
x! (z − x)!
Conditional Distribution of (X|Z). Division of the joint p.m.f by the p.m.f of Z results in
x z−x
z! λ µ

f X|Z (x|z) = · · ,
x! (z − x)! λ+µ λ+µ
which means that conditionally (X|Z = z) has the binomial distribution, Bin[z, q] with
λ
q= .
λ+µ
.
11
Conditional Covariance between (X, Y ) Recall that the covariance is defined as
Cov[X, Y ] = E[X · Y ] − E[X] · E[Y ].
Conditioning on Z = z can be handled similarly, so since (X|Z = z) has the binomial distribution, its
conditional expectation is z · q and similarly, E[Y |Z = z] = z · (1 − q). To find conditional expectation
of the product (X · Y |Z = z), notice that given (Z = z), it is the same as for the product, X(z − X).
Thus obtain:
E[X · Y |Z = z] = z 2 · q(1 − q) − z · q(1 − q) = z(z − 1)q(1 − q).
Finally, the covariance is
z(z − 1)q(1 − q) − z 2 · q(1 − q) = −zq(1 − q).
Alternatively, notice that Var [Z|Z = z] = 0 and using the identity valid for the variance of a sum,
0 = Var [X + Y |X + Y = z] = Var [X|Z = z] + 2 · Cov [(X, Y )|Z = z] + Var [Y |Z = z]
conclude that
1
Cov [(X, Y )|Z = z] = − · (Var [X|Z = z] + Var [Y |Z = z]) = −zq(1 − q)
2
5.3 Gamma-Distributions
Assume first that (X, Y ) are independent with densities
1 1
f X (x) = · xa−1 · exp(−x) and f Y (y) = · y b−1 · e−y ,
Γ(a) Γ(b)
which means that X ∼ Gamma [a, 1] and Y ∼ Gamma [b, 1], where a > 0 and b > 0 are given shape
parameters.
Disribution of the Sum Since (X, Y ) are independent, their joint density is simply a product of marginal
ones:
1
f X,Y (x, y) = · xa−1 · y b−1 · exp(−(x + y))
Γ(a) · Γ(b)
for any (x > 0), (y > 0), and zero elsewhere. The density of Z was derived in Probability Theory
course and led to Z ∼ Gamma [a + b, 1], or
1
f Z (z) = · z a+b−1 · exp(−z).
Γ(a + b)
For those willing to validate it, use the substitution
x y
u= , 1−u=
z z
and integrate the joint density f X,Y (x, y) in x from 0 to z = x + y > 0.
12
Conditional Distribution of (X|Z = z) The distribution of (X|Z = z) is obtained as the ratio,
f X,Y (x, y)
f X|Z (x|z) =
f Z (z)
valid for 0 < x < z.

Introduce T = X Z . It was shown in Probability Theory that (T |Z = z) ∼ Beta [a, b], or
Γ(a + b)
f T |Z (t|z) = · ta−1 · (1 − t)b−1
Γ(a) · Γ(b)
for 0 < t < 1. Since it does not depend on z, variables T and Z are independent.
Conditional Distribution of (Z|X = x) (Z − X|X = x) has the same distribution as (Z − x|X = x),
which is Gamma [b, 1].
Conditonal Moments Recall that T = X/Z, hence X = T · Z. Conditional Expectation of X|Z = z is

a
E [X|Z = z] = E [T · Z|Z = z] = E [T ] · z = ·z
a+b
Conditional variance of (X|Z = z) can be found by using the same representation, X = T · Z as
follows:
a·b
Var [X|Z = z] = Var [T · Z|Z = z] = z 2 · Var [T ] = z 2 ·
(a + b)2 · (a + b + 1)
5.4 Gamma-Distributed Independent Variables with a Common Scale

If X and Y are independent Gamma-distributed random variables,
X ∼ Gamma [a, λ] and Y ∼ Gamma [b, λ],
then their standardized versions,
X1 = λ · X and Y1 = λ · Y,
are still independent each of them having the scale parameter equal to 1. Therefore, Z = X + Y ∼
Gamma [a + b, λ]. Also
X X1
T = = ∼ Beta [a, b]
X +Y X1 + Y1
and is independent of Z.
13
5.5 Two Exponentially Distributed Independent Variables
If X and Y are independent with a common exponential density,
f X (u) = f Y (u) = λ · exp(−λ · u) for u > 0,
then referring to the previous case, the following conclusions can be made.
1. Z = X + Y ∼ Gamma [2, λ], having the density
f Z (z) = λ2 · z · exp(−λ · z)
2. (X|Z = z) is uniformly distributed over the interval (0 < x < z)

3. Variable, T = X/Z has standardized uniform distribution, or Unif (0, 1) and is independent of
Z = X + Y.
4. Conditional moments of (X|Z = z) can be found as follows:
z z2
E [X|Z = z] = and Var [X|Z = z] =
2 12
5.6 Mixed Case: Poisson + Exponential

Assume that the distribution of a pair (X, Y = N ), where X is continuous and Y = N is discrete.
Suppose that X is exponentially distributed with rate λ > 0, so its p.d.f. is
f X (x) = λ exp(−λx) (x > 0).
Given X = x, the distribution of (N |X = x) is Poisson with intensity equal to x, so that

xn −x
P[N = n|X = x] = e (n ≥ 0).
n!
Marginal Distribution of N The marginal p.m.f. is defined for any integer n ≥ 0 according to (5) as
follows:
Z ∞ n
x −x 1 ∞ n n! λ λ
Z
N
f (n) = P[N = n] = [ ·e ]·λ·exp(−λx) dx = · x ·exp(− (λ + 1) x) dx = · n+1
=
0 n! n! 0 n! (λ + 1) (λ +
λ
It can be interpreted the geometric distribution starting at n = 0 with the success rate q = λ+1
Conditional Distribution of (X|N = n) Conditionally, given N = n, the variable X has the p.d.f.
equal to the ratio,
n+1
1 xn −x 1 xn

f X|N (x|n) = ·[ · e ] · λ e−λx = · ,
P[N = n] n! λ+1 n!
which is Gamma[n + 1, (λ + 1)].
14
Expectation and Variance Conditional expectation of (X|N = n) is
n+1
E[X|N = n] = .
λ+1
Conditional Variance of (X|N = n) is
(n + 1)
Var[X|N = n] = .
(λ + 1)2
5.7 Mixed Case: Geometric + Uniform

Assume that a continuous Q is uniformly distributed over the unit interval, (0, 1). Given Q = q, a discrete
random variable N has a geometric distribution with success rate equal to q.
Marginal Distribution of N Since Q is uniformly distributed over the unit interval, its density is
f Q (q) = 1 for 0 < q < 1 and zero elsewhere. So the marginal p.m.f. for N is
Z 1
N
f (n) = P[N = n] = (1 − q)n · q dq.
0
Substitute u = 1 − q and reduce the integral above to

Z 1
1 1 1
f N (n) = (un − un + 1) du = − = .
0 n+1 n+2 (n + 1) · (n + 2)
Conditional Distribution of (Q|N = n) Conditionally, given N = n, the p.d.f. of Q is the ratio,

1
f Q|N (x|n) = · q(1 − q)n = (n + 1)(n + 2)q(1 − q)n ,
P[N = n]
which will be later identified as Beta[2, n + 1].
Expectation and Variance Conditional expectation and variance of (Q|N = n) can be found directly
by using the facts about Beta distribution. Thus the expectation is:
Z 1
2 2
E[Q|N = n] = (n + 1)(n + 2) q 2 (1 − q)n dq = = .
0 2+n+1 n+3
The variance can be found as
2 (n + 1)
Var[Q|N = n] =
(n + 3)2 (n + 4)
15
Random Sums
By now, we can slightly simplify our common notation. Notice that when two random variables, (X, Y )
are considered, the conditional distribution and moments for T = T (X, Y ), given (X = x) or (Y = y) are
simply transformations of the variable that defines a condition. A common situation arises when variables
N and {Xk : k ≥ 1} are independent and X-values share the same distribution with finite first and
second moments.
Assume that
µ = E [Xk ] and σ 2 = Var [Xk ]
are finite. Also make a similar assumption about moments of N , so that
ν = E [N ] and τ 2 = Var [N ]
are finite, too. The random sum is defined as

N
X
S= Xk ,
k=1
so its randomness is due to the two possible reasons, those in X-values and in the value of N. When N = 0
with a positive probability, set S = 0 by default.
5.8 Conditional Moments of (S|N )

Notice that given N = n > 0, conditional expectation of (S|N = n) is
E [S|N = n] = n · µ
Using a short notation, we rewrite this as
E [S|N ] = N · µ (11)
and if N = 0 occurs, this formula covers such an occasion, too.

Conditional variance of (S|N ), due to independence assumptions, is
Var [S|N ] = N · σ 2 (12)
5.9 Marginal Moments of a Random Sum

5.9.1 Marginal Expectation
Using (9) or (10), we have
E [S] = E [E [S|N ]] = E [N · µ] = µ · ν,
or in the alternative form:
E [S] = µ · ν = E [X] · E [N ] (13)
16
5.9.2 Variance of a Random Sum
In Probability Theory course, for any two variables, say (T, W ), the fundamental identity was established:
Var [T ] = Var [E [T |W ]] + E [Var [T |W ]] (14)
Plugging (11) and (12) into this formula, obtain
Var [S] = Var [E [S|N ]] + E [Var [S|N ]] = Var [N · µ] + E [N · σ 2 ] = µ2 · τ 2 + ν · σ 2 ,
so for the sake of quick references,
Var [S] = µ2 · τ 2 + ν · σ 2 = (E [Xk ])2 · Var [N ] + E [N ] · Var [Xk ] (15)
17
5.10 Exercises
5.10.1 Random Poisson Sum
In addition to common assumptions, suppose that N ∼ Poisson [λ] and consider
N
X
S= Xk
k=1
Notice that for N we have:

ν = τ 2 = λ.
Using notation
µ = E [Xk ] and σ 2 = Var [Xk ],
conclude that expectation and variance of S are:
E [S] = µ · λ and
Var [S] = µ2 · λ + λ · σ 2 = λ · (µ2 + σ 2 ) = λ · E [(Xk )2 ]
Random Geometric Sum

Under already made common assumptions, with
µ = E [Xk ] and σ 2 = Var [Xk ],
suppose that N is geometrically distributed, with
f N (n) = P [N = n] = (1 − q)n · q for n ≥ 0.
Since it starts at n = 0, its first two moments are:

1−q 1−q
ν= and τ 2 =
q q2
Therefore, the expectation of a random sum is
1−q
E [S] = µ · ν = µ ·
q
and its variance is derived from (15) as follows:

1−q 1−q 2
Var [S] = µ2 · + ·σ
q2 q
18

ACTS 6306: Lecture Notes: Part 2 Independence, Conditional Distributions, and Expectations

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ACTS 6306: Lecture Notes: Part 2 Independence, Conditional Distributions, and Expectations

Uploaded by

Copyright:

Available Formats

ACTS 6306: Lecture Notes: Part 2

Independence, Conditional Distributions, and Expectations

1 Independence and Conditional Probabilities

1.1 Probability Rules

1. The entire sample space, Ω is an event: Ω ∈ A.

2. If A ⊂ Ω is an event, then its complement, A0 , is also an event: A0 inA.

1. 0 ≤ P[A] ≤ 1 for any event A.

3. For any finite or countable collection of pairwise disjoint events,

This is the additive (countably-additive) property of probability.

This is the additive property of probability for disjoint events.

1.3 Conditional Probability and Independent Events

which implies that B also does not depend on A.

forms a partition of the space Ω, if [

P[Bj ] · P[A|Bj ] P[Bj ] · P[A|Bj ]

2.1 Purely Discrete Case.

If T = T (X, Y ) is a transformed variable, such as T = aX + bY = c, or T = (X ± Y )2 etc. then its

provided the double series converges absolutely.

2.2 Continuous Case

2.3 Marginal and Conditional Distributions

2.3.1 Discrete Case

Similarly, marginal probability function of Y is

Like it was in (1), conclude

f X,Y (x, y) = f X (x) · f Y |X (y|x) = f Y (y) · f X|Y (x|y). (4)

2.3.2 Purely Continuous Case

f X,Y (x, y) Y |X f X,Y (x, y)

f X,Y (x, y) = f X (x) · f Y |X (y|x) = f Y (y) · f X|Y (x|y). (5)

1. For any integer n, the value

is a probability mass function, which means

2. For any real x ∈ R, the expression

is a density function of X, which implies

3.1 Conditional Distributions

Conditioning in X The conditional probability mass function of (N |X = x) is

f X,Y (x, y) = f X (x) · f Y |X (y|x) = f Y (y) · f X|Y (x|y) (6)

• These notions can be extended to cover multivariate distributions for X or Y if necessary.

4.1 Purely Discrete Case

(Y = y)] = P[X = x] · P[Y = y|X = x] = f X (x) · f Y |X (y|x),

Similarly, if T = T (X) or T = T (X, Y ) is a transformed variable, then

T (x, y) · f X,Y (x, y)

which can be expressed as follows:

E[T (X, Y )|X = x] = E[T (x, Y )|X = x]. (7)

Var[Y |X = x] = E[Y 2 |X = x] − (E[Y |X = x])2 ,

which is common for the definition of variance.

4.3 Mixed Case

Similarly, for T = T (X, N ), obtain:

E[T (X, N )] = E[T (X, n)|N = n], (9)

and the expected value of N, given X = x, is:

Similarly, obtain that for T = T (X, N ),

which means that

so the sum equals to e−λ·(1−q) . Thus conclude that

p(x, n) λn−x · (1 − q)n−x −λ(1−q)

E[X|N = n] = n · q and Var[X|N = n] = n · qcdot(1 − q),

E[N |X = x] = E[(N − x) + x|X = x] = λ(1 − q) + x.

5.2 Two Independent Poisson Variables

X ∼ Poisson [λ] and Y ∼ Poisson [µ].

Then their sum, Z = X + Y is also Poisson-distributed with parameter λ + µ.

because (X, Y ) are independent. Hence

Cov[X, Y ] = E[X · Y ] − E[X] · E[Y ].

z(z − 1)q(1 − q) − z 2 · q(1 − q) = −zq(1 − q).

0 = Var [X + Y |X + Y = z] = Var [X|Z = z] + 2 · Cov [(X, Y )|Z = z] + Var [Y |Z = z]

valid for 0 < x < z.

Conditonal Moments Recall that T = X/Z, hence X = T · Z. Conditional Expectation of X|Z = z is

5.4 Gamma-Distributed Independent Variables with a Common Scale