Notes (Week 1 and 2)

Statistics for Data Science - 2
Notes - Week 1 and 2

Multiple Random Variables
1. Joint probability mass function: Suppose X and Y are discrete random variables
defined in the same probability space. Let the range of X and Y be TX and TY ,
respectively. The joint PMF of X and Y , denoted fXY , is a function from TX × TY
to [0, 1] defined as
fXY (t1 , t2 ) = P (X = t1 and Y = t2 ), t1 ∈ TX , t2 ∈ TY
• Joint PMF is usually written as table or a matrix.

• P (X = t1 and Y = t2 ) is denoted P (X = t1 , Y = t2 )
2. Marginal PMF: Suppose X and Y are jointly distributed discrete random variables
with joint PMF fXY . The PMF of the individual random variables X and Y are called
as marginal PMFs. It can be shown that
X
fX (t1 ) = P (X = t1 ) = (fXY (t1 , t2 ))
t2 ∈TY
X
fY (t2 ) = P (X = t2 ) = (fXY (t1 , t2 ))
t1 ∈TX
Note: Given the joint PMF, the marginal is unique.
3. Conditional distribution given an event: Suppose X is a discrete random variable

with range TX , and A is an event in the same probability space. The conditional PMF
of X given A is defined as the PMF
fX|A (t) = P (X = t|A)
where t ∈ TX
We will denote the conditional random variable by X|A. (Note that X|A is a valid
random variable with PMF fX|A ).
P ((X = t) ∩ A)
• fX|A (t) =
P (A)
• Range of (X|A) can be different from TX and will depend on A.
4. Conditional distribution of one random variable given another:
Suppose X and Y are jointly distributed discrete random variables with joint PMF
fXY . The conditional PMF of Y given X = t is defined as the PMF
P (X = x, Y = y) fXY (x, y)
fY |X=x (y) = =
P (X = x) fX (x)
We will denote the conditional random variable by Y |(X = x). (Note that Y |(X = x)
is a valid random variable with PMF fY |(X=x) .
• Range of (Y |X = t) can be different from TY and will depend on t.

• fXY (x, y) = fY |X=x (x, y).fX (x) = fX|Y =y (x, y).fY (y)
•
P
fY |X=x (y) = 1
y∈TY
5. Joint PMF of more than two discrete random variables:

Suppose X1 , X2 , . . . , Xn are discrete random variables defined in the same probability
space. Let the range of Xi be TXi . The joint PMF of Xi , denoted by fX1 X2 ...Xn , is a
function from TX1 × TX2 × . . . × TXn to [0, 1] defined as
fX1 X2 ...Xn (t1 , t2 , . . . , tn ) = P (X1 = t1 , X2 = t2 , . . . , Xn = tn ); ti ∈ TXi
6. Marginal PMF in case of more than two discrete random variables:

Suppose X1 , X2 , . . . , Xn are jointly distributed discrete random variables with joint
PMF fX1 X2 ...Xn . The PMF of the individual random variables X1 , X2 , . . . , Xn are
called as marginal PMFs. It can be shown that
X
fX1 (t1 ) = P (X1 = t1 ) = fX1 X2 ...Xn (t1 , t2 , . . . , tn )
t2 ∈TX2 ,t3 ∈TX3 ,...,tn ∈TXn
X
fX2 (t2 ) = P (X2 = t2 ) = fX1 X2 ...Xn (t1 , t2 , . . . , tn )
t1 ∈TX1 ,t3 ∈TX3 ,...,tn ∈TXn
..
.
X
fXn (tn ) = P (Xn = tn ) = fX1 X2 ...Xn (t1 , t2 , . . . , tn )
t1 ∈TX1 ,t2 ∈TX2 ,...,tn−1 ∈TXn−1
7. Marginalisation: Suppose X1 , X2 , . . . , Xn are jointly distributed discrete random

variables with joint PMF fX1 X2 ...Xn . The joint PMF of the random variables Xi1 , Xi2 , . . . Xik ,
denoted by fXi1 Xi2 ...Xik is given by
X
fXi1 Xi2 ...Xik (ti1 , ti2 , . . . tik ) = fX1 X2 ...Xn (t1 , . . . ti1 −1 , ti1 , ti1 +1 , . . . tik −1 , tik , tik +1 . . . , tn )
• Sum over everything you don’t want.
Page 2
8. Conditioning with multiple discrete random variables:
• A wide variety of conditioning is possible when there are many random variables.
Some examples are:
• Suppose X1 , X2 , X3 , X4 ∼ fX1 X2 X3 X4 and xi ∈ TXi , then
fX X (x1 , x2 )
– fX1 |X2 =x2 (x1 ) = 1 2
fX2 (x2 )
fX X X (x1 , x2 , x3 )
– fX1 ,X2 |X3 =x3 (x1 , x2 ) = 1 2 3
fX3 (x3 )
fX X X (x1 , x2 , x3 )
– fX1 |X2 =x2 ,X3 =x3 (x1 ) = 1 2 3
fX2 X3 (x2 , x3 )
fX X X X (x1 , x2 , x3 , x4 )
– fX1 X4 |X2 =x2 ,X3 =x3 (x1 , x4 ) = 1 2 3 4
fX2 X3 (x2 , x3 )
9. Conditioning and factors of the joint PMF:
Let X1 , X2 , X3 , X4 ∼ fX1 X2 X3 X4 , Xi ∈ TXi .
fX1 X2 X3 X4 (t1 , t2 , t3 , t4 ) =P (X1 = t1 and (X2 = t2 , X3 = t3 , X4 = t4 ))
=fX1 |X2 =t2 ,X3 =t3 ,X4 =t4 (t1 )P (X2 = t2 and (X3 = t3 , X4 = t4 ))
=fX1 |X2 =t2 ,X3 =t3 ,X4 =t4 (t1 )fX2 |X3 =t3 ,X4 =t4 (t2 )P (X3 = t3 and X4 = t4 )
=fX1 |X2 =t2 ,X3 =t3 ,X4 =t4 (t1 )fX2 |X3 =t3 ,X4 =t4 (t2 )fX3 |X4 =t4 (t3 )fX4 (t4 ).
• Factoring can be done in any sequence.
10. Independence of two random variables:
Let X and Y be two random variables defined in a probability space with ranges TX
and TY , respectively. X and Y are said to be independent if any event defined using
X alone is independent of any event defined using Y alone. Equivalently, if the joint
PMF of X and Y is fXY , X and Y are independent if
fXY (x, y) = fX (x)fY (y)
for x ∈ TX and y ∈ TY
• X and Y are independent if

fX|Y =y (x) = fX (x)
fY |X=x (y) = fY (y)
for x ∈ TX and y ∈ TY
• To show X and Y independent, verify

fXY (x, y) = fX (x)fY (y)
for all x ∈ TX and y ∈ TY
Page 3
• To show X and Y dependent, verify
fXY (x, y) 6= fX (x)fY (y)
for some x ∈ TX and y ∈ TY
– Special case: fXY (t1 , t2 ) = 0 when fX (t1 ) 6= 0, fY (t2 ) 6= 0.
11. Independence of multiple random variables:
Let X1 , X2 , . . . , Xn be random variables defined in a probability space with range of
Xi denoted TXi . X1 , X2 , . . . , Xn are said to be independent if events defined using
different Xi are mutually independent. Equivalently, X1 , X2 , . . . , Xn are independent
iff
fX1 X2 ...Xn (t1 , t2 , . . . , tn ) = fX1 (x1 )fX2 (x2 ) . . . fXn (xn )
for all xi ∈ TXi
• All subsets of independent random variables are independent.
12. Independent and Identically Distributed (i.i.d.) random variables:
Random variables X1 , X2 , . . . , Xn are said to be independent and identically distributed
(i.i.d.), if
(i) they are independent.
(ii) the marginal PMFs fXi are identical.
Examples:
• Repeated trials of an experiment creates i.i.d. sequence of random variables
– Toss a coin multiple times.
– Throw a die multiple times.
• Let X1 , X2 , . . . Xn ∼ i.i.d.X (Geometric(p)).
X will take values in {1, 2, . . .}
P (X = k) = pk−1 p
Since Xi ’s are independent and identically distributed, we can write

P (X1 > j, X2 > j, . . . , Xn > j) =P (X1 > j)P (X2 > j) . . . P (Xn > j)
=[P (X > j)]n
∞
X
P (X > j) = (1 − p)k−1 p
k=j+1
=(1 − p)j p + (1 − p)j+1 p + (1 − p)j+2 p + . . .

=(1 − p)j p[1 + (1 − p) + (1 − p)2 + . . .]

j 1
=(1 − p) p
1 − (1 − p)
=(1 − p)j
⇒ P (X1 > j, X2 > j, . . . , Xn > j) = [P (X > j)]n = (1 − p)jn
Page 4
13. Functions of a random variable: X : random variable with PMF fX (t).
f (X) : random variable whose PMF is given as follows.
ff (X) (a) = P (f (X) = a) = P (X ∈ {t : f (t) = a})

X
= fX (t)
t:f (t)=a
• PMF of f (X) can be found using PMF of X.

14. Functions of multiple random variables (g(X1 , X2 , . . . , Xn )):
Suppose X1 , X2 , . . . , Xn have joint PMF fX1 X2 ...Xn with TXi denoting the range of Xi .
Let g : TX1 × TX2 × . . . × TXn → R be a function with range Tg . The PMF of
X = g(X1 , X2 ..., Xn ) is given by
X
fX (t) = P (g(X1 , X2 ..., Xn ) = t) = fX1 X2 ...Xn (t1 , t2 , . . . , tn )
(t1 ,...,tn ):g(X1 ,X2 ...,Xn )=t
• Sum of two random variables taking integer values:

X, Y ∼ fXY , Z = X + Y.
Let z be some integer,
P (Z = z) =P (X + Y = z)
X∞
= P (X = x, Y = z − x)
x=−∞
X∞
= fXY (x, z − x)
x=−∞
X∞
= fXY (z − y, y)
y=−∞
∞
• Convolution: If X and Y are independent, fX+Y (z) =
P
fX (x)fY (z − x)
x=−∞
• Let X ∼ Poisson(λ1 ), Y ∼ Poisson(λ2 )

– X and Y are independent.
– Z = X + Y , z ∈ {0, 1, 2, . . .}
fZ (z) ∼ Poisson(λ1 + λ2 )
λ1 λ2
(X = k | Z = n) ∼ Binomial n, , (Y = k | Z = n) ∼ Binomial n,
λ1 + λ2 λ1 + λ2
15. CDF of a random variable:
Cumulative distribution function of a random variable X is a function FX : R → [0, 1]
defined as
FX (x) = P (X ≤ x)
Page 5
16. Minimum of two random variables:
Let X, Y ∼ fXY and let Z = min{X, Y }, then
fZ (z) = P (Z = z) = P (min{X, Y } = z)
= P (X = z, Y = z) + P (X = z, Y > z) + P (X > z, Y = z)
X X
= fXY (z, z) + fXY (z, t2 ) + fXY (t1 , z)
t2 >z t1 >z
FZ (z) = P (Z ≤ z) = P (min{X, Y } ≤ z)
= 1 − P (min{X, Y } > z)
= 1 − [P (X > z, Y > z)]
17. Maximum of two random variables:

Let X, Y ∼ fXY and let Z = max{X, Y }, then
fZ (z) = P (Z = z) = P (max{X, Y } = z)
= P (X = z, Y = z) + P (X = z, Y < z) + P (X < z, Y = z)
X X
= fXY (z, z) + fXY (z, t2 ) + fXY (t1 , z)
t2 <z t1 <z
FZ (z) = P (Z ≤ z) = P (max{X, Y } ≤ z)
= [P (X ≤ z, Y ≤ z)]
18. Maximum and Minimum of n i.i.d. random variables
• Let X ∼ Geometric(p), Y ∼ Geometric(q)

X and Y are independent.
Z = min(X, Y )
Z ∼ Geometric(1 − (1 − p)(1 − q))
• Maximum of 2 independent geometric random variables is not geometric.
Important Points:
1. Let N ∼ Poisson(λ) and X|N = n ∼ Binomial(n, p), then X ∼ Poisson(λp)
Page 6
2. Memory less property of Geometric(p)
If X ∼ Geometric(p), then
P (X > m + n|X > m) = P (X > n)
3. Sum of n independent Bernoulli(p) trials is Binomial(n, p).
4. Sum of 2 independent Uniform random variables is not Uniform.
5. Sum of independent Binomial(n, p) and Binomial(m, p) is Binomial(n + m, p).
6. Sum of r i.i.d. Geometric(p) is Negative-Binomial(r, p).
7. Sum of independent Negative-Binomial(r, p) and Negative-Binomial(s, p) is Negative-

Binomial(r + s, p)
8. If X and Y are independent, then g(X) and h(Y ) are also independent.
Page 7

Notes (Week 1 and 2)

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Notes (Week 1 and 2)

Uploaded by

Copyright:

Available Formats

Statistics for Data Science - 2

Notes - Week 1 and 2

fXY (t1 , t2 ) = P (X = t1 and Y = t2 ), t1 ∈ TX , t2 ∈ TY

• Joint PMF is usually written as table or a matrix.

Note: Given the joint PMF, the marginal is unique.

3. Conditional distribution given an event: Suppose X is a discrete random variable

fX|A (t) = P (X = t|A)

• Range of (Y |X = t) can be different from TY and will depend on t.

5. Joint PMF of more than two discrete random variables:

fX1 X2 ...Xn (t1 , t2 , . . . , tn ) = P (X1 = t1 , X2 = t2 , . . . , Xn = tn ); ti ∈ TXi

6. Marginal PMF in case of more than two discrete random variables:

7. Marginalisation: Suppose X1 , X2 , . . . , Xn are jointly distributed discrete random

• Sum over everything you don’t want.

• X and Y are independent if

• To show X and Y independent, verify

Since Xi ’s are independent and identically distributed, we can write

=(1 − p)j p + (1 − p)j+1 p + (1 − p)j+2 p + . . .

ff (X) (a) = P (f (X) = a) = P (X ∈ {t : f (t) = a})

• PMF of f (X) can be found using PMF of X.

• Sum of two random variables taking integer values:

• Let X ∼ Poisson(λ1 ), Y ∼ Poisson(λ2 )

17. Maximum of two random variables:

18. Maximum and Minimum of n i.i.d. random variables

• Let X ∼ Geometric(p), Y ∼ Geometric(q)

1. Let N ∼ Poisson(λ) and X|N = n ∼ Binomial(n, p), then X ∼ Poisson(λp)

P (X > m + n|X > m) = P (X > n)

3. Sum of n independent Bernoulli(p) trials is Binomial(n, p).

4. Sum of 2 independent Uniform random variables is not Uniform.

5. Sum of independent Binomial(n, p) and Binomial(m, p) is Binomial(n + m, p).

6. Sum of r i.i.d. Geometric(p) is Negative-Binomial(r, p).

7. Sum of independent Negative-Binomial(r, p) and Negative-Binomial(s, p) is Negative-

You might also like