You are on page 1of 4

CHAPTER 2

Discrete Random Variables

§ 1. Random Variable and Expectation

Let p , F, Pq be a probability space. We assume that is discrete and countable (finite


or infinite). Note that that F can be any subset of powerset 2 . A random variable X is a
“function” X : Ñ R such that tw P : XpÊq “ ru P F for any r P R. Since is discrete, we
can assume, X takes at the most discretely countable values. For most part of this chapter, we
take that X takes integer values.
A random variable X is characterised by its distribution i.e., the probabilities PpX “ xq
for different values of x. Two random variables are said to be identically distributed if their
distributions are same i.e.,
PpX “ xq “ PpY “ xq

for each x P R. In such a case we write X “ Y .


d

Two random variables can be equal as if two X and Y are same as “functions” i.e., XpÊq “
Y pÊq for each Ê P . However, the most useful concept, apart from the above distributional
equivalence, is the almost sure equivalence. Two random variables X and Y are said to be almost
surely equal if
PtÊ P : XpÊq ‰ Y pÊqu “ 0

and we write X “ Y a.s.


An important point to note here is that the distributional equivalence does not require that
the random variables have to be defined on the same sample space.
The smallest ‡-field FX Ñ F containing the events tÊ P : XpÊq “ xu for each x P R is
called the ‡-field generated by X (or information space generated by X).
Two random variables X and Y are said to be independent if FX and FY are independent
i.e., PpA X Bq “ PpAqPpBq for any A P FX and B P FY . In other words,

PpX “ x, Y “ yq “ PpX “ xqPpY “ yq

for any x, y P R.
Independence of random variables extends naturally to a family of random variables. A
family of random variables is said to be pairwise independent if any two of the random variables
are independent.
Expectation of the random variable X is defined by
ÿ
EX “ xPpX “ xq
xPR

Since X takes at most countably infinite values, the above sum makes “sense”. The variance of
a random variable is given by
V arpXq “ EpX ´ EXq2 .

(1.1). Theorem. Let X, Y be two random variables and c P R. Then EpX ` Y q “ EX ` EY and
EcX “ cX.
9
10 2. DISCRETE RANDOM VARIABLES

Proof. Consider
ÿ ÿÿ
EpX ` Y q “ zPpX ` Y “ zq “ zPpX ` Y “ z|Y “ yqPpY “ yq
z z y
ÿÿ
“ zPpX “ z ´ yqPpY “ yq
z y
ÿÿ ÿÿ
“ pz ´ yqPpX “ z ´ yqPpY “ yq ` yPpX “ z ´ yqPpY “ yq
z y z y
« ff « ff
ÿ ÿ ÿ ÿ
“ pz ´ yqPpX “ z ´ yq PpY “ yq ` y PpX “ z ´ yq PpY “ yq
y z y z
ÿ ÿ
“ pEXqPpY “ yq ` y ¨ 1 ¨ PpY “ yq
y y
“ EX ` EY.
Next ÿ ÿ x x
EcX “ xPpcX “ xq “ c PpX “ q
x x
c c
cEX.

An important property of nonnegative integer valued random variable is the following:


(1.2). Proposition. Let X be nonnegative integer valued random variable. Then
8
ÿ
EpXq “ PpX ° nq.
n“1

An important property of convex functions is the following:


(1.3). Theorem (Jensen’s Inequality). For any convex function f ,
f pErXsq § Ef pErXsq.
(1.4). Example. Bernoulli random variable: X „ Berppq if
#
1 with probability p
X“
0 with probability 1 ´ p.
Note that EX “ p. Find V arpXq.

(1.5). Example. Binamomial random variable: X „ Binpn, pq if


ˆ ˙
n k
X “ k with probability p p1 ´ pqn´k
k

Observe that X “ X1 ` X2 ` ¨ ¨ ¨ ` Xn where Xi , i “ 1, 2, ¨ ¨ ¨ are iid (independent and identically


distributed with Bernoulli distribution. Find expectation and variance.

An important consequence of the independence of two random variables is the following:

(1.6). Theorem. Let X, Y be two independent random variables, then


EpXY q “ EXEY.

§ 2. Bernoulli’s Law of Large Numbers

Let X „ Binpn, pq where 0 † p † 1. Bernoulli proved the following


X
lim Pp|
´ p| ° ‘q “ 0.
nÑ8 n
This is known as Bernoulli’s law of large numbers. In the modern language, this is the weak law
of large numbers for Bernoulli random variables.
§ 3. MOMENT AND MARKOV INEQUALITY 11

Let pX piq :“ PpX “ iq. then


pX piq n´i`1p
“ .
pX pi ´ 1q i q
Thus pX piq • pX pi ´ 1q if and only if pn ` 1qp • i
Let r • pn ` 1qp and k • 1, then
pX pr ` kq n´rp n´rp
§ § .
pX pr ` k ´ 1q r`k q r q
Set R “ n´r p
r q then R § 1. Now
ÿ
n´r
PpX • rq “ pX pr ` kq
k“0

pX pr ` 1q
ÿ
n´r
pX pr ` kq
“ pX prq ...
k“0
pX prq pX pr ` k ´ 1q
ÿ
n´r
§ pX prqRk
k“0
1
§ pX prq .
1´R
Since r • pn ` 1qp, r • tpn ` 1qpu and hence pX decreases between rtpn ` 1qpu, rs. Therefore for
tpn ` 1qpu § i § r, pX prq § pX piq. Now
ÿn ÿr
1“ pX piq • pX piq • pr ´ tpn ` 1qpu ` 1qpX prq.
i“0 i“tpn`1qpu
1
This implies that pX prq § r´np and hence PpX • rq § pr´npq2 .
rq

For large n, we have n‘ ° p (by Archemedian property). Take r “ rnp ` n‘s. Then
`X ˘ ` ˘ ` ˘ rnp ` n‘sq pq q q
P ´ p ° ‘ “ P X ° np ` n‘ § P X • rnp ` n‘s § § 2 ` `
n prnp ` n‘s ´ npq2 ‘ n ‘n ‘2 n2
` ˘
Hence, P X n ´ p ° ‘ Ñ 0 as n Ñ 8.

For the other implication, we consider Y “ n ´ X. Then Y „ Binpn, qq and


`X ˘ `n ´ Y ˘ `Y ˘
P ´ p † ´‘ “ P ´ p † ´‘ “ P ´q °‘
n n n
Now applying the above estimate for Y , we obtain
`X ˘ pq p p
P ´ p † ´‘ § 2 ` ` .
n ‘ n ‘n ‘2 n2
Combining these two estimates, we have
`X ˘ 2pq 1 1
P | ´ p| ° ‘ § 2 ` ` Ñ 0 as n Ñ 8.
n ‘ n ‘n ‘2 n2
This proves the Bernoulli’s law of large numbers.

§ 3. Moment and Markov Inequality

(3.1). Theorem (Markov’s Inequality). Let X be non-negative random variable. Then


EX
PpX • aq §
a
for any a ° 0.

Proof. For an event A, define the indicator random variable by


#
1 if Ê P A
1A pÊq “
0 otherwise.
12 2. DISCRETE RANDOM VARIABLES

Let A “ tX • au. Then X “ X1A ` X1Ac . Now


EX “ EpX1A q ` EpX1Ac q
• Epa1A q ` EpX1Ac q
“ aPpAq ` EpX1Ac q
• aPpAq.
The last inequality follows since EpX1Ac q • 0 (recall that X is non-negative). And the Markov’s
inequality follows. ⇤

An important consequence of the Markov’s inequality is the Chebyshev’s inequality.


(3.2). Theorem (Chebyshev’s Inequality). Let X be a random variable. Then, for any a ° 0,
V arpXq
Pp|X ´ EX| • aq § .
a2
Proof. First note that
Pp|X ´ EX| • aq “ Pp|X ´ EX|2 • a2 q
Applying Markov’s inequality to the non-negative random variable |X ´ EX|2 , we now hat
E|X ´ EX|2 V arpXq
Pp|X ´ EX| • aq § “
a2 a2
proving the Chebyshev’s inequality. ⇤

One important corollary of the Chebyshev’s inequality is weak law of large numbers.
(3.3). Theorem (Weak Law of Large Numbers). Let X1 , X2 , ¨ ¨ ¨ be a sequence of independent
and identically distributed (i.i.d.) random variables. Then
ˆˇ ˇ˙
ˇ X1 ` X2 ` ¨ ¨ ¨ ` Xn ˇ
P ˇ ˇ ´ EX1 ˇˇ Ñ 0
n
as n Ñ 8.

Proof. Since X1 , X2 , ¨ ¨ ¨ are i.i.d., we have EXi “ EX1 and V arpXi q “ V arpX1 q for each
i “ 1, 2, ¨ ¨ ¨ . Consequently
X1 ` X2 ` ¨ ¨ ¨ ` Xn 1 V arpX1 q
V arp q “ 2 pV arpX1 q ` V arpX2 q ` ¨ ¨ ¨ ` V arpXn qq “
n n n
Now applying Chebyshev’s inequality, we have, for every a ° 0,
ˆˇ ˇ ˙
ˇ X1 ` X2 ` ¨ ¨ ¨ ` Xn ˇ V arpX1 q
P ˇˇ ˇ
´ EX1 ˇ • a § Ñ0
n na2
as n Ñ 8. This proves the weak law. ⇤

You might also like