StatIdea Slides 2

2.
Random Variables and Distributions
J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 1 / 55

2.1. Random objects and random variables
DeÖnition. A random object is a measurable function

x̃ : (W, F ) "! (W0 , F 0 ) , where W is a sample space and F is the
⼀
s-algebra of events.
DeÖnition. A (real-valued) random variable is a measurable function

x̃ : (W, F ) "! (R, B) , where W is a sample space and F is the
~
s-algebra of events.
Real hlumber .
T) →
九问
eg {
.
H ,
H ,
.
)
{ H T, ,
τ 3
→
C
1
)
0 i
0
{T τ τ3
→
, ,

Similarly, when W is a sample space and F is the s-algebra of events,
! "
x̃ : (W, F ) "! R, B is a extended (real-valued) random variable.
x̃ : (W, F ) "! (Rn , B) is a "(real-valued) random vector" or a
"(real-valued) multivariate random variable".
# $
n
x̃ : (W, F ) "! R , B is a "extended (real-valued) random vector"
or a "extended (real-valued) multivariate random variable".
A random vector is just a vector of random variables:
x̃ = (x̃1 , x̃2 , ..., x̃n ) .

2.2. Probability distributions
Let (W, F , P ) be a probability space.
DeÖnition. The probability distribution (or distribution) of a random
object x̃ : (W, F , P ) "! (W0 , F 0 ) is a probability measure Px̃ on
(W0 , F 0 ) deÖned by
! "
Px̃ (B ) = P x̃ "1 (B ) for all B 2 F 0
or
Px̃ (B ) = P fw 2 W j x̃ (w ) 2 B g = P fx̃ 2 B g for all B 2 F 0 .
Obviously,
Z Z Z
Px̃ (B ) = 1dPx̃ * dPx̃ * IB (x )dPx̃ (x ) for all B 2 F 0
B B W0
or
Z Z Z
Px̃ (B ) = 1dP * dP * Ix̃ "1 (B ) (w )dP (w ) for all B 2 F 0 .
W
x̃ "1 (B ) x̃ "1 (B )

Example: We roll a balanced dice,! W W= f1,
" 2, 3, 4, 5, 6g , and
consider the random variable x̃ : W, 2 , P "! (R, B) deÖned as
8
< 1 if w = 1, 2, 3, 4
x̃ (w ) =
:
7 if w = 5, 6.
The induced probability Px̃ on (R, B) (or distribution of x̃) satisÖes
Px̃ f1g = P f1, 2, 3, 4g = 2/3, Px̃ f7g = P f5, 6g = 1/3,

Px̃ f12g = P (∆) = 0, Px̃ ("3, 1) = P (∆) = 0,
Px̃ h 3,p1] =i P f1, 2, 3, 4g = 2/3,
[" Px̃ [5, 8] = Px̃ f7g = 1/3,
Px̃ p, 13 = P (∆) = 0, Px̃ ("•, 12] = P (W) = 1,
Px̃ [10, •) = P (∆) = 0, Px̃ (1, •) = P f5, 6g = 1/3,
Px̃ ("•, 2] = P f1, 2, 3, 4g = 2/3, etc.
Moreover, using the properties of the probability, we obtain the

distribution for all Borel sets in R.
e
DeÖnition. The support supp (Px̃ ) of the distribution of the random
vector x̃ : (W, F ) "! (Rn , B) is the smallest closed subset of Rn
whose complement has zero probability distribution,
Px̃ f[supp (Px̃ )]c g = 0.
DeÖnition. Two random objects x̃ and ỹ deÖned on (W, F , P ) and

~
taking values on (W0 , F 0 ) are equivalent (or equal) in distribution

d
(x̃ = ỹ ) if they have the same distribution, Px̃ = Pỹ .
Example:
! We" toss a balanced coin !and consider
"
x̃ : W, 2W , P "! (R, B), and ỹ : W, 2W , P "! (R, B) deÖned as
8 8
< "1 if w = H < "1 if w = T
x̃ (w ) = and ỹ (w ) =
: :
1 if w = T 1 if w = H.
d
Thus, x̃ = ỹ .

An event A is sure if A = W.
An event A is almost sure (a.s.) if P (A) = 1.
An event A is negligible if P (A) = 0.

DeÖnition. We say that two random objects deÖned on (W, F , P )
and taking values on (W0 , F 0 ) are equal, x̃ = ỹ , if x̃ (w ) = ỹ (w ) for
all w 2 W.
DeÖnition. We say that two random objects deÖned on (W, F , P )
a.s .
and taking values on (W0 , F 0 ) are equal almost surely (a.s.), x̃ = ỹ , if
P fx̃ = ỹ g = P fw 2 W j x̃ (w ) = ỹ (w ) g = 1 ,
or, equivalently, if
P fx̃ 6= ỹ g = P fw 2 W j x̃ (w ) 6= ỹ (w ) g = 0.
Note that the concept of "a.s." is the same as that of "a.e." The only
di§erence is that "a.e." applies to functions deÖned on measure
spaces, whereas "a.s." applies to random objects deÖned on
probability spaces.
a.s . a.s . d
Obviously, x̃ = ỹ =) x̃ = ỹ . Moreover, x̃ = ỹ =) x̃ = ỹ but the
converse is not true (see the example in the previous page where
d a.s .
x̃ = ỹ but x̃ 6= ỹ since P fx̃ 6= ỹ g = 1).
2.3. Distribution function of a random variable
Note that the distribution Px̃ of a random variable

x̃ : (W, F , P ) "! (R, B) is a probability measure on (R, B) and,
thus, is a Önite measure.
Therefore, the distribution Px̃ of a random variable

x̃ : (W, F , P ) "! (R, B) is a Lebesgue-Stieltjes measure on R
satisfying Px̃ (R ) = 1.
—
auable

DeÖnition. The (cumulative) distribution function (cdf)
i
Fx̃ : R "! R of a random variable x̃ : (W, F , P ) "! (R, B) is the
distribution function associated with the distribution Px̃ , i.e.,
Px̃ (a, b ] = P fa < x̃ . b g = Fx̃ (b ) " Fx̃ (a),
where we make the normalization lim Fx̃ (x ) = 0.

x !"•
Therefore,
Px̃ ("•, x ] = P fx̃ . x g = Fx̃ (x ) " lim Fx̃ (x ) = Fx̃ (x ).

x !"•
Moreover,
lim Fx̃ (x ) = Px̃ ("•, •) = P fx̃ 2 R g = 1.

x !•
Thus, the distribution function of a random variable x̃ is increasing,

right-continuous, and satisÖes lim Fx̃ (x ) = 0 and lim Fx̃ (x ) = 1.
x !"• x !•

2.4. Discrete random variables
DeÖnition. A random variable x̃ : (W, F ) "! (R, B) is discrete if

its range is countable or discrete (either Önite or inÖnite).
If W is discrete then x̃ is discrete. The converse is not true.
Let fx1 , x2 , ...g be the range x̃ (W) of the discrete random variable x̃.
If x̃ is discrete there is a countable partition A = fA1 , A2 , ...g of W

with
An = fw 2 W j x̃ (w ) = xn g , for all xn 2 x̃ (W).
Therefore, An = x̃ "1 (xn ) , for all xn 2 x̃ (W).
An = ∞ (w )

The s-algebra s(A) generated by the partition A is the smallest ⼀
s-algebra that makes the random variable x̃ measurable.

⼀
The distribution of a discrete random variable x̃ (which is said to

have a discrete distribution) satisÖes:
Px̃ fxn g = P fx̃ = xn g = P (An ), for all xn 2 x̃ (W).
DeÖnition. The probability mass function (pmf), (or just probability

function), fx̃ : x̃ (W) "! [0, 1] , of a discrete random variable x̃ (or of
a discrete distribution Px̃ ) is given by:
fx̃ (x ) = P fx̃ = x g = Px̃ fx g , for all x 2 x̃ (W).

Properties of the probability and distribution functions of a
discrete random variable:
1.
Â fx̃ (x ) = 1.
x 2x̃ (W)
2. Any function f : x̃ (W) "! [0, 1] , where x̃ (W) is countable,

satisfying Â f (x ) = 1 can serve as a probability function of a
x 2x̃ (W)
discrete distribution.
3.
Fx̃ (x ) = Â fx̃ (t ), with t 2 x̃ (W).
t .x

4.
Px̃ (B ) = P fx̃ 2 B g = Â fx̃ (x ), for all B 2 B , with x 2 x̃ (W).

x 2B
5.
fx̃ (x ) = Fx̃ (x ) " lim" Fx̃ (t ), for x 2 x̃ (W).
t !x
In particular, if the range of x̃ can be ordered so that

x1 < x2 < ... < xi "1 < xi < xi +1 < ..., then fx̃ (x1 ) = Fx̃ (x1 ) and
fx̃ (xi ) = Fx̃ (xi ) " Fx̃ (xi "1 ) for i = 2, 3, ...

Example: Let x̃ be the number of heads when tossing 4 coins.
Pcamplemeutj.no
> 1/16 for x = 0
,
>
>
>
>
>
>
>
> 4/16 for x = 1
>
>
>
<
fx̃ (x ) = 6/16 for x = 2
>
>
>
>
>
>
>
> 4/16 for x = 3
>
>
>
>
:
1/16 for x = 4,
or
, -
1 4
fx̃ (x ) = , for x = 0, 1, 2, 3, 4.
16 x | {z }
x̃ (W)

Probability Histogram:

Probability Bar Chart:

Distribution function:

2.5. Continuous and absolutely continuous random
variables
DeÖnition 1. A random variable x̃ is continuous if its range x̃ (W) is
0t_t_P.cn
continuous. "
⼀
DeÖnition 2. A random variable x̃ is continuous if its distribution

function Fx̃ is continuous, that is, if Px̃ fx g = P fx̃ = x g = 0 for all
x 2 R.
Continuity according to DeÖnition 2 implies continuity according to
DeÖnition 1.
DeÖnition. A random variable x̃ : (W, F ) "! (R, B) is absolutely
continuous if its distribution function Fx̃ is absolutely continuous,
! " i.e.,
there exists a Borel measurable function fx̃ : (R, B) "! R, B that
is integrable with respect to Lebesgue measure such that
Z
Fx̃ (x ) " Fx̃ (a) = fx̃ (t )dt, for all a 2 R, x 2 R, with a . x.
[a,x ]

Absolute continuity implies continuity.
Random variables that are neither discrete nor absolutely continuous

are called "mixed".
Equivalent deÖnition: A random variable x̃ is absolutely continuous

if its distribution Px̃ is absolutely continuous with respect to Lebesgue
measure.
Therefore, thanks to the Radon-Nikodym theorem,

! " there exists a
Borel measurable function fx̃ : (R, B) "! R, B such that
Z
Px̃ (B ) = fx̃ (x )dx, for all B 2 B .
B

2.6. Density
! "
The Borel measurable function fx̃ : (R, B) "! R, B such that
Z
Px̃ (B ) = fx̃ (x )dx, for all B 2 B ,
B
is called the probability density function (pdf), (or density function or

just "density"), of the random variable x̃ (or of the distribution Px̃ ).
☐
Since Px̃ (R ) = 1, the density function fx̃ is integrable with respect to
_
Lebesgue measure on (R, B) .
Moreover, the density fx̃ is Önite a.e. with respect to Lebesgue

measure on (R, B) .
The density function fx̃ of the random variable x̃ is the

Radon-Nikodym derivative of its distribution with respect to Lebesgue
measure, fx̃ = dPx̃ / dx.

Note: If x̃ is absolutely continuous, then
Px̃ (a, b ] = Px̃ (a, b ) = Px̃ [a, b ] = Px̃ [a, b ) =

Z
Fx̃ (b ) " Fx̃ (a) = fx̃ (x )dx.
[a,b ]
Notation: If the random variable x̃ has the distribution Px̃ , we write

x̃ 0 Px̃ , x̃ 0 Fx̃ , or x̃ 0 fx̃ , where Fx̃ is the corresponding distribution
function and fx̃ is the corresponding probability or density function.

Px̃ [a, b ] is given by the area of the yellow region
Rn-omottheheigt.hn
Properties of the density:
1. Z
fx̃ (x )dx = 1.
R
2. Z
Fx̃ (x ) = fx̃ (t )dt.
("•,x ]
3. Any non-negative (a.e.! w.r.t.

" LebesgueRmeasure) Borel measurable
function f : (R, B) "! R, B satisfying R f (x )dx = 1 can serve as
a density of an absolutely continuous distribution on (R, B).
4. If x̃ is absolutely continuous, then fx̃ = Fx̃0 when the derivative of

Fx̃ exists. Moreover, the derivative Fx̃0 exists a.e. w.r.t. Lebesgue
measure. If fx̃ is continuous at x then Fx̃ is di§erentiable at x and
fx̃ (x ) = Fx̃0 (x ).

2.7. Random vectors
x̃ : (W, F ) "! (Rn , B) .

x̃ = (x̃1 , x̃2 , ..., x̃n ) or x̃ = (x̃1 , x̃2 , ..., x̃n )| .
x̃i = pi (x̃ ) , where pi : Rn "! R is the projection to the ith
coordinate.
The distribution of the random vector x̃ is a probability measure on
(Rn , B) given by
! "
Px̃ (B ) = P x̃ "1 (B ) for all B 2 B (Rn ) .
The distribution function (cdf) of the random vector

x̃ = (x̃1 , x̃2 , ..., x̃n ) , Fx̃ : Rn "! R, is given by
Fx̃ (x1 , x2 , ..., xn ) = P fx̃i . xi , for i = 1, 2, ..., ng .

| {z }
<
x 2R n ⼥≤x
atb 。
Orz
The distribution function of a random vector x̃ is an (i) increasing,...
(Increasing: a < b =) F (a) . F (b ), where a and b belong to Rn .)
(ii) right-continuous,..
! "
(Right-continuous at x0 : lim F (x ) * F x0+ = F (x0 ) , where
x !x0+
x > x0 2 Rn )
e.EE
(iii) Fx̃ (x ) ! 0 if at least one of the components xi of x 2 Rn tends
to "•, and
(iv) Fx̃ (x ) ! 1 if all the components xi , i = 1, ..., n, of x 2 Rn tend

_
to •.

The random vector x̃ = (x̃1 , x̃2 , ..., x̃n ) is discrete if its range x̃ (W) is
countable (or discrete).
器
The probability function (pmf),
fx̃ : x̃1 (W) 1 x̃2 (W) 1 ... 1 x̃n (W) "! [0, 1] , of a discrete random
⼀
vector x̃ is given by:
8 9
< =
fx̃ (x ) = Px̃ fx g = P (x̃1 , x̃2 , ..., x̃n ) = (x1 , x2 , ..., xn ) =
:| {z } | {z };
x̃ x 2R n
P fx̃i = xi , for i = 1, 2, ..., ng , for all x 2 x̃1 (W) 1 x̃2 (W) 1 ... 1 x̃n (W).
0
Note: x̃ (W) 2 x̃1 (W) 1 x̃2 (W) 1 ... 1 x̃n (W).
_

Properties of the probability and distribution functions of a
discrete random vector:
1.
Â fx̃ (x ) = 1 or Â fx̃ (x ) = 1.
x 2x̃ (W) x 2x̃1 (W)1x̃2 (W)1...1x̃n (W)
2.
O
F (x ) = Â f (t ),
x̃
t 5x
x̃ with t = (t1 , t2 , ..., tn ) 2 x̃ (W),
where t 5 x means that ti . xi for i = 1, 2, ..., n.
3.
Px̃ (B ) = P fx̃ 2 B g = Â fx̃ (x ), for all B 2 B (Rn ) .
x 2B

The random vector x̃ = (x̃1 , x̃2 , ..., x̃n ) (or its distribution) is
absolutely continuous
! if" there exists a Borel measurable function
fx̃ : (Rn , B) "! R, B , called the density (pdf), that is integrable
with respect to Lebesgue measure on (Rn , B) , such that
Z
Px̃ (B ) = fx̃ (x1 , x2 , ..., xn ) d (x1 , x2 , ..., xn ) , for all B 2 B (Rn ) .
B
Properties of the density of a random vector:
1. Z Z Z
fx̃ (x )dx = ... fx̃ (x1 , ..., xn )dx1 ...dxn = 1.
Rn R R | {z }
-_-
x 2R n
2.
Z Z Z
和性
Fx̃ (x ) = ... fx̃ (t1 , t2 , ..., tn )dt1 dt2 ...dtn .
("•,xn ] ("•,xn "1 ] ("•,x1 ]

3. Any non-negative (a.e.! w.r.t." Lebesgue measure) Borel measurable
function f : (Rn , B) "! R, B satisfying
Z Z
... f (x1 , x2 , ..., xn )dx1 dx2 , ..., dxn = 1
R R
can serve as a density of an absolutely continuous distribution on

(Rn , B) .
4. If the random vector x̃ = (x̃1 , x̃2 , ..., x̃n ) is absolutely continuous,
then
∂n Fx̃ (x1 , x2 , ..., xn )
fx̃ (x1 , x2 , ..., xn ) = .
∂x1 ∂x2 ...∂xn
when this nth crossed partial derivative of Fx̃ exists. Moreover, this
derivative exists a.e. w.r.t. Lebesgue measure on (Rn , B).

2.8. Marginal distributions
DeÖnition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be a random vector with
distribution Px̃ . The marginal distribution of x̃i , for i = 1, ..., n, is
given by
Px̃i (B ) = Px̃ (R 1... 1 B 1 ... 1 R ), for all B 2 B (R ) .
"
i
DeÖnition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be a discrete random vector with
the probability function fx̃ , the marginal probability function of x̃i , for
i = 1, ..., n, is given by
fx̃i (xi ) =
Â ... Â Â ... Â fx̃ (x1 , ..., xi "1 , xi , xi +1 ..., xn ) .

| {z }
x1 2x̃1 (W) xi "1 2x̃i "1 (W) xi +1 2x̃i +1 (W) xn 2x̃n (W)
x
for all xi 2 x̃i (W).

DeÖnition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be an absolutely continuous
random vector with the density fx̃ , the marginal density of x̃i , for
i = 1, ..., n, is given by
Z Z
fx̃i (xi ) = ... fx̃ (x1 , ..., xi "1 , xi , xi +1 ..., xn )dx1 ...dxi "1 dxi +1 ...dxn ,
| {z }
R R x
for all xi 2 R.
Note: From the marginal probability or density functions we can

construct the marginal distributions in the usual way.

Example 1: The discrete random vector (x̃, ỹ ), where x̃ is the
number of points when rolling a dice and ỹ is the number of heads
when tossing a coin has a probability function fx̃ ,ỹ (x, y ) summarized
in the following table:
y nx 1 2 3 4 5 6 fỹ (y )
0 1/12 1/12 1/12 1/12 1/12 1/12 1/2
1 1/12 1/12 1/12 1/12 1/12 1/12 1/2
fx̃ (x ) 1/6 1/6 1/6 1/6 1/6 1/6 1
The marginal probability functions of x̃ and ỹ are summarized in the

"margins".

Example 2: The absolutely continuous random vector (x̃, ỹ ) has the
following density:
8
> 2
< (x + 2y ) for 0 < x < 1 and 0 < y < 1
fx̃ ,ỹ (x, y ) = 3
>
:
0 otherwise.

Marginal densities:
Z • Z 1
2 26 71
fx̃ (x ) = fx̃ ,ỹ (x, y )dy = (x + 2y ) dy = xy + y 2 0
"• 0 3 3
2
= (x + 1) , for 0 < x < 1.
3
Therefore, 8
> 2
>
< 3 (x + 1) for 0 < x < 1
fx̃ (x ) =
>
>
: 0 otherwise.

Similarly,
Z • Z 1 8 91
2 2 x2
fỹ (y ) = fx̃ ,ỹ (x, y )dx = (x + 2y ) dx = + 2xy
"• 0 3 3 2 0
, -
2 1 1
= + 2y = (1 + 4y ) , for 0 < y < 1.
3 2 3
Therefore,
8
> 1
< (1 + 4y ) for 0 < y < 1
fỹ (y ) = 3
>
:
0 otherwise.

2.9. Independent random variables
DeÖnition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be a random vector deÖned on
(W, F , P ) with the distribution Px̃ on (Rn , B (Rn )) . The random
variables x̃1 , x̃2 , ..., x̃n are said to be independent if, for all collections
of sets B1 , B2, ..., Bn belonging to B (R ) , we have
P fx̃1 2 B1 , ..., x̃n 2 Bn g = P fx̃1 2 B1 g 5 ... 5 P fx̃n 2 Bn g .
or, equivalently, if the distribution of the random vector x̃ is equal to

the product measure of the marginal distributions,
n
Px̃ = Px̃1 1 ... 1 Px̃n * ’ Px̃ .i
i =1
DeÖnition. Let x̃i : (W, F , P ) "! (Wi , Fi ) , for i = 1, ..., n. The

random objects x̃1 , x̃2 , ..., x̃n are said to be independent if, for all sets
B1 2 F1 , ..., Bn 2 Fn ,
P fx̃1 2 B1 , ..., x̃n 2 Bn g = P fx̃1 2 B1 g 5 ... 5 P fx̃n 2 Bn g .

Equivalent deÖnition. Let x̃1 , x̃2 , ..., x̃n be a collection of random
objects on the probability space (W, F , P ),
x̃i : (W, F ) "! (Wi , Fi ) , for i = 1, 2, ..., n.
The random objects x̃1 , x̃2 , ..., x̃n are said to be independent if the
joint distribution
n
O
P(x̃1 ,x̃2 ,...,x̃n ) : Fi "! [0, 1]
i =1
of these n random objects is equal to the product measure of the

marginal distributions,
n
P(x̃1 ,x̃2 ,...,x̃n ) = ’ Px̃ ,i
i =1
where Px̃i : Fi "! [0, 1] is the marginal distribution of the random

n
N
object x̃i , i = 1, ..., n, and Fi is the product s"algebra.
i =1

Proposition. Let x̃i : (W, F , P ) "! (Wi , Fi ) , for i = 1, ..., n, be a
collection of independent random objects and
gi : (Wi , Fi ) "! (Wi0 , Fi0 ) , for i = 1, ..., n, be measurable functions.
Then, the random objects gi (x̃i ) : (W, F , P ) "! (Wi0 , Fi0 ) , for
i = 1, ..., n, are independent.

Proof. If
P fx̃1 2 B1 , ..., x̃n 2 Bn g = P fx̃1 2 B1 g 5 ... 5 P fx̃n 2 Bn g ,
for all sets B1 2 F1 , ..., Bn 2 Fn , then

< ! " ! "=
P x̃1 2 g1"1 B10 , ..., x̃n 2 gn"1 Bn0
< ! "= < ! "=
= P x̃1 2 g1"1 B10 5 ... 5 P x̃n 2 gn"1 Bn0 ,
for all sets B10 2 F10 , ..., Bn0 2 Fn0 , since g1"1 (B10 ) 2 F1 , ..., gn"1 (Bn0 ) 2 Fn
due to the measurability of gi , for i = 1, ..., n. Therefore,
< = < = < =
P g1 (x̃1 ) 2B10 , ..., gn (x̃n ) 2Bn0 = P g1 (x̃1 ) 2B10 5 ... 5 P gn (x̃n ) 2Bn0 ,
for all sets B10 2 F10 , ..., Bn0 2 Fn0 , which proves the independency of the
random objects gi (x̃i ) : (W, F , P ) "! (Wi0 , Fi0 ) , for i = 1, ..., n. Q.E .D.

Proposition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be a random vector with the
distribution function Fx̃ : Rn "! [0, 1] and let Fi : R "! [0, 1] be
the marginal distribution function of x̃i , for i = 1, ..., n. Then, the
random variables x̃1 , x̃2 , ..., x̃n are independent if and only if
Fx̃ (x1 , ..., xn ) = F1 (x1 ) 5 F2 (x2 ) 5 ... 5 Fn (xn ) ,
for all x = (x1 , ..., xn ) 2 Rn .

Proposition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be a discrete random vector
with the probability function fx̃ : x̃1 (W) 1 ... 1 x̃n (W) "! [0, 1] and
let fi : x̃i (W) "! [0, 1] be the marginal probability function of x̃i , for
i = 1, ..., n. Then, the random variables x̃1 , x̃2 , ..., x̃n are independent
if and only if
fx̃ (x1 , ..., xn ) = f1 (x1 ) 5 f2 (x2 ) 5 ... 5 fn (xn ) ,
for all x = (x1 , ..., xn ) 2 x̃1 (W) 1 ... 1 x̃n (W).
Proposition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be an absolutely continuous

random vector with the density function fx̃ : Rn "! R and let
fi : R "! R be the marginal density function of x̃i , for i = 1, ..., n.
Then, the random variables x̃1 , x̃2 , ..., x̃n are independent if and only if
fx̃ (x1 , ..., xn ) = f1 (x1 ) 5 f2 (x2 ) 5 ... 5 fn (xn ) ,
for all x = (x1 , ..., xn ) 2 Rn .

2.10. Generalized conditional probability
Let x̃ : (W, F , P ) "! (W0 , F 0 ) and let us Öx the event B 2 F . From
the Radon-Nikodym theorem we know that there exists a Borel
measurable function g : (W0 , F 0 ) "! (R, B) such that
Z
P (fx̃ 2 Ag \ B ) = g (x )dPx̃ (x ), for all A 2 F 0 ,
| {z } A
l (A )
since l 7 Px̃ . The function g is called the conditional probability of

B given x̃ = x and is written as g (x ) = P (B jx̃ = x ). The
conditional probability is essentially unique for a given B 2 F (i.e., if
there exists another such function h, then g = h a.e. [Px̃ ]).
Therefore,
Z
P (fx̃ 2 Ag \ B ) = P (B jx̃ = x )dPx̃ (x ),
Tnevent
A
with P (B jx̃ = 5 ) : (W0 , F 0 ) "! (R, B) .

However, sometimes the conditional probability is viewed as a
measure on (W, F ),
P (5 jx̃ = x ) : F "! R.
Moreover, if A = W0 , then
!< = " Z
P x̃ 2 W0 \ B = P (W \ B ) = P (B ) = P (B jx̃ = x )dPx̃ (x ),
W0
which is a generalization of the theorem of total probability.
Note that, if x̃ is an absolutely continuous random variable, then

P (B jx̃ = x ) is a conditional probability given an event (fx̃ = x g)
that has zero probability!

2.11. Conditional distributions
DeÖnition. Let (x̃, ỹ ) be a vector of two random objects

x̃ : (W, F , P ) "! (Wx , Fx ) and ỹ : (W, F , P ) "! (Wy , Fy ) , and let
C 2 Fy be a Öxed measurable set. The conditional distribution of ỹ
given x̃ = x is the Borel measurable function
Pỹ jx̃ (C jx̃ = 5 ) : (Wx , Fx ) "! (R, B) given by
-
Pỹ jx̃ (C jx ) = P fỹ 2 C jx̃ = x g , for all x 2 Wx ,

n nn
which is essentially unique w.r.t. Px̃ .
However, sometimes the conditional distribution is viewed as a

measure on (Wy , Fy ) ,
Pỹ jx̃ (5 jx ) : Fy "! R.

Assume that the random vector (x̃, ỹ ) is discrete with the probability
function fx̃ ,ỹ : x̃ (W) 1 ỹ (W) "! [0, 1] . Then, the conditional
distribution Pỹ jx̃ (y jx ) = P fỹ = y jx̃ = x g must satisfy
-
P fx̃ 2 A, ỹ 2 C g = Â Pỹ jx̃ (C jx )P| fx̃{z= x g}

x 2A
fx̃ (x )
= Â Â Pỹ jx̃ (y jx )fx̃ (x ), for all A 2 B (R ) and C 2 B (R ) . (8)

x 2A y 2C
| {z }
P ỹ jx̃ (C jx )
Let us deÖne the conditional distribution Pỹ jx̃ (y jx ) as follows:
P fx̃ = x, ỹ = y g fx̃ ,ỹ (x, y )

Pỹ jx̃ (y jx ) = = * fỹ jx̃ (y jx ) ,
P fx̃ = x g fx̃ (x )
for all (x, y ) 2 x̃ (W) 1 ỹ (W) with fx̃ (x ) > 0.

_

The function fỹ jx̃ (5 jx ) : ỹ (W) "! [0, 1] , for all x 2 x̃ (W) such that
fx̃ (x ) > 0, is the conditional probability function of ỹ given x̃ = x.
The previous deÖnition of the conditional probability function (or

conditional distribution) of ỹ given x̃ = x is the right one since the
expression (8) becomes
P fx̃ 2 A, ỹ 2 C g = Â Â fỹ jx̃ (y jx ) fx̃ (x )

x 2A y 2C
fx̃ ,ỹ (x, y )

= Â Â fx̃ (x )
fx̃ (x ) = Â Â fx̃ ,ỹ (x, y ),
x 2A y 2C x 2A y 2C
for all A 2 B (R ) and C 2 B (R ) .

Assume that the random vector (x̃, ỹ ) is absolutely continuous with
the density fx̃ ,ỹ : R2 "! R. Then, we would like to have an
expression like this:
Z
P fx̃ 2 A, ỹ 2 C g = Pỹ jx̃ (C jx )dPx̃ (x )
A
Z Z 8Z 9
= Pỹ jx̃ (C jx )fx̃ (x )dx = fỹ jx̃ (y jx ) dy fx̃ (x )dx, (88)
A A C
| {z }
P ỹ jx̃ (C jx )

Let us deÖne the conditional density of ỹ given x̃ = x,
fỹ jx̃ (5 jx ) : R "! R, for all x 2 R such that fx̃ (x ) > 0, as follows:
fx̃ ,ỹ (x, y )

fỹ jx̃ (y jx ) = , for all (x, y ) 2 R2 with fx̃ (x ) > 0.
fx̃ (x )
The previous deÖnition of the conditional density of ỹ given x̃ = x is

the right one since the expression (88) becomes
Z Z
P fx̃ 2 A, ỹ 2 C g = fỹ jx̃ (y jx ) fx̃ (x )dydx
A C
Z Z Z Z
fx̃ ,ỹ (x, y )
= fx̃ (x )dydx = fx̃ ,ỹ (x, y )dydx,
A C fx̃ (x ) A C

If the discrete (absolutely continuous) random variables x̃ and ỹ are
independent then
fx̃ ,ỹ (x, y ) fx̃ (x ) 5 fỹ (y )

fỹ jx̃ (y jx ) = = = fỹ (y ) , for fx̃ (x ) > 0.
fx̃ (x ) fx̃ (x )
That is, the conditional probability function (density function) is

equal to the corresponding unconditional probability function (density
function).

Note that from the conditional probability and density functions we
can obtain the conditional distribution in the usual way, namely,
Pỹ jx̃ (C jx ) = P fỹ 2 C jx̃ = x g = Â fỹ jx̃ (y jx ) , for all C 2 B ,
y 2C
or
Z
Pỹ jx̃ (C jx ) = P fỹ 2 C jx̃ = x g = fỹ jx̃ (y jx ) dy , for all C 2 B ,
C
where
Pỹ jx̃ (C j5 ) : (R, B) "! (R, B)
or, sometimes,
Pỹ jx̃ (5 jx ) : B (R ) "!R.
Note again that, if x̃ is an absolutely continuous random variable,

then Pỹ jx̃ (C jx ) is a conditional distribution (and, hence, a
conditional probability P fỹ 2 C jx̃ = x g) given the event fx̃ = x g ,
which has zero probability! This conditional distribution is well
deÖned when the marginal density of x̃ evaluated at x, fx̃ (x ), is
strictly positive.
Example: The absolutely continuous random vector (x̃, ỹ ) has the
following density:
8
> 2
>
< 3 (x + 2y ) for 0 < x < 1 and 0 < y < 1
fx̃ ,ỹ (x, y ) =
>
>
: 0 otherwise.
We have already proved that the marginal density of the random

variable ỹ is
8
> 1
< (1 + 4y ) for 0 < y < 1
fỹ (y ) = 3
>
:
0 otherwise.

Therefore, the conditional density of x̃ given ỹ = y is
8 2
>
>
(x + 2y ) 2x + 4y
< 31 = for 0 < x < 1
3 (1 + 4y )
1 + 4y
fx̃ jỹ (x jy ) =
>
>
:
0 otherwise,
for 0 < y < 1.
Thus, the conditional density of x̃ given ỹ = 1/4 is

8
, > - > 2x + 1 1
>1 < = (2x + 1) for 0 < x < 1
fx̃ jỹ x >> = 2 2
4 >
:
0 otherwise.

Then,
,, 9> - ? > @ Z 1/3 , > -
1 >> 1 1 >> 1 >1
Px̃ jỹ "•, > = P x̃ . >ỹ = = fx̃ jỹ x >> dx
3 4 3 4 "• 4
Z 1/3 , -
1 16 2 71/3 1 1 1 1 4 4 2
= (2x + 1) dx = x +x 0 = + = 5 = = ,
0 2 2 2 9 3 2 9 18 9
while , 9 Z 1/3 Z •
1
Px̃ "•, = fx̃ ,ỹ (x, y )dy dx
3 "• "•
| {z }
fx̃ (x )
Z 1/3 Z 1
2 7
= (x + 2y ) dy dx = .
0
|0 3 {z } 27
2
3 (x +1 )

If we have more than 2 random variables, we can generalize the
previous conditional probability and density functions.
Example:
fx̃1 ,x̃2 ,x̃3 ,x̃4 (x1 , x2 , x3 , x4 )

fx̃1 ,x̃3 jx̃2 ,x̃4 (x1 , x3 jx2 , x4 ) = ,
fx̃2 ,x̃4 (x2 , x4 )
where
fx̃2 ,x̃4 (x2 , x4 ) = Â Â fx̃1 ,x̃2 ,x̃3 ,x̃4 (x1 , x2 , x3 , x4 ) > 0,

x1 2x̃1 (W) x3 2x̃3 (W)
or Z Z
fx̃2 ,x̃4 (x2 , x4 ) = fx̃1 ,x̃2 ,x̃3 ,x̃4 (x1 , x2 , x3 , x4 ) dx1 dx3 > 0.
R R

StatIdea Slides 2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

StatIdea Slides 2

Uploaded by

Copyright:

Available Formats

2.

Random Variables and Distributions

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 1 / 55

DeÖnition. A random object is a measurable function

DeÖnition. A (real-valued) random variable is a measurable function

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 2 / 55

A random vector is just a vector of random variables:

x̃ = (x̃1 , x̃2 , ..., x̃n ) .

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 3 / 55

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 4 / 55

The induced probability Px̃ on (R, B) (or distribution of x̃) satisÖes

Px̃ f1g = P f1, 2, 3, 4g = 2/3, Px̃ f7g = P f5, 6g = 1/3,

Moreover, using the properties of the probability, we obtain the

DeÖnition. Two random objects x̃ and ỹ deÖned on (W, F , P ) and

taking values on (W0 , F 0 ) are equivalent (or equal) in distribution

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 6 / 55

An event A is almost sure (a.s.) if P (A) = 1.

An event A is negligible if P (A) = 0.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 7 / 55

Note that the distribution Px̃ of a random variable

Therefore, the distribution Px̃ of a random variable

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 9 / 55

Px̃ (a, b ] = P fa < x̃ . b g = Fx̃ (b ) " Fx̃ (a),

where we make the normalization lim Fx̃ (x ) = 0.

Px̃ ("•, x ] = P fx̃ . x g = Fx̃ (x ) " lim Fx̃ (x ) = Fx̃ (x ).

lim Fx̃ (x ) = Px̃ ("•, •) = P fx̃ 2 R g = 1.

Thus, the distribution function of a random variable x̃ is increasing,

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 10 / 55

DeÖnition. A random variable x̃ : (W, F ) "! (R, B) is discrete if

If W is discrete then x̃ is discrete. The converse is not true.

If x̃ is discrete there is a countable partition A = fA1 , A2 , ...g of W

Therefore, An = x̃ "1 (xn ) , for all xn 2 x̃ (W).

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 11 / 55

s-algebra that makes the random variable x̃ measurable.

The distribution of a discrete random variable x̃ (which is said to

Px̃ fxn g = P fx̃ = xn g = P (An ), for all xn 2 x̃ (W).

DeÖnition. The probability mass function (pmf), (or just probability

fx̃ (x ) = P fx̃ = x g = Px̃ fx g , for all x 2 x̃ (W).

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 12 / 55

2. Any function f : x̃ (W) "! [0, 1] , where x̃ (W) is countable,

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 13 / 55

Px̃ (B ) = P fx̃ 2 B g = Â fx̃ (x ), for all B 2 B , with x 2 x̃ (W).

In particular, if the range of x̃ can be ordered so that

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 14 / 55

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 15 / 55

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 16 / 55

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 17 / 55

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 18 / 55

DeÖnition 2. A random variable x̃ is continuous if its distribution

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 19 / 55

Random variables that are neither discrete nor absolutely continuous

Equivalent deÖnition: A random variable x̃ is absolutely continuous

Therefore, thanks to the Radon-Nikodym theorem,

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 20 / 55

is called the probability density function (pdf), (or density function or

Lebesgue measure on (R, B) .

Moreover, the density fx̃ is Önite a.e. with respect to Lebesgue

The density function fx̃ of the random variable x̃ is the

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 21 / 55

Px̃ (a, b ] = Px̃ (a, b ) = Px̃ [a, b ] = Px̃ [a, b ) =

Notation: If the random variable x̃ has the distribution Px̃ , we write

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 22 / 55