You are on page 1of 55

2.

Random Variables and Distributions

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 1 / 55


2.1. Random objects and random variables

DeÖnition. A random object is a measurable function


x̃ : (W, F ) "! (W0 , F 0 ) , where W is a sample space and F is the

s-algebra of events.

DeÖnition. A (real-valued) random variable is a measurable function


x̃ : (W, F ) "! (R, B) , where W is a sample space and F is the
~

s-algebra of events.
Real hlumber .

T) →
九 问
eg {
.
H ,
H ,
.

)
{ H T, ,
τ 3

C
1

)
0 i
0
{T τ τ3

, ,

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 2 / 55


Similarly, when W is a sample space and F is the s-algebra of events,
! "
x̃ : (W, F ) "! R, B is a extended (real-valued) random variable.
x̃ : (W, F ) "! (Rn , B) is a "(real-valued) random vector" or a
"(real-valued) multivariate random variable".
# $
n
x̃ : (W, F ) "! R , B is a "extended (real-valued) random vector"
or a "extended (real-valued) multivariate random variable".

A random vector is just a vector of random variables:

x̃ = (x̃1 , x̃2 , ..., x̃n ) .

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 3 / 55


2.2. Probability distributions
Let (W, F , P ) be a probability space.
DeÖnition. The probability distribution (or distribution) of a random
object x̃ : (W, F , P ) "! (W0 , F 0 ) is a probability measure Px̃ on
(W0 , F 0 ) deÖned by
! "
Px̃ (B ) = P x̃ "1 (B ) for all B 2 F 0
or
Px̃ (B ) = P fw 2 W j x̃ (w ) 2 B g = P fx̃ 2 B g for all B 2 F 0 .

Obviously,
Z Z Z
Px̃ (B ) = 1dPx̃ * dPx̃ * IB (x )dPx̃ (x ) for all B 2 F 0
B B W0
or
Z Z Z
Px̃ (B ) = 1dP * dP * Ix̃ "1 (B ) (w )dP (w ) for all B 2 F 0 .
W
x̃ "1 (B ) x̃ "1 (B )

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 4 / 55


Example: We roll a balanced dice,! W W= f1,
" 2, 3, 4, 5, 6g , and
consider the random variable x̃ : W, 2 , P "! (R, B) deÖned as
8
< 1 if w = 1, 2, 3, 4
x̃ (w ) =
:
7 if w = 5, 6.

The induced probability Px̃ on (R, B) (or distribution of x̃) satisÖes

Px̃ f1g = P f1, 2, 3, 4g = 2/3, Px̃ f7g = P f5, 6g = 1/3,


Px̃ f12g = P (∆) = 0, Px̃ ("3, 1) = P (∆) = 0,
Px̃ h 3,p1] =i P f1, 2, 3, 4g = 2/3,
[" Px̃ [5, 8] = Px̃ f7g = 1/3,
Px̃ p, 13 = P (∆) = 0, Px̃ ("•, 12] = P (W) = 1,
Px̃ [10, •) = P (∆) = 0, Px̃ (1, •) = P f5, 6g = 1/3,
Px̃ ("•, 2] = P f1, 2, 3, 4g = 2/3, etc.

Moreover, using the properties of the probability, we obtain the


distribution for all Borel sets in R.
J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 5 / 55
e
DeÖnition. The support supp (Px̃ ) of the distribution of the random
vector x̃ : (W, F ) "! (Rn , B) is the smallest closed subset of Rn
whose complement has zero probability distribution,
Px̃ f[supp (Px̃ )]c g = 0.

DeÖnition. Two random objects x̃ and ỹ deÖned on (W, F , P ) and


~

taking values on (W0 , F 0 ) are equivalent (or equal) in distribution


d
(x̃ = ỹ ) if they have the same distribution, Px̃ = Pỹ .
Example:
! We" toss a balanced coin !and consider
"
x̃ : W, 2W , P "! (R, B), and ỹ : W, 2W , P "! (R, B) deÖned as
8 8
< "1 if w = H < "1 if w = T
x̃ (w ) = and ỹ (w ) =
: :
1 if w = T 1 if w = H.

d
Thus, x̃ = ỹ .

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 6 / 55


An event A is sure if A = W.

An event A is almost sure (a.s.) if P (A) = 1.

An event A is negligible if P (A) = 0.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 7 / 55


DeÖnition. We say that two random objects deÖned on (W, F , P )
and taking values on (W0 , F 0 ) are equal, x̃ = ỹ , if x̃ (w ) = ỹ (w ) for
all w 2 W.
DeÖnition. We say that two random objects deÖned on (W, F , P )
a.s .
and taking values on (W0 , F 0 ) are equal almost surely (a.s.), x̃ = ỹ , if
P fx̃ = ỹ g = P fw 2 W j x̃ (w ) = ỹ (w ) g = 1 ,
or, equivalently, if
P fx̃ 6= ỹ g = P fw 2 W j x̃ (w ) 6= ỹ (w ) g = 0.

Note that the concept of "a.s." is the same as that of "a.e." The only
di§erence is that "a.e." applies to functions deÖned on measure
spaces, whereas "a.s." applies to random objects deÖned on
probability spaces.
a.s . a.s . d
Obviously, x̃ = ỹ =) x̃ = ỹ . Moreover, x̃ = ỹ =) x̃ = ỹ but the
converse is not true (see the example in the previous page where
d a.s .
x̃ = ỹ but x̃ 6= ỹ since P fx̃ 6= ỹ g = 1).
J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 8 / 55
2.3. Distribution function of a random variable

Note that the distribution Px̃ of a random variable


x̃ : (W, F , P ) "! (R, B) is a probability measure on (R, B) and,
thus, is a Önite measure.

Therefore, the distribution Px̃ of a random variable


x̃ : (W, F , P ) "! (R, B) is a Lebesgue-Stieltjes measure on R
satisfying Px̃ (R ) = 1.

auable

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 9 / 55


DeÖnition. The (cumulative) distribution function (cdf)

i
Fx̃ : R "! R of a random variable x̃ : (W, F , P ) "! (R, B) is the
distribution function associated with the distribution Px̃ , i.e.,

Px̃ (a, b ] = P fa < x̃ . b g = Fx̃ (b ) " Fx̃ (a),

where we make the normalization lim Fx̃ (x ) = 0.


x !"•

Therefore,

Px̃ ("•, x ] = P fx̃ . x g = Fx̃ (x ) " lim Fx̃ (x ) = Fx̃ (x ).


x !"•

Moreover,

lim Fx̃ (x ) = Px̃ ("•, •) = P fx̃ 2 R g = 1.


x !•

Thus, the distribution function of a random variable x̃ is increasing,


right-continuous, and satisÖes lim Fx̃ (x ) = 0 and lim Fx̃ (x ) = 1.
x !"• x !•

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 10 / 55


2.4. Discrete random variables

DeÖnition. A random variable x̃ : (W, F ) "! (R, B) is discrete if


its range is countable or discrete (either Önite or inÖnite).

If W is discrete then x̃ is discrete. The converse is not true.

Let fx1 , x2 , ...g be the range x̃ (W) of the discrete random variable x̃.

If x̃ is discrete there is a countable partition A = fA1 , A2 , ...g of W


with
An = fw 2 W j x̃ (w ) = xn g , for all xn 2 x̃ (W).

Therefore, An = x̃ "1 (xn ) , for all xn 2 x̃ (W).

An = ∞ (w )

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 11 / 55


The s-algebra s(A) generated by the partition A is the smallest ⼀

s-algebra that makes the random variable x̃ measurable.


The distribution of a discrete random variable x̃ (which is said to


have a discrete distribution) satisÖes:

Px̃ fxn g = P fx̃ = xn g = P (An ), for all xn 2 x̃ (W).

DeÖnition. The probability mass function (pmf), (or just probability


function), fx̃ : x̃ (W) "! [0, 1] , of a discrete random variable x̃ (or of
a discrete distribution Px̃ ) is given by:

fx̃ (x ) = P fx̃ = x g = Px̃ fx g , for all x 2 x̃ (W).

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 12 / 55


Properties of the probability and distribution functions of a
discrete random variable:

1.
 fx̃ (x ) = 1.
x 2x̃ (W)

2. Any function f : x̃ (W) "! [0, 1] , where x̃ (W) is countable,


satisfying  f (x ) = 1 can serve as a probability function of a
x 2x̃ (W)
discrete distribution.

3.
Fx̃ (x ) = Â fx̃ (t ), with t 2 x̃ (W).
t .x

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 13 / 55


4.

Px̃ (B ) = P fx̃ 2 B g = Â fx̃ (x ), for all B 2 B , with x 2 x̃ (W).


x 2B

5.
fx̃ (x ) = Fx̃ (x ) " lim" Fx̃ (t ), for x 2 x̃ (W).
t !x

In particular, if the range of x̃ can be ordered so that


x1 < x2 < ... < xi "1 < xi < xi +1 < ..., then fx̃ (x1 ) = Fx̃ (x1 ) and
fx̃ (xi ) = Fx̃ (xi ) " Fx̃ (xi "1 ) for i = 2, 3, ...

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 14 / 55


Example: Let x̃ be the number of heads when tossing 4 coins.

Pcamplemeutj.no
> 1/16 for x = 0
,

>
>
>
>
>
>
>
> 4/16 for x = 1
>
>
>
<
fx̃ (x ) = 6/16 for x = 2
>
>
>
>
>
>
>
> 4/16 for x = 3
>
>
>
>
:
1/16 for x = 4,

or
, -
1 4
fx̃ (x ) = , for x = 0, 1, 2, 3, 4.
16 x | {z }
x̃ (W)

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 15 / 55


Probability Histogram:

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 16 / 55


Probability Bar Chart:

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 17 / 55


Distribution function:

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 18 / 55


2.5. Continuous and absolutely continuous random
variables
DeÖnition 1. A random variable x̃ is continuous if its range x̃ (W) is

0t_t_P.cn
continuous. "

DeÖnition 2. A random variable x̃ is continuous if its distribution


function Fx̃ is continuous, that is, if Px̃ fx g = P fx̃ = x g = 0 for all
x 2 R.
Continuity according to DeÖnition 2 implies continuity according to
DeÖnition 1.
DeÖnition. A random variable x̃ : (W, F ) "! (R, B) is absolutely
continuous if its distribution function Fx̃ is absolutely continuous,
! " i.e.,
there exists a Borel measurable function fx̃ : (R, B) "! R, B that
is integrable with respect to Lebesgue measure such that
Z
Fx̃ (x ) " Fx̃ (a) = fx̃ (t )dt, for all a 2 R, x 2 R, with a . x.
[a,x ]

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 19 / 55


Absolute continuity implies continuity.

Random variables that are neither discrete nor absolutely continuous


are called "mixed".

Equivalent deÖnition: A random variable x̃ is absolutely continuous


if its distribution Px̃ is absolutely continuous with respect to Lebesgue
measure.

Therefore, thanks to the Radon-Nikodym theorem,


! " there exists a
Borel measurable function fx̃ : (R, B) "! R, B such that
Z
Px̃ (B ) = fx̃ (x )dx, for all B 2 B .
B

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 20 / 55


2.6. Density
! "
The Borel measurable function fx̃ : (R, B) "! R, B such that
Z
Px̃ (B ) = fx̃ (x )dx, for all B 2 B ,
B

is called the probability density function (pdf), (or density function or


just "density"), of the random variable x̃ (or of the distribution Px̃ ).

Since Px̃ (R ) = 1, the density function fx̃ is integrable with respect to
_

Lebesgue measure on (R, B) .

Moreover, the density fx̃ is Önite a.e. with respect to Lebesgue


measure on (R, B) .

The density function fx̃ of the random variable x̃ is the


Radon-Nikodym derivative of its distribution with respect to Lebesgue
measure, fx̃ = dPx̃ / dx.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 21 / 55


Note: If x̃ is absolutely continuous, then

Px̃ (a, b ] = Px̃ (a, b ) = Px̃ [a, b ] = Px̃ [a, b ) =


Z
Fx̃ (b ) " Fx̃ (a) = fx̃ (x )dx.
[a,b ]

Notation: If the random variable x̃ has the distribution Px̃ , we write


x̃ 0 Px̃ , x̃ 0 Fx̃ , or x̃ 0 fx̃ , where Fx̃ is the corresponding distribution
function and fx̃ is the corresponding probability or density function.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 22 / 55


Px̃ [a, b ] is given by the area of the yellow region

Rn-omottheheigt.hn
J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 23 / 55
Properties of the density:

1. Z
fx̃ (x )dx = 1.
R

2. Z
Fx̃ (x ) = fx̃ (t )dt.
("•,x ]

3. Any non-negative (a.e.! w.r.t.


" LebesgueRmeasure) Borel measurable
function f : (R, B) "! R, B satisfying R f (x )dx = 1 can serve as
a density of an absolutely continuous distribution on (R, B).

4. If x̃ is absolutely continuous, then fx̃ = Fx̃0 when the derivative of


Fx̃ exists. Moreover, the derivative Fx̃0 exists a.e. w.r.t. Lebesgue
measure. If fx̃ is continuous at x then Fx̃ is di§erentiable at x and
fx̃ (x ) = Fx̃0 (x ).

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 24 / 55


2.7. Random vectors

x̃ : (W, F ) "! (Rn , B) .


x̃ = (x̃1 , x̃2 , ..., x̃n ) or x̃ = (x̃1 , x̃2 , ..., x̃n )| .
x̃i = pi (x̃ ) , where pi : Rn "! R is the projection to the ith
coordinate.
The distribution of the random vector x̃ is a probability measure on
(Rn , B) given by
! "
Px̃ (B ) = P x̃ "1 (B ) for all B 2 B (Rn ) .

The distribution function (cdf) of the random vector


x̃ = (x̃1 , x̃2 , ..., x̃n ) , Fx̃ : Rn "! R, is given by

Fx̃ (x1 , x2 , ..., xn ) = P fx̃i . xi , for i = 1, 2, ..., ng .


| {z }
<
x 2R n ⼥≤x
J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 25 / 55
atb 。
Orz
The distribution function of a random vector x̃ is an (i) increasing,...

(Increasing: a < b =) F (a) . F (b ), where a and b belong to Rn .)

(ii) right-continuous,..
! "
(Right-continuous at x0 : lim F (x ) * F x0+ = F (x0 ) , where
x !x0+
x > x0 2 Rn )

e.EE
(iii) Fx̃ (x ) ! 0 if at least one of the components xi of x 2 Rn tends
to "•, and

(iv) Fx̃ (x ) ! 1 if all the components xi , i = 1, ..., n, of x 2 Rn tend


_
to •.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 26 / 55


The random vector x̃ = (x̃1 , x̃2 , ..., x̃n ) is discrete if its range x̃ (W) is
countable (or discrete).


The probability function (pmf),
fx̃ : x̃1 (W) 1 x̃2 (W) 1 ... 1 x̃n (W) "! [0, 1] , of a discrete random

vector x̃ is given by:
8 9
< =
fx̃ (x ) = Px̃ fx g = P (x̃1 , x̃2 , ..., x̃n ) = (x1 , x2 , ..., xn ) =
:| {z } | {z };
x̃ x 2R n

P fx̃i = xi , for i = 1, 2, ..., ng , for all x 2 x̃1 (W) 1 x̃2 (W) 1 ... 1 x̃n (W).

0
Note: x̃ (W) 2 x̃1 (W) 1 x̃2 (W) 1 ... 1 x̃n (W).
_

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 27 / 55


Properties of the probability and distribution functions of a
discrete random vector:
1.

 fx̃ (x ) = 1 or  fx̃ (x ) = 1.
x 2x̃ (W) x 2x̃1 (W)1x̃2 (W)1...1x̃n (W)

2.
O
F (x ) = Â f (t ),

t 5x
x̃ with t = (t1 , t2 , ..., tn ) 2 x̃ (W),

where t 5 x means that ti . xi for i = 1, 2, ..., n.

3.
Px̃ (B ) = P fx̃ 2 B g = Â fx̃ (x ), for all B 2 B (Rn ) .
x 2B

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 28 / 55


The random vector x̃ = (x̃1 , x̃2 , ..., x̃n ) (or its distribution) is
absolutely continuous
! if" there exists a Borel measurable function
fx̃ : (Rn , B) "! R, B , called the density (pdf), that is integrable
with respect to Lebesgue measure on (Rn , B) , such that
Z
Px̃ (B ) = fx̃ (x1 , x2 , ..., xn ) d (x1 , x2 , ..., xn ) , for all B 2 B (Rn ) .
B

Properties of the density of a random vector:

1. Z Z Z
fx̃ (x )dx = ... fx̃ (x1 , ..., xn )dx1 ...dxn = 1.
Rn R R | {z }
-_-
x 2R n

2.
Z Z Z
和性
Fx̃ (x ) = ... fx̃ (t1 , t2 , ..., tn )dt1 dt2 ...dtn .
("•,xn ] ("•,xn "1 ] ("•,x1 ]

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 29 / 55


3. Any non-negative (a.e.! w.r.t." Lebesgue measure) Borel measurable
function f : (Rn , B) "! R, B satisfying
Z Z
... f (x1 , x2 , ..., xn )dx1 dx2 , ..., dxn = 1
R R

can serve as a density of an absolutely continuous distribution on


(Rn , B) .
4. If the random vector x̃ = (x̃1 , x̃2 , ..., x̃n ) is absolutely continuous,
then
∂n Fx̃ (x1 , x2 , ..., xn )
fx̃ (x1 , x2 , ..., xn ) = .
∂x1 ∂x2 ...∂xn
when this nth crossed partial derivative of Fx̃ exists. Moreover, this
derivative exists a.e. w.r.t. Lebesgue measure on (Rn , B).

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 30 / 55


2.8. Marginal distributions
DeÖnition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be a random vector with
distribution Px̃ . The marginal distribution of x̃i , for i = 1, ..., n, is
given by
Px̃i (B ) = Px̃ (R 1... 1 B 1 ... 1 R ), for all B 2 B (R ) .
"
i

DeÖnition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be a discrete random vector with
the probability function fx̃ , the marginal probability function of x̃i , for
i = 1, ..., n, is given by
fx̃i (xi ) =

 ...   ...  fx̃ (x1 , ..., xi "1 , xi , xi +1 ..., xn ) .


| {z }
x1 2x̃1 (W) xi "1 2x̃i "1 (W) xi +1 2x̃i +1 (W) xn 2x̃n (W)
x
for all xi 2 x̃i (W).

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 31 / 55


DeÖnition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be an absolutely continuous
random vector with the density fx̃ , the marginal density of x̃i , for
i = 1, ..., n, is given by
Z Z
fx̃i (xi ) = ... fx̃ (x1 , ..., xi "1 , xi , xi +1 ..., xn )dx1 ...dxi "1 dxi +1 ...dxn ,
| {z }
R R x

for all xi 2 R.

Note: From the marginal probability or density functions we can


construct the marginal distributions in the usual way.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 32 / 55


Example 1: The discrete random vector (x̃, ỹ ), where x̃ is the
number of points when rolling a dice and ỹ is the number of heads
when tossing a coin has a probability function fx̃ ,ỹ (x, y ) summarized
in the following table:

y nx 1 2 3 4 5 6 fỹ (y )
0 1/12 1/12 1/12 1/12 1/12 1/12 1/2
1 1/12 1/12 1/12 1/12 1/12 1/12 1/2
fx̃ (x ) 1/6 1/6 1/6 1/6 1/6 1/6 1

The marginal probability functions of x̃ and ỹ are summarized in the


"margins".

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 33 / 55


Example 2: The absolutely continuous random vector (x̃, ỹ ) has the
following density:
8
> 2
< (x + 2y ) for 0 < x < 1 and 0 < y < 1
fx̃ ,ỹ (x, y ) = 3
>
:
0 otherwise.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 34 / 55


Marginal densities:
Z • Z 1
2 26 71
fx̃ (x ) = fx̃ ,ỹ (x, y )dy = (x + 2y ) dy = xy + y 2 0
"• 0 3 3
2
= (x + 1) , for 0 < x < 1.
3
Therefore, 8
> 2
>
< 3 (x + 1) for 0 < x < 1
fx̃ (x ) =
>
>
: 0 otherwise.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 35 / 55


Similarly,
Z • Z 1 8 91
2 2 x2
fỹ (y ) = fx̃ ,ỹ (x, y )dx = (x + 2y ) dx = + 2xy
"• 0 3 3 2 0
, -
2 1 1
= + 2y = (1 + 4y ) , for 0 < y < 1.
3 2 3

Therefore,
8
> 1
< (1 + 4y ) for 0 < y < 1
fỹ (y ) = 3
>
:
0 otherwise.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 36 / 55


2.9. Independent random variables
DeÖnition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be a random vector deÖned on
(W, F , P ) with the distribution Px̃ on (Rn , B (Rn )) . The random
variables x̃1 , x̃2 , ..., x̃n are said to be independent if, for all collections
of sets B1 , B2, ..., Bn belonging to B (R ) , we have
P fx̃1 2 B1 , ..., x̃n 2 Bn g = P fx̃1 2 B1 g 5 ... 5 P fx̃n 2 Bn g .

or, equivalently, if the distribution of the random vector x̃ is equal to


the product measure of the marginal distributions,
n
Px̃ = Px̃1 1 ... 1 Px̃n * ’ Px̃ .i
i =1

DeÖnition. Let x̃i : (W, F , P ) "! (Wi , Fi ) , for i = 1, ..., n. The


random objects x̃1 , x̃2 , ..., x̃n are said to be independent if, for all sets
B1 2 F1 , ..., Bn 2 Fn ,
P fx̃1 2 B1 , ..., x̃n 2 Bn g = P fx̃1 2 B1 g 5 ... 5 P fx̃n 2 Bn g .

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 37 / 55


Equivalent deÖnition. Let x̃1 , x̃2 , ..., x̃n be a collection of random
objects on the probability space (W, F , P ),

x̃i : (W, F ) "! (Wi , Fi ) , for i = 1, 2, ..., n.

The random objects x̃1 , x̃2 , ..., x̃n are said to be independent if the
joint distribution
n
O
P(x̃1 ,x̃2 ,...,x̃n ) : Fi "! [0, 1]
i =1

of these n random objects is equal to the product measure of the


marginal distributions,
n
P(x̃1 ,x̃2 ,...,x̃n ) = ’ Px̃ ,i
i =1

where Px̃i : Fi "! [0, 1] is the marginal distribution of the random


n
N
object x̃i , i = 1, ..., n, and Fi is the product s"algebra.
i =1

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 38 / 55


Proposition. Let x̃i : (W, F , P ) "! (Wi , Fi ) , for i = 1, ..., n, be a
collection of independent random objects and
gi : (Wi , Fi ) "! (Wi0 , Fi0 ) , for i = 1, ..., n, be measurable functions.
Then, the random objects gi (x̃i ) : (W, F , P ) "! (Wi0 , Fi0 ) , for
i = 1, ..., n, are independent.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 39 / 55


Proof. If

P fx̃1 2 B1 , ..., x̃n 2 Bn g = P fx̃1 2 B1 g 5 ... 5 P fx̃n 2 Bn g ,

for all sets B1 2 F1 , ..., Bn 2 Fn , then


< ! " ! "=
P x̃1 2 g1"1 B10 , ..., x̃n 2 gn"1 Bn0
< ! "= < ! "=
= P x̃1 2 g1"1 B10 5 ... 5 P x̃n 2 gn"1 Bn0 ,
for all sets B10 2 F10 , ..., Bn0 2 Fn0 , since g1"1 (B10 ) 2 F1 , ..., gn"1 (Bn0 ) 2 Fn
due to the measurability of gi , for i = 1, ..., n. Therefore,
< = < = < =
P g1 (x̃1 ) 2B10 , ..., gn (x̃n ) 2Bn0 = P g1 (x̃1 ) 2B10 5 ... 5 P gn (x̃n ) 2Bn0 ,

for all sets B10 2 F10 , ..., Bn0 2 Fn0 , which proves the independency of the
random objects gi (x̃i ) : (W, F , P ) "! (Wi0 , Fi0 ) , for i = 1, ..., n. Q.E .D.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 40 / 55


Proposition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be a random vector with the
distribution function Fx̃ : Rn "! [0, 1] and let Fi : R "! [0, 1] be
the marginal distribution function of x̃i , for i = 1, ..., n. Then, the
random variables x̃1 , x̃2 , ..., x̃n are independent if and only if

Fx̃ (x1 , ..., xn ) = F1 (x1 ) 5 F2 (x2 ) 5 ... 5 Fn (xn ) ,

for all x = (x1 , ..., xn ) 2 Rn .

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 41 / 55


Proposition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be a discrete random vector
with the probability function fx̃ : x̃1 (W) 1 ... 1 x̃n (W) "! [0, 1] and
let fi : x̃i (W) "! [0, 1] be the marginal probability function of x̃i , for
i = 1, ..., n. Then, the random variables x̃1 , x̃2 , ..., x̃n are independent
if and only if

fx̃ (x1 , ..., xn ) = f1 (x1 ) 5 f2 (x2 ) 5 ... 5 fn (xn ) ,

for all x = (x1 , ..., xn ) 2 x̃1 (W) 1 ... 1 x̃n (W).

Proposition. Let x̃ = (x̃1 , x̃2 , ..., x̃n ) be an absolutely continuous


random vector with the density function fx̃ : Rn "! R and let
fi : R "! R be the marginal density function of x̃i , for i = 1, ..., n.
Then, the random variables x̃1 , x̃2 , ..., x̃n are independent if and only if

fx̃ (x1 , ..., xn ) = f1 (x1 ) 5 f2 (x2 ) 5 ... 5 fn (xn ) ,

for all x = (x1 , ..., xn ) 2 Rn .

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 42 / 55


2.10. Generalized conditional probability
Let x̃ : (W, F , P ) "! (W0 , F 0 ) and let us Öx the event B 2 F . From
the Radon-Nikodym theorem we know that there exists a Borel
measurable function g : (W0 , F 0 ) "! (R, B) such that
Z
P (fx̃ 2 Ag \ B ) = g (x )dPx̃ (x ), for all A 2 F 0 ,
| {z } A
l (A )

since l 7 Px̃ . The function g is called the conditional probability of


B given x̃ = x and is written as g (x ) = P (B jx̃ = x ). The
conditional probability is essentially unique for a given B 2 F (i.e., if
there exists another such function h, then g = h a.e. [Px̃ ]).
Therefore,
Z
P (fx̃ 2 Ag \ B ) = P (B jx̃ = x )dPx̃ (x ),
Tnevent
A

with P (B jx̃ = 5 ) : (W0 , F 0 ) "! (R, B) .


J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 43 / 55
However, sometimes the conditional probability is viewed as a
measure on (W, F ),

P (5 jx̃ = x ) : F "! R.

Moreover, if A = W0 , then
!< = " Z
P x̃ 2 W0 \ B = P (W \ B ) = P (B ) = P (B jx̃ = x )dPx̃ (x ),
W0

which is a generalization of the theorem of total probability.

Note that, if x̃ is an absolutely continuous random variable, then


P (B jx̃ = x ) is a conditional probability given an event (fx̃ = x g)
that has zero probability!

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 44 / 55


2.11. Conditional distributions

DeÖnition. Let (x̃, ỹ ) be a vector of two random objects


x̃ : (W, F , P ) "! (Wx , Fx ) and ỹ : (W, F , P ) "! (Wy , Fy ) , and let
C 2 Fy be a Öxed measurable set. The conditional distribution of ỹ
given x̃ = x is the Borel measurable function
Pỹ jx̃ (C jx̃ = 5 ) : (Wx , Fx ) "! (R, B) given by
-

Pỹ jx̃ (C jx ) = P fỹ 2 C jx̃ = x g , for all x 2 Wx ,


n nn

which is essentially unique w.r.t. Px̃ .

However, sometimes the conditional distribution is viewed as a


measure on (Wy , Fy ) ,

Pỹ jx̃ (5 jx ) : Fy "! R.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 45 / 55


Assume that the random vector (x̃, ỹ ) is discrete with the probability
function fx̃ ,ỹ : x̃ (W) 1 ỹ (W) "! [0, 1] . Then, the conditional
distribution Pỹ jx̃ (y jx ) = P fỹ = y jx̃ = x g must satisfy
-

P fx̃ 2 A, ỹ 2 C g = Â Pỹ jx̃ (C jx )P| fx̃{z= x g}


x 2A
fx̃ (x )

= Â Â Pỹ jx̃ (y jx )fx̃ (x ), for all A 2 B (R ) and C 2 B (R ) . (8)


x 2A y 2C
| {z }
P ỹ jx̃ (C jx )

Let us deÖne the conditional distribution Pỹ jx̃ (y jx ) as follows:

P fx̃ = x, ỹ = y g fx̃ ,ỹ (x, y )


Pỹ jx̃ (y jx ) = = * fỹ jx̃ (y jx ) ,
P fx̃ = x g fx̃ (x )

for all (x, y ) 2 x̃ (W) 1 ỹ (W) with fx̃ (x ) > 0.


_

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 46 / 55


The function fỹ jx̃ (5 jx ) : ỹ (W) "! [0, 1] , for all x 2 x̃ (W) such that
fx̃ (x ) > 0, is the conditional probability function of ỹ given x̃ = x.

The previous deÖnition of the conditional probability function (or


conditional distribution) of ỹ given x̃ = x is the right one since the
expression (8) becomes

P fx̃ 2 A, ỹ 2 C g = Â Â fỹ jx̃ (y jx ) fx̃ (x )


x 2A y 2C

fx̃ ,ỹ (x, y )


= Â Â fx̃ (x )
fx̃ (x ) = Â Â fx̃ ,ỹ (x, y ),
x 2A y 2C x 2A y 2C

for all A 2 B (R ) and C 2 B (R ) .

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 47 / 55


Assume that the random vector (x̃, ỹ ) is absolutely continuous with
the density fx̃ ,ỹ : R2 "! R. Then, we would like to have an
expression like this:
Z
P fx̃ 2 A, ỹ 2 C g = Pỹ jx̃ (C jx )dPx̃ (x )
A
Z Z 8Z 9
= Pỹ jx̃ (C jx )fx̃ (x )dx = fỹ jx̃ (y jx ) dy fx̃ (x )dx, (88)
A A C
| {z }
P ỹ jx̃ (C jx )

for all A 2 B (R ) and C 2 B (R ) .

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 48 / 55


Let us deÖne the conditional density of ỹ given x̃ = x,
fỹ jx̃ (5 jx ) : R "! R, for all x 2 R such that fx̃ (x ) > 0, as follows:

fx̃ ,ỹ (x, y )


fỹ jx̃ (y jx ) = , for all (x, y ) 2 R2 with fx̃ (x ) > 0.
fx̃ (x )

The previous deÖnition of the conditional density of ỹ given x̃ = x is


the right one since the expression (88) becomes
Z Z
P fx̃ 2 A, ỹ 2 C g = fỹ jx̃ (y jx ) fx̃ (x )dydx
A C
Z Z Z Z
fx̃ ,ỹ (x, y )
= fx̃ (x )dydx = fx̃ ,ỹ (x, y )dydx,
A C fx̃ (x ) A C

for all A 2 B (R ) and C 2 B (R ) .

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 49 / 55


If the discrete (absolutely continuous) random variables x̃ and ỹ are
independent then

fx̃ ,ỹ (x, y ) fx̃ (x ) 5 fỹ (y )


fỹ jx̃ (y jx ) = = = fỹ (y ) , for fx̃ (x ) > 0.
fx̃ (x ) fx̃ (x )

That is, the conditional probability function (density function) is


equal to the corresponding unconditional probability function (density
function).

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 50 / 55


Note that from the conditional probability and density functions we
can obtain the conditional distribution in the usual way, namely,
Pỹ jx̃ (C jx ) = P fỹ 2 C jx̃ = x g = Â fỹ jx̃ (y jx ) , for all C 2 B ,
y 2C
or
Z
Pỹ jx̃ (C jx ) = P fỹ 2 C jx̃ = x g = fỹ jx̃ (y jx ) dy , for all C 2 B ,
C
where
Pỹ jx̃ (C j5 ) : (R, B) "! (R, B)
or, sometimes,
Pỹ jx̃ (5 jx ) : B (R ) "!R.

Note again that, if x̃ is an absolutely continuous random variable,


then Pỹ jx̃ (C jx ) is a conditional distribution (and, hence, a
conditional probability P fỹ 2 C jx̃ = x g) given the event fx̃ = x g ,
which has zero probability! This conditional distribution is well
deÖned when the marginal density of x̃ evaluated at x, fx̃ (x ), is
strictly positive.
J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 51 / 55
Example: The absolutely continuous random vector (x̃, ỹ ) has the
following density:
8
> 2
>
< 3 (x + 2y ) for 0 < x < 1 and 0 < y < 1
fx̃ ,ỹ (x, y ) =
>
>
: 0 otherwise.

We have already proved that the marginal density of the random


variable ỹ is
8
> 1
< (1 + 4y ) for 0 < y < 1
fỹ (y ) = 3
>
:
0 otherwise.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 52 / 55


Therefore, the conditional density of x̃ given ỹ = y is
8 2
>
>
(x + 2y ) 2x + 4y
< 31 = for 0 < x < 1
3 (1 + 4y )
1 + 4y
fx̃ jỹ (x jy ) =
>
>
:
0 otherwise,

for 0 < y < 1.

Thus, the conditional density of x̃ given ỹ = 1/4 is


8
, > - > 2x + 1 1
>1 < = (2x + 1) for 0 < x < 1
fx̃ jỹ x >> = 2 2
4 >
:
0 otherwise.

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 53 / 55


Then,
,, 9> - ? > @ Z 1/3 , > -
1 >> 1 1 >> 1 >1
Px̃ jỹ "•, > = P x̃ . >ỹ = = fx̃ jỹ x >> dx
3 4 3 4 "• 4
Z 1/3 , -
1 16 2 71/3 1 1 1 1 4 4 2
= (2x + 1) dx = x +x 0 = + = 5 = = ,
0 2 2 2 9 3 2 9 18 9

while , 9 Z 1/3 Z •
1
Px̃ "•, = fx̃ ,ỹ (x, y )dy dx
3 "• "•
| {z }
fx̃ (x )
Z 1/3 Z 1
2 7
= (x + 2y ) dy dx = .
0
|0 3 {z } 27
2
3 (x +1 )

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 54 / 55


If we have more than 2 random variables, we can generalize the
previous conditional probability and density functions.

Example:

fx̃1 ,x̃2 ,x̃3 ,x̃4 (x1 , x2 , x3 , x4 )


fx̃1 ,x̃3 jx̃2 ,x̃4 (x1 , x3 jx2 , x4 ) = ,
fx̃2 ,x̃4 (x2 , x4 )

where

fx̃2 ,x̃4 (x2 , x4 ) = Â Â fx̃1 ,x̃2 ,x̃3 ,x̃4 (x1 , x2 , x3 , x4 ) > 0,


x1 2x̃1 (W) x3 2x̃3 (W)

or Z Z
fx̃2 ,x̃4 (x2 , x4 ) = fx̃1 ,x̃2 ,x̃3 ,x̃4 (x1 , x2 , x3 , x4 ) dx1 dx3 > 0.
R R

J. CaballÈ (UAB - MOVE - BSE) Probability and Statistics IDEA 55 / 55

You might also like