You are on page 1of 13

Page 1 of 13

Chapter 0: Revision
Probability Space:

Let Ω denote the sample space in an experiment.

Definition: A σ-field is a non-empty class of subsets of Ω that is closed under the formation of
countable unions and complements and contains the null set ∅.

A σ-field of great interest in probability is the Borel σ-field of subsets of the real line, ℜ . This is
the σ-field generated by the class of all bounded semi closed intervals of the form (a, b] and it is
denoted by B.

Examples:

- Toss a coin once: Ω = {H, T}


The σ-field S is the class of all subsets of Ω , namely, S = {∅ , {H}, {T}, {H, T}}.
- Toss a coin twice: Ω = {(H, H), (H, T), (T, H), (T, T)}. S has 24 elements.

Definition: The triplet ( Ω , B, P) is called a probability space, while Ω or ( Ω , B) is called the


sample space.

Definition: Let Ω be a space, B a Borel field of subsets of Ω. A probability measure P(.) on B is


a numerically valued function with domain B, satisfying the following axioms:

1) For all A∈ B, P(A) ≥ 0.


2) If A1 , A2 , A3 , are pairwise disjoint events in S, then
∞ ∞
) P( A1 ) + P( A2 ) + . I.e. P( Ai ) = ∑ P( Ai ) .
P( A1 ∪ A2 ∪ =
i =1 i =1

3) P( Ω ) = 1.

Axiom (b) is called “countable additivity”. The corresponding axiom restricted to a finite
collection {A } is called “finite additivity”.

Recall that:

n n! n  n  n n n


 = . Note that  =  ,  = 1=   , and  =n
 r  r !(n − r )! r n−r n 0 1

 n  n!
 =
 n1 , n2 , , nk  n1 !n2 ! nk !

Page 1 of 13
Page 2 of 13

Conditional Probability and Independence:

Definition: The conditional probability of the event A given that B occurred is

P( A ∩ B)
P( A | B) = provided that P( B) ≠ 0. Therefore:
P( B)

P=
( A ∩ B) P( B=
| A) P ( A) P ( A | B ) P ( B ).

The law of total probability (LOTP):


n
Let {B1 , B2 , , Bn } be a partition of S and A be any event in S. Then P( A) = ∑ P( A | Bi ) P( Bi ).
i =1

P( B | A) P( A)
Bayes Rule: P( A | B) = , P( B) ≠ 0.
P( B)

Independent Events: P( A ∩ B) =
P ( A) P( B) which implies P ( A | B ) = P ( A) and
P ( B | A) = P ( B ) .

Random Variables:

Definition: A real valued point function X(.) defined on the space ( Ω , B, P) is called a random
variable (r.v.) (or measurable) if the set {ω: X(ω) ≤ x} ∈ B for every x ∈ ℜ . X is a mapping of Ω
into ℜ . Ω 
X
→ℜ .

Example: Toss two coins and let X represent the number of heads.

Definition: p(x) is a discrete probability function if the following two conditions hold:

i) p( x) ≥ 0
ii) ∑ p( x) = 1
x

Examples: Uniform, Bernoulli, Binomial, geometric, negative binomial and hypergeometric


distributions.

=
Expectation: µ E=
(X ) ∑ xp( x)
x

Page 2 of 13
Page 3 of 13

RMK: E ( g ( X )) = ∑ g ( x) p ( x)
x

Definition: The rth moment of a random variable X is defined as E ( X r ) = ∑ x r p ( x) .


x

Definition: The rth central moment of a random variable X is defined as


E[( X − µ ) r ] = ∑ ( x − µ ) r p ( x) .
x

Definition: The second central moments is called the variance of X. i.e. σ 2 or


Var ( X )=E[( X − µ ) 2 ] . The standard deviation is σ = σ 2

RMK: Var(X) = E ( X 2 ) − µ 2 .

Definition: f ( x) is a continuous probability density function if the following two conditions


hold:

iii) f ( x) ≥ 0

iv) ∫−∞
f ( x) = 1

b
∫ f ( x)dx .
Definition: Let X be a continuous r.v., then P(a < X < b) =
a

Examples: Uniform, Exponential, Gamma, Chi-square, Beta, Cauchy and Normal distributions.

RMK: The cumulative distribution function (CDF) satisfies the following F ( x) ,

i) F ( x) = ∑ p( x)
X ≤x

ii) lim F ( x) = 0
x →−∞

iii) lim F ( x) = 1
x →∞

iv) If a ≤ b then F (a ) ≤ F (b) . i.e. F ( x) is a non-decreasing function.

Page 3 of 13
Page 4 of 13

Random Vector:

Definition : A random vector X = ( X 1 , X 2 , , X n ) is a vector of jointly distributed random
variables.

 EX 1 
  EX 
E( X ) =  2 
 
 
 EX n 

Bivariate Probability Distribution:

Definition: p ( x1 , x2 ) is a joint probability function of the discrete random variables X 1 and X 2 ,


if p ( x1 , x2 ) satisfies the following two conditions:

i) p ( x1 , x2 ) ≥ 0 for all x1 and x2 . ii) ∑∑ p( x , x ) = 1


x1 x2
1 2

Definition: f ( x1 , x2 ) is a joint probability function of the continuous random variables X 1 and


X 2 , if f ( x1 , x2 ) satisfies the following two conditions:

∞ ∞
1) f ( x1 , x2 ) ≥ 0 for all x1 and x2 . 2) ∫ ∫
−∞ −∞
f ( x1 , x2 )dx2 dx1 = 1

Marginal distribution:

In discrete case: the marginal distribution of X is defined as p X ( x) = ∑ p ( x, y ) , for all values of


y

Y. In continuous case: the marginal distribution of X is defined as f X ( x) = ∫ f ( x, y )dy , for all
−∞

values of Y.

Definition: The conditional density of X given that Y = y is defined as


f ( x, y )
=f ( x | y) , fY ( y ) ≠ 0.
fY ( y )

Definition: If X and Y are independent random variables, then


F ( x, y ) = FX ( x) FY ( y ) for all x and y. Therefore, If X and Y are random variables then

a) In discrete case: If X and Y are independent, then p ( x, y ) = p X ( x) pY ( y ) for all x and y.


b) In continuous case: If X and Y are independent, then f ( x, y ) = f X ( x) fY ( y ) for all x and y.

Page 4 of 13
Page 5 of 13

Covariance:

Cov( X , Y ) = E [ ( X − µ X )(Y − µY ) ] = E ( XY ) − E ( X ) E (Y )

Cov( X , Y )
The correlation coefficient ρ = .
σ X σY

Note that ρ has no unit and it is between -1 and 1. I.e. −1 ≤ ρ ≤ 1.

Conditional Expectations:

E ( X 1 | X 2 ) = ∫ x1 f ( x1 | x2 )dx1
−∞

Low of iterated expectations: E ( X ) = E ( E ( X | Y ) ) .

=
Theorem: Var( X ) E (Var( X | Y )) + Var( E ( X | Y )) .

Page 5 of 13
Page 6 of 13

Distribution of Function of Random Variables

Goal: If X is a random variable follows the p.d.f f ( x) , what is the distribution of the random
variable U = g ( X ) ?

I.e. we have a function of the random variable X and we are looking for describing the
distribution of the new random variable U. Here the random variable X transformed under the
function g into a new random variable U. X g U.

The distribution of the new random variable U, can be found using one of the following
methods:

1) The method of distribution function


2) The method of transformation
3) The method of moment generating function

The Method of Distribution Function:

Summary of distribution function method:

- Find the region U=u in the X 1 , X 2 ,..., X n space.


- Find the region U ≤ u .
- Find FU (u )= P(U ≤ u )= ∫∫∫
over U ≤u
....∫ f ( x1 , x2 ,..., xn )dx1dx2 ...dxn .

d
- Find the density function fU (u ) which is fU (u ) = FU (u ) .
du

Examples (Univariate distributions):

1) Let X be a random variable follows the uniform distribution on the interval (0, 1). Find
the density function of the random variable U =− β ln(1 − X ) .

Page 6 of 13
Page 7 of 13

2) Let X be a random variable which follows the normal distribution with parameters µ
X −µ
and σ . Show that the random variable Z = follows the standard normal
σ
distribution.

3) Let X be any continuous random variable. Show that the random variable U = F ( X )
follows always the uniform distribution on (0,1).

Page 7 of 13
Page 8 of 13

Example (Bivariate distributions):

4) Let X 1 and X 2 denote a random sample of size n = 2 from the uniform distribution on
= X1 + X 2 .
the interval (0,1) . Find the probability density function of for U

Page 8 of 13
Page 9 of 13

The Method of Transformations

Let X be a random variable with density function f X ( x) . Let U = h( X ) where h( x) is a one to


one function, then the density function of Y can be obtained as

fU (u ) = f X ( h −1 (u ) )
dx
.
du

Examples (Univariate Distributions):

1) Let X  gamma(a ,β ) . What is the density function of U = e X .

Page 9 of 13
Page 10 of 13

Examples (Bivariate Distributions):

Let ( X 1 , X 2 ) have a jointly continuous distribution with pdf f X1 , X 2 ( x1 , x2 ) . Let Y1 = u1 ( X 1 , X 2 )

and Y2 = u2 ( X 1 , X 2 ) where the function y1 = u1 ( x1 , x2 ) and y2 = u2 ( x1 , x2 ) define a one-to-one

transformation. If we can write x1 = w1 ( y1 , y2 ) and x2 = w2 ( y1 , y2 ) , then

fY1 ,Y2 ( y1 , y2 ) = f X1 , X 2 ( w1 ( y1 , y2 ), w2 ( y1 , y2 ) ) | J | ,

dx1 dx1
dy1 dy2
where J = .
dx2 dx2
dy1 dy2
  
Transformation in ℜn . Let Y = g ( X ) where X = ( X 1 , X 2 , , X n ) is continuous r.v. and
  
g :  n →  n . Then the joint PDF of Y is fY ( y ) = f X ( x ) | J | where

∂X 1 ∂X 2 ∂X n

d ∂Y1 ∂Y1 ∂Y1
dX
=J =d   
dY
∂X 1 ∂X 2 ∂X n

∂Yn ∂Yn ∂Yn

2) Supp # 1

Page 10 of 13
Page 11 of 13

Page 11 of 13
Page 12 of 13

The Method of Moment-Generating Functions

For any random variable X, the moment generating function, M X (t ) = E (etX ) , is unique.

Theorem: Let X 1 , X 2 ,..., X n be independent random variable with moment generating


functions M X1 (t ), M X 2 (t ),..., M X n (t ) . If U = X 1 + X 2 +  + X n , then
M U=
(t ) M X1 (t ) × M X 2 (t ) × ... × M X n (t ) .

Proof:

Examples:

1) Let X 1 , X 2 ,..., X n be independent normally distributed random variables with E ( X i ) = µi


n
and Var(X i ) = σ i2 for i = 1, 2,..., n . Show that U = ∑X
i =1
i = X 1 + X 2 +  + X n is normally
n n
distributed with mean ∑ µi and variance
i =1
∑σ
i =1
i
2
.

Page 12 of 13
Page 13 of 13

2) Let X 1 , X 2 ,..., X n be independent gamma distributed random variables with parameters


n n
α i and fixed β . Show that U = ∑ X i has gamma distribution with parameters ∑α i
i =1 i =1
and β .

3) Let X 1 , X 2 ,..., X n be independent normally distributed random variables with E ( X i ) = µi


X i − µi n
and Var(X i ) = σ i2 for i = 1, 2,..., n . Let Z i = , Show that U = ∑ Z i2 has a Chi-
σi i =1

square distribution with n degrees of freedom.

Page 13 of 13

You might also like