Chapter 0

Page 1 of 13
Chapter 0: Revision
Probability Space:
Let Ω denote the sample space in an experiment.
Definition: A σ-field is a non-empty class of subsets of Ω that is closed under the formation of
countable unions and complements and contains the null set ∅.
A σ-field of great interest in probability is the Borel σ-field of subsets of the real line, ℜ . This is
the σ-field generated by the class of all bounded semi closed intervals of the form (a, b] and it is
denoted by B.
Examples:
- Toss a coin once: Ω = {H, T}

The σ-field S is the class of all subsets of Ω , namely, S = {∅ , {H}, {T}, {H, T}}.
- Toss a coin twice: Ω = {(H, H), (H, T), (T, H), (T, T)}. S has 24 elements.
Definition: The triplet ( Ω , B, P) is called a probability space, while Ω or ( Ω , B) is called the

sample space.
Definition: Let Ω be a space, B a Borel field of subsets of Ω. A probability measure P(.) on B is

a numerically valued function with domain B, satisfying the following axioms:
1) For all A∈ B, P(A) ≥ 0.

2) If A1 , A2 , A3 , are pairwise disjoint events in S, then
∞ ∞
) P( A1 ) + P( A2 ) + . I.e. P( Ai ) = ∑ P( Ai ) .
P( A1 ∪ A2 ∪ =
i =1 i =1
3) P( Ω ) = 1.
Axiom (b) is called “countable additivity”. The corresponding axiom restricted to a finite
collection {A } is called “finite additivity”.
Recall that:
n n! n  n  n n n

 = . Note that  =  ,  = 1=   , and  =n
 r  r !(n − r )! r n−r n 0 1
 n  n!
 =
 n1 , n2 , , nk  n1 !n2 ! nk !
Page 1 of 13
Page 2 of 13
Conditional Probability and Independence:
Definition: The conditional probability of the event A given that B occurred is
P( A ∩ B)
P( A | B) = provided that P( B) ≠ 0. Therefore:
P( B)
P=
( A ∩ B) P( B=
| A) P ( A) P ( A | B ) P ( B ).
The law of total probability (LOTP):

n
Let {B1 , B2 , , Bn } be a partition of S and A be any event in S. Then P( A) = ∑ P( A | Bi ) P( Bi ).
i =1
P( B | A) P( A)
Bayes Rule: P( A | B) = , P( B) ≠ 0.
P( B)
Independent Events: P( A ∩ B) =
P ( A) P( B) which implies P ( A | B ) = P ( A) and
P ( B | A) = P ( B ) .
Random Variables:
Definition: A real valued point function X(.) defined on the space ( Ω , B, P) is called a random
variable (r.v.) (or measurable) if the set {ω: X(ω) ≤ x} ∈ B for every x ∈ ℜ . X is a mapping of Ω
into ℜ . Ω 
X
→ℜ .
Example: Toss two coins and let X represent the number of heads.
Definition: p(x) is a discrete probability function if the following two conditions hold:
i) p( x) ≥ 0
ii) ∑ p( x) = 1
x
Examples: Uniform, Bernoulli, Binomial, geometric, negative binomial and hypergeometric

distributions.
=
Expectation: µ E=
(X ) ∑ xp( x)
x
Page 2 of 13
Page 3 of 13
RMK: E ( g ( X )) = ∑ g ( x) p ( x)
x
Definition: The rth moment of a random variable X is defined as E ( X r ) = ∑ x r p ( x) .

x
Definition: The rth central moment of a random variable X is defined as

E[( X − µ ) r ] = ∑ ( x − µ ) r p ( x) .
x
Definition: The second central moments is called the variance of X. i.e. σ 2 or

Var ( X )=E[( X − µ ) 2 ] . The standard deviation is σ = σ 2
RMK: Var(X) = E ( X 2 ) − µ 2 .
Definition: f ( x) is a continuous probability density function if the following two conditions

hold:
iii) f ( x) ≥ 0
∞
iv) ∫−∞
f ( x) = 1
b
∫ f ( x)dx .
Definition: Let X be a continuous r.v., then P(a < X < b) =
a
Examples: Uniform, Exponential, Gamma, Chi-square, Beta, Cauchy and Normal distributions.
RMK: The cumulative distribution function (CDF) satisfies the following F ( x) ,
i) F ( x) = ∑ p( x)
X ≤x
ii) lim F ( x) = 0
x →−∞
iii) lim F ( x) = 1
x →∞
iv) If a ≤ b then F (a ) ≤ F (b) . i.e. F ( x) is a non-decreasing function.
Page 3 of 13
Page 4 of 13
Random Vector:

Definition : A random vector X = ( X 1 , X 2 , , X n ) is a vector of jointly distributed random
variables.
 EX 1 
  EX 
E( X ) =  2 
 
 
 EX n 
Bivariate Probability Distribution:
Definition: p ( x1 , x2 ) is a joint probability function of the discrete random variables X 1 and X 2 ,

if p ( x1 , x2 ) satisfies the following two conditions:
i) p ( x1 , x2 ) ≥ 0 for all x1 and x2 . ii) ∑∑ p( x , x ) = 1

x1 x2
1 2
Definition: f ( x1 , x2 ) is a joint probability function of the continuous random variables X 1 and

X 2 , if f ( x1 , x2 ) satisfies the following two conditions:
∞ ∞
1) f ( x1 , x2 ) ≥ 0 for all x1 and x2 . 2) ∫ ∫
−∞ −∞
f ( x1 , x2 )dx2 dx1 = 1
Marginal distribution:
In discrete case: the marginal distribution of X is defined as p X ( x) = ∑ p ( x, y ) , for all values of

y
∞
Y. In continuous case: the marginal distribution of X is defined as f X ( x) = ∫ f ( x, y )dy , for all
−∞
values of Y.
Definition: The conditional density of X given that Y = y is defined as

f ( x, y )
=f ( x | y) , fY ( y ) ≠ 0.
fY ( y )
Definition: If X and Y are independent random variables, then

F ( x, y ) = FX ( x) FY ( y ) for all x and y. Therefore, If X and Y are random variables then
a) In discrete case: If X and Y are independent, then p ( x, y ) = p X ( x) pY ( y ) for all x and y.

b) In continuous case: If X and Y are independent, then f ( x, y ) = f X ( x) fY ( y ) for all x and y.
Page 4 of 13
Page 5 of 13
Covariance:
Cov( X , Y ) = E [ ( X − µ X )(Y − µY ) ] = E ( XY ) − E ( X ) E (Y )
Cov( X , Y )
The correlation coefficient ρ = .
σ X σY
Note that ρ has no unit and it is between -1 and 1. I.e. −1 ≤ ρ ≤ 1.
Conditional Expectations:
∞
E ( X 1 | X 2 ) = ∫ x1 f ( x1 | x2 )dx1
−∞
Low of iterated expectations: E ( X ) = E ( E ( X | Y ) ) .
=
Theorem: Var( X ) E (Var( X | Y )) + Var( E ( X | Y )) .
Page 5 of 13
Page 6 of 13
Distribution of Function of Random Variables
Goal: If X is a random variable follows the p.d.f f ( x) , what is the distribution of the random
variable U = g ( X ) ?
I.e. we have a function of the random variable X and we are looking for describing the
distribution of the new random variable U. Here the random variable X transformed under the
function g into a new random variable U. X g U.
The distribution of the new random variable U, can be found using one of the following
methods:
1) The method of distribution function

2) The method of transformation
3) The method of moment generating function
The Method of Distribution Function:
Summary of distribution function method:
- Find the region U=u in the X 1 , X 2 ,..., X n space.

- Find the region U ≤ u .
- Find FU (u )= P(U ≤ u )= ∫∫∫
over U ≤u
....∫ f ( x1 , x2 ,..., xn )dx1dx2 ...dxn .
d
- Find the density function fU (u ) which is fU (u ) = FU (u ) .
du
Examples (Univariate distributions):
1) Let X be a random variable follows the uniform distribution on the interval (0, 1). Find
the density function of the random variable U =− β ln(1 − X ) .
Page 6 of 13
Page 7 of 13
2) Let X be a random variable which follows the normal distribution with parameters µ
X −µ
and σ . Show that the random variable Z = follows the standard normal
σ
distribution.
3) Let X be any continuous random variable. Show that the random variable U = F ( X )
follows always the uniform distribution on (0,1).
Page 7 of 13
Page 8 of 13
Example (Bivariate distributions):
4) Let X 1 and X 2 denote a random sample of size n = 2 from the uniform distribution on
= X1 + X 2 .
the interval (0,1) . Find the probability density function of for U
Page 8 of 13
Page 9 of 13
The Method of Transformations
Let X be a random variable with density function f X ( x) . Let U = h( X ) where h( x) is a one to

one function, then the density function of Y can be obtained as
fU (u ) = f X ( h −1 (u ) )
dx
.
du
Examples (Univariate Distributions):
1) Let X  gamma(a ,β ) . What is the density function of U = e X .
Page 9 of 13
Page 10 of 13
Examples (Bivariate Distributions):
Let ( X 1 , X 2 ) have a jointly continuous distribution with pdf f X1 , X 2 ( x1 , x2 ) . Let Y1 = u1 ( X 1 , X 2 )
and Y2 = u2 ( X 1 , X 2 ) where the function y1 = u1 ( x1 , x2 ) and y2 = u2 ( x1 , x2 ) define a one-to-one
transformation. If we can write x1 = w1 ( y1 , y2 ) and x2 = w2 ( y1 , y2 ) , then
fY1 ,Y2 ( y1 , y2 ) = f X1 , X 2 ( w1 ( y1 , y2 ), w2 ( y1 , y2 ) ) | J | ,
dx1 dx1
dy1 dy2
where J = .
dx2 dx2
dy1 dy2
  
Transformation in ℜn . Let Y = g ( X ) where X = ( X 1 , X 2 , , X n ) is continuous r.v. and
  
g :  n →  n . Then the joint PDF of Y is fY ( y ) = f X ( x ) | J | where
∂X 1 ∂X 2 ∂X n

d ∂Y1 ∂Y1 ∂Y1
dX
=J =d   
dY
∂X 1 ∂X 2 ∂X n

∂Yn ∂Yn ∂Yn
2) Supp # 1
Page 10 of 13
Page 11 of 13
Page 11 of 13
Page 12 of 13
The Method of Moment-Generating Functions
For any random variable X, the moment generating function, M X (t ) = E (etX ) , is unique.
Theorem: Let X 1 , X 2 ,..., X n be independent random variable with moment generating

functions M X1 (t ), M X 2 (t ),..., M X n (t ) . If U = X 1 + X 2 +  + X n , then
M U=
(t ) M X1 (t ) × M X 2 (t ) × ... × M X n (t ) .
Proof:
Examples:
1) Let X 1 , X 2 ,..., X n be independent normally distributed random variables with E ( X i ) = µi

n
and Var(X i ) = σ i2 for i = 1, 2,..., n . Show that U = ∑X
i =1
i = X 1 + X 2 +  + X n is normally
n n
distributed with mean ∑ µi and variance
i =1
∑σ
i =1
i
2
.
Page 12 of 13
Page 13 of 13
2) Let X 1 , X 2 ,..., X n be independent gamma distributed random variables with parameters

n n
α i and fixed β . Show that U = ∑ X i has gamma distribution with parameters ∑α i
i =1 i =1
and β .
3) Let X 1 , X 2 ,..., X n be independent normally distributed random variables with E ( X i ) = µi

X i − µi n
and Var(X i ) = σ i2 for i = 1, 2,..., n . Let Z i = , Show that U = ∑ Z i2 has a Chi-
σi i =1
square distribution with n degrees of freedom.
Page 13 of 13

Chapter 0

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 0

Uploaded by

Copyright:

Available Formats

Page 1 of 13

Let Ω denote the sample space in an experiment.

- Toss a coin once: Ω = {H, T}

Definition: The triplet ( Ω , B, P) is called a probability space, while Ω or ( Ω , B) is called the

Definition: Let Ω be a space, B a Borel field of subsets of Ω. A probability measure P(.) on B is

1) For all A∈ B, P(A) ≥ 0.

n n! n  n  n n n

Conditional Probability and Independence:

Definition: The conditional probability of the event A given that B occurred is

The law of total probability (LOTP):

Examples: Uniform, Bernoulli, Binomial, geometric, negative binomial and hypergeometric

Definition: The rth moment of a random variable X is defined as E ( X r ) = ∑ x r p ( x) .

Definition: The rth central moment of a random variable X is defined as

Definition: The second central moments is called the variance of X. i.e. σ 2 or

Definition: f ( x) is a continuous probability density function if the following two conditions

RMK: The cumulative distribution function (CDF) satisfies the following F ( x) ,

iv) If a ≤ b then F (a ) ≤ F (b) . i.e. F ( x) is a non-decreasing function.

Bivariate Probability Distribution:

Definition: p ( x1 , x2 ) is a joint probability function of the discrete random variables X 1 and X 2 ,

i) p ( x1 , x2 ) ≥ 0 for all x1 and x2 . ii) ∑∑ p( x , x ) = 1

Definition: f ( x1 , x2 ) is a joint probability function of the continuous random variables X 1 and

In discrete case: the marginal distribution of X is defined as p X ( x) = ∑ p ( x, y ) , for all values of

Definition: The conditional density of X given that Y = y is defined as

Definition: If X and Y are independent random variables, then

a) In discrete case: If X and Y are independent, then p ( x, y ) = p X ( x) pY ( y ) for all x and y.

Note that ρ has no unit and it is between -1 and 1. I.e. −1 ≤ ρ ≤ 1.

Low of iterated expectations: E ( X ) = E ( E ( X | Y ) ) .

Distribution of Function of Random Variables

1) The method of distribution function

The Method of Distribution Function:

Summary of distribution function method:

- Find the region U=u in the X 1 , X 2 ,..., X n space.

Examples (Univariate distributions):

Example (Bivariate distributions):

The Method of Transformations

Let X be a random variable with density function f X ( x) . Let U = h( X ) where h( x) is a one to

Examples (Univariate Distributions):

1) Let X  gamma(a ,β ) . What is the density function of U = e X .

Examples (Bivariate Distributions):

Let ( X 1 , X 2 ) have a jointly continuous distribution with pdf f X1 , X 2 ( x1 , x2 ) . Let Y1 = u1 ( X 1 , X 2 )

and Y2 = u2 ( X 1 , X 2 ) where the function y1 = u1 ( x1 , x2 ) and y2 = u2 ( x1 , x2 ) define a one-to-one

transformation. If we can write x1 = w1 ( y1 , y2 ) and x2 = w2 ( y1 , y2 ) , then

The Method of Moment-Generating Functions

Theorem: Let X 1 , X 2 ,..., X n be independent random variable with moment generating

1) Let X 1 , X 2 ,..., X n be independent normally distributed random variables with E ( X i ) = µi

2) Let X 1 , X 2 ,..., X n be independent gamma distributed random variables with parameters

3) Let X 1 , X 2 ,..., X n be independent normally distributed random variables with E ( X i ) = µi

square distribution with n degrees of freedom.

You might also like