You are on page 1of 82

STA 227: PROBABILITY AND STATISTICS III

Course outline
 Multivariate distribution theory: marginal and conditional distributions
 Transformation of random variables; use of generating function technique
 Chebyshev’s inequality
 The central limit theorem
 The weak and strong law of large numbers
2 2
 χ− distribution; Distribution of the sum and ratio of two independent χ random
variate.
 Student’s t – distribution for comparison of two sample means.
 F- distribution
 Order Statistics
 Characteristics functions

REFERENCES
1. Mathematical statistics
By John ‘E’ Freund
2. Introduction to Mathematical Statistics
By Robert .V. Hogg and Allan .T. Craig
3. Mathematical statistics
By Irwin Miller and Marylees Miller

COURSE LECTURER
PROF. J.K. KINYANJUI

Page 1 of 82
MULTIVARIATE DISTRIBUTION THEORY
We earlier defined random variable as a real-valued function over a sample space with a probability
measure, and it stands to reason that many different random variables can be defined over one and
the same sample space.
In this section we shall be concerned first with the bivariate case, i.e. with situation where we are
interested at the same time in a pair of random variables defined over a joint sample space. Later we
shall extend this discussion to the Multivariate case, covering any finite number of random variables.
If X and Yare discrete random variables, we write the probability that X will take on the value x
and Y will take on the value pr ( X =x ,Y = y ) .Thus, p ( X=x ,Y = y ) is the probability of the
y as

intersection of the event X =x and Y= y

Example 1
Two caplets are selected at random from a bottle containing three aspirins, two sedative, and four
laxative caplets.
If X and Y are, respectively, the numbers of aspirin and sedative caplets included among the two
caplets drawn from the bottle, find the probabilities associated with all possible pairs of values of X
and Y.
Solution
The possible pair are ( 0,0 ) , ( 0,1 ) , ( 1,0 ) , ( 1,1 ) , ( 0,2 ) and ( 2,0 )
To find the probability associated with ( 1,0 ) , for example, observe that we are concerned with the
event of getting one of the three aspirin caplets, none of the two sedative caplet, and hence, one of the
four laxative caplets.
( 3 ¿) ¿ ¿ ¿
The no. of ways in which this can be done is ¿ and the total number of
( 9 ¿ ) ¿ ¿¿
ways in which two of the nine caplets can be selected is ¿
Since these possibilities are all equally likely by virtue of the assumption that the selection is random,
12 1
=
it follows that the probability associated with ( 1,0 ) is 36 3 ,
Similarly, the probability associated with ( 1,1 ) is:

( 3 ¿) ¿ ¿ ¿ ¿ ¿
¿ ,
and continuing this way, we obtain the values shown in the following table:

x
0 1 2
0 1 1 1
6 3 12
1 2 1 0
y 9 6

Page 2 of 82
2 1 0 0
36

It is generally preferably to represent probabilities such as these by means of formula. In other words,
it is preferable to express the probabilities by means of a function with the values
f ( x , y ) =Pr ( X=x ,Y = y ) for any pair of values ( x, y ) within the range of the random variables
X and Y .
For instance, for the two random variables of the above example, we can write

3
f ( x , y )= ׿ ( 2 ¿ ) ¿ ¿ ¿ ¿ ¿
()
x ¿
Definition 1
If X and Yare discrete random variables, the function given by f ( x , y ) =Pr ( X=x ,Y = y ) for each pair
of values ( x, y ) within the range of X and Y is called the joint probability distribution of X and
Y.
Theorem 1
A bivariate function can serve as the joint probability distribution of a pair of discrete random
variable X and Y if and only if its values, f ( x , y ) , satisfy the conditions:
(1) f ( x , y ) ≥0 , for each pair of values ( x, y ) within its domain;
∑ ∑ f ( x, y )=1 ( x, y ) within
(2) x y , where the double summation extends over all possible pairs
its domain.

Example 2
Determine the value of k for which the function given by f ( x , y ) =kxy for x=1,2,3 ; y=1,2,3 can
serve as a joint probability distribution.
Solution
Substituting the various values of x and y , we get
f ( 1,1 )=k , f ( 1,2 )=2 k , f ( 1,3 )=3 k , f ( 2,1 ) =2 k , f ( 2,2 )=4 k , f ( 2,3 )=6 k , f ( 3,1 )=3 k , f ( 3,2 )=6 k , and
f ( 3,3 ) =9 k .
To satisfy the first condition of Theorem 1 above, the constant k must be nonnegative, and to
satisfy the second condition,
k +2 k +3 k +2 k +4 k +6 k +3 k +6 k +9 k=1
1
⇒ 36 k =1 ⇒ k =
36

Definition 2

Page 3 of 82
The X and Y are discrete random variables, the function given by
F ( x , y )= pr ( X ≤x , Y ≤ y )=∑ ∑ f ( s , t )
s≤x t≤ y , for −∞< x <∞ ; −∞< y< ∞
where f ( s, t ) is the value of the joint probability distribution of X and Y at ( s ,t ) , is called the
joint distribution function or the joint Cumulative Distribution Function, of X and Y.
Example 3
With reference to example 1 find F ( 1,1 ) .
Solution
F ( 1,1 )=Pr ( X ≤1 ,Y ≤1 )
1 2 1 1 8
=f ( 0,0)+ f ( 0,1)+ f ( 1,0)+ f ( 1,1)= + + + =
6 9 3 6 9
As in the univariate case, the joint distribution function of two random variables is defined for all real
numbers.
For instance, for example 1, we get F (−2,1 )= pr ( X ≤−2 ,Y ≤1 )=0 and
F ( 3.7,4 .5 ) =pr ( X ≤3 .7,Y ≤4.5 )=1
Definition 3
A bivariate function with values f ( x , y ) , defined over the XY − plane is called a joint probability
density function of a continuous random variable X and Y if and only if
pr [ ( X , Y ) ∈ A ] =∬ f ( x , y ) dxdy
A
for any region A in the XY − plane .
Theorem 2
A bivariate function can serve as a joint probability density function of a pair of continuous random
variables X and Y if its values, f ( x , y ) , satisfy the conditions:
f ( x , y ) ≥0 for −∞< x <∞ , −∞< y <∞ ;
(1)
∫∫ f ( x , y ) dx dy = 1
x y
(2)
Example 4
Given the joint probability density function
3
{
f ( x , y ) =¿ x ( y+x ) for 0<x<1, 0<y<2 ¿ ¿¿¿
5
of two random variables X and Y , find Pr [ ( x , y ) ∈ A ] where A is the region
{( x , y ) / 0< x < 12 , 1< y< 2 }
Solution
Pr [ ( x , y ) ∈ A ] = Pr 0< X < 12 , 1< Y <2
[ ]
1
2 2
3
∫ ∫ 5
x ( y +x ) dxdy
= y =1 x=0

Page 4 of 82
2 1
3 x2 y 3x 3 2 2

= y=1
[ 10
+
x =0 15 ]
= y =1
2
dy ∫(
y =1
3y 1
+
40 40
dy=
3 y2 y
) [ +
80 40 ] =
11
80
Analogous to definition 2, we have the following definition of the joint distribution function of two
continuous random variables.

Definition 4
If X and Y are continuous random variables, the function given by
y x
F ( x , y )=Pr ( X≤x ,Y ≤ y )= ∫ ∫ f ( s, t ) dsdt
−∞ −∞ , for −∞< x <∞ ; −∞< y <∞
where f ( s, t ) is the value of the joint probability density of X and Y at ( s ,t ) is called the joint
Cumulative Density Function of X and Y .
We shall limit our discussion here to random variable whose joint cumulative density function is
continuous everywhere and partially differentiable with respect to each variable for all but a finite set
of values of two random variables.
d
f (x )= [ F ( x )]
Analogous to the relationship dx for univariate case, partial differentiation in
2
f ( x , y ) = ∂ F ( x, y )
definition 4leads to ∂ x∂ y
wherever these partial derivatives exist.
The joint cumulative density function of two continuous random variables determines their joint
probability density function at all points ( x, y ) where the joint density is continuous.
Example 5
If the joint probability density of X and Y is given by

f ( x,y ) =¿ { x+y for 0<x<1; 0<y<1¿¿¿¿


Find the cumulative density function of these two random variables.
Solution
The cumulative density function of X and Y is defined as;
y x
F ( x , y )=Pr ( X≤x ,Y ≤ y )= ∫ ∫ f ( s , t ) dsdt
−∞ −∞
y x y 2 x y
s x2
= ∫ ∫ ( s+t ) dsdt= ∫
t=0 s=0 t=0
[ ] 2
+st
s=0
dt = ∫
t=0
[ 2 ]
+ xt−0 dt
y
x 2 t xt 2 x 2 y xy 2 1
=
2 [ +
2 ]
t =0
=
2
+ = xy ( x+ y )
2 2

Example 6
Find the joint probability density of the two random variables X and Y whose joint cumulative
density function is given by

F ( x , y )=¿ {( 1−e−x )( 1−e− y ) ¿ ¿¿¿


Page 5 of 82
Also we use the joint probability density function to determine Pr ( 1< X <3 , 1<Y <2 )
Solution
(a) Since the partial differentiation yields
∂2 F ( x , y ) = ∂2 [ ( 1−e−x )( 1−e− y ) ]
∂ x∂ y ∂ x∂ y
= ∂ ∂ ( 1−e−x )( 1−e−y ) = ∂ ( 1−e−x )× ∂ ( 1−e− y )
∂x ∂ y[ ∂x ] ∂y
= ∂ ( 1−e−x )×(−e− y ) = −e− y ∂ ( 1−e− x )
∂x ∂x =−e− y (−e− x )=e−x e− y =e−x− y
=e− ( x+ y )
2
f (x , y)= ∂
− ( x+ y )
F ( x, y )=e for x>0, and y>0
∂ x∂ y
i.e. = 0, elsewhere
We find that the joint probability density of X and Y is given by:
f (x , y )=e−( x+ y ) for x>0 , and y>0
= 0 , elsewhere
2 3
Pr ( 1< X <3 , 1<Y <2 )= ∫ ∫ e−( x+ y ) dxdy
y=1 x=1
(b)
2 2 2
3
=∫ [ −e−( x+ y ) x=1
] dy = ∫ −( e − (3+ y ) − ( 1+ y )
−e ) dy= ∫ [−e−( 3+ y ) +e−( 1+ y ) ] dy
y=1 y=1 y=1
2
2 2
e −(1+ y )
= − ∫ e−( 3+ y ) + ∫ e− ( 1+ y ) dy
y=1
−5 −4
y=1
−3 −2 −5
−1 −( 3+ y ) 2
= (e
−1
−4
] y=1 +
−3
−1
−2
[ ] y=1 =[ e
− ( 3+ y ) 2
] y=1−[e
−( 1+ y ) 2
] y=1
=e −e −( e −e )=e −e −e +e =0. 074

For two random variables the joint probability is geometrically speaking, a surface and the
probability that we calculated in the preceding example is given by the volume under this surface as
shown in the figure (2) below;

All the definitions of the section can be generalized to the multivariate case, where there are n
random variables. Corresponding to definition1, the values of the joint probability distribution of n
discrete random variables 1 2 X , X ,⋯, X
n are given by
f ( x 1 , x 2 ,⋯, x n ) =Pr ( X 1= x 1 , X 2 =x 2 ,⋯, X n= x n )

Page 6 of 82
x , x ,. .. , x n )
for each n-tuple ( 1 2 within the range of the random variables; and corresponding to
definition 2, the values of their joint distribution function are given by
F ( x 1 , x 2 ,⋯x n ) =Pr ( X 1≤x 1 , X 2 ≤x 2 ,⋯, X n≤x n )
for
−∞< x 1 <∞ , −∞< x 2 <∞ ,⋯,−∞< x n <∞

Example 7
If the joint probability distribution of the three random variable X ,Y and Z is given by

( x+y ) z
f ( x , y , x )=¿
63 { for x=1,2; y=1,2,3; z=1,2¿ ¿¿¿
Find Pr ( X=2 , Y +Z≤3 )
Solution
3 6 4 13
Pr ( X=2 , Y +Z≤3 )=f ( 2,1,1 )+ f ( 2,1,2 )+ f ( 2,2,1 ) = + + =
63 63 63 63
In the continuous case, probabilities are again obtained by integrating the joint probability density,
and the joint cumulative density function is given by
xn x 2 x1

F ( x 1 , x 2 ,⋯, x n ) =∫ .. . ∫ ∫ f ( t 1, t 2 . .. , t n ) dt 1 dt 2 ⋯dt n
−∞ −∞ −∞

for
−∞<x 1 <∞ ,−∞< x 2 <∞ ,⋯,−∞< x n <∞ analogous to Definition 4.
Also partial differentiation yields
n
f ( x 1 , x 2 ,⋯, x n ) = ∂ F ( x 1 , x2 ,. .. , x n)
∂ x 1 ∂ x 2 .. . ∂ x n
wherever this partial derivative exist.
EXAMPLE 8
If the trivariate probability density of
X 1 , X 2 , and X 3 is given by
−x3
{
f ( x 1 , x2 , x 3)=¿ ( x 1+x 2)e for 0<x 1 ¿1, 0<x 2 ¿1 x3 ¿0 ¿ ¿¿¿
Pr [ ( X 1 , X 2 X 3 ) ∈ A ]
Find where A is the region
1 1
{( x 1 , x 2 , x 3 ) / 0< x1 < 2 , 2 < x 2 < 1, x 3 <1 }
Solution
Pr [ ( X 1 , X 2 , X 3 ) ∈ A ] =Pr 0< X 1 < 12 , 12 < X 2 < 12 <1 , X 3 <1
[ ]
1
1 2 1
2 1 1 − x3 1
= ∫ ∫ ∫ ( x1+ x2) e
− x3
dx 1 dx 2 dx 3 =∫ ∫ −[ ( x 1+x 2) e ] x 3=0
dx 1 dx 2
x1 =0 x = 1 x =0 x1 =0 x = 1
2 2 3 2 2
1 1
2 1 2 1
0
= ∫ ∫ [−( x1 + x 2 ) e
1
−1
−( x 1 + x 2 e ) ] dx 1 dx 2 = ∫ ∫ [−( x1 + x 2) e−1+ ( x 1+ x 2 ) ] dx 1 dx 2
x1 =0 x 2= 2 x1 =0 x = 1
2 2

Page 7 of 82
1 1
2 1 2 1

=−e−1 ∫ ∫ ( x 1 + x 1 ) dx 1 dx 2 + ∫ ∫ ( x 1 + x 2 ) dx 1 dx 2
x1 =0 x = 1 x1 =0 x = 1
2 2 2 2
1 1
1 1
2 2
x2
=−e −1

x1 =0
1
[ x 1 x2 +
x

2
2
2

] x2 =
1
2
dx 1 + ∫
x 1=0
1
[ x1 x2 + 2
2 ] x =
2
1
2
dx 1

2 2
1 1 1 1 1
=−e
−1
∫ ( x 1+ 12 )−( 2 x 1 + 8 ) dx 1 + ∫ ( x1 + 2 )−( 2 x 1 + 8 ) dx 1
x1=0 x 1=0
1 1 1 1
2 2 2 2
1 3
=−e
−1
∫ ( x 1 + 12 − 12 x 1 − 18 ) dx 1 + ∫ ( x 1 + 12 − 12 x 1− 18 ) dx 1 =−e
−1
∫ ( 12 x1 + 38 ) dx 1 + ∫ ( 2 x 1 + 8 )dx 1
x1 =0 x1 =0 x1=0 x 1=0
1 1
1 3 1 3
=−e−1
[ x 2 + x1
4 1 8 ] [
2

x1=0
+
4
x 1 + x1
8
2
] 2

x 1=0
=−e−1
1 1 3 1
4 [
( 4 ) 8
1
4
3
8][
+ ( 2 )−0 + ( 14 )+ ( 12 )−0 ]
1 3 1 3 4 4 1 1
=−e−1
[ + + +
16 16 16 16 ]( ) =−e−1 ( ) + =−e−1 +
16 16 4 4 ()
1
= [ 1−e−1 ]=0 . 158
4

EXERCISE
1. If the values of the joint probability distribution of X and Y are as shown in the table below:
x

0 1 2
1 1 1
0 12 6 24
1 1 1
y
1 4 4 40
1 1
2 8 20 0
1
3 120 0 0

Find
a) Pr ( X=1 , Y =2 ) ; [ 1 20 ]
b) Pr ( X = 0 , 1≤Y <3 ) → [38]
c) Pr ( X +Y ≤1 ) [12] →

Pr ( X >Y ) → [ 7 30 ]
d)
F ( 1. 2 , 0. 9 ) →[ 1 4 ]
e)
f) F (−3 , 1. 5 ) → [ 0 ]
F ( 2,0 ) → [ 7 24 ]
g)
F ( 4 , 2. 7 ) → [ 119 120 ]
h)
2. If the joint probability density of X and Y is given by;
Page 8 of 82
f ( x,y ) =¿ {c ( x2+y2) for x=−1, 0, 1, 3; y=−1, 2, 3 ¿ ¿¿¿
1

Find the value of c.


[ ]
c=
89

Hence find:
Pr ( X ≤1 , Y >2 ) → [ 29 89 ]
(a)
Pr ( X = 0 ,Y ≤2 ) → [ 5 89 ]
(b)
Pr ( X +Y >2 ) → [ 55 89 ]
(c)
(3) Determine k so that

f ( x,y ) =¿ {kx ( x −y ) ¿¿¿¿ for 0<x <1 , −x < y <x


can serve as a joint probability density function. [ k=2 ]
(4) Find k if the joint probability distribution of X ,Y and Z is given by

f ( x,y,z )=¿ {kxyz for x=1,2; y=1,2,3 ; z=1,2 ¿¿¿¿ [ k=1 54 ]


Hence find:
1
Pr ( X=1 , Y ≤2 , Z =1 ) → 18
a)
7
Pr ( X=2 , Y +Z=4 ) → 27
b)
c) F ( 1,0,1 ) →0
1
F ( 2,1,2 ) →
d) 6

e) F ( 4,4,4 ) →1
(5) Find k if the joint probability density of X ,Y and Z is given by

f ( x,y,z )=¿ {kxy ( 1−z ) for 0<x<1, 0< y<1, 0<z<1, x+y+z<1¿¿¿¿ ⇒ [ k =144 ]
1
( 2) Pr x + y <
Hence find
(6) Use the result of examples 5 to verify that the joint distribution function of the random variable
X 1 , X 2 and X 3 of example 8 is given

1 −x 3 1 −3
F ( x 1 2 x3 ) = ¿ { 0 , 1 − x3
{ ( )( ) { { ( )( )
for x1≤0, x2≤0, or x3≤0¿ 2x1 2 x1+x2 1−e , for 0<x1¿1, 0<x2¿1, x3¿0¿ x2 x2+1 1−e , for x1¿1, 0<x2¿1, x3¿0¿ x1 x1+1 1−e , for 0<x1¿1, x2¿1, x3¿0¿ ¿
2 2
( )( )
Page 9 of 82
(7) If the joint probability density of x, yandz is given by
1
{
f ( x,y,z )=¿ 3 ( 2x+3 y+z ) for 0<x<1, 0<y<1, 0<z<1 ¿ ¿¿¿
Find:
Pr ( X= 12 , Y = 12 , Z= 12 )
a)
1 1 1
Pr ( X < 2 , Y < 2 , Z < 2 )
b)
(8) A certain college gives attitude test in the sciences and the humanities to all
entering fresher if X and Yare, respectively, the proportions of correct answers a student gets
on the tests in the two subjects, the joint probability distribution of these random variables can
be approximated with the joint probability density
2
5{
f ( x , y ) =¿ (2x+3 y) for 0<x<1, 0<y<1 ¿ ¿¿¿
What are the probabilities that a student will get:
(a) Less than 0.40 on both tests; [ 0.064 ]
(b) More than 0.08 on the science test and lee than 0.50 on the humanities test?
[ 0.102 ]
(9) Suppose that P, the price of a certain commodity (in dollar), and S, its total sales (in 10,000
units), are random variables, whose joint probability density can be approximated closely with
the joint probability density.

f ( p,s )=¿ {5 pe−ps for 0.20<p<0.40, s>0 ¿ ¿¿¿


Find the probabilities that;
a) The price will be less than 30 cents and sale will exceed 20,000 units
b) The price will be between 25 cents and 30 cents and sales will be less than 20,000 units

MARGINAL DISTRIBUTIONS
Definition 5
If X and Y are discrete random variables and f ( x , y ) is the value of their joint probability
distribution at ( x, y ) , the function given by:
g ( x ) =∑ f ( x , y )
y for each x within the range of X is called the marginal distribution of X.
Correspondingly the function given by:
h ( y )=∑ f ( x , y )
x for each y within the range of Y is called the marginal distribution of Y.
Example 9
The joint probability distribution function of random variables X and Y are given in the table below:
x

Page 10 of 82
1 2 3
1 1
0
1 6 6
1 1
y 0
2 6 6
1 1
0
3 6 6

Determine the:
i. Marginal distribution of X
ii. Marginal distribution of Y
Solution
x
g( y)
1 2 3
1 1 1
0
1 6 6 3
1 1 1
y 0
2 6 6 3
1 1 1
0
3 6 6 3

f (x) 1 1 1
3 3 3

(i) The marginal distribution of X is given by


3
f ( x )= ∑ f ( x , y )
y=1
=f ( x , 1 )+f ( x, 2 ) +f ( x , 3 )
When X =1
1 1 1
f ( 1 )=f ( 1,1 )+ f ( 1,2 ) +f ( 1,3 )=0+ 6 + 6 = 3
When X =2
1 1 1
f ( 2 )= f ( 2,1 ) +f ( 2,2 ) + f ( 2,3 ) = 6 + 0+ 6 = 3
When X =3
1 1 2 1
f ( 3 ) =f ( 3,1 ) +f ( 3,2 ) + f ( 3,3 )= 6 + 6 + 0= 3 = 3
Hence the marginal probability distribution of Xis:
x 1 2 3
1 1 1
f (x ) 3 3 3

(ii) The marginal probability distribution of Y is given by


3
g ( y )= ∑ f ( x , y )
x=1
=f (1 , y ) +f ( 2 , y )+f ( 3 , y )
When Y =1
1 1 1
=0+ + =
g (1 )=f ( 1,1 ) +f ( 2,1 ) +f (3,1 ) 6 6 3
When Y =2
Page 11 of 82
1 1 1
= +0+ =
g ( 2 )=f ( 1,2 ) +f ( 2,2 ) + f ( 3,2 ) 6 6 3
When Y =3
1 1 1
= + +0=
g ( 3 )=f ( 1,3 )+f ( 2,3 ) +f ( 3,3 ) 6 6 3
Hence the marginal probability distribution of Y is:
y 1 2 3
1 1 1
g( y) 3 3 3

When X and Y are continuous random variables, the probability distributions are replaced by
probability densities, summations are replaced by integrals and we get
Definition 6
If X and Y are continuous random variables and f ( x , y ) is the value of their joint probability density
at ( x, y ) , the function given by:
g ( x ) =∫ f ( x , y ) dy
y is called the marginal density of X.
Correspondingly, the function given by:
h ( y )=∫ f ( x , y) dx
x is called the marginal density of Y.
Example 10
Given the joint probability density
2
{
f ( x, y ) =¿ ( x+2 y ) for 0<x<1, 0< y<1 ¿ ¿¿¿
3
Find the marginal densities of X and Y.
Solution
Performing the necessary integrations, we get
1
2 2
g ( x ) =∫ f ( x , y ) dy= ∫ ( x +2 y ) dy= ( x +1 )
y y=0 3 3

g ( x ) =¿ 23 ( x+1 ) for 0<x<1 ¿ ¿¿¿


{
i.e. is the marginal density function of X.
Likewise;
1
2 1
h ( y )=∫ f ( x , y ) dx= ∫ ( x +2 y ) dx= (1+4 y )
x x =0 3 3

h ( y )=¿ 13 ( 1+4 y ) , for 0<y<1 ¿ ¿¿¿


{
i.e.

Page 12 of 82
When we are dealing with more than two random variables, we can speak not only of the marginal
distributions of the individual random variables, but also of the joint marginal distributions of
several of the random variables.
X 1 , X 2 ,⋯, X n
If the joint probability of the discrete random variables has the value
f (x 1 , x 2 ,⋯, x n ) X1
the marginal distribution of alone is given by
g ( x 1 ) =∑ ∑ ⋯∑ f ( x 1 , x 2 ,. .. x n )
x
2
x
3 xn X1
for all values within the range of .
X1 , X2 X3
The joint distribution of and is given by
m ( x1 , x 2 , x 3 ) =∑ ∑ ⋯∑ f ( x 1 , x 2 ,⋯, x n )
x x xn
4 5 for all the values within the range of
X 1 , X 2 , and X 3 , other marginal distributions can be defined in the same way.
For the continuous
case, probability distributions are replaced by probability densities, summations are replaced by
X 1 , X 2 ,⋯, X n has
integrals, and if the joint probability density of the continuous random variables
f x , x .. . , x n )
the values ( 1 2 , the marginal density of
X 2 alone is given by
h( x 2 )=∫ ∫⋯∫ f ( x 1 , x2 ,⋯, x n )dx 1 dx 3 ⋯dxn
x1 x3 xn for −∞< x 2 <∞ .
The joint marginal density of
X 1 and X n is given by
m( x1 , x n )=∫ ∫ ⋯ ∫ f ( x 1 , x 2 ,⋯, x n )dx 2 dx 3 ⋯dx n−1
x2 x 3 x n−1 for
−∞< x 1 <∞ and −∞< x n <∞ , and so
forth.
Example 11
Considering again the trivariate probability density given by
−x 3
{
f ( x 1 , x 2, x3 )= ¿ ( x1 +x 2) e for 0<x 1 ¿1, 0<x 2 ¿1, x 3 ¿0¿¿¿¿
Find the joint marginal density of
X 1 and X 3 and the marginal density of X 1 alone.
Solution
X 1 and X 3 is
Performing the necessary integration, we find that the joint marginal density of
given by
1
−x
m ( x1 , x 3 ) =∫ f ( x1 , x 2 , x 3 , ) dx 2 = ∫ ( x 1 +x 2 ) e 3
dx 2
x2 x 2=0
1
1
x 22
=e
− x3
∫ ( x 1 + x2 ) dx 2 =e
x =0
2
−x
3
[ x1 x2 +
2 ] x 2=0 =e
− x3
[ x1 +
1
2 ] (
−0 = x 1 +
1
2 )−x
e 3

Page 13 of 82
1 −x3
i.e.
m ( x1 ,x 3)=¿ ( { )
x1 + 2 e , for 0<x 1 ¿1, x3 ¿0 ¿ ¿¿¿
Using this result, we find that the marginal density of
X 1 alone is given by
∞ 1 ∞
g ( x1)= ∫ ∫ f ( x 1 , x 2 , x 3 ) dx 2 dx 3 = ∫ m ( x 1 , x 3 ) dx 3
x 3=0 x =0 x3 =0
2
∞ ∞
1
= ∫ ( x1+ 12 ) e
x3=0
−x 3
(
dx 3 = x1 +
2 )∫ e
x =0
3
−x 3
dx 3 −x 3 ∞
=( x+ 2 ) [ −e
1
] x =0
3
1 1
=− ( x 1 + 2 ) [ 0−1 ] =x1 + 2

1
i.e
{
g ( x1)=¿ x1+ 2 , for 0<x1<1 ¿ ¿¿¿
CONDITIONAL DISTRIBUTIONS
We define the conditional probability of event A given event B as
Pr ( A∩B )
Pr ( A /B )=
Pr ( B ) , provided Pr( B)≠0
Suppose now that A and B are the events X =x and Y = y , so that we can write
X=x Pr ( X =x , Y = y ) f ( x , y )
Pr ( =
Y = y Pr ( Y = y ) ) =
h( y )
Provided Pr ( Y = y )=h ( y )≠0 , where f ( x , y ) is the value of the joint probability distribution of X
and Y at ( x, y ) and h ( y ) is the value of the marginal distribution of Y at y.

Denoting the conditional probability by f (x / y ) to indicate that x is a variable and Yis fixed, let
us now make the following definition:
Definition 7
If f ( x , y ) is the value of the joint probability distribution of the discrete random variables X and
Y at ( x, y ) and h ( y ) is the value of the marginal distribution of Y at y, the function given by
x f (x , y)
f ( )
y
=
h( y)
, h ( y )≠0
, for each x within the range of X is called the conditional
distribution of X given by Y= y .
Correspondingly, if g ( x ) is the value of the marginal distribution of X at x, the function given by
y f (x , y)
w
x( )
=
g(x)
, g (x )≠0
for each y within the range of Y is called the conditional distribution
of Y given X =x .

Definition 8

Page 14 of 82
If f ( x , y ) is the value of joint density of the joint density of the continuous random variables X
and Y at ( x, y ) and h ( y ) is the value of marginal density of Y at y, the function given by
x f (x , y)
f ( )
y
=
h( y)
, h ( y )≠0
for −∞< x <∞ , is called the conditional density of X given Y= y .
Correspondingly, if g ( x ) is the value of the marginal density of X at x, the function given by
y f (x , y)
w
x ( )
=
g(x)
, g( x )≠0
for −∞< x <∞ , is called condition density of Y given X =x .

Example 12
Given the joint probability density

f ( x,y )=¿ {4 xy for 0<x<1, 0<y<1¿¿¿¿


Find the marginal densities of X and Y and the conditional density of X given Y = y.
Solution
Performing the necessary integrations, we get
1
1
g ( x ) =∫ f ( x , y ) dy= ∫ 4 xy dy =[ 2 xy 2 ] y=0 =2 x
y y=0

i.e.
g (x )=¿ {2 x, 0<x<1¿¿¿¿
Also
1
1
h ( y )=∫ f ( x , y ) dx= ∫ 4 xy dx=[ 2 x 2 y ]x =0= 2 y
x x =0

i.e.
h ( y )=¿ {2 y, 0<y<1¿¿¿¿
Then, substituting into the formula for a conditional density, we get
x f ( x , y ) 4 xy
f ( )y
=
h( y)
=
2y
=2 x ,

i.e.,
f (x y )=¿ {2 x , 0<x<1 ¿¿¿¿
Example 13
With reference to example 1, find the conditional distribution of:
(i) X given Y=1 (ii) Y given X = 0.
Solution
The results of Example 1 are shown in the following table, together with the marginal totals, that is,
the totals of the respective rows and columns:

x
0 1 2 h (y)

Page 15 of 82
0 1 1 1 7
6 3 12 12
1 2 1 0 7
y 9 6 18
2 1 0 0 1
36 36
g (x) 5 1 1
12 2 12

We need to get the marginal densities of X and Y.


The column totals are the probabilities that X will take on the values 0, 1, and 2. In other words, they
are the values
2
g ( x ) = ∑ f ( x , y ) , for x=0,1,2
y =0

which is the marginal distribution of X . Hence


g( x )=f ( x , 0 ) +f ( x , 1 )+f ( x , 2 )
When X = 0;
1 2 1 6+8+1 15 5
g ( 0 )=f ( 0,0 ) +f ( 0,1 ) + f ( 0,2 )= + + = = =
6 9 36 36 36 12
When X = 1;
1 1 2+1 3 1
g (1 )=f ( 1,0 ) + f (1,1 )+f ( 1,2 )= + +0= = =
3 6 6 6 2
When X = 2;
1 1
g ( 2 )=f ( 2,0 )+ f ( 2,1 )+ f ( 2,2 )= +0+ 0=
12 12
Hence the marginal probability distribution of Xis:
x 0 1 2
g( x) 5 1 1
12 2 12

The marginal probability distribution of Y is given by


2
h ( y )= ∑ f ( x , y )
x=0
=f ( 0 , y ) +f ( 1, y ) +f ( 2 , y )
When Y = 0;
1 1 1 6 +12+ 3 21 7
h ( 0 )=f ( 0,0 ) + f ( 1,0 )+ f ( 2,0 )= + + = = =
6 3 12 36 36 12
When Y = 1;
2 1 8+6+ 0 14 7
h ( 1 )=f ( 0,1 )+ f ( 1,1 )+ f ( 2,1 )= + + 0= = =
9 6 36 36 18
When Y = 2;
1 1
h ( 2 )=f ( 0,2 )+ f ( 1,2 )+ f ( 2,2 )= +0+0=
36 36
Hence the marginal probability distribution of Y is:
Page 16 of 82
y 0 1 2
h( y ) 7 7 1
12 18 36

(i) Then, substituting into the formula for a conditional density, we get
X= x f ( x , 1)
f (Y =1
=
h ( 1) ), x=0,1,2
When x = 0;
X=0 f ( 0,1 ) 2/9 4
f (Y =1
= )=
h (1 ) 7/18 7
=
When x = 1;
X=1 f ( 1,1 ) 1/6 3
f (Y =1
= =) =
h ( 1 ) 7 /18 7
When x = 2;
X=2 f ( 2,1 ) 0
f (Y =1
= )=
h ( 1 ) 7 /18
=0
Hence, the conditional density of X given Y = 1 is:
x 0 1 2
X= x 4 3 0
f ( Y =1 ) 7 7

(ii) The conditional density of Y given X = 0 is given by:


Y=y f (0 , y )
f (
X =0
=
g ( 0) )
, y=0,1,2
When Y = 0;
Y =0 f ( 0,0 ) 1/6 2
f (
X=0
= = )
g ( 0 ) 5 /12 5
=
When Y = 1;
Y =1 f ( 0,1 ) 2/9 8
f (
X =0
= =)
g ( 0 ) 5/12 15
=
When Y = 2;
Y =2 f ( 0,2 ) 1/36 1
f (X =0
= )
=
g ( 0 ) 5/12 15
=
Hence, the conditional density of X given Y = 1 is:

y 0 1 2
2 8 1
f (Y = y X =0 ) 5 15 15

When we are dealing with more than two random variables, whether continuous or discrete, we can
consider various different kinds of conditional distributions or densities. For instance, if
f ( x 1 , x 2 , x3 , x 4 )
is the value of the joint distribution of the discrete random variables
X 1 , X 2 , X 3 , and X 4 at ( x 1 , x2 , x 3 , x 4 ) , we can write

Page 17 of 82
x3 f ( x1 , x2 , x3 , x4 )
f
( x1 , x2 , x4 ) =
g ( x1 , x2 , x4)
, g ( x 1 , x 2 , x 4 , )≠0
for the value of the conditional distribution of
X 3 at x 3 given X 1 =x 1 , X 2 =x 2 , and X 4 =x 4 , where g ( x 1 , x 2 , x 4 ) is the value of the joint

marginal distribution of
X 1 , X 2 , and X 4 at ( x 1 , x2 , x 4 ) . We can also write
x2 , x4 f ( x1 , x2 , x3 , x4 )
f
( x1 , x3 ) =
g ( x1 , x 3 )
, g ( x1 , x 3 ) ≠0
for the value of the joint conditional distribution
of X 2 and X 4 at ( 2 4 ) given X 1 =x 1 and X 3 =x 3 , or
x ,x
x ,x , x f ( x1 , x2 , x3 , x4 )
(
f 2 3 4
x1
=
) g ( x1 )
, g ( x1 )≠0
for the value of the joint conditional distribution of
X 2 , X 3 , and X 4 at ( x 2 , x3 , x 4 ) given X 1 =x 1 .

When we are dealing with two or more random variables, questions of independence are usually of
great importance.
x
In Example 12, we see that
f
y
=2 x ( )
, does not depend on the given value Y= y . Whenever the
values of the condition distribution of X is given Y = y do not depend on y, it follows that
x
f ( )
y
=g ( x )
, and hence the formula definition 7 and 8 yield
f ( x , y ) =f ( x y )×h ( y )= g ( x )×h( y )
.
That is, the values of the joint distribution are given by the products of the corresponding value of the
two marginal distributions.
Generalizing from this observation, let us now make the following definition:
Definition 9
f ( x 1 , x 2 ,⋯, x n )
If is the value of joint density of the joint probability distribution function of the n
discrete random variables
X 1 , X 2 ,⋯, X n at x 1 , x 2 ,⋯, x n ,and f i ( xi ) is the value of the marginal

distribution of
X i at x i for i=1,2 ,⋯, n , then the n random variables are independent if and
only if

f ( x 1 , x 2 ,⋯, x n ) =f 1 ( x 1 )×f 2 ( x 2 )×⋯×f n ( x n ) ( x 1 , x2 ,⋯, x n )


for all within their range.

To give a corresponding definition for continuous random variables, we simply substitute the word
density for the word distribution.

Definition 10
f ( x 1 , x 2 ,⋯, x n )
If is the value of joint density of the joint probability density function of the n
continuous random variables
X 1 , X 2 ,⋯, X n at x 1 , x 2 ,⋯, x n ,and f i ( xi ) is the value of the

marginal density of
X i at x i for i=1,2 ,⋯, n , then the n random variables are independent if
and only if
Page 18 of 82
f ( x 1 , x 2 ,⋯, x n ) =f 1 ( x 1 )×f 2 ( x 2 )×⋯×f n ( x n ) ( x 1 , x2 ,⋯, x n )
for all within their range.

Example 14
If the joint probability density of X and Y is given by:

f (x, y)=¿ {24y(1−x−y) for x>0, y>0, x+y<1¿¿¿¿


Find:
(a) The marginal density of X
(b) The marginal density of Y
(c) Determine whether the two random variables are independent.

Solution
(a) Performing the necessary integrations, we get the marginal density function of X is given by:
1− x 1− x 1−x
y 2 xy 2 y 3
g ( x ) =∫ f ( x , y ) dy= ∫ 24 y (1−x− y ) dy=24 ∫ ( y−xy − y ) dy=24 − −
y y=0 y=0 2 2 3
2
[ ] y=0

(1−x )2 x(1−x )2 (1−x )3 3−3 x−2(1−x )


=24
2[ 3

2

3
=24(1−x )2 − −
2 2 3]
1 x (1−x )
= 24(1−x )2 [
6 ] [ ]
24(1−x )
= =4 (1−x )3
6

i.e.
g ( x ) =¿ {4 (1−x)3 , 0<x<1 ¿ ¿¿¿
(b) Also the marginal density function of Y is given by:

1− y 1− y 1− y
x2 y
h ( y )=∫ f ( x , y ) dx = ∫ 24 y (1−x− y ) dx=24 ∫ ( y−xy− y ) dx=24 xy−
x x=0 x =0 2
2
−xy 2 [ ] x=0

y (1− y )2 (1− y )
[
=24 y (1− y )−
2 ]
− y 2 (1− y ) =24 y (1− y ) 1−
2 [
− y =24 y (1− y )
2−1+ y−2 y
2 ] [ ]
24 y (1− y )(1− y )
= =12 y (1− y )2
2

i.e.
h ( y )=¿ {12y(1−y)2 , 0<y<1 ¿ ¿¿¿
(c) The two random variables are independent if and only if
f ( x , y ) = g( x)×h( y) . In our case;
g( x )× h( y )=4 (1−x )3 ×12 y(1− y )2 =48 y (1−x )3 (1− y )2≠f ( x , y )
Hence, the two random variables are not independent.

Example 15
With reference to example 1, determine whether the two random variables X and Y are independent.
Solution

Page 19 of 82
The results of Example 1 are shown in the following table, together with the marginal totals, that is,
the totals of the respective rows and columns:
x
0 1 2 h (y)
0 1 1 1 7
6 3 12 12
1 2 1 0 7
y 9 6 18
2 1 0 0 1
36 36
g (x) 5 1 1
12 2 12

In example 10, we obtained the marginal distributions of X and Y as:


The marginal probability distribution of Xis:
x 0 1 2
g( x) 5 1 1
12 2 12

The marginal probability distribution of Y is:


y 0 1 2
h( y ) 7 7 1
12 18 36

The two random variables are independent if and only if


f ( x , y ) = g( x )×h( y ) for all x=0,1,2 and y=0,1,2 . In our case;

5 7 35 1
g(0 )×h (0) = × = ≠f (0,0 )=
12 12 144 6
1 7 7 1
g(1)×h(1) = × = ≠f (1,1 )=
2 18 36 6
5 1 5 1
g(0 )×h (2 ) = × = ≠f (0,2 )=
12 36 432 36

Hence, the two random variables are not independent.

Example 16
X
Considering n independent flips of a balanced coin, let i be the number of heads (0 or
th
1)obtained in the i flip for i=1,2 ,⋯, n .
Find the joint probability distribution of these n random variables.
Solution
Since each of the random variables
X i , for i=1,2 ,⋯, n , has the probability distribution

Page 20 of 82
1
f i ( xi )= 2 for x i=0,1 and the n random variables are independent, their joint probability
distribution is given by
n
f ( x 1 , x 2 ,⋯, x n ) =f 1 ( x 1 ) ×f 2 ( x 2 )×⋯×f n ( x n )
= ( 12 )×( 12 ) ×⋯×( 12 )=( 12 ) where
x i=0 or 1 for
i=1,2 ,⋯, n
Example 17
Given the independent random variable X 1 , X 2 , and X 3 with the probability densities
−x1
{
f 1 ( x1 )= ¿ e , for x1 ¿0 ¿ ¿¿¿
−2x
f 2( x2)=¿ {2 e , for x ¿0 ¿ ¿¿¿
2
2
−3x
f 3 ( x3)=¿ {3 e , for x ¿0 ¿ ¿¿¿
3
3
find their joint probability density, and use it to evaluate the probability
Pr ( X 1 + X 2 ≤1 , X 3 >1 )
.
Solution
The joint probability density function is given by
−x −2 x −3 x
f ( x 1 , x 2 , x3 )=f 1 ( x 1 ) . f 2 ( x 2 ) . f 3 ( x3 ) =e 1×2e 2 ¿ 3 e 3
− x 1−2 x 2 −3 x 3
=6 e
−x1−2x2−3x 3
i.e.
f ( x 1 ,x 2 ,x3 )= ¿ 6 e { for x1 ¿0, x2 ¿0, x3 ¿0 ¿ ¿¿¿
Thus
1 1− x1

−x 1−2 x 2−3 x3
P ( X 1 + X 2 ≤1, X 3 >1 ) = ∫ ∫ ∫ 6e dx 1 dx 2 dx 3
x 1=0 x2 =0 x 3=1
1 1−x 1 1 1− x1
− x 1−2 x 2−3 x 3 ∞ −x 1−2 x2−3
=6 ∫ ∫ ( )[e − 13 ]
x3 =1 dx 1 dx 2 =−2 ∫ ∫ [ 0−e ] dx dx
1 2
x1 =0 x =0 x1=1 x 2=0
2
1 1− x1 1 1
−x 1−2 x2−3 1−x 1
=2 ∫ ∫e
−x 1−2 x2−3
dx 1 dx 2 =
−2

2 x1 =0
[ e ]x2=0 dx1 =− ∫ e [ − x −2(1−x )−3−e ] dx
1 1
− x −3
1
1
x 1=0 x 2=0 x1 =0
1 1
−x 1−2+2 x 1−3
= ∫ [e ] dx = ∫ [e ] dx
−x −3 −x 1−3 −x −5
1 1
−e 1 −e 1
x1 =0 x1 =0
1 1
−x1 −3 x −5
=∫ e dx 1− ∫ e 1 dx1 −x1 −3 1 x1−5 1
x1 =0 x1 =0 =−[ e ] x1=0 −[ e ]
x 1=0

=−[ e−4 −e−3 ]− [ e−4 −e−5 ] −4


=−e +e −e +e
−3 −4 −5
= e−3−2e−4 +e−5=0.020

Page 21 of 82
EXERCISE
(1) Given the values of the joint probability distribution of X and Y shown in the table:

x
-1 1
-1 1 1
8 2
y 0 0 1
4
1 1 0
8

Find the:
Marginal distribution of X.
(a)
(b) Marginal distribution of Y.
(c) Conditional distribution of X given Y = 1.
(d) Conditional distribution of Y given X = 0.

(2) Given the joint probability distribution


xyz
f ( x , y , z )=
108 for x=1,2,3 ; y=1,2,3; z=1,2
Find
xy
a) The joint marginal distribution of X and Y ⇒ g ( x , y )= 36 for x=1,2,3 and y=1,2,3
xz
b) The joint marginal distribution of X and Z ⇒ g ( x , z )= 18 for x=1,2,3 and z=1,2
x
c) The marginal distribution of X ⇒ g ( x ) = 6 for x=1,2,3
d) The conditional distribution of Z given X =1 and Y =2

[ ⇒f ( z X=1 , Y =2)= 3z for z=1,2]


e) The joint conditional distribution of Y and Z given X =3 .
y,z yz
⇒f ( X=3 )=
18
for y=1,2,3 , and z=1,2
(3) Given the values of the joint probability distribution of X and Y shown in the table:

x
0 1 2
0 1 1 1
12 6 24
y 1 1 1 1
4 4 40
2 1 1 0
8 20

Page 22 of 82
3 1 0 0
120

Find the:
1
[ g (−1 )= and g ( 1 )= 34 ]
Marginal distribution of X. ⇒ 4
(a)
5
[h (−1 ) = , h( 0 )= 14 , and h( 1)= 18 ]
Marginal distribution of Y. ⇒ 8 .
(b)
1
⇒ [ f (−1/−1)= 5
and f (1 /−1 )= 45 ]
(c) Conditional distribution of X given Y = -1.

(4) Check whether X and Yare independent, if their joint probability distribution is given by:
) 1
a) f x , y = 4 for x=−1 and y=−1 ; x=−1 and y=1 , x=1 and y=−1 , x=1 and y=1
(
⇒ independent
) 1
b) f x , y = 3 for x=0 and y=0 ; x=0 and y=1 , x=1 and y=1 , ⇒ not independent
(
(5) If the joint probability density of X and Y is given by:

f ( x, y ) =¿ 14 ( 2 x+y ) for 0<x<1, 0<y<2 ¿ ¿¿¿


{
Find the:
1
[
⇒¿ g ( y ) = 4 (1+ y ) , 0< y<2 ¿ ¿ ¿ ]
a) Marginal density of Y ¿
x
⇒¿ f[( ) y
= 12 ( 2 x+1 ) , 0< x<1 ¿ ¿ ¿
]
b) Conditional density of X given Y =1 ¿
(6) If X is the proportion of persons who will respond to one kind of mail-order solicitation, Y is the
proportion of persons who will respond to another kind of mail order solicitation, and the joint

f ( x,y ) =¿ 25 ( x+4 y ) for 0<x<1, 0<y<1 ¿ ¿¿¿


{
probability of X and Y is given by:
Find the probabilities that:
a) At least 30% will respond to the first kind of mail order solicitation. ⇒ [ 0 . 742 ]
b) At most 50% will respond to the second kind of mail-order solicitation given that there has
been a 20%responce to the first kind of mail order solicitation ⇒ [ 0 . 273 ]
(7) If X is the amount of money (in dollars)that a sales person spends on a gasoline during a day and
Y is the corresponding amount of money (in dollars) for which he or she is reimbursed, the joint

f ( x , y ) =¿ 251 ( 20−x
{ )x for 10<x<20 , 2x < y<x ¿ ¿¿¿
density of these two random variables is given by:
Find the:
20−x
[
⇒¿ g ( x )= 50 for 10<x<20 ¿ ¿ ¿ ]
a) Marginal density of X ; ¿

Page 23 of 82
y 1
b) Condition density of Y given X =12 → [(
f
x=12 6 ]
)=

c) Probability that the sales person will be reimbursed at least $8 when spending $12
→ [ 1 3]
(8) The useful life (in hours) of a certain kind of vacuum tube is a random variable having the
probability density:
20,000
f ( x )= ¿ { ( x+100 )3
, x>0 ¿ ¿¿¿
If three of these tubes operate independently, find
a) The joint probability of
X 1 , X 2 , and X 3 representing the lengths of their useful lives.

( 20,000 )3
⇒f ( x 1 ,x 2 , x 3) =¿
{
3 3
( x1+100 ) ( x 2+100 ) ( x3+100 )3
for x 1 >0, x 2 >0, x 3 >0 ¿ ¿¿¿

1
Pr ( X 1 < 100 , X 2 <100 , X 3 ≥200 ) →
(b) The value of 16

Page 24 of 82
TRANSFORMATION OF RANDOM VARIABLES
We shall concern ourselves with the problem of finding the probability distributions or densities of
functions of one or more random variables.
That is, given a set of random variables 1 2
X , X ,⋯, X
n and their joint probability or density, we shall
be interested in finding of joint probability distribution of some random variable
Y =u ( X 1 , X 2 ,⋯, X n )
.
This means that the values of Y are related to those of X ' s by means of the equation
y=u ( x 1 , x 2 ,⋯, x n )
.
Several methods are available for solving this kind of problems. The ones we shall discuss are the
distribution function technique, the transformation technique and the moment – generating
function technique.

THE DISTRIBUTION FUNCTION TECHNIQUE


A straight forward method of obtaining the probability density of a function of continuous random
variables consists of first finding its distribution function and then its probability density by
differentiation.
Thus, if
X 1 , X 2 ,⋯, X n are continuous random variables with a given joint probability density, the
Y =u ( X 1 , X 2 ,⋯, X n )
probability density of is obtained by first determining an expression for the
probability
F ( y )=Pr ( Y ≤ y )=Pr [ u ( X 1 , X 2 ,⋯, X n ) ≤ y ]
and then differentiating to get
d
f ( y )= [ F ( y )]
dy .
Example 1
If the probability density of X is given by

f (x)=¿ {6 x ( 1−x ) , for 0<x<1¿¿¿¿


3
Find the probability density of Y = X .
Solution
Letting G ( y ) denote the value of the distribution function of Y at y , we can write
1

G ( y )=Pr ( Y ≤ y ) =Pr ( X 3≤ y ) =Pr ( X ≤ y ) 3

1 1
y3 y3
1
=∫ 6 x ( 1−x ) dx=∫ ( 6 x−6 x 2 ) dx y3
0 0 =[ 3 x 2 −2 x 3 x =0
]
2

=3 y −2 y
3

And, hence,
d d 2
( )
g ( y )= [ G ( y ) ] = 3 y 3
−2 y
dy dy
−1 −1

=2 y 3
−2=2 y( 3
−1 )

Page 25 of 82
−1

i.e.
{( )
g ( y )=¿ 2 y 3−1 , for 0<y<1 ¿ ¿¿¿
Example 2
If Y=|X|, show that

g ( y )=¿ { f ( y ) +f (− y ) for y>0¿¿¿¿


where f ( x ) is the value of the probability density of X at x and g ( y ) is the value of probability
density of Y at y .
Solution
For y>0 we have
G ( y )=Pr ( Y ≤ y )
=Pr (|X|≤ y )
= pr (− y≤X ≤ y )
=F ( y ) −F (− y ) since Pr ( a≤ X≤b )=F ( b )−F ( a )
and, upon differentiation
d
g ( y )= [ F ( y )−F ( y ) ]
dy
=f ( y )+f (− y )
also, since |X| cannot be negative, g ( y )=0 for y<0 .
Arbitrarily, letting g ( 0 )=0 , we can thus write

g ( y )=¿ { f ( y ) +f (− y ) for y>0¿¿¿¿


Example 3
If the joint density of X 1 and X 2 is given by
−3 x1−2x2
{
f ( x1 ,x 2)=¿ 6 e , for x 1 ¿0, x 2 ¿0 ¿ ¿¿¿
find the probability density of Y = X 1 + X 2

Solution
X2

x 1 +x 2= y

Page 26 of 82
Integrating the joint density over the shaded region, we get
y −3 x1−2 x2 y− x1

[ ]
y−x 1
y
−3 x 1−2 x2 e
F ( y )= ∫ ∫ 6e dx 1 dx 2 =6 ∫ dx 1
x 1=0 x 2=0 x1 =0 −2 x2=0
y y y
−3 x1 −2 ( y −x1 ) −3 x −3 x 1−2 y +2 x1 −3 x −x 1−2 y −3 x
=−3 ∫ [e −e 1
] dx 1 =−3 ∫ [e −e 1
] dx 1 =−3 ∫ [e −e 1
] dx
1
x1 =0 x1 =0 x1 =0
y y
−x1 −2 y −3x y
=−3 ∫e +3 ∫ e 1
dx1 −x −2 y 3 −3 x 1 y
=3 [ e 1
] x 1=0 + [ e ] x1=0
x1 =0 x1=0 −3 =3 [ e− y−2 y −e ]− [ e−3 y −e 0 ]
−3 y −2 y −3 y
=3e −3e −e +1
−3 y −2 y
i.e. F ( y )=1+2e −3 e
and, differentiating with respect to y , we obtain
d d
f ( y )= [ f ( y ) ]= [ 1+2 e−3 y −3 e 2 y ]
dy dy
f ( y )=−6e−3 y +6e−2 y
=¿ {6 ( e−2 y −e−3 y ) , y>0 ¿ ¿¿ ¿
¿
EXERCISE
1. If the probability density of X is given by

−x2
f ( x )=¿ { 2 xe , for x>0 ¿ ¿¿¿
2
and Y = X , find

a. The distribution function of Y


⇒G ( y )=¿ {1−e−y ,y>0 ¿ ¿¿¿
b. The probability density of Y
( ) { −y
⇒g y =¿ e ,y>0 ¿ ¿¿¿
2. If X has an exponential distribution with the parameters θ , use the distribution function
technique to find the probability density of the random variable Y =ln X .

3. If X has the uniform density with the parameters α=0 and β=1 , use the distribution
function technique to find the probability density of the random variable Y =√X .

⇒g ( y )=¿ {2 y, 0<y<1¿¿¿¿
4. If the joint probability density of X and Y is given by

Page 27 of 82
−(x 2 +y 2 )
f ( x,y ) =¿ {4 xye , for x>0, y>0 ¿ ¿¿¿
and Z =√ X 2 +Y 2 , find

(a) The distribution function of Z.

(b) The probability density of Z.

5. If X 1 and X 2 are independent random variables having the exponential densities with the
parameters θ1 and θ2 , use the distribution function technique to find the probability density of
Y = X 1 + X 2 when
y y
1 −θ −θ
(a)
θ1 ≠θ2 ,
⇒f ( y )=¿
{θ1−θ2
(
e −e , y>0 ¿ ¿¿¿
1 2
)
y
1 −

(b)
θ1 =θ2 , {
⇒f ( y )=¿ 2 ye θ , y>0 ¿ ¿¿¿
θ
6. With reference to the two random variables of exercise 3 above show that if
θ1 =θ2 =1 , the random variable
X1
Z=
X 1+ X 2
has the uniform density with α=0 and β=1 .
7. If the joint density of X and Y is given by

(f x,y ) =¿ {e−(x+ y) , for x>0, y>0 ¿ ¿¿¿


X+Y
Z=
and 2 , find the probability density of Z by the distribution function technique.
8. The percentages of copper and iron in a certain kind of ore are, respectively, X 1 and X 2 . If the
joint density of these two random variables is given by

f ( x 1 ,x 2)=¿ 113 ( 5x 1+x2 ) for x1 >0, x 1>0, and x 1+2x 2<2 ¿ ¿¿¿
{
Use the distribution function to find the probability of Y = X 1 + X 2 .
Also find E ( Y ) ,the expected total percentage of copper and iron in the ore.

Page 28 of 82
92 3(2−y)(7 y−4)
1 { {
⇒g( y)=¿ y , for 0<y≤1, ¿ , for 1<y<2¿¿¿¿
2
TRANFORMATION TECHNIQUE: ONE VARIABLE
Let us show how the probability distribution or density of a function of a random variable can be
determined without first getting its distribution function.
In the discrete case there is no real problem so long as the relationship between the values of X and
Y =u ( X ) is one- to –one; all we have to do is make the appropriate substitution.
Example 4
If X is the number of heads obtained in four tosses of a balanced coin, find the probability
1
Y=
distribution of 1+X
Solution
1
θ=
Using the formula for the binomial distribution with n=4 and 2 , we find the probability
distribution of X is given by

x 0 1 2 3 4
1 4 6 4 1
f (x) 16 16 16 16 16

1
y=
Then, using the relationship to 1+x to substitute values of Y for valuesof X ,we find that
the probability distribution of Y is given by

1 1 1 1
y 1
2 3 4 5

1 4 6 4 1
g( y) 16 16 16 16 16

If we had wanted to make the substitution directly in the formula for the binomial distribution with

1
θ= 2
1
x= y−1
f ( x )=¿ ( 4 ¿ ) ¿ ¿ ¿
n=4 and we would have substituted for x in ¿
getting

g ( y )=f ( 1y −1 )=¿ ( 4 ¿ ) ¿ ¿ ¿
¿
Note
Note that in the preceding example the probabilities remain unchanged; the only difference is that in
the result they are associated with various values of Y instead of corresponding value of X . That
Page 29 of 82
is all there is to transformation or (change- of-variable) technique in the discrete case so long as the
relationship is one-to-one.
If the relationship is not one-to-one we may proceed as the in the following example.
Example 5
With reference to the example 4 above; find the probability distribution of the random variable
2
Z =( X−2 )
Solution
Calculating the probabilities h ( z ) associated with the various values of z , we get
6
h ( 0 )=f ( 2 )= 16
4 4 8
h ( 1 )=f ( 1 ) + f ( 3 )= 16 + 16 = 16
1 1 2
h ( 4 )=f ( 0 ) +f ( 4 )= 16 + 16 = 16 and hence,

z 0 1 4
3 1 1
h(z) 8 2 8

To perform a transformation of variable in the continuous case, we shall assume that the function
given by y=u ( x ) is differentiable and either increasing or decreasing for all values within the
−1
range of X for which f ( x )≠0 , so that the inverse function given by x=u ( y ) =w ( y ) , exists
for all the corresponding values of y and is differentiable except where u' ( x )=0 .
Under these conditions, we can prove the following theorem:

Theorem 1
Let f ( x ) be the value of probability density, of the continuous random variable X at x if the
function given by y=u ( x ) is differentiable and either increasing or decreasing for all values within
the range of X for which f ( x )≠0 , then for the value of x , the equation y=u ( x ) can be
uniquely solved for x to give x=w ( y ) , and for the corresponding values of y the probability
'
density of y=u ( x ) is given by g ( y )=f [ w ( y ) ] .|w ( y )| provided u' ( x )≠0
=0,elsewhere
Proof

Page 30 of 82
y=u ( x ) is increasing. As can be seen from
First let us prove the case where the function given by
the figure above, X must take on a value between w ( a ) and w ( b ) when Y takes on the value
between a and b . Hence
pr [ a< y< b ] = pr [ w ( a ) < X< w ( b ) ]
w (b ) b
= ∫ f ( x ) dx = ∫ f [ w ( y ) ] w ' ( y ) dy
x=w ( a ) y=a

where we performed the change of variable y=u ( x ) , or equivalently x=w ( y )


'
⇒ dx=w ( y ) dy ,
in the integral.
'
Thus, so long as w ( y ) exists, we can write
'
g ( y )=f [ w ( y ) ] w ( y ) .
When the function given by y=u ( x ) decreasing, it can be seen from the figure above that X must
take on the value between w ( b ) and w ( a ) when Y takes on a value between a and b. Hence,
Pr [ a<Y <b ] =Pr [ w ( b ) < X < w ( a ) ]
w (a ) a b
'
= ∫ f ( x ) dx = ∫ f [ w ( y ) ] w ( y ) dy =− ∫ f [ w ( y ) ] w ' ( y ) dy
x=w ( b ) y=b y=a
where we performed the same change of variable as before, and it follows that
dx 1
w ' ( y )= =
dy dy
dx is positive when the function given by y=u ( x ) is
'
g ( y )=−f [ w ( y ) ] w ( y ) since
increasing, and −w ( y ) is positive when the function given by y=u ( x ) is decreasing, we can
'

combine the two cases by writing g ( y )=f [ w ( y ) ] .|w ' ( y )| .


Example 6

If X has the exponential distribution given by


f ( x )=¿ { e−x , for x>0 ¿ ¿¿¿
Find the probability density of the random variable Y =√X .
Solution
The equation Y = √ X relating the values of X and Y has a unique inverse x= y 2 , which
dx
w ' ( y )= =2 y
yields dy . Therefore,
2 2
−y
g ( y )=e− y |2 y|=2 ye
−y 2
i.e.
g ( y )=¿ {2 ye , for y>0 ¿ ¿¿¿
Example 7

Page 31 of 82
If the double arrow of the fig.1 below is span so that the random variable θ has the uniform density

f ( θ ) =¿ π1 , for − π2 <θ< π2 ¿ ¿¿¿


{
determine the probability density of X , the abscissa of the point on the X − axis to which the
arrow will point

Page 32 of 82
Solution
As is apparent from the diagram, the relationship between x and θ is given by x=a tanθ , so
dθ a
= 2 2
that dx a + x
and it follows that
1 a
g ( x ) = .| 2 2 |
π a +x
1 a
= . 2 2 , for −∞< x< ∞
π a +x

TRANSFORMATION TECHNIQUE: SEVERAL VARIABLES


Suppose that we are given the joint distribution of two random variables, X 1 and X 2 and that we
want to determine the probability distribution of the probability or the probability density of the
random variable
Y =u ( X 1 , X 2 ) x x
. If the relationship between y and 1 with 2 held constant or
the relationship between y and
x 2 with x 1 held constant permits, we can proceed in the discrete
case as in example 4 to find the joint distribution of Y and X 2 or that of X 1 and Y and then sum
on the values of the other random variable to get the marginal distribution of Y .
In the continuous case, we first use Theorem 1 with the transformation formula written as
∂ x1
g ( y , x 2 )=f ( x 1 , x 2 ) .| |
∂y or as
∂ x2
g ( x 1 , y ) =f ( x1 , x 2 ) .| | f ( x 1 , x 2)
∂y where and the partial derivative must be expressed in terms of
y x
and 2 , or x 1 and y . Then we integrate out the other variable to get the marginal density
of Y .
Example 8
If X 1 and X 2 are independent random variables having Poisson distribution with the parameters
λ1 and λ2 , find the probability distribution of the random variable Y = X 1 + X 2 .
Solution
Since X 1 and X 2 are independent, their joint distribution is given by
−λ 1 x1 − λ2 x
e ( λ1 ) e ( λ2) 2
f ( x 1 , x 2)= ¿
x1 ! x2 !
− ( λ 1 +λ 2 )
e
x
( λ 1 ) 1 ( λ2 ) 2
x x 1=0,1,2 ,⋯
= ,
x1 ! x2 ! x 2=0,1,2,⋯
Since y=x 1 +x 2 and, hence, x 1= y −x 2 , we can substitute y−x 2 for x 1 , getting
− ( λ 1 +λ 2 )
e
x
( λ 2) 2 ( λ1 )
y− x2
y=0,1,2 ,⋯
g ( y , x 2 )=
x 2 ! ( y−x 2 ) ! , x 2=0,1,2,⋯
for the joint distribution of Y and X 2 .
Page 33 of 82
Then, summing on x 2 from 0 to y , we get
− ( λ 1+ λ 2) x y−x 2
y
e ( λ2 ) 2 ( λ 1 )
h ( y )= ∑
x =0
2
x 2 ! ( y− x2 ) !
− ( λ 1 +λ 2 ) y
e y! x y−x
= ∑ ( λ2 ) 2 ( λ 1 ) 2
y! x 2=0 x2 ! ( y−x 2 ) !
y
Identifying the summation at which we arrived as the binomial expansion of ( λ 1 + λ2 ) , we finally
get
− ( λ 1 +λ 2 ) y
e ( λ 1 + λ 2)
h ( y )= , y=0,1,2 ,⋯
y!
and we have, thus, shown that the sum of two independent random variables having Poisson
distribution, with parameters λ1 and λ2 has a Poisson with the distribution parameter λ=λ1 + λ2 .

In general, we begin with the joint distribution of two random variables X 1 and X 2 and
Y 1 =u1 ( X 1 , X 2 )
determine the joint distribution of two new random variables and
Y 2 =u2 ( X 1 , X 2 )
. Then we can find the marginal distribution of Y 1 or Y 2 by summation or
integration.
This method is used mainly in the continuous case, where we need the following theorem; which is a
direct generalization of Theorem 1.
Theorem 2
Let ( 1 2 ) be the value of the joint probability density of the continuous random variables X 1
f x ,x
x ,x
and X 2 at ( 1 2 ) . If the functions given by
y 1 =u1 ( x 1 , x2 ) y 2 =u2 ( x 1 , x 2 )
and are partially
differentiable with respect to both x 1 and x 2 and represent a one-to –one transformation for all
values within the range of
X 1 and X 2 for which f ( x 1 , x 2 ) ≠0 ,then for these values of x 1 and
x 2 , the equation y 1 =u1 ( x 1 , x2 ) and y 2 =u2 ( x 1 , x 2 ) can be uniquely solved for x 1 and x 2 to give
x 1=w1 ( y 1 , y 2 ) x 2=w2 ( y1 , y 2 )
and , and for the corresponding values of y 1 and y 2 the joint
Y 1 =u1 ( X 1 , X 2 ) Y 2 =u2 ( X 1 , X 2 )
probability density of and is given by
g ( y 1 , y 2 ) =f [ w 1 ( y 1 , y 2 ) , w 2 ( y 1 , y 2 ) ] ×|J|
Here, J , called the Jacobian of the transformation, is the determinant
∂ x1 ∂ x1
∂ y1 ∂ y2
J =| |
∂ x2 ∂ x2
∂ y1 ∂ y2

Example 9
If the joint probability density of X 1 and X 2 is given by

Page 34 of 82
−( x1 +x 2)
{
f ( x 1 ,x2)=¿ e , x 1 ¿0, x2 ¿0 ¿ ¿¿¿
Find the:
X1
Y 2=
a. Joint probability density of Y 1 =X 1 + X 2 and X 1+ X 2

b. Marginal density of Y 2 .

Page 35 of 82
Solution
x1
y 2=
a. Solving
y 1 =x1 + x2 and for
x 1 and x 2 , we get
x 1+ x 2
x 1= y 1 y 2 and x2 = y 1 (1− y 2 ) and it follows that
∂ x1 ∂ x1
∂ y1 ∂ y2 y y1
J =| |=| 2 |= − y 1
∂ x2 ∂ x 2 1− y 2 − y 1
∂ y1 ∂ y2

Since the transformation is one-to-one, mapping the region x 1 >0 and x 2 >0 in the
x 1 x 2 − plane

into the region 1 y >0 and 0< y <1 2 in the


y 1 y 2 − plane, we can use theorem 2 and it follows
that
g ( y 1 , y 2 ) =f [ w 1 ( y 1 , y 2 ) , w 2 ( y 1 , y 2 ) ] ×|J|
−y − y1
= e |− y1|= y 1 e
1

−y1
i.e.
{
g ( y1 , y 2)=¿ y1 e , y1 ¿0, 0<y2 ¿1 ¿ ¿¿¿
(b) Using the joint density obtained in part (a) and integrating out y 1 , we get


h ( y 2 )= ∫ g ( y1 , y2) dy 1 −y
= ∫ y 1 e 1 dy 1
y1=0 y =0
1 = Γ(2)=1!=1

h ( y2 )=¿ {1 , 0<y2<1 ¿ ¿¿¿ [ 0, 1 ]


i.e. which is the p.d.f. of uniform distribution on the interval .
Example 10
If the joint density of X 1 and X 2 is given by

f ( x1 ,x2)=¿ {1 , 0<x1 <1, 0<x2<1 ¿ ¿¿¿


Find the:
(a) Joint density of Y = X 1 + X 2 and
Z =X 2
(b) Marginal density Y .
Solution
a. Solving y=x 1 +x 2 and
z=x 2 for x 1 and x 2 , we get x 1= y −z and x 2=z , so that
∂ x1 ∂ x1
∂ y1 ∂ y 2 1 −1
J =| |=| |= 1
∂ x2 ∂ x2 0 1
∂ y1 ∂ y2

Page 36 of 82
Since the transformation is one-to-one, mapping the region 0<x 1 <1 and 0<x 2 <1 in the
x 1 x 2 − plane into the region z< y< z+1 and 0< z <1 in the yz− plane, we get
g ( y , z )=f [ w1 ( y , z ) , w 2 ( y , z ) ]|J|=1×|1|=1

i.e.
g ( y,z )=¿ {1 , z<y<z+1, 0<z<1¿¿¿¿
b. Integrating out z separately for y≤0 , 0< y<1, 1< y<2 , and y≥2 , we get

y 1

{ {
h(y)=¿{0, for y≤0¿ ∫ 1dz=y, for 0<y<1¿ ∫ dz=2−y, for 1<y<2¿ ¿
z=0 z=y−1
and to make the density function continuous, we let h ( 1 )=1 . We have thus shown that the sum of
the given random variables has the triangular probability density, whose graph is shown in the
figure below;

Page 37 of 82
So far we have considered only functions of two random variables, but the method based on theorem
2 can easily be generalized to functions of three or more random variables. For instance, if we are
given the joint probability density of three random variables X 1 , X 2 , and
X
3 and we want to find
Y =u X , X , X Y 2 =u2 ( X 1 , X 2 , X 3 )
the joint probability density of the random variable 1 1 ( 1 2 3 ) , , and
Y 3 =u3 ( X 1 , X 2 , X 3 )
, the general approach is the same but the , but the Jocabian is now the 3×3
determinant
∂ x1 ∂ x1 ∂ x1
∂ y1 ∂ y2 ∂ y3
∂ x2 ∂ x2 ∂ x2
J =| |
∂ y1 ∂ y2 ∂ y3
∂ x3 ∂ x3 ∂ x3
∂ y1 ∂ y2 ∂ y3
Once we have determined the joint probability density of the three new random variables, we can
find the marginal density of any of the two of the random variables, or any one, by integration.
Example 11
If the joint probability density of X 1 , X 2 , and X 3 is given by
−( x1 +x2 +x3 )
{
f ( x 1 , x 2 , x3 )= ¿ e , for x1 ¿0, x 2 ¿0,x 3 ¿0 ¿ ¿¿¿
Find the:
a. Joint density of
Y 1 =X 1 + X 2 + X 3 , Y 2 =X 2 , and Y 3 =X 3 ;
b. Marginal density of Y 1 .
Solution
a) Solving the system of the equations
y 1 =x1 + x2 +x 3 , y 2 =x2 and y 3 =x3 for x 1 ,x 2 and x 3 ,

we get
x 1= y 1 − y 2 − y 3 , x 2= y 2 x =y
and 3 3 . It follows that

∂ x1 ∂ x1 ∂ x1
∂ y1 ∂ y2 ∂ y3
∂ x 2 ∂ x 2 ∂ x 2 1 −1 −1
J =| |=|0 1 0 |=1
∂ y1 ∂ y2 ∂ y3 0 0 1
∂ x3 ∂ x3 ∂ x3
∂ y1 ∂ y2 ∂ y3
and, since the transformation is one-to –one, then
− y1 − y1
g ( y 1 , y 2 , y 3 ) =e ×|1|=e
−y1
i.e.
{
g ( y1 , y 2 , y3)=¿ e , y2 ¿0, y 3 ¿0, and y1 ¿ y2 +y3 ¿ ¿¿¿¿
b) Integrating out y 2 and y 3 , we get

Page 38 of 82
y1 y1− y3 y1 y1
− y1 −y y1 − y3 −y
h ( y 1) = ∫ ∫ e dy 2 dy 3 =e 1
∫ [ y 2 ] y =0 dy 3 =e ∫ [ y 1− y 3 ] dy 3
1
2
y 3=0 y 2=0 y3 =0 y3 =0
2 y1
y y 21 1 2 − y 1
=e
− y1
[ y 1 y 3−
2
3
] y 3=0
=e
− y1
[ ]
2
y − = y1 e ,
1
2 2
y1 ¿ 0

Observe that we have shown that the sum of three independent random variables having the
α=1 and β=1 is a random variable having the gamma distribution
gamma distribution with
with α =3 and β=1 .

EXERCISE
m=3 , N=6 and n=2 , find the probability
1. If X has a hypergeometric distribution with
3 2
2
distribution of the random variable Z =( X−1 ) .
⇒ h ( 0 )= and h (1 )=
5 5 [ ]
1
2. If X has a binomial distribution with
n=3 and p= 3 , find the probability distributions of:
X
Y= 8
, g( 12 )= 12 , g( 32 )= 27
6
, g( 34 )= 27
1
(a) 1+ X [ ⇒ g ( 0)= 27 27 ]
(b) m=3, N=6 [⇒ g (0)= 12
27
, g(1 )= 14
27
, g( 16 )= 271 ]
3. If X =ln( y) has a normal distribution with the mean μ and the standard deviation σ , find
the probability density of Y , which is said to have the log-normal distribution.
2
( ln yσ−μ ) ,
[ ]
1
1 1 −2
⇒ g ( y )= ⋅
2 y
⋅e y >0
√ 2 πσ
4. If the probability density of X is given by

kx 3
f ( x )=¿
{
( 1+2 x ) 6
, x>0 ¿ ¿¿¿
where k is an appropriate constant, find the probability density of the random variable
2X
Y=
1+2 X . Identify the distribution of Y , and thus determine the value of k .

k =320
¿¿ [This is a beta distribution with α =4 and β=2 ];

Y = −2ln X
5. If X has a uniform density with α =0 and β=1 , show that the random variable ,
has a gamma distribution. What are its parameters?

Page 39 of 82
1

6. If X has a uniform density with α =0 and β=1 , show that Y=X


−α
, with α > 0 has the

α
Pareto distribution
f ( y )= ¿
{ y α+1
, y>1 ¿ ¿¿¿

7. Consider the random variable X with the uniform density having α =1 and β=3 .
(a) Use the result of Example 2 to find the probability density of Y=|X| .
⇒ g ( y )= 12 , for 0< y <1 and g( y )= 14 , for 1< y <3
[ ]

(b) Find the probability density of


8. If the joint probability distribution of X 1 and X 2 is given by
2
Z =X (=Y ) .
2
¿¿
x1 x2
f ( x 1 , x 2) = , x 1 =1,2,3 x 2=1,2,3 , find the probability distribution of:
36 and
a)
X1 X 2 ;

b)
X1 / X2 .
9. If the joint probability distribution of X 1 and X 2 is given by
x1 x2
f ( x 1 , x 2) = , x 1 =1,2,3
x 2=1,2,3 , find
36 and
(a) The joint distribution of Y 1 =X 1 + X 2 and Y 2 =X 1 −X 2 ;
1 2 2 3 4 3
⇒¿ [ f ( 2,0 )= 36
, f ( 3,−1 ) = 36
,f ( 3,1 ) = 36
,f ( 4 ,−2 ) = 36
,f ( 4,0 ) = 36
,f ( 4,2 ) = ,
36 ¿ ]¿¿
¿
(b) The marginal distribution of Y 1 .
1 4
[
⇒ g ( 2 ) = 36 , g ( 3 )= 36 , g ( 4 ) = 10
36
, g (5 )= 12
36
9
, g ( 6 )= 36 ]
1 1 5
10. If
X 1 , X 2 , and X 3 have the multinomial distribution with n=2, θ1 = 4 , θ 2= 3 , and θ3 = 12 ,
find the joint probability distribution of
Y 1 =X 1 + X 2 , Y 2= X 1− X 2 , and Y 3 =X 3 .
25 5 5
[
⇒ g ( 0,0,2 )= 144 , g (1 ,−1,1 )= 18 , g ( 1,1,1 )= 24 , g ( 2 ,−2,0 ) = 19 , g ( 2,0,0 )= 16 , and g ( 2,2,0 )= 16
1
]
11. If
X 1 and X 2 are independent random variables having binomial distributions with the

respective parameters
n1 and θ and n2 and θ , show that Y = X 1 + X 2 has the binomial

distribution with the parameters


n1 +n2 and θ . Hint: use the Theorem:
k
∑ (mr )( k−r
n = m+
) (k)
r=0

Page 40 of 82
12. If X and Y are independent random variables having the standard normal distribution, show
that the random variable Z =X +Y is also normally distributed. (Hint: Complete the square in
the exponent). What are the mean and the variance of this normal distribution?
[ ⇒ μ=0 and σ 2=2 ]
13. Consider the two random variables X and Y with the joint probability density

f ( x,y ) =¿ {12xy ( 1−y ) , 0<x<1, 0<y<1¿¿¿¿


2
a) Find the joint probability density of Z =XY and U=Y .

⇒¿ ¿
b) Find the marginal density of Z.

⇒¿ ¿ 14. Consider two independent random variable X 1


and X 2 having the same Cauchy distribution
1
f ( x )=¿
{ 2
π ( 1+x )
, −∞<x<∞ ¿ ¿¿¿

a) Find the probability density of Y 1 =X 1 + X 2 and Y 2 =X 1 −X 2

b) Find the marginal density of Y 1 . ⇒¿ ¿


15. Let X and Y be one continuous random variables having the joint probability density

f ( x,y )=¿ {24xy, for 0<x<1, 0<y<1, x+y<1¿¿¿¿


Find the joint probability of Z =X +Y and W = X .

⇒¿ ¿
16. Let X and Y be two independent random variables having identical gamma distributions.
X
U= and V = X +Y
(a) Find the joint probability density of the random variables X +Y .
(b) Find and identify the marginal density of U.

17. According to the Maxwell – Boltzmann law of theoretical Physics, the probability density of V ,
the velocity of a gas molecule, is

Page 41 of 82
2 −βv 2
f ( v )=¿ {kv e , v>0 ¿ ¿¿¿
where β depends on its mass and the absolute temperature and k is an appropriate constant.
1 2
Show that the kinetic energy E= 2 m V is a random variable having a gamma distribution.

Page 42 of 82
MOMENT GENERATING FUNCTION TECHNIQUE
Moment – generating functions can play an important role in determining the probability distribution
or density of a function variables when the function is a linear combination of n independent random
variables.
The method is based on the theorem that the moment – generating function of the sum of n
independent random variables equals the product of their moment – generating functions, namely,
Theorem 3
If
X 1 , X 2 ,⋯, X n are independent random variables and Y = X 1 + X 2 +⋯+ X n , then
n
M Y ( t ) =∏ M X ( t ) M X (t )
i=1
i
, where i is the value of the moment- generating function of
X i at t .
Proof
Making use of the fact that the random variables are independent and, hence
f ( x 1 , x 2 ,⋯, x n ) =f 1 ( x 1 )⋅f 2 ( x 2 ) ⋯f n ( x n )
M Y ( t ) =E ( e tY )
( X 1+ X 2 +⋯+ X n) t
=E e [ ]
∞ ∞ ∞
( x 1+ x2+⋯+x n) t
=∫ ∫ ⋯∫ e f ( x1 , x2 ,⋯,x n ) dx 1 dx 2 ⋯dx n
−∞
∞ −∞ −∞ ∞ ∞
x 1t x2 t xt
= ∫ e f 1 ( x1 ) dx 1 ¿ ∫ e f 2 ( x 2 ) dx 2 ⋯ ∫ e n f n ( x n ) dx n
−∞ −∞ −∞
n

=M X ( t )⋅M X 2 ( t ) ⋯M X n (t ) =∏ M X ( t )
i
1 i=1
which proves the theorem for the continuous case. To prove it for the discrete case, we have only to
replace all of the integrals by sums.
Note that if we want to use the above theorem 3 to find the probability distribution or the probability
density of the random variable
Y = X 1 +X 2 +⋯+X n , we must be able to identify whatever

probability distribution or density corresponds to M Y ( t ) .


Example 12
Find the probability distribution of the sum of n independent random variables
X 1 , X 2 ,⋯, X n

having Poisson distributions with the respective parameters


λ1 , λ 2 ,⋯, λn .
Solution
λ ( e t −1 ) λ ( e t −1) λ ( e t −1)
M X ( t )=e 1
M X ( t ) =e 2
,⋯, M X n ( t ) =e n
1 , 2

λ ( et −1 )
M X ( t )=e i
Y = X 1 +X 2 +⋯+X n we obtain
i.e. i and, hence, for
n n
λ ( et −1 )
M Y ( t ) =∏ M X ( t ) =∏ e i
i
i=1 i=1
t t t t
λ 1 ( e −1) λ2 (e −1) λn (e −1) (λ 1+ λ2 +⋯+λ n )( e −1 )
=e ⋅e ¿⋯e =e

Page 43 of 82
which can readily be identified as the moment- generating function of the Poisson distribution with
the parameter 1 2 λ=λ + λ +⋯+ λ
n .
Thus, the distribution of the sum of n independent random variables having Poisson distributions
with the parameter
λi is a Poisson distribution with the parameter λ=λ1 + λ2 +⋯+ λ n
Example 13
If
X 1 , X 2 ,⋯, X n are independent random variables having an exponential distribution with the same

parameter θ , find the probability density of the random variable


Y = X 1 +X 2 +⋯+X n .
Solution
Since the exponential distribution is a gamma distribution with α=1 and β=θ , we have
1 −1
M X ( t )=M X (t ) =⋯=M X (t ) = =( 1−θt )
1 2 n 1−θt
and hence
n n
M Y ( t ) =∏ M X ( t ) =∏ ( 1−θt )−1 =( 1−θt )−n
i
i=1 i=1
Identify the moment – generating function of Y as that of a gamma distribution with α =n and
β=θ , we conclude that the distribution of the sum of the
independent random variables n
having exponential distributions with the same parameter θ is a gamma distribution with the
parameter α =n and β=θ .
Theorem 3 above also provides an easy and elegant way of deriving the moment –generating
function of the binomial distribution.
Suppose that
X 1 , X 2 ,⋯, X n are independent random variables having the same Bernoulli
1−x
distribution f ( x ,θ )=θ x ( 1−θ ) for x=0,1
1
M X ( t )=E ( etx )= ∑ e tx θ x (1−θ )1−x
x=0

=e
t ( 0)
θ0 ( 1−θ )+e θ ( 1−θ )
t 1 0
=(1−θ )+et θ=1+θ ( et −1 )
M X (t )=1+ θ ( et −1 )
i.e. i

so that Theorem 3 yields for


Y = X 1 +X 2 +⋯+X n
m n
n
M Y ( t ) =∏ M X ( t ) =∏ [ 1+θ ( e t −1 ) ] =[ 1+θ ( e t −1 ) ]
i
i=1 i=1
This moment-generating function is readily identified as that of the binomial distribution with the
parameter n and θ . Of course,
Y = X 1 +X 2 +⋯+X n is the total number of successes in n trials,
since X 1 is the number of successes on the first trial, X 2 is the number of successes on the second

trial, ⋯ ,and
X n is the number of successes on the nth trial.

EXERCISE

Page 44 of 82
1. If X 1 and X 2 are independent random variables having binomial distribution with respective
parameters n1 and θ and n2 and θ , show that Y = X 1 + X 2 has the binomial distribution
with the parameter n1 +n2 and θ [Hint: Use the moment – generating function technique]
2. If n independent random variables have the same gamma distribution with the parameters α
and β , find the moment-generating function of their sum and, if possible, identify its
distribution.
[it is a gamma distribution with parameters nα and β ]

X
3. If n independent random variables i have normal distributions with the means
μi and its
σ
standard deviation i , find the moment-generating of their sum and identify the corresponding
distribution, its mean and its variance.
n n

[ )]
1
∑ μi + 2 t2 ∑ σ 2i n n
⇒ M X ( t ) =e
n
i =1 i=1
, Y ~N
(∑
i=1
μi , ∑ σ 2i
i=1

4. Prove the following generalization of Theorem 3; if


X 1 , X 2 ,⋯, X n are independent random
n
M Y ( t ) = ∏ M X ( ai t ) M X (t )
variables and
Y =a 1 X 1 +a2 X 2 +⋯+a n X n , then i=1
i
, where i is the
value of the moment-generating function of
X i at t .

5. A lawyer has an unlisted number on which she receives on the average 2.1 calls every half hour
and a listed number on which she receives on the average of 10.9 calls every half-hour. If it can be
assumed that the number of calls she receives on these phones are independent random variables
having Poisson distributions, what are the probabilities that in half an hour she will receive
altogether:
a) 14 calls ⇒ [ 0 . 1021 ]
b) At most six calls? ⇒ [ 0 . 0259 ]
6. The number of fish a person catches per hour at Lake Victoria is a random variable having the
Poisson distribution with parameter λ=1. 6 .
What are the probabilities that a person fishing there will catch:
a) Four fish in two hours; ⇒ [ 0 . 1781 ]
b) At least two fish in 3 hours; ⇒ [ 0 . 9523 ]
c) At most three fish in four hours? ⇒ [ 0 . 1189 ]
7. If the number of minutes a doctor spends with a patient is a random variable having an
exponential distribution with the parameter θ=9 , what are the probabilities that it will take the
doctor at least 20minutes to treat:
a) One patient; ⇒ [ 0 . 1084 ]
b) Two patients; ⇒ [ 0 . 3492 ]
c) Three patients? ⇒ [ 0 . 6168 ]

Page 45 of 82
Page 46 of 82
CHEBYSHEV’S THEOREM
2
To demonstrate how σ or σ is indicative of the spread or dispersion of the distribution of a
random variable, we prove the following theorem, called Chebyshev’s theorem named after the 19th
century Russian Mathematician P.L Chebyshev.
Theorem (Chebyshev’s Theorem)
If μ and σ are the mean and the standard deviation of a random variable X , then for any
1
1−
positive constant k , the probability is at least k 2 that X will take on a value within k
standard deviations of the mean; Symbolically;
1
Pr (|X −μ|<kσ )≥1−
k2
Proof

By definition;

2 2 2
σ =E [ ( X −μ ) ]= ∫ ( x−μ ) f ( x ) dx
−∞
Then, dividing the integral into three parts as shown in fig.1 above, we get
μ−kσ μ +kσ ∞
2 2
2
σ= ∫ ( x−μ ) f ( x ) dx + ∫ ( x−μ ) f ( x ) dx+ ∫ ( x−μ )2 f ( x ) dx
−∞ μ−kσ μ+kσ
2
Since the integrand ( x−μ ) f ( x ) is non-negative, we can form the inequality
μ−kσ ∞
2
2
σ ≥ ∫ ( x−μ ) f ( x ) dx + ∫ ( x−μ )2 f ( x ) dx
−∞ μ−kσ
2 2 2
by deleting the second integral. Now, since ( x−μ ) ≥k σ for x≤μ−kσ or x≥μ+kσ , it follows
μ−kσ ∞
2 2 2
σ≥ ∫ k σ f ( x ) dx + ∫ k 2 σ 2 f ( x ) dx
−∞
that μ +kσ
and, hence, that
μ−kσ ∞
1
≥ ∫ f ( x ) dx + ∫ f ( x ) dx 2
k 2 −∞ μ+kσ provided σ ≠0 .
Since the sum of the two integrals on the right -hand side is the probability that X will take on a
value less than or equal to μ−kσ or greater than or equal to μ+kσ , we have thus shown that
1
Pr (|X −μ||≥kσ )≤
k2
and it follows that

Page 47 of 82
1
Pr (|X −μ|<kσ )≥1−
k2
Clearly, the probability given by Chebyshev’s theorem is only a lower bond; whether the probability
that a given random variable will take on a value within k standard deviations of the mean is
1
1− 2
actually greater than k , and is so by how much, we cannot say, but Chebyshev’s theorem
1
1− 2
assures as that this probability cannot be less than k .
Only when the distribution of a random variable is known can we calculate the exact probability.

Example 1
If the probability density of X is given by:

f ( x )=¿ { 630x4 ( 1−x )4 for 0<x<1 ¿ ¿¿¿


Find the probability that it will take on a value within two standard deviations of the mean and
compare this probability with the lower bound provided by Chebyshev’s theorem.
Solution
1 1 1
Straight forward integration shows that and
μ=
2
σ 2=
44 , so that
σ=

44
Thus, the probability that X will take on a value within two standard deviations of the mean is the
≈0 . 15
.

probability that it will take on a value between 0.20 and 0.80 , namely,
0. 80
4
Pr ( 0 .20< X <0 . 80 )= ∫ 630 x 4 (1−x ) dx
x=0 .20 =0. 96
Observe that the statement “the probability is 0 .96 ”is a much stronger statement than “the
probability is at least 0.75 ”, which is provided by Chebyshev’s theorem.
Example 2
Let X have the p.d.f
1
f ( x )=¿ {√ 2 3
, for − √3<x<√ 3 ¿ ¿¿¿
3
2 k=
Here μ=0 and σ =1 . If 2 , we have the exact probability
Pr (|X −μ|≥kσ )=Pr |X|≥ 32 =1−Pr (|X|≤ 32 )
[ ]
3
2
1 √3
=1−∫ dx=1−
3 2 √3 2
−2
1 4
=
By Chebyshev’s inequality, the preceding probability has the upper bound k2 9 .

Page 48 of 82
3
1− √ =0 .134
Since 2 , the exact probability in this case is considerably less than the upper bound
4
9 .

Page 49 of 82
EXERCISE
1. What is the smallest value of k in Chebyshev’s theorem for which the probability that a
random variable will take on a value between μ−kσ and μ+kσ is
a) At least 0.95 ⇒ [ k =√ 20 ]
b) At least 0. 99 ? ⇒ [ k =10 ]
2. A study of the nutritional value of a certain kind of bread shows that the amount of thiamine
(vitamin B1) in a slice may be looked upon as a random variable with μ=0.260 mg and
σ =0 .005 mg . According to Chebyshev’s theorem, between what values must be the thiamine
content of
35
a) At least 36 of all the slices of this bread; [between 0.230 and 0.290 ]
143
b) At least 144 of all the slices of this bread? [ between 0.200 and 0.320 ]
3. If X is a random variable such that E ( X )=3 and E ( X 2 )=13 , use the Chebyshev’s theorem
to determine a lower bound for the probability Pr (−2<X <8 ) . [0.84]

Page 50 of 82
SAMPLING DISTRIBUTION
INTRODUCTION
Statistics concerns itself mainly with conclusions and predictions resulting from chance outcomes
that occur in carefully planned experiments or investigations
In the finite case, these chance outcomes constitute a subset, or sample, of measurements or
observations from a larger set of values called the population.
In the continuous case they are usually of identically distributed random variables, whose
distribution we refer to as the population distribution, or the infinite population sampled .The
word “infinite” implies that there is logically speaking, no limit to the number of values we could
observe,
Not all samples lend themselves to valid generalizations about the populations from which they
came. Most of the methods of inference are based on the assumption that we are dealing with
random samples.

Definition 1
If 1 X , X ,⋯, X
2 n are independent and identically distributed random variables, we say that they
constitute a random sample from the infinite population given by their common distribution.
f ( x 1 , x 2 ,⋯, x n )
If is the value of the joint distribution of such a set of random variables of
n
f ( x 1 , x 2 ,⋯, x n ) =∏ f ( x i )
( x 1 , x2 ,⋯, x n ) we can write i=1 where
f ( xi)
is the value of the
population distribution at i .
x
Statistical inferences are usually based on statistics, i.e., on random variables that are functions of a
set of random variables
X 1 , X 2 ,⋯, X n .
Definition 2
If
X 1 , X 2 ,⋯, X n constitute a random sample, then
n
∑ Xi
i=1
X̄ =
n is called the sample mean and
n

∑i =1
¿ ( X i − X̄ ) 2

S 2= ¿
n−1 is called the sample variance.
n n

x̄=
∑ xi
i=1
∑i =1
¿ ( x i− x̄ )2

s2= ¿
We might calculate n and n−1 for observed sample data and refer to these
statistics on the sample mean and the sample variance.
Here, the x i , x̄ , and s are values of the corresponding random variables
2 X i , X̄ and S 2 .

THE DISTRIBUTION OF THE MEAN


Since statistics are random variables, their values will vary from sample to sample, and it is
customary to refer to their distribution as sampling distributions.
Theorem 1
Page 51 of 82
If
X 1 , X 2 ,⋯, X n constitute a random sample from an infinite population with the mean μ and
σ2
2 E ( X̄ ) =μ and Var ( X̄ ) =
the variance σ , then n .
Proof
n
∑ Xi
i=1
X̄ =
n
n

E ( X̄ )=E

1 1
n
[ ]
∑ Xi
i=1
n
1
= E
n
n

( ) 1
n
∑ Xi = n ∑ E ( Xi )
i=1 i=1

= ∑ μ= ⋅nμ=μ
n i=1 n =E ( x i ) =μ
, since
n

var ( X̄ )=var

1
n
[ ]
∑ Xi
i=1

1
n n
1
= 2 var
n
2 nσ
2
σ2
n

[∑ ]
i=1
Xi

= ∑ var ( X i )= ∑ σ = =
n2 i=1 n2 i=1 n2 n
σ2
var ( X̄ )=
i.e. n .
It is customary to write E(X) as
μ X̄ , var ( X̄ ) as σ 2X̄ and refer to σ X̄ as the standard error of
σ
σ X̄ =
the mean. The formula for the standard error of the mean √n , shows that the standard error
of the distribution of X̄ decreases when n ,the sample size, is increased. This means that when
n becomes larger and we actually have more information (the values of more random variables),

we can expect values of X̄ to be closer to μ , the quantity they are intended to estimate.
If we refer to Chebyshev’s theorem, we can express this formally in the following way;

Theorem 2
For any positive constant c, the probability that X̄ will take on a value between μ−c and μ+c
2
σ
1− 2
is at least nc . When n→ ∞ , this probability approaches 1.
This result, called a law of large numbers, is primarily of theoretical interest.
Of more practical value is the central limit theorem, one of the most important theorems of statistics,
which concerns the limiting distribution of the standardized mean of n random variables when
n→ ∞ .

Page 52 of 82
Theorem 3(Central Limit Theorem)
If
X 1 , X 2 ,⋯, X n constitute a random sample from an infinite population with the mean μ ,the

variance σ ,and the moment- generating function M X ( t ) ,then the limiting distribution of
2

X̄−μ
z=
σ /√n as n→ ∞ is the standard normal distribution.
Proof
X̄ −μ
tz
M z (t )=E ( e )=E e
t

[ ( )] σ
√n
− √ nσ μ t

=E e [ √n t X̄ σ
n μt
⋅e
−√
n tμ
σ
] =e
− √ n tμ
σ
[
Ee
√n t X̄
σ
] =e M X̄ ( √ nσ t )
−√ t
=e σ
M n X̄ ( σ √n )
Since
n X̄ =X 1 + X 2 +⋯+ X n , it follows that
nμ t n
−√
t
M z (t )=e σ
[ ( √ )]
MX
σ n and hence that
n μt t
ln [ M z ( t ) ] =−
σ
+n . ln M X
σ n

[ ( √ )]
t
Expanding
MX ( )
σ √ n as a power series in t, we obtain
n μt t t2 t3
ln [ M z ( t ) ] =− √
' '
σ
+n ln 1+μ '1
'
[
σ √n
+μ'2 2 +μ '3 3
2σ n 6σ n√n
+⋯
]
Where μ1 , μ2 and μ 3 are the moments about the origin of the population distribution, namely,
those of the original random variables
Xi .
If n is sufficiently large, we can use the expansion of ln ( 1+x ) as power series in x , getting
' ' 2 ' 3 ' ' 2 ' 3 2

ln [ M z ( t ) ] =−

' ' 2

σ
n μt
+n
μ1 t μ 2 t
+ +
μ3 t
σ √n 2 σ 2n 6 σ 3n √n {[
+⋯ −
1 μ1 t μ 2 t
+ +
μ3 t
2 σ √n 2 σ 2 n 6 σ 3 n √ n
3
+⋯ +
] [ ]
1 μ1 t μ 2 t μ'3 t 3
[ + +
3 σ √n 2 σ 2n 6 σ 3n √n
+⋯ −⋯
]
Then, collecting powers of t , we obtain
' '
'
μ'2 μ 2 μ'3 μ'1 μ'2 μ3
ln [ M z ( t ) ] = −
√ n μt √ n μ1
σ (
+
σ
t+
2σ 2

1

2σ 2
t 2
+ 3
− 3) ( + 3
1

6 σ √n 2 σ √n 3 σ √ n
t 3 +⋯ ) ( )
Page 53 of 82
' ' 2
and since
'
μ1 =μ and μ2 −μ1 =σ , this reduces to
'
' ' ' μ3
ln [ M z ( t ) ] = t +
2 6 (
1 2 μ3 μ1 μ2

2
+ 1
t3
3 σ 3√n )
+⋯

1
3
Finally, observing that the coefficient of t is a constant times √ n and in general for r≥2 the
1
r
coefficient of t is a constant times √n r−2 , we get
Lim 1
ln [ M Z (t ) ]= t 2
n→∞ 2
and hence that
1 2
t
Lim
M Z (t )= e 2
n→∞
Since the limit of a logarithm equals the logarithm of the limit (provided these limits exist).
We identify the limiting moment-generating function at which we have arrived as that of the
standard normal distribution; this completes the proof of the CLT.
Sometimes, the central limit theorem is interpreted incorrectly as implying that the distribution of
X̄ approaches a normal distribution when .This is incorrect because var ( X̄ ) →0 when
n→ ∞
n→ ∞ ; on other hand, the central limit theorem does justify approximating the distribution of
σ2
X̄ with a normal distribution having the mean μ and the variance n when n is large ,
In practice, this approximation is used when n≥30 regardless of the actual shape of the population
sampled.

Example1
A soft drink vending machine is set so that the amount of drink dispensed is a random variable with
a mean of 200ml and a standard deviation of 15 ml. What is the probability that the average (mean)
amount dispensed in a random sample of size 36 is at least 204ml?
Solution
μ =200
According to theorem 1, the distribution of X̄ has the mean X̄ and the standard deviation
15
σ X̄ = √36 =2 .5
, and according to the central limit theorem, this distribution is approximately
X̄−μ 204−μ 204−μ 204−200
Pr ( X̄ ≥204 )=Pr
σ
√n


√n
=Pr Z≥
σ
√n ) (
=Pr Z≥
15
√ 36 ) ( )
normal. Pr ( z≥1 .6 )=1−Pr ( z≤1 .6 )=1−0 .9452=0 . 0548
It is of interest to note that when the population we are sampling is normal, the distribution of X̄ is
a normal distribution regardless of the size of n .

Page 54 of 82
Example1
A random sample of size 72 is taken from a population with the probability density function given by
−x
1
f (x ) = x e4 , 0<x <∞
16
= 0 , elsewhere
Use the Central Limit Theorem (CLT) to compute an approximate probability that the mean of the
random sample will exceed 9.
Solution
α ,β
For a gamma distribution with parameters ;
−x
f (x ) =¿
{ 1 α−1 β
Γ (α ) β
α
x e , 0<x<∞ ¿¿¿¿

In our case;
α−1=1 ⇒ α=2
β=4
μ=E ( X )=αβ=2×4=8
2 2
Var ( x )=αβ =2×4 =32

Pr ( X̄ >9 ) =1−Pr ( X̄ < 9 )


X̄−μ 9−μ
=1−Pr ( <
σ /√n σ /√n )
9−8
=1−Pr Z <
( √32/ √ 72 )
=1−Pr ( Z <1 .5 ) =1−0 . 933=0. 067

Theorem 4
If X̄ is the mean of a random sample of size from a normal population with the mean μ and
n
2
the variance σ , its sampling distribution is a normal distribution with the mean μ and the
2
variance σ /n .
Proof
X 1+ X 2 +X 3+⋯+ X n

M X̄ ( t )=E ( et X̄ )=E e
tX 1
[(
t
n )]
tX tX

=E [ e ]⋅E [ e ] ⋯E [ e ]
2 n
n n n

t t t
=M X
1 () ()
n
⋅M X2 ⋯ M X n
n n ()
Page 55 of 82
t t
But
MX
i ( n) =M X () n
, i=1,2 ,⋯, n

Hence;
n
t
M X̄ ( t )= M X
[ ( )] n
Since the moment – generating function of a normal distribution with the mean μ and the variance
1
μt+ 2 σ 2 t 2
2
σ is given by M X ( t )=e , we get
t 1 2 t 2 n 1 2 σ 2

M X̄ ( t )= e [ μ ()+
n 2
σ n () ] =e μt + t
2 ( )
n

This moment – generating function is readily seen to be that of a normal distribution with the mean
2
μ and the variance σ /n .

EXERCISE
1. A random sample of size n=100 is taken from an infinite population with the mean μ=75
2
and the variance σ =256 .
(a) Based on the Chebyshev’s theorem, with what probability can we assert that the value we
obtain for X̄ will fall between 67 and 83 ? [0.96]
(b) Based on the Central Limit Theorem, with what probability can we assert that the value we
obtain for X̄ will fall between 67 and 83 ? [0.9999994]
2. A random sample of size n=81 is taken from an infinite population with the mean μ=128
and the standard deviation σ =6 .3 .With what probability can we assert that the value we obtain
for X̄ will not fall between 126 .6 and 129. 4 if we use:
a) Chebyshev’s theorem. [The probability is at most 0.25 ]
b) The central limit theorem? ⇒ [ 0 . 0456 ]
3. A random sample of size 64 is taken from a normal population with mean μ=51.4 and
standard deviation, σ =6 .8 . What is the probability that the mean of the sample will:
a) Exceed 52.9; ⇒ [ 0 . 0388 ]
b) Fall between 50.5 and 52.3; ⇒ [ 0 . 7108 ]
⇒ [ 0 . 1736 ]
c) Be less than 50.6?
4. A random sample of size n = 225 is to be taken from an exponential population with θ=4 .
Based on the central limit theorem, what is the probability that the mean of the sample will exceed
4.5?
5. A random sample of size n = 200 is to be taken from a uniform population with
α=24 and β=48 . Based on the central limit theorem, what is the probability that the mean of
the sample will be less than 35? [0.0207]
6. A random sample of size n = 100 is taken from a normal population with σ =25 . What is the
probability that the mean of the sample will differ from the mean of the population by 3 or more
either way? [0.2302]
Page 56 of 82
2
7. Let X̄ denote the mean of a random sample of size 100 from a distribution which is χ ( 50 )
Compute an approximate value of Pr(49< X̄ <51) ).

1
8. Let
x {
f (x )=¿ 2 , 1<x<∞ ¿ ¿¿¿
be the p.d.f of a random variable X .
Consider random sample of size 72 from the distribution having this p.d.f.
Compute approximately the probability that more than 50 of the items of the random sample
are less than 3. ⇒ [ 0 . 267 ]
9. Let X̄ denote the mean of a random sample of size 128 from gamma distribution with
α=2 and β=4 . Approximate Pr(7< X̄<9 ) . ⇒ [ 0 . 954 ]
10. Compute an approximate probability that the mean of a random sample of size 15 from a
distribution having p.d.f.

f (x)=¿ {3 x2 , 0<x<1 ¿ ¿¿¿


3 4
is between 5 and 5 . ⇒ [ 0 . 840 ]

THE CHI-SQUARE DISTRIBUTION


A random variable X has the Chi-square distribution with r degrees of freedom if its probability
density function is given by
r x
1

{
−1 −
f ( x )=¿ r
x 2
e 2
for x>0 ¿¿¿
r ¿
2 Γ
2

2 ()
2
The Chi-square distribution is often denoted by” χ -distribution” where χ is the lower case
2
Greek letter Chi. We write, for brevity ,that X is χ (r) to mean that the random variable X has a
chi-square distribution with r degrees of freedom.

Moment – generating Function of a Chi – square Distribution


By definition,
M X ( t )=E ( etx ) =∫ e tx f ( x ) dx
∞ r x ∞ r−2 x ∞ r −2 x
1 −1 − tx 1 − 1 − +tx
= ∫ e tx r
x2 e 2
dx= ∫ e r
x 2
e 2
dx = r ∫x 2
e 2
dx
r r r
x=0
2 Γ2

2

() r −2 1
x=0
2 Γ
2

2() 2 Γ
2

2() x=0

1 − x −t
2 ( )
= r
r
∫x 2
e dx
2 Γ 2

2 () x=0

Page 57 of 82
dy 1 dy
1
y=x ( 2 −t ) ⇒ = 2 −t dx= 1
Let dx , i.e. 2
−t
∞ r −2
1 y dy r
−1
∴ M X ( t )= e− y
( )
2
r
r y=0 1 −t
∫ = r 1
1 ∞
∫ y2
r
−y
e
dy
1
2 Γ 2

2 () ∞
2
r
22 Γ
r
2 2
−t
( ) ( 12 −t )
y=0

∞ r
2
−1

( 12 −t )
1 −1 1 −1
= r r ∫ y
2
e− y dy = r r ∫ y 2 e− y dy
r 1 r 1−2 t
22 Γ ( )( )
2 2
−t 2 y=0 22 Γ
2
r
2 ( )( ) 2 y=0

1 r 1
= r
r 1−2 t 2
r
¿Γ
2
= r () −
=( 1−2 t ) 2
22 Γ
2( )( r

22 )
( 1−2 t ) 2

r
−2
∴ M X (t )= (1−2t ) is the moment-generating function of a chi-square distribution.

Mean and Variance of A Chi-square Distribution


'
Mean =M X ( 0 )
r r
−2 d ' −
M X ( t )= (1−2 t ) ⇒ M (t ) = ( 1−2 t ) 2
X
dt
r r
r − −1 − −1
M 'X ( t )=− (1−2t ) 2 (−2 ) =r (1−2t ) 2
i.e. 2
r
−1
At t =0 ; M 'X ( 0 )=r ( 1− 0 ) 2 =r
∴ mean=μ=r .
d d − r2 −1 −r
''
dt
'
M X ( t )= ( M X ( t ) )= r ( 1−2 t )
dt
=r [
−r
2
−2
−1 (1−2t )2 ¿(−2) ] ( )
−r
−r −2
=−2r
2( −1 ( 1−2 t )2 )
At t = 0;
r
(
M ''X (0 )=−2 r − −1 ( 1 )=r 2 +2 r
2 )
'' ' 2
∴ Var ( X )=M X (0 ) − [ M X (0 ) ]
= r 2 +2r−r 2 =2 r
Thus the mean and the variance of the Chi-square distribution with r degrees of freedom are r and
2 r respectively.

The Chi-square distribution has several important mathematical properties, which are given below:

Theorem 5
2
If X has the standard normal distribution, then X has the chi-square distribution with r=1
degree of freedom.

Page 58 of 82
Proof

M z (t )=E ( e ) tz = ∫ e tz f ( x ) dx
−∞
2
Let Z =X
1 − 12 x2
f ( x )= e
For a standard normal distribution its p.d.f is given by √2 π
∞ 1 ∞ 1
1 − 2 x2 tZ tx 2 1 − x2
∴ M z (t)= ∫ e e dx= ∫ e e 2 dx
x=−∞ √2 π x=−∞ √2 π
2 1 2 1−2t 1−2t
1

− x ( 2 −t ) 1

−x (
2 ) 2

−x 2( 2 )
= ∫e
√ 2π x=−∞
dx= ∫e
√2π x=−∞
dx = ∫e
√ 2π x=0
dx
2 1 1
x ( 1−2 t ) 2y 2 2 2 2
1

Let
y=
2
⇒ x=
1
1−2 t
1
=
1−2 t
y ( 1 1
) ( )
− − −
22 y 2
2 2
y 2
dx= 1 dy = 1 dy
i.e. 2 (1−2t ) 2
( 1−2t ) 2

1 1 ∞ 1 ∞
2
∞ −2
2 y

2 − −y 1 1
−1
e− y dy
2
∴ M z ( t )= ∫ e −y
1 dy = 1∫ y e dy
2
= 1 ∫y 2

√2 π y =0 ( 1−2t ) 2
2 √ π ( 1−2 t ) 2 y=0
( 1−2 t ) √ π
2 0
1
1 1 1
=
( 1−2 t )2 √ π
1 Γ ()2
=
( 1−2t ) 2 √ π
1 ¿ √ π=( 1−2t )

2

1
−2
Hence M Z ( t )=( 1−2t ) , which is the m.g.f. of a Chi-square distribution with r=1 degree of
freedom.

Theorem 6
If X 1 , X 2 ,⋯, X n are independent random variables having standard normal distribution, then
n
Y =∑ X 2i
i=1 has the chi-square distribution with r= n degrees of freedom.
Proof
Using the moment-generating function given above with r=1 and Theorem 5, we find that
1

M X 2 ( t ) =( 1−2 t ) 2
i= 1,2,⋯, n
i

and it follows that


n
−1 −
n

M Y ( t ) =∏ ( 1−2t ) 2 =( 1−2 t ) 2

i=1
This moment- generating function is readily identified as that of the Chi-square distribution with
r= n degrees of freedom.

Theorem 7

Page 59 of 82
If X 1 , X 2 ,⋯, X n are independent random variables having the Chi-square distribution with
n
Y =∑ X i
r 1 , r 2 ,⋯,r n degrees of freedom, then i=1 has the Chi-square distribution with
r 1 +r 2 +⋯+r n degrees of freedom.

Proof (exercise)

Theorem 8
If
X 1 and X 2 are independent random variables, X 1 has a Chi-square distribution with r1
degrees of freedom, and X 1 + X 2 has a Chi-square distribution with r >r 1 degrees of freedom,
then X 2 has a Chi-square distribution with r−r 1 degrees of freedom.
Proof (exercise)

2
The χ - distribution has many important applications, of which several will be discussed in later
courses. Foremost, there those based, directly or indirectly, on the following theorem:
Theorem 9
2
If X̄ and S are the mean and the variance of a random sample of size n from a normal
population with the mean μ and the standard deviation σ , then
a) X̄ and S 2 are independent;
( n−1 ) S 2
b) The random variable σ2 has a chi-square distribution with (n−1) degrees of freedom.
Proof
a) A detailed proof of part (a) would go beyond the scope of this course, so we shall assume the
2
independence of X̄ and S in our proof of part (b).
b) We begin with the identity
n n
2 2
∑ ( X i−μ ) =∑ [ ( X i− X̄ ) +( X̄ −μ ) ]
i=1 i=1
n
=∑ ( X i − X̄ ) +2 ( X i − X̄ ) ( X̄ −μ ) + ( X̄ −μ )2
[ ]
i =1
n n n
=∑ ( X i − X̄ ) +2 ∑ ( X i − X̄ ) ( X̄ −μ ) + ∑ ( X̄ −μ )2
2

i =1 i=1 i=1
n n n
2 2
=∑ ( X i − X̄ ) +2 ( X̄−μ ) ∑ ( X i− X̄ ) + ( X̄ −μ ) ∑1
i =1 i=1 i=1
n n n
=∑ ( X i − X̄ )2 +2 ( X̄−μ )
i =1
n
[ ∑ X i−∑ X̄
i =1 i =1 ] +n ( X̄ −μ )2

=∑ ( X i − X̄ )2 +2 ( X̄−μ ) [ n X̄−n X̄ ] + n ( X̄−μ )2


i =1

Page 60 of 82
n
=∑ ( X i − X̄ )2 +n ( X̄−μ )2
i=1

n n
∑ ( X i−μ ) =∑ ( X i− X̄ )2 +n ( X̄ −μ )2
2

i.e. i =1 i=1 g
n

2
( n−1 ) S 2 ∑ ( X i− X̄ )2
Now, if we divide each term by σ and substitute for i=1 , it follows that
n n
∑ ( X i−μ )2 ∑ ( X i− X̄ )2 n ( X̄−μ )
2
i =1 i =1
2
= 2 + 2
σ σ σ
n 2
X i −μ ( n−1 ) S 2 X̄ −μ 2
⇒ ∑
i=1
( σ ) = 2
σ
+
σ
( )
√n
With regard to the three terms of this identity, we know from Theorem6 that the one on the left-hand
side of the equation is a random variable having a chi-square distribution with n d.f.
Also, according to Theorems 4 and 5, the second term on the right-hand side of the equation is a
random variable having a chi-square distribution with 1 d.f.
2
Now, since X̄ and S are assumed to be independent, it follows that the two terms on the right-
(n−1 )S 2
hand side of the equation are independent, and we conclude by Theorem8 that σ2 is a
2
random variable having a χ -distribution with (n−1) d.f.
2
Since the χ -distribution arises in many important applications, integrals of its density have been
extensively tabulated.

χ 2α ( ν ) is such that if X is a random variable having a chi-square distribution with ν degrees of


freedom, then
Pr ( X ≥ χ 2α ( ν ) )=α
When ν is greater than 30, the chi-square tables cannot be used and probabilities related to chi-
square distributions are usually approximated with normal distributions.

Example 2
Suppose that the thickness of a part used in a semiconductor in its critical dimension, and that the
process of manufacturing these parts is considered to be under control if the true variation among the
σ =0 .60
thickness of the parts is given by a standard deviation not greater that thousandth of an

Page 61 of 82
inch. To keep a check on the process, random sample of size n = 20are taken periodically, and it is
regarded to be “out of control” if the probability that S2 will take on a value greater than or equal to
σ =0 .60
the observed sample value is 0.01 or less than( even though ). What can one conclude about
the process if the standard deviation of such a periodic random sample is S = 0.84 thousand of an
inch?

Solution
2
(n−1 )S
σ2 with n = 20 and σ =0 .60 exceeds
The process will be declared “out of control” if
2
χ 0 .01 ( 19 )=36 . 191 . Since
2 2
(n−1 )S 19(0 . 84 )
= =37. 24
σ2 (0 . 60)2 exceeds 36.191, we reject
H 0 and conclude that the process is out of
control.
Of course, it is assumed here that the sample may be regarded as a random sample from a normal
population.

THE t – DISTRIBUTION
In Theorem 4 we showed that for random sample from a normal population with the mean μ and
2
the variance σ , the random variable X̄ has a normal distribution with the mean μ and the
σ2
variance n ; in other words,
X̄−μ
~ N (0 , 1)
σ
√n .
This is an important result, but the major difficulty in applying it is that in most realistic applications
the population standard deviation σ is unknown. This makes it necessary to replace σ with an
estimate, usually with the value of the sample standard deviation, S. Thus, the theory which follows
X̄−μ
S
leads to the exact distribution of √n for random samples from normal populations.
Theorem 10
If Y and Z are independent random variables, Y has a chi-square distribution with r degrees of
freedom, and Z has the standard normal distribution, then the distribution of
Z
T=
Y
√ r
is given by
r+1
Γ( ) t2 − ( r +12 )
f (t )=
2

√ πr Γ ( r2 )
⋅ 1+ ( )
r
for −∞<t <∞

Page 62 of 82
and it is called the t – distribution with r degrees of freedom.
Proof
Since Y and Z are independent, their joint probability density is given by
1 2 r y

{
1 −2 z 1 −1 −
2
f ( y , z)=¿ e ⋅ y e 2 for y>−0, and −∞<z<∞ ¿¿¿
√2 π r 2 r
¿
Γ
2
2 ()
r 1
1 −1 − ( y+ z2 )
2 2
= r
y e
r
√ 2 π Γ 22
2 ()
z
t=
y
Then, to use the change-of-variable technique, we solve √ r for z getting
dz y
z=t √ y/r and, hence dt
Thus the joint density of Y and T
=
r . √ is given by
r 1 t2 y 1
1 −1 − y+ ( ) y
g ( y , t )=
√2 π Γ ( ) 2 r
r
2
y 2
e 2 r
¿(r) 2

2
r 1 y t2
1 −1 + − 1+( )
= r y 2 2
e 2 r

√ 2 π Γ ( r2 ) 2 √ r 2

Hence;
2
y
1 r
( tr )
{
−1 − 1+
g ( y, t )=¿ r
y 2
e 2
for y>0 and −∞<t <∞ ¿ ¿¿ ¿
r
√2 π Γ ( ) 2 2
2

And, integrating out y, we get


2
∞ ∞ r −1 − y 1+ t
1 ( )
f ( t )= ∫ g ( y , t ) dy= ∫ r
y 2
e 2 r dy
y =0 y=0
√ 2 πr Γ ( 2r ) 2 2
Let
y
w= ( 1+t 2 / r ) ⇒ y= 2 wt 2
2 1+
r
dy 2 2
= ⇒ dy=
dw 1+ t
2
t
2

r ( ) ( 1+
r )
Therefore;

∞ r−1
1 2w 2 dw
( )
2
f ( t )= ∫ 2
e−w 2

√2 πr Γ ( r
)2 r /2
1+ t t r
2
w=0
r ( 1+ )
Page 63 of 82
r−1 r +1 r
∞ − ∞
2 2
2 r−1
−w 2 2
2 2 r−1
−w
= 2 r−1 2
∫ w 2
e dw = 2 r +1 2
∫ w 2
e dw
r r
t t r t t
√ 2 πr Γ
2() ( ) ( )
2 2 1+
r
2
1+
r
w=0
√ 2 πr Γ
2 ( )( 1+
r

) ( )
2
1+
r
w=0

∞ r+ 1
√2 r −1
1 −1
= r+1 ∫ w 2
e
−w
dw = r+1 ∫ w 2
e−w dw
r t2 r t2
√2 √π √r Γ ( 2 ) (1+ r ) 2 w=0 √ πr Γ ( 2 ) ( 1+ r ) 2 w=0

1
=
√ πr Γ ( 2r ) 1+
r +1
t2 2
Γ ( r +12 )
( r)
r+1
Γ( ) t2 − ( r+2 1 )
∴ f ( t )=
2

√ πr Γ ( r2 )
( ) 1+
r
for −∞<t < ∞

The t – distribution was introduced originally by W.S. Gosset, who published his scientific writings
under the pen-name “Student”, since the company for which he worked; a brewery did not permit
publication by employees.
Thus, the t – distribution is also known as the Student -t –distribution, or Student’s - t - distribution.
In view of its importance, the t- distribution has been tabulated extensively.
The t – tables, for example contains values t α ( ν ) for α=0.10 , 0.05, 0.025, 0.01, 0.005 and
r=v=1, 2,⋯, 29, where t α ( ν ) is such that the area to its right under the curve of the t -
distribution with v degree of freedom is equal to α . That is, t α ( ν ) is such that if T is a random
variable having a
t -distribution with v degrees of freedom, then
Pr ( T ≥t α ( v ) )=α

Fig 2: t-distribution

The table does not contain value of t α ( ν ) for α=0 .50 , since the density is symmetrical about
t=0 and hence t 1−α ( ν ) = −t α ( ν ) .
When v is 30 or more, probabilities related to the t - distribution are usually approximated with the
use of normal distributions.
Among the many applications of the t-distribution, its major application (for which it was originally
developed) is based on the following theorem

Theorem 11

Page 64 of 82
If X̄ and S 2 are the mean and the variance of a random sample of size n from a normal
2
population with the mean μ and the variance σ , then
X̄−μ
T= S
√n
has the student’s t-distribution with ( n−1 ) degrees of freedom.
Proof
By Theorems 9 and 4, the random variables
( n−1 ) S2 X̄−μ
Y= Z= σ
σ2 and √n have respectively, a chi-square distribution with ( n−1 ) degrees of
freedom and the standard normal distribution.
Since they are also independent by part (a) of Theorem 9, substitution into the formula for T of
Theorem 10 yields.
X̄ −μ
σ

T=
√n
= X̄−μ
S2

√ σ
2
S
√n and this completes the proof.

Example 3
In 16 one- hour test runs, the gasoline consumption of an engine averaged 16.4 gallons with a
standard deviation of 2.1 gallons.
Test the claim that the average gasoline consumption of this engine is 12.0 gallons per hour.

Solution
Substituting n=16 , μ=12.0 , X̄ =16 .4 and S=2 . 1 into the formula for t in Theorem 11, we get
X̄−μ 16 . 4−12 . 0
t= S = =8. 38
2 .1
√n
√16
From the t – tables, t 0. 005 ( 15 )=2 . 947 .
Since computed value of t=8 .38> t 0 . 005 (15 )=2 .947 , we reject the claim.
Thus, it would seem reasonable to conclude that the true average hourly gasoline consumption of the
engine exceeds 12.0 gallons.

THE F-DISTRIBUTION
Another distribution which plays an important role in connection with sampling from normal
populations is the F-Distribution, named after Sir Ronald. A. Fisher, one of the most prominent
statisticians of this century.
Originally, it was studied as the sampling distribution of the ratio of two independent random
variables with chi-square distribution, each divided by its respective degree of freedom.
Theorem 12

Page 65 of 82
n
If U and V are independent random variables having chi-square distributions with 1 and n2
U /n1
F=
degrees of freedom then V /n2 is a random variable having an F-distribution, namely, a
random variable whose probability density is given by
n 1+n 2
(2)
{()()
Γ n1 n1 −1
n1 n1 n +n

( ) ( ) 2 ( 1 2)
−1
2
g(f )=¿ 2
f 1+ f for f >0 ¿ ¿ ¿ ¿
n n n2 n2
Γ 1 Γ 2
2 2
Proof
The joint density of U and V is given by
n1 −u n2 −v
1 2
−1
2 1 2
−1
2
f (u , v )= n1
u e ¿ n
v e
n1
2
n
22 Γ
2( ) 2 2
Γ 2
2 ( )
n1 n2
1 2
−1
2
−1 − (u+
2
v
)
= n 1+n 2
u v e , u >0 , v > 0
n n
2 2
( ) ( )
Γ 1
2
Γ 2
2

Then, to use the change- of- variable technique we solve


u n1
f 
v n2 for u getting
fvn1 n1
u  vf
n2 n2 and hence
∂u n
= 1V
∂ f n2

Thus the joint density of F and V is given by


n 1 vf

1 n1
n1 n2
−1 −( )n2
+v

n1
g( f , v )= n 1+n 2 ( )
n2
vf 2
v2 e
2
¿
n2
v
22 Γ (n2 ) Γ ( n2 )
1 2

n1
n1

=
( )
n2
2

f
n
2
1
−1
v
n +n
2
1 2
−1
e

v n1 f
( )
2 n2
+1
for f >0 and v> 0
n +n
1 2
n n
22 ( ) ( )
Γ 1 Γ 2
2 2

Now, integrating out v, we have;

Page 66 of 82

g(f )= ∫ g (f , v )dv
v=0
n1
n1

= ∫
∞() n2
2

f2
n1
−1
n1+n2

v2
−1
e

v n1 f
( )dv
2 n2
+1

n1+ n2
v=0
22 Γ ( n2 ) Γ (n2 )
1 2

n1 n
n1 1

()
−1
2
2
f v n1 f
n2 ∞
n1 +n2
−1

( ) dv
2 n
+1
= n1 +n2 ∫ v2 e 2

n1 n v=0
22 Γ ()( )
2
Γ 2
2

v n1 2w
w=
2 n2 (
f +1
) ⇒ v=
n1
( n2
f +1
)
dv 2 2
⇒ = ⇒ dv= dw
dw n1 n1

Let
n2 (
f +1
) (n2
f +1
)
n1 n1
n1
()
−1
2
f2
n2 ∞
2w n1 +n2
2
∴ g(f )=
( )
−1
n1 +n2 ∫ 2 e−w dw
n1 n n1 f n1 f
22 Γ ()()
2
Γ 2
2
v=0
n2
+1
( )
n2
+1
n1 n1 n 1+ n 2
n1
()
−1
2
f2 22 n1 +n2
n2 ∞ −1
= n1 +n2 n1+n2
∫ w2 e−w dw

(n2 ) Γ (n2 ) (nn f +1)


v=0
1
22 Γ 1 2 2

2
n n1
n1 2 1
()
−1
f2
n2
=
n1 n 2 n1 f
n1+n2
Γ ( n2 +n )
1 2

Γ ( ) ( )(
2
Γ
2 n2
+1 ) 2

That is;
n1 +n2
(2)
{()()
Γ n1 n1 −1
n1 n n +n

( ) ( ) 2 ( 1 2)
−1
g(f )=¿ 2
f 2
1+ 1 f for f >0 ¿ ¿ ¿ ¿
n n n2 n2
Γ 1 Γ 2
2 2

In view of its importance, the F- distribution has been tabulated extensively.

Page 67 of 82
F α ( ν1 , ν2)
Table VI, for example, contains values of is such that the area to its right under the curve
of the F-distribution with 1
ν , and ν 2 degrees of freedom is equal to α (see Fig.3)
F ν ,ν Pr [ F≥f α ( ν 1 , ν 2 ) ]=α
i.e. α ( 1 2 ) is such that .

f ,n1 ,n2

Fig 3

Applications of the F-distribution arise in problems in which we are interested in comparing the
variance  1 and  2 of two normal populations; for instance, in problems in which we want to
2 2

estimate the ratio  1  2 or perhaps test whether  1   2 .


2 2 2 2

We base such inferences on independent random samples of size n1 and n2 from the two populations
2 2
2 ( n1 −1) S 1 2 ( n2 −1 )S2
χ 1= 2
and χ 2= 2
and Theorem9, according to which σ1 σ2 are values of random
variables having chi-square distribution with  n1  1 and  n2  1 degrees of freedom.
By “independent random samples” we mean that n1  n2 random variables constituting the two
random samples are all independent, so that the two chi-square random variables are independent
and the substitution of their values for U and Vin Theorem 12 yields the following result.

Theorem 13

If S 21 and S 22 are the variances of independent random samples of size n1 and n2 from normal
2 2
population with the variances σ 1 and σ 2 , then
S12
σ 21 σ 22 S 21
F= =
S 22 σ 21 S 22
σ 22 is a random variable having an F-distribution with n1  1 and n2  1 degree of
freedom.

EXERCISE
1. Use Theorem 9 to show that for random samples of size n from a normal population with the
2 4
2
variance  , the sampling distribution of S has the mean  and the variance n  1 .
2 2

Page 68 of 82
2. Show that if
X 1 , X 2 ,⋯, X n are independent random variables having the χ 2 ( 1 ) and
Y n =X 1 + X 2 +⋯+ X n , the limiting distribution of
Yn
−1
n
Z= 2
as n→∞
√ n is the standard normal distribution.
3. Based on the result of Exercise (2) above, show that if X is a random variable having a chi-square
X−n
distribution with n degrees of freedom and n is large, the distribution of can be √2 n
approximated with the standard normal distribution.
4. Use the method of Exercise (3) above to find the approximate value of the probability that a
random variable having a chi-square distribution with ν =50 will take on a value greater than
68.0.
n
5. Show that for n > 2 the variance of the t – distribution with n degrees of freedom is n−2 .
t2 1
1+ =
[Hint: Make the substitution n u ]
6. Verify that if T has a t-distribution with n degree of freedom, then X  T and an F- distribution
2

with n1  1 and 2
n =n degrees of freedom.

7. Verify that if X has an F-distribution with n1 and n2 degrees of freedom and


n2 →∞ , the

distribution of
Y =n 1 X approaches the chi-square distribution with n1 degrees of freedom.
1
Y
8. If X has an F-distribution with n1 and n2 degrees of freedom, show that X has an F-
distribution with n2 and n1 degrees of freedom.
n1 n2
α= and β=
9. Verify that if Y has a beta distribution with 2 2 , then
n2 Y
X=
n 1 (1−Y )
has an F-distribution with n1 and n2 degrees of freedom.
10. Let X be a random variable whose probability density function is the F-distribution with m and n
degrees of freedom, i.e., , F ( m , n ) . Prove that:
n 2 n2 (m+n−2 )
Mean of X =E( X )= , n>2 Variance of X=Var ( X )= 2
, n>4
n−2 and m(n−2) (n−4 ) .
11. A random sample of size n=12 from a normal population has the mean X̄ =27 .8 and the
2
variance S =3 .24 . If we base our decision on the statistic of Theorem 11, can we say that the
given information supports the conjecture that the mean of the population is   42 ?

Page 69 of 82
t  1.347 t0.10,11  1.363
[ t  1.347 ; Since is fairly small (chose to ), the data tend to support the
claim.]

Page 70 of 82
ORDER STATISTICS
Consider a random sample of size n from an infinite population with a continuous density, and
suppose we arrange the values of
X 1 , X 2 ,⋯, and X n according to size. If we look upon the
Y
smallest of the x’s as a value of the random variable 1 , the next largest as a value of the random
variable
Y 2 , the next largest after that as a value of the random variable Y 3 , ⋯ , and the
Y
largest as a value of the random variable n , we refer to these Y ' s as order statistics. In
Y Y
particular, 1 is the first order statistic, 2 is the second order statistic, 3 is the third order
Y
statistic, and so on. (We are limiting this discussion to infinite populations with continuous
densities so that there is zero probability that any two of the x' s will be alike.)
To be more explicit, consider the case where n=2 and the relationship between the values of
the X ' s and the Y ' s is
y 1 =x1 , and y 2=x 2 , when x 1 < x 2
y 1 =x2 , and y 2=x 1 , when x 2 < x 1
Similarly, for n=3 the relationship between the values of the respective random variables is
y 1 =x1 , and y 2=x 2 , and y 3 =x 3 , when x 1 <x 2 <x 3
y 1 =x1 , and y 2=x 3 , and y 3=x 2 , when x1 <x3 <x 2
y 1 =x3 , and y 2=x 2 , and y 3 =x1 , when x3 <x 2 <x 1
Let us now derive a formula for the probability density of the rth order statistic for r=1,2 ,⋯,n .
Theorem 14
For random samples of size n from an infinite population that has the value f (x ) at x , the
probability density of the rthorder statistic
Y r is given by
yr r−1

gr ( y r )=

Proof
n!
(r −1)!(n−r )! [∫
x =−∞
f ( x )dx
] f ( yr )
[∫

x= y r
f ( x )dx
n−r

] for −∞< y r ¿ ∞

Suppose that the real axis is divided into three intervals, one from −∞ to
y r , a second from
y r to y r +h (where h is a positive constant), and the third from y r +h to ∞ . Since the
population we are sampling has the value f (x ) at x , the probability that r−1 of the sample
values fall into the first interval, one falls into the second interval, and n−r fall into the third
interval is
yr r −1 y r +h
n!
(r−1 )! 1 ! (n−r )! [ ∫ ] [ ∫ ][ ∫ ]
x=−∞
f ( x )dx
x= yr
f ( x)dx

x= y r +h
f ( x )dx
according to the formula for the
n−r

multinomial distribution. Using the law of the mean from calculus, we have

y r +h

∫ f ( x ) dx =f ( ξ )⋅h where y r ¿ ξ≤ y r + h
x = yr

Page 71 of 82
and if we let h→0 , we finally get
yr r−1

gr ( y r )=
n!
(r −1)!(n−r )! [∫
x =−∞
for the probability density of the rth order statistic.
f ( x )dx
] f ( yr )

[∫
x= y r
f ( x )dx
n−r

] for −∞< y r ¿ ∞

In particular, the sampling distribution of


Y 1 , the smallest value in a random sample of size n, is
given by
∞ n−1
g1 ( y 1 )=n⋅f ( y 1 )
[ ]

x= y 1
f ( x )dx for −∞< y 1 ¿∞

Y n , the largest value in a random sample of size n, is given by


while the sampling distribution of
yn n−1

gn ( y n ) =n⋅f ( y n )
[∫ ]
x =−∞
f ( x ) dx for −∞< y n ¿ ∞
~ Y
Also, in a random sample of size n=2m+1 the sample median X is m+1 , whose sampling
distribution is given by
~
x m
m

[∫ ] [∫

(2 m+1 )!
h (~
x )=
m ! m! x=−∞
f ( x )dx f (~
x)
x =~
x
f (x )dx
] for −∞<~
x <∞
1
Y +Y
[For random samples of size n=2m , the median is defined as 2 ( m m+1 ) ]
In some instances it is possible to perform the integrations required to obtain the densities of the
various order statistics; for other populations there may be no choice but to approximate these
integrals by using numerical methods.
Example 4
Show that for random samples of size n from an exponential population with the parameter θ ,
the sampling distributions of
Y 1 and Y n are given by
−ny 1

{ n
g1 ( y 1 )=¿ θ⋅e , for y 1 ¿0 ¿ ¿¿¿
and
θ

−y n n−1

{
gn ( y n)=¿ nθ⋅e 1−e
−y n
θ
[ ] θ
, for y n ¿0 ¿ ¿¿¿
and that, for random samples of size n=2m+1 from this kind of population, the sampling
distribution of the median is given by
−~x(m+1) −~x m
h (~x )=¿
Solution
{
(2m+1)!
m! m! θ
⋅e θ [1−e ] θ
for ~x >0 ¿ ¿¿¿

Page 72 of 82
A random variable X has an exponential distribution if its probability density is given by
−x

f ( x )=¿ { 1 θ
θ
⋅e , for x>0 ¿ ¿¿¿
The sampling distribution of
Y 1 , the smallest value in a random sample of size n, is given by
−x n−1

[ ]

1 θ
g1 ( y 1 )=n⋅f ( y 1 ) ∫ θ
e dx
x= y 1
Thus;
−x ∞ −y −y −y



1
θ
e
−x
θ
dx =
−θ θ
θ
e [ ] x= y 1 =− e −e [ −∞ θ
1
] =−[ 0−e ] =e θ
1
θ
1

x = y1
Substituting this value into the above equation, we have;
n−1 −y −y n−1
−x
g1 ( y 1 )=n⋅f ( y 1 )
[ ∫

1
θ
e θ
dx
] 1
=n⋅ eθ
θ
1

¿ eθ[ ] 1

x= y 1
− y1 − y 1(n−1 ) − y 1+ − y 1( n−1)
[ ] − y1 −ny 1 + y 1 −ny 1
1 θ n n n
= n⋅ e ⋅eθ = ⋅eθ = ⋅eθ = ⋅e θ
θ θ θ θ
Thus; we have shown that;
−ny 1

{ n
g1 ( y 1 )=¿ θ⋅e , for y 1 ¿0 ¿ ¿¿¿
θ

Similarly, the sampling distribution of


Y n , the largest value in a random sample of size n, is
given by:
yn n−1

gn ( y n ) =n⋅f ( y n )

yn
[∫ x =−∞
f ( x ) dx
] for 0< y n ¿∞

−x y n −y −y −y
−x

∫ 1θ e θ dx=−θθ e θ [ ] x=0 =− e [ θ
n
0
−e =− e ] [ θ
n
]
−1 = 1−e θ
n

x =0
Substituting this value into the above equation, we have;
n−1
yn − yn − y n n−1 − yn − y n n−1

[∫ ]
gn ( y n ) =n⋅f ( y n )

Thus; we have shown that;


x =0
f ( x )dx
1
=n⋅ e
θ
θ [
¿ 1−e θ ] n
= e
θ
θ [ 1−e ] θ

−y n n−1

g ( y )=¿ { e [1 −e ] ,
−yn
n θ
n n
θ
for y n ¿0 ¿ ¿¿¿
θ
~ Y
Also, in a random sample of size n=2m+1 the sample median X is m+1 , whose sampling
distribution is given by
Page 73 of 82
~
x m
m

[∫ ] [∫

(2 m+1 )!
h (~
x )=
m ! m! x=−∞
f ( x )dx f (~
x)
x =~
x
f (x )dx
] for 0<~
x<∞

~
~
x ~
x −x −x x −~x −~x −~x

∫ f ( x )dx= ∫ 1θ e θ
dx=
−θ
θ
[ ]
e θ x=0 =− [e θ ] [
−e 0 =− e −1 =1−e θ ] θ

x=0 x=0

−x −x ∞ −~x −~x −~x


∞ ∞
∫~ f ( x )dx= ∫~ θ e
1 θ
dx=
−θ
θ
[ ] θ −∞
e x=~x =− e −e
θ [ ]=−[0−e ]=e θ θ

x= x x= x
Hence;
~
x m
m

[∫ ] [∫ ]

(2 m+1)!
h (~
x )= f ( x )dx f (~
x) f ( x )dx
m! m! x=0 x=~
x
−~x m −~x −~x m x m
−~ −~x −m ~
x x m −~
−~ x( m+1)
¿
(2 m+1)!
m! m!
[
1−e θ ] 1
θ
e θ [e ]θ
=
(2 m+1)!
m! m!
1−e θ [ ] 1θ eθ

θ
=
(2 m+1)!
m ! m! θ
[
1−eθ ]eθ

Thus; we have shown that;

−~x (m+1) −~x m


~
{
h ( x )=¿
(2m+1)!
m! m! θ
⋅e θ [1−e ] , θ
for ~x>0 ¿ ¿¿¿

The following is an interesting result about the sampling distribution of the median, which holds

when the population density is continuous and nonzero at the population median , which is

1
∫ f ( x)dx=
such that x=−∞
2 .
Theorem 15
For large n, the sampling distribution of the median for random samples of size 2n+1 is
1
= 2
approximately normal with the mean

and the variance 8 [ f (~
μ )] n .
Proof (Exercise)

Note that for random samples of size 2n+1 from a normal population we have μ=~
μ , so
f (~
μ )=f (μ )=
1 πσ 2
σ √ 2 π and the variance of the median is approximately 4 n .

Page 74 of 82
If we compare this with the variance of the mean, which for random samples of size 2n+1 from an
σ2
infinite population is 2 n+1 , we find that for large samples from normal populations the mean
is more reliable than the median; that is, the mean is subject to smaller chance fluctuations than
the median.

Page 75 of 82
EXERCISE
1. Find the sampling distributions of
Y 1 and Y n for random samples of size n from a

continuous uniform population with α=0 and β=1 .


2. Find the sampling distribution of the median for random samples of size 2m + 1 from the

population of exercise 1 above.


~
⇒h ( x )=¿ {
(2m+1)! ~ ~ m
m! m! θ
⋅x (1− x ) , for 0<~x <1¿ ¿¿¿
3. Find the mean and the variance of the sampling distribution of
Y 1 for random samples of
size n from the population of exercise 1.
4. Find the sampling distributions of
Y 1 and Y n for random samples of size n from a

population having the beta distribution with α=3 and β=2 .

¿ ¿
5. Find the sampling distribution of the median for random samples of size 2m + 1 from the
population of exercise 4 above.
6. Duplicate the method used in the proof of Theorem 14 to show that the joint density of
Y 1 and Y n is given by
yn n−2

(a)
{
g ( y 1 , y n ) =¿ n(n−1)f ( y 1 )f ( y n )
[∫ ]
x= y1

Use this result to find the joint density of


f ( x)dx for −∞< y 1 ¿ y n ¿∞ ¿¿¿¿

Y 1 and Y n for random samples of size n


from
an exponential distribution.
(b)
Y 1 and Y n for the population of Exercise 1.
Use this result to find the joint density of
7. With reference to part (b) of Exercise 6, find the covariance of 1
Y and Y n .
1
[ ⇒
( n+1)2 (n+2) ]
8. Use the formula for the joint density of
Y 1 and Y n shown in Exercise 6 and the

transformation technique discussed earlier to find an expression for the joint density of
Y1

and the sample range nR=Y −Y


1 .
9. Use the result of Exercise 8 and that of part (a) of Exercise 6 to find the sampling distribution of
R for random samples of size n from an exponential population.

¿¿ Page 76 of 82
10. Use the result of Exercise 8 to find the sampling distribution of R for random samples of size n
from the continuous uniform population of Exercise 1.
11. Use the result of Exercise 10 to find the mean and the variance of the sampling distribution of
R for random samples of size n from the continuous uniform population of Exercise 1.
n−1 2(n−1)
[ ⇒ E( R)=
n+1
; σ 2=
(n+1 )2 (n+2 ) ]
CHARACTERISTIC FUNCTION OF A RANDOM VARIABLE
In a more advanced course, we would not work with the moment-generating function because so
many distributions do not have moment-generating functions.
itx
Instead, we would let i denote the imaginary unit, t an arbitrary real, and would define E e .  
This expectation exists for every distribution and it is called the characteristic function of the
distribution.
Every distribution has a unique characteristic function; and to each characteristic function there
corresponds a unique distribution of probability.
Definition
If X is a random variable, then the characteristic function of X is the function
ψ : ℜ→ C, defined for t ∈ ℜ as the expectation of e itX (this expectation exist for all t ∈ ℜ ).

ψ X ( t ) = E (eitX ) =¿ ∑ eitX Pr ( X=x ) if X is discrete ¿ ¿¿¿


{ x
PROPERTIES
1. Uniqueness: X and Y have the same characteristic function if and only if they have the same
distribution function.
  ( n)
( ) n ( n itX )
2. If E X exists, then the nth derivative of ψ X ( t ) exist and ψ X t =i E X e
n

Moreover: ψ(Xk ) ( 0 )=i k E ( X n ) , k≤n .


n
(it ) j
ψ X ( t ) =∑ E ( X j ) + 0(t n )
3. If E  X  exists, then j=0 j!
n

n
Lim 0( t )
where 0  t n
 is a function that t →0 t n
=0
.

MEAN AND VARIANCE USING CHARACTERISTIC FUNCTION


ψ X ( t ) = E ( e itX )
d d itX
ψ 'X ( t ) = [ E ( e itX ) ]=E
dt [dt
[ ]
( e ) ] =iE ( Xe itX )
d d d
ψ ''X ( t ) = [ ψ 'X ( t ) ] = [ iXE ( e itX ) ] =iE
dt dt [dt
[ ]
( Xe itX ) ] =i2 E ( X 2 eitX )
Page 77 of 82
At t = 0;
ψ 'X ( 0 )= iE ( Xe 0 )=iE ( X )

ψ 'X ( 0 )
∴ Mean=E( X )=
i
2 2
Var ( X )= E ( X )− [ E ( X ) ]
'' ' 2 '' ' 2
ψ X (0) ψ X (0) ψ X ( 0 )−[ ψ X ( 0 ) ]
=
i2

i [ ] =
i2
Students who are familiar with complex -valued functions may write
ψ X ( t ) =M X ( it ) .

Multivariate Characteristic Function


X =( X 1 , X 2 ,⋯, X k ) X is the function
If is a random vector, then the characteristic function of
k
ψ : ℜ → C, defined for t∈ℜ as the expectation of e
it X
(this expectation exist
k
for all t ∈ ℜ ).
ψ( X1 ,X 2 ,⋯, Xk ) ( t 1 ,t 2 ,⋯,tk ) = E ( e it X )
i(t 1 X 1 +t2 X 2+⋯+t k X k )
=¿
{ ∑ ∑ ⋯∑ e
x
1
x
2
x
k
Pr ( X=x 1 , X=x 2 ,⋯, X=x k ) if X is discrete ¿ ¿¿¿¿ ¿

¿
Properties
1. The uniqueness property holds (as in the case of k  1 ).
2. If X 1 , X 2 ,⋯, X k are independent, then
ψ( X ) ( t 1 ,t 2 ,⋯, t k ) = ψ X ( t 1) ¿ ψ X ( t 2 ) ⋯ψ X ( t k )
1 , X 2 ,⋯, Xk 1 2 k

Example 1
If X is a random variable having a binomial distribution, i.e.,

n x n−x
f (x)=¿
{( )x
p q , x=0,1,2,⋯,n ¿ ¿¿¿
a) Find the characteristic function of X andhence
b) Find the mean and the variance of X.
Solution
(a) The characteristic function of X is defined as

Page 78 of 82
ψ X ( t ) = E ( e itX )
n X n− X n ( it ) X n−X
=∑ eitX Pr( X=x )=∑ e itX
x x X
p q =∑
x
()X
pe q ()
= n ( peit ) q n−0 + n ( peit ) q n−1 + n ( pe it ) qn−2 +⋯+ n ( peit ) qn−n
0 1 2 n
()
0 1 () 2 () n ()
n 1 n 2 n
() ()
=qn + ( pe it ) q n−1 + ( peit ) qn−2 +⋯+ ( peit )
1 2
From the Binomial Theorem;

( a+b )n =an + n an−1 b+


() () n n−2 2
a b +⋯+bn
1 2
it
Hence, in our case; a=q , and b= pe
n 1 n 2 n n
1 ()
∴ q n + ( peit ) q n−1 + () 2
( pe it ) q n−2 +⋯+ ( pe it ) =( q+ pe it )
Hence the characteristic function of X is
n
ψ X ( t ) =( q+ pe it )
d d n n−1
ψ X ( t ) ]= [ ( q+ peit ) ]=n ( q+ pe it ) ⋅pie it
'
ψ X (t)= [
(b) dt dt
At t = 0;
n−1
ψ 'X ( 0 )=n ( q+ pe 0 ) ⋅pie 0 =npi( p+q)n−1
But p+q=1 ; hence
ψ 'X ( 0 )=npi
'
ψ ( 0 ) npi
∴ Mean=E( X )= X = =np
i i
2 2
Var ( X )= E ( X )− [ E ( X ) ]
2 '' ' 2
ψ 'X' ( 0 ) ψ 'X ( 0 ) ψ X ( 0 )− [ ψ X ( 0 ) ]
=
i2
− [
i ] =
i2

d ' d n−1
ψ X ( t ) ] = [ npie it ( q + pe it ) ]
''
ψ X (t)= [
dt dt
Using the product rule of differentiation;
'' dU dV
ψ X ( t ) =V +U
dt dt , where we let;
dU
U=npie it ⇒ =npi 2 eit
dt
n−1 dV n−2 n−2
V =( q+ peit ) ⇒ =(n−1 )( q+ peit ) ⋅pie it =( n−1) pie it ( q + pe it )
dt
Hence;
Page 79 of 82
dU dV
ψ ''X ( t ) =V +U
dt dt
n−1 n−2
=( q+ peit ) ⋅npi 2 e it + npi 2 eit⋅( n−1) peit ( q+ pe it )
n−2
=npi 2 e it ( q+ peit ) [ ( q +pe it ) +(n−1) peit ]
At t = 0; we have;
'' 2 0 0 0 n−2 0 2 n−2
ψ X ( 0 )=npi e ( q+ pe ) [ ( q +pe ) +( n−1 ) pe ] =npi ( q+ p ) [ ( q+ p ) +( n−1 ) p ]
But p +q=1
Hence ;
''
ψ X ( 0 )=npi 2 [ 1+(n−1) p ]
2 2
Var ( X )= E ( X )− [ E ( X ) ]
2 2
''
ψ X ( 0)
'
ψ X ( 0) ψ'X' ( 0 )−[ ψ 'X ( 0 ) ]
=2
i

i [ ]
=2
i
2
npi 2 [ 1+( n−1) p ] −[ npi ] npi 2 [ 1+( n−1 ) p ] −n2 p2 i 2
=2 =2
i i
2
npi [ 1+( n−1) p−np ]
=2 =np [ 1+np− p−np ] =np [ 1− p ]
i
=npq
Example 2
If X is a random variable having an exponential distribution with parameter λ , i.e.

f (x)=¿ { λ e−λx , for x>0 ¿ ¿¿¿


a) Find the characteristic function of X. Hence find
b) The mean and variance of X.
Solution
(a) The characteristic function of X is defined as
ψ X ( t ) = E ( e itX )
∞ ∞ ∞
itx itx − λx itx itx−λx
=∫ e f (x )dx=∫ e λe dx= λ ∫ e ⋅e −λx
dx=λ ∫ e dx=λ ∫ e−x ( λ−it ) dx
x x x=0 x=0 x=0
−x( λ−it ) ∞
e

[ ] =
−λ
[ e−∞−e0 ]=−λ [ 0−1 ] = λ =λ ( λ−it )−1
−( λ−it ) x =0 ( λ−it ) ( λ−it ) ( λ−it )
Hence the characteristic function of X is
ψ X ( t ) =λ( λ−it )−1
' d d
ψ X ( t ) = [ ψ X ( t ) ]= [ λ ( λ−it )−1 ]=λ(−1 ) ( λ−it )−2⋅(−i)=λi ( λ−it )−2
(b) dt dt
At t = 0;
Page 80 of 82
' −2−2 −1
ψ X ( 0 )= λi ( λ−i(0)) =λi ( λ ) =iλ
'
ψ X ( 0 ) iλ −1 −1 1
∴ Mean=E( X )= = =λ =
i i λ
2 2
Var ( X )= E ( X )− [ E ( X ) ]
2 '' ' 2
ψ 'X' ( 0 ) ψ 'X ( 0 ) ψ X ( 0 )− [ ψ X ( 0 ) ]
=
i2

i [ ] =
i2

'' d ' d
ψ X (t)= [ ψ X ( t ) ] = [ λi ( λ−it )−2 ]
dt dt
=λi(−2 ) ( λ−it )−3⋅(−i)=2 λi 2 ( λ−it )−3
At t = 0; we have;
'' d '
ψ X ( t ) ]=2 λi 2 ( λ−i (0) ) =2 λi 2 λ−3 =2 i 2 λ−2
−3
ψ X ( 0 )= [
dt
2 2
Var ( X )= E ( X )− [ E ( X ) ]
2 '' ' 2
ψ 'X' ( 0 ) ψ 'X ( 0 ) ψ X ( 0 )− [ ψ X ( 0 ) ]
=
i2

i [ ] =2
i
2
2 i 2 λ−2 −[ iλ−1 ] 2 i2 λ−2 −i 2 λ−2 i 2 λ−2
=2 =2 =2 = λ−2
i i i
1
∴ Var ( X )= λ−2 = 2
λ

EXERCISE
1. If X is a random variable having a uniform probability density function; i.e.

1
{
Pr(X=x)=f (x)=¿ , for x=0,1,2,⋯, N−1 ¿ ¿¿¿
N
1 1−e itN
(a) Find the characteristic function of X, and hence;
[ ⇒ ψ X (t )= ⋅
N 1−eit ]
N −1 N 2 −1
(b) Find the mean and variance of X.
[ ⇒ E ( X )=
2
; Var ( X )=
12 ]
2. If X is a random variable having a Poisson probability density function; i.e.

e−λ λ x
Pr(X=x)=f (x)=¿
x! {
, for x=0,1,2,3⋯¿ ¿¿¿
−λ (1−eit )
(a) Find the characteristic function of X, and hence; [⇒ ψ X (t )=e ]
Page 81 of 82
(b) Find the mean and variance of X. [ ⇒ E( X )=λ ; Var ( X )=λ ]
3. If X is uniform on the interval [ a, b ] , then X has a probability density function given by;

1
Pr(X=x)=f (x)=¿ { b−a
, for a≤x≤b ¿ ¿¿¿
e ibt −e iat
(a) Find the characteristic function of X, and hence;
[ ⇒ ψ X (t )=
it (b−a ) ]
(b−a )2
(b) Find the mean and variance of X.
[ ⇒ E ( X )=
a+b
2
; Var ( X )=
12 ]
4. If X is a random variable having a chi-squared distribution with n degrees of freedom, its
probability density function is given by;
−n

f (x )=¿
{ 2
Γ(
2

n
2 )
x
n
2
−1
−x
2
e , for x>0 ¿ ¿¿¿

−n

(a) Find the characteristic function of X, and hence;


[⇒ ψ (t )=( 1−2it ) ]
X
2

(b) Find the mean and variance of X. [ ⇒ E( X )=n ; Var ( X )=2 n ]


2
5. If X is a random variable having a normal distribution with the mean =μ and variance =σ
,its probability density function is given by;
−1
( x−μ )2
f (x )=¿
{√
1 2σ 2
2
e
2πσ
, for −∞<x<∞ ¿ ¿¿¿

(a) Find the characteristic function of X, and hence;


[⇒ ψ (t )=e ]
X
iμt− σ 2 t 2
2

(b) Find the mean and variance of X. [⇒ E( X )=μ ; Var ( X )=σ 2 ]

Page 82 of 82

You might also like