You are on page 1of 6

A random variable X has a probability density function if there is a function f : R R so that

P(a < X b) =

f (x)dx
a

for any a < b. A random variable with a density is called continuous. We


remind the reader that the distribution of a continuous random variable is
determined by its density function.
The goal of this discussion is to discuss the determination of the distribution of the random variable Y = h(X), where h is a differentiable function
and X is a continuous random variable with density function f .
We first consider the case where the following hold:
1. h : D R is differentiable on its domain D, which is an open interval
of R.
2. h is one-to-one, which means that h(x) = h(y) implies that x = y.
3. The support of X, defined as
def

supp( f ) = {x : f (x) > 0}

(1)
is contained in D.

For any set S R, define h(S) = {y : y = f (x) for some x D}.


Theorem 1. Suppose that the conditions above hold. Then Y = h(X) is a
continuous random variable with density function
(2)

fY (y) = fX (h1 (y))

d 1
h (y) 1 {y h(D)} ,
dy

for all points y so that fX is continuous at h1 (y).


Proof. Since h is one-to-one, it is either always increasing or always decreasing on D. Assume that h is increasing, the other case is similar. We
begin by computing the distribution function of Y :
P(Y y) = P(h(X) y)
= P X h1 (y)
h1 (y)

fX (x)dx

d 1
h (z)dz
dy

y
d 1
h (z) dz
=
fX (h1 (z))
dy

fX (h1 (z))

by change of variables
since h is increasing .

Now suppose that y is such that h1 (y) is a point of continuity for fX . Then
by the Fundamental Theorem of Calculus, FY is differentiable at y with
derivative
fY (y) =

d
d 1
FY (y) = fX (h1 (y))
h (y) .
dy
dy

Now we consider the case where h is not one-to-one. We will assume the
following: For each y h(D), the set h1 ({y}) = {x : h(x) = y} is a finite
set.
We recall the following theorem from calculus:
Theorem (Inverse Function Theorem). Let h : D R be a differentiable
function. Let y = f (x) for some x D. Suppose that f (x) = 0. Then
there is an open interval I containing x and an open interval J containing
y, so that h restricted to I is one-to-one, and there is a differentiable inverse
h1
x : J I.
Thus, for each x h1 ({y}), there is a function gx defined in a neighborhood of x so that h gi (y ) = y for all y in a neighborhood of y, and
gi h(x ) = x for all x in a neighborhood of x.
Assume that for each x h1({y}), we have h (x) = 0. Now, since
1
h ({y}) is finite, say equal to {x1 , . . . , xr }, letting gi = gxi there is an interval J containing y on which each of the gi is defined. We can take J small
enough so that {gi (J)} are disjoint intervals.
We have for a y b with a < b and a, b J,

P(a Y b) = P

{X gx([a, b])}
xh1 ({y})

P(X gx ([a, b]))

xh1 ({y})

f (u)du

xh1 ({y}) gx ([a,b])

=
xh

({y}) a

a xh1({y})

f (gx (s)) gx(s) ds


f (gx (s)) gx(s) ds

x1

x2

x3

F IGURE 1. A many-to-one function


Then taking a = y and b = y + y, differentiating with respect to y, and
evaluating at y = 0 yields
(3)

fY (y) =

xh1({y})

f (gx (s)) gx (s) .

Figure 1 shows a many-to-one function. Note how a little neighborhood


around y maps to neighborhoods surrounding the three points in h1 ({y}).
For this y, the sum in (3) will have three terms.
Let us consider an example. Suppose that X has an exponential(1) distribution, and let

if 0 < x 13
3x
8
h(x) = 1 5 x 13
if 13 < x < 15

2 x 8
8
if x 15
.
15
The reader should graph this function. Let 0 < y < 1. The there are three x
so that h(x) = y. Namely,
1
x= y
3
1
4
x = y+
5
15
1
8
x = y+ .
2
15
The three functions on the right of the above equation are then g1(x), g2 (x)
and g3 (x). Thus we have
ey/3 ey/54/15 ey/28/15
+
+
.
3
5
2
Here is another example. Suppose that X has the density
x
f (x) = 2 1 {0 < x < 2} ,
2
fY (y) =

and consider the random variable Y = sin X. We will now find the density
of Y .
First take y > 0. Then
sin1({y}) (0, 2) = {arcsin(y), arcsin(y) + /2} .
This follows since, by convention, arcsin(y) is defined to take values in
[ 2 , 2 ] for y [1, 1].
Thus using the notation as above, we have
g1 (y) = arcsin(y) ,
g2 (y) = arcsin(y) +

.
2

Thus we have
fY (y) =

arcsin(y)
22

1
1 y2

arcsin(y) + 2
22

1
1 y2
=

arcsin(y) + 4
2

1 y2

Let us review some facts from multivariate calculus. Let h : D R be a


one-to-one function, where D Rn and R Rn . We write
h(x1 , . . . , xn ) = (h1 (x1 , . . . , xn), . . . , hn(x1 , . . . , xn )) .
The total derivative of h at x = (x1 , . . . , xn) is defined as the matrix

h
1
1
1
(x) h
(x) h
(x)
x1
x2
xn

h2
h2
2

x (x) h
x2 (x) xn (x)
1

Dh(x) = .
.
.
.
..
..
..

..
hn
x1 (x)

hn
x2 (x)

hn
xn (x)

The Jacobian of h at x, which we will denote by Jh (x) is defined as


def

Jh (x) = det Dh(x) .

(4)

Now the change of variables formula says the following: Let h : D R


be a function which has continuous first partial derivatives. Then
f (y)dy =
E

h1 (E)

f (h(x))|Jh (x)|dx .

Now we can state the formula for finding the density of Y = h(X), where
X is a random vector in Rn , and h is a one-to-one function defined on D
Rn , where the support of fX is contained in D.

Theorem 2. Let h be as above, and let X be a continuous random vector in


Rn with density fX . Then the density of Y is given by
fY (y) = fX (h1 (y))|Jh1 (y)| .
A fact which is often very useful is that
(5)

|Jh1 (y)| =

1
|Jh (h1 (y)|

Let X,Y be independent standard Normal random variables. Let


(D, ) = h(X,Y ) = (X 2 + Y 2 , arctan(Y /X)) .
Then

h1(d, ) = ( d cos , d sin ) .

We have
Dh1(d, ) =

cos
2 d
1
sin
2 d

d sin

d cos

Thus
1
1
1
cos2 + sin2 = .
2
2
2
Then we have for d > 0 and [0, 2):
|Jh1 (d, )| =

1 1 (d cos2 +d sin2 ) 1 1 d 1
e 2
= e 2
2
2 2
2
This shows that D and are independent, and D is exponential(1/2), and
is Uniform[0, 2). [Why?]
Note we could run this in reverse: Suppose we start with D an exponential(1/2)
random variable, and an independent Uniform[0, 2) random variable.
Then let

g(d, ) = ( d cos, d sin ) .


fD, (d, ) =

Then g1 (x, y) = h(x, y), where h is defined as above. Now finding the
Jacobian of g1 itself can be done, but it is perhaps easier to use (5):
1
1
=
= 2.
Jg1 (x, y) = Jh (x, y) =
Jh1 (h(x, y)) 1/2
Thus,
2
2
2
2
1
1
1
fX,Y (x, y) = e(x +y )/2 2 = e(x +y )/2 .
2
2
2
Thus g(D, ) gives a pair of independent standard normal random variables.
[Why?]

This gives a method of simulating a pair of Normal random variable. It


is relatively easy to simulate a uniform and an exponential random variable. Then applying the function g to them gives a pair of Normal random
variables.

You might also like