Functions of Random Variables

4.
FUNCTIONS OF RANDOM VARIABLES

Given Y = g(X), or Y = g(X1, X2) = a1X1+a2X2
If PDF of X is known, can the PDF of Y be obtained?
4.1 SINGLE VARIABLE CASE

Suppose Y = g(X), g is monotonic increasing, e.g Y = 5X2, X  0.
Relationship is deterministic, i.e. X is known exactly (e.g. X = 2),
then Y is also known exactly (e.g. Y = 20).
If X is random, then Y is random.

If P(1 < X < 2) = 0.1, then P(5 < Y < 20) = 0.1 since Y = 5X2.
If P(x0 < X < x0+dx) = p, then P(y0 < Y <y0+dy) = p; y0=g(x0)
For monotonically increasing function

Y
Probability of X0
occurrence within
interval = fY(y)dy Probability of
dy occurrence within
interval
= fX(x)dx
X
dx
PDF of X = fX(x), PDF of Y = fY(y).
Probability in variable X is mapped to variable Y. Hence,
fY(y)dy = fX(x)dx
𝐝𝒙
fY(y) = fX(x) where x = g-1(y) (inverse function)
𝐝𝒚
𝐝𝒙
For monotonically decreasing function, i.e. is negative
𝐝𝒚
Y
X0 Probability
= fY(y)dy
dy
Probability
= fX(x)dx
X
dx
Probability in variable X is mapped to variable Y.
But probabilities must be positive, hence
|fY(y)dy| = |fX(x)dx|
𝐝𝒙
fY(y) = fX(x) where x = g-1(y)
𝐝𝒚
Change of variable theorem for monotonic function

(either always increasing or decreasing)
Example 4.1
Given Y = ln X. If fX(x) is LN(, ), find fY(y).
Y = ln X => = => =
1  1  ln x    2 
Recall: f X ( x)  exp       LN ( ,  ) 0 x
x 2  2    
1  1  ln(e y )    2 
fY(y) = fX(x)  exp     x
x 2  2    
1  1  y   2 
 exp     ~ N ( ,  )
 2  2    
(All x must be converted to y by putting x = ey)

For discrete r.v.,
if Y = g(X) and P(X = x0) = pX(x0) = p,
then P(Y = g(x0)) = p.
Hence, given pX(x), then PMF of Y is
pY(y) = pX(x) = pX[g-1(y)]
Example
X = no. of functional bulldozers
Y = X2
pY(4) = pX(2) = 0.384
X Y P(X = xi)
3 9 0.512
2 4 0.384
1 1 0.096
0 0 0.008
4.2 MULTI-VALUE SINGLE VARIABLE FUNCTION

Given Y = g(X), the inverse, X = g-1(Y) can take multiple values,
e.g. Y = X2, => X =  . How to find fY(y)?
For a two-value function, if Y = y, then the inverse is

X = x1 or X = x2. For the above, X = + or X =
Hence, P(Y = y) = P(X = x1 or X = x2) = P(X= x1  X = x2)
= P(X = x1) + P(X = x2)
Therefore if Y  g ( X ) and X  g 1 ( y )  x1 , x 2 , x 3 , , x n ,
n
pY ( y )   p X ( x i ) for discrete r.v.
i 1
n
dx i
fY ( y )   f X ( x i ) for continuous r.v.
i 1 dy
Two-valued function (non-monotonic)
Y Probability
= fY(y)dy
dy
Probability
= fX(x2)dx2 Probability
= fX(x1)dx1
X
dx2 dx1
Probability fY(y)dy is mapped to two regions:
fX(x1)dx1 and fX(x2)dx2
Hence, |fY(y)dy| = |fX(x1)dx1| + |fX(x2)dx2|
fY(y) = fX(x1) + fX(x2)
Don’t worry, 3 or
Multi-valued function (non-monotonic) more valued
function not tested
Y
Probability = fY(y)dy
Prob = dy
Prob =
fX(x3)dx3 fX(x1)dx1
X
dx3 dx2 dx1
Prob = fX(x2)dx2
Probability fY(y)dy is mapped to multiple regions:

n
dxi
Hence, f Y ( y )   f X ( xi )
i 1 dy
Example 4.2 - Y=X2, X ~ N(0,1), find fY(y).
If Y = y, then X = x1 = + (first root)

or X = x2 = (second root)
n
dxi
f Y ( y )   f X ( xi ) for multi-valued functions
i 1 dy and continuous r.v.
1 2
a) x1 = + => =
1  x12  1 1  y
f Y (y )  exp     exp  
2π  2  2 y 2 2 πy  2
b) x2 = => =
1  x22  1 1  y
f Y (y )  exp      exp  
2π  2  2 y 2 2 πy  2
c) Combining (a) and (b) gives

1  y 1  y 1  y
fY ( y )  exp     exp     exp   
2 2y  2  2 2y  2 2y  2
Scaling a random variable
How does multiplying (or dividing) by a constant affect a

random variable?
Consider y = bX, where b = constant
Both the mean and standard deviation are multiplied by b, i.e.
Y = bX , Y = |b| X (note b can be negative!)
Distribution type remains unchanged, i.e.

If X ~ normal, Y ~ normal
If X ~ lognormal, Y ~ lognormal
Coefficient of variation also remains unchanged
Covariance of two random variables

Recall that for a single random variable, the variance is
 X2  E[( X   X ) 2 ]
For two random variables X and Y, the covariance is
cov( X , Y )  E[( X   X )(Y  Y )]
Convenient to normalize the covariance as follows
cov( X , Y )  = Pearson product moment

 XY 
 XY correlation coefficient
(no units) or simply correlation coefficient
–1    1
Correlation coefficient
r = sample correlation coefficient
Negative correlation
straight
line
Zero correlation
Positive correlation
(uncorrelated)
straight
line
Background,
Correlation vs Dependence not tested
X and Y are independent
P(XY) = P(X)P(Y) (discrete)
fXY(x, y) = fX(x)fY(y)
joint probability density function for continuous r.v.
(background only, not tested)
X and Y are uncorrelated

E[( X   X )(Y  Y )]  0
• Independent implies uncorrelated
• However, uncorrelated variables may not necessarily
be independent!!
• Special case: for two jointly normal variables X and Y,
uncorrelated implies independence. (does not apply to
other distributions)
Background,
Correlation vs Dependence not tested
• Correlation is a measure of linear dependence

• Uncorrelated implies no linear dependence, but there
can be nonlinear dependence! Y
For example, Y = X2
X and Y are completely
dependent
(Y is fully specified by X)
X
However, XY= 0
E[( X   X )( X   X ) 2 )
(uncorrelated, i.e. no  XY 
 XY
linear relationship)
E[( X   X )3 )
= 0 (symmetric)
 XY
4.3 FUNCTION OF MULTIPLE RANDOM VARIABLES

(more advanced, so we only consider special cases)
4.3.1 Sum (or Difference) of Normal Random Variables
Consider a case of two normal variates X1 and X2, where
X 1 ~ N (  X1 , X1 ), X 2 ~ N (  X 2 , X 2 ) with correlation  X1 X 2
What is the distribution of Y = a1X1+a2X2 ?

The distribution of Y will be N (Y , Y )
(sum of normal variables is also normal)

The distribution of Y will be N ( Y , Y ), where
2
Y  E (Y )  E (a1 X 1  a2 X 2 )  a1  X  a2  X   ai  X
1 2 i
i 1
 Y2  E[(Y  Y )2 ]  E[{a1 ( X 1   X )  a2 ( X 2   X )}2 ]

1 2
 E[a12 ( X 1   X1 )2  2a1a2 ( X 1   X1 )( X 2   X 2 )  a22 ( X 2   X 2 )2 ]

 a12 X2 1  2a1a2 E[( X 1   X1 )( X 2   X 2 )]  a22 X2 2
2 2
a  2
1
2
X1 a 
2
2
2
X2  2a1a2  X1 X 2  X1  X 2   a a  i j Xi X j X X
i j
i 1 j 1
E [( X 1   X 1 )( X 2   X 2 )]  cov ( X 1, X 2 )
cov ( X 1, X 2 )
X 
1X2
X X1 2
Correlation does not imply causation !!!

B causes A (reverse causation or reverse causality)
Observation: The faster that windmills are observed to rotate, the more
wind is observed.
Wrong conclusion: Wind is caused by the rotation of windmills.
Third factor causes both A and B

Observation: As ice cream sales increase, drowning deaths increases
Wrong conclusion: Ice cream consumption causes drowning.
Actual explanation: Ice-cream is sold in hot summer months, and during
summer, people are more likely to swim.
Relationship is coincidental
Observation: Russian state leaders alternate from bald to non-bald for 200
years
Actual explanation: Purely coincidental
Littlewood's law states that a person can expect to experience events with
odds of one in a million (i.e miracle) at the rate of about one per month.
https://en.wikipedia.org/wiki/Littlewood%27s_law
Dead
Example 4.3 - Combined load on column + live
Given S = D + L + W, D ~ N(4.2,0.3), L ~ N(6.5,0.8) + wind
and W ~ N(3.4,0.7), with DL= 0.1, LW = 0 and load
DW = 0. If the strength of the column R ~ N(21.15,

3.1725), find the probability of failure of the
column. Assume R and S are uncorrelated.
The distribution of S will be N(  S ,  S ), where

3
 S   ai  X   D   L  W  4.2  6.5  3.4  14.1
i
i 1
3 3
 
2
S a a  i j X X
i
jX  X
i
 
j
2
D   2
L   W  2  DL D L
2
i 1 j 1
 0.32  0.82  0.7 2  2  0.1  0.3  0.8  1.268   S  1.126
Given S = D + L + W , D ~ N(4.2,0.3), L ~ N(6.5,0.8) and

W ~ N(3.4,0.7), with DL= 0.1, LW = 0 and DW = 0. If the
strength of the column R ~ N(21.15, 3.1725), find the
probability of failure of the column.
Assume R and S are uncorrelated.
Let X  strength  load  R  S . X will be N(  X ,  X )
 X   R   S  21.15  14.1  7.05

 X2   R2   S2  3.1725 2  1.268 2  11.333   X  3.366
Probability of failure is when load > strength, i.e. S > R

or R – S < 0 or equivalently, X < 0
 X  X   0  7.05 
P( X  0)           ( 2.094 )  0.018
 X   3 . 366 
3 3
 a a 
i 1 j 1
i j Xi X j X X 
i j
a1a1  X 1 X 1 X 1 X 1  a1a 2  X 1 X 2  X 1 X 2  a1a 3  X 1 X 3  X 1 X 3
 a 2 a1  X 2 X 1 X 2  X 1  a 22 a 22  X 2 X 2  X 2  X 2  a 2 a 3  X 2 X 3  X 2  X 3
 a 3 a1  X 3 X 1 X 3  X 1  a 32 a 32  X 3 X 2  X 3  X 2  a 3 a 3  X 3 X 3  X 3  X 3
 a12 12  a 22 22  a 32 32  2a1a 2  X 1 X 2  X 1  X 2

 2a1a 3  X 1 X 3  X 1 X 3  2a 2 a 3  X 2 X 3  X 2  X 3
Note that
X X  X1 1 2X2
  X3X3  1 X X  X
1 2 2 X1
, etc
More complex example.

S =1 –2D + 3L – 4W , D ~ N(4.2,0.3), L ~ N(6.5,0.8) and
W ~ N(3.4,0.7), with DL= 0.1, LW = –0.2 and DW = – 0.3.
The distribution of S will be N(S , S ), where

3
S   ai  X  1  2D  3L  4W  1  2(4.2)  3(6.5)  4(3.4)  ...
i
i 1
3 3
  2
S a a  i j Xi X j X X
i j
i 1 j 1
 22  D2  32  L2  42  W2  2(2)(3)  DL D L  2(3)(4)  LW  L W

 2(2)(4)  DW  D W
 4(0.32 )  9(0.82 )  16(0.72 )  2(2)(3)(0.1)(0.3)(0.8)
 2(3)(4)(0.2)(0.8)(0.7)  2(2)(4)(0.3)(0.3)(0.7)
 ...
4.3.2 Product (or quotient) of Lognormal Random Variables
Consider Y  a0 X 1 1 X 2 2 , where
a a
X 1 ~ LN (  X 1 ,  X 1 ), X 2 ~ LN (  X 2 ,  X 2 ) with correlation  X 1 X 2
ln Y  ln a 0  a1 ln X 1  a 2 ln X 2
ln X 1 ~ N (  X 1 ,  X 1 ) and ln X 2 ~ N (  X 2 ,  X 2 )
ln Y is the sum of normal r.v. with mean and variance:

E(ln Y )  ln a 0  a1 E (ln X 1 )  a 2 E (ln X 2 )  ln a 0  a1 X 1  a 2  X 2
var(ln Y )  a12 X2 1  a 22 X2 2  2 a1 a 2  ln X 1 ln X 2  X 1  X 2
Assume  ln X 1 ln X 2   X 1 X 2 .
The distribution of Y is LN ( Y ,  Y ), where Y  E(ln Y )

2 2 2
 ln a 0   a i  Xi and   var(ln Y )  
Y
2
aa i j X iX j
XX
i j
i 1 i 1 j 1
lognormal lognormal
Y  a0 X 1a1 X 2a2
Take log
ln Y  ln a0  a1 ln X 1  a2 ln X 2
normal normal normal
Calculate: E[ln Y] = Y
Stdev(lnY) = Y
Example 4.4 - Settlement of footing, S
Settlement of footing on sand, S = PBI/M
P (applied pressure): LN, = –0.005,   0.1
B (footing dimension): LN,  = 1.792,  = 0
sand
I (influencing factor): LN, = –0.516,   0.1
M (modulus of compressibility): LN,  = 3.455,   0.15
Assume P, B, I, and M are independent LN variates,
find (a) mean settlement (b) P(S < 0.2)
Pr oduct of lognormals, hence S will be LN(S , S ), where
S   P   B   I   M  2.184
S2   2P   2B   2I   2M  0.0425    0.206
(a ) mean settlement, S  exp(S  0.5S2 )  0.115
 ln 0.2  (2.184) 
(b) P(S  0.2)     (2.789)  0.9974
 0 .206 
More complex example
S = 2P3B4I5/M6 (Physically wrong, just for example)

Assume P and B are correlated with PB = 0.1
I and M are correlated with IM = –0.2
S will be LN(S , S ), where

S  ln2  3P  4B  5I  6M  ...
 S2  32  P2  42  B2  52  I2  62  M2
 2(3)(4) PB P B  2(5)(6) IM  I  M
 ...

Functions of Random Variables

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Functions of Random Variables

Uploaded by

Copyright:

Available Formats

4.

FUNCTIONS OF RANDOM VARIABLES

4.1 SINGLE VARIABLE CASE

If X is random, then Y is random.

For monotonically increasing function

Change of variable theorem for monotonic function

(All x must be converted to y by putting x = ey)

4.2 MULTI-VALUE SINGLE VARIABLE FUNCTION

For a two-value function, if Y = y, then the inverse is

fY(y) = fX(x1) + fX(x2)

Probability fY(y)dy is mapped to multiple regions:

If Y = y, then X = x1 = + (first root)

c) Combining (a) and (b) gives

How does multiplying (or dividing) by a constant affect a

Consider y = bX, where b = constant

Both the mean and standard deviation are multiplied by b, i.e.

Y = bX , Y = |b| X (note b can be negative!)

Distribution type remains unchanged, i.e.

Coefficient of variation also remains unchanged

Covariance of two random variables

cov( X , Y )  E[( X   X )(Y  Y )]

Convenient to normalize the covariance as follows

cov( X , Y )  = Pearson product moment

X and Y are uncorrelated

• Correlation is a measure of linear dependence

4.3 FUNCTION OF MULTIPLE RANDOM VARIABLES

4.3.1 Sum (or Difference) of Normal Random Variables

Consider a case of two normal variates X1 and X2, where

What is the distribution of Y = a1X1+a2X2 ?

(sum of normal variables is also normal)

 Y2  E[(Y  Y )2 ]  E[{a1 ( X 1   X )  a2 ( X 2   X )}2 ]

 E[a12 ( X 1   X1 )2  2a1a2 ( X 1   X1 )( X 2   X 2 )  a22 ( X 2   X 2 )2 ]

Correlation does not imply causation !!!

Third factor causes both A and B

Given S = D + L + W, D ~ N(4.2,0.3), L ~ N(6.5,0.8) + wind

and W ~ N(3.4,0.7), with DL= 0.1, LW = 0 and load

DW = 0. If the strength of the column R ~ N(21.15,

The distribution of S will be N(  S ,  S ), where

 0.32  0.82  0.7 2  2  0.1  0.3  0.8  1.268   S  1.126

Given S = D + L + W , D ~ N(4.2,0.3), L ~ N(6.5,0.8) and

Let X  strength  load  R  S . X will be N(  X ,  X )

 X   R   S  21.15  14.1  7.05

Probability of failure is when load > strength, i.e. S > R

a1a1  X 1 X 1 X 1 X 1  a1a 2  X 1 X 2  X 1 X 2  a1a 3  X 1 X 3  X 1 X 3

 a12 12  a 22 22  a 32 32  2a1a 2  X 1 X 2  X 1  X 2

More complex example.

The distribution of S will be N(S , S ), where

 22  D2  32  L2  42  W2  2(2)(3)  DL D L  2(3)(4)  LW  L W

ln Y is the sum of normal r.v. with mean and variance:

The distribution of Y is LN ( Y ,  Y ), where Y  E(ln Y )

normal normal normal

More complex example

S = 2P3B4I5/M6 (Physically wrong, just for example)

S will be LN(S , S ), where

You might also like