You are on page 1of 25

Advanced Statistics

Joint probability distributions

Economics
University of Manchester

1
Joint Probability Distributions

 Lecture focuses on two variables


 Relationship to individual variables

 Correlation and independence

 Ideas generalise to more than two variables

 Use discrete case for illustrative purposes


 Principal results apply also to continuous variables

 Expectations & combinations of variables lecture


notes, pp. 4-10

2
Joint Probability Distributions
 For two discrete random variables, X & Y, have
Pr(X = x ∩Y = y) = p(x, y)

 Joint probability distribution:


 0  p(x, y)  1

 and   p ( x, y )  1
x y
Note: the order of summation can be
switched

3
Example I
 H and W: weekly incomes of husbands and wives
(£100s)

h
0 1 2
0 0.05 0.15 0.10
w 1 0.10 0.10 0.30
2 0.05 0.05 0.10
 For example: Pr(H = 2, W = 1) = 0.30
The table entries are the joint probability
distribution 4
Marginal Probabilities
 Marginal probability: probability of outcome for one
variable.
h
0 1 2 p(w)
0 0.05 0.15 0.10 0.30
w 1 0.10 0.10 0.30 0.5
2 0.05 0.05 0.10 0.2
p(h) 0.20 0.3 0.5 1.0

 That is: pH (h)   p(h, w) column sum


w

pW ( w)   p(h, w) row sum


h
5
Marginal Probability Distributions
 Marginal probability distribution: probabilities of all
outcome for one variable.
w p(w) h p(h)
μW =0.9 0 0.3 0 0.2 μH =1.3
1 0.5 1 0.3
2 0.2 2 0.5
1.0 1.0

 Satisfy usual requirements for probability distribution:


 0  pH(h)  1 & pH(h) = 1
 0  pW(w)  1 & pW(w) = 1

6
Functions of Two Variables
 Total income of couple: T = H + W “household income”
h (value of t)
0 1 2
0 0.05 (0) 0.15 (1) 0.10 (2)
w 1 0.10 (1) 0.10 (2) 0.30 (3)
2 0.05 (2) 0.05 (3) 0.10 (4)

 E[T] can be obtained from (either) the joint


distribution (or the distribution of T)

E[T ]  E[ H W ]   (h w) p (h, w)  2.2


h w
7
Independence
 Random variables X and Y are independent if and
only if
p(x, y) = p(x)  p(y) for all x and y
 Example: For H and W, p (0,0)  0.05
pH (0)  0.2, pW (0)  0.3 pH(0).pW(0)=0.06
p ( 0, 0)  p H ( 0)  pW ( 0 )
 H and W are NOT independent: p(w,h)≠pH(h)pW (w)

 One needs to verify p(x, y) = p(x)  p(y)


for all x and y to establish independence
8
Covariance

 Covariance between two random variables is


cov[X, Y] = E[(X – μX)(Y – μY)]

 To compute for our table, E[(W – μW)( H– μH)]


w=0,h=2 gives
h [(w – 0.9)(h – 1.3)] (0-0.9)(2-1.3)
0 1 2
0 0.05 [1.17] 0.15 [0.27] 0.10 [-0.63]
w 1 0.10 [-0.13] 0.10 [-0.03] 0.30 [0.07]
2 0.05 [-1.43] 0.05 [-0.33] 0.10 [0.77]

 cov[H, W] = wh (w – 0.9)(h – 1.3) p(h, w)


= 0.03
9
Correlation
 Correlation is a unit-free measure of association
cov[ X , Y ]  XY
 XY   ,  1   X ,Y  1
 var[ X ] var[ Y ]  X Y
 The closer |ρXY| is to 1, the closer to perfect linear
association between the two variables
 that is, the relationship is closer to Y = b X + c

 For the husband/wife income example:


 HW 0.03
 HW    0.055
 W  H 0.7  0.781
•positive but extremely weak
10
Independence & Correlation
 For independent X & Y:
 NO association, linear or nonlinear

 cov[X, Y] = 0

 ρXY = 0

BUT:
 ρXY = 0 does NOT imply independence
 a nonlinear relationship may exist

11
Linear Combinations

 Examine properties of linear functions of variables


 E.g., V = aX + bY + c, constants a, b, c

 mean and variance


 covariance (two or more variables)
 distributions (Normal)
 extension of the linear transformation Y = bX + c

 Expectations & combinations of variables


 Lecture Notes pp. 11-17. (pp.1-3 for Linear

Transformations)

12
Linear Transformations

 Y = bX + c, constants b, c
 Could be change of units
 Eg, b ≈ 1.20, c = 0 to convert £ to €
 Effects on mean and variance:
 E[Y] = E[bX + c] = b E[X] + c
2
 var[Y] = var[bX + c] = b var[X]

 Effects on distribution
2 2 2
 If X~N(µ,σ ) then Y~N(bµ+c,b σ )

13
Mean of Linear Combination

 V = aX + bY + c, constants a, b, c

 E[V]=E[aX+bY+c]
=a E[X]+b E[Y]+c
I.E.
μV = a.μX + b.μY + c

14
Mean: Generalisation
We have n random variables
 Random variables X1, ..., Xn, constants c, a1, ..., an:
 n

E c   ai X i  E[c  a1 X 1  a2 X 2 ... an X n ]
 i 1 
apply E[.] to X1
 c  a1 E[ X 1 ]
 E[ a2 X 2 ...an X n ]
  apply E[.] to each random variable
 c  a1 E[ X 1 ]
Note that the E[.]
operation is taken  a2 E[ X 2 ]  ...  an E[ X n ]
inside the n
Summation c   ai E[ X i ]
i 1
15
Mean: Example
 P = 2X1 + 5X2 – 3X3 + 4
 E[X1] = 2, E[X2] = -1, E[X3] = 3

 Mean of P:
E[ P]  E[2 X 1  5 X 2  3 X 3  4]
E[.] operates  2 E[ X 1 ]  5E[ X 2 ]  3E[ X 3 ]  4
on each Xi
plug the
 2  2  5  (1)  (3)  3  4 numbers
 6 in

16
Calculating Variances
 Alternative formula to calculate variance:
 var[X] = E[X 2] – {E[X ]}2 = E[X 2] – μ2

 Proof: var[ X ]  E[( X   ) ] expand the product


2

apply E[.] to  E[ X 2  2 X   2 ] Remember,


the random  is NOT
 E[ X 2 ]  2  E[ X ]   2 random
variables
 E[ X 2 ]   2 Simplify
formula
-22 + 2

17
Calculating Covariances
 Similarly for covariances:
cov[X, Y] = E[(X – μX)(Y – μY)]
= E[X Y] – μX μY
 Proof:
expand the
cov[ X , Y ]  E[( X   X )(Y  Y )] product
 E[ XY   X Y  XY   X Y ]
apply E[.] to
the random  E[ XY ]   X E[Y ]  E[ X ]Y   X Y
variables  E[ XY ]  2  X Y   X Y Simplify
formula
 E[ XY ]   X Y
18
The Implication of Zero Correlation
 cov[X, Y] = E[(X – μX)(Y – μY)]
= E[X Y] – μX μY
 Correlation: ρXY=cov[X,Y]/(σXσY)
 For uncorrelated X & Y we have cov[X, Y] =0
 This means E[X Y] – μX μY = 0
which  E[X Y] = μX  μY
the expected value of the product of uncorrelated random
variables is the product of their individual expected values

19
Variance of Linear Combination
 Key result:
var[aX bY  c ]  a var[ X ]  b var[Y ]
2 2

 2ab  cov[ X , Y ]
Combinations involving more than 1 random
variable require covariance terms
 Notes provide proof; for V = aX + bY + c
2
 use definition var[V] = E[(V – μ ) ]
V
 & results for mean of linear combination

20
Variance: Example2 2Combination formula:
a  H  b  W  2ab  HW
2 2

 Husband & wife income example:


var[H] = 0.61, var[W] = 0.49, cov[H, W] = 0.03
 For T = H + W “household income”
var[T ]  var[ H ]  var[W ]  2  1 1 cov[ H , W ]
 0.61  0.49  2  0.03  1.16
a, b
 For D = H – W “gender pay gap”
var[ D ]  var[ H ]  ( 1) var[W ]
2

 2  1 ( 1) cov[ H , W ]
 0.61  0.49  2  0.03  1.04 21
Variance: Uncorrelated Variables
 For n uncorrelated random variables X1, X2, ..., Xn
 Constants a1, a2, …, an
n n
var[  ai X i ]   a var[ X i ]
2
i
i 1 i 1
If the random variables are
uncorrelated we can take
 For uncorrelated X, Y var[.] inside the Summation

 X Y   X2   Y2

 Except in very special cases  X Y   X   Y


Calculate variance before standard deviation
22
Linear Combinations of Normal
Variables
 If variables follow normal distribution
 So does any linear combination

 This is an important result for the normal case

Hence for X ~ N (  X ,  ), Y ~ N ( Y ,  )
2 2
 X Y

T  aX + bY  c ~ N ( T ,  )
2
T

 T  a x  b y  c mean formula

  a var[ X ]  b var[ Y ]  2ab  cov[ X , Y ]


2
T
2 2

variance formula
23
Normal Variables: Example E.G. Wife &
Husband annual
earnings in
 Suppose we have Independent thousands
W~ N(20, 5) and H~ N(30, 11)
 Then if D = W – H, its mean and variance are
D can also be viewed as
 D  W   H a “gender pay gap”
mean
=20 – 30 = -10

    ( 1)   2   W H
2
D
2
W
2 2
H
variance Independence implies zero
=5+11 –0 = 16
covariance
 So D ~ N(-10, 16)
24
The Example continued
“The probability the Wife earns
 To find Pr[D > 0]: more than the Husband”

Map D onto  0  (10)  μD


the standard Pr[ D  0]  Pr  Z  
Normal  16 
 Pr[ Z  2.5] σD
 1  0.99379
 0.00621
This leads onto the idea of Hypothesis testing
for answering questions like:
Is the gender pay gap zero?
25

You might also like