You are on page 1of 15

ti

STATISTICAL METHODS FOR BUSINESS

ho
uk
M
K
y
ja
Su
c
ti
ho
uk
M
K
y
ja
Su

Copyright 2015
c by Sujay K Mukhoti All rights reserved.

No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any
form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise,
c

without permission of the author.

Limit of Liability/Disclaimer of Warranty: While the author have used his best efforts in
preparing this book, they make no representations or warranties with respect to the completeness
of the contents of this book and specifically disclaim any implied warranties of merchantability
or fitness for a particular purpose. The advice and strategies contained herin may not be suitable
for your situation. You should consult with a professional where appropriate. Not the author
shall be liable for any loss of profit or any other commercial damages, including but not limited
to special, incidental, consequential, or other damages.

For general information on our other products and services please contact: Sujay K Mukhoti
OM& QT Area, IIM Indore, Rau-Pithampur Road, Indore-453556, Tel.: 0731-2439-487.
CHAPTER 1

ti
ho
JOINT DISTRIBUTION OF RANDOM
uk
VARIABLES
M
K
y

In this chapter we discuss the association between random variables. Consider


ja

two random variables, IBM daily stock return X and Microsoft daily stock
return Y , values of which are shown in the following table:
Su

Y
X 0.5 0 -0.5 Total
-0.4 0.02 0.07 0.03 0.12
c

0 0.02 0.62 0.04 0.68


0.4 0.07 0.11 0.02 0.2
Total 0.11 0.8 0.09 1

Table 1.1 Joint probability distribution of X and Y

Further, given that both the stocks are priced at $ 100 and you are endowed
with $ 200, what portfolio would you construct? The choices are 2 IBM stocks,

D R A F T August 28, 2019, 5:38pm D R A F T


2 JOINT DISTRIBUTION OF RANDOM VARIABLES

2 Microsoft stocks and 1 of each IBM and Microsoft. To solve this problem,
we first note the following properties of joint distributions of X and Y .

1. The cell values are joint probabilities P [X = x, Y = y]. For example, cell
(1,2) value is P [X = −0.4, Y = 0].

2. The row totals provides the marginal distributions of X. Thus, Marginal

ti
X P [X = x]
-0.4 0.12
0 0.68

ho
0.4 0.2

Table 1.2 Marginal distribution of X

expectation of X is E[X] =
X

x
uk
xP [X = x] = 0.032 and marginal variance

of X is V (X) =
M
3. The column totals provide the marginal distributions of Y . Thus, Marginal

Y P [Y = y]
K

0.5 0.11
0 0.8
-0.5 0.09

Table 1.3 Marginal distribution of X


y
ja

X
expectation of Y is E[Y ] = yP [Y = y] = 0.1 and marginal variance
y
Su

of Y is V (Y ) =

Functions of two random variables, X & Y:


c

Consider the above joint distribution between X and Y. From this table,
distribution of a function of (X, Y ), say H(X, Y ), can be obtained as follows:
P [H(x, y) = k] is computed by adding the probabilities of all the pairs which
generates H(x, y) = k.

EXAMPLE 1.1

From table 1.1, we find out the probability distributions of max(X, Y ).

D R A F T August 28, 2019, 5:38pm D R A F T


3

max(X, Y ) P [max(X, Y ) = k]
-0.4 0.03
0 0.73
0.4 0.13
0.5 0.11

Table 1.4 Probability distribution of max(X, Y )

Expectation of linear functions Suppose X and Y are two random variables

ti
with E[X] = µx E[Y ] = µy . Then the following results hold:

ho
E[X + Y ] = µx + µy (X + Y resembles a portfolio return with two stocks)

E[X + Y ] = µx − µy (X − Y resembles a portfolio return with one stock


uk
X and short on the other, i.e borrowed and sold Y )

E[aX + bY ] = aµx + bµy (X − Y resembles a portfolio return with a units


M
of X and b units of Y . If any of (a, b) is negative that indicates short
with corresponding no. of units.)

E[g(X) + h(Y )] = E[g(X)] + E[h(Y )]. For example, g(X) = X 2 and


K

h(Y ) = Y , then E[X 2 + Y ] = E[X 2 ] + µy .

For n random variables X1 , X2 , . . . Xn with means µ1 , µ2 . . . µn , E[a1 X1 +


y

Pn Pn
a2 X2 + . . . + an Xn ] = a1 µ1 + a2 µ2 + . . . an µn = i=1 ai µi ( i=1 ai Xi ,
ja

resembles return of a portfolio with ai units of stock Xi , i = 1, 2 . . . n)


Su

Conditional Distribuition The conditional distribution of X given Y = y is


P [X=x,Y =y]
given by the set of probabilities P [X = x | Y = y] = P [Y =y] . For
example, the conditional distribution of X given Y = 0.5 , in the above
X
example, is given by In this case, E[X | Y = 0.5] = xP [X = x | Y =
c

X P [X = x | Y = 0.5]
-0.4 0.02/0.11= 0.18
0 0.02/0.11= 0.18
0.4 0.02/0.11= 0.64

Table 1.5 Conditional distribution of X

y] = 0.184. Similarly one may compute E[Y | X = x] for different x’s. In

D R A F T August 28, 2019, 5:38pm D R A F T


4 JOINT DISTRIBUTION OF RANDOM VARIABLES

particular, E[Y | X = 0] = 0.85714, E[Y | X = −0.5] =, E[Y | X = 0.5] =.


Thus, E[Y | X = x] varies with x and hence is a function of the random
variable, say h(X). It can be proven (proof is not required) that

E[h(X)] = E[E[Y | X]] = E[Y ]

EXAMPLE 1.2

ti
A road builder company claimed that the road surface is made “skid

ho
free” and they would compensate every major skid-accident leading to
hospitalization Rs. 500000. If the monthly number of skid-accidents on
that road is Poisson(5) and 25% of them usually results in hospitalization,
uk
how much would be the average compensation paid by the company in a
month? Further, the company runs a toll tax on this road. If the traffic
M
of cars on this road is Poisson(100) per day, how much they must charge
so that the company does not expect a loss even after paying for the
compensation?
K

Proof (Solution): Let X= no of accidents on road ∼ P (5) and Y = no. of


major accidents. Thus, for a given number of accidents, say k, maximum
y

number of major accidents possible is k. Given that each accident has a


probability 0.25 of being a major accident, then Y | X = k ∼ Bin(x, 0.25).
ja

Thus
Su

E[E[Y | X = x]] = E[0.25X] = 0.25 × E[X] = 5 × 0.25 = 1.25.

Hence the expected compensation amount paid by the company, during a


month, is 1.25 × 500000 = 625000.
c

Further, let Z=monthly traffic on the road assuming 30-day in a month.


Therefore Z ∼ P oisson(3000). If t amount of money is charged per vehicle,
the income would match up with the compensation paid if t = 625000/3000 =
208.33. Thus they should charge at least 208.33 as toll tax per vehicle.

Independence Two random variables, X and Y , are said to be independent


iff P [X = x, Y = y] = P [X = x]P [Y = y] for all pairs of values (x, y). Since

D R A F T August 28, 2019, 5:38pm D R A F T


5

in the above joint table P [X = −0.4, Y = 0.5] = 0.02 6= P [X = −0.4]P [Y =


0.5] = 0.12 × 0.11, X and Y are dependent.
If X and Y are independent then

E[XY ] = E[X]E[Y ]

iid random variable: Two variables are said to be independently and

ti
identcally distributed (iid) if they have same distribution and are independent.
Similarly a set of random variables are said to iid if they are (mutually)

ho
independent and identically distributed. In this case, E[X] = E[Y ] = µ and
V (X) = V (Y ) = σ 2 .

Covariance
uk
Covariance The covariance between X and Y is defined as Cov(X, Y ) =
M
E[{X − E[X]}{Y − E[Y ]}] = E[XY ] − E[X]E[Y ], where
XX
E[XY ] = xyP [X = x, Y = y]
K

x y

Covariance measures the degree of linearity between X and Y . As an expla-


nation, let us look at the following figure:
y
ja
Su
c

Notice, with origin(0,0), if data on X and Y are on 1st and 3rd quadrants,
then X × Y is positive. The product is negative otherwise. If (A) Y increases
with X, then most of the (x, y) pairs would frequent 1st and 3rd quadrants
compared to 2nd and 4th quadrants (as in the 2nd figure), (B) the opposite
when Y decreases with X (as in the 3rd figure) and (C) if there is no such
relationship, then all quadrants will be equally frequented by the data pairs.
Thus in case (A), the expected product XY (E[XY]) would be greater than

D R A F T August 28, 2019, 5:38pm D R A F T


6 JOINT DISTRIBUTION OF RANDOM VARIABLES

zero, in case (B) it will be negative and in case (C) it would be close to zero.
Since the center of X is E[X] and that of Y is E[Y ] (not necessarily zero-
zero), shifting the center we get the measure of degree of linearity as E[{X −
E(X)}{Y − E(Y )}] = Cov(X, Y ) = E[XY ] − µx µy . Cov(X, Y ) is maximum
when X and Y are perfect (linear) match, i.e X = Y and it is minimum
when there is a perfect (linear) mismatch, i.e X = −Y . Thus, −V (Y ) ≤
Cov(X, Y ) ≤ V (Y ) and hence is unbounded. Also it is unit dependent.

ti
Cov(X, X) = V (X)

ho
−V (Y ) ≤ Cov(X, Y ) ≤ V (Y ) and −V (X) ≤ Cov(X, Y ) ≤ V (X)

Cov(X, C) = 0, where C is a constant uk


Cov(X, Y ) = 0, if X and Y are independent, but not the vice versa
M
Cov(aX + bY, cX) = acV (X) + bcCov(X, Y ), Cov(aX + bY, cX + dY ) =
acV (X) + bdV (Y ) + (ad + bc)Cov(X, Y )
K

Variance of linear combination of X and Y are given by:

V (X + Y ) = V (X) + V (Y ) + 2Cov(X, Y ) = σx2 + σy2 + 2σxy


y

V (X − Y ) = V (X) + V (Y ) − 2Cov(X, Y ) = σx2 + σy2 − 2σxy


ja

V (aX + bY ) = a2 V (X) + b2 V (Y ) + 2abCov(X, Y ) = a2 σx2 + b2 σy2 + 2abσxy

σxy
Correlation The correlation coefficient is given by ρ = √Cov(X,Y ) =
Su

V (X)V (Y ) σx σy
and is a scaled measure of degree of linear relationship between X and Y .
Properties of ρ:
c

1. −1 ≤ ρ ≤ 1

2. If X and Y are independent, then ρ = 0, but not the converse.

bd
3. If U = a + bX and V = c + dY , then ρU V = |b||d| ρXY

4. V (aX + bY ) = a2 σx2 + b2 σy2 + 2ρabσx σy

a2i σi2 +
P P P
5. V ( ai Xi ) = i6=j ai aj ρσi σj

D R A F T August 28, 2019, 5:38pm D R A F T


7

Comparing two random variables

Consider two random variables, viz. stock price of Reliance in BSE with that
of CITI in NYSE. Directly comparing the monthly prices would not be right
as they are in two different risk groups (risk is measured by standard deviation
of it). Also the units (here INR and DOLLAR) are different. To make them
comparable, a technique often used is division by standard deviation of the

ti
corresponding variable. Thus the ratio becomes (i) independent of monetary
unit and hence all related complications, (ii) independent of risk and hence

ho
Pt −Pt−1
comparable. To compare the returns of two risky assets (defined as Pt )
William Sharpe defined the following measure:

S=E

X − rf
σ
 uk
called Sharpe ratio, where rf is the risk-free rate of return. If a two-asset
M
(risky) portfolio contains a number of shares of with return X and b number
of shares of with return Y , then portfolio return is aX + bY . Portfolio Sharpe
ratio is
K

E[aX + bY ] − rf (a + b)
Sp = p
V (aX + bY )
, where rf = risk-free rate (constant).
y

Sharpe ratio simply describes what is expected return over the risk-free rate
ja

per unit risk or expected excess return over risk. An anticipated increase in
risk (volatility) would expect to see higher return over risk-free rate and hence
Su

one investor will invest with increased σ if E[X] ≥ rf . Notice, even if two
stocks have same σ value, the one with higher E[X − rf ] (or higher Sharpe
Ratio) would draw more investment. Similarly if two assets expect same
c

amount of return, the one with lesser risk (σ) (equivalently higher Sharpe
Ratio) will attract more investment. Hence the rule is
Invest in the stock which has higher Sharpe ratio

EXAMPLE 1.3

Suppose one investor wants to decide on choices she has in investing her
current cash. The two stocks available are Apple (µA = 2.48% and volatil-

D R A F T August 28, 2019, 5:38pm D R A F T


8 JOINT DISTRIBUTION OF RANDOM VARIABLES

ity σA = 13.8%) and McDonald’s (µM = 1.28% and volatility σM = 6.5).


Which stock will be preferred for investemnt? Assume 0.4% risk-free rate.

Solution: The corresponding Sharpe ratios are:


µA − rf 2.48 − 0.4
S(A) = = = 0.151
σA 13.8
µM − rf 1.28 − 0.4
S(M ) = = = 0.135
σM 6.5

ti
Since the Sharpe ratio of apple is higher, she should prefer Apple stock for

ho
investment.
For 2 units of IBM shares, a = 2, b = 0. The Sharpe ratio of this portfolio is
2E[X]−2rf E[X]−rf
S2IBM = √ = √ . For 2 units of Microsoft shares, a = 0, b = 2.
V (2x) V (X)

The Sharpe ratio of this portfolio is S2IBM = √


uk
2E[X]−2rf E[X]−rf
= √
V (2x)
= 0.112.
V (X)
Similarly, Sharpe ratio of 2 shares of Microsoft is S2M S = 0.1343. Thus 2
M
IBM stocks are preferred over 2 Microsoft stocks.
However, to compute Sharpe ratio for portfolio with 1 share of each stock
is not that straight-forward. This is because of the co-movements between
K

Microsoft of IBM prices. Specifically, the risk in one is influenced by the


price movement of the other and hence the portfolio risk needs to be carefully
measured. We first define covariance as follows.
y

Let us go back to the problem of computing Sharpe ratio for the portfolio
ja

with 1 of each IBM and Microsoft stock. In this case the expected portfolio
return is E[X] + E[Y ] = 0.042 and portfolio variance V (X + Y ) = σx2 + σy2 +
Su

2σxy = 0.1234 and hence Sharpe ratio is 0.09962. Since the Sharpe ratio is
more in the portfolio of 2 IBM stocks, it is advised to purchase 2 IBM shares.
c

Portfolio Optimization Problem

Suppose an investor has a budget of B amount. She is ready to invest the


money into n risky assets. Suppose that the purchasing price of the ith stock
is Pi,0 and she spends ωi proportion of her budget on the ith stock. Thus
B×ωi
the number of units of the ith stock she holds is Pi,0 . If she holds the
stock for 1 period of time (day/month/quarter) and then sells it at Pi,1 , her

D R A F T August 28, 2019, 5:38pm D R A F T


9

B×ωi
gain on the ith stock is − Pi,0 ) = B × ωi Ri , where Ri is the return
Pi,0 (Pi,1 " n #
Pi,1 −Pi,0
X
(Ri = Pi,0 ). Thus the total expected gain is BE ωi Ri and portfolio
i=1
risk is
 
n
! n n X
n
X X X
B2V = B2  ωi2 V (Ri ) +
 
ωi R i  ωi ωj Cov(Ri , Rj )

i=1 i=1 i=1 j=1
i6=j

ti
The aim of the investor is to minimize portfolio risk by selecting the propor-
tions ω1 , ω2 . . . ωn . Since the constant B does not influence the optimization

ho
problem, we ignore it while stating portfolio optimization problem. However,
this optimization problem is subject to the condition that expected return at-
n
tains a pre-determined value, i.e.
X Pn uk
ωi E[Ri ] = r0 and i=1 ωi = 1. Notice
i=1
here we don’t make ωi > 0. In this case ωi < 0 implies shorting is allowed.
M
Thus the problem is stated as
n
X X n
n X n
X n
X
argmin ωi2 V (Ri )+ ωi ωj Cov(Ri , Rj ) s.t. ωi E[Ri ] = r0 and ωi = 1
ω1 ,...ωn
i=1 i=1 j=1 i=1 i=1
K

i6=j

Value at Risk : VaR


y

Measurement of risk is a crucial step in taking positions in an investment


ja

strategy. Two popular measures of risks are Value at Risk and Conditional
Value at Risk or Expected Shortfall.
Su

Value at Risk (VaR): The value at risk (VaR) at confidence level α within
time T is a value V such that the loss on the investment by time T would not
exceed V with probability α.
c

Here we are 100α% sure that the loss will not be more than V dollars in time
T . Another way to state VaR is ”leaving bottom 100(1-α)% cases, P − V is
the worst value a portfolio (worth P dollars initially) can arrive at after T
time”. The value at risk works as a ”stop loss condition”.
Notice, gain is defined as negative loss and hence VaR can be defined in
terms of gain as well. We explain VaR using the following example.

D R A F T August 28, 2019, 5:38pm D R A F T


10 JOINT DISTRIBUTION OF RANDOM VARIABLES

EXAMPLE 1.4 VaR in terms of total loss

Suppose gain from a portfolio worth $50 million during six months is
normally distributed with $2 million and a standard deviation of $10
million. Let us find the VaR (in terms of total loss/gain) at 99%.
Let G denote the $-gain from the portfolio and G ∼ N (2, 102 ). Then
the loss L = −G ∼ N (−2, 102 ). If V is the 99% VaR, then

ti
P [L ≤ V ] = 0.99
 

ho
V +2
⇒Φ = Φ(2.326)
10
⇒ V = −2 + 10 × 2.326 = 21.26
uk
Hence, loss will not exceed $21.26 in next six months with probability 99%
and hence the worst value the portfolio can attain is (50-21.26)=28.74
M
(barring bottom 1% cases) during next 6 months.

EXAMPLE 1.5 VaR in terms of % loss


K

Find the VaR for an investment of $500000 at 99% over the next year.
That is, find out how low the value of this investment could be if we
rule out the lowest 1% outcomes. The investment is expected to grow
y

during the year by 10% with sd 35%. Assume normality. What about
ja

next month if the growth rate is same across months? How small the risk
should be if the VaR needs to be reduced to $200000.
Su

Let G denote the %-growth rate of the portfolio and G ∼ N (0.1, 0.352 ).
Then the loss rate L = −G ∼ N (−0.1, 0.352 ). If V is the 99% VaR, then
c

P [L ≤ V ] = 0.99
 
V +2
⇒Φ = Φ(2.326)
10
⇒ V = −0.1 + 0.35 × 2.326 = 0.7141

Hence, loss will not exceed $500000 × 0.7141= $357050 in next year with
probability 99% and hence the worst value the portfolio can attain is
$142950 (barring bottom 1% cases) during next year.

D R A F T August 28, 2019, 5:38pm D R A F T


11

Notice that 100(1-α)% VaR (e.g. 99%) is essentially 100αth (for 99% VaR,
1st ) upper quantile or critical value or zα of the loss distribution.

Conditional VaR or Expected shortfall

Conditional VaR or CVaR or expected shortfall measures the expected loss


given that the loss more than VaR has been incurred. Thus CVaR measures
how much loss will be incurred if things go that bad.

ti
If the loss distribution is N (µ, σ 2 ), then CVaR at 100(1 − α)% confidence

ho
is given by
σ
CV aR = µ + φ(zα )
α
uk
Ex. 1 — A normal random variable X has mean 35 and standard deviation
10. Find a value X that has area 0.01 to its right.
M
Ex. 2 — For X ∼ N (32, 49), Find x such that P(−x < X < x) = 0.99

Ex. 3 — The daily price of coffee is approximately normally distributed


over a period of 15 days with a mean in April 2007 of $1.35 per pound (on the
K

wholesale market) and standard deviation of $0.15. Find a price such that
the probability in the next 15 days that the price will go below it will be 0.90.
y

Ex. 4 — The outcomes for a one-year long project are equally likely between
ja

$50 million loss to $50 million gain. Find the 99% VaR.

Ex. 5 — The change in the value of a portfolio in three months is normally


Su

distributed with a mean of $500000 and a standard deviation of $3 million.


Find the VaR at a confidence of 99%.
Ex. 6 — The change in the value of a portfolio in three months is normally
c

distributed with a zero mean and a standard deviation of $2 million. Find


the VaR at a confidence of 97.5%.
z -1.6448 -1.96 -2.326 -2.576
Φ(z) 0.05 0.025 0.01 0.005

Ex. 7 — X denotes the salary of a new employee, and Y denotes the cost of
the benefits, such as health insurance, that come with the position. Benefits
cost is a given percentage of the salary, so the cost of benefit is X Y.

D R A F T August 28, 2019, 5:38pm D R A F T


12 JOINT DISTRIBUTION OF RANDOM VARIABLES

Salary(X)
Benefits (Y) 80000 100000 150000
0.3 0.3 0.3 0.15
0.2 0.1 0.1 0.05

Table 1.6 Joint probability distribution of X and Y

a. Are X and Y independent random variables?

ti
b. What is the expected percentage of total salary allocated to benefits?

ho
c. What is the expected total cost of hiring a new employee?

Ex. 8 — If the variance of the sum is Var(X + Y) = 8 and Var(X) = Var(Y)


= 5, then what is the correlation between X and Y? uk
Ex. 9 — Whats the covariance between a random variable X and a con-
M
stant?
Ex. 10 — A quality control plan for an assembly line involves sampling 20
finished items per day and checks for no. of defectives. If p is the probability
K

of producing a defective item then p is assumed to be a random variable (due


to chance causes/ non-systematic error) with P[p = 0.01] =0.2, P[p=0.5]=0.6,
P[p=0.9]=0.18, P[p=0.95]=0.02. Find the expected number of defectives.
y

Ex. 11 — Let X denote the number of Canon digital cameras sold during
ja

a particular week by a certain store. The distribution of X is:


x 0 1 2 3 4
Su

P[X=x] 0.1 0.2 0.3 0.25 0.15


Sixty percent of all customers who purchase these cameras also buy an ex-
tended warranty. Let Y denote the number of purchasers during this week
who buy an extended warranty.
c

a. What is P(X = 4; Y = 2)?

b. Calculate P(X = Y).

c. Determine the joint probability table of X and Y and then the marginal
distribution of Y.

Harder Problem

D R A F T August 28, 2019, 5:38pm D R A F T


13

Ex. 12 — Number of accidents in campus follow a Poisson distribution with


an average of per day. The accident turns serious with probability p. Find
the probability of observing k serious accidents in a day.

ti
ho
uk
M
K
y
ja
Su
c

D R A F T August 28, 2019, 5:38pm D R A F T

You might also like