Lecture 3 PDF

The inverse covariance matrix I
I All allocations we have seen so far are functions of the inverse covariance matrix of
Understanding the inverse covariance returns
matrix I This is the classic optimal weighting matrix for much of linear estimation - but also
carries important finance intuition
I Turns out that we can represent this matrix as a parameters from system of linear
regressions
Rasmus Lönn Portfolio Management FM21010 - Regression perspective and testing 7 / 48
The inverse covariance matrix II The inverse covariance matrix III
I For simplicity take two returns r1 and r2 such that, I Take the simple linear regression r1 = a + r2 + "1 and recall V["1 ] = ( 2
1
2 2
1,2 / 2 )
we can revise the diagonal and o↵-diagonal elements
!
2
1 1,2
⌃= 1,2
2
2 1
. 2 = and =
2 2 2
1 2
2
1,2 V["1 ] 2 2
1 2
2
1,2 V["1 ]
I the inverse of this covariance matrix is then,
I thus giving us a representation in terms of linear regression,
! 0 2 1
2 2 1,2 0 2 1 !
1 2 2 2 2 2 2 2 1,2
⌃ 1
= 2 1,2
=@ 1 2 1,2 1 2 1,2 A 2 2 2 2 2 2 V("1 ) 1 V(") 1
1
=@ A=
2 2 2 2 1 2 1,2 1 2 1,2
1 2 1,2 . 2
. 1 ⌃ 2
1 2 2
1 2
2
1,2 . 2 2
1
2 . V("2 ) 1
1 2 1,2
I Notice in particular the recurring ratios of covariances to variances

I Hence, asymptotically we identify this inverse through 2 linear regressions
Rasmus Lönn Portfolio Management FM21010 - Regression perspective and testing 8 / 48 Rasmus Lönn Portfolio Management FM21010 - Regression perspective and testing 9 / 48
The inverse covariance matrix IV The inverse covariance matrix V
I Expanding the set of returns to N gives us,
I For i.i.d. returns the tangency weight of asset i is proportional to µi / 2
i
X
ri = a + j rj + ✏i . I When returns are correlated that weight is proportional to V("i ) = 2
i (1 Ri2 )
j6=i I This is the non-diversifiable risk of asset i
I Which then provides an N ⇥ N matrix,
I Linear regression minimizes the residual variance, hence this is the minimal
0 1 non-diversifiable risk of i given a portfolio of the other N 1 assets - this is understood
✓1,1 ✓1,2 . . . ✓1,N
B C as a hedging portfolio
B ✓2,1 ✓2,2 . . . ✓2,N C
1 B C
⌃ =B . .. .. .. C
B ..
@ . . . C
A I The o↵-diagonal elements i,j V("i )
1 are the optimal hedging portfolio weight of
✓N,1 ✓N,2 . . . ✓N,N asset j with respect to i
I A long position in i is optimally hedged by a short position of i,j V("i )
1
in j
I where ✓i,i = V("i ) 1 and ✓i,j = i,j V("i )
1
The inverse covariance matrix VI The inverse covariance matrix VII
I Returning to our fundamental portfolio allocations for N assets
I The global minimum variance allocation is forming a linear combination of N optimal I Empirically we can exploit this regression representation of the inverse
hedging portfolios
⌃ 1◆ I Estimating the system of linear regressions we can construct the inverse without
wgmv =
◆0 ⌃ 1 ◆ performing the sometimes troublesome matrix inversion
I The tangency portfolio scales the combination of optimal hedges by their expected
I The difficulty is constructing the matrix such that it is symmetric and positive
returns
⌃ 1 µ̃ semi-definite
wtan =
◆0 ⌃ 1 µ̃
I The Efficient frontier is formed by a combination of the other two
Regression Approaches to Portfolio Choice
I We can understand ⌃ 1 as a system of linear regressions. Can we likewise represent

the portfolio weights in linear regression form?
On weights from linear regressions I Empirical portfolio weights can be computed as parameter estimates of linear
regression models with/without restriction
I Advantages,
I battery of tools from regression analysis can be used (s.e., tests, goodness of fit measures)
I models can be easily tested against each other
I insights into empirical issues
I straightforward extension tho other estimation approaches (shrinkage estimation, Bayes)
Regression representation I Regression representation II

I One famous use of this type of regression perspective comes from Britten-Jones I To see this, by some substitution we can write the OLS estimate as,
(1999),
b̂ = (R0 R) 1
R0 ◆
I Tangency portfolio,
⌃ 1 µ̃ b + R̄R̄0 )
= (⌃ 1
R̄
wT AN =
◆0 ⌃ 1 µ̃
I Consider the following regression of a vector of ones on R, a T ⇥ N matrix of I Which by matrix inversion lemma (special case of what we briefly saw last lecture) is,
excess(!) returns, ⇣ b b 1
1 R̄R̄0 ⌃ ⌘
b 1 ⌃
◆ = bRt + ut b̂ = ⌃ R̄
(1 + b 1 R̄)
R̄0 ⌃
I Let b̂ be the OLS estimate, then we have that b
⌃ 1 R̄
=
b
(1 + R̄0 ⌃ 1 R̄)
b̂
wT AN =
(◆0 b̂) I Scale by ◆0 b̂ gives ⌃
b b 1 R̄),
1 R̄/(◆0 ⌃ which is the tangency portfolio weights
Regression representation III Regression representation VI
I Likewise, with a risk-free rate we can consider 1 ⌃ 1 µ̃

I Although this is a really strange regression, it enables us to use standard OLS
estimates to do inference on portfolio weights I For a linear regression setup, i.e.,
I We can use standard t-tests/F-tests to test restrictions on portfolio weights
y = Xw + ".
I Britten-Jones (1999) regression also plays a big role in asset pricing
I We have the OLS estimator equal to (X 0 X) 1
X 0 y for,
I The Hansen-Jagannathan distance - misspecification in a stochastic discount factor I X= p 1
⌃ is N ⇥ N matrix
2
model - is a function of this projection onto the excess return space I y = p1 ⌃ 12 µ̃ is N ⇥ 1 vector
I Again, Asset pricing and Portfolio management are closely related I Then the OLS estimator ŵ = (X 0 X) 1
X 0y = 1 ⌃ 1 µ̃, i.e. it is equal to the mean
variance portfolio with a risk free rate (no adding up constraint)
Regression representation IV
I Kempf-Memmel (2006) show that the GMV portfolio weights can be obtained by the
following OLS regression,
N
X
rt1 = µp + wj (rt1 rtj ) + "pt (1)
j=2
Testing portfolio performance
I Where,
ŵj = estimated GMV weight (j = 2, . . . , N )

N
X
ŵ1 =1 ŵj
j=2
µp =w0 µ mean of GMVP

2 0
" =w ⌃w variance of GMVP
Literature Literature
I This is an old problem but as of now the main references are,

I By now we have introduced several portfolio allocations and factor models
I Ledoit, O. and Wolf, M. (2008). Robust performance hypothesis testing with the
I The models can help us estimate the inputs (µ,⌃)
Sharpe ratio. Journal of Empirical Finance.
I Matlab/R code is available on Michael Wolf’s website
I The improved estimates should enable better performance from our portfolio
allocation I Ledoit, O. and Wolf, M. (2011). Robust performance hypothesis testing with the
variance. Wilmott Magazine.
I But how do we know if the gains are statistically significant?
I Matlab/R code is available on Michael Wolf’s website
Testing the Sharpe Ratio - The Problem Some notation

I Take two investment strategies i and j with excess returns rt,i and rt,j for t = 1, . . . , T
I Assume that rt,i and rt,j are strictly stationary processes with, I Denote E(r1,i
2 )=
i and E(r1,j
2 )=
j with their estimates î and ˆj
! !
µi 2
i i,j
µ= and ⌃ =
2 I Further define v = (µi , µj , i , j)
0 and v̂ = (µ̂i , µ̂j , î , ˆj )
µj i,j j
I The di↵erence in Sharpe Ratios is given by, I Such that = f (v) and ˆ = f (v̂) with,
µi µj
= SRi SRj = . a b
i j f (a, b, c, d) = p p (2)
c a2 d b2
I Its estimator using sample moments is given by,
ˆ = SR
î ˆ j = µ̂i µ̂j
SR
î ˆj
The Solution I The Solution II
I Under (relatively) mild regularity conditions on the higher moments of excess returns,
I In our notation we now have the gradients,
p d
T (v̂ v) ! N(0, ) ✓ ◆
c d 1 a 1 b
r0 f (a, b, c, d) = , , ,
(c a2 )1.5 (d b2 )1.5 2 (c a2 )1.5 2 (d b2 )1.5
I is an unknown symmetric positive semi-definite matrix
I If a consistent estimator ˆ is available, the standard error for ˆ is given by,
I These results largely date back to early time-series literature, e.g. Andrews (1991). s
⇣ ⌘ r0 f (v̂) ˆ rf (v̂)
s ˆ = . (4)
I Using the Delta method we further find, T
p d I Hence, this is now a classic time-series econometrics exercise
T(ˆ ) ! N(0, r0 f (v) rf (v)) (3)
HAC Inference I HAC Inference II

I Given a kernel, the standard error s( ˆ ) is obtained as in Eq. (4)
I From time-series econometrics we have a large large set of estimators for the
I A two-sided p-value for the null hypothesis H0 : = 0 is given by,
I A standard group of estimators is the heteroskedasticity and autocorrelation robust 0 1
kernels estimators | ˆ|
p̂ = 2 @ ⇣ ⌘A
s ˆ
I These use a kernel to weight observations in manner than accommodates the
dynamics. Common kernel choices includes Bartlett(Newey-West), Parzen, Quadratic where (·) denotes the c.d.f. of the standard normal distribution
spectral
I Alternatively, a 1 ↵ confidence interval for is given by,
I Alternatively we can use bootstrapping - resample pairs of observations (with ⇣ ⌘
ˆ ± z1 ↵/2 s ˆ
replacement) - to assess the sampling distribution
where z denotes the quantile of the standard normal distribution
Bootstrap Method Ledoit and Wolf (2008) simulation study
I Ledoit and Wolf (2008) sets up a simulation study to evaluate their method
I Kernel based HAC estimators sometimes have poor small-sample properties
I They consider 6 return DGPs; normal-iid, fat-tailed iid, normal-GARCH, fat-tailed

I Consider studentized bootstrap confidence interval for some L distribution,
0 1 GARCH, normal-serially correlated and fat-tailed correlated
| ˆ⇤ ˆ|
L@ ⇣ ⌘ A I They use 4 variations of their test; standard HAC, pre-whitened HAC, iid bootstrapp
s ˆ⇤
and time-series bootstrapp
⇣ ⌘
where ˆ ⇤ is the bootstrap estimate and s ˆ ⇤ is the bootstrap standard error
I As a benchmark they include one of the early tests for equal Sharpe ratios, the
I Remember that returns are not i.i.d. - account for time-series nature with block Jobson-Korkie test
sampling
I They generate samples of 120 observations
Ledoit and Wolf (2008) simulation study

O. Ledoit, M. Wolf / Journal of Empirical Finance 15 (2008) 850–859 857 Testing Variances - The Problem
Table 1
Empirical rejection probabilities (in percent) for various data generating processes (DGPs) and inference methods; see Section 4 for a description I Turning away from Sharpe ratios, how should we treat volatilities?
DGP JKM HAC HACPW Boot-IID Boot-TS
I Again, two strategies i and j with excess returns rt,i and rt,j for t = 1, . . . , T
Nominal level α = 1%
Normal-IID 1.2 1.2 1.2 1.1 1.0
t6-IID 3.5 1.9 2.1 1.4 1.3
Normal-GARCH 1.7 1.8 1.8 1.5 1.1
t6-GARCH 1.8 2.0 2.0 1.6 1.2
Normal-VAR
t6-VAR
2.5
6.4
2.2
2.6
1.8
2.2
2.7
1.8
1.2
1.1
I Both assumed strictly stationary with,
! !
Normal-IID 5.0 5.3 5.4 4.9 4.8 µi 2
i i,j
t6-IID
Normal-GARCH
10.7
7.2
6.7
7.1
6.9
7.2
5.2
6.0
5.0
5.5
µ= and ⌃=
µj 2
t6-GARCH 7.4 7.7 7.5 6.9 5.7 i,j j
Normal-VAR 9.5 6.9 6.1 8.5 5.0
t6-VAR 14.5 7.9 7.3 7.3 5.1

I The ratio of the two variances is given by
Normal-IID 10.3 10.3 10.7 10.1 9.6
t6-IID 17.9 12.4 12.5 10.3 9.9
Normal-GARCH 12.8 12.5 12.3 12.4 10.5
2
i
t6-GARCH 13.7 13.3 13.1 13.1 11.1 ⇥= 2
Normal-VAR 15.6 12.4 10.8 15.6 9.7
t6-VAR 22.5 13.3 12.0 13.3 9.8 j
For each DGP, the null hypothesis of equal Sharpe ratios is true and so the empirical rejection probabilities should be compared to the nominal level of the test,
given by α. We consider three values of α, namely α = 1%, 5% and 10%. All empirical rejection probabilities are computed from 5000 repetitions of the underlying
DGP, and the same set of repetitions is shared by all inference methods.
I Testing H0 : ⇥ = 1 vs. H1 : ⇥ 6= 1 with an F -test is not suitable for this data
Table 2
Summary sample statistics for monthly log returns in excess of the risk-free rate: mean, standard deviation, Sharpe ratio, and first-order autocorrelation
_
Fund r s ˆ
Sh Φ̂
Fidelity 0.511 4.760 0.108 −0.010
Fidelity Aggressive Growth 0.098 9.161 0.011 0.090
Coast Enhanced Income 0.245 0.168 1.461 0.152
JMG Capital Partners 1.228 1.211 1.014 0.435
The Solution I The Solution II
I Just as before, imposing relatively mild regularity conditions we have
I Rewrite the testing problem to define, p d
T (v̂ v) ! N(0, )
2 2
= log ⇥ = log log
i j I Delta method implies,
p
I Then the previous testing problem is equivalent to H0 : = 0 vs H1 : 6= 0 T(ˆ
d
) ! N(0, r0 f (v) rf (v)) (5)
I Using the same notation as before v = (µi , µj , i , 0 where

j) and v̂ = (µ̂i , µ̂j , î , ˆj ) we find ✓ ◆
2a 2b 1 1
ˆ = f (v̂) r0 f (a, b, c, d) = , , ,
= f (v) and c a 2 d b2 c a 2 d b2
s
where f (a, b, c, d) = log(c a2 ) log(d b2 ) ⇣ ⌘ r0 f (v̂) ˆ rf (v̂)
s ˆ = (6)
T
HAC Inference I (similar to the Sharpe Ratio) Bootstrap Method

I Find consistent estimator ˆ , for example robust kernel estimator
I A two-sided p-value for the null hypothesis H0 : = 0 by

0 1
|ˆ| I Similar to the Sharpe ratio case, kernel HAC inference is often liberal in small samples
p̂ = 2 @ ⇣ ⌘A ,
s ˆ
I Consider bootstrap alternative, with proper account given to time-series nature of the
where (·) denotes the c.d.f. of the standard normal distribution data
I Alternatively, a 1 ↵ confidence interval for by,

⇣ ⌘
ˆ ± z1 ↵/2 s ˆ ,
where z denotes the quantile of the standard normal distribution

PW
Remark 3.2 of Ledoit and Parzen kernel instead of the (prewhitened) QS kernel. The results were virtu-
ally identical and are therefore not reported. Since the Parzen kernel has a
Table 1: Empirical rejection probabilities (in percent) for various data-

generating processes (DGPs) and inference methods; see Section 4 for a
t on the finite-sample per- description. For each DGP, the null hypothesis of equal variances is true and
essarily limited) simulations. so the empirical rejection probabilities should be compared to the nominal
under the null, based on 5,000 test, given by `. We
level of theSimulation three values of `, namely ` = 1%,
considercomparison
study
onsidered are a = 0.01, 0.05, 5%, and 10%. All empirical rejection probabilities are computed from 5,000
repetitions of the underlying DGP, and the same set of repetitions is shared
ying M = 499 resamples.I TheComparing the empirical rejection probabilities in simulation, (for T = 120)
by all inference methods.
DGP F HAC HACpw Boot-IID Boot-TS
Nominal level a = 1%
dy:
Normal-IID 0.2 1.2 1.4 0.9 0.9 An empirical example
t6-IID 4.2 1.5 1.7 0.8 0.8
ed on the QS kernel with Normal-GARCH 0.4 1.4 1.3 1.0 0.9
ws (1991). t6-GARCH 0.3 1.5 1.5 1.0 1.0
ased on the prewhitened QS
Normal-VAR 0.5 2.1 2.0 1.6 0.9
ion of Andrews and Monahan
t6-VAR 3.8 2.1 2.0 1.1 1.0
ection 3.2.1 of Ledoit and
I Nominal level a =bootstrap
5%
Again, the time-series is generally reliable, there is some loss under HAC
Normal-IID
class estimators 2.4 6.1 6.1 5.1 4.9
ection 3.2.2 of Ledoit and
to pick a data-dependent t -IID
6 Lönn
11.5 6.8 7.0 4.9 4.7
Rasmus Portfolio Management FM21010 - Regression perspective and testing 36 / 48
∈{1,2,4,6,8,10}. The semi- Normal-GARCH 2.1 5.4 5.5 5.0 4.8

el in conjunction with t6-GARCH 2.4 5.7 5.9 5.1 5.0
ter we employ the stationary
Normal-VAR 3.1 7.2 6.7 6.4 4.8
with an average block size
t6-VAR 10.9 6.9 6.5 5.3 4.9
Nominal level a = 10%
Normal-IID 5.9 11.3 11.1 10.2 9.8
of equal variances to be true. t6-IID 18.3 11.4 10.4 10.1 9.7

turn processes are identical. Normal-GARCH 5.6 10.8 11.0 10.2 10.1
qual variance one and within-
Empirical exercise on industry sorts Value-weighted industry sorts - Basic descriptive stats
t6-GARCH 6.0 10.9 11.2 10.1 9.8
ions of normality and inde-
Normal-VAR 7.3 12.4 11.7 12.0 9.9
tal, we consider the same six I The following table provides a general overview of the asset returns
t6-VAR
Wolf (2008, Section 4). I To summarize 17.8 12.4 12.0 10.2 10.0
the main parts of what we have covered so far we consider 49 industry
N = 49 Mean St.dev Min Median Max market
sorts from Kenneth Frenchs’ data library
Average 0.14 0.06 0.17 0.14 0.26 0.13
Wilmott magazine
I The sorts give us daily value-weighted return, we consider the time period between Vol. 0.21 0.07 0.13 0.20 0.49 0.16
|SR| 0.72 0.28 0.08 0.77 1.30 0.84
2010 and 2017
25/10/11 8:48 PM
I Average refers to the time series average of returns
I We consider 2 allocations; minimum-variance, tangency
I Vol. is the time series volatility of returns
I We start by simply plugging the sample moments without using any factor structure
I All values are annualized
Allocations and performance Performance
I Minimum-variance and mean-variance portfolio returns over time
I We form two allocations optimized daily
⌃ 1 µ̃
wtan =
◆0 ⌃ 1 µ̃
⌃ 1◆
wgmv =
◆0 ⌃ 1 ◆
I To estimate the inputs ⌃ 1 and µ̃ we use the sample mean and sample
variance-covariance matrix over a rolling window of 500 days
I Annualized Sharpe ratios are SRtan = 0.72 and SRgmv = 1.18

Imposing a Fama-French structure Performance

I Minimum-variance returns with and without factor structure over time
I Can we improve the gmv allocation by using the Fama-French 5-factor model?
µ⇤ = a + Bµf and ⌃⇤ = B⌃f B 0 + ⌃✏
I Let ⌃✏ be diagonal such that all covariance stems from the factors
I Estimate ⌃f with the sample covariance matrix and a and B by linear regression
I Update the allocation,

⇤ ⌃⇤ 1 ◆
wgmv =
◆ 0 ⌃⇤ 1 ◆
I Annualized Sharpe ratios are SRgmvf = 1.272 and SRgmv = 1.18
Significance test I Significance test II
I There appear to be some gains from imposing the factor structure
⇣ ⌘
I Are they also statistically significant? I Without annualization we find ˆ = 0.0062 and s ˆ = 0.017
I Let H0 : = 0 against a two-sided alternative and, I Which gives a p-value around 0.70 so we fail to reject our null hypothesis
ˆ = SR
ˆ i SR ˆ j = µ̂i µ̂j
î ˆj I What could be the next step to improve? Is the diagonal ⌃✏ too restrictive?
s
⇣ ⌘ r0 f (v̂) ˆ rf (v̂)
s ˆ =
T
I where we estimate with a HAC estimator under a Parzen kernel
To think on - Regarding tangency portfolios
I It seems like the tangency allocation (⌃ 1 µ̃/(◆0 ⌃ 1 µ̃)) performs below expectations.
It should provide an optimal trade-o↵ between variance and mean returns
I Could we solve this by performing the Britten-Jones regression and allocating wealth
in accordance with the assets respective b̂?

Lecture 3 PDF

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 3 PDF

Uploaded by

Copyright:

Available Formats

The inverse covariance matrix I

Rasmus Lönn Portfolio Management FM21010 - Regression perspective and testing 7 / 48

The inverse covariance matrix II The inverse covariance matrix III

I Notice in particular the recurring ratios of covariances to variances

The inverse covariance matrix VI The inverse covariance matrix VII

I Returning to our fundamental portfolio allocations for N assets

I The Efficient frontier is formed by a combination of the other two

I We can understand ⌃ 1 as a system of linear regressions. Can we likewise represent

Rasmus Lönn Portfolio Management FM21010 - Regression perspective and testing 14 / 48

Regression representation I Regression representation II

I Likewise, with a risk-free rate we can consider 1 ⌃ 1 µ̃

ŵj = estimated GMV weight (j = 2, . . . , N )

µp =w0 µ mean of GMVP

I This is an old problem but as of now the main references are,

Testing the Sharpe Ratio - The Problem Some notation

HAC Inference I HAC Inference II

I They consider 6 return DGPs; normal-iid, fat-tailed iid, normal-GARCH, fat-tailed

Ledoit and Wolf (2008) simulation study

Nominal level α = 10%

I Using the same notation as before v = (µi , µj , i , 0 where

HAC Inference I (similar to the Sharpe Ratio) Bootstrap Method

I A two-sided p-value for the null hypothesis H0 : = 0 by

I Alternatively, a 1 ↵ confidence interval for by,

where z denotes the quantile of the standard normal distribution

Table 1: Empirical rejection probabilities (in percent) for various data-

∈{1,2,4,6,8,10}. The semi- Normal-GARCH 2.1 5.4 5.5 5.0 4.8

of equal variances to be true. t6-IID 18.3 11.4 10.4 10.1 9.7

I We form two allocations optimized daily

I Annualized Sharpe ratios are SRtan = 0.72 and SRgmv = 1.18

Imposing a Fama-French structure Performance

µ⇤ = a + Bµf and ⌃⇤ = B⌃f B 0 + ⌃✏

I Update the allocation,

I There appear to be some gains from imposing the factor structure

I where we estimate with a HAC estimator under a Parzen kernel

To think on - Regarding tangency portfolios

Rasmus Lönn Portfolio Management FM21010 - Regression perspective and testing 45 / 48

You might also like