Cross Validation Based Transfer Learning for Financial Covariance Estimation

Cross Validation Based Transfer Learning for
Financial Covariance Estimation:

A Data-Driven Approach
Torsten Mörstedt
Deka Investment GmbH*, Mainzer Landstr. 16, 60325 Frankfurt, Germany, torsten.moerstedt@deka.de.
Bernhard Lutz, Dirk Neumann

University of Freiburg, Rempartstr. 16, 79085 Freiburg, Germany, bernhard.lutz@is.uni-freiburg.de,
dirk.neumann@is.uni-freiburg.de
Existing studies on covariance estimation generally assume that the future covariance matrix must be
estimated based only on the limited history of a given set of portfolio constituents. In this study, we propose
a new perspective on how to estimate the covariance matrix. We present a purely data-driven approach
that selects the estimation parameters using cross validation to be historically optimal on a disjoint transfer
set of assets according to the given objective. The proposed approach additionally uses a second shrinkage
target that is determined based on how much the sample eigenvalues are imbalanced according to their Gini
coefficient. Our empirical evaluation based on a total of six stock market indices shows that the proposed
approach outperforms established estimators in minimizing variance and maximizing risk-adjusted return.
The second shrinkage target is particularly relevant for high-dimensional covariance matrices where the
number of assets is greater than the number of historic datapoints. To the best of our knowledge, this study
is the first to apply the concept of transfer learning to the problem of covariance estimation.
Key words : Covariance estimation, transfer learning, non-linear shrinkage, second shrinkage target,
data-driven approach, portfolio optimization
1. Introduction
Covariance estimation is a well studied problem in the finance and operations research literature. An
accurate estimate of the future covariance is required for standard portfolio optimization procedures
* Views expressed in this paper are those of the author and do not necessarily reflect those of Deka Investment or its
employees.
Electronic copy available at: https://ssrn.com/abstract=3986993

Mörstedt, Lutz, and Neumann: Transfer Learning for Covariance Estimation
2
like minimizing variance and maximizing risk-adjusted return (Markowitz 1952). The task of the
problem is to estimate the future covariance of a given portfolio of N assets based on the history of
T datapoints. However, the sample covariance is generally a poor estimate of the future covariance,
which leads to deteriorated out-of-sample performance (Frost and Savarino 1986, Jorion 1986,
Michaud 1989, Olivares-Nadal and DeMiguel 2018). As a remedy, researchers have proposed several
approaches to improve the sample covariance estimate. The proposed methods can be grouped into
(i) explicit (non-) linear shrinkage (e.g., Ledoit and Wolf 2003, 2017b, Engle et al. 2019), (ii) implicit
shrinkage by combining sample and reference weights (e.g., Frahm and Memmel 2010, Bodnar
et al. 2018), (iii) thresholding (e.g., Bickel and Levina 2008, Fan et al. 2013), and signal/noise
separation (e.g., Laloux et al. 1999, Plerou et al. 2002, Zhao et al. 2019), (iv) factor models (e.g.,
Fan et al. 2011, 2018, De Nard et al. 2019), or (v) regularization of the inverse covariance matrix
(e.g., Friedman et al. 2008, Lam and Fan 2009, Cai et al. 2011, Nguyen et al. 2021). Other studies
have proposed estimation error reduction through constrained portfolio optimization in contrast to
covariance adjustments (e.g., Jagannathan and Ma 2003, DeMiguel et al. 2009a, Fan et al. 2012).
Extensive literature reviews and empirical studies of covariance estimation can be found in Fan
et al. (2016), Ledoit and Wolf (2017a) and Ledoit and Wolf (2021a).
However, all of the aforementioned studies considered the problem of covariance estimation from
a rather restrictive perspective, where the only available real-world data for parameter selection is
given by the history of the actual portfolio constituents. To determine the estimation parameters
such as shrinkage intensities, prior studies applied loss functions with statistically derived priors for
an unknown oracle covariance, e.g., by using random matrix theory (most recently Ledoit and Wolf
2021b). In a similar fashion, using sample data only, others calculated in-sample portfolio volatility
to directly determine the shrinkage intensity (e.g., Frahm and Memmel 2010, Bodnar et al. 2018)
or to identify signal/noise separation (Zhao et al. 2019).
1.1. Our Approach
In this study, we propose an entirely different perspective on financial covariance estimation.
Following prior studies, we also assume that the future covariance must be estimated for N assets

3
based on only T datapoints. However, we assume that we are additionally provided with other
stock market time-series data from a disjoint set of assets to select the estimation parameters.
From a real-world perspective, the history of joint datapoints may be reduced due to company
merges and acquisitions, splits, or one of the portfolio constituents being recently listed at the stock
market. In this case, the future covariance matrix must be estimated based on the comparably short
history of the portfolio constituents. We then argue that financial investors can rely on additional
time-series data from a disjoint set of assets that are not part of the portfolio. More precisely, we
argue that we can select the estimation parameters using cross validation based on the history of
other assets. A major advantage of applying cross validation for parameter selection is that we can
apply this method to estimate the future covariance according to different optimization objectives,
such as minimum variance or maximum risk-adjusted return. The historically optimal parameters
are subsequently used to estimate the actual covariance based on the given assets and datapoints.
The concept of “transfer learning” is well known in the context of machine learning (e.g., Bastani
2021, Pan and Yang 2009) or the pricing literature (e.g., Bastani et al. 2021). Yet, we are not aware
of any approach that applies the same idea to the problem of parameter selection in covariance
estimation like shrinkage intensities and shrinkage target.
We present a novel covariance estimator based on non-linear shrinkage, which we call “Cross
Validation based Transfer Learning” (CVTL). The approach can be attributed to the group of
rotation equivariant non-linear shrinkage estimators. This means that the sample eigenvectors are
left unchanged, while the sample eigenvalues are shrunk, in our case, against the mean eigenvalue
and a second non-linear shrinkage target. The second shrinkage target is determined by a two-step
process. First, we scale the sample eigenvalue imbalance measured by their Gini coefficient to a
desired quantity. Second, we search for a Beta distribution, for which the corresponding cumulative
distribution function achieves the desired Gini coefficient. All parameters are selected by cross
validation to be historically optimal on a disjoint history with respect to the given objective.
We recommend the disjoint history to contain at least one additional year of trading data. This

4
allows for performing at least twelve train-evaluate iterations during cross validation with monthly
rebalancing to achieve a stable parameter configuration. The objective can be chosen freely and it
can be completely tailored towards the overarching portfolio optimization goals, e.g., minimizing
variance or maximizing risk-adjusted return. Therefore, our approach is purely data-driven and
agnostic with respect to the desired eigenvalue distribution and optimization objective. In case
of maximizing risk-adjusted return, our approach operates also on the covariance estimate only,
without having to estimate future returns. Instead, the weights are always calculated according to
the Markowitz global minimum variance portfolio (Markowitz 1952) but the eigenvalues are set to
be historically optimal under the desired objective function on the disjoint dataset.
Our numerical evaluation based on several global stock indices as well as aggregated industry
portfolios shows that CVTL is superior to established covariance estimators in minimizing variance
and maximizing risk-adjusted return. The latter holds in particular when additionally accounting
for transaction costs. The second shrinkage target is assigned a non-zero shrinkage intensity when
estimating high-dimensional covariance matrices where the number of assets is greater than the
number of datapoints. However, the intensity of the second shrinkage target decreases if the number
of datapoints increases relatively to the number of assets. Besides, we empirically compare the
resulting eigenspectra of CVTL with the analytically derived spectra from the non-linear shrinkage
approaches QuEST and BN. The results indicate a similar eigenspectrum, while CVTL tends to
increase instead of decrease the largest eigenvalue for high-dimensional covariance matrices.
1.2. Contributions
We provide a novel perspective on covariance estimation by proposing a purely data-driven estimator
CVTL that combines cross validation and transfer learning. We show that transfer learning allows us
to replace the traditional loss functions based on theoretically derived priors of the oracle covariance
with a flexible objective function tailored to the overarching optimization problem. In contrast
to existing approaches, CVTL is agnostic towards the desired eigenspectrum as all estimation
parameters are selected based on a disjoint dataset. We further outline a novel non-linear shrinkage

5
target based on the cumulative distribution function of a Beta distribution to achieve a desired
Gini coefficient. The second shrinkage target is particularly useful for high-dimensional estimation
problems with N > T . In addition, we demonstrate the flexibility of CVTL by setting the objective
function towards maximizing risk-adjusted return. However, we do not explicitly estimate future
returns as we calculate the portfolio weights solely based on the formula of the GMV portfolio,
which also presents a novel approach.
1.3. Outline
The remainder of this paper is structured as follows. Section 2 introduces the preliminaries on
financial covariance estimation. In addition, it presents a brief explanation of existing shrinkage
methods. Section 3 presents our covariance estimator that applies cross validation based transfer
learning. In particular, it presents a detailed decription of how to select the shrinkage parameters.
Section 4 describes our empirical evaluation including the competing approaches and the datasets
used for parameter estimation and out-of-sample evaluation. Section 5 presents the results. We
first present the main results for the objectives minimum variance and maximum risk-adjusted
return. Second, we present several sensitivity analyses, where we show how varying the number of
cross validations and the parameter search space influences out-of-sample performance. Third, we
show how our approach is linked to established covariance estimators from an empirical perspective.
Finally, Section 6 concludes and provides an outlook on future research.
2. Preliminaries
Portfolio optimization aims at finding the optimal allocation of weights w = (w1 , . . . , wN ) for a
portfolio consisting of N assets. The global minimum variance (GMV) portfolio allocation minimizes
the portfolio risk (Markowitz 1952). Let Σ ∈ RN ×N denote the future covariance matrix and 1 the
vector consisting of N ones. The weights of the GMV portfolio are given by
Σ−1 1
w= . (1)
10 Σ−1 1
However, the future covariance matrix Σ is an unknown parameter that needs to be estimated
based on historic data. Let X T,N = x1 , . . . , xT with xt ∈ RN denote the sequence of historic returns

6
1
PT
used for covariance estimation. The average return vector is given as x̄ = T t=1 xt . The sample
b ∈ RN ×N can then be calculated as

covariance matrix estimate Σ
T
1 X
Σ
b= (xt − x̄)(xt − x̄)0 . (2)
T − 1 t=1
The dimension D = N/T specifies the fraction between the number of assets and datapoints used for
covariance estimation. If D < 1 (more datapoints than assets), the estimation problem is considered
low-dimensional. Here, Σ
b is symmetric and positive semi-definite by construction. Conversely, if
D > 1, the problem is considered high-dimensional, which corresponds to an estimated sample
covariance that is rank deficient and thus not invertible.
2.1. Eigendecomposition
Eigendecomposition can be used to write the sample covariance matrix as the product of an
orthonormal matrix Λ ∈ RN ×N , where column i contains the eigenvector associated with eigenvalue
λi , and a diagonal matrix diag(λ1 , . . . , λN ) containing the eigenvalues in descending order λi ≥ λi+1
b = Λ diag(λ1 , . . . , λN ) Λ0 .
Σ (3)
The eigenvalues of the covariance matrix reflect the risk contribution of the portfolio constituents
(Roncalli and Weisang 2016). If eigenvalue λi is greater than λj , then the eigenvector Λi captures
more variance than the eigenvector Λj . The GMV weights (1) are calculated based on the inverse
covariance matrix Σ−1 . The inverse of (3) is given as

b −1 = Λ diag 1 1
Σ ,..., Λ0 . (4)
λ1 λN
Given the sample covariance estimate Σ,

b the GMV portfolio weights can be expressed as a
function of the eigenvalues

Λ diag 1
λ1
Λ0 1
, . . . , λ1N
w(λ1 , . . . , λN ) = . (5)
10 Λ diag λ11 , . . . , λ1N Λ0 1
For instance, it can easily be shown that setting all eigenvalues to the same value c ∈ R>0 results in
the 1/N portfolio.

7
2.2. Shrinkage Methods
Shrinkage estimators present a common approach to improve the sample covariance estimate (e.g.,
Ledoit and Wolf 2003, 2004a, Frahm and Memmel 2010, Bodnar et al. 2018). Shrinkage approaches
can be distinguished between linear and non-linear shrinkage estimators.
Linear Shrinkage The idea of linear shrinkage (LS) is to combine the sample covariance matrix
b with a reference covariance matrix R by applying the shrinkage intensity δ ∗

Σ
b LS = (1 − δ ∗ ) Σ
Σ b + δ ∗ R. (6)
A frequent choice for R is the covariance matrix implied by the 1/N portfolio, which is given
by a diagonal matrix that contains the mean sample eigenvalue (Ledoit and Wolf 2004b, Frahm
and Memmel 2010). Such approaches are usually seen an application of “Stein-type” covariance
estimators (see a discussion in Ledoit and Wolf 2021a). The fundamental idea of “Stein-type”
estimators was originally derived to reduce mean squared error of multivariate mean estimation
by shrinking the estimate to a target vector (Stein 1956, James and Stein 1961). The shrinkage
intensity δ ∗ specifies how close the shrunk covariance matrix will be to the reference matrix R.
Setting δ ∗ to 0 yields the sample covariance, while setting δ ∗ to 1 yields the reference matrix R.
In the related literature, δ ∗ is often expressed as a function of D (e.g., Frahm and Memmel 2010,
Bodnar et al. 2018). As an alternative, δ ∗ can be derived by implicitly optimizing a loss function
(see Ledoit and Wolf (2021c) for a review on different loss functions), e.g., the Frobenius loss
P P P P 2
LF c, = c − measured in units of risk between the sample and the reference portfolio

F
(e.g., Ledoit and Wolf 2003). Since LF cannot be explicitly calculated, researchers make assumptions
P P
about the characteristics of to minimize LF implicitly. Assumptions about are based, among
others, on an uninformed identity matrix (e.g., Ledoit and Wolf 2004b, Frahm and Memmel 2010,
Bodnar et al. 2018), an identity matrix with the average correlation on its off-diagonals (e.g., Ledoit
and Wolf 2004a) or an arbitrary factor structure (e.g., Ledoit and Wolf 2003).

8
Non-Linear Shrinkage Non-linear shrinkage (N LS) methods apply a function
f N LS (λ1 , . . . , λN ) = λ∗1 , . . . , λ∗N (7)
based on the sample eigenvalues to obtain the shrunk eigenvalues λ∗1 , . . . , λ∗N . The N LS covariance
estimate is then calculated using eigendecomposition (3) as
b N LS = Λ diag(λ∗1 , . . . , λ∗N ) Λ0 .
Σ (8)
For instance, f N LS can be defined as
f N LS (λ1 , . . . , λN ) = (1 − δ ∗ )λ + δ ∗ λθ , (9)
with non-linear shrinkage target λθ . The difference to linear shrinkage is that λθ is not a vector that
simply repeats a given value like the mean eigenvalue. Instead, the sample eigenvalues are shrunk
against different individual target eigenvalues.
Recent approaches set the f N LS to the Stieltjes transform of the rescaled Marcenko Pastur
(MP) density (e.g., De Nard et al. 2019, Bun et al. 2017, Ledoit and Wolf 2012). Intuitively, this
distribution presents the limit eigenvalue distribution of a random covariance matrix with N → ∞.
Recently, Ledoit and Wolf (2021b) proposed another N LS approach that combines traditional
“Stein-type” shrinkage with MP-based estimators. The approach applies quadratic shrinkage with
two shrinkage targets, where the respective shrinkage intensity mainly depends on the dimension D.
The more data points are available for covariance estimation, the higher the intensity of “Stein-type”
shrinkage over MP adjustments.
Another common approach is to divide the sample eigenvalues into signal and noise regions
(Laloux et al. 1999, Plerou et al. 2002, Zhao et al. 2019). The function f N LS then rescales only
the noisy eigenvalues while preserving the signal eigenvalues. Similar approaches are known as
“thresholding” (Fan et al. 2013) where the noise eigenvalues are erased from the covariance estimate.

9
2.3. Cross Validation
The concept of cross validation is well known in the machine learning literature (e.g., Pan and Yang
2009, Ban et al. 2018). The underlying idea is to evaluate multiple train/test splits of the available
data to select the optimal model parameters. By evaluating multiple train/test splits, the resulting
parameters are more robust than the parameters obtained by using a single train/test split (e.g.,
Bergmeir and Benı́tez 2012).
In covariance estimation, the shrinkage parameters like intensity and target could, in theory, also
be selected based on cross validation. This requires a long history of datapoints to be available to
perform a sufficient number of cross train-evaluate iterations. However, if there was a large amount
of data available for cross validation, the same data could also directly be used to estimate the
sample covariance matrix (2). The central limit theorem states that the sample covariance estimate
generally becomes better if the number of datapoints T increases. In particular for high-dimensional
estimation problems, the history of datapoints is smaller than the number of assets, so that there is
no data available to perform cross validation.
We propose an approach that applies cross validation based transfer learning to identify the
optimal shrinkage parameters. We solve the problem of limited data availability by using a sufficiently
long history of disjoint assets for parameter selection. The optimal parameters identified on the
disjoint dataset are then transferred to the actual covariance estimation problem.
3. Covariance Estimation Through Cross Validation based Transfer

Learning
Our approach called “CVTL” (Cross Validation based Transfer Learning) can be attributed to
the class of rotation equivariant non-linear shrinkage estimators. That is, we shrink the sample
eigenvalues, while not changing the eigenvectors. The main idea of our approach is to select the
shrinkage parameters based on a comparatively longer but disjoint history of other assets. The
approach is fully data-driven and agnostic towards achieving a particular shrinkage target or
intensity.

10
Let U = {1, . . . , U } denote the index set of all assets in the universe. Given a portfolio of N
assets with stock indices X ⊂ U , we need to estimate the future covariance matrix based on the
joint history X T,N . We now assume that there is at least one other index set V ⊂ U of size N with
V ∩ X = ∅ but longer joint history. Specifically, we assume that the history V contains at least
one additional year of trading data so that we can perform at least twelve cross validations for
parameter selection given monthly rebalancing (e.g., Ackermann et al. 2017, De Nard et al. 2019,
Zhao et al. 2019). The optimal shrinkage parameters are then transferred to the actual estimation
problem based on X T,N .
3.1. Non-Linear Shrinkage Using Second Shrinkage Target
The main challenge of non-linear shrinkage is to find an optimal parameter setting which specifies
how to adjust the sample eigenvalues. Specifically, this requires one or more shrinkage targets and
the respective shrinkage intensities.
We shrink the sample eigenvalues λ = λ1 , . . . , λN against two shrinkage targets (λ̄ and λθ )
λ∗ = δ1 λ + δ2 λ + (1 − δ1 − δ2 )λθ . (10)
The first shrinkage target λ is given by a vector of length N that simply repeats the average sample
1
PN
eigenvalue N i=1 λi as suggested by Ledoit and Wolf (2003). The shrinkage parameters δ1 , δ2 , and
λθ are selected using cross validation based transfer learning on a disjoint history V K,N .
The second shrinkage target λθ is calculated based on the imbalance of the sample eigenvalues.
We measure imbalance according to the Gini coefficient (Dorfman 1979). The values of the Gini
coefficient range between 0 and 1, where a value of 0 indicates perfect balance (λi = c ∈ R>0 ,
∀i = 1, . . . , N ), while a value of 1 indicates perfect imbalance (λ = (c, 0, . . . , 0)). We prefer the Gini
over other measures of imbalance such as entropy as the Gini coefficient is limited to the interval
[0, 1]. Since the Gini coefficient requires λi ≥ 0 for all eigenvalues, we set all negative eigenvalues to
zero (if any).

11
Definition 3.1 (Gini Coefficient) Given eigenvalues λ = λ1 , . . . , λN with λi ≥ 0, ∀i = 1, . . . , N ,
the Gini coefficient is defined as

PN PN
i=1j=1 |λi − λj |
G(λ1 , . . . , λN ) = PN . (11)
2N i=1 λi

We define G Σ
b as the Gini coefficient of the eigenvalues of a sample covariance estimate Σ.b

Given G Σ b , we first determine the desired Gini coefficient Gθ for the second shrinkage target.
Subsequently, we search for a cumulative Beta distribution that achieves the desired Gini coefficient.
Thereby, we are flexible in achieving almost any possible eigenspectrum in the non-linear shrinkage
target.
To obtain Gθ , we introduce a parameter γ ∈ [−1, 1] that adjusts the Gini coefficient of Σ

b as


γ + (1 − γ) G(Σ), if γ ≥ 0,

 b
θ
G = (12)


(1 − |γ |) G(Σ),
 b otherwise.

Thereby, we calculate a convex combination of G Σ
b and 1 to increase the imbalance of the sample

eigenvalues (γ ≥ 0), or between G Σ
b and 0 to balance the sample eigenvalues (γ < 0). Thereby, it
is ensured that Gθ ∈ [0, 1].
Given the desired Gini coefficient Gθ , we generate artificial eigenvalues λθ so that G(λθ ) ≈ Gθ .
For this purpose, we follow (e.g., Ledoit and Wolf 2012, 2017a) in using the cumulative distribution
function (CDF) of a Beta distribution with parameters α, β. The main argument for a Beta
distribution is its bounded support and easily adjustable shape for which Ledoit and Wolf (2012)
calls it the ”best suited family” of distribution for such purpose. The eigenvalues are then calculated
as the values of CDFα,β over an equally spaced grid over the interval [0, 1]. This yields the following
optimization problem
λα,β = arg min |G(λα,β ) − Gθ | (13)

λα,β

α,β N N −1 1
where λ = CDFα,β , CDFα,β , . . . , CDFα,β . (14)
N N N

12
We optimize α over the set {1, 2, 3, 4, 5, 10, 20, . . . , 100, 150} while setting β = 1. Thereby, we can
generate a large variety of eigenvalue distributions with different Gini coefficients as suggested by
the optimal parameter setting found through cross validation. To provide an intuition, Figure 1
shows the CDFs of three Beta distributions and the corresponding Gini coefficients for N = 100.
Figure 1 Cumulative Distribution Functions of Different Beta Distributions and Corresponding Gini Coefficients
for N = 100 eigenvalues.

Cumulative probability
1.0 1.0 1.0

0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0.0 0.0 0.0
0 0.5 1 0 0.5 1 0 0.5 1
x x x
(a) α = 2, β = 1, G(λ2,1 ) = 0.502 (b) α = 5, β = 1, G(λ5,1 ) = 0.716 (c) α = 10, β = 1, G(λ10,1 ) = 0.834
We normalize the resulting eigenvalues λα,β from (13) in order to keep the trace of the sample
eigenvalues (e.g., Laloux et al. 2000).
1
PN
θ α,β N i=1 λi
λ =λ . (15)
||λ ||1
α,β
This step is necessary in order to ensure that the second shrinkage target has a reasonable influence
∗
,β ∗
on the resulting shrunk eigenvalues. If the artificial eigenvalues λα are all much larger than their
pendants in the sample eigenvalues λ, the shrinkage operation is strongly dominated by λα,β . Note
that normalization does not alter the Gini coefficient (i.e., G(λα,β ) = G(λθ )) as the Gini coefficient
is invariant to scalar multiplication.
3.2. Parameter Selection Using Cross Validation
We need to select a total of three parameters, namely the adjustment factor γ for the Gini coefficient
and the shrinkage intensities δ1 , δ2 . The parameters are chosen such that the resulting weights
w(γ, δ1 , δ2 ) are historically optimal on a disjoint history V K,N according to the given objective
φ(w) (e.g., minimize variance or maximize risk-adjusted return). The disjoint set of asset V can be

13
Algorithm 1 Covariance Estimation with Given Parameters.

1: input: X T,N = x1 , . . . , xT history of returns for covariance estimation, γ scaling parameter, δ1 , δ2 shrinkage
intensities
T
b = 1 P (xt − x̄)(xt − x̄)0
2: Σ T −1
t=1
3: calculate eigendecomposition Σb = Λ λ Λ0
(
γ + (1 − γ) G(Σ),
b if γ ≥ 0,
4: Gθ =
(1 − |γ|) G(Σ),
b otherwise.
5: λα,β = arg minλα,β |G(λα,β ) − Gθ |
where λα,β = CDFα,β N1 , CDFα,β N2 , . . . , CDFα,β N

6: N
,
7: and α ∈ {1, 2, 3, 4, 5, 10, 20, . . . , 100, 150}, β = 1.
1 PN
i=1 λi
8: normalization λθ = λα,β N
||λα,β ||1
9: λ∗ = δ1 λ + δ2 λ̄ + (1 − δ1 − δ2 )λθ
∗ ∗ 0
10: Σ = Λ λ Λ
11: output: Σ∗ covariance estimate
selected by randomly sampling indices from U \ X , as done in this study. As an alternative, one
could attempt to specifically select similar portfolio constituents, e.g., from a similar stock market
index, to further improve the resulting parameter configuration.
We assume monthly rebalancing so that each portfolio allocation is held for ρ = 21 trading days
(e.g., Ackermann et al. 2017, De Nard et al. 2019, Zhao et al. 2019). Our empirical results imply
that the parameter setting becomes stable after approximately twelve train-evaluate iterations for
cross validation (see EC.2.2). Hence, we recommend to use a disjoint history V K,N containing at
least another trading year (i.e., 12 · 21 = 252 days) of data so that K ≥ T + 252 and applied the
same methodology to our results.
The process of cross validation is described by Algorithm 2. Without loss of generality, we assume
that φ(w) needs to be minimized. For each parameter configuration γ, δ1 , δ2 , we first check if the
resulting covariance estimate ΣX (γ, δ1 , δ2 ) on the actual dataset X is invertible. If ΣX (γ, δ1 , δ2 ) is
not invertible, the parameter configuration is discarded. Otherwise, we perform time-series cross
validation with monthly rebalancing to calculate the historical performance of γ, δ1 , δ2 under the
given objective. Let the disjoint history V K,N = v1 , . . . , vK be sorted from the oldest (v1 ) to the
most recent datapoint (vK ). We first estimate a covariance matrix with the given parameters based
on v1 , . . . , vT (see Algorithm 1). The resulting covariance matrix is then evaluated based on the
first out-of-sample period vT +1 , . . . , vT +ρ . Subsequently, we shift the estimation and out-of-sample

14
periods by ρ days towards the future. Accordingly, the second covariance matrix is estimated
based on vρ+1 , . . . , vρ+1+T and its performance is measured on the second out-of-sample period
vρ+T +2 , . . . , v2ρ+T +2 . This continues until the most recent datapoint vK becomes part of the last
out-of-sample period. We use n to denote the number of train-evaluate iterations that are performed
during cross validation (it is ensured that n ≥ 12). The performance of the current parameter
configuration is then aggregated as the mean performance over all evaluated out-of-sample periods.
Algorithm 2 Parameter Selection Using Cross Validation.

1: input: X T,N history of returns for covariance estimation, V K,N disjoint history of returns for parameter selection
with K ≥ T + 12ρ, ρ rebalancing interval (21 trading days), P discrete parameter space, φ(w) objective function
to be minimized, ψ(γ, δ1 , δ2 ) → {true, f alse} additional filter for admissible configurations
2: r∗ = ∞
3: for each γ, δ1 , δ2 ∈ P do
4: if ΣX (γ, δ1 , δ2 ) not invertible then
5: ignore configuration γ, δ1 , δ2
6: end if
7: t=1
8: n=0
9: while t < K − T − ρ do
10: calculate Σ e based on parameters γ, δ1 , δ2 and datapoints vt , . . . , vt+T (see Algorithm 1)
e −1 1
11: w = 1Σ0 Σ
e −1 1
12: rn (γ, δ1 , δ2 ) = evaluate φ(w) based on vt+T +1 , . . . , vT +1+ρ
13: t=t+ρ
14: n=n+1
15: end while
r = n1 n
P
16: i=1 ri (γ, δ1 , δ2 )
17: if r < r∗ ∧ ψ(γ, δ1 , δ2 ) then
18: γ ∗ , δ1∗ , δ2∗ = γ, δ1 , δ2
19: r∗ = r
20: end if
21: end for
22: output: Historically optimal parameters γ ∗ , δ1∗ , δ2∗
For the purpose of this study, we select the estimation parameters from the following search
space P . We later provide a sensitivity analysis to assess the influence of a more fine-grained grid.
γ ∈ {−1, −0.8, . . . , 1} (16)
δ1 , δ2 ∈ {0, 0.05, . . . , 1} with δ1 + δ2 ≤ 1 (17)
We include an additional filter ψ(γ, δ1 , δ2 ) → {true, f alse}, which allows us to limit the search
space based on particular criteria, as described in the next section.

15
3.3. Objectives
The objective function φ(w) is evaluated on all out-of-sample periods that occur during cross
validation. Let Σi and v̄i denote the covariance matrix and mean return vector of the out-of-sample
period in the ith train-evaluate iteration during cross validation. To minimize variance, the objective
function must be set to
φ(w) = w0 Σi w. (18)
Alternatively, we can set the objective function to maximize risk-adjusted return as
v̄i0 w
φ(w) = . (19)
w0 Σi w
Note that the weights are always calculated according to the formula of the GMV portfolio (1)
for any given objective. Existence of the inverse covariance matrix is guaranteed by filtering the
estimation parameters accordingly. In particular, our approach does not require us to estimate future
returns. Instead, we select an eigenvalue distribution and shrinkage intensities to explicitly optimize
the given objective φ(w) for the disjoint history. The same parameters should then implicitly
optimize φ(w) in the future investment period. This approach has – to the best of our knowledge –
not been attempted before.
When optimizing for risk-adjusted return (19), we observed overfitting in terms of heavily
dispersed portfolio weights. Specifically, the resulting portfolios exhibit high out-of-sample volatility
combined with higher out-of-sample returns, or, the opposite; low out-of-sample returns paired
with low out-of-sample volatility, while not achieving a stable configuration. Therefore, we limit
the search space to improve the stability of the parameter configurations. For this purpose, we
calculate the optimal η ∗ ∈ [0, 1] that maximizes risk-adjusted return as a linear combination between
the weights of the sample covariance wGM V and the 1/N portfolio w1/N using the disjoint transfer
history. The optimal η ∗ is determined based on the following optimization problem

n
∗ 1 X v̄i0 w
η = arg max , (20)
η n i=1 w0 Σi w
where w = η wGM V + (1 − η) w1/N . (21)

16
Subsequently, we define η ∗ to be the middle of an interval [a, b] with a = η ∗ − ζ and b = η ∗ + ζ given
the parameter ζ. We set ζ = 0.30, however, the performance is not sensitive to small changes in ζ
as we later show in our sensitivity analyses (see table 9). We then calculate the acceptance interval
Ψσ in terms of the out-of-sample variance of the portfolios wa and wb achieved by combining the
sample weights and the 1/N portfolio weights according to a and b

" n n
#
1X 0 1X 0
Ψσ = w Σi wa , w Σi wb . (22)
n i=1 a n i=1
We define σ(γ, δ1 , δ2 ) as the mean variance of the given parameter configuration over all periods
during cross validation

n
1X
σ(γ, δ1 , δ2 ) = w(γ, δ1 , δ2 )0 Σi w(γ, δ1 , δ2 ). (23)
n i=1
Finally, we define the acceptance function to filter for parameters γ, δ1 , δ2 so that the resulting
mean variance during cross validation is within Ψσ



1, if σ(γ, δ1 , δ2 ) ∈ Ψσ ,


ψ(γ, δ1 , δ2 ) = (24)


0, otherwise.

Thereby, we avoid the aforementioned problem of achieving portfolios with overly low or high
out-of-sample variance, which leads to more stable parameter estimates. Note that ψ can be defined
in regard to arbitrary objectives. For instance, one could control transaction cost by only accepting
portfolios that do not exceed a given turnover during cross validation or incorporate specific goals
and behavioral theory of individual investors directly into the covariance matrix as suggested in
(e.g., Shefrin and Statman 2000, Das et al. 2010, 2018).
Altogether, we consider the following approaches for the remainder of this study
• CVTL. CVTL with objective minimum variance (18).
• CVTLLS. CVTL with objective minimum variance (18). We require δ1 + δ2 = 1 to exclude the
non-linear shrinkage target λθ .
• CVTL σµ CVTL with objective maximum risk-adjusted return (19).
• CVTLLS σµ . CVTL with objective maximum risk-adjusted return (19). We require δ1 + δ2 = 1
to exclude the non-linear shrinkage target λθ .

17
3.4. Conceptual Interpretation and Relation to Other Estimators
CVTL is related to several existing estimators like linear-shrinkage (e.g., Ledoit and Wolf 2003,
2004b,a). When setting δ1 + δ2 = 1, our approach (CVTLLS) has the same shrinkage target as Ledoit
and Wolf (2004b). Similarly, setting δ2 = 1 yields the 1/N portfolio. The intuition behind CVTL is to
adjust the sample eigenvalues to be historically optimal under a given objective φ(w) on the disjoint
history. Thereby, CVTL implicitly performs the following steps: (i) estimating the covariance matrix,
(ii) inverting the covariance matrix, (iii) optimizing the portfolio towards the given objective. This
can be interpreted as a flexible extension of the minimum variance loss suggested by (Engle et al.
2019), where the oracle estimator is derived by using cross validation. However, the difference is that
we replace the statistically derived prior for the oracle eigenvalues by Engle et al. (2019) with the
objective function based on the disjoint history. Furthermore, CVTL is related to the bounded-noise
estimator by Zhao et al. (2019) in the sense that the covariance adjustment is entirely driven by
the given objective. However, CVTL differs from the bounded-noise approach as it does not rely on
bootstrapping within the sample data for optimal parameter calibration.
Figure 2 Conceptual Difference between Established Non-Linear Shrinkage Methods (left, dark gray) and
CVTL (right, light gray).

Established Non-linear Shrinkage History X T,N Cross Validation based Transfer Learning
Implicit minimization Transfer set V K,N

b = Λ diag(λ) Λ0
Σ Objective φ(w)
of loss function L(Σ,
b Σ) with K T
Cross validation
λ∗ = fθN LS (λ) λ∗ = δ1 λ + δ2 λ̄ + (1 − δ1 − δ2 )λθ
Σ∗ = Λ diag(λ∗ ) Λ0
The conceptual difference between established non-linear shrinkage estimators and CVTL is
illustrated in Figure 2. While existing non-linear shrinkage methods (e.g., Ledoit and Wolf 2015, Bun

18
et al. 2017, Ledoit and Wolf 2020, 2021b), select the shrinkage parameters to implicitly minimize a
loss function L(Σ,

b Σ), CVTL selects the shrinkage parameters based on cross validation on a disjoint
history according to a given objective φ(w). CVTL uses cumulative Beta distribution functions to
generate the shrinkage target. However, setting δ1 + δ2 = 0 does not lead to the same non-linear
shrinkage as those of Ledoit and Wolf (e.g., 2015, 2021b). Instead, we could achieve a comparable
estimator by replacing the second shrinkage target of CVTL with the non-linear function presented
in Ledoit and Wolf (2015). Using the Gini coefficient as a measure of imbalance to generate the
second (non-linear) shrinkage target is related to linear shrinkage to constant correlation (Ledoit
and Wolf 2004a). The higher the average correlation in the sample covariance, the higher the Gini
coefficient of the sample eigenvalues, and vice versa.
4. Evaluation Method
We evaluate CVTL according to the objectives minimum variance and maximum risk-adjusted
return against established covariance estimators from the literature. An overview of all considered
estimators is presented in Table 1.

19
Table 1 Overview of Covariance Estimators Used in Evaluation.

Approach Description Reference
Estimators with Objective Minimum Variance

CVTL CVTL with second shrinkage target –
CVTLLS CVTL with δ1 + δ2 = 1 to exclude second shrinkage target –
QIS Quadratic-Inverse Shrinkage combining linear and non-linear shrinkage (Ledoit and Wolf 2021b)
QuEST Quantized Eigenvalues Sampling Transform: NLS based on Stieltjes (Ledoit and Wolf 2015)
transform of the MP density
LShriCC Linear shrinkage to constant correlation (Ledoit and Wolf 2004a)
LShri Linear shrinkage towards identity matrix (Ledoit and Wolf 2004b)
FMEst Frahm Memmel Estimator: Shrinkage between the sample covariance (Frahm and Memmel 2010)
and 1/N portfolio weights
BPSEst Bodnar Parolya Schmid Estimator: Generalized version of FMEst (Bodnar et al. 2018)
POET I Principal Orthogonal complEment Thresholding: constant threshold (Bai and Ng 2002, Fan et al.
parameter / dynamic factors count 2013, 2016)
POET II Principal Orthogonal complEment Thresholding: constant threshold (Fan et al. 2013, 2016)
parameter / heuristic factors count
BN Bounded Noise estimator: latest signal/noise separation approach with (Zhao et al. 2019)
L = 1000 cross validations
Sample Sample covariance estimator –
Estimators with Objective Maximum Risk-Adjusted Return

CVTL µ CVTL with second shrinkage target –
σ
CVTLLS µ CVTL with δ1 + δ2 = 1 to exclude second shrinkage target –
σ
BN VAR BN estimator with objective maximum risk-adjusted return (Zhao et al. 2019)
NC2R Generalized unconstrained partial min-var portfolio trained on return (DeMiguel et al. 2009a)
of prior period
CT Combined Talmud estimator: Combination of 50% sample covariance (Tu and Zhou 2011)
GMV and 50% 1/N portfolio
1/N 1/N portfolio with equal weights (DeMiguel et al. 2009b)
Dataset Our evaluation is based on six different datasets. First, we use three stock datasets
from Bloomberg: (i) all US listed stocks on the NYSE, AMEX, and NASDAQ (US ), (ii) European
stocks listed in the Stoxx 600 Europe index (EU ), (iii) a geographic mixture of stocks listed in the
MSCI World index (WO). Second, we use three publicly available aggregated industry portfolios
with (iv) 10, (v) 30, and (vi) 49 industries from (French 2021) (FFI ). All datasets span the time
period from January 3, 2000 – December 31, 2020. We delete all common NaN data points, e.g.,
weekends, national holidays and other non-trading days. For each dataset, the respective disjoint
history is randomly sampled from the union of the US, EU and WO datasets.
Rolling Window Evaluation We perform a rolling window evaluation with one-step-ahead
predictions, where the covariance matrix is estimated based on daily returns. We apply monthly

20
portfolio rebalancing to keep the turnover low (e.g., De Nard et al. 2019). We analyze low- and
N
high-dimensional covariance estimation problems with the dimensions D = T
∈ {2, 32 = 1.50, 34 ≈
1.33, 25 = 0.40, 15 = 0.20, 7.5

1
≈ 0.13} to increase the robustness of our findings (Rossi and Inoue 2012).
Each result is reported as the mean over 50 evaluations. In each evaluation, we randomly select
N ∈ {100, 200, 300} portfolio constituents from the respective dataset. Industry portfolio results
are based on the full dataset with N ∈ {10, 30, 49} industry constituents. For both datasets, we
calculate the following performance metrics as the mean over all occurring out-of-sample periods
between monthly portfolio rebalancing.
Performance Metrics We consider two objectives, namely, minimizing variance and maximizing
risk-adjusted return. Accordingly, we provide annualized out-of-sample volatility and risk-adjusted
return as the annualized return divided by the annualized volatility. In addition, we provide the
turnover and risk-adjusted return after transaction costs assuming 0.25 % per trade (Thapa and
Poshakwale 2010). Additional performance metrics are provided in EC.4. We perform two-sided
t-tests with α = 0.05 to check whether the performance of CVTL is significantly different from
existing estimators.
5. Results
5.1. Minimum Variance
We first consider the results for the objective minimum variance as shown in Table 2. We present
the out-of-sample volatility for estimation problems with D = 43 ≈ 1.33 and D = 0.40. We also show
the results of the estimators with objective maximum risk-adjusted return as a high risk-adjusted
return may also imply low volatility. The volatility of the best approach per setting is highlighted
in bold. Underlined values indicate that the respective approach is significantly outperformed with
p < 0.05 by CVTL. The results show that CVTL and CVTLLS consistently achieve the lowest
volatility among competing estimators independent from the dataset and portfolio size. CVTL
outperforms CVTLLS in most settings. The relative advantage of the second shrinkage target in
CVTL is stronger in the high-dimensional estimation problems. Furthermore, we find that QuEST
leads consistently to the lowest volatility among the competing covariance estimators.

21
Table 2 Out-of-Sample Annualized Volatility for Different Datasets.

D = N/T High-Dimensional Problem (D=1.33) Low-Dimensional Problem (D=0.40)
N 100 200 300 100 200 300
Data US EU WO US EU WO US EU WO US EU WO US EU WO US EU WO
CVTL 9.33 9.72 9.21 7.52 8.42 7.97 6.86 7.98 7.45 9.02 9.38 8.83 7.51 8.52 7.88 7.04 8.26 7.72
CVTLLS 9.44 9.80 9.24 7.60 8.46 7.98 6.94 7.98 7.43 9.04 9.41 8.83 7.53 8.52 7.86 7.09 8.23 7.65
QIS 9.42 9.84 9.30 7.60 8.46 8.06 6.92 7.99 7.49 9.08 9.46 8.95 7.58 8.56 7.99 7.10 8.28 7.74
QuEST 9.35 9.77 9.23 7.56 8.43 8.03 6.89 7.97 7.46 9.06 9.46 8.94 7.57 8.57 7.99 7.09 8.28 7.74
LShriCC 10.25 10.59 9.85 8.86 9.51 8.76 8.38 8.89 8.29 9.64 9.94 9.32 8.24 9.12 8.42 7.84 8.79 8.20
LShri 9.76 10.44 9.68 8.22 9.43 8.74 7.66 9.17 8.32 9.39 10.02 9.30 8.03 9.19 8.47 7.63 8.91 8.25
BPSEst 13.55 13.50 12.64 11.69 12.26 11.41 10.72 11.62 10.79 10.07 10.60 9.94 8.51 9.61 8.96 8.02 9.22 8.67
FMEst – – – – – – – – – 10.15 10.73 10.06 8.55 9.69 9.05 8.04 9.29 8.74
POET I – – – – – – – – – 9.86 10.38 10.34 12.63 10.33 14.95 13.82 10.87 26.34
POET II 9.79 10.52 9.93 7.92 8.97 8.50 7.17 8.38 7.84 9.16 9.75 9.19 7.66 8.79 8.15 7.17 8.45 7.87
BN 10.34 10.17 9.64 8.35 8.81 8.38 7.59 8.26 7.83 9.15 9.53 9.01 7.66 8.65 8.00 7.17 8.43 7.77
Sample – – – – – – – – – 10.38 11.11 10.45 8.67 9.93 9.31 8.11 9.47 8.93

CVTL µ 9.72 9.95 9.43 7.83 8.61 8.14 7.15 8.17 7.66 9.56 9.75 9.22 7.97 8.93 8.10 7.49 8.56 8.05
σ
CVTLLS µ 9.77 10.06 9.69 7.89 8.65 8.18 7.20 8.20 7.66 9.85 9.92 9.36 8.13 8.91 8.20 7.77 8.53 8.06
σ
BN VAR 11.17 11.14 10.39 9.44 9.95 9.34 9.06 9.86 9.16 10.27 10.93 10.19 8.69 9.92 9.05 7.95 9.76 8.73
NC2R 13.03 12.53 11.99 12.63 12.47 11.52 14.18 13.88 12.81 12.97 12.50 11.80 12.96 12.63 11.57 14.63 14.00 12.99
CT – – – – – – – – – 11.63 11.52 10.67 10.71 10.75 9.85 10.70 10.73 9.93
1/N 17.19 16.40 14.97 16.77 16.13 14.52 16.92 16.29 14.66 17.27 16.41 14.90 16.81 15.85 14.19 17.17 16.23 14.55
Note: Results of all evaluated covariance estimators for US, EU and WO data with N ∈ {100, 200, 300}. Each value is
given in percent as the average over 50 evaluations of random portfolio constituents. Out-of-sample volatility greater
than 50 percent due to estimation errors is denoted by “–”. The best estimator per problem setting is highlighted in
bold. Underlined values indicate significant differences from CVTL with p < 0.05.
We further assess the results for out-of-sample volatility with respect to different dimensions
D = N/T ∈ {2, 1.50, 1.33, 0.40, 0.20, 0.13} of the sample data X. The results for the US dataset are
shown in Table 3. The results for all other datasets can be found in EC.4. Again, CVTL generally
outperforms the competing estimators. The difference between CVTL and CVTLLS is significantly
larger in the high-dimensional problems with D > 1. However, for the low-dimensional problems
with D < 1, the benefit of the second shrinkage target diminishes and both CVTL models achieve
similar out-of-sample variance.

22
Table 3 Out-Of-Sample Annualized Volatility for US Data and Different Dimensions.

N 100 200 300
D = N/T 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13

CVTL 9.50 9.32 9.33 9.02 9.01 9.39 7.69 7.61 7.52 7.51 8.19 8.92 6.86 6.91 6.86 7.04 8.09 7.60
CVTLLS 9.66 9.44 9.44 9.04 9.01 9.38 7.81 7.70 7.60 7.53 8.19 8.89 6.95 6.99 6.94 7.09 8.07 7.62
QIS 9.59 9.42 9.42 9.08 9.07 9.39 7.76 7.67 7.60 7.58 8.21 8.89 6.90 6.97 6.92 7.10 8.07 7.59
QuEST 9.55 9.37 9.35 9.06 9.06 9.39 7.72 7.63 7.56 7.57 8.20 8.89 6.88 6.94 6.89 7.09 8.07 7.59
LShriCC 10.33 10.18 10.25 9.64 9.32 9.54 8.93 8.87 8.86 8.24 8.54 9.12 8.23 8.41 8.38 7.84 8.46 7.78
LShri 9.85 9.71 9.76 9.39 9.22 9.49 8.15 8.21 8.22 8.03 8.42 9.04 7.41 7.64 7.66 7.63 8.32 7.69
BPSEst 12.77 12.92 13.55 10.07 9.40 9.57 10.61 10.95 11.69 8.51 8.53 9.10 – 10.02 10.72 8.02 8.41 7.73
FMEst – – – 10.15 9.41 9.58 – – – 8.55 8.54 9.10 – – – 8.04 8.41 7.72
POET I – – – 9.86 9.29 9.53 – – – 12.63 8.62 9.13 – – – 13.82 8.58 7.82
POET II 11.54 10.03 9.79 9.16 9.17 9.50 9.02 8.11 7.92 7.66 8.31 9.04 7.87 7.33 7.17 7.17 8.23 7.75
BN 10.81 10.42 10.34 9.15 9.17 9.57 8.83 8.51 8.35 7.66 8.33 9.05 7.82 7.73 7.59 7.17 8.20 7.76
Sample – – – 10.38 9.49 9.61 – – – 8.67 8.56 9.10 – – – 8.11 8.41 7.72

CVTL µ 9.92 9.66 9.72 9.56 9.76 10.26 7.92 7.78 7.83 7.97 8.77 9.90 7.24 7.16 7.15 7.49 9.07 7.92
σ
CVTLLS µ 10.06 9.66 9.77 9.85 9.97 10.45 7.95 7.86 7.89 8.13 9.01 9.52 7.35 7.32 7.20 7.77 8.76 8.03
σ
BN VAR 11.41 11.13 11.17 10.27 10.40 10.64 9.81 9.50 9.44 8.69 9.36 9.83 9.14 9.12 9.06 7.95 8.88 8.42
NC2R 13.07 13.15 13.03 12.97 13.14 13.73 12.60 12.76 12.63 12.96 13.77 14.22 13.69 13.81 14.18 14.63 15.40 13.03
CT – – – 11.63 11.42 11.83 – – – 10.71 11.39 12.02 – – – 10.70 11.60 10.12
1/N 17.03 17.03 17.19 17.27 16.98 17.40 16.66 16.82 16.77 16.81 17.67 18.37 16.72 16.83 16.92 17.17 18.32 15.18
Note: Results of all evaluated covariance estimators for US data with N ∈ {100, 200, 300} and D ∈
{2, 1.50, 1.33, 0.40, 0.20, 0.13}. Each value is given in percent as the average over 50 evaluations of random portfolio
constituents. Out-of-sample volatility greater than 50 percent due to estimation errors is denoted by “–”. The best
estimator per problem setting is highlighted bold. Underlined values indicate significant differences from CVTL with
p < 0.05.
Regarding the competing estimators, we observe a general pattern that the achieved volalities
are less diverged for D < 1. We find that QuEST (Ledoit and Wolf 2015) even presents the overall
dominant approach for a total of three low-dimensional problems with N = 200, D = 0.13 and
N = 300, D ∈ {0.20, 0.13}. Other estimators that rely on the weights given by the sample covariance
matrix, namely FMEst and CT, show significantly worse performance for high D up to the point
where the resulting portfolio yields an out-of-sample volatility of above 50 percent, which is denoted
by “–”.
Interestingly, we find that CVTL, QuEST, and QIS (Ledoit and Wolf 2021b) achieve comparably
lower variances for higher than for lower dimensions D. This seems surprising, as greater D implies
a more deteriorated covariance matrix with greater imbalance in eigenvalues, and, thus, a more
challenging estimation problem. However, this difference in volatility can be explained as the
number and timeframe of the individual out-of-sample periods in the rolling window evaluations

23
are different. For instance, the setting D = 0.13 and N = 300 requires 2250 datapoints, or ten years
of return history to estimate the first covariance matrix. All of these datapoints are hence not used
as out-of-sample periods. Conversely, the setting D = 2 and N = 300 requires only 150 datapoints
to estimate the first covariance matrix. Accordingly, the high-dimensional evaluation is based on
more and older datapoints than the low-dimensional evaluation.
Figure 3 Visualization of Performance.
(a) Annualized Volatility vs. Annualized Return (b) Annualized Volatility vs. Risk-Adjusted Return
Note: Visualization of out-of-sample annualized volatility, annualized return, and risk-adjusted for all covariance
estimators based on US data with N = 100 and D = 0.20. Each value is given in percent as the average over 50
simulations with random portfolio constituents. The area under the frontier created by CVTL provides additional
diversification benefits.
Figure 3a illustrates the out-of-sample performance of each estimator along the efficient frontier
in a return versus volatility diagram for N = 100 and D = 0.20. In addition to our prior findings,
we observe that CVTL leads to comparable or better returns than most competing estimators. As

24
a result, it shifts the efficient frontier , which traditionally starts at the sample portfolio, to the
upper left by reducing variance and increasing return.
We also consider the results for out-of-sample volatility based on small aggregated industry
portfolios as done by Zhao et al. (2019). Note that these datasets are publicly available (see French
2021), which facilitates reproducibility of our results. Table 4 presents the results for volatility for
N ∈ {10, 30, 49} industry portfolios. CVTL now achieves only average results if the portfolio is very
small, with N = 10. However, both CVTL approaches become superior to the competing approaches
for larger potfolios with N > 10. For N = 10, POET I achieves the lowest volatility. Besides, we
find similar patterns for the otherwise superior QuEST model, which is outperformed by linear
shrinkage for N < 49. This was already suggested by Ledoit and Wolf (2021a) and is mainly due to
the complexity of fitting a non-linear function for small X T ×N .
Table 4 Out-Of-Sample Annualized Volatility for FFI Data and Different Dimensions.
N 10 30 49
D = N/T 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13

CVTL 14.14 13.76 13.60 12.79 12.14 12.05 11.89 11.85 11.68 11.25 10.82 10.89 11.23 10.99 10.89 10.37 10.32 10.11
CVTLLS 14.00 13.58 13.42 12.27 11.77 11.85 12.05 11.89 11.71 11.21 10.78 10.86 11.16 10.99 10.85 10.34 10.31 10.08
QIS 13.43 13.49 13.63 12.36 11.98 11.96 12.12 12.15 12.14 11.41 11.09 11.08 11.33 11.12 11.14 10.63 10.50 10.24
QuEST 13.25 12.97 12.73 12.26 11.95 11.97 11.99 11.94 11.76 11.41 11.08 11.08 11.23 11.06 11.04 10.60 10.50 10.24
LShriCC 13.32 13.12 12.91 12.15 12.04 12.11 12.14 12.32 12.04 11.67 11.10 11.05 11.46 11.54 11.29 10.81 10.56 10.31
LShri 13.11 12.78 12.56 12.04 11.77 11.82 11.95 12.03 11.93 11.47 11.02 11.03 11.46 11.29 11.28 10.77 10.56 10.28
BPSEst 14.69 14.97 15.33 13.09 12.16 12.05 14.14 14.37 15.26 12.25 11.40 11.21 13.24 13.66 14.35 11.65 10.81 10.40
FMEst – 15.55 – 13.23 12.22 12.09 – – – 12.42 11.44 11.23 – – – 11.81 10.83 10.41
POET I 13.11 12.71 12.43 11.89 11.61 11.70 11.95 12.15 13.37 11.43 10.92 10.94 27.62 12.79 16.88 10.69 10.41 10.16
POET II 15.86 14.52 13.93 12.41 12.11 12.17 13.25 12.68 12.43 11.64 11.33 11.37 12.22 11.39 11.42 10.72 10.67 10.47
BN 13.31 12.83 12.63 12.01 11.85 11.89 12.03 11.94 11.69 11.36 10.97 11.07 11.39 11.21 11.03 10.60 10.47 10.25
Sample – – – 13.91 12.43 12.21 – – – 13.03 11.61 11.33 – – – 12.31 10.98 10.50

CVTL µ 23.52 13.63 15.78 13.02 12.64 13.58 13.33 13.32 13.11 12.28 12.07 11.79 11.62 11.61 11.80 10.84 10.97 10.95
σ
CVTLLS µ 40.22 13.66 26.38 12.47 12.76 12.90 13.20 12.32 12.47 11.97 11.52 11.70 11.72 11.09 11.04 10.69 11.12 11.19
σ
BN VAR 13.84 13.91 13.69 13.67 13.22 13.55 13.15 13.53 12.59 12.59 12.54 12.63 12.10 12.10 11.98 11.95 12.02 12.37
NC2R 14.26 13.81 13.80 13.55 13.18 13.24 14.00 14.20 13.85 13.58 13.29 12.99 13.44 13.14 13.45 12.96 13.24 12.91
CT – – 46.79 12.82 12.44 12.50 – – – 12.74 12.34 12.31 – – – 12.22 12.04 11.80
1/N 15.54 15.55 15.56 15.51 15.52 15.68 16.80 16.80 16.76 17.00 16.78 16.83 16.69 16.88 16.72 16.66 16.75 16.43
Note: Results of all evaluated covariance estimators for FFI industry portfolios with N ∈ {10, 30, 49} and
D ∈ {2, 1.50, 1.33, 0.40, 0.20, 0.13}. Out-of-sample volatility greater than 50 percent due to estimation errors is denoted
by “–”. The best estimator per problem setting is highlighted in bold. Underlined values indicate significant differences
from CVTL with p < 0.05.

25
5.2. Maximum Risk-Adjusted Return
Next, we consider all covariance estimators based on the objective maximum risk-adjusted return.
Table 5 presents the out-of-sample risk-adjusted return as annual return divided by annual volatility.
Again, we consider all estimators, including the estimators that were specifically designed to
minimize volatility as a high risk-adjusted return may also be achieved through low volatility. In
this case, underlined values indicate that the respective approach is significantly outperformed with
p < 0.05 by CVTL σµ .
Table 5 Out-of-Sample Risk-Adjusted Return and Different Datasets.

D = N/T High-Dimensional Problem (D=1.33) Low-Dimensional Problem (D=0.40)
N 100 200 300 100 200 300

CVTL 0.98 1.12 0.88 1.38 1.52 1.03 1.25 1.48 0.99 0.81 1.10 0.75 1.09 1.43 1.08 1.01 1.42 0.99
CVTLLS 0.99 1.11 0.92 1.38 1.49 1.07 1.24 1.46 1.02 0.82 1.09 0.79 1.09 1.40 1.10 1.00 1.39 1.01
QIS 0.97 1.06 0.88 1.36 1.46 1.02 1.21 1.44 0.96 0.76 1.05 0.74 1.05 1.40 1.02 0.98 1.39 0.99
QuEST 0.97 1.06 0.88 1.38 1.48 1.02 1.23 1.45 0.97 0.77 1.06 0.73 1.05 1.40 1.02 0.98 1.39 0.99
LShriCC 0.58 0.69 0.61 0.79 0.88 0.71 0.44 0.78 0.58 0.50 0.79 0.60 0.64 0.82 0.76 0.46 0.81 0.71
LShri 0.92 1.00 0.87 1.22 1.32 1.01 0.98 1.11 0.81 0.72 0.97 0.74 0.90 1.14 0.92 0.79 1.09 0.90
BPSEst 0.78 0.82 0.82 0.94 0.92 0.80 0.79 0.78 0.73 0.67 0.85 0.74 0.85 1.03 0.87 0.76 0.99 0.85
FMEst – – – – – – – – – 0.63 0.84 0.71 0.82 1.00 0.83 0.75 0.99 0.83
POET I – – – – – – – – – 0.87 1.06 0.86 0.49 1.52 0.48 1.06 1.24 0.18
POET II 0.83 0.93 0.81 1.21 1.34 1.02 1.04 1.25 0.96 0.69 1.00 0.74 0.90 1.21 1.02 0.84 1.17 0.99
BN 0.98 1.08 0.86 1.33 1.45 1.00 1.31 1.51 0.99 0.81 1.12 0.75 1.08 1.55 1.12 1.02 1.59 1.03
Sample – – – – – – – – – 0.58 0.81 0.65 0.78 0.96 0.77 0.72 0.97 0.81

CVTL µ 1.00 1.12 0.92 1.39 1.49 1.03 1.26 1.51 0.99 0.89 1.11 0.83 1.24 1.49 1.15 1.09 1.59 1.02
σ
CVTLLS µ 0.98 1.09 0.96 1.39 1.46 1.06 1.25 1.43 1.02 0.90 1.05 0.83 1.22 1.45 1.17 1.07 1.43 1.01
σ
BN VAR 0.73 0.86 0.60 0.89 1.24 0.68 0.86 1.36 0.70 0.55 1.05 0.52 0.81 1.45 1.01 0.88 1.70 1.00
NC2R 0.80 0.77 0.81 0.86 0.95 0.82 0.71 0.82 0.80 0.70 0.78 0.74 0.77 0.90 1.01 0.54 0.83 0.72
CT – – – – – – – – – 0.89 0.92 0.93 1.11 1.15 1.17 0.82 1.02 0.97
1/N 0.83 0.77 0.86 0.88 0.78 0.89 0.77 0.82 0.85 0.86 0.75 0.88 1.03 0.97 1.13 0.68 0.78 0.84
Note: Results of all evaluated covariance estimators for US, EU and WO data with N ∈ {100, 200, 300}. Each value is
given in percent as the average over 50 evaluations of random portfolio constituents. Out-of-sample volatility greater
than 50 percent due to estimation errors is denoted by “–”. The best estimator per problem setting is highlighted in
bold. Underlined values indicate significant differences from CVTL µ with p < 0.05.
σ
The results do not suggest a dominating estimator, which is somewhat expected as the estimation
of returns is usually more unstable than the estimation of risk (Michaud 1989). However, we observe
that CVTL σµ and CVTLLS σµ usually perform among the best models in the considered low- and high-
dimensional settings. The highest risk-adjusted return in a few settings (D = 1, 33, N = 200, EU and

26
WO dataset) is achieved by CVTL and CVTLLS, albeit these approaches are primarily minimizing
variance. The largest outperformance over CVTL σµ is achieved by CT in the low-dimensional
problem with N = 100 on the MSCI World (WO) dataset. In addition, we find that CVTL and
CVTL σµ achieve similar risk-adjusted returns even though they were designed to optimize different
objectives. There is only one setting where CVTL or CVTL σµ do not outperform the approaches
BN VAR and NC2R, which are both designed to maximize risk-adjusted return.
Following our prior analyses, we also assess the results for risk-adjusted return based on the US
dataset and dimensions D = N/T ∈ {2, 1.50, 1.33, 0.40, 0.20, 0.13}. The results are shown in Table 6.
Here, we observe that CVTL σµ and CVTLLS σµ achieve the highest risk-adjusted return in most
settings.
Table 6 Out-Of-Sample Risk-Adjusted Return for US Data and Different Dimensions.

N 100 200 300
D = N/T 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13

CVTL 1.06 1.13 0.98 0.81 0.94 0.81 1.25 1.45 1.38 1.09 0.73 0.73 1.53 1.33 1.25 1.01 0.72 1.47
CVTLLS 1.10 1.15 0.99 0.82 0.94 0.82 1.25 1.44 1.38 1.09 0.74 0.72 1.51 1.33 1.24 1.00 0.72 1.43
QIS 1.08 1.14 0.97 0.76 0.90 0.79 1.22 1.45 1.36 1.05 0.71 0.69 1.51 1.30 1.21 0.98 0.68 1.42
QuEST 1.08 1.14 0.97 0.77 0.90 0.79 1.23 1.46 1.38 1.05 0.71 0.69 1.53 1.32 1.23 0.98 0.68 1.42
LShriCC 0.71 0.81 0.58 0.50 0.72 0.66 0.65 0.81 0.79 0.64 0.47 0.52 0.76 0.57 0.44 0.46 0.38 1.15
LShri 1.06 1.12 0.92 0.72 0.85 0.76 1.15 1.32 1.22 0.90 0.63 0.65 1.34 1.08 0.98 0.79 0.58 1.32
BPSEst 0.89 0.91 0.78 0.67 0.83 0.75 1.02 1.07 0.94 0.85 0.62 0.64 0.34 0.99 0.79 0.76 0.57 1.30
FMEst – – – 0.63 0.82 0.75 – – – 0.82 0.62 0.64 – – – 0.75 0.57 1.30
POET I – – – 0.87 0.99 0.84 – – – 0.49 0.80 0.75 – – – 1.06 0.78 1.46
POET II 0.87 1.04 0.83 0.69 0.83 0.75 1.01 1.24 1.21 0.90 0.62 0.65 1.19 1.15 1.04 0.84 0.62 1.31
BN 1.01 1.05 0.98 0.81 0.98 0.83 1.15 1.44 1.33 1.08 0.75 0.75 1.46 1.29 1.31 1.02 0.76 1.53
Sample – – – 0.58 0.79 0.73 – – – 0.78 0.60 0.64 – – – 0.72 0.56 1.30

CVTL µ 1.08 1.10 1.00 0.89 1.11 0.84 1.27 1.46 1.39 1.24 0.82 0.83 1.51 1.34 1.26 1.09 0.87 1.52
σ
CVTLLS µ 1.11 1.12 0.98 0.90 1.10 0.84 1.24 1.44 1.39 1.22 0.84 0.82 1.51 1.32 1.25 1.07 0.86 1.52
σ
BN VAR 0.80 0.82 0.73 0.55 0.74 0.74 0.79 1.04 0.89 0.81 0.58 0.62 0.80 0.72 0.86 0.88 0.68 1.58
NC2R 0.80 0.72 0.80 0.70 0.72 0.43 0.74 0.75 0.86 0.77 0.71 0.38 0.90 0.84 0.71 0.54 0.51 0.86
CT – – – 0.89 1.08 0.79 – – – 1.11 0.76 0.69 – – – 0.82 0.67 1.05
1/N 0.93 0.85 0.83 0.86 1.02 0.68 0.92 0.95 0.88 1.03 0.69 0.59 0.89 0.80 0.77 0.68 0.59 0.75
estimator per problem setting is highlighted in bold. Underlined values indicate significant differences from CVTL µ
σ
with p < 0.05.
Figure 3b illustrates the out-of-sample performance of each estimator along the efficient frontier
in a risk-adjusted return vs volatility diagram for N = 100 and D = 0.20. We observe that, by

27
introducing CVTL σµ , we obtain an improved efficient frontier between CVTL, CVTL σµ and the 1/N
portfolio that dominates existing estimators. Importantly, as shown by Figure 3a, the increase in
out-of-sample risk-adjusted return in CVTL σµ is not caused by strategic scaling, e.g., decreasing
variance towards zero to improve risk-adjusted return. Instead, CVTL σµ exhibits comparable, but
improved, levels of absolute risk and return than the traditional efficient frontier.
The results for maximum risk-adjusted return on the FFI industry portfolios are shown in Table 7.
The results differ from those of our prior analyses on risk-adjusted return based on single stock data.
For small portfolios with N = 10, POET II leads to the highest risk-adjusted return. For N = 49,
linear shrinkage to constant correlation (LinShriCC) dominates all other estimators. The CVTL
estimators perform weakly, which seems to be caused by the small portfolio sizes, in particular for
N < 49.
Table 7 Out-Of-Sample Risk-Adjusted Return for FFI Data and Different Dimensions.
N 10 30 49
D = N/T 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13
CVTL 0.63 0.77 0.62 0.68 0.89 0.96 0.78 0.82 0.83 0.84 0.94 0.92 0.75 0.81 0.88 0.89 0.73 0.95
CVTLLS 0.66 0.78 0.64 0.70 0.94 0.98 0.78 0.85 0.84 0.86 0.93 0.93 0.77 0.86 0.88 0.90 0.72 0.93
QIS 0.85 0.92 0.98 0.88 1.07 1.04 0.59 0.86 0.81 0.80 0.87 0.90 0.82 0.81 0.71 0.73 0.64 0.85
QuEST 0.92 0.94 0.92 0.91 1.11 1.05 0.62 0.82 0.76 0.79 0.86 0.91 0.79 0.78 0.71 0.78 0.65 0.85
LShriCC 0.79 0.84 0.90 0.87 0.95 0.87 0.71 0.65 0.65 0.79 0.90 0.93 0.85 0.99 0.96 0.85 0.71 0.91
LShri 0.68 0.82 0.76 0.82 1.04 1.04 0.73 0.83 0.77 0.83 0.89 0.91 0.76 0.72 0.75 0.81 0.67 0.88
BPSEst 0.62 0.72 0.69 0.81 1.05 1.02 0.50 0.82 0.72 0.78 0.85 0.87 0.79 0.74 0.80 0.59 0.60 0.81
FMEst – 0.68 – 0.84 1.08 1.04 – – – 0.74 0.85 0.88 – – – 0.55 0.60 0.81
POET I 0.67 0.89 0.89 0.85 1.02 1.01 0.78 0.80 0.72 0.60 0.83 0.86 0.36 0.65 0.16 0.72 0.62 0.85
POET II 1.16 1.30 1.27 0.99 1.11 0.96 0.61 0.84 0.78 0.72 0.79 0.81 0.82 0.75 0.79 0.87 0.63 0.74
BN 0.78 0.84 0.71 0.86 1.09 1.06 0.70 0.84 0.84 0.78 0.85 0.89 0.67 0.84 0.80 0.79 0.63 0.91
Sample – – – 0.83 1.11 1.06 – – – 0.68 0.86 0.89 – – – 0.50 0.60 0.79

CVTL µ 0.74 0.80 0.27 0.55 0.79 0.75 0.65 0.84 0.95 0.85 0.94 0.89 0.75 0.86 0.82 1.00 0.73 1.02
σ
CVTLLS µ 1.07 0.81 – 0.73 0.84 0.87 0.64 0.83 0.76 0.82 1.01 0.91 0.56 0.86 0.86 0.95 0.71 0.98
σ
BN VAR 0.67 0.93 0.95 0.86 1.04 0.97 0.70 0.73 0.92 0.78 0.94 1.02 0.71 0.93 0.88 0.95 0.73 0.86
NC2R 0.54 0.78 0.73 0.61 0.76 0.79 0.46 0.65 0.68 0.71 0.49 0.74 0.68 0.76 0.70 0.74 0.64 0.84
CT – – 0.86 0.75 0.91 0.95 – – – 0.81 0.81 0.85 – – – 0.74 0.63 0.85
1/N 0.63 0.68 0.52 0.49 0.57 0.68 0.54 0.62 0.74 0.69 0.60 0.64 0.51 0.66 0.58 0.72 0.51 0.72
Note: Results of all evaluated covariance estimators for FFI industry portfolios with N ∈ {10, 30, 49} and
D ∈ {2, 1.50, 1.33, 0.40, 0.20, 0.13}. Out-of-sample volatility greater than 50 percent due to estimation errors is denoted
by “–”. The best estimator per problem setting is highlighted in bold. Underlined values indicate significant differences
from CVTL µ with p < 0.05.
σ
We also consider out-of-sample risk-adjusted return after transaction costs (0.25 percent per
trade (Thapa and Poshakwale 2010)), for which we briefly summarize the findings. Detailed results

28
are provided in EC.1. We first consider the turnover rates (i.e., the sum of weight changes during
the evaluation period) as they are directly influencing transaction costs. Here, we find that CVTL
usually achieves the lowest turnover of all estimators with the objective minimizing risk. Among the
estimators with objective maximum risk-adjusted return, CVTL σµ most often achieves the lowest
turnover rates. The only estimator that shows lower turnover for N = 300 is N2CR. This can be
expected as N2CR employs weight constraints. Regarding risk-adjusted return after transaction
costs, we can confirm the results of Kourtis (2015), which suggest that it is hard to outperform
the 1/N portfolio on a risk-adjusted basis after accounting for transaction costs. However, CVTL σµ
and CVTLLS σµ are the only estimators that outperform the 1/N portfolio in several settings. The
results about risk-adjusted return after costs for other datasets are provided in EC.4.
5.3. Sensitivity Analyses
We now assess the sensitivity of CVTL in regard to the cross validation parameters. Specifically,
we consider a higher number of disjoint histories used in cross validation and different sizes of
the search grid for the covariance estimation parameters γ, δ1 , δ2 . We analyze the sensitivity for
D ∈ {0.40, 1.33} and portfolio size N = 100 based on the US dataset. In addition, we analyze the
influence of the parameter ζ that is used to define the function (24) that reduces the parameter
search space when maximizing risk-adjusted return.
In our main analyses, we used only one transfer dataset (i.e., one disjoint history) for cross valida-
tion and the parameter grid from (17). In this analysis, we evaluate the difference in performance
for 1, . . . , 5 transfer sets and two additional parameter grids. We consider a smaller grid Psmall
1
γ ∈ {−1, − , . . . , 1} (25)
3
δ1 , δ2 ∈ {0, 0.1, . . . , 1} with δ1 + δ2 ≤ 1 (26)
and a larger grid Plarge .
γ ∈ {−1, −0.9, . . . , 1} (27)
δ1 , δ2 ∈ {0, 0.02, . . . , 1} with δ1 + δ2 ≤ 1 (28)

29
Table 8 shows the difference in basis points (bps, 1 basis point = 0.01 %) for the particular
performance metrics between the modified parameter setting and the reference setting (i.e., one
transfer set and the medium-sized grid (17)). The scores are calculated in a way that positive values
indicate an improvement, while negative values indicate a decline in the respective performance
metric. For the objective minimum variance, increasing the number of transfer sets has a small
positive effect as it reduces out-of-sample volatility by one bp. Increasing the grid size from medium
to large reduces out-of-sample volatility by 5 bps in the high-dimensional setting and by 8 bps in
the low-dimensional setting. Conversely, reducing the grid size from medium to small increases
out-of-sample volatility by up to 5 bps in the high-dimensional setting and by up to 8 bps in the
low-dimensional setting. Interestingly, we find that increasing the grid and the number of transfer
sets simultaneously leads to higher out-of-sample volatility than using only one transfer set in
combination with a large grid.
Table 8 Sensitivity of CVTL Performance Towards Different Search Grids and Number of Transfer Sets.
High-Dimensional Setting (D = 1.33) Low-Dimensional Setting (D = 0.40)
Search Small Medium Large Small Medium Large
grid
Transfer
Objective Minimum Variance (CVTL)
sets
1 −5 – 5 −8 – 8
2 −5 1 0 −6 2 4
3 −5 1 0 −6 2 4
4 −4 1 0 −6 2 4
5 −5 1 0 −6 2 4
Transfer
Objective Risk-Adjusted Return (CVTL µ )
sets σ
1 2 – −3 8 – −8
2 4 1 −2 7 2 −5
3 4 1 0 9 2 −6
4 3 0 0 11 2 −5
5 4 2 0 13 3 −5
Note: Scores are given in basis points. Positive values indicate improvements, while negative values indicate declines in
the respective performance metric. The reference setting is given by a medium-sized grid and one transfer set for cross
validation.
For CVTL σµ with the objective maximum risk-adjusted return, increasing the number of transfer
sets has a greater positive effect of up to 2 bps in the low-dimensional and 3 bps in the high-

30
dimensional setting. However, increasing the grid size from medium to large decreases the risk-
adjusted return by up to 8 bps in the high-dimensional setting. In particular, we find that it is best
to decrease the grid from medium to small as this increases out-of-sample risk-adjusted return by up
to 4 bps in the low-dimensional and 13 bps in the high-dimensional setting. These findings suggest
that using a very fine-grained grid leads to overfitting when optimizing the covariance estimate to
maximize risk-adjusted return. Instead, it is beneficial to use a more coarse grid to achieve better
generalizability of the parameters found via cross validation.
We also analyze the influence of the ζ parameter that is used to decide whether or not a given
parameter configuration δ1 , δ2 , γ is admissible when optimizing for risk-adjusted return. In our
main analysis, we set ζ = 0.30. We now analyze how out-of-sample risk-adjusted return changes for
different values ζ ∈ {0.10, 0.20, 0.30, 0.40, 0.50} and if we remove the search space reduction. The
results are shown in Table 9. Evidently, it is generally beneficial to filter the parameter configurations
so that the resulting portfolios exhibit non-extreme volatility in cross validation. Note that not
filtering the parameter configurations can also lead to impressively good results, e.g.for N = 200 and
D = 0.13, however, we believe that such scores only appeared by chance and will not be reproducible
in the future. In addition, we find that risk-adjustred return changes only marginally for different
values of ζ. The difference in risk-adjusted return across different values of ζ is a maximum of 6 bps
in the high-dimensional problems and a maximum of 10 bps in the low-dimensional problems.
Table 9 Out-Of-Sample Risk-Adjusted Return for US Data and Different Dimensions.

N 100 200 300
D = N/T 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13
CVTL µ with Search Space Reduction

σ
ζ = 0.10 1.08 1.12 1.00 0.86 1.15 0.82 1.28 1.44 1.34 1.26 0.79 0.81 1.50 1.30 1.20 0.97 0.79 1.44
ζ = 0.20 1.08 1.12 1.01 0.88 1.13 0.84 1.27 1.44 1.38 1.25 0.80 0.82 1.51 1.34 1.25 1.05 0.84 1.50
ζ = 0.30 1.08 1.10 1.00 0.89 1.11 0.84 1.27 1.46 1.39 1.24 0.82 0.83 1.51 1.34 1.26 1.09 0.87 1.52
ζ = 0.40 1.09 1.10 0.99 0.89 1.08 0.84 1.30 1.45 1.36 1.24 0.83 0.85 1.51 1.35 1.25 1.11 0.85 1.54
ζ = 0.50 1.06 1.10 0.99 0.88 1.06 0.84 1.29 1.45 1.36 1.22 0.83 0.86 1.51 1.34 1.23 1.11 0.86 1.54
CVTL µ without Search Space Reduction

σ
0.71 1.09 0.73 0.87 1.02 0.84 1.30 1.44 1.32 1.15 0.79 2.74 1.47 1.34 1.28 1.12 0.73 0.77
constituents. The best estimator per problem setting is highlighted in bold.

31
5.4. Empirical Interpretation and Relation to Other Estimators
Finally, we analyze the resulting estimation parameters and eigenvalue distributions from an
empirical point of view. In addition, we compare the resulting eigenvalue distributions with the
established non-linear shrinkage estimator QuEST (Ledoit and Wolf 2015) and BN (Zhao et al.
2019).
Estimation Parameters of CVTL We start by considering the most frequent parameters
γ, δ1 , δ2 that are used to estimate the future covariance based on sample data. Table 10 provides
the average parameters of CVTL and CVTL σµ for US data with N = 100 and different dimensions.
To ease readability, we define δ3 = 1 − δ1 − δ2 as the weight of the non-linear shrinkage target λθ .
We observe that δ3 is considerably higher in the high-dimensional settings with D > 1, which is
consistent with the recent results by Ledoit and Wolf (2021b) who also found greater usage of
the non-linear shrinkage target for D > 1. For the low-dimensional problems, δ3 decreases and δ1
increases, which implies a greater weight of the sample eigenvalues. We further observe a decreasing
scaling parameter γ for increasing D. The parameters of CVTL σµ show similar patterns. The weight
of the non-linear shrinkage target increases for higher D, while the weight of the sample eigenvalues
increases for lower D. Additional empirical observations about the most frequent parameters for
larger portfolios, parameter values over time, and parameter interdependence are provided in EC.2.
Table 10 Average Parameter Calibration for CVTL Estimators and Different Sample Dimensions.
CVTL CVTL µ
σ
D δ1 δ2 δ3 γ δ1 δ2 δ3 γ
2.00 20.9 26.6 52.4 88.6 23.2 43.0 33.8 40.8
1.50 30.7 32.9 36.4 90.9 24.0 47.5 28.5 59.0
1.33 36.5 36.7 26.8 91.9 28.6 46.4 25.0 42.1
0.40 60.6 27.5 11.9 84.8 33.5 45.8 20.7 30.0
0.20 68.8 19.5 11.7 81.8 33.7 47.0 19.3 11.4
0.13 69.4 17.7 12.9 69.0 29.3 45.3 25.5 35.8
Note: Average parameter calibration for US data with N = 100 and D ∈ {2, 1.50, 1.33, 0.40, 0.20, 0.13}. Each score is
given in percent as the average over 50 evaluations with random portfolio constituents. δ3 = 1 − δ1 − δ2 .

32
Eigenvalue Distributions Next, we compare the resulting eigenvalue distributions of CVTL,
QuEST (Ledoit and Wolf 2015) and BN (Zhao et al. 2019). Figure 4 plots the shrunk eigenvalues
compared to the sample eigenvalues for CVTL (black cross), QuEST (dark gray circle) and BN
(gray triangles). The plotted eigenvalues denote the average over 50 evaluations with N = 100 based
on the US data.
Figure 4 Sample and Shrunk Eigenvalues.
D = 2.00 D = 1.50 D = 1.33

Shrunk eigenvalue
10−2 10−2 10−2
10−3 10−3 10−3
10−4 10−4 10−4
10−4 10−3 10−2 10−4 10−3 10−2 10−4 10−3 10−2
Sample eigenvalue Sample eigenvalue Sample eigenvalue
D = 0.40 D = 0.20 D = 0.13

Shrunk eigenvalue
10−2 10−2 10−2
10−3 10−3 10−3
10−4 10−4 10−4
10−4 10−3 10−2 10−4 10−3 10−2 10−4 10−3 10−2
CVTL QuEST BN Sample
Note: The plotted eigenvalues are averaged over 50 simulations with random portfolio constituents based on the US
data with N = 100.
We make several observations regarding the shrunk eigenvalues of CVTL. First, CVTL shrinks the
eigenvalues in a clearly non-linear way. Second, CVTL shrinks small eigenvalues in a similar fashion
to QuEST, especially for lower dimensions. This is surprising given that CVTL does not induce any
statistically derived lower limit for small eigenvalues. Accordingly, this supports the use of cross
validation for parameter selection as it leads to comparable results like statistically derived methods.

33
Third, CVTL does not reduce the largest eigenvalue for all dimensions. Instead, it increases the
first eigenvalue in the high dimensional problems. As a result, the weight directions of the first
eigenvalue will be suppressed to a stronger extent compared to other estimators. This indicates
that it is not always necessary to decrease the first eigenvalue. Independent of the dimension, the
eigenvalues achieved by BN deviate from QuEST and CVTL, which can be expected due to the
separation of signal and noise. The same analysis for CVTL σµ is provided in EC.2.3.
Benefits of Cross Validation and Second Shrinkage Target Ultimately, we perform an
analysis to estimate the isolated effects of using cross validation for parameter selection and the
benefit of the proposed second shrinkage target. For this purpose, we generate a new covariance
estimator called “QuESTCV” that uses the shrinkage target by QuEST (Ledoit and Wolf 2015)
but cross validation to select the shrinkage intensity. This allows us to calculate the individual
benefits of (i) cross validation for selection of shrinkage intensities as the difference in bps between
QuESTCV and QuEST, and (ii) the proposed second shrinkage target λθ as the difference in bps
between CVTL and QuESTCV. Table 11 presents the results for the objectives minimum variance
(we only consider the objective minimum variance as QuEST specifically minimizes volatility). For
instance, the value 7 in the row for cross validation for N = 100 and D = 0.40 indicates that using
cross validation to select the shrinkage intensity of QuEST instead of applying the original QuEST
estimator reduces out-of-sample volatility by 7 bps.
Table 11 Individual Benefits of Cross Validation and Second Shrinkage Target for US data.
N 100 200 300
D = N/T 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13
Cross −1 0 1 7 6 3 −2 −1 2 6 3 1 1 2 3 5 3 0
validation
Shrinkage 7 5 3 −1 0 −3 4 5 4 0 −1 −3 2 1 1 −1 −3 −1
target λθ
Note: CV denotes the benefit of cross validation as the difference in basis points when using cross validation to select
the shrinkage intensity of QuEST (Ledoit and Wolf 2015) instead of applying the original QuEST estimator. λθ
denotes benefit of the proposed second shrinkage target as the difference in basis points when using CVTL instead
of QuEST with cross validation to select the shrinkage intensity. Each score denotes the difference in out-of-sample
volatility basis points, e.g., 1 denotes the difference 10.00% vs. 10.01%. Positive values indicate an improvement in the
respective performance metric. Negative values indicate a decline.

34
We find that using cross validation for selecting shrinkage intensities has the greatest benefit
for low-dimensional problems, while the benefit for high-dimensional problems is smaller or even
negative. Conversely, using the second shrinkage target of CVTL instead of the non-linear shrinkage
target by QuEST reduces out-of-sample volatility in particular for the high-dimensional problems,
while it often has a negative effect for low-dimensional problems. This supports the recent idea of
Ledoit and Wolf (2021b) to shrink the eigenvalues quadratically for low-dimensional problems using
a non-linear and a linear shrinkage target. Additional analyses about the benefits of cross validation
and the second shrinkage target are provided in EC.3.
6. Concluding Remarks
We proposed a novel approach for covariance estimation that applies cross validation based transfer
learning called “CVTL”. The proposed approach applies non-linear shrinkage to the eigenvalues of
the sample covariance according to a given objective function. In contrast to existing methods, which
generally rely on analytically derived values for the estimation parameters, CVTL is purely data-
driven and agnostic with respect to the resulting eigenvalue distribution and shrinkage intensities.
All estimation parameters are selected using cross validation and the given objective based on a
disjoint history of assets. The resulting parameters are subsequently used to estimate the actual
covariance matrix of the given portfolio constituents. Thereby, our study presents a novel perspective
on the problem of covariance estimation.
Although CVTL is purely data-driven, the resulting shrunk eigenvalues are similar to those
of existing non-linear shrinkage models. However, we found differences in the high-dimensional
problems with objective minimum variance, where CVTL tends to increase the largest eigenvalue
which presents the exact opposite of existing estimators. This is surprising as, so far, the underlying
idea of non-linear shrinkage estimators was to push small eigenvalues up and large eigenvalues
down, as recently summarized by Ledoit and Wolf (2021a). A major advantage of CVTL over
existing estimators is its flexibility, as the estimation parameters can be selected based on a
predefined objective function. All portfolio weights are directly calculated through the GMV portfolio

35
optimization procedure that relies only on the covariance matrix. For instance, to maximize risk-
adjusted return, CVTL does not need to explicitly estimate future returns but they are implicitly
estimated via the chosen parameter configuration to shrunk the covariance eigenvalues.
This study opens up several avenues for future research. First, one could improve on the selection
of transfer datasets used during cross validation. In our study, we used random portfolio constituents
for the disjoint history. However, it should be possible to obtain more accurate estimation parameters
by specifically selecting similar stocks, e.g., from the same sectors but in different countries. Second,
the estimation parameters could be selected based on more sophisticated search procedures, like
random search (Bergstra and Bengio 2012), where each parameter value is drawn from a predefined
distribution. Third, it also seems reasonable to evaluate our approach based on other important
criteria like sustainability. Recently, portfolios with a particular focus on ESG (Environmental,
Social and Governance) criteria have received growing attention from investors. A possible objective
function could be defined as the ESG score divided by volatility, which would present a novel way
of generating ESG portfolios with low risk.
References
Ackermann F, Pohl W, Schmedders K (2017) Optimal and naive diversification in currency markets. Manage-
ment Sci. 63(10):3347–3360.
Bai J, Ng S (2002) Determining the number of factors in approximate factor models. Econometrica 70(1):191–
221.
Ban GY, El Karoui N, Lim AEB (2018) Machine learning and portfolio optimization. Management Sci.
64(3):1136–1154.
Bastani H (2021) Predicting with proxies: Transfer learning in high dimension. Management Sci. 67(5):2964–
2984.
Bastani H, Simchi-Levi D, Zhu R (2021) Meta dynamic pricing: Transfer learning across experiments.
Management Sci. Forthcoming.
Bergmeir C, Benı́tez JM (2012) On the use of cross-validation for time series predictor evaluation. Inf. Sci.
191:192–213.

36
Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13(2):281–
305.
Bickel PJ, Levina E (2008) Covariance regularization by thresholding. Ann. Stat. 36(6):2577–2604.
Bodnar T, Parolya N, Schmid W (2018) Estimation of the global minimum variance portfolio in high
dimensions. Eur. J. Oper. Res. 266(1):371–390.
Bun J, Bouchaud JP, Potters M (2017) Cleaning large correlation matrices: Tools from random matrix theory.
Phys. Rep. 666:1–109.
Cai T, Liu W, Luo X (2011) A constrained l1 minimization approach to sparse precision matrix estimation.
J. Am. Stat. Assoc. 106(494):594–607.
Das SR, Markowitz H, Scheid J, Statman M (2010) Portfolio optimization with mental accounts. J. Financial
Quant. Anal. 45(2):311–337.
Das SR, Ostrov D, Radhakrishnan A, Srivastav D (2018) A new approach to goals-based wealth management.
J. Invest. Manag. 16(3):1–27.
De Nard G, Ledoit O, Wolf M (2019) Factor models for portfolio selection in large dimensions: The good, the
better and the ugly. J. Financ. Econom. Forthcoming.
DeMiguel V, Garlappi L, Nogales F, Uppal R (2009a) A generalized approach to portfolio optimization:
Improving performance by constraining portfolio norms. Management Sci. 55(5):798–812.
DeMiguel V, Garlappi L, Uppal R (2009b) Optimal versus naive diversification: How inefficient is the 1/N
portfolio strategy? Rev. Financ. Stud. 22(5):1915–1953.
Dorfman R (1979) A formula for the gini coefficient. Rev. Econ. Stat . 61(1):146–149.
Engle RF, Ledoit O, Wolf M (2019) Large dynamic covariance matrices. J. Bus. Econ. Stat. 37(2):363–375.
Fan J, Liao Y, , Mincheva M (2011) High-dimensional covariance matrix estimation in approximate factor
models. Ann. Stat. 39(6):3320–3356.
Fan J, Liao Y, Liu H (2016) An overview of the estimation of large covariance and precision matrices. Econom.
J. 19(1):C1–C32.

37
Fan J, Liao Y, Mincheva M (2013) Large covariance estimation by thresholding principal orthogonal comple-
ments. J. R. Stat. Soc. Series. B 75(4):603–680.
Fan J, Liu H, Wang W (2018) Large covariance estimation through elliptical factor models. Ann. Stat.
46(4):1383–1414.
Fan J, Zhang J, Yu K (2012) Vast portfolio selection with gross-exposure constraints. J. Am. Stat. Assoc.
107(498):592–606.
Frahm G, Memmel C (2010) Dominating estimators for minimum-variance portfolios. J. Econom. 159(2):289–
302.
French KR (2021) Current research returns. https://mba.tuck.dartmouth.edu/pages/faculty/ken.
french/data_library.html.
Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso.
Biostatistics 9(3):432–441.
Frost PA, Savarino JE (1986) An empirical Bayes approach to efficient portfolio selection. J. Financial Quant.
Anal. 21(3):293–305.
Jagannathan R, Ma T (2003) Risk reduction in large portfolios: Why imposing the wrong constraints helps.
J. Finance 58(4):1651–1683.
James W, Stein C (1961) Estimation with quadratic loss. Proc. 4th Berkeley Symp. Mathematical Statistics
Probability, 361–380.
Jorion P (1986) Bayes-stein estimation for portfolio analysis. J. Financial Quant. Anal. 21(3):68–74.
Kourtis A (2015) A stability approach to mean-variance optimization. Financial Rev. 50(3):301–330.
Laloux L, Cizeau P, Bouchaud JP, Potters M (1999) Noise dressing of financial correlation matrices. Phys.
Rev. Lett. 83(7):1467–1470.
Laloux L, Cizeau P, Potters M (2000) Random matrix theory and financial correlations. Int. J. Theor. Appl.
Finance 3(3):391—-397.
Lam C, Fan J (2009) Sparsistency and rates of convergence in large covariancematrix estimation. Ann. Stat.
37(6B):4254–4278.

38
Ledoit O, Wolf M (2003) Improved estimation of the covariance matrix of stock returns with an application
to portfolio selection. J. Empir. Finance 10(5):603–621.
Ledoit O, Wolf M (2004a) Honey, I shrunk the sample covariance matrix. J. Portf. Manag. 4(30):110–119.
Ledoit O, Wolf M (2004b) A well-conditioned estimator for large-dimensional covariance matrices. J. Multivar.
Anal. 88:365–411.
Ledoit O, Wolf M (2012) Nonlinear shrinkage estimation of large-dimensional covariance matrices. Ann. Stat.
40(2):1024–1060.
Ledoit O, Wolf M (2015) Spectrum estimation: a unified framework for covariance matrix estimation and Pca
in large dimensions. J. Multivar. Anal. 139(2):360–384.
Ledoit O, Wolf M (2017a) Nonlinear shrinkage of the covariance matrix for portfolio selection: Markowitz
meets Goldilocks. Rev. Financ. Stud. 30(12):4349–4388.
Ledoit O, Wolf M (2017b) Numerical implementation of the quest function. Comput. Stat. Data. Anal.
115:199–223.
Ledoit O, Wolf M (2020) Analytical nonlinear shrinkage of large-dimensional covariance matrices. Ann. Stat.
48(5):3043–3065.
Ledoit O, Wolf M (2021a) The power of (non-)linear shrinking: A review and guide to covariance matrix
estimation. J. Financ. Econom. Fortcoming.
Ledoit O, Wolf M (2021b) Quadratic shrinkage for large covariance matrices. Bernoulli Fortcoming.
Ledoit O, Wolf M (2021c) Shrinkage estimation of large covariance matrices: Keep it simple, statistician? J.
Multivar. Anal. 186.
Markowitz HM (1952) Portfolio selection. J. Finance 7(1):77–91.
Michaud RO (1989) The Markowitz optimization enigma: Is ‘optimized’ optimal? Financial Anal. J. 45(1):31–
42.
Nguyen VA, Kuhn D, Esfahani PM (2021) Robust inverse covariance estimation: The wasserstein shrinkage
estimator. Oper. Res. Forthcoming.

39
Olivares-Nadal AV, DeMiguel V (2018) Technical note - A robust perspective on transaction costs in portfolio
optimization. Oper. Res. 66(3):733–739.
Pan SJ, Yang Q (2009) A survey on transfer learning. IEEE Trans. Knowl. Data. Eng. 22(10):1345–1359.
Plerou V, Gopikrishnan P, Rosenow B, Amaral LAN, Guhr T, Stanley HE (2002) A random matrix approach
to cross-correlations in financial data. Phys. Rev. E . 65(6):066126.
Roncalli T, Weisang G (2016) Risk parity portfolios with risk factors. Quant. Finance 16(3):377–388.
Rossi B, Inoue A (2012) Out-of-sample forecast tests robust to the choice of window size. J. Bus. Econ. Stat.
30(3):432–453.
Shefrin H, Statman M (2000) Behavioral portfolio theory. J. Financial Quant. Anal. 35(2):127–151.
Stein C (1956) Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. Proc.
3rd Berkeley Symp. Mathematical Statistics Probability, 197–206.
Thapa C, Poshakwale SS (2010) International equity portfolio allocations and transaction costs. J. Bank.
Financ. 34(11):2627–2638.
Tu J, Zhou G (2011) Markowitz meets Talmud: A combination of sophisticated and naive diversification
strategies. J. Financ. Econ. 99(1):204–215.
Zhao L, Chakrabarti D, Muthuraman K (2019) Portfolio construction by mitigating error amplification: The
bounded-noise portfolio. Oper. Res. 67(4):965–983.

e-companion to Mörstedt, Lutz, and Neumann: Transfer Learning for Covariance Estimation ec1
E-Companion to “Cross Validation Based Transfer Learning

for Financial Covariance Estimation: A Data-Driven
Approach”
This e-companion contains a total of four appendices. EC.1 presents additional results for risk-
adjusted return after transaction costs. EC.2 provides several insights about the observed parameter
configurations of our approach. EC.3 evaluates the possible benefit of our approach over existing
methods. EC.4 presents the results for minimum variance and maximum risk-adjusted return for
several other datasets.
EC.1. Risk-Adjusted Return After Transaction Costs

In this appendix, we present the results for risk-adjusted return after transaction costs (0.25 percent
per trade (Thapa and Poshakwale 2010)). The turnover rates are shown in Table EC.1. The results
for risk-adjusted return after transaction costs for different datasets are shown in Table EC.3 for
D = 1.33 and Table EC.2 for D = 0.40. The results for all dimensions based on the US dataset are
shown in Table EC.4.
Table EC.1 Turnover for Different Portfolio Sizes and Dimensions for US Data.
N 100 200 300
D = N/T 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13
CVTL 21.25 18.57 17.03 9.36 6.61 5.32 17.69 15.53 14.08 8.03 6.43 5.41 15.29 13.39 12.68 7.61 6.18 5.25
CVTLLS 22.78 20.12 18.33 9.90 6.82 5.40 19.59 17.04 15.40 8.56 6.56 5.52 16.66 14.79 13.95 8.12 6.35 5.43
QIS 22.67 21.95 21.77 13.00 8.45 6.55 19.21 18.14 17.63 11.12 7.71 6.18 16.94 16.60 16.33 10.38 7.36 5.53
QuEST 20.48 19.16 18.44 12.09 8.25 6.48 17.25 15.79 15.04 10.73 7.63 6.14 15.10 14.58 14.20 10.14 7.31 5.50
LShriCC 36.50 33.01 31.53 16.52 9.82 7.28 36.30 34.03 33.05 17.00 9.98 7.30 37.01 35.64 34.46 17.68 9.80 6.78
LShri 27.85 27.20 26.99 16.62 9.90 7.31 30.39 30.53 30.58 17.39 10.04 7.34 31.40 32.95 33.20 17.94 9.83 6.76
BPSEst – – – 24.34 11.49 7.94 – – – 22.76 10.92 7.68 – – – 22.22 10.52 7.03
FMEst – – – 25.91 11.66 8.00 – – – 23.65 11.03 7.71 – – – 22.91 10.60 7.05
POET I – – – 18.63 9.52 6.86 – – – 20.89 9.61 6.91 – – – 24.42 9.75 6.52
POET II – 43.65 37.67 15.09 9.55 7.50 – 39.98 34.15 13.85 9.33 7.32 – 38.53 32.73 13.55 9.13 6.59
BN 17.02 16.49 16.00 11.78 7.45 5.87 17.11 16.24 15.71 10.98 7.20 5.98 17.08 16.09 15.45 10.66 7.31 5.78
Sample – – – 28.41 12.26 8.32 – – – 25.12 11.43 7.89 – – – 24.03 10.85 7.13

CVTL µ 19.33 15.79 14.51 7.26 4.89 4.31 15.85 13.36 11.71 6.01 4.89 5.37 11.95 11.14 11.13 5.58 5.18 4.78
σ
CVTLLS µ 21.96 18.57 16.91 6.27 4.19 3.11 18.15 15.93 13.77 5.32 3.70 3.73 11.38 11.29 13.52 4.56 3.95 3.31
σ
BN VAR 22.10 21.26 20.69 15.70 10.25 7.95 21.17 19.74 18.81 11.98 7.55 6.14 18.46 16.26 15.30 8.26 5.60 4.18
NC2R 20.08 19.79 19.12 20.03 22.25 23.26 17.06 19.23 18.10 20.58 19.24 22.39 7.01 6.56 6.26 7.82 7.70 10.07
CT – – – 14.25 6.20 4.26 – – – 12.60 5.78 4.05 – – – 12.04 5.49 3.69
1/N 1.35 1.38 1.37 1.35 1.32 1.30 1.36 1.35 1.37 1.31 1.32 1.32 1.36 1.37 1.34 1.30 1.31 1.23
Note: Empirical results based on the All US data set with N ∈ {100, 200, 300}. Each value is given in percent as the
average over 50 evaluations of random portfolio constituents. Turnover greater than 50 due to estimation errors is
denoted by “–”. The best estimator per problem setting (excluding the 1/N portfolio) is highlighted in bold.

ec2 e-companion to Mörstedt, Lutz, and Neumann: Transfer Learning for Covariance Estimation
Table EC.2 Out-Of-Sample Annualized Turnover and Risk-Adjusted Return after Cost for Different Datasets.
D Turnover Risk-Adjusted Return After Cost
N 100 200 300 100 200 300
CVTL 17.03 20.64 19.70 14.08 18.50 17.55 12.68 17.93 16.62 0.49 0.55 0.32 0.88 0.91 0.45 0.76 0.87 0.40
CVTLLS 18.33 22.10 20.96 15.40 19.71 18.85 13.95 18.87 17.80 0.47 0.50 0.32 0.83 0.85 0.45 0.70 0.81 0.39
QIS 21.77 23.21 23.36 17.63 20.04 21.04 16.33 19.42 20.72 0.35 0.42 0.22 0.74 0.82 0.33 0.59 0.78 0.24
QuEST 18.44 20.00 20.06 15.04 17.64 18.59 14.20 17.39 18.68 0.45 0.51 0.30 0.84 0.90 0.40 0.68 0.86 0.31
LShriCC 31.53 32.74 30.87 33.05 33.48 32.73 34.46 34.11 34.66 −0.20 −0.11 −0.19 −0.17 −0.03 −0.25 −0.58 −0.21 −0.47
LShri 26.99 36.12 32.16 30.58 43.38 37.90 33.20 48.24 42.25 0.19 0.09 0.01 0.24 0.09 −0.12 −0.14 −0.26 −0.48
BPSEst – – – – – – – – – – – – – – – – – –
FMEst – – – – – – – – – – – – – – – – – –
POET I – – – – – – – – – – – – – – – – – –
POET II 37.67 45.78 44.45 34.15 42.83 42.53 32.73 41.48 41.32 −0.16 −0.20 −0.33 0.08 0.07 −0.27 −0.13 −0.05 −0.38
BN 16.00 17.71 16.86 15.71 18.01 17.53 15.45 18.84 18.04 0.57 0.60 0.40 0.82 0.89 0.45 0.76 0.89 0.39
Sample – – – – – – – – – – – – – – – – – –

CVTL µ 14.51 16.44 16.58 11.71 15.60 14.27 11.13 15.57 16.03 0.60 0.67 0.45 0.98 0.99 0.57 0.84 0.98 0.44
σ
CVTLLS µ 16.91 19.93 20.27 13.77 18.49 16.52 13.52 17.26 18.00 0.52 0.55 0.40 0.91 0.87 0.52 0.75 0.85 0.40
σ
BN VAR 20.69 23.53 22.30 18.81 21.64 20.84 15.30 18.04 17.24 0.24 0.30 0.04 0.37 0.64 0.10 0.42 0.85 0.21
NC2R 19.12 20.58 17.82 18.10 11.23 11.65 6.26 4.79 4.57 0.41 0.33 0.41 0.47 0.70 0.54 0.59 0.72 0.70
CT – – – – – – – – – – – – – – – – – –
1/N 1.37 1.32 1.23 1.37 1.27 1.20 1.34 1.34 1.24 0.80 0.75 0.83 0.86 0.76 0.87 0.75 0.80 0.82
Note: Turnover and risk-adjusted return after costs (0.25% per trade) of all evaluated covariance estimators for
US, EU and WO data with N ∈ {100, 200, 300} and D = 0.4. Each score is given in percent as the average over 50
evaluations with random portfolio constituents. Out-of-sample volatility greater than 50 percent due to estimation
errors is denoted by “–”. The best estimator per problem setting (in terms of turnover excluding 1/N portfolio) is
highlighted in bold.

Table EC.3 Out-Of-Sample Annualized Turnover and Risk-Adjusted Return after Cost for Different Datasets.
D Turnover Risk-Adjusted Return After Cost
N 100 200 300 100 200 300
CVTL 9.36 11.27 10.41 8.03 10.84 10.00 7.61 10.65 10.03 0.54 0.78 0.44 0.80 1.07 0.74 0.73 1.06 0.65
CVTLLS 9.90 11.84 11.05 8.56 11.25 10.45 8.12 10.82 10.45 0.53 0.75 0.46 0.78 1.04 0.74 0.70 1.03 0.65
QIS 13.00 13.15 13.93 11.12 12.59 13.51 10.38 12.25 13.45 0.39 0.68 0.33 0.66 0.99 0.57 0.59 0.99 0.53
QuEST 12.09 12.36 13.04 10.73 12.12 13.19 10.14 11.97 13.46 0.41 0.71 0.35 0.68 1.01 0.58 0.60 0.99 0.53
LShriCC 16.52 16.48 16.15 17.00 17.50 17.59 17.68 18.62 18.81 0.06 0.35 0.15 0.10 0.32 0.22 −0.11 0.25 0.11
LShri 16.62 20.55 19.04 17.39 22.25 20.89 17.94 23.54 22.34 0.26 0.42 0.20 0.33 0.49 0.28 0.18 0.39 0.19
BPSEst 24.34 26.30 26.31 22.76 27.13 27.31 22.22 27.48 28.10 0.04 0.20 0.06 0.16 0.28 0.08 0.05 0.21 0.01
FMEst 25.91 28.53 28.68 23.65 28.72 29.13 22.91 28.61 29.58 −0.02 0.14 −0.03 0.11 0.22 0.00 0.01 0.18 −0.04
POET I 18.63 22.04 21.42 20.89 24.59 25.45 24.42 26.63 28.55 0.37 0.49 0.31 0.06 0.85 0.03 0.64 0.56 −0.10
POET II 15.09 18.16 18.11 13.85 17.34 17.41 13.55 17.22 16.93 0.26 0.50 0.22 0.42 0.68 0.46 0.34 0.62 0.42
BN 11.78 11.83 11.43 10.98 12.12 11.88 10.66 12.72 12.18 0.47 0.78 0.41 0.70 1.16 0.72 0.62 1.17 0.61
Sample 28.41 32.42 32.86 25.12 31.25 32.09 24.03 30.51 32.00 −0.12 0.05 −0.16 0.03 0.13 −0.11 −0.04 0.12 −0.11

CVTL µ 7.26 7.84 7.57 6.01 7.33 6.48 5.58 9.07 6.31 0.69 0.89 0.61 1.03 1.26 0.93 0.89 1.29 0.81
σ
CVTLLS µ 6.27 6.46 6.10 5.32 6.28 5.33 4.56 7.76 5.27 0.73 0.87 0.65 1.05 1.26 0.99 0.91 1.18 0.83
σ
BN VAR 15.70 15.93 15.56 11.98 12.70 12.38 8.26 9.74 8.81 0.16 0.65 0.13 0.45 1.09 0.65 0.61 1.42 0.73
NC2R 20.03 20.49 17.02 20.58 16.19 14.04 7.82 5.54 5.26 0.28 0.34 0.36 0.34 0.55 0.68 0.40 0.72 0.61
CT 14.25 16.24 16.45 12.60 15.66 16.07 12.04 15.28 16.03 0.56 0.54 0.52 0.79 0.75 0.73 0.52 0.63 0.54
1/N 1.35 1.29 1.19 1.31 1.26 1.17 1.30 1.28 1.17 0.84 0.72 0.86 1.01 0.95 1.11 0.66 0.76 0.81
Note: Turnover and risk-adjusted return after costs (0.25% per trade) of all evaluated covariance estimators for US,
EU and WO data with N ∈ {100, 200, 300} and D = 1.33. Each score is given in percent as the average over 50
evaluations with random portfolio constituents. Out-of-sample volatility greater than 50 percent due to estimation
errors is denoted by “–”. The best estimator per problem setting (in terms of turnover excluding 1/N portfolio) is
highlighted in bold.

Table EC.4 Out-Of-Sample Risk-Adjusted Return after Cost for US Data and Different Sample Dimensions.
N 100 200 300
D = N/T 2.0 1.50 1.33 0.40 0.20 0.13 2.0 1.50 1.33 0.40 0.20 0.13 2.0 1.50 1.33 0.40 0.20 0.13
CVTL 0.46 0.59 0.49 0.54 0.74 0.66 0.63 0.89 0.88 0.80 0.52 0.57 0.93 0.81 0.76 0.73 0.52 1.28
CVTLLS 0.47 0.57 0.47 0.53 0.74 0.66 0.58 0.84 0.83 0.78 0.53 0.56 0.86 0.76 0.70 0.70 0.51 1.23
QIS 0.45 0.51 0.35 0.39 0.65 0.60 0.56 0.81 0.74 0.66 0.46 0.51 0.85 0.67 0.59 0.59 0.44 1.22
QuEST 0.51 0.59 0.45 0.41 0.66 0.60 0.63 0.89 0.84 0.68 0.47 0.51 0.93 0.76 0.68 0.60 0.45 1.22
LShriCC – – – 0.06 0.44 0.46 – – – 0.10 0.17 0.32 – – – – 0.09 0.91
LShri 0.30 0.37 0.19 0.26 0.57 0.56 0.17 0.32 0.24 0.33 0.32 0.44 0.22 – – 0.18 0.28 1.08
BPSEst – – – 0.04 0.50 0.53 – – – 0.16 0.29 0.42 0.28 – – 0.05 0.25 1.05
FMEst – 0.29 0.29 – 0.49 0.52 – – – 0.11 0.28 0.42 0.28 – – 0.01 0.24 1.05
POET I – 0.58 – 0.37 0.71 0.66 – – – 0.06 0.50 0.55 – – – 0.64 0.49 1.23
POET II – – – 0.26 0.56 0.54 – – 0.08 0.42 0.32 0.44 – – – 0.34 0.33 1.08
BN 0.58 0.62 0.57 0.47 0.76 0.66 0.63 0.91 0.82 0.70 0.53 0.57 0.86 0.73 0.76 0.62 0.53 1.33
Sample – 0.29 0.29 – 0.44 0.50 – – – 0.03 0.25 0.41 0.28 – – – 0.22 1.05

CVTL µ 0.56 0.66 0.60 0.69 0.98 0.73 0.73 0.99 0.98 1.03 0.67 0.68 1.06 0.92 0.84 0.89 0.71 1.36
σ
CVTLLS µ 0.52 0.60 0.52 0.73 0.98 0.76 0.63 0.89 0.91 1.05 0.73 0.71 1.08 0.91 0.75 0.91 0.74 1.40
σ
BN VAR 0.29 0.31 0.24 0.16 0.48 0.54 0.23 0.48 0.37 0.45 0.37 0.45 0.27 0.26 0.42 0.61 0.51 1.44
NC2R 0.39 0.32 0.41 0.28 0.27 – 0.38 0.35 0.47 0.34 0.34 – 0.76 0.71 0.59 0.40 0.38 0.65
CT – 0.24 0.25 0.56 0.92 0.69 – – 1.00 0.79 0.62 0.60 0.23 – – 0.52 0.54 0.95
1/N 0.90 0.83 0.80 0.84 1.00 0.65 0.90 0.93 0.86 1.01 0.67 0.57 0.86 0.78 0.75 0.66 0.57 0.72
Note: Empirical results of all evaluated covariance estimators for US data with N ∈ {100, 200, 300} and
D ∈ {2.0, 1.5, 1.33, 0.4, 0.2, 0.3}. Each score is given in percent as the average over 50 evaluations with random portfolio
estimator per problem setting is highlighted in bold.
Table EC.5 Out-Of-Sample Risk-Adjusted Return after Cost for FFI Data and Different Sample Dimensions.
N 10 30 49
D = N/T 2.0 1.50 1.33 0.40 0.20 0.13 2.0 1.50 1.33 0.40 0.20 0.13 2.0 1.50 1.33 0.40 0.20 0.13
CVTL 0.37 0.49 0.34 0.34 0.59 0.73 0.05 0.11 0.15 0.45 0.66 0.68 – 0.01 0.15 0.47 0.44 0.69
CVTLLS 0.39 0.48 0.35 0.32 0.62 0.75 0.10 0.17 0.18 0.46 0.64 0.67 – 0.10 0.17 0.45 0.42 0.67
QIS 0.05 – 0.00 – 0.45 0.59 – – – 0.03 0.34 0.50 – – – – 0.13 0.45
QuEST 0.13 0.06 – 0.07 0.51 0.58 – – – 0.08 0.34 0.51 – – – 0.07 0.16 0.46
LShriCC – – – 0.18 0.50 0.54 – – – 0.09 0.41 0.56 – – – 0.02 0.20 0.51
LShri 0.10 0.21 0.14 0.17 0.56 0.67 – – – 0.01 0.34 0.52 – – – – 0.12 0.47
BPSEst 0.14 0.37 0.52 – 0.39 0.55 – – – – 0.19 0.42 – – – – – 0.35
FMEst – 0.67 0.68 – 0.37 0.54 0.44 – – – 0.17 0.41 – – 0.08 – – 0.34
POET I – 0.10 0.10 0.17 0.56 0.67 – – – – 0.36 0.52 – – – – 0.15 0.49
POET II 0.28 0.34 0.28 0.07 0.57 0.58 – – – – 0.32 0.47 – – – 0.12 0.18 0.41
BN 0.33 0.31 0.17 0.20 0.56 0.62 0.08 0.18 0.15 0.24 0.42 0.52 0.01 0.17 0.15 0.28 0.22 0.55
Sample – – 0.68 – 0.29 0.50 0.44 – – – 0.12 0.40 – – 0.08 – – 0.31

CVTL µ 0.49 0.33 – 0.25 0.59 0.59 0.20 0.39 0.39 0.56 0.77 0.75 0.00 0.31 0.35 0.65 0.55 0.88
σ
CVTLLS µ 0.89 0.45 – 0.38 0.68 0.73 0.15 0.40 0.20 0.55 0.84 0.78 – 0.09 0.16 0.61 0.54 0.88
σ
BN VAR 0.03 0.11 0.13 – 0.29 0.29 – – – 0.09 0.36 0.50 – 0.10 0.09 0.34 0.24 0.35
NC2R – – – – 0.22 0.32 – – – 0.02 – 0.19 – – – 0.20 0.13 0.35
CT – – 0.51 – 0.50 0.67 0.34 – – 0.05 0.45 0.62 – – 0.07 – 0.30 0.63
1/N 0.61 0.67 0.51 0.48 0.56 0.67 0.53 0.60 0.73 0.67 0.58 0.63 0.49 0.65 0.57 0.70 0.50 0.70
Note: Empirical results of all evaluated covariance estimators for FFI data with N ∈ {10, 30, 49} and D ∈
{2.0, 1.5, 1.33, 0.4, 0.2, 0.3}. Each score is given in percent as the average over 50 evaluations with random portfolio
estimator per problem setting is highlighted in bold.

EC.2. Empirical Parameters

This appendix provides additional analyses about the empirically observed parameters.
EC.2.1. Parameters in Analysis on US Data
The mean parameter values on the US dataset for different dimensions are shown in Table EC.6.
Table EC.6 Parameter Values on US Data for Different Sample Dimensions.

CVTL CVTLLS CVTL µ CVTLLS µ
σ σ
D = N/T δ1 δ2 δ3 γ δ1 δ2 δ1 δ2 δ3 γ δ1 δ2
N = 100
2.00 20.90 26.60 52.40 88.60 61.90 38.10 23.20 43.00 33.80 40.80 57.60 42.40
1.50 30.70 32.90 36.40 90.90 63.00 37.00 24.00 47.50 28.50 59.00 56.40 43.60
1.33 36.50 36.70 26.80 91.90 62.60 37.40 28.60 46.40 25.00 42.10 55.10 44.90
0.40 60.60 27.50 11.90 84.80 75.80 24.20 33.50 45.80 20.70 30.00 45.40 54.60
0.20 68.80 19.50 11.70 81.80 83.50 16.50 33.70 47.00 19.30 11.40 44.20 55.80
0.13 69.40 17.70 12.90 69.00 85.40 14.60 29.30 45.30 25.50 35.80 43.30 56.70
N = 200
2.00 25.10 24.90 50.00 91.80 63.60 36.40 27.30 42.60 30.10 48.80 56.60 43.40
1.50 33.30 29.20 37.50 94.60 65.00 35.00 31.20 46.10 22.70 50.30 57.00 43.00
1.33 38.50 33.00 28.50 96.20 64.10 35.90 30.80 48.60 20.60 16.30 52.00 48.00
0.40 59.60 26.30 14.00 96.10 75.90 24.10 34.90 49.30 15.80 14.00 43.20 56.80
0.20 63.40 17.60 19.00 77.50 84.60 15.40 30.20 44.80 25.10 49.10 41.90 58.10
0.13 66.40 15.30 18.30 74.90 87.60 12.40 34.30 38.50 27.20 18.20 53.30 46.70
N = 300
2.00 30.00 29.30 40.60 91.00 61.50 38.50 27.90 55.90 16.20 19.30 37.90 62.10
1.50 36.10 33.00 30.90 96.00 62.50 37.50 28.90 52.00 19.10 46.30 43.90 56.10
1.33 39.60 33.80 26.60 97.30 63.60 36.40 29.70 46.80 23.50 41.30 54.40 45.60
0.40 51.10 23.60 25.30 98.50 75.20 24.80 30.20 51.90 17.90 40.80 39.40 60.60
0.20 62.90 18.40 18.70 82.70 83.60 16.40 30.50 45.80 23.70 31.00 46.10 53.90
0.13 63.20 12.60 24.20 69.70 88.90 11.10 40.10 45.20 14.70 −4.50 43.50 56.50
Note: Each parameter value is given in percent as the average over 50 evaluations with random portfolio constituents.

EC.2.2. CVTL Parameters over Time
Next, we present how the parameters of our approach change over time after model inception. We
provide the corresponding plots for N ∈ {100, 200, 300} and D ∈ {2, 1.5, 1.33, 0.40, 0.20, 0.13} for
the objectives minimum variance (Figure EC.1, Figure EC.2, and Figure EC.3), and maximum
risk-adjusted return (Figure EC.6, Figure EC.7, Figure EC.8).
EC.2.2.1. Minimum Variance
Figure EC.1 Parameter Values over Time after Model Inception for Minimum Variance and N=100.
D = 2.00 D = 1.50 D = 1.33
1 1 1
Paramter value
0.5 0.5 0.5
0 0 0
−0.5 −0.5 −0.5

0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
Months after inception Months after inception Months after inception
D = 0.40 D = 0.20 D = 0.13

1 1 1
Paramter value
0.5 0.5 0.5
0 0 0
−0.5 −0.5 −0.5

0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
δ1 δ2 γ
Note: Values are aggregated over 50 evaluations with random portfolio constituents for US data and N = 100.

D = 2.00 D = 1.50 D = 1.33
1 1 1
Paramter value
0.5 0.5 0.5
0 0 0
−0.5 −0.5 −0.5

0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
D = 0.40 D = 0.20 D = 0.13

1 1 1
Paramter value
0.5 0.5 0.5
0 0 0
−0.5 −0.5 −0.5

0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
δ1 δ2 γ

D = 2.00 D = 1.50 D = 1.33
1 1 1
Paramter value
0.5 0.5 0.5
0 0 0
−0.5 −0.5 −0.5

0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
D = 0.40 D = 0.20 D = 0.13

1 1 1
Paramter value
0.5 0.5 0.5
0 0 0
−0.5 −0.5 −0.5

0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
δ1 δ2 γ

Figure EC.4 Parameter Variation Over Time for Minimum Variance and N=100, D=1.33.
δ1 δ2
1 1
Paramter value
0.5 0.5
0 0
−0.5 −0.5
0 10 20 30 40 50 0 10 20 30 40 50
Months after inception Months after inception
1
Paramter value
−1
0 10 20 30 40 50
Months after inception
δ1 δ2 γ
Note: Parameter mean value and standard deviations. Values are aggregated over 50 evaluations with random
portfolio constituents for US data and N = 100.

Figure EC.5 Parameter Variation Over Time for Minimum Variance and N=100, D=0.40.
δ1 δ2
1 1
Paramter value
0.5 0.5
0 0
−0.5 −0.5
0 10 20 30 40 50 0 10 20 30 40 50
1
Paramter value
−1
0 10 20 30 40 50
δ1 δ2 γ

EC.2.2.2. Maximum Risk-Adjusted Return
Figure EC.6 Parameter Values over Time after Model Inception for Maximum Risk-Adjusted Return and
N=100.
D = 2.00 D = 1.50 D = 1.33
1 1 1
Paramter value
0.5 0.5 0.5
0 0 0
−0.5 −0.5 −0.5

0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
D = 0.40 D = 0.20 D = 0.13

1 1 1
Paramter value
0.5 0.5 0.5
0 0 0
−0.5 −0.5 −0.5

0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
δ1 δ2 γ

N=200.
D = 2.00 D = 1.50 D = 1.33
1 1 1
Paramter value
0.5 0.5 0.5
0 0 0
−0.5 −0.5 −0.5

0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
D = 0.40 D = 0.20 D = 0.13

1 1 1
Paramter value
0.5 0.5 0.5
0 0 0
−0.5 −0.5 −0.5

0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
δ1 δ2 γ

N=300.
D = 2.00 D = 1.50 D = 1.33
1 1 1
Paramter value
0.5 0.5 0.5
0 0 0
−0.5 −0.5 −0.5

0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
D = 0.40 D = 0.20 D = 0.13

1 1 1
Paramter value
0.5 0.5 0.5
0 0 0
−0.5 −0.5 −0.5

0 50 100 150 200 0 50 100 150 200 0 50 100 150 200
δ1 δ2 γ

Figure EC.9 Parameter Variation Over Time for Maximum Risk-Adjusted Return and N=100, D=1.33.
δ1 δ2
1 1
Paramter value
0.5 0.5
0 0
−0.5 −0.5
0 10 20 30 40 50 0 10 20 30 40 50
1
Paramter value
−1
0 10 20 30 40 50
δ1 δ2 γ

Figure EC.10 Parameter Variation Over Time for Maximum Risk-Adjusted Return and N=100, D=0.40.
δ1 δ2
1 1
Paramter value
0.5 0.5
0 0
−0.5 −0.5
0 10 20 30 40 50 0 10 20 30 40 50
1
Paramter value
−1
0 10 20 30 40 50
δ1 δ2 γ

EC.2.3. Shrunk eigenvalues of CVTL with Objective Maximum Risk-Adjusted

Return
In the main paper, we compared the eigenvalues of CVTL against those of QuEST (Ledoit and
Wolf 2015) under the objective minimum variance. The same analysis under the objective maximum
risk-adjusted return is shown in Figure EC.11.
Figure EC.11 Sample and shrunk eigenvalues.

D = 2.00 D = 1.50 D = 1.33
Shrunk eigenvalue
10−2 10−2 10−2
10−3 10−3 10−3
10−4 10−4 10−4
10−4 10−3 10−2 10−4 10−3 10−2 10−4 10−3 10−2
D = 0.40 D = 0.20 D = 0.13

Shrunk eigenvalue
10−2 10−2 10−2
10−3 10−3 10−3
10−4 10−4 10−4
10−4 10−3 10−2 10−4 10−3 10−2 10−4 10−3 10−2
CVTL µ QuEST Sample

σ
data with N = 100.

EC.2.4. Relations between Parameters
We also consider the relations between the parameters δ1 , δ2 , γ. For convenience, we define δ3 =
1 − δ1 − δ2 as the intensity of the second shrinkage target. We consider the relations between our
estimation parameters after 12 months following model inception. Figures EC.12 and EC.13 show
the relations and histogramms for the objective minimum variance based on dimensions D = 1.33
and D = 0.40. Figures EC.14 and EC.15 show the same plots but for the objective maximum
risk-adjusted return.
We observe that the parameter pairs (δ1 , δ2 ), (δ1 , δ3 ), and (δ1 , γ) are in opposite relation for low-
and high-dimensional problems.
EC.2.4.1. Minimum Variance
Figure EC.12 Parameter Relations for High-Dimensional Problem with D = 1.33 under Objective Minimum
Variance.
Note: The plotted eigenvalues are averagered over 50 simulations with random portfolio constituents based on the US
data with N = 100.

Figure EC.13 Parameter Relations for Low-Dimensional Problem with D = 0.40 under Objective Minimum
Variance
data with N = 100.

EC.2.4.2. Maximum Risk-Adjusted Return
Figure EC.14 Parameter Relations for High-Dimensional Problem with D = 1.33 under Objective Maximum
Risk-Adjusted Return.
data with N = 100.

Figure EC.15 Parameter Relations for Low-Dimensional Problem with D = 0.40 under Objective Maximum
Risk-Adjusted Return.
data with N = 100.

EC.3. Performance of Grid Entries in Empirical Analysis for US Data

In this appendix, we provide additional analyses about the benefit of using cross validation and
the proposed second shrinkage target of CVTL. In particular, we illustrate how different grid
entries γ, δ1 , δ2 ∈ P relate to traditional combinations between the sample covariance portfolio and
the 1/N portfolio. In addition, we illustrate how grid entries from CVTL relate to possible linear
shrinkage combinations (EC.3.1). We further extend the analysis about individual benefits of cross
validation and the second shrinkage target from the main paper (see Section 5.4) by relating results
from CVTL to the hypothetical combination of CVTL and QuEST (QuESTCV) (EC.3.2). We
also examine the potential benefit of CVTL (over the sample covariance estimate) and the second
shrinkage target by providing the share of grid entries from P that would lead to an outperformance
versus the respective baseline model (EC.3.3).
EC.3.1. Non-Linear Grid Entries with Outperformance Potential
Figures EC.16 and EC.17 illustrate the historic results of hypothetical grid entries from P for the
US dataset with N = 100 and D ∈ {1.33, 0.40}. Bright gray transparent circles represent the grid
entries of theoretically reachable portfolios, filled gray circles represent possible grid entries in the
class of rotation equivariant estimators resulting in linear shrinkage with λ1 + λ2 = 1. Black crosses
represent linear combinations between the sample and 1/N portfolios. Figure EC.17 zooms into
non-linear grid entries that have the potential to either improve volatility or risk-adjusted return.
Comparing linear shrinkage (filled gray circles) and non-linear shrinkage (bright gray transparent
circles), Figure EC.16 indicates that the hypothetically optimal grid entries converge in terms of
the portfolio that minimizes variance. Along the efficient frontier, however, both types of shrinkage
procedures diverge within the hypothetical grid entries in the sense that our non-linear shrinkage
target has the potential to increase return for the same level of risk. Yet, the non-linear shrinkage
target also leads to hypothetical grid entries that are clearly dominated by others, especially the
linear shrinkage combination. Hence, we attribute greater potential but also greater predictive risk
to CVTL than to CVTLLS. The right side of Figure EC.16 supports the common literature (e.g.

Ledoit and Wolf 2004a) that it is not beneficial to rely solely on the sample covariance matrix when
performing mean-variance portfolio optimization as the sample frontier is outperformed by most
hypothetical grid entries even without predictive model.
Figure EC.17 zooms into entries located near the historically optimal hypothetical grid entry that
are optimal in minimizing variance or maximizing risk-adjusted return. We observe two opposing
findings. In case of minimum variance, we observe significantly more non-linear versus linear grid
entries that lead to lower variance in the high-dimensional than in the low-dimensional problem.
This supports our finding from Table 11, where we find greater benefit of the non-linear shrinkage
target for high-dimensional problems. In case of maximum risk-adjusted return, we observe the
opposite, that is, significantly more non-linear versus linear hypothetical grid entries that lead to
greater risk-adjusted return in the low-dimensional than in the high-dimensional problems.
Figure EC.16 Performance of Grid Entries.
Note: Values are averaged over 50 simulations with random portfolio constituents based on the US data with N = 100.

Figure EC.17 Performance of Grid Entries (Zoomed In).

EC.3.2. Benefits of Cross Validation and Second Shrinkage Target
Figures EC.18 and EC.19 enlarge figs. EC.16 and EC.17 by also showing the grid entries of the
hypothetical estimator QuESTCV described in section 5.4 represented by small dark filled circles.
The traditional QuEST estimator is represented by one large dark filled circle. QuESTCV is achieved
by combining the eigenvalues of QuEST and cross validation to select the shrinkage intensity.
Figure EC.18 indicates that using the QuEST implied eigenvalues restricts the improvement
potential to minimizing variance. Along the efficient frontier their is no hypothetical grid entry that
leads to higher returns for the same level of variance compared to the linear shrinkage entries. Also
this can be expected by construction of the QuEST model, it highlights the flexibility of our second
shrinkage target. Figure EC.19 confirms this finding by indicating that neither in high nor in the low
dimensional case any hypothetical grid entry of QuESTCV leads to a higher risk-adjusted return
than the respective linear shrinkage grid entries. It further supports the recent idea of Ledoit and
Wolf (2021b), that combining QuEST eigenvalues with a linear shrinkage target has the potential
to improve the QuEST estimator, however, depends on finding the optimal intensity as several
QuESTCV grid entries have the potential of outperform QuEST. Finally, Figure EC.19 supports
our second non-linear shrinkage target as it leads to potentially lower variance than any combination
of the QuEST eigenvalues with a linear shrinkage target.

Figure EC.18 Grid Entries of CVTL and QuESTCV.

Figure EC.19 Grid Entries of CVTL and QuESTCV (Zoomed In).

EC.3.3. Possible Benefit Analysis
Table EC.7 further underlines the possible benefit of CVTL over the sample portfolio when
minimizing variance, and over the 1/N portfolio when maximizing risk-adjusted return. In addition,
Table EC.7 shows the possible benefits of the second shrinkage target of CVTL over CVTLLS
(linear shrinkage). We measure the potential benefit by counting the number of grid entries that
outperform the respective benchmark multiplied by 100 and divide the result by the grid size. As
an example, the possible benefit over the sample portfolio for N = 10 and D = 2 is 87 implying
that 87% of the hypothetical CVTL grid entries would outperform the sample portfolio in terms
of minimizing variance. The comparison using CVTL σµ measures the potential benefit in terms of
risk-adjusted return. Hence, Table EC.7 is the numerical representation of Figures EC.16 and EC.17.
When comparing CVTL to the sample covariance, the possible benefit decreases with an increasing
D. Under the objective maximum risk-adjusted return, the possible benefit of CVTL σµ over the
1/N portfolio increases with increasing D and N . Similar findings can be made when comparing
the volatility achieved by CVTL against CVTLLS. However, when D becomes larger, the possible
benefit is almost zero. This consistent with our findings in the main paper that the relative advantage
of CVTL over CVTLLS diminishes for large D.
Table EC.7 Possible Benefits of CVTL and CVTL µ .

σ
N CVTL CVTL µ
σ
D = N/T 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13
Possible benefit over sample portfolio Possible benefit over 1/N portfolio
10 87.00 87.60 87.50 75.00 39.10 27.40 7.00 11.50 27.70 35.20 49.10 48.10
30 83.00 84.10 84.40 48.00 20.40 7.50 62.30 68.90 69.40 77.30 83.00 75.80
49 77.00 80.40 79.80 41.80 14.80 7.10 71.90 69.80 77.10 81.20 83.90 92.10
100 71.00 75.20 75.80 40.30 17.40 10.70 90.50 90.50 88.40 94.40 96.60 89.90
200 66.90 69.20 70.10 36.80 12.80 6.10 90.50 90.50 90.50 95.20 95.50 94.50
300 67.70 69.80 68.90 37.70 11.30 3.90 91.00 90.50 90.50 92.70 92.40 97.20
Possible benefit over CVTLLS Possible benefit over CVTLLS µ
σ
10 0.90 0.40 0.90 0.00 0.00 0.10 0.20 4.30 6.30 0.70 3.10 10.90
30 2.40 1.70 1.40 0.00 0.20 0.40 3.00 1.30 5.50 9.20 1.10 2.10
49 7.00 5.90 3.50 0.20 0.40 0.00 2.60 2.20 3.00 2.00 1.00 4.70
100 5.30 3.50 1.80 0.30 0.30 0.20 8.00 3.30 7.50 14.50 11.00 8.90
200 2.90 1.80 1.10 0.30 0.20 0.00 3.80 1.40 3.70 1.40 6.30 8.40
300 1.90 1.20 1.30 0.20 0.00 0.00 3.70 1.10 1.80 0.20 1.80 0.50
Note: Possible benefit of CVTL over different benchmarks based on FFI data for N ∈ 10, 30, 49 and US data for
N ∈ 100, 200, 300. For US data, each value is given in percent as the average over 50 evaluations of random portfolio
constituents.

EC.4. Additional Results on Other Datasets

This appendix presents several additional results and performance metrics on the remaining datasets.
The analyses in the main paper were primarily based on the US dataset and the metrics volatility
and risk-adjusted return. Here, we present the results on the WO and EU dataset. In addition, we
present of several other performance metrics, namely, the annualized return (Ret), the maximum
drawdown (maximum relative price depreciation over the time series averaged per annum, MDD),
downside volatility (annualized volatility of negative returns only, DwVol), Calmar ratio (we assume
the ratio as Ret/MDD, Calm), Sortino ratio (we assume the ratio as Ret/DwVol, Sort), and portfolio
turnover (sum of actively forced changes in weights averaged per annum, Turn). Altogether, we
present the following analyses
• US Dataset (EC.4.1)
— Other Performance Metrics (EC.4.1.1)
• WO Dataset (EC.4.2)
— Volatility (EC.4.2.1)
— Risk-Adjusted Return (EC.4.2.2)
— Risk-Adjusted Return After Cost (EC.4.2.3)
• EU Dataset (EC.4.3)
— Volatility (EC.4.3.1)
— Risk-Adjusted Return (EC.4.3.2)
— Risk-Adjusted Return After Cost (EC.4.3.3)
• FFI Industry Datasets (EC.4.4)

EC.4.1. US Data
EC.4.1.1. Other Performance Metrics
Table EC.8 Other Performance Metrics on US Data for N=100.

High-Dimensional Problem (D=1.33) Low-Dimensional Problem (D=0.40)
Metric Ret MDD DwVol Calm Sort VaR Turn Ret MDD DwVol Calm Sort VaR Turn
CVTL 9.10 31.35 6.00 29.03 1.52 1.23 17.03 7.32 29.88 5.83 24.49 125.56 1.20 9.36
CVTLLS 9.31 31.69 6.10 29.37 1.53 1.25 18.33 7.38 29.98 5.85 24.61 126.08 1.21 9.90
QIS 9.09 31.85 6.08 28.56 1.49 1.25 21.77 6.92 30.27 5.88 22.87 117.80 1.21 13.00
QuEST 9.08 31.64 6.05 28.71 1.50 1.24 18.44 6.94 30.19 5.86 22.98 118.44 1.21 12.09
LShriCC 5.97 34.60 6.67 17.24 0.89 1.38 31.53 4.81 32.66 6.22 14.73 77.33 1.29 16.52
LShri 8.98 32.85 6.28 27.33 1.43 1.29 26.99 6.77 31.34 6.04 21.61 112.21 1.25 16.62
BPSEst 10.61 44.59 8.53 23.79 1.24 1.76 50.00 6.74 33.73 6.48 19.99 104.03 1.34 24.34
FMEst – – – – – – – 6.44 34.06 6.51 18.92 98.96 1.35 25.91
POET I – – – – – – – 8.58 32.91 6.37 26.06 134.53 1.32 18.63
POET II 8.13 33.09 6.29 24.57 1.29 1.30 37.67 6.29 30.81 5.94 20.41 105.96 1.23 15.09
BN 10.16 34.49 6.65 29.46 1.53 1.37 16.00 7.38 30.34 5.90 24.32 125.02 1.22 11.78
Sample – – – – – – – 5.99 34.96 6.64 17.14 90.28 1.38 28.41

CVTL µ 9.71 32.48 6.28 29.89 1.55 1.29 14.51 8.53 31.55 6.21 27.03 137.27 1.28 7.26
σ
CVTLLS µ 9.57 32.69 6.31 29.26 1.52 1.30 16.91 8.85 32.44 6.40 27.28 138.40 1.31 6.27
σ
BN VAR 8.12 37.87 7.33 21.45 1.11 1.51 20.69 5.70 34.72 6.72 16.41 84.74 1.39 15.70
NC2R 10.44 43.57 8.35 23.96 1.25 1.73 19.12 9.01 42.17 8.43 21.38 106.88 1.74 20.03
CT – – – – – – – 10.32 37.73 7.42 27.34 139.13 1.53 14.25
1/N 14.23 54.70 10.63 26.01 1.34 2.21 1.37 14.82 55.16 10.80 26.86 137.19 2.24 1.35
Note: Each value is given in percent as the average over 50 evaluations of N = 100 portfolio constituents with
D ∈ (1.33, 0.40). The covariance estimator with the best score per metric is highlighted in bold. Underlined values
indicate significant difference from CVTL with p < 0.05.


D High-Dimensional Problem (D=1.33) Low-Dimensional Problem (D=0.40)
Metric Ret MDD DwVol Calm. Sort. VaR Turn. Ret MDD DwVol Calm. Sort. VaR Turn.
CVTL 10.41 25.05 4.75 41.56 2.19 0.97 14.08 8.16 25.21 4.71 32.35 173.12 0.96 8.03
CVTLLS 10.47 25.42 4.81 41.17 2.17 0.98 15.40 8.19 25.33 4.73 32.34 173.21 0.96 8.56
QIS 10.36 25.37 4.81 40.84 2.16 0.98 17.63 7.96 25.50 4.75 31.23 167.51 0.97 11.12
QuEST 10.41 25.25 4.79 41.23 2.17 0.97 15.04 7.97 25.47 4.75 31.28 167.81 0.97 10.73
LShriCC 6.96 29.64 5.53 23.49 1.26 1.14 33.05 5.24 28.18 5.16 18.60 101.47 1.06 17.00
LShri 10.05 27.29 5.12 36.84 1.96 1.05 30.58 7.21 27.07 4.99 26.64 144.53 1.03 17.39
BPSEst 10.96 38.61 7.22 28.40 1.52 1.49 50.00 7.26 28.72 5.30 25.27 136.97 1.09 22.76
FMEst – – – – – – – 7.05 28.85 5.32 24.43 132.63 1.10 23.65
POET I – – – – – – – 6.16 35.18 8.21 17.51 75.02 1.69 20.89
POET II 9.56 26.37 4.96 36.25 1.93 1.01 34.15 6.87 26.06 4.80 26.37 143.13 0.98 13.85
BN 11.13 27.76 5.34 40.10 2.09 1.08 15.71 8.28 25.70 4.78 32.21 173.37 0.98 10.98
Sample – – – – – – – 6.74 29.27 5.38 23.02 125.19 1.11 25.12

CVTL µ 10.85 26.21 4.97 41.39 2.18 1.01 11.71 9.85 26.56 5.05 37.07 194.82 1.02 6.01
σ
CVTLLS µ 10.95 26.41 5.01 41.46 2.19 1.01 13.77 9.95 27.06 5.17 36.77 192.62 1.04 5.32
σ
BN VAR 8.43 32.05 6.31 26.32 1.34 1.28 18.81 7.04 29.20 5.54 24.11 127.02 1.14 11.98
NC2R 10.83 41.35 8.12 26.20 1.33 1.66 18.10 9.96 42.74 8.20 23.31 121.44 1.68 20.58
CT – – – – – – – 11.89 34.65 6.68 34.31 177.85 1.36 12.60
1/N 14.81 53.67 10.23 27.59 1.45 2.10 1.37 17.28 52.93 10.37 32.65 166.68 2.13 1.31
D ∈ (1.33, 0.40). The covariance estimator with best score per metric is highlighted in bold. Underlined values indicate
significant difference from CVTL with p < 0.05.

CVTL 8.60 23.37 4.52 36.81 1.90 0.92 12.68 7.13 23.68 4.42 30.10 161.42 0.90 7.61
CVTLLS 8.61 23.67 4.58 36.39 1.88 0.94 13.95 7.10 23.86 4.45 29.78 159.69 0.91 8.12
QIS 8.40 23.66 4.55 35.50 1.85 0.93 16.33 6.96 24.02 4.44 28.98 156.82 0.91 10.38
QuEST 8.48 23.54 4.54 36.01 1.87 0.93 14.20 6.93 24.00 4.43 28.86 156.27 0.91 10.14
LShriCC 3.66 29.58 5.35 12.37 0.68 1.12 34.46 3.63 27.55 4.88 13.16 74.26 1.01 17.68
LShri 7.53 26.43 4.91 28.47 1.53 1.01 33.20 6.06 26.05 4.73 23.25 128.05 0.98 17.94
BPSEst 8.49 36.23 6.72 23.43 1.26 1.39 50.00 6.13 27.41 4.99 22.36 122.75 1.03 22.22
FMEst – – – – – – – 6.02 27.51 4.99 21.88 120.55 1.03 22.91
POET I – – – – – – – 14.67 37.29 8.69 39.33 168.71 1.81 24.42
POET II 7.48 24.51 4.65 30.52 1.61 0.96 32.73 6.00 24.39 4.44 24.58 134.94 0.92 13.55
BN 9.95 25.13 5.03 39.60 1.98 1.03 15.45 7.29 24.04 4.45 30.33 163.94 0.91 10.66
Sample – – – – – – – 5.86 27.81 5.02 21.06 116.55 1.04 24.03

CVTL µ 9.00 24.16 4.75 37.26 1.89 0.97 11.13 8.15 24.91 4.76 32.73 171.36 0.97 5.58
σ
CVTLLS µ 9.02 24.42 4.76 36.92 1.89 0.97 13.52 8.29 25.85 4.96 32.06 167.04 1.01 4.56
σ
BN VAR 7.82 30.75 6.21 25.44 1.26 1.26 15.30 7.03 26.49 5.07 26.54 138.66 1.04 8.26
NC2R 10.06 45.79 9.05 21.98 1.11 1.86 6.26 7.94 46.92 9.30 16.91 85.35 1.91 7.82
CT – – – – – – – 8.74 34.54 6.70 25.29 130.35 1.38 12.04
1/N 13.02 53.99 10.48 24.11 1.24 2.17 1.34 11.69 53.82 10.58 21.73 110.56 2.19 1.30

EC.4.2. WO Data
EC.4.2.1. Volatility
Table EC.11 Out-Of-Sample Annualized Volatility for WO Data and Different Dimensions.
N 100 200 300
D = N/T 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13

CVTL 9.28 9.20 9.21 8.83 8.75 9.06 8.17 8.04 7.97 7.88 8.59 8.99 7.59 7.61 7.45 7.72 8.44 8.16
CVTLLS 9.30 9.22 9.24 8.83 8.73 9.02 8.14 8.03 7.98 7.86 8.54 8.92 7.57 7.58 7.43 7.65 8.35 8.05
QIS 9.28 9.25 9.30 8.95 8.85 9.12 8.16 8.07 8.06 7.99 8.65 8.97 7.62 7.65 7.49 7.74 8.40 8.11
QuEST 9.23 9.19 9.23 8.94 8.85 9.12 8.13 8.04 8.03 7.99 8.65 8.97 7.60 7.63 7.46 7.74 8.40 8.11
LShriCC 9.84 9.82 9.85 9.32 9.04 9.24 8.87 8.78 8.76 8.42 8.86 9.15 8.33 8.39 8.29 8.20 8.67 8.26
LShri 9.57 9.61 9.68 9.30 9.02 9.22 8.62 8.69 8.74 8.47 8.83 9.11 8.16 8.38 8.32 8.25 8.64 8.23
BPSEst 11.52 11.98 12.64 9.94 9.20 9.31 10.23 10.72 11.41 8.96 8.95 9.17 9.54 10.16 10.79 8.67 8.72 8.26
FMEst – – – 10.06 9.23 9.32 – – – 9.05 8.97 9.17 – – – 8.74 8.73 8.26
POET I – – – 10.34 9.04 9.23 – – – 14.95 9.15 9.11 – – – 26.34 9.21 8.25
POET II 11.55 10.08 9.93 9.19 9.06 9.31 9.50 8.63 8.50 8.15 8.79 9.17 8.53 8.08 7.84 7.87 8.56 8.30
BN 9.79 9.62 9.64 9.01 8.82 9.10 8.61 8.44 8.38 8.00 8.64 8.98 8.06 8.00 7.83 7.77 8.43 8.15
Sample – – – 10.45 9.35 9.39 – – – 9.31 9.06 9.22 – – – 8.93 8.79 8.29

CVTL µ 9.54 9.39 9.43 9.22 9.17 9.65 8.25 8.14 8.14 8.10 9.17 9.69 7.72 7.77 7.66 8.05 9.25 8.44
σ
CVTLLS µ 10.21 9.77 9.69 9.36 9.20 9.59 8.95 8.17 8.18 8.20 9.01 9.20 7.76 7.87 7.66 8.06 8.68 8.41
σ
BN VAR 10.39 10.32 10.39 10.19 10.00 10.17 9.52 9.33 9.34 9.05 9.67 9.96 9.22 9.24 9.16 8.73 9.44 8.91
NC2R 11.89 11.89 11.99 11.80 11.64 12.23 11.64 11.75 11.52 11.57 12.44 12.84 12.18 12.48 12.81 12.99 13.86 12.02
CT – – – 10.67 10.24 10.58 – – – 9.85 10.48 10.88 – – – 9.93 10.59 9.56
1/N 14.81 14.83 14.97 14.90 14.34 14.71 14.51 14.52 14.52 14.19 15.13 15.70 14.50 14.59 14.66 14.55 15.68 13.43
Note: Results of all evaluated covariance estimators for WO data with N ∈ {100, 200, 300} and D ∈
estimator per problem setting is highlighted in bold. Underlined values indicate significant difference from CVTL with
p < 0.05.

EC.4.2.2. Risk-Adjusted Return
Table EC.12 Out-Of-Sample Risk-Adjusted Return for WO Data and Different Dimensions.
N 100 200 300
D = N/T 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13

CVTL 0.69 0.98 0.88 0.75 1.03 0.89 0.88 1.14 1.03 1.08 0.81 0.77 1.10 1.05 0.99 0.99 0.85 1.27
CVTLLS 0.76 1.01 0.92 0.79 1.04 0.90 0.96 1.19 1.07 1.10 0.83 0.76 1.14 1.07 1.02 1.01 0.85 1.32
QIS 0.71 0.99 0.88 0.74 0.98 0.86 0.91 1.14 1.02 1.02 0.79 0.76 1.07 1.01 0.96 0.99 0.86 1.32
QuEST 0.72 0.99 0.88 0.73 0.98 0.87 0.90 1.15 1.02 1.02 0.79 0.76 1.08 1.02 0.97 0.99 0.86 1.32
LShriCC 0.53 0.70 0.61 0.60 0.87 0.78 0.44 0.75 0.71 0.76 0.69 0.68 0.72 0.67 0.58 0.71 0.72 1.21
LShri 0.71 0.97 0.87 0.74 0.97 0.87 0.92 1.12 1.01 0.92 0.78 0.76 1.08 0.91 0.81 0.90 0.84 1.29
BPSEst 0.71 0.84 0.82 0.74 0.96 0.86 0.81 0.97 0.80 0.87 0.77 0.75 0.93 0.87 0.73 0.85 0.82 1.29
FMEst – – – 0.71 0.95 0.86 – – – 0.83 0.77 0.75 – – – 0.83 0.82 1.28
POET I – – – 0.86 1.06 0.93 – – – 0.48 0.85 0.82 – – – 0.18 1.03 1.42
POET II 0.57 0.91 0.81 0.74 0.97 0.86 0.81 1.10 1.02 1.02 0.79 0.80 1.02 1.00 0.96 0.99 0.89 1.29
BN 0.77 0.92 0.86 0.75 1.04 0.90 0.85 1.09 1.00 1.12 0.85 0.80 1.03 1.03 0.99 1.03 0.89 1.34
Sample – – – 0.65 0.91 0.85 – – – 0.77 0.75 0.75 – – – 0.81 0.82 1.27

CVTL µ 0.79 0.99 0.92 0.83 1.15 0.92 0.91 1.13 1.03 1.15 0.83 0.75 1.09 1.05 0.99 1.02 0.80 1.27
σ
CVTLLS µ 0.55 0.91 0.96 0.83 1.13 0.91 0.67 1.18 1.06 1.17 0.84 0.75 1.09 1.05 1.02 1.01 0.79 1.23
σ
BN VAR 0.54 0.68 0.60 0.52 0.94 0.91 0.55 0.82 0.68 1.01 0.80 0.70 0.58 0.66 0.70 1.00 0.70 1.26
NC2R 0.69 0.82 0.81 0.74 0.91 0.66 0.72 0.83 0.82 1.01 0.95 0.54 0.85 0.83 0.80 0.72 0.59 1.07
CT – – – 0.93 1.19 0.95 – – – 1.17 0.91 0.76 – – – 0.97 0.79 1.24
1/N 0.94 0.85 0.86 0.88 1.12 0.83 0.94 0.96 0.89 1.13 0.82 0.62 0.89 0.81 0.85 0.84 0.61 0.98
Note: Results of all evaluated covariance estimators for WO data with N ∈ {100, 200, 300} and D ∈
estimator per problem setting is highlighted in bold. Underlined values indicate significant difference from CVTL µ
σ
with p < 0.05.

EC.4.2.3. Risk-Adjusted Return After Cost
Table EC.13 Out-Of-Sample Risk-Adjusted Return after Cost for WO Data and Different Dimensions.
N 100 200 300
D = N/T 2.0 1.50 1.33 0.40 0.20 0.13 2.0 1.50 1.33 0.40 0.20 0.13 2.0 1.50 1.33 0.40 0.20 0.13
CVTL 0.01 0.37 0.32 0.44 0.80 0.72 0.18 0.49 0.45 0.74 0.58 0.60 0.38 0.44 0.40 0.65 0.63 1.09
CVTLLS 0.05 0.36 0.32 0.46 0.81 0.72 0.19 0.49 0.45 0.74 0.59 0.58 0.38 0.42 0.39 0.65 0.61 1.12
QIS 0.02 0.31 0.22 0.33 0.70 0.65 0.17 0.43 0.33 0.57 0.50 0.54 0.31 0.30 0.24 0.53 0.56 1.08
QuEST 0.09 0.39 0.30 0.35 0.71 0.66 0.23 0.50 0.40 0.58 0.50 0.54 0.38 0.37 0.31 0.53 0.56 1.08
LShriCC – – – 0.15 0.58 0.57 – – – 0.22 0.36 0.44 – – – 0.11 0.38 0.94
LShri – 0.08 0.01 0.20 0.65 0.63 – – – 0.28 0.42 0.51 – – – 0.19 0.46 1.01
BPSEst – – – 0.06 0.59 0.61 – – – 0.08 0.38 0.49 – – – 0.01 0.42 0.99
FMEst – – – – 0.57 0.60 – – – – 0.37 0.49 – – – – 0.42 0.99
POET I – – – 0.31 0.73 0.70 – – – 0.03 0.49 0.58 – – – – 0.67 1.14
POET II – – – 0.22 0.64 0.62 – – – 0.46 0.46 0.56 – – – 0.42 0.56 1.02
BN 0.29 0.43 0.40 0.41 0.78 0.69 0.28 0.52 0.45 0.72 0.57 0.58 0.40 0.42 0.39 0.61 0.60 1.09
Sample – – – – 0.51 0.58 – – – – 0.34 0.48 – – – – 0.40 0.97

CVTL µ 0.25 0.51 0.45 0.61 1.00 0.80 0.32 0.63 0.57 0.93 0.68 0.59 0.56 0.60 0.44 0.81 0.62 1.10
σ
CVTLLS µ – 0.36 0.40 0.65 1.02 0.83 0.05 0.59 0.52 0.99 0.72 0.64 0.65 0.65 0.40 0.83 0.65 1.11
σ
BN VAR – 0.10 0.04 0.13 0.63 0.67 – 0.22 0.10 0.65 0.57 0.52 0.03 0.15 0.21 0.73 0.53 1.12
NC2R 0.27 0.41 0.41 0.36 0.50 0.25 0.43 0.54 0.54 0.68 0.64 0.20 0.75 0.73 0.70 0.61 0.47 0.92
CT – – – 0.52 1.00 0.83 – – – 0.73 0.72 0.64 – – – 0.54 0.62 1.10
1/N 0.92 0.83 0.83 0.86 1.09 0.81 0.92 0.93 0.87 1.11 0.80 0.60 0.87 0.79 0.82 0.81 0.59 0.95
Note: Empirical results of all evaluated covariance estimators for WO data with N ∈ {100, 200, 300} and
D ∈ {2.0, 1.5, 1.33, 0.4, 0.2, 0.3}. Each value is given in percentage and is based on the average over 50 evaluations
of random investment universes. – is set for annualized out-of-sample volatility > 50% and negative out-of-sample
risk-adjusted returns. Best estimator per setting is highlighted in bold.

Table EC.14 Other Performance Metrics on WO Data for N=100.

CVTL 8.12 33.40 6.08 24.31 133.54 1.26 19.70 6.65 31.56 5.80 21.06 114.62 1.21 10.41
CVTLLS 8.52 33.34 6.10 25.56 139.65 1.26 20.96 6.93 31.48 5.78 22.03 119.86 1.20 11.05
QIS 8.17 33.73 6.14 24.21 133.12 1.27 23.36 6.60 31.96 5.85 20.64 112.66 1.22 13.93
QuEST 8.10 33.54 6.10 24.15 132.75 1.26 20.06 6.56 31.96 5.85 20.51 111.99 1.22 13.04
LShriCC 6.01 35.66 6.50 16.84 92.37 1.35 30.87 5.55 33.11 6.12 16.76 90.67 1.27 16.15
LShri 8.45 34.77 6.35 24.31 133.13 1.32 32.16 6.86 32.90 6.04 20.86 113.63 1.26 19.04
BPSEst 10.31 43.51 8.05 23.70 128.10 1.67 50.00 7.39 34.82 6.42 21.24 115.20 1.34 26.31
FMEst – – – – – – – 7.15 35.27 6.49 20.28 110.21 1.35 28.68
POET I – – – – – – – 8.92 34.80 6.61 25.63 134.98 1.37 21.42
POET II 8.08 35.52 6.47 22.75 124.88 1.34 44.45 6.79 32.56 5.98 20.86 113.52 1.25 18.11
BN 8.29 34.85 6.36 23.78 130.39 1.31 16.86 6.72 32.32 5.92 20.79 113.55 1.23 11.43
Sample – – – – – – – 6.79 36.63 6.72 18.53 100.93 1.40 32.86

CVTL µ 8.64 33.88 6.22 25.50 138.89 1.29 16.58 7.65 32.78 6.05 23.34 126.54 1.25 7.57
σ
CVTLLS µ 9.31 34.34 6.37 27.11 146.25 1.31 20.27 7.75 33.25 6.14 23.30 126.16 1.27 6.10
σ
BN VAR 6.22 37.79 6.98 16.46 89.11 1.44 22.30 5.34 36.98 6.85 14.45 77.96 1.42 15.56
NC2R 9.75 42.30 7.95 23.05 122.64 1.63 17.82 8.76 40.68 7.78 21.54 112.57 1.60 17.02
CT – – – – – – – 9.94 36.94 6.87 26.90 144.59 1.42 16.45
1/N 12.82 50.53 9.57 25.37 133.88 1.96 1.23 13.18 50.22 9.57 26.24 137.75 1.96 1.19
D ∈ (1.33, 0.40). The covariance estimator with best score per KPI is marked bold. Underlined values indicate


CVTL 8.23 28.42 5.12 28.95 160.61 1.05 17.55 8.49 28.11 5.07 30.19 167.52 1.05 10.00
CVTLLS 8.55 28.37 5.13 30.14 166.65 1.05 18.85 8.64 28.02 5.03 30.82 171.71 1.04 10.45
QIS 8.20 28.72 5.16 28.55 158.85 1.06 21.04 8.16 28.45 5.11 28.68 159.60 1.05 13.51
QuEST 8.16 28.64 5.15 28.50 158.55 1.05 18.59 8.18 28.43 5.11 28.77 159.98 1.05 13.19
LShriCC 6.18 30.91 5.53 20.00 111.72 1.15 32.73 6.40 29.91 5.38 21.41 119.09 1.11 17.59
LShri 8.80 30.44 5.51 28.91 159.60 1.13 37.90 7.82 29.77 5.36 26.27 146.03 1.11 20.89
BPSEst 9.13 39.71 7.11 22.98 128.30 1.47 50.00 7.79 31.45 5.63 24.76 138.35 1.16 27.31
FMEst – – – – – – – 7.55 31.75 5.68 23.78 132.86 1.18 29.13
POET I – – – – – – – 7.12 37.20 9.85 19.14 72.30 2.03 25.45
POET II 8.63 29.69 5.35 29.07 161.34 1.10 42.53 8.34 28.62 5.18 29.14 161.08 1.07 17.41
BN 8.41 29.91 5.40 28.13 155.95 1.11 17.53 8.95 28.46 5.14 31.47 174.11 1.06 11.88
Sample – – – – – – – 7.21 32.62 5.85 22.11 123.15 1.21 32.09

CVTL µ 8.42 29.13 5.26 28.91 160.09 1.08 14.27 9.29 28.84 5.23 32.22 177.52 1.08 6.48
σ
CVTLLS µ 8.69 29.21 5.26 29.75 165.13 1.08 16.52 9.60 29.28 5.29 32.79 181.51 1.09 5.33
σ
BN VAR 6.31 33.70 6.23 18.74 101.31 1.27 20.84 9.18 31.95 5.95 28.73 154.38 1.21 12.38
NC2R 9.41 40.06 7.47 23.49 125.98 1.53 11.65 11.70 40.27 7.46 29.04 156.88 1.52 14.04
CT – – – – – – – 11.55 34.08 6.27 33.90 184.27 1.28 16.07
1/N 12.96 50.22 9.21 25.80 140.63 1.87 1.20 16.07 48.28 9.11 33.30 176.43 1.85 1.17


CVTL 7.38 26.98 4.96 27.34 148.70 1.03 16.62 7.65 27.67 4.95 27.66 154.65 1.02 10.03
CVTLLS 7.56 26.90 4.94 28.11 153.10 1.03 17.80 7.72 27.37 4.86 28.21 158.71 1.00 10.45
QIS 7.18 27.10 4.95 26.50 145.02 1.03 20.72 7.63 27.56 4.93 27.67 154.64 1.02 13.45
QuEST 7.22 27.03 4.94 26.71 146.17 1.02 18.68 7.64 27.58 4.94 27.71 154.82 1.02 13.46
LShriCC 4.81 29.58 5.44 16.26 88.49 1.14 34.66 5.79 29.05 5.19 19.95 111.55 1.07 18.81
LShri 6.72 29.54 5.41 22.75 124.15 1.13 42.25 7.41 28.95 5.19 25.60 142.82 1.07 22.34
BPSEst 7.82 37.64 6.89 20.78 113.50 1.44 50.00 7.33 30.22 5.41 24.26 135.43 1.12 28.10
FMEst – – – – – – – 7.29 30.40 5.45 23.97 133.57 1.13 29.58
POET I – – – – – – – 4.67 38.95 18.94 12.00 24.67 3.70 28.55
POET II 7.56 27.56 5.11 27.44 147.87 1.06 41.32 7.77 27.61 5.01 28.13 155.02 1.03 16.93
BN 7.77 28.43 5.19 27.32 149.65 1.07 18.04 8.01 27.86 4.99 28.76 160.46 1.02 12.18
Sample – – – – – – – 7.25 31.01 5.58 23.37 129.93 1.15 32.00

CVTL µ 7.58 27.85 5.08 27.23 149.22 1.05 16.03 8.18 28.83 5.15 28.38 158.95 1.05 6.31
σ
CVTLLS µ 7.80 27.74 5.09 28.10 153.14 1.05 18.00 8.12 28.90 5.17 28.09 157.12 1.06 5.27
σ
BN VAR 6.38 34.18 6.26 18.67 102.00 1.28 17.24 8.76 31.32 5.68 27.98 154.34 1.16 8.81
NC2R 10.20 44.18 8.36 23.09 121.93 1.71 4.57 9.38 44.78 8.57 20.94 109.36 1.75 5.26
CT – – – – – – – 9.67 34.33 6.37 28.18 151.77 1.29 16.03
1/N 12.43 50.38 9.37 24.67 132.61 1.92 1.24 12.15 49.17 9.39 24.72 129.41 1.92 1.17

EC.4.3. EU Data
EC.4.3.1. Volatility
Table EC.17 Out-Of-Sample Annualized Volatility for EU Data and Different Dimensions.
N 100 200 300
D = N/T 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13

CVTL 9.87 9.71 9.72 9.38 9.51 9.85 8.54 8.53 8.42 8.52 9.19 9.61 7.93 7.95 7.98 8.26 8.95 8.87
CVTLLS 9.97 9.80 9.80 9.41 9.52 9.84 8.60 8.58 8.46 8.52 9.15 9.54 7.94 7.95 7.98 8.23 8.90 8.77
QIS 9.90 9.77 9.84 9.46 9.57 9.89 8.54 8.55 8.46 8.56 9.17 9.53 7.91 7.94 7.99 8.28 8.88 8.68
QuEST 9.85 9.72 9.77 9.46 9.57 9.89 8.51 8.52 8.43 8.57 9.18 9.53 7.89 7.91 7.97 8.28 8.88 8.68
LShriCC 10.62 10.50 10.59 9.94 9.85 10.05 9.61 9.56 9.51 9.12 9.45 9.70 8.99 8.88 8.89 8.79 9.13 8.77
LShri 10.36 10.32 10.44 10.02 9.80 10.02 9.21 9.43 9.43 9.19 9.41 9.67 8.58 8.96 9.17 8.91 9.11 8.75
BPSEst 12.50 12.94 13.50 10.60 9.95 10.09 10.90 11.54 12.26 9.61 9.49 9.69 10.09 10.72 11.62 9.22 9.16 8.77
FMEst – – – 10.73 9.97 10.10 – – – 9.69 9.50 9.70 – – – 9.29 9.17 8.77
POET I – – – 10.38 9.85 10.04 – – – 10.33 9.51 9.69 – – – 10.87 9.25 8.83
POET II 14.98 10.66 10.52 9.75 9.84 10.14 10.57 9.18 8.97 8.79 9.34 9.69 9.10 8.40 8.38 8.45 8.97 8.74
BN 10.46 10.24 10.17 9.53 9.59 9.90 9.04 8.94 8.81 8.65 9.23 9.56 8.40 8.30 8.26 8.43 8.97 8.79
Sample – – – 11.11 10.09 10.16 – – – 9.93 9.58 9.73 – – – 9.47 9.21 8.78

CVTL µ 10.08 9.97 9.95 9.75 9.93 10.55 8.66 8.72 8.61 8.93 9.75 10.49 8.16 8.10 8.17 8.56 9.82 9.75
σ
CVTLLS µ 10.27 10.04 10.06 9.92 10.04 10.53 8.94 8.76 8.65 8.91 9.70 9.91 8.22 8.19 8.20 8.53 9.17 9.63
σ
BN VAR 11.11 11.04 11.14 10.93 10.98 11.24 10.00 10.00 9.95 9.92 10.53 10.94 9.77 9.86 9.86 9.76 10.45 10.13
NC2R 12.72 12.63 12.53 12.50 12.48 13.05 12.36 12.82 12.47 12.63 13.37 13.94 13.20 13.41 13.88 14.00 15.15 13.60
CT – – – 11.52 11.20 11.58 – – – 10.75 11.40 11.95 – – – 10.73 11.58 10.83
1/N 16.38 16.42 16.40 16.41 15.89 16.35 16.17 16.15 16.13 15.85 16.90 17.64 16.06 16.16 16.29 16.23 17.55 15.50
Note: Results of all evaluated covariance estimators for EU data with N ∈ {100, 200, 300} and D ∈
estimator per problem setting is highlighted in bold. Underlined values indicate significant difference from CVTL with
p < 0.05.

EC.4.3.2. Risk-Adjusted Return
Table EC.18 Out-Of-Sample Risk-Adjusted Return for EU Data and Different Dimensions.
N 100 200 300
D = N/T 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13 2 1.50 1.33 0.40 0.20 0.13

CVTL 1.06 1.18 1.12 1.10 1.28 1.17 1.31 1.49 1.52 1.43 1.23 1.17 1.64 1.62 1.48 1.42 1.30 1.86
CVTLLS 1.07 1.16 1.11 1.09 1.27 1.15 1.32 1.46 1.49 1.40 1.23 1.14 1.59 1.58 1.46 1.39 1.26 1.82
QIS 1.03 1.12 1.06 1.05 1.24 1.14 1.27 1.44 1.46 1.40 1.24 1.17 1.57 1.58 1.44 1.39 1.30 1.85
QuEST 1.03 1.13 1.06 1.06 1.25 1.15 1.28 1.45 1.48 1.40 1.24 1.18 1.59 1.60 1.45 1.39 1.31 1.85
LShriCC 0.70 0.78 0.69 0.79 0.92 0.92 0.66 0.84 0.88 0.82 0.92 0.94 0.87 0.97 0.78 0.81 0.95 1.59
LShri 0.98 1.08 1.00 0.97 1.15 1.08 1.20 1.29 1.32 1.14 1.12 1.10 1.41 1.31 1.11 1.09 1.16 1.78
BPSEst 0.88 0.86 0.82 0.85 1.10 1.04 1.06 1.07 0.92 1.03 1.09 1.08 1.23 1.04 0.78 0.99 1.12 1.76
FMEst – – – 0.84 1.10 1.04 – – – 1.00 1.09 1.08 – – – 0.99 1.12 1.77
POET I – – – 1.06 1.32 1.21 – – – 1.52 1.39 1.30 – – – 1.24 1.50 –
POET II 0.32 1.02 0.93 1.00 1.14 1.06 0.96 1.29 1.34 1.21 1.11 1.13 1.26 1.42 1.25 1.17 1.31 1.70
BN 1.01 1.09 1.08 1.12 1.34 1.20 1.26 1.44 1.45 1.55 1.31 1.22 1.52 1.56 1.51 1.59 1.44 1.89
Sample – – – 0.81 1.07 1.04 – – – 0.96 1.08 1.09 – – – 0.97 1.13 1.77

CVTL µ 1.08 1.15 1.12 1.11 1.34 1.16 1.36 1.47 1.49 1.49 1.25 1.10 1.68 1.64 1.51 1.59 1.27 1.63
σ
CVTLLS µ 1.07 1.14 1.09 1.05 1.33 1.13 1.27 1.43 1.46 1.45 1.22 1.11 1.54 1.53 1.43 1.43 1.26 1.67
σ
BN VAR 0.84 0.96 0.86 1.05 1.29 1.31 1.07 1.25 1.24 1.45 1.40 1.28 1.23 1.36 1.36 1.70 1.41 1.99
NC2R 0.70 0.90 0.77 0.78 0.97 0.90 0.86 0.96 0.95 0.90 0.87 0.65 0.89 0.94 0.82 0.83 0.53 1.20
CT – – – 0.92 1.16 1.00 – – – 1.15 1.00 0.76 – – – 1.02 0.78 1.34
1/N 0.82 0.75 0.77 0.75 0.97 0.77 0.86 0.82 0.78 0.97 0.73 0.43 0.78 0.76 0.82 0.78 0.43 0.87
Note: Results of all evaluated covariance estimators for EU data with N ∈ {100, 200, 300} and D ∈
estimator per problem setting is highlighted in bold. Underlined values indicate significant difference from CVTL µ
σ
with p < 0.05.

EC.4.3.3. Risk-Adjusted Return After Cost
Table EC.19 Out-Of-Sample Risk-Adjusted Return after Cost for EU Data and Different Dimensions.
N 100 200 300
D = N/T 2.0 1.50 1.33 0.40 0.20 0.13 2.0 1.50 1.33 0.40 0.20 0.13 2.0 1.50 1.33 0.40 0.20 0.13
CVTL 0.36 0.55 0.55 0.78 1.05 1.00 0.59 0.84 0.91 1.07 0.98 0.98 0.90 0.99 0.87 1.06 1.06 1.67
CVTLLS 0.34 0.50 0.50 0.75 1.03 0.98 0.54 0.78 0.85 1.04 0.98 0.95 0.82 0.91 0.81 1.03 1.01 1.62
QIS 0.36 0.46 0.42 0.68 0.99 0.95 0.56 0.77 0.82 0.99 0.96 0.96 0.86 0.92 0.78 0.99 1.02 1.63
QuEST 0.42 0.55 0.51 0.71 1.00 0.95 0.63 0.84 0.90 1.01 0.97 0.97 0.93 1.00 0.86 0.99 1.03 1.63
LShriCC – – – 0.35 0.65 0.72 – – – 0.32 0.60 0.71 – – – 0.25 0.62 1.33
LShri 0.06 0.15 0.09 0.42 0.83 0.85 – 0.08 0.09 0.49 0.76 0.85 0.02 – – 0.39 0.78 1.50
BPSEst – – – 0.20 0.75 0.80 – – – 0.28 0.71 0.82 – – – 0.21 0.73 1.48
FMEst – – – 0.14 0.74 0.80 – – – 0.22 0.70 0.82 – – – 0.18 0.73 1.48
POET I – – – 0.49 1.00 0.99 – – – 0.85 1.03 1.06 – – – 0.56 1.13 1.80
POET II – – – 0.50 0.82 0.82 – – 0.07 0.68 0.78 0.88 – – – 0.62 0.98 1.43
BN 0.53 0.62 0.60 0.78 1.09 0.99 0.68 0.87 0.89 1.16 1.03 1.00 0.88 0.93 0.89 1.17 1.14 1.64
Sample – – – 0.05 0.70 0.78 – – – 0.13 0.68 0.82 – – – 0.12 0.73 1.48

CVTL µ 0.51 0.68 0.67 0.89 1.19 1.04 0.76 0.95 0.99 1.26 1.08 0.92 1.08 1.13 0.98 1.29 1.04 1.46
σ
CVTLLS µ 0.40 0.58 0.55 0.87 1.21 1.05 0.65 0.86 0.87 1.26 1.08 0.97 1.05 1.04 0.85 1.18 1.08 1.56
σ
BN VAR 0.25 0.38 0.30 0.65 1.00 1.08 0.42 0.64 0.64 1.09 1.16 1.10 0.65 0.82 0.85 1.42 1.24 1.86
NC2R 0.26 0.46 0.33 0.34 0.53 0.48 0.61 0.70 0.70 0.55 0.47 0.48 0.79 0.85 0.72 0.72 0.47 1.15
CT – – – 0.54 0.99 0.89 – – – 0.75 0.82 0.65 – – – 0.63 0.62 1.22
1/N 0.80 0.72 0.75 0.72 0.94 0.75 0.84 0.80 0.76 0.95 0.71 0.41 0.76 0.74 0.80 0.76 0.41 0.84
Note: Empirical results of all evaluated covariance estimators for EU data with N ∈ {100, 200, 300} and
D ∈ {2.0, 1.5, 1.33, 0.4, 0.2, 0.3}. Each score is given in percentage and is based on the average over 50 evaluations
of random investment universes. – is set for annualized out-of-sample volatility > 50% and negative out-of-sample
risk-adjusted returns. Best estimator per setting is highlighted in bold. Underlined values indicate significant difference
from CVTL µ with p < 0.05.
σ

Table EC.20 Other Performance Metrics on EU Data for N=100.

CVTL 10.90 35.29 6.41 30.89 169.99 1.32 20.64 10.34 33.39 6.15 30.96 168.06 1.27 11.27
CVTLLS 10.87 35.58 6.47 30.55 168.07 1.33 22.10 10.24 33.48 6.16 30.60 166.20 1.27 11.84
QIS 10.39 35.90 6.50 28.93 159.76 1.34 23.21 9.95 33.74 6.19 29.50 160.89 1.28 13.15
QuEST 10.39 35.72 6.46 29.08 160.68 1.33 20.00 10.06 33.72 6.18 29.82 162.63 1.28 12.36
LShriCC 7.27 38.42 6.99 18.91 104.03 1.45 32.74 7.88 35.42 6.48 22.25 121.52 1.34 16.48
LShri 10.43 37.67 6.81 27.70 153.27 1.41 36.12 9.73 35.43 6.51 27.46 149.44 1.35 20.55
BPSEst 11.13 47.62 8.62 23.37 129.07 1.78 50.00 9.06 37.43 6.85 24.22 132.25 1.42 26.30
FMEst – – – – – – – 9.01 37.82 6.92 23.83 130.29 1.43 28.53
POET I – – – – – – – 11.05 36.49 6.73 30.29 164.12 1.39 22.04
POET II 9.75 37.86 6.83 25.76 142.69 1.41 45.78 9.75 34.56 6.34 28.20 153.65 1.31 18.16
BN 10.96 37.02 6.76 29.60 162.07 1.39 17.71 10.68 33.98 6.23 31.44 171.34 1.29 11.83
Sample – – – – – – – 9.00 39.02 7.13 23.07 126.19 1.48 32.42

CVTL µ 11.16 35.96 6.57 31.04 169.81 1.36 16.44 10.78 34.57 6.41 31.18 168.14 1.32 7.84
σ
CVTLLS µ 10.94 36.54 6.63 29.94 164.86 1.37 19.93 10.43 35.30 6.53 29.56 159.81 1.35 6.46
σ
BN VAR 9.57 40.76 7.61 23.48 125.75 1.57 23.53 11.44 39.22 7.39 29.16 154.73 1.52 15.93
NC2R 9.63 45.70 8.41 21.07 114.51 1.73 20.58 9.71 44.31 8.25 21.91 117.71 1.70 20.49
CT – – – – – – – 10.61 40.60 7.46 26.14 142.25 1.54 16.24
1/N 12.66 57.11 10.32 22.17 122.75 2.12 1.32 12.25 56.71 10.41 21.59 117.69 2.14 1.29
D ∈ (1.33, 0.40). The covariance estimator with best score per KPI is marked bold. The lowest turnover of any
optimized portfolio is marked bold. Underlined values indicate significant difference from CVTL with p < 0.05.


CVTL 12.78 29.52 5.39 43.31 237.25 1.11 18.50 12.15 29.11 5.52 41.74 219.98 1.13 10.84
CVTLLS 12.59 29.82 5.44 42.23 231.29 1.12 19.71 11.96 29.21 5.52 40.96 216.72 1.14 11.25
QIS 12.39 29.74 5.42 41.67 228.68 1.11 20.04 11.95 29.32 5.55 40.75 215.23 1.14 12.59
QuEST 12.47 29.65 5.40 42.05 230.80 1.11 17.64 12.02 29.30 5.55 41.04 216.59 1.14 12.12
LShriCC 8.40 33.72 6.03 24.90 139.26 1.24 33.48 7.51 31.67 5.81 23.72 129.35 1.20 17.50
LShri 12.42 32.43 5.97 38.29 208.11 1.23 43.38 10.44 31.39 5.85 33.25 178.55 1.21 22.25
BPSEst 11.27 42.43 7.64 26.55 147.36 1.58 50.00 9.89 33.05 6.08 29.94 162.64 1.26 27.13
FMEst – – – – – – – 9.72 33.26 6.12 29.24 158.84 1.27 28.72
POET I – – – – – – – 15.66 34.52 6.53 45.36 239.99 1.35 24.59
POET II 12.03 30.99 5.66 38.82 212.43 1.16 42.83 10.65 29.86 5.63 35.67 189.36 1.17 17.34
BN 12.77 31.26 5.70 40.85 224.09 1.17 18.01 13.38 29.44 5.60 45.44 238.74 1.15 12.12
Sample – – – – – – – 9.49 33.92 6.25 27.99 151.86 1.30 31.25

CVTL µ 12.87 30.48 5.59 42.22 230.17 1.15 15.60 13.30 30.75 5.85 43.26 227.39 1.20 7.33
σ
CVTLLS µ 12.60 30.64 5.58 41.12 225.74 1.15 18.49 12.95 30.78 5.83 42.07 222.19 1.20 6.28
σ
BN VAR 12.30 35.06 6.68 35.07 184.06 1.35 21.64 14.41 33.51 6.60 43.01 218.37 1.34 12.70
NC2R 11.86 44.00 8.16 26.95 145.35 1.67 11.23 11.40 43.96 8.09 25.94 140.97 1.66 16.19
CT – – – – – – – 12.40 37.42 6.88 33.13 180.29 1.42 15.66
1/N 12.58 57.05 10.26 22.04 122.58 2.08 1.27 15.37 54.93 10.04 27.99 153.07 2.05 1.26


CVTL 11.81 27.94 5.27 42.27 223.94 1.08 17.93 11.69 28.45 5.36 41.10 218.15 1.10 10.65
CVTLLS 11.62 28.18 5.30 41.23 219.25 1.09 18.87 11.42 28.35 5.31 40.28 214.89 1.09 10.82
QIS 11.52 28.02 5.28 41.11 218.08 1.08 19.42 11.53 28.39 5.34 40.60 215.63 1.10 12.25
QuEST 11.59 27.97 5.26 41.44 220.16 1.08 17.39 11.51 28.37 5.35 40.58 215.23 1.10 11.97
LShriCC 6.91 31.74 5.73 21.76 120.63 1.19 34.11 7.09 30.77 5.56 23.05 127.50 1.15 18.62
LShri 10.18 31.69 5.96 32.11 170.88 1.23 48.24 9.70 30.09 5.64 32.24 171.93 1.16 23.54
BPSEst 9.11 40.94 7.34 22.25 124.12 1.52 50.00 9.17 31.54 5.80 29.08 158.20 1.19 27.48
FMEst – – – – – – – 9.17 31.62 5.83 29.00 157.25 1.20 28.61
POET I – – – – – – – 13.43 33.59 6.92 39.97 194.18 1.42 26.63
POET II 10.47 29.21 5.48 35.84 191.08 1.13 41.48 9.89 28.99 5.45 34.10 181.51 1.12 17.22
BN 12.52 29.06 5.35 43.07 233.89 1.12 18.84 13.38 28.94 5.45 46.22 245.51 1.11 12.72
Sample – – – – – – – 9.20 31.99 5.93 28.75 154.99 1.22 30.51

CVTL µ 12.31 28.47 5.40 43.23 228.01 1.11 15.57 13.59 29.28 5.62 46.43 241.78 1.14 9.07
σ
CVTLLS µ 11.73 28.98 5.44 40.46 215.76 1.12 17.26 12.21 29.58 5.56 41.27 219.64 1.13 7.76
σ
BN VAR 13.39 35.33 6.73 37.89 198.88 1.37 18.04 16.63 34.06 6.52 48.84 254.94 1.32 9.74
NC2R 11.31 48.93 9.02 23.12 125.44 1.87 4.79 11.62 48.68 9.23 23.86 125.82 1.88 5.54
CT – – – – – – – 10.92 37.40 6.93 29.19 157.51 1.41 15.28
1/N 13.40 57.34 10.32 23.37 129.84 2.12 1.34 12.66 55.96 10.48 22.62 120.73 2.13 1.28

EC.4.4. Industry Data
Table EC.23 Other Performance Metrics on FFI Data for N=10.

CVTL 8.38 44.32 8.73 18.91 95.95 1.78 14.14 8.76 40.71 7.93 21.51 110.48 1.63 16.46
CVTLLS 8.58 43.57 8.60 19.70 99.77 1.75 14.66 8.63 39.09 7.59 22.07 113.58 1.57 17.47
QIS 13.33 45.37 8.66 29.39 153.96 1.78 50.00 10.83 40.13 7.65 27.00 141.56 1.57 43.42
QuEST 11.74 41.80 7.98 28.08 147.08 1.65 45.46 11.22 39.50 7.59 28.41 147.83 1.56 38.93
LShriCC 11.58 42.35 8.09 27.35 143.20 1.65 45.44 10.53 39.32 7.49 26.77 140.45 1.53 31.42
LShri 9.60 40.71 7.99 23.58 120.17 1.63 29.47 9.82 38.30 7.48 25.63 131.32 1.53 29.43
BPSEst 10.57 50.60 9.79 20.89 107.96 1.99 9.61 10.55 42.55 8.16 24.78 129.19 1.66 45.90
FMEst – – – – – – – 11.14 43.29 8.18 25.74 136.16 1.67 50.00
POET I 11.05 39.64 7.93 27.87 139.26 1.62 37.01 10.12 37.75 7.41 26.80 136.58 1.51 30.64
POET II 17.72 44.43 8.63 39.88 205.22 1.77 50.00 12.29 40.19 7.66 30.57 160.47 1.57 42.86
BN 9.01 40.92 8.14 22.02 110.70 1.67 26.18 10.32 38.35 7.44 26.92 138.69 1.53 29.78
Sample – – – – – – – 11.58 46.15 8.59 25.09 134.87 1.75 50.00

CVTL µ 4.28 44.52 10.28 9.61 41.60 2.15 20.89 7.22 42.10 8.13 17.14 88.76 1.66 15.06
σ
CVTLLS µ – 62.28 17.47 – – 3.68 23.74 9.08 39.72 7.75 22.86 117.23 1.59 16.31
σ
BN VAR 13.02 43.09 8.82 30.21 147.52 1.78 42.16 11.72 43.87 8.73 26.71 134.19 1.79 47.43
NC2R 10.08 44.78 8.83 22.51 114.17 1.78 46.29 8.23 43.10 8.55 19.09 96.18 1.74 40.59
CT – – – – – – – 9.58 41.20 8.02 23.25 119.45 1.64 38.73
1/N 8.07 51.48 9.98 15.68 80.93 2.02 0.76 7.62 50.78 9.77 15.00 77.93 1.99 0.80
Note: Values are given in percent based on N = 10 portfolio constituents (industries) with D ∈ (1.33, 0.40). The
covariance estimator with best score per metric is highlighted in bold. Underlined values indicate significant difference


CVTL 9.68 38.50 7.44 25.13 130.06 1.51 29.95 9.47 36.42 7.24 25.99 130.82 1.49 16.35
CVTLLS 9.81 38.47 7.45 25.51 131.68 1.51 28.95 9.59 36.31 7.20 26.42 133.15 1.49 16.56
QIS 9.80 40.36 7.57 24.29 129.52 1.55 50.00 9.10 37.72 7.45 24.14 122.26 1.54 33.57
QuEST 8.94 39.27 7.48 22.76 119.55 1.52 47.92 9.04 38.03 7.49 23.77 120.64 1.54 31.02
LShriCC 7.81 40.17 7.46 19.44 104.63 1.54 50.00 9.20 38.57 7.43 23.85 123.72 1.54 31.12
LShri 9.16 39.36 7.57 23.27 120.92 1.54 50.00 9.52 37.61 7.40 25.31 128.62 1.53 35.72
BPSEst 10.92 50.73 9.45 21.52 115.52 1.93 50.00 9.50 40.37 7.82 23.53 121.46 1.61 50.00
FMEst – – – – – – – 9.20 41.12 7.92 22.38 116.24 1.63 50.00
POET I 9.64 43.49 8.25 22.16 116.81 1.70 50.00 6.81 38.44 7.48 17.71 91.05 1.54 33.14
POET II 9.70 40.90 7.85 23.71 123.54 1.62 50.00 8.34 39.16 7.59 21.29 109.81 1.55 33.11
BN 9.85 38.63 7.48 25.49 131.61 1.51 30.63 8.89 37.46 7.44 23.74 119.59 1.52 23.40
Sample – – – – – – – 8.90 43.44 8.28 20.49 107.44 1.72 50.00

CVTL µ 12.48 43.04 8.25 29.01 151.32 1.68 27.19 10.38 39.59 7.86 26.23 132.15 1.61 13.02
σ
CVTLLS µ 9.45 40.82 7.89 23.15 119.84 1.61 26.43 9.83 38.73 7.63 25.37 128.80 1.57 11.85
σ
BN VAR 11.61 41.48 8.12 28.00 142.95 1.65 45.46 9.80 41.18 8.27 23.81 118.56 1.70 32.80
NC2R 9.39 46.68 9.02 20.11 104.12 1.81 50.00 9.64 44.59 8.96 21.62 107.61 1.84 35.50
CT – – – – – – – 10.29 40.89 8.25 25.17 124.70 1.68 36.77
1/N 12.47 56.56 10.45 22.04 119.27 2.13 0.95 11.70 56.59 10.94 20.68 106.99 2.24 0.89
covariance estimator with best score per metric is hihglighted in bold. Underlined values indicate significant difference

CVTL 9.60 36.07 6.93 26.61 138.55 1.41 30.10 9.28 33.98 6.59 27.30 140.78 1.35 16.48
CVTLLS 9.58 35.78 6.91 26.79 138.60 1.40 29.09 9.33 33.84 6.60 27.57 141.32 1.34 17.31
QIS 7.86 38.57 7.15 20.38 109.99 1.46 50.00 7.74 36.50 6.80 21.20 113.81 1.38 31.52
QuEST 7.87 37.97 7.10 20.72 110.76 1.46 43.77 8.29 36.17 6.77 22.93 122.46 1.38 28.80
LShriCC 10.89 37.10 7.06 29.35 154.28 1.46 50.00 9.15 36.20 6.90 25.29 132.57 1.41 34.39
LShri 8.42 38.61 7.18 21.82 117.34 1.47 50.00 8.68 36.27 6.89 23.93 126.02 1.40 37.11
BPSEst 11.44 46.78 9.25 24.47 123.68 1.88 50.00 6.84 40.03 7.50 17.09 91.16 1.54 50.00
FMEst – – – – – – – 6.54 40.89 7.59 16.00 86.21 1.56 50.00
POET I – – – – – – – 7.66 36.29 6.86 21.12 111.76 1.40 32.53
POET II 8.98 38.88 7.28 23.09 123.30 1.49 50.00 9.35 36.70 6.95 25.49 134.54 1.42 30.60
BN 8.82 36.50 7.07 24.17 124.78 1.45 27.30 8.41 35.17 6.77 23.91 124.14 1.38 20.62
Sample – – – – – – – 6.21 43.16 7.97 14.39 77.96 1.63 50.00

CVTL µ 9.72 38.77 7.62 25.06 127.57 1.53 20.99 10.85 34.79 6.89 31.19 157.51 1.40 14.13
σ
CVTLLS µ 9.53 36.29 7.05 26.26 135.11 1.43 29.31 10.16 34.65 6.85 29.33 148.25 1.39 13.60
σ
BN VAR 10.59 39.40 7.86 26.89 134.82 1.60 36.21 11.36 39.13 7.85 29.02 144.63 1.60 27.22
NC2R 9.36 44.57 8.73 20.99 107.50 1.77 36.27 9.61 42.88 8.46 22.41 113.56 1.71 26.58
CT – – – – – – – 9.06 40.31 7.88 22.48 115.07 1.59 35.22
1/N 9.78 55.47 10.73 17.62 91.08 2.17 0.94 11.99 54.55 10.66 21.99 112.48 2.15 0.89
covariance estimator with best score per metric is highlighted in bold. Underlined values indicate significant difference

Cross Validation Based Transfer Learning for Financial Covariance Estimation

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cross Validation Based Transfer Learning for Financial Covariance Estimation

Uploaded by

Copyright:

Available Formats

Cross Validation Based Transfer Learning for

Financial Covariance Estimation:

Bernhard Lutz, Dirk Neumann

data-driven approach, portfolio optimization

Electronic copy available at: https://ssrn.com/abstract=3986993

or to identify signal/noise separation (Zhao et al. 2019).

1.1. Our Approach

In this study, we propose an entirely different perspective on financial covariance estimation.

Electronic copy available at: https://ssrn.com/abstract=3986993

estimation like shrinkage intensities and shrinkage target.

Electronic copy available at: https://ssrn.com/abstract=3986993

We provide a novel perspective on covariance estimation by proposing a purely data-driven estimator

Electronic copy available at: https://ssrn.com/abstract=3986993

which also presents a novel approach.

financial covariance estimation. In addition, it presents a brief explanation of existing shrinkage

Finally, Section 6 concludes and provides an outlook on future research.

Electronic copy available at: https://ssrn.com/abstract=3986993

b ∈ RN ×N can then be calculated as

D > 1, the problem is considered high-dimensional, which corresponds to an estimated sample

covariance that is rank deficient and thus not invertible.

covariance matrix Σ−1 . The inverse of (3) is given as

Given the sample covariance estimate Σ,

function of the eigenvalues

the 1/N portfolio.

Electronic copy available at: https://ssrn.com/abstract=3986993

2.2. Shrinkage Methods

can be distinguished between linear and non-linear shrinkage estimators.

b with a reference covariance matrix R by applying the shrinkage intensity δ ∗

Electronic copy available at: https://ssrn.com/abstract=3986993

Non-Linear Shrinkage Non-linear shrinkage (N LS) methods apply a function

f N LS (λ1 , . . . , λN ) = λ∗1 , . . . , λ∗N (7)

estimate is then calculated using eigendecomposition (3) as

For instance, f N LS can be defined as

against different individual target eigenvalues.

shrinkage over MP adjustments.

Electronic copy available at: https://ssrn.com/abstract=3986993

2.3. Cross Validation

Bergmeir and Benı́tez 2012).

no data available to perform cross validation.

3. Covariance Estimation Through Cross Validation based Transfer

Electronic copy available at: https://ssrn.com/abstract=3986993

problem based on X T,N .

3.1. Non-Linear Shrinkage Using Second Shrinkage Target

the respective shrinkage intensities.

zero (if any).

Electronic copy available at: https://ssrn.com/abstract=3986993

Definition 3.1 (Gini Coefficient) Given eigenvalues λ = λ1 , . . . , λN with λi ≥ 0, ∀i = 1, . . . , N ,

the Gini coefficient is defined as

To obtain Gθ , we introduce a parameter γ ∈ [−1, 1] that adjusts the Gini coefficient of Σ

is ensured that Gθ ∈ [0, 1].

λα,β = arg min |G(λα,β ) − Gθ | (13)

Electronic copy available at: https://ssrn.com/abstract=3986993

for N = 100 eigenvalues.

1.0 1.0 1.0

eigenvalues (e.g., Laloux et al. 2000).

is invariant to scalar multiplication.

3.2. Parameter Selection Using Cross Validation

Electronic copy available at: https://ssrn.com/abstract=3986993

Algorithm 1 Covariance Estimation with Given Parameters.

index, to further improve the resulting parameter configuration.

same methodology to our results.

resulting covariance estimate ΣX (γ, δ1 , δ2 ) on the actual dataset X is invertible. If ΣX (γ, δ1 , δ2 ) is

first out-of-sample period vT +1 , . . . , vT +ρ . Subsequently, we shift the estimation and out-of-sample