Improved Covariance Matrix Estimation With An Application in Portfolio Optimization

You might also like

You are on page 1of 5

IEEE SIGNAL PROCESSING LETTERS, VOL.

27, 2020 985

Improved Covariance Matrix Estimation With an


Application in Portfolio Optimization
Samruddhi Deshmukh , Member, IEEE, and Amartansh Dubey , Member, IEEE

Abstract—One of the major challenges in multivariate analysis not follow desirable properties like multivariate normality and
is the estimation of population covariance matrix from the sample stationarity. This has been well established by the fact that
covariance matrix (SCM). Most recent covariance matrix estima- the Nobel Prize winning minimum risk portfolio theory by
tors use either shrinkage transformations or asymptotic results
from Random Matrix Theory (RMT). Both of these techniques
Markowitz [14] could not be effectively used in practical cases
try to achieve a similar goal which is to remove noisy correlations for almost 50 years because it relies on the accurate estima-
and add structure to SCM to overcome the bias-variance trade-off. tion of the covariance matrix [15], [17], [28]. His theory is
Both methods have their respective pros and cons. In this paper, also referred to as the Modern Portfolio Theory (MPT) as
we propose an improved estimator which exploits the advantages of it radically changed the investment perspectives after the 1950s.
these techniques by taking optimally weighted convex combination The key idea is to minimize the risk by avoiding investment in
of covariance matrices estimated by shrinkage transformation and highly correlated stocks (a diversified portfolio). The traditional
a filter based on RMT. It is a generalized estimator which can
adapt to changing sampling noise conditions by performing hyper-
estimators like SCM have proved to be highly ineffective due to
parameter optimization. Using data from six of the world’s biggest the heavy-tailed nature of stock market data and availability of
stock exchanges, we show that the proposed estimator outperforms limited samples [17], [28].
the existing estimators in minimizing the out-of-sample risk of the The amount of sampling noise present in SCM depends on
portfolio and hence predicts population statistics more precisely. certain properties of the data. To understand this, let M be
The proposed estimator can be useful in a wide range of machine the number of features (can be stocks in a market), N be the
learning and signal processing applications. number of samples (daily returns of each stock). The data matrix
Index Terms—Covariance estimator, Portfolio Optimization, can be represented as X ∈ RM ×N . After removing mean, the
Principle Component Analysis, Random Matrix Theory. SCM (ΣSCM ∈ RM ×M ) is defined as ΣSCM = XX T /N . The
I. INTRODUCTION following factors decide the extent of deviation of SCM from
OVARIANCE matrix estimation is a crucial problem in the population covariance matrix (Σpop ):
C multivariate statistics [1], [2] and it find applications across
many disciplines ranging from engineering [3]–[11] and physics
1) Dimensionality constant (c = M/N ): SCM is an asymp-
totically unbiased estimator, i.e. ΣSCM → Σpop as N →
[12], [13] to finance [14]–[17]. It is an active area of research in ∞ and c = M/N ∈ (0, 1). Furthermore, the estimation
signal processing, wireless communication, machine learning error (Σpop − ΣSCM ) is low if c → 0, i.e. N >> M
and finance [3], [16]–[23]. However, the extent of innovation and high if c → 1. Usually, c is not close to 0 in financial
needed in estimating true correlations largely depends on the data (like stock market data) since M is comparable to N
properties of the data and trade-off between desired accuracy [17]. Hence, due to the limited samples per feature, SCM
and complexity. A simple estimator like SCM is useful if data can give highly noisy correlations (high sampling noise).
has desirable properties like multivariate normality, independent 2) Normality Assumption: SCM is a maximum likelihood
samples, larger samples size, etc. More involved estimators estimate of Σpop derived under the assumption of mul-
can be used for domain-specific problems when certain prior tivariate normality. But the distribution of stock returns is
empirical constraints are given [4]–[11]. mostly non-Gaussian and is best modeled by heavy-tailed
However, in many cases, multivariate datasets do not follow distributions [29]. This increases the estimation error.
desirable properties and constraints and require more general- 3) Independence across samples: Another crucial assump-
ized estimators which can easily adapt to the changing patterns tion made while deriving the maximum likelihood esti-
and frequent outliers [24]–[27]. Financial data is particularly mator of Σpop is that the samples across each feature are
challenging for traditional estimators like SCM because it does independent and identically distributed (i.i.d.). This is not
true for stock data as it can have temporal correlations.
Manuscript received April 14, 2020; accepted May 7, 2020. Date of pub- 4) Bias-Variance tradeoff (structure in covariance matrix):
lication May 20, 2020; date of current version June 29, 2020. The associate
editor coordinating the review of this manuscript and approving it for publica- Deviation from the aforementioned assumptions might
tion was Prof. Ya-Feng Liu. (Both authors contributed equally to this work.) increase the sampling noise and cause over-fitting result-
(Corresponding author: Amartansh Dubey.) ing in a highly non-structured SCM. This results in poor
Samruddhi Deshmukh was with the Leonardo Machine Learning Division, estimates for out-of-sample correlation coefficients.
SAP Labs, Bangalore 560066, India. She is now with the Needl.ai, Bangalore
560066, India (e-mail: dsamruddhi21@gmail.com). Hence, the scarcity of samples, deviation from multivariate
Amartansh Dubey is with the Department of Electronic and Computer En- normality, deviation from i.i.d nature and lack of structure, all
gineering, Hong Kong University of Science Technology, Hong Kong (e-mail: make SCM a terrible estimator for many practical applications,
adubey@connect.ust.hk).
Digital Object Identifier 10.1109/LSP.2020.2996060
particularly for portfolio optimization [16]–[18].

1070-9908 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: ETH BIBLIOTHEK ZURICH. Downloaded on March 13,2023 at 03:30:09 UTC from IEEE Xplore. Restrictions apply.
986 IEEE SIGNAL PROCESSING LETTERS, VOL. 27, 2020

In this paper, we propose an improved covariance matrix Over the last two decades, researchers have proposed various
estimator by taking optimally weighted convex combination of shrinkage estimators [26]–[28], [32]. Haff [32] was first to pro-
covariance matrices estimated by shrinkage transformation and pose using an identity matrix as the shrinkage target. Ledoit and
a RMT based filter. This estimator when applied to the data Wolf [28] proposed a shrinkage target based on the Sharpe Index
of major stock markets, outperformed the existing estimators model. Another famous paper by Ledoit and Wolf [27] proposed
in minimizing the portfolio’s out-of-sample risk (test error). a shrinkage target that has sample variances as the diagonal
A lower risk implies a better estimation of true correlations elements and the average value of all sample covariances as the
among the stocks. Since the covariance analysis is a crucial off-diagonal elements. Hence, it is called as “Sample Variance
step in multivariate analysis, the proposed estimator can be and Mean Covariance target”. Previous studies [17], [27] as
useful in a wide range of machine learning and signal processing well as our analysis in Section IV shows that this estimator is
applications [3], [12], [18]–[22]. the best among all linear shrinkage estimators. Hence, we have
Section II provide an overview of existing methods and for- combined it in our framework in Section III. However, a major
mulates portfolio optimization problem. Section III describes drawback of shrinkage estimators is that they impose a uniform
the proposed estimator followed by the empirical results in structure on all covariance values and choice of a shrinkage
Section IV. In all the sections upper and lower case boldfaced target is highly sensitive towards properties like non-normality
letters (italic or otherwise) represent matrices and vectors re- and skewness.
spectively. Italic letters represent scalars and functions.
B. Random Matrix Theory Approach
II. SHRINKAGE TRANSFORMATIONS, RMT FRAMEWORK AND Unlike Shrinkage estimators which uniformly add bias to
PROBLEM FORMULATION SCM, the RMT based methods exploit the asymptotic properties
For an effective and concise performance analysis, we have of matrices in the eigen-space and add selective bias to the
designed our comparative study for existing estimators which unstructured SCM. Also, they do not rely on assumptions like
satisfies mainly two important practical conditions. Firstly, we multivariate normality. One such technique is cleaning noisy
have selected the existing estimators which deals with highly eigenvalues of SCM using Marchenko-Pastur (MP) law [24],
unpredictable real-world financial data [17], [24]–[31], where [25], [31]. MP law provides lower and upper bounds on eigen-
there are no prior constraints imposed on the behavior of either values such that all eigenvalues inside the bounds are associated
the data or model. In other words, for this study, we have not con- with sampling noise. MP law is stated as follows: Let X ∈
sidered several other domain-specific estimators which utilize RM ×N be a matrix such that entries xi,j = [X]i,j are jointly
prior empirical constraints like restricted condition numbers or i.i.d. real random variables with zero mean and finite variance
follow specific structures in the simulated environment [4]–[11]. (σ 2 < ∞) (other strict results needs the first four moments to
Secondly, we have selected those existing estimators which are be finite). Let λˆ1 , λˆ2 , ...λˆM be the sample eigenvalues of SCM
shown to be effective with the real-world data [24]–[31] and (ΣSCM = XX T /N ). Since the entries of the original matrix
not with the specific domain-driven simulated results [5]–[11]. X are random, these sets of eigenvalues can also be viewed as
However, the future work for this study can also incorporate random variables. Now consider a probability measure GM (x)
cross-domain comparisons. We have chosen the minimum risk on these sample eigenvalues (λˆi ) in the semi-infinite Borel set as
portfolio problem for comparative study because the complex a count function (analogous to cumulative distribution) as shown
and fast-changing properties of the market data provide extreme in (2). The derivative of (2) gives sample eigenvalue probability
conditions to test the robustness of a covariance estimator. The density in (3).
real-world data-driven problems like portfolio optimization have 1 ˆ
motivated the researchers to develop better estimators to handle GM (x) = {λi ≤ x} (2)
M
financial data and these are mainly of two types: shrinkage
1 
M
estimators [26]–[28] and RMT based estimators [24], [25], [30]. ΣSCM
gM = δ(x − λˆi ) (3)
A comprehensive review of these estimators can be found in [17], M i=1
[31].
This density converges to the Marchenko-Pastur (MP) distribu-
ΣSCM
tion gM → g M P (x) as the dimensions of matrix X become
A. Shrinkage Transformations asymptotically large (M, N → ∞ and c = M/N ∈ (0, 1)). The
Shrinkage estimators solve the overfitting problem by impos- convergence is better if c is close to 0. The Marchenko-Pastur
ing structure on SCM. The covariance values in SCM that are distribution (g M P (x)) is given as:
extremely large due to sampling noise (and outliers) tend to 
(x − λ− )(λ+ − x)
contain a lot of positive error and need to be pulled downwards. g MP
(x) =
Similarly, extremely small covariance values need to be pulled 2 π c σ2 x (4)
2
 2 2
 2
upwards. It is done by shrinking SCM towards a highly struc- λ− = σ (1 − (c)) , λ+ = σ (1 + (c))
tured matrix called a shrinkage target. The convex combination
of SCM (ΣSCM ) with a shrinkage target (F ) gives the shrinkage The eigenvalues lying inside MP law bounds [λ− , λ+ ] represent
estimator in (1), where, ρ is the shrinkage intensity. Its value sampling noise among originally uncorrelated features and can
depends on the properties of the data. be replaced with a constant while keeping eigenvalues outside
these bounds intact. The eigenvectors of SCM can be scaled
Σshrink = ρF + (1 − ρ)ΣSCM , 0 ≤ ρ ≤ 1 (1) with these new eigenvalues to obtain cleaner covariance matrix

Authorized licensed use limited to: ETH BIBLIOTHEK ZURICH. Downloaded on March 13,2023 at 03:30:09 UTC from IEEE Xplore. Restrictions apply.
DESHMUKH AND DUBEY: IMPROVED COVARIANCE MATRIX ESTIMATION 987

(ΣM P ). This technique is called Eigenvalue Clipping [17] and Here, the shrinkage target F is the Sample Variance and Mean
unlike shrinkage techniques, it selectively adds bias to noisy Covariance target (explained in Section II-A). The optimization
correlations. Another recent development in RMT to clean SCM problem for finding optimal weights is given as:
is that of Rotationally Invariant Estimator [31]. However, this
method does not give a significant improvement over Eigenvalue minimize |Σpop − Σ∗ |F
α,β,γ
Clipping for small datasets [17] (also shown in Section IV) and
requires heavier numerical computations. subject to Σ∗ = α F + β ΣM P + γ ΣSCM
There are some disadvantages of RMT based methods. For α + β + γ = 1, α, β, γ ≥ 0 (7)
example, Eigenvalue Clipping completely overlooks the fact
that extreme eigenvalues lying outside the MP law bounds can be Where, | · |F is Frobenius norm. The problem with three
overestimated and can increase the sampling noise [17]. Further- variables (α, β, γ) in (7) can be simplified to (8) with two
more, these results are derived under asymptotic assumptions variables (θ, φ).
which can be misleading when the available sample size is small.
minimize |Σpop − Σ∗ |F
The RMT based methods also require finite variance and i.i.d. θ,φ
nature in data which might not be true for heavy-tailed financial
data having high temporal correlations. Thus, both shrinkage subject to Σ∗ = [θF + (1 − θ)ΣM P ]φ + (1 − φ)ΣSCM
and RMT techniques have their pros and cons depending on the 0 ≤ θ ≤ 1, 0 ≤ φ ≤ 1 (8)
properties of the data.
Similarly, proposed estimator in (6) can be rewritten as (9). The
C. Formulation of Portfolio Optimization Problem effective weights of F , ΣM P and ΣSCM are now θφ, (1 − θ)φ,
and (1 − φ) respectively.
The conventional problem of finding the minimum risk port-
folio is a convex problem with linear constraints [14]. We have Σ∗ = θφ F + (1 − θ)φ ΣM P + (1 − φ) ΣSCM (9)
included an additional return constraint for our study because
In practical cases, Σpop is not known and hence for a given
even a risk-averse investor would expect a minimal positive
values of (θ, φ), we cannot decide how close is Σ∗ to Σpop .
return. This gives the following problem formulation:
Hence, problem in (8) cannot be solved directly. However, we
minimize var(pT X) ( ≈ | pT (ΣSCM ) p |2 ) know that the Markowitz theory (MPT) optimally minimizes the
p
risk associated with a portfolio if a given estimator accurately
subject to 1T p = 1, p  0, g T p ≥ rdaily (5) predicts the population correlations (Σ∗ → Σpop ) [17], [24]–
[27]. Therefore, to find optimal values of (θ, φ) which solves
M ×N
where, X ∈ R is the stock return matrix for M stocks, each (8), we rewrite MPT problem in (5) as (10) and solve it multiple
with N number of daily returns. The portfolio vector p ∈ RM ×1 times iterating over the values of both θ and φ from 0 to 1 with
is the optimization variable and ‘var’ represents variance. The sufficiently high resolution until we get minimum value of the
vector g ∈ RM ×1 represent the predicted daily returns of M objective function in (10).
stocks and it can be estimated using recurrent neural networks
[34] or simply by dividing the available data into training and minimize | pT Σ∗ (θ, φ) p |2
p
test sets. The threshold rdaily is the minimum daily expected
return. The objective function var(pT X) can be approximated subject to 1T p = 1, p  0, g T p ≥ rdaily (10)
as |pT ΣSCM p|2 . The first constraint in (5) implies that the sum
Solving (10) for fixed values of (θ, φ) requires eigen-
of all portfolio weights should be one. The second constraint
decomposition of symmetric positive semi-definite matrix which
forces the portfolio weights to be positive, since we are not
has worst case complexity of O(M 3 ) [35]. Furthermore, the
considering a short-selling scenario [28]. The third constraint
number of times problem in (10) needs to be solved depends
specifies the minimum expected return. The problem in (5) tries
on the choice of increment step used to iterate values of both
to find a vector p in M dimensional feature space on which the
θ, φ from 0 to 1. Hence, (θ, φ) are the hyper-parameters in (10)
projection of data is minimum while satisfying the constraints.
which needs to be searched on a 2-D dimensional grid. It is
Therefore, estimation of covariance matrix is the key step for
important to note that in portfolio optimization problems, the
solving (5).
computational complexity is not exclusively discussed because
of the limited sized datasets involved [16]–[18], [30]. In fact, ac-
III. PROPOSED ESTIMATOR
curacy becomes more important than computational complexity
In this section, we propose an improved covariance estimator when using limited sized datasets. Therefore, this paper focuses
which exploits the advantages of both shrinkage and Eigenvalue on accuracy rather than computational cost. However, several
Clipping approaches by taking an optimally weighted convex faster methods for decomposition of covariance matrices can be
combination of the high variance ΣSCM , a highly structured explored as part of future work [35]–[38]. Similarly, for high
shrinkage target F and a matrix obtained by applying Eigenvalue dimensional applications, the grid search for hyperparameters
Clipping (ΣM P ). The formulation of the proposed estimator can be accelerated using various methods in field of fast hyper-
(represented as Σ∗ ) is shown in (6). parameter tuning [39]–[41].
This estimator is simple but very effective in removing the
Σ∗ = α F + β ΣM P + γ ΣSCM
(6) shortcomings of shrinkage estimators and Eigenvalue Clipping
where α + β + γ = 1 and α, β, γ ≥ 0 filter. We know that shrinkage transformation adds bias to all

Authorized licensed use limited to: ETH BIBLIOTHEK ZURICH. Downloaded on March 13,2023 at 03:30:09 UTC from IEEE Xplore. Restrictions apply.
988 IEEE SIGNAL PROCESSING LETTERS, VOL. 27, 2020

TABLE I
ANNUALIZED OUT-OF-SAMPLE RISK FOR MINIMUM VARIANCE PORTFOLIO (WITH CONSTRAINT OF ACHIEVING ATLEAST 10% RETURN) FOR NSE, NIKKEI, S&P
AND BSE DATASETS USING DIFFERENT ESTIMATORS (IN TERMS OF % STANDARD DEVIATION)

TABLE II We then shift this 200 days training window forward and update
RESULTS FOR NASDAQ AND NYSE DATASETS
the portfolio at frequencies of 30, 60 and 90 days and record the
variance of the daily returns in each case. We have used only 200
days of daily returns for training because in finance, using recent
data is preferable to capture the effects of the recent trends.
The comparisons shown in Tables I and II is based on the
variance (volatility) of daily returns associated with portfolios
obtained by various estimators. The volatility is calculated for
the test data (out-of-sample risk). This variance is converted
to percentage√ standard deviation and annualized by multiply-
ing it with 365. The results are shown for portfolio update
sample covariance values uniformly. On the other hand, Eigen- frequencies of 30, 60 and 90 days. It can be seen that the
value Clipping selectively removes and replaces noisy eigenval- proposed estimator gives the lowest out-of-sample risk for all
ues inside MP law bounds but ignores the noise in extreme eigen- six datasets and for all the update frequencies. A decrease in
values. This means, Eigenvalue Clipping is efficiently removing volatility even at the first decimal place is considered as fairly
the noisy correlations between the features which are originally significant in the field of portfolio optimization [17], [27], [28].
uncorrelated. But it is ineffective in removing the noisy correla- Based on this benchmark, the margin of improvement given
tions between the features which are originally highly correlated. by proposed estimator is significant in all the datasets except
Whereas, shrinkage estimator is opposite in its ability. So when for 30 days update frequency in NASDAQ. In most cases, the
we take the weighted convex combination of both shrinkage identity matrix (ΣIdentity ) is the worst estimator (except for
and Eigenvalue Clipping estimators, we not only remove noisy BSE) which is expected as it assumes zero correlations among
eigenvalues inside the MP law bounds, but also shrink extreme stocks. SCM is the second-worst. Also, the proposed estimator
eigenvalues lying outside the bounds. Hence, noisy correlations gives good results even when the update frequency is low, i.e. 90
among both correlated and uncorrelated features can now be days. This shows that the proposed estimator predicts the future
handled. Furthermore, this provides a generalized estimator that correlations more precisely.
can adapt to different datasets by changing the values of θ and φ.

IV. DATA AND EMPIRICAL RESULTS V. CONCLUSION AND FUTURE WORK


We have compared following five estimators with our pro- In this paper, we proposed an improved covariance matrix
posed estimator (Σ∗ ): 1) Identity Matrix (ΣIdentity ) proposed estimator which exploits the advantages of both shrinkage and
by [32]. It assumes that there is no correlation among stocks; 2) RMT based estimator. It first uses MP law to clip noisy eigenval-
Shrinkage Estimator (ΣShrinkage ) proposed in [27]. It is shown ues lying inside the MP law bounds, thus adding selective bias,
to be most efficient linear shrinkage estimator [17], [27]; 3) Sam- and then uses shrinkage techniques to shrink extreme covariance
ple Covariance Matrix (ΣSCM ); 4) Eigenvalue Clipping based values, thus reducing error in them. Hence, noisy correlations
estimator (ΣM P ); 5) Rotational Invariant Estimator (ΣRIE ) among both correlated and uncorrelated features can now be
recently proposed by Bun et al. [24], [31]. handled. Also, this provides a generalized estimator that can
For comparing these estimators, we consider stocks from six adapt to different datasets by tuning its parameter using training
major stock exchanges: NSE, NIKKEI, BSE, S&P, NASDAQ data.
and NYSE. We solved the problem formulated in (10) for each We used data from six of the world’s largest stock exchanges
dataset to minimize the investment risk while satisfying the and showed that our proposed estimator outperforms all existing
minimum 10% return constraint. We have selected 100 most estimators in minimizing the out-of-sample risk of the portfolio.
liquid stocks from each of the exchanges, with 750 days (Jan, This implies that it efficiently predicts the true correlations
2014 to Jan, 2016, around 2 years) of daily returns data for each among stocks and by extension, among any set of multivariate
stock. We also avoided selecting stocks which are common in features. Hence it can also be useful in other fields dealing with
multiple American exchanges. The daily returns for the first 200 the covariance matrices including machine learning and signal
days are used to train the initial minimum risk portfolio using processing. The performance and computational efficiency for
six estimators shown in Tables I and II. Hence the size of the large dimensional problems can be investigated as part of future
data matrix is 100 × 200, i.e., dimensionality constant c = 0.5. work.

Authorized licensed use limited to: ETH BIBLIOTHEK ZURICH. Downloaded on March 13,2023 at 03:30:09 UTC from IEEE Xplore. Restrictions apply.
DESHMUKH AND DUBEY: IMPROVED COVARIANCE MATRIX ESTIMATION 989

REFERENCES [21] S. Liao, J. Li, Y. Liu, Q. Gao, and X. Gao, “Robust formulation
for PCA: Avoiding mean calculation with L2, p-norm maximization,”
[1] J. Baik and J. W Silverstein, “Eigenvalues of large sample covariance presented at AAAI Conf. Artif. Intell., North America, Apr. 2018.
matrices of spiked population models,” J. Multivariate Anal., vol. 97, no. 6, [Online]. Available: https://www.aaai.org/ocs/index.php/AAAI/AAAI18/
pp. 1382–1408, 2006. paper/view/16571/16561
[2] T. W. Anderson, An Introduction to Multivariate Statistical Analysis [22] Q. Zhao, D. Meng, Z. Xu, W. Zuo, and L. Zhang, “Robust principal
(Wiley series in Probability and Statistics). New York, NY, USA: Wiley, component analysis with complex noise,” in Proc. Int. Conf. Mach. Learn.,
July 11, 2003. 2014, pp. 55–63.
[3] K. Upadhya and S. A. Vorobyov, “Covariance matrix estimation for [23] P. Stoica, E. G. Larsson, and A. B. Gershman, “The stochastic CRB
massive mimo,” IEEE Signal Process. Lett., vol. 25, no. 4, pp. 546–550, for array processing: A textbook derivation,” IEEE Signal Process. Lett.,
2018. vol. 8, no. 5, pp. 148–150, 2001.
[4] A. Aubry, A. De Maio, and L. Pallotta, “A geometric approach [24] J. Bun, J.-P. Bouchaud, and M. Potters, “Cleaning large correlation matri-
to covariance matrix estimation and its applications to radar prob- ces: Tools from random matrix theory,” Phys. Rep., vol. 666, pp. 1–109,
lems,” IEEE Trans. Signal Process., vol. 66, no. 4, pp. 907–922, 2017.
2017. [25] J.-P. Bouchaud and M. Potters, “Financial applications of random matrix
[5] Y. I. Abramovich and O. Besson, “Regularized covariance matrix esti- theory: A short review,” 2009, arXiv:0910.1205.
mation in complex elliptically symmetric distributions using the expected [26] O. Ledoit and M. Wolf, “Nonlinear shrinkage of the covariance matrix for
likelihood approach—Part 1: The over-sampled case,” IEEE Trans. Signal portfolio selection: Markowitz meets Goldilocks,” Rev. Financial Studies,
Process., vol. 61, no. 23, pp. 5807–5818, 2013. vol. 30, no. 12, pp. 4349–4388, 2017.
[6] X. Hua, Y. Cheng, H. Wang, and Y. Qin, “Robust covariance estimators [27] O. Ledoit and M. Wolf, “Honey, I shrunk the sample covariance matrix,”
based on information divergences and riemannian manifold,” Entropy, J. Portfolio Manag., vol. 30, no. 4, pp. 110–119, 2004.
vol. 20, no. 4, pp. 219, 2018. [28] O. Ledoit and M. Wolf, “Improved estimation of the covariance matrix
[7] J. H. Won and S.-J. Kim, “Maximum likelihood covariance estimation of stock returns with an application to portfolio selection,” J. Empirical
with a condition number constraint,” in Proc. IEEE Fortieth Asilomar Finance, vol. 10, no. 5, pp. 603–621, 2003.
Conf. Signals, Syst. Comput., 2006, pp. 1445–1449. [29] S. T. Rachev, Handbook of Heavy Tailed Distributions in Finance: Hand-
[8] G. Cui, N. Li, L. Pallotta, G. Foglia, and L. Kong, “Geometric barycenters books in Finance, vol. 1. North Holland, The Netherlands: Elsevier, 2003.
for covariance estimation in compound-Gaussian clutter,” IET Radar, [30] J. Bun et al., “Rotational invariant estimator for general noisy matrices.,”
Sonar Navigat., vol. 11, no. 3, pp. 404–409, 2016. IEEE Trans. Inf. Theory, vol. 62, no. 12, pp. 7475–7490, 2016.
[9] L. Du, J. Li, and P. Stoica, “Fully automatic computation of diagonal [31] M. Potters and J. P. Bouchaud, A First Course in Random Matrix The-
loading levels for robust adaptive beamforming,” IEEE Trans. Aerosp. ory, 2019. [Online]. Available: https://physics-complex-systems.fr/wp-
Electron. Syst., vol. 46, no. 1, pp. 449–458, 2010. content/uploads/2019/02/Notes_chap1-11.pdf
[10] A. De Maio, L. Pallotta, J. Li, and P. Stoica, “Loading factor estimation [32] L. R. Haff, “Empirical Bayes estimation of the multivariate normal covari-
under affine constraints on the covariance eigenvalues with application to ance matrix,” Ann. Statist., vol. 8, no. 3, pp. 586–597, 1980.
radar target detection,” IEEE Trans. Aerosp. Electron. Syst., vol. 55, no. 3, [33] V. A. Marčenko and L. A. Pastur, “Distribution of eigenvalues for some
pp. 1269–1283, 2018. sets of random matrices,” Math. USSR-Sbornik, vol. 1, no. 4, pp. 457–483,
[11] A. Aubry, A. De Maio, L. Pallotta, and A. Farina, “Maximum likelihood 1967.
estimation of a structured covariance matrix with a condition number [34] F. Qian and X. Chen, “Stock prediction based on lstm under different
constraint,” IEEE Trans. Signal Process., vol. 60, no. 6, pp. 3004–3021, stability,” in Proc. IEEE 4th Int. Conf. Cloud Comput. Big Data Anal.,
2012. 2019, pp. 483–486.
[12] S. Olivares and M. G. A. Paris, “Bayesian estimation in homodyne in- [35] A. Sharma and K. K. Paliwal, “Fast principal component analysis using
terferometry,” J. Phys. B: Atomic, Mol. Opt. Phys., vol. 42, no. 5, 2009, fixed-point algorithm,” Pattern Recognit. Lett., vol. 28, no. 10, pp. 1151–
Art. no. 055506. 1155, 2007.
[13] R. Schodel et al., “A star in a 15.2-year orbit around the supermassive [36] S. Attallah and K. Abed-Meraim, “A fast adaptive algorithm for the
black hole at the centre of the Milky Way,” Nature, vol. 419, no. 6908, generalized symmetric eigenvalue problem,” IEEE Signal Process. Lett.,
pp. 694–697, 2002. vol. 15, pp. 797–800, 2008.
[14] H. M. Markowitz, “Foundations of portfolio theory,” J. Finance, vol. 46, [37] A. Sharma, K. K. Paliwal, S. Imoto, and S. Miyano, “Principal component
no. 2, pp. 469–477, 1991. analysis using QR decomposition,” Int. J. Mach. Learn. Cybernet., vol. 4,
[15] F. J. Fabozzi, F. Gupta, and H. M. Markowitz, “The legacy of modern no. 6, pp. 679–683, 2013.
portfolio theory,” J. Investing, vol. 11, no. 3, pp. 7–22, 2002. [38] H. Cardot and D. Degras, “Online principal component analysis in high
[16] Y. Feng and D. P. Palomar, “Portfolio optimization with asset selection dimension: Which algorithm to choose?,” Int. Statist. Rev., vol. 86, no. 1,
and risk parity control,” in Proc. IEEE Int. Conf. Acoust., Speech Signal pp. 29–50, 2018.
Process., 2016, pp. 6585–6589. [39] A. Klein, S. Falkner, S. Bartels, P. Hennig, and F. Hutter, “Fast Bayesian
[17] J. Bun, J. Bouchaud, and M. Potters, “Cleaning correlation matri- optimization of machine learning hyperparameters on large datasets,”
ces,” Risk.net. [Online]. Available: https://www.risk.net/risk-magazine/ 2016, arXiv:1605.07079.
technical-paper/2452666. Accessed on: Jan. 10, 2018. [40] C.-S. Foo, C. B. Do, and A. Y. Ng, “A majorization-minimization algorithm
[18] L. Yang, R. Couillet, and M. R. McKay, “A robust statistics approach to for (multiple) hyperparameter learning,” in Proc. 26th Annu. Int. Conf.
minimum variance portfolio optimization,” IEEE Trans. Signal Process., Mach. Learn., 2009, pp. 321–328.
vol. 63, no. 24, pp. 6684–6697, Dec. 2015. [41] J. Bergstra and Y. Bengio, “Random search for hyper-parameter optimiza-
[19] L. An, S. Yang, and B. Bhanu, “Person re-identification by robust canonical tion,” J. Mach. Learn. Res., vol. 13, no. 10, pp. 281–305, Feb. 2012.
correlation analysis,” IEEE Signal Process. Lett., vol. 22, no. 8, pp. 1103– [42] S. Deshmukh and A. Dubey, “Improved covariance matrix estima-
1107, Aug. 2015. tor using shrinkage transformation and random matrix theory,” 2019,
[20] Z. Zhang, W. Liu, W. Leng, A. Wang, and H. Shi, “Interference-plus-noise arXiv:1912.03718.
covariance matrix reconstruction via spatial power spectrum sampling for [43] Z. Wang, M. Li, H. Chen, L. Zuo, P. Zhang, and Y. Wu, “Adaptive detection
robust adaptive beamforming,” IEEE Signal Process. Lett., vol. 23, no. 1, of a subspace signal in signal-dependent interference,” IEEE Trans. Signal
pp. 121–125, Jan. 2015. Process., vol. 65, no. 18, pp. 4812–4820, 2017.

Authorized licensed use limited to: ETH BIBLIOTHEK ZURICH. Downloaded on March 13,2023 at 03:30:09 UTC from IEEE Xplore. Restrictions apply.

You might also like