Market Correlation and Market Volatility in US Blue Chip Stocks

Craig Mounfield (craig.mounfield@volterra.co.uk) and Paul Ormerod (pormerod@volterra.co.uk)

Volterra Consulting Ltd

The Old Power Station 121 Mortlake High Street London SW14 8SN

Crowell Prize Submission 20th March 2001

Abstract

We analyse the daily rates of return of US blue chip stocks over the 1993-2001 period. Using the technique of random matrix theory, we show that the correlation matrix of these rates of return is to a large extent dominated by noise rather than by true information. These results confirm for this data set findings recently documented in the econophysics literature.

However, the eigenvector associated with the principal eigenvalue of the correlation matrix does contain true information and shows stability over time. This, the market eigenvector, shows the extent to which the individual stocks tend to move together. We quantify the fraction of total information contained within this eigenmode, which we define as the information index

We find a clear positive relationship between the absolute changes in the variability of the information index and the absolute changes in the variability of the market index. Further, the absolute change in the variability of the information index lagged one day has statistically significant predictive power for the absolute change in the variability of the market index.

As a consequence of this the correlation matrix is one of the cornerstones of much of modern financial engineering such as CAPM (Elton et al. In this case the empirically measured correlations may be significantly noise dominated masking the true correlations between asset returns. RMT was originally developed for the study of complex quantum mechanical systems.1. Introduction A precise quantification of the correlations between the returns of different assets traded in financial markets is of fundamental importance to risk management where one attempts to diversify as widely as possible the character of the portfolio (reducing exposure to sector/industry specific shocks). Undertaking this . to an organisations profile changing over time) • A finite number of observations of asset price movements (the statistical significance of spurious measurements becomes insignificant in the limit of an infinite number of observations of asset pair price movements). for example. 1995) and Value at Risk. However it is well understood that empirical measurements of the correlations between assets are subject to a number of significant sources of potential error. In order to assess the degree to which an empirical correlation matrix is noise dominated we can compare the eigenspectra properties of the empirical matrix with the theoretical eigenspectra properties of a random matrix. The difficulties associated with determining the true correlations between financial assets arise primarily due to : • Non-stationary correlations between assets (due. The technique of Random Matrix Theory (RMT) has recently been applied to financial market data to analyse the true degree of information content contained within empirical correlation matrices formed from equity returns.

Random Matrix Theory The problem of understanding the properties of matrices with stochastically fluctuating entries is one which has been studied intensively since the 1950’s in the context of nuclear physics. The results are consistent with those of the recent literature. Bouchaud et al 2000. We apply this technique to daily returns on leading US blue chip stocks using daily data over the 1993 . 2. Second. Section 2 outlines the relevant concepts of RMT. 2000). Gopikrishnan et al 2000. stable deviations of empirical financial correlation matrices from the universal predictions of RMT.2001 period. First. to show that this technique may also yield an information index which characterises the degree to which the movements of assets in a portfolio are correlated. . In this context the problem was to understand the empirically observed energy spectra of complex quantum mechanical systems (specifically heavy nuclei composed of many interacting constituents). Plerou 2000. in that empirical financial correlation matrices are in general dominated by noise.analysis will identify those eigenstates of the empirical correlation matrix which contain genuine information content. Mantegna et al 1999. but there do exist some significant. The remaining eigenstates will be noise dominated and hence unstable over time. Laloux et al 1999. Section 3 then applies these concepts to the analysis of a portfolio of US blue chip equities. The main purpose of this paper is two-fold. that the temporal evolution of this index is well correlated with the volatility of the overall market index. This technique has recently been applied by a number of researchers to financial market data (for example. Drozdz et al 2001) as well as to macroeconomic data (Ormerod et al. The structure of the paper is as follows. Plerou et al 1999. Finally the main results are summarised in section 4.

Undertaking this analysis will identify those eigenstates of the empirical matrix which contain genuine information content. That is. The remaining eigenstates are understood to be noise dominated and hence potentially unstable over time.In order to characterise these properties it was assumed that the numerous manybody interactions are in fact so complex that in the aggregate they may be considered to be random. 2. the elements of the Hamiltonian matrix H ij may be considered to be mutually independent random variables. non-random properties of the system under consideration. If the inter-period logarithmic returns are defined as . It was also demonstrated that RMT predictions represent an average over all possible interactions. The eigenstates that contain genuine information content are specific to the system under consideration and are indicative of the presence of collective modes of motion. Under this assumption it was possible to derive the statistics of the eigenvalue distribution of the Hamiltonian which were in remarkable agreement with experimental data (a contemporary exposition of RMT may be found in Mehta. 1991).1 Eigenspectra Properties of Random Matrices Consider a matrix M of T observations of price changes of N assets (at a frequency of e. inter-day observations).g. Deviations from the universal predictions of RMT identify system-specific. In order to assess the degree to which an empirical correlation matrix is noise dominated one may compare the eigenspectra properties of the empirical matrix with the theoretical eigenspectra properties of a random matrix. These deviations provide clues about the underlying interactions within the system (Mehta 1991). Hence RMT predictions are universal predictions that will apply to wide classes of systems.

λ . usually rescaled to unity). λmax = σ 2 (1 + 1 2 ) Q 1 2 ) Q λmin = σ 2 (1 − (σ 2 is the variance of the elements of M . λmax ] where Q = T N The upper and lower bounds on the theoretical eigenvalue distribution are given by. for λ ∈ [λmin .M i (t ) = ln Pi (t ) − ln Pi (t − 1) then the correlation matrix measuring the correlations between the N assets is given by C= 1 T MM T If the T observations are i.i. .d random variables then in the limit N → ∞ and T → ∞ the density of eigenvalues. As can be seen from this figure there is a well-defined range of non-zero eigenvalues λmin < λ < λmax . This distribution is plotted below in figure 1 for Q = 3.22. of the random correlation matrix C is given by (Sengupta et al 1999) ρ C (λ ) = Q 2πσ 2 (λmax − λ )(λ − λmin ) λ ≥ 1.

. there will be structure present in the correlation matrix. In other words. noisy subspace band where the postulates of RMT hold. When the dimensions of the random matrix under consideration are finite (but still ‘large’) this has the effect of broadening the spectral distribution. However in these instances Monte-Carlo simulation can generate what the broadened eigenvalue distribution is expected to be. The eigenvalue distribution of the correlation matrices of matrices of actual data can be compared to this ‘null-hypothesis’ distribution and thus. in theory. That is to say. if the distribution of eigenvalues of an empirically formed matrix differs from the above distribution. Each isolated eigenstate outside of the RMT bounds represents a correlated group whose size and participants are obtained from the eigenvalue and eigenvector respectively. the eigenvectors corresponding to eigenvalues within λmin < λ < λmax contain no genuine information.This range of eigenvalues corresponds to a random. then that matrix will not have completely random elements.

7 0.8 2 2.6 Density 0.4 0.6 2.2 1.8 0.2 2.4 0.4 1.6 0.8 1 1.8 Eigenvalue Figure 1 : Theoretical Density of Eigenvalues for a Random Matrix .3 0.9 0.6 1.2 0.2 0.0.5 0.4 2.1 0 0 0.

2 The Inverse Participation Ratio To analyse the structure of the eigenvectors of the empirical correlation matrix the inverse Participation Ratio (IPR) may be calculated.e. This is necessary since spurious correlations may be introduced by a particular choice of data to calculate the correlation matrix from. it corresponds to the contribution of asset i to eigenvector α . The IPR is commonly utilised in localisation theory to quantify the contribution of the different components of an eigenvector to the magnitude of that eigenvector (thus determining if an eigenstate is localised or extended) (Plerou et al 1999). That is to say. α Component i of an eigenvector vi corresponds to the contribution of time series i to that eigenvector. the stability of the correlations between the assets). in this context. In order to quantify this we define the IPR for eigenvector α to be N I α = ∑ (viα ) 4 i =1 Hence an eigenvector with identical components vi = 1 α N will have I α = 1 N and an eigenvector with one non-zero component will have I α = 1. We may assess this stability by calculating the scalar product of eigenvectors in non-overlapping analysis periods. Therefore the inverse participation ratio is the reciprocal of the number of eigenvector components significantly different from zero (i. the number of assets contributing to that eigenvector). 2.2.3 Temporal Stability of the Eigenvector Structure For those eigenvectors that deviate from the theoretically predicted bounds of RMT it is important to quantify the degree of stability of the information content of the eigenmode (i.e. That is for two analysis periods TA and TB we form the overlap matrix .

No inter-period stability would imply that Oij (TA . . mostly Dow Jones Industrial Average constituents). v N (T A ) ⋅ v N (TB )   . TB ) = 0 . the correlations between the assets contributing to that eigenvector remain stable from period to period) then each element of the overlap matrix would be equal to Oij (TA .1 Data Analysed The data set is for 31 US equities (blue chips. RMT Applied to Empirical Correlation Matrices Having described the basic analysis tools of RMT we will now apply this technology to financial correlation matrices. 3.e.  1  v (T ) ⋅ v N (T ) A B  . .g. . while at the same time preserving the statistical properties of the distributions (e. .  .  1 1 v (T A ) ⋅ v (TB )   Hence if the eigenvector structure remains perfectly stable in time (i. 1 N v (T A ) ⋅ v (TB )    . 3. There are 2068 separate trading days (taking out holidays etc).   . .  O(T A .2 Analysis of the Eigenspectra Properties To demonstrate that RMT may yield genuine information as to the true information content contained within an empirical correlation matrix we will .  . As a control this data set is also analysed after each of the time series of the individual assets are shuffled at random 10000 times. . . mean and variance). . TB ) =  . daily closing data for the period 4th January 1993 to 13th March 2001. 3. TB ) = δ ij . This has the effect of destroying any temporal correlations in the data. This randomly shuffled portfolio will act as a control to demonstrate there exists a quantitative difference between the eigenspectra of random and empirical correlation matrices.

For a matrix of this dimension the theoretical upper and lower bounds for the eigenvalue distribution are 1.38 and 0. Firstly we choose two non-overlapping time periods of approximately 4 years in duration and calculate the eigenspectra properties of the correlation matrices formed from these two analysis periods. That is to say we form the correlation matrix from the inter-day returns of the assets (there are thus 2067 observations of daily price changes for the 31 assets). non-random. However the observation of a significant number of eigenvalues outside the RMT bounds for the original. This indicates that that large scale macrostructure of the portfolio remains unchanged over the course of the 8 year total analysis period. .26 and 0.calculate the eigenspectra properties of the two portfolios described above. We may also examine the stability of these correlations over time. We of course expect that for the shuffled data there should be no information content contained within the time series since the process of shuffling the data destroys any temporal correlations in the data. For the correlation matrix formed from the original data set we observe that there are 17 eigenvalues below the lower bound. portfolio demonstrates that there does indeed exist genuine. For matrices of these dimensions the theoretical upper and lower eigenvalues are 1. For the correlation matrix (of dimension 31 x 31) formed from the shuffled data we find that all 31 eigenvalues of the correlation matrix fall within the upper and lower bounds. For the two analysis periods it is found that the numbers of eigenvalues below the theoretical minimum are 12 and 15 and above the theoretical maximum are identical (being 4). unshuffled. correlated movements between groups of assets within the portfolio.77 respectively. 4 eigenvalues above the upper bound and therefore 10 eigenvalues which fall between the upper and lower bounds.68 respectively.

We can also calculate the overlap matrix between the two periods. In addition to this if we repeat the analysis with 10 non-overlapping periods (each of 200 trading days in duration) we also observe an average overlap for the eigenvectors corresponding to the largest eigenvalue yields an average degree of overlap of 0. For these two analysis periods the overlap between the eigenvectors corresponding to the largest eigenvalue is 0.95.99. This is shown in figure 2. . These numbers represent a significant degree of temporal stability of the eigenvector structure.

As can be seen. the dot product of eigenvector 1 with itself in each of the two periods . A white square corresponds to perfect overlap between the structure of the 2 eigenvectors (perfect stability of the degree of information content in that eigenmode) and black corresponds to no degree of overlap whatsoever.bottom right hand corner) is significantly different from that of any of the other overlaps.e. the degree of stability of the market eigenmode (i.Figure 2 : Colour coded plot of the degree of overlap of the eigenvectors corresponding to 2 non-overlapping analysis periods for the US blue chip portfolio. .

3 Analysis of the ‘Market’ Eigenmode In terms of those eigenvalues which lie outside the noisy sub-space band the most important is the largest eigenvalue. Analysis of the eigenvector corresponding to the largest eigenvalue demonstrates that each of the 31 components of the eigenvector contribute approximately an equal amount to the eigenvector. the maximum eigenvalue) is to this value the more information is contained within this mode and the more correlated the movements of the price changes of the assets within the portfolio are. for this data set (2067 observations of daily returns for 31 assets).34). This indicates that this eigenmode is ‘extended’. The theoretical maximum eigenvalue is 1. for the US blue chip portfolio of 31 assets. We may therefore quantify the fraction of total information contained within this eigenmode – the information index . the IPR for this eigenvector is 0. Gopikrishnan et al. Indeed. 2000).032 (1/N) that we would expect if all of the assets contributed equally to the eigenvector.expressed as a percentage by the following formula .05 (the remainder of the eigenvalues are in the range 2.g.3. the trace of the correlation matrix is equal to 31 (since there are 31 independent time series). Hence the behaviour of this eigenmode is indicative of large-scale correlated movements of all of the assets within the portfolio. The application of RMT techniques to equities traded in financial markets have demonstrated that this eigenmode corresponds to the ‘market’ (e.e. That is. In order to quantify this overall collective motion of the portfolio’s asset price dynamics we may exploit the fact that the trace of the correlation matrix is preserved.15 to 0. The closer the 'market' eigenmode (i. This is to be compared with the value of 0.26 so it is clear that the largest empirically observed eigenvalue is significantly above this threshold. the maximum eigenvalue of the correlation matrix is 7.037. In particular.

2000) (with a window of 250 trading days). then we would expect Q (t ) → 100% . What is of interest is to determine how this eigenmode evolves temporally.Q (t ) = 100 λ max N If the assets in the portfolio move together very closely. the maximum eigenvalue is calculated. Within this window. was chosen for the analysis. The analysis is undertaken with a fixed window of data. Plotted in figures 3a and 3b respectively are the absolute values of the logarithmic differences of DJIA and the information index. Conversely. The absolute value of the logarithmic differences represents a proxy for the volatility of the time series (Ponzi. Figure 3a (for the DJIA) demonstrates that there exists periods of ‘bursts’ of volatility interspersed by periods of low volatility (so-called volatility clustering characteristic of the . This window is then advanced by one period (corresponding. if the asset price movements are completely uncorrelated then we would expect Q (t ) → 0% (corresponding to no collective dynamics). Q.4 Temporal Evolution of the Market Eigenmode We have seen that the eigenmode of the empirical correlation matrix corresponding to the maximum eigenvalue represents a collective motion of all of the assets within the portfolio. In particular. in this data set. which corresponds to approximately one year in terms of elapsed time. to one trading day) and the maximum eigenvalue noted for each period. the spectral properties of the correlation matrix formed from the constituent elements of the US blue chip portfolio are calculated. 3. A window of 250 periods. As previously the correlation matrix is formed from the returns on the assets. The same procedure is followed for the Dow Jones Industrial Index (DJIA) itself.

.presence of long-range temporal correlations in the volatility). Inspection of the charts suggests that the two measures exhibit a significant degree of correlation. In particular it is apparent that bursts of extreme volatility in the DJIA are reflected in similar bursts in the information index.

.

1 0. 07 0. 06 0. 08 0. 02 0. 03 0.Volatility of DJIA 0. 06 0. 04 0. 02 0 29/ 93 12/ 29/ 94 12/ 29/ 95 12/ 29/ 96 12/ 29/ 97 12/ 29/ 98 12/ 29/ 99 12/ 29/ 00 12/ Figure 3b : Plot of volatility of the information index for the period 4th January 1998 – 13th March 2001 . 08 0. 04 0. 12 0. 14 0. 05 0. 01 0 29/ 93 12/ 29/ 94 12/ 29/ 95 12/ 29/ 96 12/ 29/ 97 12/ 29/ 98 12/ 29/ 99 12/ 29/ 00 12/ Figure 3a : Plot of volatility of the DJIA for the period 4th January 1998 – 13th March2001 Volatility of Information Index 0.

.

02 0.0 0.371.01 gives ρ = 0. is in fact 0.08 0. The significant positive correlation persists even when the large potential outliers are trimmed from the data set. with N = 1610. highly statistically significantly different from zero.12 0. Figure 4 shows that the overwhelming bulk of the data is concentrated at low values of the variables. but even choosing only those observations where the absolute value of the maximum eigenvalue is < 0. Volatility of Information Index 0. ρ.06 Figure 4 : Scatter plot demonstrating the relationship between the volatility of the returns on the DJIA with the volatility of the returns on the information index Using the full data set from the windowing. For example. we have N = 1818 trading days.06 0.02 0.0 0. . set out in Figure 4.10 0.462. does suggest a positive relationship between them. The simple correlation coefficient.04 Volatility of DJIA 0.A scatter plot of the two variables. using only those observations where the absolute value of the maximum eigenvalue is < 0.04 0.283.03 gives N = 1789 and ρ = 0.

a value of 0. By choosing a sufficiently large value for the span.might have some predictive power as far as the volatility of the overall index is concerned. we choose the k nearest neighbours of x(t). so that any point on the curve at that point depends only on the observations at that point and some specified neighbouring points. which constitute a neighbourhood N(x(t)). For any given data point.0965 with a standard error of 0. so the coefficient is significantly different from zero at p< 0.8 represents the best choice of the span. We examined this relationship using the general non-linear least squares technique of local regression (available in the program S-Plus. This percentage is called the span. x(t) say.These results suggest that the volatility of the Dow Jones Industrial Average is positively correlated with the volatility of the degree of information in the eigenvector associated with the 'market' eigenvalue.0001.the degree to which the constituent stocks move together . for example). in the limit the local regression technique is identical to that of classical least squares. In this case. This enables us to carry out standard analysis of variance on the results for different choices of the span. The reduction in the residual sum of squares compared to that obtained with . This technique fits a curve to the data points locally. We then examined the possibility that the volatility of the degree of information in the market eigenvector . In other words. The estimated coefficient was 0. Empirically only the first lagged value was statistically significant. We carried out classical least squares regression of the volatility of the Dow Jones index on lagged values of the volatility of the maximum eigenvalue. The number of neighbours k is specified as a percentage of the total available number of data points.0213. all the points in the data set are in the neighbourhood of every single point.

8 is only 2. the variability of the information index lagged one day has statistically significant power in accounting for movements in the current variability of the index. Conclusions The correlation matrix of returns is of fundamental importance to much modern portfolio analysis. The results presented here confirm these findings with a data set of daily returns on US blue chip stocks over the 1993-2001 period.5. the correlation matrix does contain a certain amount of true information. However.00012. . indicating that the local regression model is somewhere between linear and a quadratic one in complexity [ref S-Plus Modern Statistics and Advances Graphics. In particular. the eigenvector associated with the principal eigenvalue of the correlation matrix enables us to identify the extent to which the individual stocks are genuinely moving together over time. However. We use the term 'market eigenmode' to characterise this eigenvalue and vector. Guide to Statistics vol. recent literature in the physics journals using the technique of random matrix theory has shown that such empirical correlation matrices contain substantial amounts of noise rather than true information. Further. Mathsoft. We demonstrate that the market eigenmode is stable over time.classical least squares is significantly different from zero at p = 0. The equivalent number of parameters in the local regression model with span = 0. We define the information index to be the fraction of total information contained within this eigenmode We analyse the temporal movements of variability of the information index and of the variability of the index formed from the component stocks and find a clear positive correlation between their absolute values. 1. 2000] 4. the degree of non-linearity is not strong. However. Seattle.

5. Grummer. Gopikrishnan. and H. 1467 (1999) R. Drozdz. Physica A 280.E. Stanley. L. Stanley A Random Matrix Theory Approach to Financial Cross-Correlations. Theory of Financial Risks – From Statistical Physics to Risk Management.A. 1471 (1999) V. cond-mat/0102402 (2001) E. Cambridge University Press (2000) S. Gopikrishnan. F. An Introduction to Econophysics. Potters.A. B. Stanley Universal and Non-universal Properties of Cross-correlations in Financial Time Series. Amaral and H. Phys Rev Lett 83. Laloux. Mehta.E. Elton and M.N. Ponzi. Random Matrices. V. Mitra. Bouchaud and M.-P Bouchaud and M. E. Random Matrix Theory and the Failure of Macroeconomic Forecasts. Cizeau.N. Rosenow. Kwapien. Mantegna and H. cond-mat/0011145 (2000) L. Plerou. N. B. Ormerod and C. Cambridge University Press (2000) M.M Sengupta and P. Speth. F. 497 (2000) V. Potters Noise Dressing of Financial Correlation Matrices.E. P. 374 (2000) A.Gruber. J. References J. P. Academic Press (1991) P. Rosenow.J. J. J. B. Stanley Identifying Business Sectors from Stock Price Fluctuations. Rosenow. New York (1995) P. The Volatility in a Multi-share Financial Market Model. Mounfield.-P. Ruf. Physica A 287. Phys Rev Lett 83. Modern Portfolio Theory and Investment Analysis. condmat/0012309 (2000) – To appear in European Physical Journal A. Gopikrishnan. P. Plerou. P. Plerou. Phys Rev E 60 3389 (1999) . L. Quantifying the Dynamics of Financial Correlations. J. Amaral and H. J.Wiley and Sons.