This action might not be possible to undo. Are you sure you want to continue?
Alex Kane Graduate School of International Relations and Pacific Studies (IR/PS) University of California, San Diego 9500 Gilman Drive, La Jolla, CA 92093-0519 (Phone) 858-534-5969 (Fax) 858-534-3939 akane@.ucsd.edu Tae-Hwan Kim School of Economics, University of Nottingham University Park, Nottingham NG7 2RD, UK (Phone) 44-115-951-5466 (Fax) 44-115-951-4159 Tae-Hwan.Kim@nottingham.ac.uk Halbert White Department of Economics, University of California, San Diego 9500 Gilman Drive, La Jolla, CA 92093-0508 (Phone) 858-534-3502 (Fax) 858-534-7040 firstname.lastname@example.org December, 2003
Abstract: The performance of active portfolio methods critically depends on the forecasting ability of the security analyst. The Treynor-Black model provides an efficient way of implementing active investment strategy. Despite its potential benefits, the Treynor-Black model appears to have had little impact on the financial community, mainly because it has been believed that the precision threshold of alpha forecasts used as inputs to the model is too high. We seek to lower the threshold of forecast precision needed to beat a passive-portfolio strategy by improving the econometric methods used in constructing portfolios from security analyst forecasts. We apply shrinkage estimation to beta coefficients and to discount functions for forecasts of stock abnormal returns. OLS estimates, Least Absolute Deviations (LAD) estimates and shrinkage LAD estimates are compared by contribution to portfolio performance. Despite correlations between forecasts and realizations of abnormal returns as low as 0.04, the shrinkage LAD methodology yields superior performance in outof-sample experiments. Key Words: Treynor-Black Model, Abnormal Returns, Sharpe Ratio, M2-measure, Least Absolute Deviations Estimator, Shrinkage LAD estimator.
We would like to thank Clive Granger, Patrick Fitzsimmons, James Hamilton, Bruce Lehmann, Robert Trippi and Allan Timmermann for their helpful comments.
The presumption of market efficiency is inconsistent with the existence of a vast industry engaged in active portfolio management. Grossman and Stiglitz (1980) derive an information-inefficient capital market equilibrium based on the cost of information and the fact that portfolio managers cannot observe the asset allocations of competitors. Treynor and Black (1973) propose a model to construct an optimal portfolio under such conditions, when security analysts forecast abnormal returns on a limited number of securities. The optimal portfolio is achieved by mixing a benchmark portfolio with an active portfolio constructed from the securities covered by the analysts. The original model assumes that residuals from the market model are uncorrelated across stocks (the diagonal version), but it can easily be extended to account for non-zero covariance across residuals (the covariance version). The efficiency of the Treynor Black (TB) model depends critically on the ability to predict abnormal returns. Its implementation requires that security analyst forecasts be subjected to statistical analysis and that the properties of the forecasts be explicitly used when new forecasts are input to the optimization process. It follows that security analysts must submit quantifiable forecasts and that they will be exposed to continuous, rigorous tests of their individual performance. The entire portfolio is also continuously subjected to performance evaluation that may engender greater exposure of managers to outside pressures. The TB model appears to have had little impact despite encouraging reports; e.g., Hodges and Brealey (1973), Ambachtsheer (1974, 1977), Ferguson (1975), Ambachtsheer and Farrell (1979), Dunn and Theisen (1983), Ippolito (1989), Goetzman and Ibbotson (1991), Kane et al (1999), to mention a few of the listed references. Although theoretically compelling, the model has not been widely adopted by investment managers. We suspect that portfolio managers and security analysts are reluctant to subject their analysis to rigorous tests. This attitude may owe in no small measure to the belief of many renowned scholars that the forecasting ability of most analysts is below the threshold needed to make the model useful. This paper aims to identify this threshold, and to lower it by using effective statistical methods. Optimal portfolios are constructed with actual forecasts of abnormal returns and beta coefficients obtained from a financial institution. We apply various ways of shrinking a robust estimator toward a data-dependent point to identify and utilize predictive power. Since distributions of abnormal returns are fat-tailed, we choose the Least Absolute Deviations (LAD) estimator as a benchmark. The quality of estimates of stock betas affects the accuracy of the estimates of realized abnormal returns, needed to
we briefly describe the model. and the security analysts of a portfolio management firm cover a limited number of securities. We also use Dimson’s (1979) aggregate coefficients method to account for infrequent trading.04. the weights of A and M in the optimal risky portfolio. 1 2 A more elaborate description can be found in Bodie et al (2001). The Treynor-Black Framework To fix ideas and introduce notation. Portfolios based on OLS estimates are dominated by the those that use LAD and shrinkage LAD estimators. Section 5 lays out the out-of-sample test procedures and Section 6 reports on portfolio performance. and a portfolio of only the covered securities cannot be efficient. The optimal portfolio must be a mix of the covered securities and the index portfolio. (1) where R is the expected excess return.2 With a risk-free asset (or a zero beta portfolio) whose rate of return is denoted rf. M) to obtain the optimal risky portfolio. R M ) wM = 1 . The existence of this portfolio can be inferred from Merton (1972) or Roll (1977). A) that can be mixed with the index (Passive Portfolio. securities that are not analyzed are assumed to be efficiently priced. Under these conditions. E(r) – rf. perhaps due to changing market conditions. The paper is organized as follows. which maximizes the Sharpe ratio ( SP = R P σ P ) are given by wA = 2 R Aσ M 2 R Aσ M − R M Cov(R A . P. The analyst forecasts in our database show correlations between forecasts and realizations of abnormal returns on the order of 0. R M ) . 1. Assuming the diagonal version of the market model we have. the application of shrinkage LAD estimation to the TB model results in superior performance in both the diagonal and covariance versions. Section 1 presents the TB framework and Section 2 describes the forecast data and sampling procedure. 2 + R M σ A − (R A + R M )Cov(R A . and predictive ability generally declines over the sample period. a specified market index is taken as the default efficient (passive) strategy.measure the bias and precision of the forecasts. and Section 4 treats the calibration of forecasts from the history of forecasting records. Section 7 provides a summary and conclusions. Nevertheless.wA. Section 3 elaborates on the estimation of beta coefficients and abnormal returns from realized stock returns. TB identify the portfolio of only the covered securities (the efficient Active Portfolio. 2 .1 Treynor and Black (1973) deal with a scenario in which the mean-variance criterion (the Sharpe ratio) is used by investors. The forecasts are biased and forecast errors are asymmetric and correlated across stocks.
E(ei) = Cov (Ri. except for σM. the less effective is diversification with the index. βA = 1. macro forecasting to obtain RM. the larger the weight in A. is maximized by choosing the weight. βi and σi. σi denotes residual standard deviation of the ith security (or portfolio). if forecast quality exceeds some threshold. Note that w0 is the optimal weight in the active portfolio when its beta is average. Var(ei) = σi2. the Sharpe ratio of the risky portfolio. ∀ i ≠ j . on the ith covered security (out of n). Substituting (2) into (1) yields wA = w0 . n). to be α i σ i2 wi = n . wi.3 3 Obviously. P. αj ∑σ j=1 (5) 2 j The reason βi is absent in (5) is that a correction is made for βA in wA of (3). 2 RM σM w0 = (3) where αA = ∑w α i =1 i n i . βA = ∑w β i =1 i n i and the weight wi is given below in (5). Rj) = 0. and statistical analysis to obtain σM. …. (2) where αi is the abnormal return expected by the analyst who covers the ith security and. there are economies to scale in the coverage of securities that help explain large portfolios in the industry. micro forecasting to obtain αi. there is room for improvement by exchanging ideas among the staffs of the various activities. in turn. With wA from (3). The intuition of (3) is that the larger the systematic risk of the active portfolio. 3 .Ri = αi+ βiRM + ei . This appraisal ratio. and hence. is given by S2 = P [wA (α A + β AR M ) + (1 − wA )RM ]2 α2 2 = SM + A . M 2 σA i =1 σ i (4a) Thus. Another important organizational implication is that portfolio management can be organized into three decentralized activities. 2. 1 + (1 − β A )w 0 2 αA σ A . Applying this solution to (4) shows that the marginal contribution of an individual security to the risky portfolio’s squared Sharpe measure is equal to its own squared appraisal ratio S2 = S2 + P M 2 n αA α2 = S 2 + ∑ i2 . We assume that the security analysts cover n securities (i = 1. 2 2 2 2 w2 ( β 2 σ M + σ A ) + (1 − wA ) 2 σ 2 + 2wA (1 − wA )β Aσ M σA A A M (4) which reveals that the appraisal ratio ( α A σ A ) of the active portfolio determines its marginal contribution to the Sharpe ratio of the passive strategy.
Yet it’s not a priori clear that even if residuals are correlated across securities. 1992. Using matrix notation and denoting the covariance matrix of residuals by Ω. 1995. XYZ had 711 stocks in its database and additional stocks were added regularly over the sample period. 4 . They used the S&P500 as a performance benchmark and. We shall refer to this firm as “XYZ corp” hereafter. in the early-mid 1990’s.S. To simplify the test procedure. 2. In December. 1995. The firm (XYZ) graciously provided us with monthly4 forecasts of abnormal returns and beta coefficients for all stocks in its database for the period December. 4 Forecasts were submitted by the last Friday of the month prior to the target month. hence we need the optimal portfolio for the generalized model. leaving 646 stocks for the test databank. the weight. ending with 771 stocks in December. we eliminated any stock for which one or more forecasts were missing. were quite representative for the US stock market with one each of average. 37 sets of monthly observations in all. We will put this tradeoff to the test. In that period. and Roll (1977)). wi .The assumption of the diagonal model is obviously suspect. When the covariance matrix is diagonal. residual covariance matrix is unchanged. The Forecast Database and Sampling Procedures The forecast data set used in our study has been provided by an investment firm active in the U. The first stage of the optimization with a non-diagonal. XYZ did not reveal how it went about portfolio management. (6) where α is the vector of expected abnormal returns and ι is a vector of ones. wc reduces to wi in (5). bad and good years. the use of the generalized model will be profitable. in the active portfolio is given by the ith element of the following n × 1 vector: wc = [α’Ω -1 ι ]-1Ω -1 α. the sample-period years. 1992-December. since we face a trade off between a somewhat flawed model (with assumed non-correlation) and a correct model with estimation errors in the covariance matrix. 1993-1995. the firm began extensively using artificial neural network-based statistical analysis to predict abnormal returns. These are the forecasts that XYZ used in constructing their portfolios. consequently. Nevertheless. As the table below shows. they mostly held large company stocks that traded in relatively large volumes. We only need to redo the maximization of the appraisal ratio of the active portfolio (see the analysis in Merton (1972).
6 While LAD estimators can be obtained by simplex linear programming proposed by Charnes and Lemke (1954).7 Among many others.Annual Returns (%) Large Stocks Small Stocks 1993 9. [Figures 1. 5 5 .30 1994 1. Annual data for these variables were obtained from Standard and Poor’s COMPUSTAT tape.21 1926-1999 Average 12.44 XYZ’s monthly forecasts of alpha were constrained to integer-values of percent per month. reducing the number of covered stocks significantly lowers the contribution of the active portfolio and hence does not diminish the force of our positive test results.8 To account for this effect.02 SD (%) 20.39 40. Rosenberg et al (1985) and Fama and French (1992) provide empirical evidence that the equity book-to-market ratio (BE/ME) and market capitalization (SIZE) help explain average stock returns. 7 As shown in section 1. Most financial firms produce forecasts in the form of a ranking variable which must be converted to a scale variable.34 1995 37. Beta forecasts were distributed around one. We randomly draw as close as possible to an equal fraction of (2 or 3) stocks from each group.50 19. 8 Fama and French argue that these variables are proxies for the part of risk premiums not captured by market beta.87 20. using the Information Correlation Adjustment proposed by Ambachtsheer (1977). This is quite unusual. We use this subsample in the subsequent sections. arriving at a random sample of 105 stocks that reflects the databank (population) distribution of the category variables. we chose to work with a subset of 105 randomly selected stocks. Using 7 categories of BE/ME and SIZE. Barrodale and Roberts (1974) proposed a modified version of the simplex algorithm (BR-L1) which is more efficient and greatly reduces computation time.29 -3.6 Accordingly. computation time for 646 stocks with BR-L1 is excessive. with about 13 stocks in each group. this method is not efficient as the parameter space grows along with the number of observations and requires a long search time. and Table 1 presents summary statistics for their location and dispersion.71 33. Nevertheless.5 Alpha forecasts were right-skewed and negative on average. 2 and Table 1 here] We confine our study to 105 of the 646 stocks in the database to reduce the computation time needed for LAD estimation. respectively. Figure 1 and Figure 2 show histograms of the alpha and beta forecasts for the 646 stocks in the sample over the period. Figure 3 and Figure 4 show the population and sample histograms for SIZE (Market Value) and BE/ME (Book/Market Value). between − 12% and 14% so that any more extreme forecast would have been set to the appropriate limit. we seek to preserve the distribution of BE/ME and SIZE in our sub-sample. typical for large stocks. we allocate the 646 stocks into 49 groups of similar BE/ME and SIZE.
3. For the insample estimate of realized abnormal returns in any given month. . and testing for a zero slope coefficient.. 6 .9 We obtained from DATASTREAM daily returns for the 105 stocks in the sample. To avoid this risk we also use standard estimates of betas from realized daily stock and market-index excess returns. T. since they are unobservable. then the slope coefficient from the regression should be zero. 1990 through March 31. if we confine ourselves to XYZ’s beta forecasts. 1996 with 1629 observations.. Table 2 shows the estimation results for 0 to 2 lags and 9 The term ‘realized’ for ex-post abnormal returns is somewhat misleading. We pool the residuals for all stocks over all the months in the sample (approximately 37 × 105 = 3. the S&P500 index. we estimate beta from daily returns over three years where the tested month is in the middle of the period. Beta estimates are the sum of the contemporaneous plus K lead and lag coefficients from the regression R i. We estimate betas with the following three alternative procedures. If the correct number of lags and leads are used. k . An appropriate value for K can be inferred from by regressing the market-model residuals (ex-post abnormal return minus alpha forecast from XYZ) on a constant and the market excess returns. and 3-month yields for T-bills for the period January 1. t . To determine the accuracy of abnormal return forecasts we need a time series of realized abnormal returns. (i) To account for infrequent trading we use Dimson’s (1979) aggregate coefficient method (AC). . the market-model equation and estimated beta coefficients. We estimate them from realized returns. (7) There is no obvious rule for selecting the number of lags and leads (K).1 Beta Coefficients We use XYZ’s beta forecasts to compute realized abnormal returns. t = 1. bi = k =− K ∑b i. Estimation of Beta Coefficients and Realized Abnormal Returns Forecasting accuracy derived from records of past forecasts is a critical input when using security analysis to optimize portfolios. t = a i + K K k =− K ∑b i. k R m.885 observations)..t + k + e i. Nevertheless.[Figures 3 and 4 here] 3. we run the risk that tests of the quality of forecasts of abnormal returns will depend too greatly on the quality of these beta forecasts.
We choose h = 0. approximately a χ2-statistic with one degree of freedom for the implicit + h hypothesis: H0: b = b*. Among the two candidates 0 and 1. Vasicek suggests the cross-sectional mean and variance for b* and vb*2 respectively. 7 .1) degrees of freedom. correcting 10 This can be explained by XYZ’s concentration in large-company stocks. vb2 = estimate of variance of b. Noting that Prob[w ≥ 0] = Prob[F ≥ h]. (iii) An alternative approach is a James-Stein shrinkage estimator (JS): bJS = wb + (1-w)b .leads.45 such that Prob[w ≥ 0] = 0. 2 2 1 / v b* + 1 / v b (8) ⎡ ⎤ h w = ⎢1 − .5. and the weight w is now given by ⎡1 − ⎤ . Figure 5 shows the distribution of beta estimates from the four methods. [Table 2 here] (ii) Vasicek (1973) proposes the Bayesian estimate (V): bV = wb + (1-w)b* . the shrinkage factor. the infrequent trading problem may still be present. We apply both shrinkage methods to the AC estimates as two alternative estimates. we use four alternatives to XYZ’s beta forecasts: OLS. AC. needs to be specified. where V and JS are Vasicek’s method and the JS method applied to the AC estimates. V and JS. Note that (b-b*)Var(b)-1(b-b*) is an F-statistic with (n-1. we are ⎣ F⎦ likely to reject the implicit null hypothesis).0). * 2 1 / vb . and we do not shrink. Instead of specifying the variance of the prior distribution as in Vasicek’s method. h. Prob[w ≥ 0] → 1 while Prob[w ≥ 0] → 0 as h → ∞. We reject the hypothesis that the slope coefficient is equal to zero only for 0 and 1 lags/leads10 and it becomes significantly different from zero for K ≥ 2. * −1 * ⎥ ⎣ (b − b )Var (b) (b − b ) ⎦ + (9) where h is a choice parameter and [a]+ = max(a. When F is large. b* = mean of prior distribution of market beta. w = where b = estimated market beta. (that is. It is evident that AC shifts the OLS distribution upwards. To summarize. it is easily seen that as h → 0. we choose K = 1 because our belief is that when K = 0. w is large (close to one). vb*2 = variance of prior distribution of market beta.
To each element in the matrix we obtain a corresponding element of the matrix of realized abnormal returns from α i*(t⋅) = Ri . The scatter of forecasts and realization with the V specification (Dimson’s estimates with Vasiceck’s correction) show that the constraint on the range of alpha forecasts [-12. (OLS). α * . [Figure 6 here] 4. Dybvig and Ross (1985) and Kane and Marks (1990). [Figures 5 and 6 here] 3. it shifts the central mass of the distribution toward the mean.. b (⋅) . ˆ Assume. 37 . for example. To assess overall forecast accuracy with various specifications and beta estimates. and a forecast.for the downward bias due to infrequent trading. while JS leaves the tails almost unchanged.t − bi(. . V shrinks the tails toward the AC mean. α * = a + b α + η. instead. i = 1. We use three alternative specifications: * ˆ α L = a L + b Lα + η L * ˆ ˆ α P = a P + b1Pα + b 2Pα 2 + η P Linear Parabolic (11a) (11b) 8 . (AC).. we pool all 37 × 105 = 3885 pairs of forecasts and realizations.. . Calibration of Alpha Forecasts TB point out that we must explicitly account for the quality of forecasts when optimizing the portfolio. we obtain the unbiased ˆ ˆ forecast. α .14] may have been costly. t = 1. . are related by α * = ˆ f (α ) + ε where ε is white noise.. of monthly betas for the 105 stocks..⋅t) RM . (JS) (10) A first glimpse at the quality of abnormal return forecasts is shown in Figure 6. particularly for positive values. For each of the five methods. We can obtain the appropriate discount function for α by estimating a well specified ˆ regression of the forecasting record.. we have a 105 × 37 matrix. (⋅) = (XYZ). Assuming that the function f is linear.2 Realized Abnormal Returns The five methods of estimating betas yield 5 sets of estimates of realized abnormal returns. a simple case where the realization. ρ. 105 . αUB = E( α * | α ) = ρ2 α . by discounting the raw forecast using the correlation. (V). between α * ˆ ˆ and α . This issue is taken up by Admati (1985).g.t . e.
The graph of the parabolic specification shows that the correction would be more severe at the low end of the forecast range.0] + η K (11c) The kinked-linear specification is a ‘no short sales’ alternative that may be required for many institutions. with five beta estimators in each.Kinked * ˆ α K = a K + b K Max[α . it converts all negative forecasts to zero. However. The extent of this differential overwhelms this specification. maintaining the pooling across stocks. 9 11 . Figure 7 shows the fitted lines from the regressions of realizations on abnormal return forecasts with the three alternative specifications. [Figure 7 here] Table 3 presents three panels for the regression results of the three specifications. The slope coefficients show the superior power of the positive alpha forecasts. the discount function would convert negative alpha forecasts into positive signals. applying a uniform discount function to all stocks is likely unrealistic and would significantly reduce the potential contribution of the active portfolio. The adjusted R2 ( R 2 ) A is highest for the kinked-linear specification suggesting that the parabolic specification is not flexible enough to handle the downward bias at the low end of the forecast range. The standard errors in parenthesis in Table 3 are computed from the HeteroscedasticityConsistent Covariance Matrix Estimator11 (HCCME) proposed by White (1980). First. where Ω*= Diag[et2/(1-ht)]. With zero P-value for all cases. increase the power of the performance tests. White’s heteroscedasticity test rejects the homoscedasticity assumption. pooling observations across time entails the use of data that was unknown at the time the forecast was made. et = regression residual and ht = tth element of the hat matrix (X(X’X)-1X). The linear specification reveals a severe correction for quality. When we test performance month-by-month we use only data from past months to estimate the discount function. effectively taking these stocks out of the active portfolio. The kinked line specification is a milder form of such correction. The estimation results are almost identical for the various beta estimation methods. pooling across stocks will. if anything. Pooling the sample across stocks and time to estimate the forecast discount function would affect the test results in two important ways. The inadequacy of the parabolic specification is There are several ways of estimating the HCCME. Hence. At the lowest end. calling for long positions in these stocks. We estimate: HCCME =(X’X)-1X’Ω*X(X’X)-1. For this reason we use pooling across time only to discuss issues of overall forecast quality.
00155. the quality of the forecasts consistently deteriorated over time. To do so. Figure 8 shows the plots of the resultant R 2 for the V beta estimate only. This observation is somewhat surprising since the best year of the sample period was the last. Finally. the R 2 is A quite small. Armed with the various specifications for discounting individual forecasts we now turn to the construction of optimal portfolios and performance evaluations. indicating that with overall low predictive power we must be careful in making generalization.apparent from the insignificance of the slope coefficients on the squared forecast alpha. (iv) Estimate the covariance matrix of residuals from past daily returns to use in the diagonal and covariance versions of TB 10 . one month at a time -. Out-of-sample Test Procedures The steps comprising our performance test are as follows: (i) Estimate the discount function for forecasts of month t from paired observations of forecast and realization of abnormal returns in month t-1. and Table 3 shows that positive forecasts fared best. (ii) Obtain unbiased forecasts for month t by applying the discount function from month t-1 to the forecasts of abnormal returns for month t. [Figure 8 here] 5. we draw similar conclusions from the intercepts. To complete this picture. The three plots reveal that. Abnormal returns of month t-1 are computed from the market-model equation using beta coefficients estimated from past realized daily returns. These are significant in the linear specification and are smallest and insignificant in the kinked-linear specification. (iii) Obtain macro forecasts of the mean and variance of the index portfolio. with the exception of the first two months. as the other four plots are A very similar. never exceeding 0. [Table 3 here] It is interesting to track the time-consistency of the quality of the forecasts. we regress (11) on the 105 stocks.37 regressions for each specification and beta estimate.
12 Let b be the LAD estimator and g be the OLS estimator. i = 1. (vi) Compute the realized return of the optimal portfolio in month t. 210 observations for t = 3. Hence.K stand for the linear. V. XYZ. XYZ. AC.g || 2 )(b .. 11 . t -1 ( o ). t -1α i. the JSLAD. t = 2. the discounted alpha forecast for the ith security is ˆ ˆ ˆ given by a ((⋅o)).. Each of the ˆ 3 × 5 = 15 sets of 105 abnormal returns are then paired with the 105 forecasts. Once the coefficients a ((⋅o)).. t = 2. AIT.. t -1 are estimated for each test month t.1 Discount Functions for the Monthly Forecasts of Abnormal Returns We use forecasts and realizations available each month.. t ⋅−)1 .c 2 )(b ... t .. 105. Construct the optimal risky portfolio from the active and index portfolios.. AC. α i..(v) Construct the active portfolio using the unbiased forecasts and estimates of the residual variances (covariance matrix) in the diagonal (covariance) version of TB. The critical role of the discount function motivates an elaborate estimation scheme. t −1 + η (⋅) . JS o ( o ).. 37. JS . t = 2. t -1 + b ((⋅o)). using stock and market realizations of excess returns in month t. . K and (⋅) = OLS. NRLAD and OWLAD estimators are defined as follows: 12 JSLAD = (1 . we first estimate the four sets of 105 beta coefficients from realized daily returns over three years preceding (⋅) month t. K .. XYZ’s 105 beta forecasts plus the four sets of estimates up to month t-1. 37. . (iii) Non-Random Combination LAD (NRLAD). . 5. (⋅) = OLS. i = 1.g || 2 )(b . For each set of the 15 combinations of ( o ) = L. parabolic and kinked specifications... α i... 105. t = 2.g) + g OWLAD = (1 . . (⋅) = OLS. b ((⋅o)). JS .. (vii) Use the realized monthly excess returns of the optimal risky TB portfolio and the market index (t = 2. and so on.. 37) to evaluate performance. (iv) James-Stein LAD (JSLAD). . XYZ.. in the above regression. P. AC.λ 2 / || b ..c1 / || b .λ1 . (⋅) = OLS. we run the following pooled regression (across i) for each t: ⋅) ˆ α (*()i.g) + g NRLAD = (1 ... are used in the market model equation (1) to compute the *( realized abnormal returns. . t -1 ... V. i = 1. JS . We estimate (12) in 5 different ways: (i) OLS. t. V. t −1 . t -1 (12) where L. V.P. .g) + g . 37.AC. we have 105 observations for t = 2.. and (v) Optimal Weighting Scheme LAD (OWLAD). P.. t −1 = a (⋅) + b (⋅) α i. For each test month.t −1 . b i. 37. t = 2. . 105. Then. t -1 ( o ). ( o ) = L. . 37. (ii) LAD..
the quality of macro forecasts will be substantially more important. yield five sets of deviations from the market model.3 Macro Forecasts It appears from (4a) that macro forecasts of the mean and variance of the index-portfolio are not more important than those of a single security. But this argument is false. 2 12 . where || b . these would be extrapolated from past forecasting errors. JSLAD and OWLAD estimators (Kim and White (2001)) are obtained by optimally mixing the OLS and LAD estimators. See Kim and White (2001) for a detailed discussion.d . This appearance led Ferguson (1975) to argue that effort spent on macro forecasting should not exceed that spent on any individual security.⋅d = R i.g || = (b . Since it is likely that the weight on M will be substantially higher than that of any individual security. These estimators have smaller mean squared error than the OLS and LAD estimators and are expected to improve out-of-sample performance of the forecasts. obtained from the daily returns of the prior 3-year period. The Sharpe ratio in (4a) is conditional on optimal weights of all securities in the risky portfolio. since we do not have a sufficient number of observations. and those of the covariance version (6) require a forecast of the full covariance matrix of residuals.The NRLAD. to obtain the forecast of the residual covariance matrix in month t. (4a) does not apply. These daily residuals are used to estimate the covariance matrix of daily residuals for the 3-year period prior to month t. The matrix is multiplied by nt. 5. The shortfall from the maximum Sharpe ratio will depend on the weight of the security in the portfolio.⋅m R M. the number of days in month t. m = month in the 3-year period (the subscript for the month for which the estimates are prepared is dropped for clarity). λ 2 ) are chosen to minimize the asymptotic risk of the corresponding estimator. To the extent that a forecast for a security is of low quality and not properly discounted.g) ′Q(b . () () e i. λ1 .2 Residual Variances and Covariances The optimal weights (5) of the diagonal version of TB require forecasts of residual variances. 5. However. Ideally. we estimate the covariance matrix from daily returns over three years ending in the last trading day of month t-1. We evaluate performance with these five estimation methods. The five estimates of beta for each stock.g) and Q is a weighting matrix and the combination parameters (c1 . d = day in the 3-year period. c 2 .d − b i.
...T + k ) standing at time T is generated by 2 2 σ M.t −1 .T +1 = w + a σ M. 5. in Wall Street terms). T. is computed from the Sharpe measure of portfolio P by 13 . T + k−1 where nm is the number of days in the nm k =1 target month. We use the AR(0)-GARCH(1. m = ∑ σ M.1] in order to rule out any unrealistic positions in the active portfolio that generate superior performance.. It appears that the size and volatility of the active portfolio weights result from the dynamics of the discounted abnormal-return forecasts. .20% per month. T 2 2 σ M. We therefore restrict the active portfolio weight to the range [0. This measure..1) specification as in Engle et al (1993) to forecast the market excess return and variance as follows: RMt = E(RM) + εt 2 2 σ Mt = w + aε t2-1 + bσ M. Figure 9 shows the forecasts and realizations of the market index excess returns. While many portfolio mangers may take short positions in individual stocks (shorting against the box. nm 2 2 and the target month variance estimate is σ M.5 Performance Measures The TB model maximizes the Sharpe measure. The RMSE of the excess-return forecasts is 2. Modigliani and Modigliani (1997) propose a transformation of the measure to a rate of return equivalent.T + k-1 . [Figure 9 here] 5.The literature on estimating the market mean is sparse (see Merton (1980)).. then the k-step ahead 2 volatility prediction ( σ M. . (13) t = 1. T. which has come to be known as M2. (14) k = 2. most users have little intuition about the value of an incremental improvement in the Sharpe measure.T + k = w + (a + b)σ M. T. the weights on the active portfolio derived from (3) and (6) turn out to be excessive and volatile.4 Restrictions on Portfolio Weights Despite the discount of the forecasts. while the literature on market volatility is quite rich. Once the estimation step for one iteration is completed. fewer can use index futures to emulate short positions in the index. T. Daily returns and 3 year rolling estimation window are used in (13) so that T is the final day of the 3year period window.T. However.
14 . index and bills to compute realized daily excess returns on the risky portfolios over the 36 months of the forecast period (January. outperformed the passive strategy.1 The Diagonal Version of the TB Model Table 4 reports performance of portfolios constructed from the diagonal version of TB. the adjusted R-square of this specification (see Table 3) shows that even positive forecasts were of low quality. Portfolio Performance Evaluation 6. This indicates the existence of fat tails in the return distributions and the value of better estimation methods. 1992-December. 1995).M 2 = SPσM – RM P (15) The first term SPσM gives the excess return on a mix of P with the risk free asset that would yield the same risk (SD) as the market index portfolio M. and the power of the TB model. providing strong testimony to the value of even miniscule predictive ability of security analysts. 6. This result may be unique to the unusual nature of the XYZ forecasts that were better in the positive range. The superior performance is of economic significance. We track the monthly optimal weights of the individual stocks and the index in the risky portfolios derived from the various estimators and specifications.13 The daily excess returns of each risky portfolio are paired with the index excess returns to compute S and M2 for each portfolio over the entire period. portfolios derived from these forecasts were significantly superior to the passive strategy. As could be predicted by the poor results of the parabolic specification reported in Table 3 and Figure 7. We use the daily returns on the stocks. M2 provides better intuition than S and has become popular with practitioners. portfolios derived from the parabolic specification performed poorly and were eliminated from Table 4. The major implications from Table 4 are as follows: (i) All portfolios. (iii) The portfolios derived from the kinked specification of the discount functions uniformly outperform the linear specification. except those derived from OLS estimation of the forecast-discount functions. and yet. By subtracting the average excess return on M we get the risk adjusted return premium of P over M. hence we include the measure in the performance evaluation reports. We also eliminated the inferior OLS and AC beta estimators. with no short sales. Still. It shows values for the Sharpe and M2 measures for selected estimators and specifications. (ii) Portfolios derived from LAD estimators uniformly outperform portfolios from the OLS specification of the discount functions.
Table 7 shows that risk of the managed portfolios was increased relative to portfolios from the diagonal version in most cases. Using (3) we obtain the contribution of the active portfolio to the squared Sharpe measure by 1. 15 .909 for the S&P500 index.951.909 = 1. show improved performance over the portfolios from the diagonal version. This means that superior performance is achieved by the improvement in average returns from the identification of non-zero alpha stocks in the linear specification.485 × 646 ÷ 105 = 2. Our sub-sample of 105 stocks out of the 646 stocks in XYZ’s databank was chosen randomly. Table 4 shows the value of the Sharpe measure of the JSLAD portfolio as 1. we can easily assess the incremental performance that would be obtained with the diagonal model by expanding the universe of covered securities from 105 to 646. This contribution would be expected to grow to 0. This suggests that the forecasting quality of the stocks that were left out is similar to the 105 stocks we used. Most portfolios (20 out of 30). In that case. This should be expected since the full covariance model provides better utilization of forecasts and hence.145.982. The right-hand panel shows that with the [0. Moreover. [Table 5 here] 6. the LAD-estimator portfolios perform best. the risk of the kinked-specification portfolios is actually slightly lower than that of the index portfolio.[Table 4 here] Table 5 presents the risk-return data of the optimal portfolio.1452 . and positive-alpha stocks in the ‘no short sales’ (kinked) specification.2 The Covariance Version of the TB Model In using the covariance version we face a trade off between a theoretically-advised improvement and (low) estimation precision of the residual covariance matrix. the managed portfolios risk is only slightly larger than that of the passive strategy.1] restriction on the active portfolio weights.114%. larger positions in the active portfolio at the expense of diversification. resulting in a Sharpe measure of 13 2.485. too. resulting in M2 = 2.0.9092 = 0. Table 6 presents the performance measures of the portfolios derived from the covariance version.982 + . and all ‘no short sales’ (kinked) specification portfolios. Here. and M2 = 2 The first month is lost since we have no estimate of the discount function for this month. compared with . This is another indication that improved estimation will further increase the effectiveness of the model and the contribution of security analysts.
we use the Dimson method to account for infrequent trading. Using a database of realized returns on these stocks and the market index. and the difficulty in various statistical analyses can substantially cut down the potential gain. and shrinkage Bayesian estimators for beta coefficients. security analysis that results with a correlation between forecasts and realizations of abnormal returns as low as 0. while imposing organization-driven restrictions on the weights of the active portfolio. Macro forecasting was not utilized and the substitute extrapolation techniques were not as powerful as they could be. Using OLS to estimate the market model and the forecast-discount functions will not do.951 × 8. We suspect that the low precision of the forecasts of security analysts contributes to the dearth of portfolio managers that efficiently use the security analysis afforded by the Treynor Black model. can endow a portfolio derived from low-quality forecasts with profitability. Various specifications of the discount functions are used to account for the quality of the forecasts. there can be a distinct diversification-like advantage in estimation procedures with a larger universe of stocks. Obviously.354%. provided by an investment firm that actually used these forecasts to construct its portfolios.980 − 8. We show that the key to profitability of low-precision forecasts is the use of sophisticated econometric methods. Nevertheless. [Tables 7. Summary and Conclusions The objective of this paper is to identify and reduce the threshold of profitable forecasting ability for portfolio management. requiring that we assign equal quality to all forecasts. this requires that econometric methods are utilized. the estimates of residual 16 .1. These methods significantly improve the performance of the risky portfolio. Our experiment was performed under adverse conditions. We experiment with a database of monthly forecasts of abnormal returns for 105 stocks over 37 months. The forecast records were short. In the process. At the same time. the complexity of managing an organization 6 times as large with the same production quality.04 can still be profitable in an organization that covers more than 100 stocks. Shrinkage estimators for beta and LAD estimates of the discount functions.8 here] 7. Finally.166 = 9. We find that the threshold of profitable forecasts of abnormal returns is extremely low. The discount functions are estimated with LAD estimators to account for fat tails. that is. we estimate forecast discount functions and apply them to the forecasts prior to the construction of the risky portfolios.
17 . These findings lend a real meaning to the concept of nearly-efficient capital markets.variances in the diagonal version and the covariance matrix in the covariance version of the TB model can be improved. this suggests that competition leads to a degree of information efficiency that reduces the forecast precision of super-marginal firms to a level as low as what we observe in this experiment. If XYZ was representative of competitive investments firms. All this indicates that the threshold to profitability of forecast precision can be further lowered.
R. (1974). (1975). Goetzmann W. The Cross-Section of Expected Stock Returns. J. 40. (1993). P. (1954). Barrodale. Econometrica. E. Financial Analysts Journal. Computational Theory of Linear Programming: The Bounded Variables Problem. Do Winners Repeat? Patterns in Mutual Fund Behavior. Index-Option Pricing with Stochastic Volatility and The Value of Accurate Variance Forecasts. R. McGraw-Hill. Fall. Engle. J. 53-56. and Ross. K. Z. A. Fall. (1979).. XLVII. F. How Consistently Do Active Managers Win?. No 4519. A.References Admati. Where Are the Customers’ Alphas?. (1985). Journal of Portfolio Management. 197-226.. Dunn. and Ibbotson. 18 . A. R. Active Portfolio Management: How to Beat The Index Funds. Vol. K. Bodie. fifth Edition. and Lemke. (1979). S. November-December. Carnegie Institute Of Technology. Journal of Finance. 629-657. 17.R.J. and French. K. Dybvig. Risk Measurement When Shares Are Subject to Infrequent Trading. I. Journal of Portfolio Management. Pittsburgh. 84-87.D. Can Active Management Add Value?. Ambachtsheer. (1977). 427-465. Working Paper. Journal of Finance. R. 319-320. F. K. (1992). Algorithm 478: Solution of an Over-determined System of Equations in the L1 Norm. Journal of Financial Economics 7. A. A. (1991).. Pennsylvania. Profit Potential in an Almost Efficient Market. Kane. A Noisy Rational Expectations Equilibrium for Multi-asset Securities Markets. Ferguson. C. P.F. (2001).E. and Marcus.R. E. Investment. (1974). Ambachtsheer. NBER Working Paper. Financial Analysts Journal. Charnes. and Farrell. Communications of the Association for Computing Machinery. Dimson. May-June. and Roberts.A. Yale School of Organization and Management.L.H.K. Graduate School of Ind. 53. and Theisen. Journal of Portfolio Management. Fama. 63-72. Jr. 401-416. 9. Administration. and Noh. The Analytics of Performance-Measurement Using a Security Market Line. (1983). Ambachtsheer. (1985). 47-50. Kane. 39-47.
S. (2001). 25-36. J. Kane. Ippolito. Vasicek. Modigliani. Hodges.L. Rosenberg. B. 48. L. 1. H.J. 19 . Financial Analysts Journal. (1985).D. Merton. Journal of Portfolio Management. A. 7. (1980). How to Use Security Analysis to Improve Portfolio Selection. Journal of Business. F. James-Stein Estimators in Large Samples with Application to the Least Absolute Deviations Estimator. R.. (1972). Merton. An Analytical Derivation of the Efficient Portfolio Frontier. 96. 66-86. Winter. 1851-1872. American Economic Review. 45-54. Journal of the American Statistical Association.Grossman. 847-838. A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity. Roll. 50-65. E. R. S. and Stiglitz.. White.. Kim. A.J. Journal of Portfolio Management. (1999). (1973). 70. 104. and Marks. 697-705. 150-166. H. (1973). and Brealey. Econometrica. Journal of Financial Intermediation. A Critique of the Asset Pricing Theory's Tests: Part I: On past and Potential Testability of the Theory. Treynor.A. 4.. T and White. F. L. (1990). 28. R. Persuasive Evidence of Market Inefficiency. Reid. O. March-April. and Black. K. 393-408.. (1980). 46. On the Impossibility of Informationally Efficient Markets. (1980). and Modigliani. 11. A. NBER Working Paper Series. 1233-1239. A Note on Using Cross-Sectional Information in Bayesian Estimation of Security Betas. R. Journal of Portfolio Management. Spring. No. Journal of Financial Economics.R. Portfolio Selection in a Dynamic and Uncertain World. (1977).A. The Valuation of Security Analysis. Marcus. (1973).C. Kane. (1997). and Trippi. R. S.C. Efficiency with Costly Information: A Study of Mutual Fund Performance 19651984. 444. (1989). 9-17. Journal of Finance. The Delivery of Market Timing Services: Newsletters versus Market Timing Funds. On Estimating The Expected Return on The Market: An Exploratory Investigation. 1-23. 129-176. Quarterly Journal of Economics. Risk Adjust Performance. R. and Lanstein.. R. Journal of Financial and Quantitative Analysis.
087 -0.003 3.701 (0.0447 0.841 0. Lag Selection for the Aggregate Coefficient Method K 0 1 2 Intercept 0.000153 0. Summary Statistics for Alpha & Beta Forecasts Mean Std Min 25% 50% 75% Max Skewness Kurtosis JB P-value Beta 0.00041 0.052 (0.089 -12 -5 -1 0 14 0.00113 0.123) Slope 0.000012 Prob(F) 0.700 (0.32 1005.9995 R2 0.113 (0.068 (0.054) 0.143 2.054) -0.207 0. 20 .054) -0.Table 1.985 1.356 4.699 (0.982 0.369 0 Table 2.274 0.334 Mean of estimated betas 0.268 67.8810 Note: Standard errors are in parenthesis.491 3.878 0 Alpha -1.056 0.036 1.123) 0.000876 0.123) 0.00024 R2a -0.
963 1.0453) 0. Table 4.921 1.037 1.1333) 0.667 2.095 1.0462) 0.001804 0.4778 92.9599 93.1334) 0.001085 0.1026 XYZ AC V JS Beta OLS XYZ AC V JS 0.1872 (0.1940 (0.1909 (0.1823 (0.0531 (0.001851 0.0559 (0.2037 (0.235 0.001762 0.058 1.894 2.3371 (0.0948) 0. Sharpe Ratio and M2-measure: Diagonal Model (S&P500 Sharpe Ratio = 0.279 2.145 OWLAD LAD 0.001762 0.0458) 0.1330) 0.1000 (0.001841 0.9745 94.1057 (0.1911 (0.0554 (0.890 LAD 1.427 1.922 1.0462) 0.001057 Beta Intercept Alpha Alpha2 R2 R2a White’s Panel B: Parabolic Specification Heteroscedasticity Test OLS 0.0943) 0.4382 94.094 0.1200) 0.5293 0.1525) 0.001084 0.1198) 0.497 * 0.001290 99.905 1.119 0.1526) 0.921 * 1.107 1.0944) 0.1892 (0.146 0.001505 0.001337 98.3910 (0.0455) 0.001028 0.095 1.2511 92.3970 (0.1916 (0.120 1.001327 98.033 1.1225 (0.001505 0. Regression Results for the Calibration of Alpha Forecasts Beta Intercept Alpha R2 R2a White’s Heteroscedasticity Test 92.001851 0.1898 (0.0947) 0.4409 Panel C: Kinked Linear Specification 0.001781 0.001536 0.3154 (0.001286 0.4014 0.001547 0.654 -0.1978) 0.001793 0.112 1.1967) 0.1951) 0.665 0.001134 0.6631 Constant Alpha*(Alpha>0) R2 R2a White’s Heteroscedasticity Test 92.3122 (0.895 1.904 1.001337 98.1199) 0.1016 (0.1336) 0.1945 (0.0950) 0.1526) 0.920 1.145 M2-measure OLS NRLAD JSLAD OWLAD -0.668 -0.6821 92.131 0.0565 (0.001313 99.955 1.1194) 0.146 0.2257 (0.Table 3.0952) 0.3930 (0.7363 94.1979) 0.909) OLS 0.481 1.114 * Note: * indicates the best estimator in the row.001391 0.095 Sharpe Ratio NRLAD JSLAD 0.001341 0.3440 94.413 1.957 1.001314 0.001342 0.076 0.001524 Note: Heteroscedasticity-consistent standard errors are in parenthesis.3210 (0.0952) 0.501 1.3939 (0.120 1.1336) 0.1521) 0.112 Beta Line Kinked Vasicek Line Kinked JS Line Kinked 0.4132 0.1523) 0.886 1.0946) 0.940 1.050 1.123 0.235 0.0950) 0.878 0.001827 0.1877 (0.124 * 0.025 1.0942) 0. 21 .9233 Panel A: Linear Specification OLS XYZ AC V JS 0.1936 (0.2808 (0.036 * 2.941 1.0562 (0.1960) 0.001804 0.058 1.287 1.337 * 2.1914 (0.1975 (0.1198) 0.033 0.337 * 2.3896 (0.965 1.118 0.3952 (0.
232 1.399 * 2. Sharpe Ratio and M2-measure: Covariance Model (S&P500 Sharpe Ratio = 0.818 11. TBP Return and Risk: Covariance Model (S&P500 Return = 8.030 * LAD 0.460 2. TBP Return and Risk: Diagonal Model (S&P500 Return = 8.261 1.865 1.166: S&P500 Risk = 8.744 1.872 9.618 9.985 * -0.158 10.205 12.000 8.712 8.464 10.388 2.144 9.996 8.846 8.482 0.621 Beta Vasicek JS 22 .267 M2-measure OLS NRLAD JSLAD OWLAD -1.523 10.623 9.600 9.162 11.703 * 3.800 8.442 9.323 9.019 1.581 10.210 0.758 TBP Return (Mean) NRLAD JSLAD OWLAD LAD 9.162 10.750 8.435 1.570 10.015 1.597 9.601 11.315 9.054 10.167 9.788 11.452 8.744 9.358 12.262 10.158 * 0.980) OLS Line 7.550 TBP Return (Mean) NRLAD JSLAD OWLAD LAD 8.403 11.242 0.206 12.706 10.754 10.232 0.958 Kinked 11.984 4.664 8.077 1.163 Kinked 11.866 1.632 13.638 13.897 * 4.909) OLS 0.445 10.621 10.639 11.705 10.747 1.867 1.976 11.532 Line 8.378 2.244 0.293 -1.794 11.927 8.042 * 1.717 * 2.267 0.983 11.568 8.892 8.286 10. Dev. Table 7.948 * 3.980 -1.168 9.054 -0.214 * Beta Line Kinked Line Kinked Line Kinked Vasicek JS Note: * indicates the M2-measure is greater than its counterpart in the diagonal model in Table 4.261 1.710 9.752 8.864 1.054 10.400 11.659 * 2.214 Sharpe Ratio NRLAD JSLAD OWLAD 0.732 * 3.110 LAD 11.398 9.248 1.861 9.143 Beta Line Kinked Vasicek Line Kinked JS Line Kinked Table 6.365 12.157 * 0.744 8.) NRLAD JSLAD OWLAD 10.019 0.407 3.456 10.451 10.601 11.893 9.517 TBP Risk (Std.435 1.277 9.) NRLAD JSLAD OWLAD LAD 10.307 9.833 8.216 * -0.980) OLS 8.972 8.951 * 3.378 8.205 0.170 11.925 OLS 10.792 1.574 11.390 8.571 8.638 9. Dev.137 Kinked 11.583 9.666 8.082 -0.811 11.Table 5.930 9.891 8.320 OLS 9.637 9.978 11.507 * 3.789 1.585 10.765 1.799 9.976 11.706 9.718 * 1.247 LAD 1.166: S&P500 Risk = 8.725 8.504 * 3.057 10.299 8.796 9.294 9.003 * -1.171 9.173 9.015 1.899 9.323 9.829 8.577 9.794 9.539 8.072 10.526 Line 8.567 10.914 TBP Risk (Std.077 1.760 9.317 8.106 11.641 9.307 9.898 * -1.713 8.827 11.
Histogram of Alpha Forecasts 7000 6000 5000 4000 3000 2000 1000 0 -15 -10 -5 0 5 10 15 Figure 2. Histogram of Beta Forecasts 2000 1800 1600 1400 1200 1000 800 600 400 200 0 0 0.5 1 1.5 23 .Figure 1.5 2 2.
Figure 3. Sample M a r k e t V a lu e : P o p u la t io n 300 45 40 250 35 200 30 25 150 20 100 15 10 50 5 0 -5 0 -5 M a r k e t V a lu e : s a m p le 0 5 10 1 9 9 4 ( u n it = m illio n s o f x ) 0 4 $1 0 5 10 1 9 9 4 ( u n it = m illio n s o f x ) 0 4 $1 Figure 4. Sample B o o k / M a r k e t V a lu e : P o p u la t io n 200 180 160 140 120 100 80 60 40 20 0 -1 0 -5 0 1 9 9 4 ( u n it = $ / $ ) 5 0 -1 0 -5 0 1 9 9 4 ( u n it = $ / $ ) 5 10 15 25 B o o k / M a r k e t V a lu e : S a m p le 30 20 5 24 . Market Value: Population vs. Book/Market Value: Population vs.
989 JS=0.5 3 Grand Mean: OLS=0.5 2 2.5 0 0.880 AC=0.4 0. OLS(dash) AC(solid) Vasicek(dashdot) JS(dotted) 1.2 1 0.2 0 -1 -0. Distribution of Beta Estimates Based on 4 Methods. Alpha Forecasts 25 . Scatter Diagram of Ex-Post Abnormal Returns vs.996 Figure 6.8 0.6 0.999 Vasicek=0.5 1 1.Figure 5.
015 0.08) 40 Line(-) Parabola(+) Kinked Line(*) 26 .04 0.03 0.035 0.02 0. Fitted Lines Based on 3 Specifications Figure 8.Figure 7. Predictive Power (Adjusted R2) Predictive Power (Adjusted R2) over time (based on Vasicek + AC Method with n=1) 0.01 0.005 0 0 5 10 15 20 25 30 35 Date Forecasts Made (The First Adjusted R2 = 0.025 0.
1 ) usin g 3 Yea r Da ily Re turn R ollin g W ind o w) 4 3 2 1 0 -1 -2 -3 -4 -5 0 5 10 15 20 25 30 Pre dictio n R oo t MSE = 2. Forecasting Monthly S&P 500 Index F o re ca stin g Mo nth ly S&P5 0 0 In d ex (2 /93 -1 /96 ) (AR (0 )-G AR CH(1.Figure 9.2 0 32 35 40 Actual(*) Return F orecast(+) Volatility F orecast(o) 27 .