13 views

Uploaded by sootos

Not fooled by randomness: using random portfolios to analyze investment funds

- INFLUENCE OF CULTURE IN CHANGING CELLULAR SERVICE PROVIDER
- Capability indices for Birnbaum Saunders processes applied to electronic and food industries.pdf
- Lec_20
- $R8TGZCZ
- Harm 98
- syllabus
- Statistics Memo
- 5. Ijhrmr - The Implementation Influence of Indonesian Government Regulation Policy
- termproject
- 9709_m17_qp_72.pdf
- Cash Holding and Company Performance
- Package CPSurv in R
- New Microsoft Word Document (2)
- c1s1
- MB0034 Research Methodology Fall10
- Hypothesis Testing
- Statistics
- Mhra Oos Oot Oct17
- Making a Research Proposal (2015 Ver 8)
- Parametric Test r

You are on page 1of 52

com/abstract=2143293

Not fooled by randomness: using random

portfolios to analyze investment funds

Roberto Stein

rstein@fen.uchile.cl

Faculty of Economics and Business, University of Chile

August 2012

The biggest challenge in testing mutual funds for manager skill is the lack of a

probability distribution of returns under the null hypothesis of no skill. A

methodology based on randomly trading portfolios and non parametric statistical

tests is explored, and a test of skill is proposed. Simulation is used to perform an

in-depth study of the properties of this test, and to compare its power against that

of other tests of skill based on factor model alphas. Empirical tests performed on a

sample of US equity mutual funds find evidence of skill in a reduced number of

managers, but that the value added by this skill is charged away from the investors

in the form of fund fees and expenses. Overall, random portfolio based measures

are found to be more powerful and easier to interpret than tests based on traditional

and bootstrapped factor model alphas.

Electronic copy available at: http://ssrn.com/abstract=2143293

1

1. Introduction

Fund performance measures, while theoretically good indicators of past overperformance

1

,

are notoriously unreliable predictors of future performance

2

. This makes them a poor

choice for investors who wish to allocate their capital in funds that, at least in expectation,

will overperform in the future.

Perhaps this is why in the last few years the discussion has shifted from measuring

performance to a more elusive factor: testing fund manager skill. The implicit argument is

that, while skill in no way guarantees persisting overperformance, a fund manager that has

obtained a high level of performance in the past through skill is much more likely to repeat

such performance in the future than one who was merely lucky.

While the argument is certainly sound, performance is an observable variable, while skill is

not and therefore measuring or testing for skill presents important empirical problems.

Past attempts at measuring skill are based solely on factor model alphas, first as indicators

of overperformance and lately interpreted as signals of manager skill. However, as

explained in Kosowski, Timmermann, Wermers and White (2006) (henceforth KTWW),

the cross-sectional distribution of the resulting regression alphas exhibit strong deviations

from normality, which invalidates standard statistical significance tests and, much more

importantly, it is not clear what the distribution of alphas should be under the null

hypothesis of no skill. KTWW claim to solve both these problems with a bootstrap

1

See Kothari & Warner (2001), who critique regularly used performance measures as lacking power enough

to detect economically large magnitudes of abnormal fund performance.

2

See Carhart (1997) and most of the literature regarding persistence in mutual fund performance, as well as

Goetzmann (2007) on how, in any case, most of these measures are susceptible to fund manager

manipulation.

Electronic copy available at: http://ssrn.com/abstract=2143293

2

methodology, which they apply to a large sample of U.S. mutual funds, as do Cuthbertson,

Nitzsche and O'Sullivan (2008) in the U.K.

In the present paper I test these methodologies and find that both of them, standard and

bootstrap alpha, are prone to be fooled by randomness: that is, they falsely detect skill

in simulated samples of portfolio returns where overperformance is a result of luck. This is

not surprising, since regression alphas are also performance metrics, and as such they are

highly correlated with fund performance, thus supplying little in the way of new

information above regularly used measurements.

I develop a new methodology to test for skill, based on randomly trading portfolios as

proposed in Burns (2007), which are used to derive the empirical distribution of fund

returns under the null of no skill. This measure is superior to the alpha-based

methodologies in that it is powerful enough to distinguish skill from luck in all except the

most extreme cases of luck, is more intuitive in its interpretation since it relies on simple

fund returns, and is designed to be applied to the identification of skill in individual funds,

as opposed to the KTWW method and that of Barras, Scaillet and Wermers (2008), which

is designed to find only the proportion of skilled funds in a given market.

This paper is organized as follows. The next section explores previous measures of fund

manager skill and the results obtained with these. Section 3 details the methodologies

behind the alpha and random portfolio measures of skill. Section 4 presents the results of

tests the power of the measures, using simulated samples of portfolios that are constructed

to obtain a performance above a benchmark through skill or luck. Section 5 presents results

of the application of both types of measures to a sample of U.S. equity funds, and Section 6

contains concluding remarks.

3

2. Measuring fund manager skill: current state of the art and new proposed measure

Factor model alphas have recently been pushed beyond the original Jensen performance

measure and considered evidence of fund manager skill. In general, this line of research

involves controlling portfolio returns for known risk factors, such as exposure to the

market, firm size, market-to-book ratio and momentum. If these models yield positive and

significant values of alpha, then this is considered evidence that the manager of the fund is

skillful (see Silli (2006) for a review).

However, these traditional regression alphas are of dubious value as tools to evaluate skill.

Apart from the various criticisms inherent in regression models (see Silli (2006), Ferson

and Schadt (1996), Christopherson (1998), Spiegel, Mamaysky and Zhang (2003, 2006),

and others), two critical shortcomings can be identified in these models that make any

inference gleaned from them unreliable. First, hypothesis testing of regression alphas relies

on the assumption of normality of the alphas distribution. KTWW find extreme deviations

from normality in the distribution of alphas in the U.S. market, as do Cuthbertson, Nitzsche

and O'Sullivan (2008) in the U.K. Second, in order to conduct a statistical test, we need to

have the probability distribution of the variable of interest under the null hypothesis.

However, we do not know what a distribution of alphas looks like if funds are managed

with no skill. We therefore take a positive and significant alpha as evidence of skill, even

though Kosowski et. al. use basic statistical testing criteria to show that, in a relatively large

sample, we can expect to observe a certain number of alphas that are positive and

significant through pure random fluctuation (i.e.: luck).

KTWW propose a bootstrap approach to improve the regression alpha test of skill. First

they select a sample of funds that operate in a certain market. For each fund in the sample,

they then regress its returns on a factor model, and obtain factor loadings and residuals.

4

Then, they repetitively resample the residuals and, together with the factor loadings,

construct a new set of returns, where the value of alpha is set to zero. Finally, they regress

the resulting returns on the same factor model and obtain the resulting alpha. Repeating this

process a number of times, KTWW construct a null distribution of no skill for the alphas

of the full market under study (a distribution as it would look if the alphas real value were

zero). They then compare the number of funds in the market for which regression alphas

are positive and significant with the number of alphas that happen to be positive and

significant in this null distribution, that is, by luck. Their conclusion is that since the

number of real positive alphas exceeds that of the lucky alphas, then some of the real

alphas must be the result of fund manager skill, as random fluctuation cannot explain them

all away.

The bootstrap methodology addresses the problem of normality, as the testing is done on

the basis of empirical distributions and there is no assumption of a parametric one. Also, by

forcing alpha to be zero in the iteration process, this bootstrap process is one way to obtain

a distribution under the null of no skill.

However, the methodology still suffers from serious drawbacks. First, the characterization

of alpha as a measure of skill, even if correctly estimated, is still a matter of interpretation

as opposed to simple returns which are unambiguous in their origin and interpretation.

Second, this measure is of an absolute nature and, as is also the case with the standard

alpha analysis described above, there will inevitably be a high correlation between

performance and alphas. That is, funds that overperform will tend to have positive and

significant alphas, irrespective of manager skill. This is because factor models regress fund

returns on a number of factors, and these factors are equal for all funds tested. In essence,

the factors become a type of benchmark that is equally applied to each fund to correct

5

returns for exposure to certain risks. The consequence is that inference is unreliable: funds

that perform well need not be managed by skillful managers since luck plays a big role in

performance. In fact, as is shown in Section 4, traditional and bootstrapped measures of

alpha are likely to erroneously reject the null of no skill in the presence of lucky

portfolios, which obtain a high level of performance due to lack. These measures are

therefore easily fooled by randomness. On the other hand, the possibility of finding a

skillful manager whose track record shows poor results is almost zero, and so the

possibility of using the alpha measure as a diagnostic tool to help identify shortcomings and

improve performance is minimal.

Moreover, as a matter of application and interpretation of results, in order to estimate the

bootstrap model of KTWW the alphas of all funds in the market must be estimated and

bootstrapped, and inference is obtained by comparing real fund alphas to a ranked matrix of

bootstrapped alphas (that is, a matrix where the resulting bootstrap alphas have been

ordered from the highest to smallest). Thus, the KTWW methodology requires data from all

funds in the market, even if the study is focused on a single fund. While Cuthbertson et. al.

claim that they are able to study individual funds using this methodology, they still require

the full market dataset, and their inference is based in comparing each real funds alpha

against a distribution of bootstrapped alphas that corresponds to the real funds performance

rank. That is, the best performing funds alpha is compared to the distribution of the highest

bootstrapped alphas, the second best fund to the second best distribution, etc. The

underlying assumption of this test is that the best performing fund will always obtain the

best possible bootstrapped alphas, the second best fund will obtain the second best set of

alphas, and so on for the rest of the funds as ranked by their performance.

6

The lack of a known distribution under a certain null hypothesis is addressed in a general

framework for investment fund analysis in Dawson and Young (2003). They argue that our

inability to carry out experiments with control groups makes obtaining these distributions

of the null hypothesis a complicated task, and advocate the use samples of random

portfolios in a Monte Carlo experiment setting to generate them. Burns (2007) notes that

constrained random portfolios, that is, portfolios that are allowed to trade randomly but

within the same bounds faced by real fund managers, constitute a control group for a

measure of skill since by construction there is no skill in their trading decisions. The fund

managers constraints or restrictions can be imposed by the firm that offers the funds, for

example in terms of the prospectus and investment goals, or self-imposed trading behavior

that the manager maintains over his career. These restrictions may be in the form of a

subset of the universe of assets in which the manager is allowed to invest (cash, fixed

income, equity and derivatives, value vs. growth stocks, small vs. large firms, etc.),

acceptable levels of risk (minimum and maximum; expressed as standard deviation, VaR,

benchmark risk, etc.), turnover ratio, number of assets in the portfolio, etc.

I will henceforth refer to the general use of constrained random portfolios as an analysis

tool as Constrained Random Portfolio Analysis, or CRPA.

The portfolio returns resulting from randomly trading portfolios are obtained purely by

chance, with no value-adding (or subtracting!) intervention, and thus represent a subset of

the state-space of feasible portfolios that could be attained by the fund manager. A large

enough sample of CRPA portfolios will therefore generate the probability distribution of

every level of performance potentially attainable by the fund manager, within the

constraints she faces. Real fund returns are then compared to the distribution obtained from

the random portfolios. Rejection of the null of no skill depends on a chosen significance

7

level: for a manager to be considered skillful, her funds returns should be at least better

than a certain percentile of the random fund distribution, where that percentile corresponds

to the desired level of significance. In other words, a manager is considered skillful if she

is able to do better than a certain number of random portfolios.

Correctly applied, CRPA addresses all the arguments against previous measures of fund

manager skill. Being a non-parametric approach, it sidesteps all the theoretical and

econometric problems associated with factor models. The analysis is strictly individual: one

fund can be analyzed with no need of data of other funds (nor the market, macroeconomic

variables, etc.) Thus, in this sense the amount of data required for the analysis is lower than

for other measures of skill, and there is no peer group or relevant benchmark decision to

make. This means as well that the measure is relative and specific to the fund being

tested. Finally, CRPA introduces a flexible and powerful framework that can be used in

many other applications beyond testing for manager skill.

8

3. Empirical Methodology

Factor model alphas, when positive and significant, are considered signs of fund manager

skill or, at least, abnormal performance not attributable to known sources of risk. The three

most widely used specifications used to estimate these alphas are:

Jensen's alpha (Jensen 1968),

1

Fama and French 3 factor model (Fama & French 1993),

2

and Carhart's 4 factor model (Carhart 1997),

3

, where

R

pt

are the returns of portfolio

p

at time

t

,

r

f

is the risk-free rate,

is the

regression alpha,

R

Mt is the market return at

time t,

is the fund's beta with respect to Fama & French's High-Minus-Low factor,

HML is Fama & French's High-Minus-Low factor,

Fama & French's Small-Minus-Big factor, SMB is Fama & French's Small-Minus-Big

factor,

Carhart's momentum factor, and

e

pt

is an error term.

The null hypothesis of no skill is rejected if a funds regression alpha is found to be positive

and significant.

A criticism of these models is that the betas are unconditioned and static. Conditional

models with time-varying betas have been developed and estimated (see Silli (2006)), in the

hopes of obtaining more precise estimation of the regression coefficients. However, as far

9

as skill testing is concerned, KTWW find that inference obtained from both conditional and

unconditional models is virtually the same. Therefore, in the tests that follow after this

section, only unconditional models are used.

Standard factor model analysis generally identifies the alpha (or regression intercept) as

evidence of abnormal return or fund manager skill (if positive and significant). However,

standard models are incapable of differentiating between positive alphas obtained by

skillful managers, and those that result from sheer luck (good or bad), as unlikely events

that can nevertheless be observed at the tails of the distribution.

The Bootstrap Alpha technique improves upon the standard analysis. Applied first in

Kosowski, Timmerman, Wermers and White (2006), then replicated in Cuthbertson,

Nitzche and O'Sullivan (2007) and (with small variations) in Fama and French (2010), this

technique seeks to obtain a distribution of factor model alphas from a bootstrap process

where the true alpha has been set to zero. Thus, this distribution will show the probabilities

or expected frequencies of observed positive and negative alphas under the null that there

exist no managers with skill. This distribution is then compared to the distribution of real

investment fund alphas and, in its simplest form, the number of funds that fall in the

extreme quantiles in one distribution can be compared to those of the other. For example, in

KTWW one statement of their analysis reads Panel A indicates that nine funds should

have an alpha estimate higher than 10% per year by chance, whereas in reality, 29 funds

achieve this alpha. This is taken as evidence that the market must contain at least some

funds that obtain positive alphas by dint of their managers' skill.

CRPA: A non-parametric alternative to factor model alphas

10

Using the software package PortfolioProbe

3

, samples of randomly trading funds can be

constructed which, while devoid of skill, may still be bound by user-defined constraints.

A sufficiently large number of these random funds constitute the sample which is then used

as the control group or distribution under the null to test fund manager skill, and can be

used to perform other types of analyses.

To obtain the relevant distribution under the null of no skill, Burns (2007) considers using

the holding period return for each portfolio in a large sample of randomly trading funds.

That is, if 1,000 portfolios are generated then the return of each portfolio is calculated, then

the distribution is based on the cross-section of the 1,000 holding period returns thus

obtained. The skill test then consists of comparing the real funds return with the ranked

random portfolio returns. If the real return attains a certain percentile, for example its

better than 95% of the random returns, then we reject the null of no skill.

Figure 1 shows the probability density of a sample of 1,000 random funds holding period

returns, with a (dashed) line depicting the 95th percentile threshold.

[Figure 1 about here]

The plot was made with a sample of 1,000 random funds trading S&P 500 listed securities

for 6 years, from 2005 to 2010. The constraints imposed on these portfolios were on

turnover and portfolio asset count (the number of different assets that could be contained in

the portfolio). The values used for these constraints are consistent with the mean of these

values (turnover and asset count) for real funds currently operating in the market

4

.

Therefore, for a fund manager at the helm of a fund trading these securities and operating

under constraints similar to those simulated, she would have to obtain a return equal to or

3

Actually, PortfolioProbe is a library of functions written for the R language.

4

The data was obtained from CRSP Mutual Fund database. The sample employed corresponds to actively

managed funds, whose net asset value is composed in at least 90% of stocks listed in the S&P500 index.

11

better than roughly 80% during the 6 year period for the null of no skill to be rejected.

I propose an improvement to this methodology, in which the null distribution used for

testing is that of the time series of returns of a single random portfolio, as opposed to the

cross-sectional approach used in Burns (2007). Using a sample of random portfolios

ordered by a certain criteria (which could be, for example, mean return) I first set the

critical value for a percentile, then choose the random portfolio which occupies the position

of that percentile in the ordered sample.

Hypothesis testing is now based on the concept of stochastic order

5

. We are interested in

testing whether the distribution of returns of the managed fund is stochastically greater than

that of the chosen percentile random fund.

A random variable A can be said to be stochastically greater than another random variable

B if

Pr > Pr > (, +) 4

By this definition we could say that A is 'bigger' than B, but the financial interpretation is

far more interesting: the probability that fund A obtains a return higher than x is higher than

that of fund B attaining a similar performance. While far less powerful than the concept of

stochastic dominance, stochastic order can guide decision making in the sense that it could

point towards a fund manager who has a higher probability of obtaining a certain level of

return in future realizations. This is consistent with the argument in favor of skill over luck

described in the first section of this article.

In order to test stochastic order between two distributions, the null and the real funds time

series of returns, the non-parametric Mann-Whitney U test (also called Mann-Whitney-

Wilcoxon and therefore referred to as MWW) is used. The MWW test is used to assess

5

See, for example, Shaked and Shanthikumar (1994)

12

whether one of two samples of independent observations tends to have larger values than

the other. The test involves estimation of the U statistic, which is calculated by first

ranking the values of both distributions and then adding these ranks. The one tail tests

alternative hypothesis can be stated as that the probability of an observation from sample X

is higher than one from sample Y is higher than 0.5, or

> +( = ) > 0.5 5

For robustness purposes, tests of location can also be used. A standard parametric t-test of the

difference in means of both distributions is employed, as well as a non parametric alternative based

on permutation. This last test consists in calculating a certain statistic, for example, t he difference

between the sample means. Then, the elements of both samples are mixed together, and repeatedly

resampled. Each iteration, two vectors are obtained with the same number of observations as each

original sample, but with elements drawn from the mixed dataset, i.e. can contain observations from

either sample. The relevant statistic is calculated, and the process is repeated. If the statistic of

interest is the difference between the means, then after each iteration a difference of means is

calculated between the resulting vectors. A large number of iterations will generate an empirical

distribution of the difference between the means of the vectors, under the null hypothesis that both

original samples were drawn from the same distribution. If the real difference in means is large

enough (higher than a critical value), then the null is rejected and both samples are assumed to come

from different distributions.

This methodology improves the quality of the test, as it analyzes the managed funds full

distribution of returns and not just the overall result achieved over a period of time. The

most important consequence of this analysis is that funds that achieve impressive results by

luck are much more likely to fail the test. Indeed, as is shown in simulation tests, this

version of the CRPA skill test is by far less likely to be fooled by randomness than all the

previously described measures.

13

4. Power of the Skill Tests

In the previous sections we encountered three measures or tests of fund manager skill,

standard regression alphas, bootstrap alphas and CRPA, detailed their methodologies, and

listed some of their potential shortcomings. In this section I directly test each measure in

terms of its power to detect skill. Moreover, and more importantly, I test the measures

potential to differentiate between skill and luck.

Since, other than the standard alpha test, test statistics do not have parametric distributions,

analytical expressions of the tests power are not obtainable. Hence, I proceed to estimate

the power of each test via simulation.

Power curves for each skill test are built by applying the test of skill to simulated samples

of portfolios which are constructed to exhibit skill or to be simply lucky. This is

accomplished by adding an extra rate of monthly return to the time series of returns of a

baseline vector of returns. For the tests reported, the baseline vector is obtained from a

sample of CRPA random portfolios. These portfolios trade S&P500 stocks, and do so

constrained to the maximum and minimum levels of turnover and number of assets in each

portfolio observed in a sample of real U.S. mutual funds that invest primarily in S&P500

stocks. The average monthly returns for these random portfolios are calculated, and then

the portfolios are ranked by this variable from smaller to larger. Then, the baseline for the

power test samples is chosen as a percentile from these ranked portfolios. For example, if

the 95

th

percentile is chosen and 1,000 portfolios were generated, then the baseline is a

vector consisting of the time series of returns of the 50

th

best performing portfolio.

Let the baseline portfolio be referred to as b, then its returns are r

bi

, where i is the time

period to which this return corresponds (i=1, , n; where n is the total number of time

14

periods under study) and let the full vector of returns be r

b

. This vector will have a mean

monthly return,

.

The samples used to construct the power curves are composed of 1,000 portfolios simulated

for each, skill and luck. To illustrate how these samples are generated, let be a

predetermined rate of added return, and e a noise term. Thus, e is a random variable,

which distributes N(0,

2

simulated in the luck portfolio sample.

Thus, the skill sample is generated by drawing vectors of e and producing portfolio time

series of returns of the form

+ 8

, where

is a vector of length n, and each element of the vector is equal to /n. Thus, the

resulting vectors of returns represent a single (smooth) monthly increase in return with

respect to the baseline, plus a noise or randomizing term with zero mean.

On the other hand, the sample of lucky portfolios is constructed as

+ 9

, where

is a vector of length n, and its elements will contain lp instances of /lp and the

rest will be zero, with the position of the non-zero elements chosen randomly for each

portfolio in the sample. So, for example, if lp is equal to 1, we are simulating a lucky fund

manager that is able to match the performance of a skilful manager with a single lucky

break, that is, a large added return in a single month, while during the rest of the time

period the returns of his fund are, in expectation, no different than the baseline. The

resulting samples have properties that make them ideal for the power tests (see appendix I).

15

The power test is carried out for various given levels of , ranging from zero (no skill) to

4% per month, a large added return that ensures that at that end of the range the power

curve converges to a probability of 1. Also, the number of lucky periods is allowed to

vary, and can take values of 1 (the full extra return added to a single months return), 3, 5

and 10.

It should be noted that these samples are consistent with the previously given definition of

skill vs. luck in investment funds: the skillful manager may have good and bad periods,

but overall she should be able to obtain a consistent performance that is better than the

market average. On the other hand, a lucky manager may be able to match (or surpass) the

performance of a skillful manager, but does so because of a relatively small number of

lucky breaks, or periods of exceptionally good returns, which have a small probability of

being repeated going forward.

The plots that follow show the resulting power curves for each test: standard alpha,

bootstrap alpha, and CRPA in its two versions, the Burns cross-sectional measure and the

95

th

percentile time-series measure.

For both regression alpha tests (standard and bootstrap) only the results based on the

Carhart four-factor model are shown. This is done to preserve the images clarity, as results

stemming from other models (one and three factor) are qualitatively equal and are available

upon request. For the same reason, percentile distribution (time series) based CRPA testing

is done using only the MWW test, as t-test and permutation test results are very similar.

Finally, in order to simplify the images, number of power curves plotted is further reduced

by introducing the concept of net power. Since detecting skill where none exist is, in fact,

a failure of the test employed, the power associated with this type of outcome is deducted

from the estimated power of detecting skill in samples that do have it. Thus, net power is

16

defined as power to detect skill power to detect luck. This measure of net power is

also consistent with the aim of skill tests, which is to separate skill from luck.

Figure 3 shows the power curves where the luck derived returns are constructed with a

single lucky month in a six year period. This is the most extreme case of luck, and

should be the easiest for the tests to identify as such. As can be observed, the Burns measure

lacks power when applied to skillful and lucky samples with similar levels of return. In fact, its net

power is close to zero for any level of added return, while the other measures exhibit similar levels

of net power. This result is due to a disproportionally large power component estimated for the

luck sample, which in net terms eliminates the equally large power for detect ing skill in the skillful

sample, which would otherwise trump other measures. The other measures fare better, with the

standard alpha and Percentile CRPA test showing very low tendencies to be fooled by these

lucky funds.

[Figure 3 about here]

As the number of lucky periods increases, we can see that the power curves based on

simulated skillful samples remain virtually the same, but the likelihood that the measures of

skill will mistakenly take a lucky fund to be skilful increases. This affects all measures by

severely reducing their net power. However, the effect is least noticeable for the CRPA /

MWW test, which at a distribution of luck into 3 lucky periods becomes the most

powerful test, and remains so for all power evaluations that follow.

Figure 4 shows a generalized deterioration of power for all tests, with the extra return factor

for lucky portfolios now spread over 5 periods. As mentioned above, the MWW test is

still relatively powerful, and remains the best alternative.

[Figure 4 about here]

Once the number of lucky periods reaches 10, out of 72 total trading periods (6 years worth

17

of data for each sample), all net power curves show marked deterioration, with the standard

alpha, bootstrap alpha and Burns test net power essentially zero, as depicted in Figure 5.

While the Percentile CRPA test is still the best, its net power never rises above

approximately 30%, making its use in these situations questionable.

[Figure 5 about here]

This can be explained again from the point of view of our definitions of skill and luck. As

the number of extra return time periods increases, the boundary between luck and skill

starts to blur. A fund with a relatively large number of good returns in a time series of

fixed length cannot be easily dismissed as lucky, as this might be evidence of a skillful

manager at the helm.

Finally, a point could be made that the testing framework is flawed, since by construction

the samples of lucky funds have the same expected return, but higher volatility than those

of skillful portfolios. Thus, one sample stochastically dominates the other, and the

identification of skillful portfolios could be easily made by applying most measures of risk-

adjusted returns (for example, the Sharpe ratio). However, the argument made here is that

in detecting skill it is not the global rate of return that matters, but how that return is

attained over a period of time. To test the robustness of the Percentile-CRPA test to the

stochastic dominance point, I next perform the power tests using samples with no stochastic

dominance: while the added volatilities remain the same, the return factor added to the

lucky portfolios is larger than that added to the skillful portfolios. Tests are performed

where the added return factor for the lucky sample is increased with respect to the

endowment of the skillful sample by factors of 20%, 40% and 60%.

Figure 6 shows the most extreme case simulated, with the net power curves for all tests

where the sample of lucky portfolios has been endowed with a return factor which is 60%

18

higher than that of the skillful portfolios, and spread over 5 periods of time. As can be

observed, the Percentile-CRPA measure remains unaffected and able to separate skill from

highly performing lucky funds, while the other tests have net power measures that fall

below zero, indicating that the test is swayed by the extra return of the lucky funds and

attributes skill to these portfolios more often than it does to truly skillful ones.

[Figure 6 about here]

19

5. Empirical Tests of Measures of Skill

5.1 Sample of Investment Funds and Required Data

While most performance measures require only portfolio returns, CRPA needs a wider

range of data for its implementation. The goal is to obtain as complete a picture as possible

of the constraints faced by the fund manager in her decision making process, in order to

integrate as many of these constraints into the CRPA portfolio formation process as

possible.

One of the first, and most important explicit constraints placed on any fund manager is the

universe of securities which are eligible to be part of the fund, which is a subset of the

securities available in the market. This constraint is clearly defined in the funds

prospectus, and is an integral part of the managers mandate and investment strategy.

While CRPA can be applied to virtually any kind of investment fund, to generate the

random portfolios we require a dataset containing the time series of returns of all assets

eligible to be part of the portfolio. Thus, for example, if we wished to analyze a corporate

bond portfolio, any and all bonds that the manager might conceivably invest in must be

included in this dataset, so that random portfolios could eventually contain these assets as

well. While firms tend to have a single stock listed in one exchange, they can (and do) have

various issues of bonds trading in the markets, which invariably makes the amount of data

required far larger. The same can be said for funds which are allowed to trade derivatives

and other assets (and even simple equity funds, which can trade stocks listed in various

markets, worldwide). Again, while conceptually the process is the same, the practical

aspects become more complicated. In order to simplify the data gathering and random

portfolio generation process, I choose to analyze a sample of funds that invest primarily in

stocks of firms listed in the S&P500 index.

20

Funds are selected that consistently maintain positions in S&P500 stocks that equal or

exceed 90% of their assets (i.e.: are mostly invested in these stocks) throughout the period

under study, which spans 6 years, from 2005 to 2010.

The data then collected includes the monthly returns of S&P500 stocks, as well as each

funds monthly returns, and yearly measures of turnover and asset count (number of assets

in the portfolio). Table I contains the sample fund names and Nasdaq tickers, as well as the

average values observed for turnover and asset count measures (which are used as random

portfolio generation constraints, in conjunction with the sample of S&P500 stocks) for the

2005-2010 period. Investment fund quarterly holdings are collected as well. All data is

obtained from CRSP

6

.

[Table I about here]

While the overall sample average turnover rate and asset count data is presented for each

fund, the algorithm that produces the samples of random portfolios required to implement

CRPA works better with bounds expressed as ranges of permissible values, as opposed to

the fixed values shown above. Thus, the average minimum and maximum turnover and

asset count for each real portfolio is calculated

7

, and these are then used as random

portfolio formation restrictions, in conjunction with the eligible stocks themselves and a

diversification restriction, expressed as a maximum capital allocation to any one stock of

10%.

Although these funds compete in the same market segment, and therefore have very similar

mandates, we can already see that the restrictions faced (or imposed) by each manager can

have large variations. While the average turnover rate for the sample is 1.66, the minimum

6 The Name column has the complete registered name of each fund, while the Name (short) column

contains an abbreviated designation, which will be used throughout the analysis.

7

Data available upon request.

21

reported is 0.13 (Jensen) while the largest is above 10 (Rydex Growth). For Asset Count,

the average number of assets under management is 106, with a minimum of 26 (Jensen) and

a maximum in excess of 500 (Vanguard). This last one could conceivably be hard to

simulate with random portfolios, given that inevitably it will contain stocks not listed in the

S&P500 index. However, funds were chosen by imposing the condition that at least 90% of

their assets be invested in S&P stocks. Thus, random funds that only contain these stocks

will still be a close approximation of the assets eligible to the fund manager, while the other

stocks that comprise the list reported at one point must be represent very minor holdings.

Table II shows descriptive statistics of each funds time series of returns. The market

portfolio is included as a benchmark

8

.

[Table II about here]

For the six year period between 2005 and 2010, the average Holding Period Return (HPR)

for the sample is 18%, while mean monthly return is close to 0.35%. As with management

restrictions, there is much variability in the sample, with the minimum return being 0.14%

per month (ProFunds) and a maximum of 0.58% per month (SunAmerica). It should be

noted that the market portfolio shows a monthly performance close to the best performing

fund, with most other actively managed funds lagging the market. The median return is

invariably higher than the mean, evincing skewed distributions, a fact which is confirmed

by a relatively high level of negative skewness. Also detected in all funds is excess

kurtosis, which explains why for all funds normality of returns is rejected at the 1% level in

most cases, and a few at the 5% level (see last column of the table, where the statistic of a

8

Market portfolio returns are obtained from the Fama & French dataset which also contain their SMB and

HML factors, all of which are used later to obtain factor model regression alphas.

22

Jarque-Bera test is shown). The non normality of returns immediately casts doubts on the

interpretation and accuracy of later performance measures, which rely on normality.

As with average monthly return, risk taking is also highly idiosyncratic in these funds, as

depicted by the standard deviation of the funds returns. While the average is 5.1%, the

values range from a minimum of 4.22% (Jensen), to a maximum of 7.95% (Rydex Value).

The level of risk-taking will, of course, affect some performance measures, such as the

Sharpe index. It is therefore premature to draw any insight into the funds qualities, be it

performance or management skill.

Finally, in the next section most tests are carried out using gross returns, as the measure of

skill should, in an absolute sense, be related to the overall performance that a manager can

obtain. However, the investor does not receive the full benefit of t hese returns, as they are

reduced by the funds fees and other expenses. Thus, some tests are also performed using

net returns, to analyze how initial results are affected by expenses. Each funds expenses

are shown in Table III, both the total expenses as self reported data

9

, as well as the expense

ratio obtained from CRSP.

[Table III about here]

5.2 Fund Performance and Tests of Skill

In this section performance and skill tests are applied to the sample of mutual funds, and

the results from each are analyzed and contrasted.

When contemplating an investment in a mutual fund most investors, even those with some

level of financial education, would consider past measures return sufficient information to

base their decisions on. Thus, fund salespeople will seldom present information beyond

9

Self-reported data is obtained from each funds publicly available information, such as prospecta,

brochures and web pages. These documents and web addresses are available upon request.

23

holding period return and/or mean monthly return, data which was presented in the

previous section, but is included in Table IV. Also included are the standard deviation, as

some investors would also consider measures of risk, and the Sharpe index as a simple risk-

adjusted measure of return.

[Table IV about here]

While these measures give no clue as to the managers skills, they are by far the most

employed by management firms in fund marketing and sales, and by investors to choose

between investment options. In order to contrast the decision results based on these

statistics with those of more advanced methodologies, the last column presents a ranking of

funds by their Sharpe indexes. Although in the previous sections we saw that there is

appreciable variability in fund risk and returns, ranking by the Sharpe ratio is similar to

ranking by raw returns. This is perhaps because these funds operate under similar mandates

and in the same market niche, prompting a sufficiently similar risk-taking behavior to make

this variable have little impact when correcting returns to take it into account. Regarding

performance itself, as can be seen in the Sharpe or Ranking columns the best fund is

SunAmerica, while the market portfolio is the second best performing fund in the sample.

While this has been previously reported, this is bad news for the fund management, as

passive management is consistently cheaper (in terms of transaction costs and fees) than

actively managed funds, so if the passive market portfolio performs better, then there would

seem to be very little evidence in favor of active management.

Previous studies make extensive use of factor models to estimate regression alphas. These

alphas have been interpreted as a performance measure (as is described, for example,

Jensens alpha), but increasingly they have come to represent fund manager skill in the

prevalent literature. Table V shows the alphas obtained for each fund under study, using

24

unconditional versions of the single factor model (as used to obtain Jensens alpha), Fama

& Frenchs three factor model, and Carharts four factor model

10

.

[Table V about here]

Also as reported previously, most alphas turn out to be insignificant. The two exceptions

are the WaMu and ProFunds, which exhibit alphas which are negative and significant. If

the standard skill interpretation of regression alphas were to be employed, then we could

say that the managers of these funds actively subtract value through their actions, as

opposed to adding value (which would be the interpretation of a positive and significant

alpha).

It should also be noted that, while these two funds with negative and significant alphas have

very low Sharpe indexes compared to the rest of the sample, the correlation between

negative alpha and poor performance is not perfect, as WaMu ranks 18

t h

but, for example,

Rochdale Value and Ameristock rank 19

th

and 20

t h

respectively, and their alphas are

negative but insignificant, as most other funds.

Notwithstanding the popularity and extensively documented applications of factor models,

as reported in the first section factor model alphas have been criticized and new

methodologies proposed to obtain better measures of fund manager skill, the main

contender being Kosowski et. al.s Bootstrap Alpha. Though employed to evaluate a full

market of funds, this methodology has since been applied by Cuthbertson et. al. to test

individual funds for manager skill. Following their methodology, I test the 20 funds in the

sample for fund manager skill using the bootstrap alpha methodology. As in Cuthbertson et.

al., I use two separate (though complementary) hypotheses to test for significance on both

tails of the resulting empirical distributions,

10

The market portfolio is not included in the table as, by definition, its alpha should be zero.

25

Hypothesis A: fund manager has skill or adds value,

HA:

0

:

0,

> 0 15

Hypothesis B: fund manager has negative skill or actively destroys value,

HB:

0

:

0,

< 0 16

Table VI shows the funds real Carhart four factor alpha, as well as the empirical p-values

obtained from each funds bootstrapped distribution for both hypotheses.

[Table VI about here]

While it is not surprising that the null of no skill is not rejected for any of the funds (see

HA pval), the last column shows that a startling number of fund alphas (19 out of the total

20) are negative and significant at the 1% level, indicating value-destroying management.

Shocking though these results may seem, they do seem coherent in conjunction with the

previously studied statistics. Specifically, if we assume the market to have zero alpha, then

if most of these funds tend to lag the market in terms of performance (both raw and risk

adjusted), it is not surprising that their alphas should be negative. As to the statistical

significance of these alphas, regular tests are at odds with the bootstrap analysis, but the

general trend is clear and consistent.

The real question here is whether were actually measuring skill, or these are still measures

of performance, so influenced by extraneous factors that the existence of the funds

managers skill cannot be ascertained. That is, these are all measurements obtained from

factors related to market and other portfolios performance, and as such are more akin to

benchmarks than true measures of individual skill, which, while related to observable

performance, would not be determined by it.

26

Next, I apply CRPA measures to the sample of mutual funds. Table VII contains the

resulting empirical p-values obtained from both, the Burns CRPA measure, and the three

tests used to determine stochastic order in the Percentile CRPA test.

[Table VII about here]

As can be observed, in this sample of 20 mutual funds the null of no skill is not rejected in

most cases

11

. However, for the Jensen fund all variants of the CRPA measure reject the null

at the 5% level or better, while for 6 other funds only the Burns measure rejects the null.

The power tests in the first chapter of this dissertation show that the Burns measure can

reject the null in the presence of a fund managed with no skill, but with a sufficiently high

overall (holding period) return, since this measure analyzes only such returns, as opposed to

the way in which the return is composed, that is, the distribution of partial returns (in the

case of the referred tests, the time series of monthly returns). Thus, the recommendation

gleaned from the power tests is to reject the null only when both CRPA measures do so

12

.

Looking at the results on Table VII the obvious inference is that only the Jensen fund is

truly managed with skill, while the funds where only the Burns measure rejects the null

managed to obtain a holding period return large enough to put them at the extreme of the

random portfolio distribution.

Comparisons of the time series distributions of these funds returns prove to be

enlightening.

11

Skill in this case referring to the ability to add value for the investors. CRPA analysis is not used here to

test the other tail of the distributions, to ascertain if there is value destroying behavior, as previously

reported with bootstrap alphas.

12

The Vanguard fund might also be a candidate for a skillful manager, since the null is also rejected by the

MWW test, the most sensitive test used to discriminate between the random portfolio percentile and the

real funds distribution.

27

In Table II we can see that while the Jensen fund has a high holding period return (the

statistic used in the Burns measure) compared to the rest of the sample, it is not the highest

(which belongs to Seligman Value). However, the Jensen fund reaches the second highest

overall return while maintaining the lowest volatility of returns, as seen in its standard

deviation (4.22% per month versus Seligman Values 5.72%, the highest in the sample).

This is evidence that the Jensen fund achieves its performance through steady returns which

are more likely attributable to superior skill, as opposed distributions with a few periods of

high return and more periods of low returns, which lead to higher volatility and can be

interpreted as luck. This interpretation is bolstered by the fact that these funds operate in the

same market, and have very similar mandates (which would not be the case if, for example,

we were comparing equity and bond funds).

To illustrate the above conjecture, Figure 7 displays the time series of fund returns

probability density, comparing the densities of the Jensen fund (full line), ranked 4

th

in the

sample, with that of the best ranked fund, SunAmerica (dashed line).

[Figure 7 about here]

While these distributions centers seem to be close (mean monthly return for Jensen is

0.4%, compared to SunAmericas 0.6%), the higher volatility of the SunAmerica fund is

clearly seen as a lower peak in the probability mass, and fatter tails, depicting a fund that

may have attained single high returns in certain periods, but is less likely to obtain similar

future performance (i.e.: lower probability of obtaining a result close to its historical mean).

A final note concerns the use of the evidence presented in a hypothetical decision making

process. If the investor is presented with the usual performance statistics, the decision

28

would inevitably be to invest in the highest ranked fund, that is, in SunAmerica

13

. If the

decision is to be based on factor model alphas, then no fund appears to be superior to the

market portfolio, whereas if bootstrapped alphas are contemplated, then all of these funds

would definitely be discarded as potential investment alternatives. Only CRPA tests show

any glimmer of hope for these actively managed funds. Using these results, an investor

would consider investing in the Jensen fund, pending further analysis of costs and fees, to

compare it with a passive strategy.

While fund expenses should not be contemplated in a pure analysis of manager skill, they

do impinge on an investors decision making process. Thus, the question is, how sensitive

are the previously derived measures to the addition of expenses? In Table VIII the CRPA-

based statistics presented in Table VII are reproduced, but recalculated using fund returns

net of expenses.

[Table VIII about here]

This analysis, made from the point of view of the investor, shows that where the Jensen

fund was previously identified as skillfully managed by most tests, it now only registers

significance in the Burns cross-sectional test, at first glance making inference about

manager skill unreliable. The correct interpretation of this result is that, though the manager

appears to be skillful, expenses lower the expected benefit to the investor to the point where

she is equally well off investing randomly by herself (and thus avoiding the charges). In

other words, and as has been concluded in previous articles, any overperformance of the

fund seems to be charged away from the investor, so that the benefit of the managers skill

are enjoyed only by the brokerage firm and/or the manager himself.

13

Unless expenses and fees are also considered, in which case perhaps a passive, market index fund would

beat all alternative strategies.

29

Finally, consider the analysis that could be made of this data by the fund managers

themselves. The Jensen fund seems to be the only skillfully managed portfolio in the

sample, but it does not obtain the best returns. While luck inevitably plays a part in all

financial results, further analysis can be made contrasting the management constraints faced

by the Jensen manager to those imposed on other managers of similar funds. As was seen in

the previous section, the Jensen fund has the lowest average turnover ratio of the sample, as

well as the lowest asset count. In the first chapter of this dissertation it is shown that for

samples of random funds (where all other variables are controlled for), differences in

management constraints can have an impact in the resulting return distributions. As an

example, for relatively low levels of turnover, having a low or high number of assets under

management are dominated as a strategy by a mid-range level. This points to potential areas

of improvement worth investigating. Perhaps the only thing the Jensen fund must do to

improve performance is used the already detected skill to manage a larger number of assets,

compared to its present level.

On further point is raised in Lisi (2011), who implements a measure of skill based on

random portfolios, but instead of the CRPA methodology, generates simple equal-weighted

portfolios from randomly selected stocks traded in the Italian market. Lisi then goes on to

apply risk adjustment and other measures to the sample of random funds before using them

to make statistical tests. However, the CRPA methodology makes these adjustments

unnecessary. Consider the fact that portfolio risk is a function of manager decisions,

coupled with management restrictions. Thus, adding fund manager constraints to the

random portfolio generation algorithm eliminates the need for further processing of the

resulting sample, be it the use of risk adjustment or factor models. That is, risk is already

restricted to the potentially available portfolios, and applying risk adjustment measures

30

should not alter the inference obtained from a CRPA analysis. I confirm this by applying

further tests in which I apply the CRPA tests to the portfolios Sharpe measures and

regression alphas. The results, withheld for brevity

14

, do not change the conclusions

described above. That is, there is no added value obtained from applying further measures,

and the correct application of CRPA to simple portfolio returns suffices.

14

Available upon request.

31

6. Conclusions

A new general framework for investment fund analysis using randomly trading portfolios is

outlined and one application, a test of fund manager skill, is developed and fully studied.

The skill test based on CRPA is found to be a powerful and appealing alternative to

traditional methods. On one hand, the statistical properties of the resulting distributions are

free from various assumption problems and biases long recognized in other families of

tests, in particular, the problems of parametric regression-based measures. On the other

hand, while fund manager skill is the focus of this paper, the implications and potential

applications of this methodology are extremely varied.

As an empirical application of the CRPA-based test of skill, a sample of U.S. large cap

mutual funds is analyzed from the point of view of a prospective investor. Standard

performance measures usually employed to make investment decisions are estimated, and

tests of skill are applied, where the null hypothesis for all tests is that managers have no

skill. The results obtained from standard and bootstrap regression alpha methodologies are,

at best, inconclusive, and in the worst cases show alphas which are negative and significant,

signifying negative skill or a reduction in portfolio value attributable to the managers

actions. CRPA skill tests of the sample of funds reveal that, while the null is not rejected

for most of them, skill can be tentatively identified in a few. Results are of further interest

because, unlike the case of regression alphas, CRPA test results are not necessarily

correlated with performance measures. In fact, the one fund where all CRPA tests reject the

null is not the best performing fund in the sample (in terms of returns and Sharpe ratio), but

the fourth best. While the CRPA skill tests detect skill in some funds, once fund fees are

deducted from their returns, the test fails to reject the null of no skill, which is consistent

32

with previous literature that shows that, while there may be some value added by a few

money managers, this value is charged away from the investors.

The application of the CRPA skill test described above would serve as a guide for investor

decision making. However, the same test can be used in other applications.

As a diagnostic tool for fund management firms, we can consider the case in which the null

is rejected for a fund manager, and therefore we can consider this manager as possessing

skill, but nevertheless the funds performance lags behind a benchmark or peer group.

Further analysis of the trading constraints could, in theory, pinpoint areas where the

manager is over or under restricted, with respect to the competition. If these bounds can be

changed (for example, allow the manager to take on more risk, or to increase the frequency

of trades), then performance could be easily improved. More generally, if a group of

managers with certain mandates lags in performance with respect to others with a different

set of goals, then perhaps what is being uncovered is a systematic market anomaly. An

example of a known anomaly, size, might manifest as investors in small firms obtaining

better returns than those who invest in large firms, after controlling for manager skill.

Perhaps CRPA could help discover other, hitherto unreported, anomalies.

There are also implications for the fund manager job market: would hiring be based on

track record alone and other second-hand sources of data, if skill could be measured

reliably? Also, fund charges to investors could potentially be analyzed and based on

manager skill, which is a more direct link to potential performance than other measures.

CRPA provides a robust new framework for various types of investment fund analysis,

including testing for fund manager skill. The results shown here confirm that CRPA tests

are more sensitive in detecting skill than factor model based tests, and the interpretation of

their results are also easier. No risk adjustment is required, as in other measures, and

33

potential econometric problems such as non-normalities are not an issue, due to the use of

non-parametric statistics.

Further applications of CRPA include, for example, testing for sources of manager skill

once the null of no skill is rejected. Simple and intuitive statistics can be employed in

conjunction with random portfolio methodology to test for stock picking and asset

allocation abilities. Beyond fund manager skill, CRPA can be employed as a non-

parametric alternative to traditional measures which can be severely weakened in their

applicability by specification problems. One example is measures of market herding, which

suffer from the lack of a distribution under the null hypothesis of no herding. CRPA-

generated markets can serve to obtain an empirical distribution in which there is no

herding, against which to benchmark the resulting herding statistic for a more precise

measure of statistical significance.

In all, CRPA is an important addition to the finance analysis toolkit.

34

List of References

Burns, P., 2007, Random Portfolios for Performance Measurement, in Erricos John

Kontoghiorghes & Cristian Gatu eds.: Optimization, Econometric and Financial

Analysis (Springer).

Burns Statistics (2011). PortfolioProbe: Portfolio Probe. R package version

1.03. http://www.burns-stat.com/.

Carhart, M., 1997, On Persistance of Mutual Fund Performance, Journal of Finance,

Vol. 52, No. 1, 57-82.

Christopherson, J., Ferson, W., Glassman, D., 1998, Conditioning Manager Alphas on

Economic Information: Another Look at the Persistence of Performance, Review of

Financial Studies, Vol.11, No. 1, 111-142.

Cuthbertson, K., Nitzsche, D., O'Sullivan, N., 2008, UK mutual fund performance: skill

or luck?, Journal of Empirical Finance 15, 613-634.

Dawson, R., R. Young, 2003, Near-uniformly distributed, stochastically generated

portfolios, in Stephen Satchel & Alan Scowcroft eds.: Advances in Portfolio

Construction and Implementation (Butterworth-Heinemann Finance).

Fama, E., K. French, 2010, Luck versus skill in the cross-section of mutual fund

returns, Journal of Finance, Vol. LXV-5, 1915-1947.

Ferson, W., R. Schadt, 1996, Measuring fund strategy and performance in changing

economic conditions, Journal of Finance, Vol. 52, No. 2, 425-461.

Kosowski, R., Timmermann, A., Wermers, R., White, H., 2006, Can Mutual Fund Stars

Really Pick Stocks? New Evidence from a Bootstrap Analysis, Journal of Finance,

Vol. LXI, No. 6, 2551-2595.

Lisi, F., 2011, Dicing with the market: randomized procedures for evaluation of mutual

funds, Quantitative Finance, Vol. 11, No. 2, 163-172.

R Development Core Team (2011). R: A language and environment for statistical

computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-

900051-07-0, URL http://www.R-project.org/.

Shaked, M. and J. G. Shanthikumar, 1994, Stochastic Orders and their Applications,

Associated Press.

Silli, B., 2006, Modern Approaches in the Evaluation of Management Skill in the

Mutual Fund Industry (working paper)

Sharpe, W., 1992, Asset Allocation: Management Style and Performance Measurement,

Journal of Portfolio Management, Vol. 18, No. 2, 7-19

35

Appendix I: Properties of Random Portfolio Samples for Power Tests

Property 1: The expected return for all funds is the same, whether skillful or lucky.

We can see this by taking expectation in (8) and (9). For both equations we have that

() =

10

Property 2: Lucky portfolios have higher variance than skillful ones.

I calculate the variance of each type of fund. For the skillful funds we have

2

=

2

+

2

+

2

+ 2

+ 2

, + 2

, 11

But

2

=

2

,

thus

2

= 2

2

+2

, 11

On the other hand, for the lucky funds,

2

=

2

+

2

+

2

+2

+ 2

, + 2

, 12

2

= 2

2

+

2

+2

+2

, +2

, 13

The difference between these two variances is

2

=

2

+2

+2

, 14

Now, looking at the terms on the right-hand side of this equation, we can see that the first

term is always positive, since its a variance, and the second term is also positive, as

just

plus a vector of zero or positive constants. The last term, the covariance between t he

added return vector and the error term, can be positive or negative. However, simulation

shows that the probability that the complete expression (the sum of the three terms) is

36

negative is very low

15

. Thus, the difference in variances tends to be positive, and therefore

lucky portfolios tend to have a higher variance than skillful portfolios.

15

Full results of this Monte Carlo test are not reported in the interest of brevity, but are available upon

request. In short, the average difference of the variances obtained from 1,000 Monte Carlo iterations is

almost always positive. The only negative values appear in some samples when the added return factor is

set to zero. In this case both samples are drawn from the same distribution, and thus the difference of

variances has a 50/50 chance of being positive or negative.

37

Table I

Identification data, average yearly turnover rate and asset count for funds in sample.

The yearly turnover ratio is the dollar value of all trades occurring in each year (buy and

sell) divided by the total value of assets at the beginning of the year. The figure shown is

the average turnover ratio for the 6 year period studied. Similarly, asset count is the 6 year

average of the yearly number of assets in each portfolio.

Nasdaq Ticker Fund Name Name (short) Turnover Ratio Asset Count

CBXCX

Calamos Investment Trust:

CALAMOS Blue Chip Fund

Calamos

0.46 108

JAMEX

Williamsburg Investment Trust:

Jamestown Equity Fund

Jamestown

0.52 65

JENSX Jensen Portfolio, Inc Jensen

0.13 26

NOLVX

Northern Funds: Large Cap Value

Fund

Northern

0.45 48

SGRCX Seligman Growth Fund, Inc.

Seligman

Growth

1.57 64

SVLCX

Seligman Value Fund Series, Inc:

Seligman Large-Cap Value Fund

Seligman Value

0.23 35

FDSTX

SunAmerica Focused Series, Inc:

Focused Dividend Strategy Portfolio

SunAmerica

1.69 30

VTGIX

Vanguard Tax-Managed Funds:

Vanguard Tax-Managed Growth &

Income Fund

Vanguard 0.14 525

ACGKX

Van Kampen Growth & Income

Fund: Growth & Income Fund

Van Kampen

0.36 74

WSHCX

Washington Mutual Investors Fund,

Inc

WaMu

0.22 132

AMSTX Ameristock Mutual Fund, Inc Ameristock

0.20 37

LOMAX

Advisors Series Trust: Edgar Lomax

Value Fund

Lomax

0.52 48

DDVCX

Delaware Group Equity Funds II:

Delaware Value Fund

Delaware

0.28 34

HGKEX

Advisors' Inner Circle Fund: HGK

Equity Value Fund

HGK

0.53 48

RIMGX

Rochdale Investment Trust:

Rochdale Large Growth Portfolio

Rochdale

Growth

0.52 77

RIMVX

Rochdale Investment Trust:

Rochdale Large Value Portfolio

Rochdale Value

0.55 90

BHGSX

Baird Funds, Inc: Baird LargeCap

Fund

Baird

0.46 49

LVPIX

ProFunds: Large-Cap Value

ProFund

ProFunds

6.64 337

SFECX

Rydex Series Funds: Large-Cap

Growth Fund

Rydex Growth

10.08 144

SEGIX

Rydex Series Funds: Large-Cap

Value Fund

Rydex Value

7.67 152

38

Table II

Summary statistics of fund sample returns

HPR is the holding period return for the 6 year period under study. Mean, Median Standard

Deviation (St. Dev.), Skewness and Kurtosis are calculated for each fund based on their monthly

returns. The last column shows the Jarque-Bera test of normality statistic for each fund, with

significance being denoted with *, ** and *** for a 10%, 5% and 1% level, respectively.

Portfolio HPR Mean Median St. Dev. Skew Kurtosis Jarque Bera

Market 0.32 0.0052 0.0117 0.0508 -0.8903 4.7215 18. 15***

Calamos 0.21 0.0038 0.0119 0.0472 -0.8878 4.8158 19. 08***

Jamestown 0.15 0.0029 0.0095 0.0444 -1.1266 5.4389 32. 62***

Jensen 0.25 0.0041 0.0092 0.0422 -0.8768 5.215 23. 61***

Northern 0.14 0.0032 0.0119 0.0526 -0.7402 4.5334 13. 44***

Seligman Growth 0.22 0.0042 0.0066 0.0534 -0.9104 4.7767 19. 15***

Seligman Value 0.26 0.0049 0.0096 0.0572 -0.737 5.0373 18. 71***

SunAmerica 0.35 0.0058 0.01 0.0559 -0.0419 5.6083 20. 15***

Vanguard 0.20 0.0037 0.0125 0.048 -0.8513 4.4199 14. 54***

Van Kampen 0.22 0.0039 0.0082 0.0474 -0.7083 3.7443 7.58**

WaMu 0.11 0.0025 0.0106 0.0443 -1.0576 5.0582 25. 77***

Ameristock 0.09 0.0022 0.0102 0.0451 -0.6409 4.1218 8.58**

Lomax 0.17 0.0034 0.0135 0.0489 -0.9321 4.4941 16. 89***

Delaware 0.18 0.0034 0.011 0.0443 -1.0189 4.2503 16. 91***

HGK 0.20 0.0038 0.0134 0.049 -1.1034 5.2559 29. 46***

Rochdale Growth 0.15 0.0034 0.0092 0.0535 -0.632 4.589 12. 2***

Rochdale Value 0.08 0.0026 0.0119 0.0545 -0.9462 5.4183 27. 89***

Baird 0.14 0.0032 0.0044 0.0515 -0.7186 5.5828 25. 85***

ProFunds 0.00 0.0014 0.0115 0.0521 -0.8977 4.404 15. 37***

Rydex Growth 0.23 0.0044 0.0036 0.0547 -0.5716 4.9627 15. 26***

Rydex Value 0.02 0.0034 0.0096 0.0795 0.1027 6.6613 39. 78***

39

Table III

Fund expenses

Self Reported total expenses obtained from fund publications (prospecta, web sites, etc.) Expense

Ratio data obtained from CRSP. Both measures reported are yearly costs, as percentage of assets.

Portfolio Self Reported Expense Ratio

Calamos 0.0235 0.0123

Jamestown 0.0113 0.0110

Jensen 0.0125 0.0107

Northern 0.0110 0.0114

Seligman Growth 0.0197 0.0163

Seligman Value 0.0215 0.0189

SunAmerica 0.0095 0.0104

Vanguard 0.0155 0.0128

Van Kampen 0.0150 0.0112

WaMu 0.0149 0.0104

Ameristock 0.0091 0.0057

Lomax 0.0099 0.0108

Delaware 0.0185 0.0137

HGK 0.0099 0.0086

Rochdale Growth 0.0150 0.0218

Rochdale Value 0.0152 0.0207

Baird 0.0100 0.0179

ProFunds 0.0273 0.0127

Rydex Growth 0.0218 0.0239

Rydex Value 0.0190 0.0167

40

Table IV

Standard portfolio performance measures

HPR are holding period returns obtained from portfolio data over a period of 6 years.

Mean is the portfolios average monthly return, while St. Dev. is the standard deviation of

those returns. Sharpe is the funds Sharpe ratio for the period under study. Rank

corresponds to the rank each fund holds in the sample, ordered by their Sharpe ratios.

Name HPR Mean St. Dev. Sharpe Rank

Market 0.32 0.005 0.051 0.064 2

Calamos 0.21 0.004 0.047 0.040 8

Jamestown 0.15 0.003 0.044 0.022 16

Jensen 0.25 0.004 0.042 0.050 4

Northern 0.14 0.003 0.053 0.024 14

Seligman Growth 0.22 0.004 0.053 0.043 6

Seligman Value 0.26 0.005 0.057 0.052 3

SunAmerica 0.35 0.006 0.056 0.068 1

Vanguard 0.20 0.004 0.048 0.037 10

Van Kampen 0.22 0.004 0.047 0.041 7

WaMu 0.11 0.003 0.044 0.013 18

Ameristock 0.09 0.002 0.045 0.006 20

Lomax 0.17 0.003 0.049 0.031 12

Delaware 0.18 0.003 0.044 0.032 11

HGK 0.20 0.004 0.049 0.038 9

Rochdale Growth 0.15 0.003 0.054 0.027 13

Rochdale Value 0.08 0.003 0.055 0.012 19

Baird 0.14 0.003 0.052 0.024 15

ProFunds 0.00 0.001 0.052 -0.010 21

Rydex Growth 0.23 0.004 0.055 0.046 5

Rydex Value 0.02 0.003 0.080 0.018 17

41

Table V

Factor model alphas

Regression alphas obtained from a one factor model (Jensens alpha), Fama & Frenchs

three factor model and Carharts four factor model. Significance is denoted with *, ** and

*** for a 10%, 5% and 1% level, respectively.

Portfolios Jensen Fama & French Carhart

Calamos -0.0011 -0.0007 -0.0007

Jamestown -0.0018 -0.0016 -0.0016

Jensen -0.0004 -0.0002 -0.0002

Northern -0.002 -0.0019 -0.002

Seligman Growth -0.001 -0.0006 -0.0006

Seligman Value -0.0005 -0.0002 -0.0003

SunAmerica 0.0005 0.0003 0.0002

Vanguard -0.0013* -0.001* -0.001*

Van Kampen -0.001 -0.0005 -0.0005

WaMu -0.0022* -0.0017 -0.0017

Ameristock -0.0025 -0.0022 -0.0022

Lomax -0.0015 -0.001 -0.001

Delaware -0.0012 -0.0008 -0.0008

HGK -0.0012 -0.0008 -0.0008

Rochdale Growth -0.0017 -0.0015 -0.0014

Rochdale Value -0.0027 -0.0025 -0.0023

Baird -0.002 -0.0022 -0.0023

ProFunds -0.0038** -0.0035*** -0.0035***

Rydex Growth -0.0009 -0.0012 -0.0012

Rydex Value -0.0031 -0.0034 -0.0037*

42

Table VI

Bootstrap alphas

The first column contains the value of the regression alpha obtained from Carharts four

factor model. The following two columns show the bootstrap p-values obtained for this

alpha when testing two hypotheses alternative to the null that alpha is zero. HA tests the

right tail, or a positive alpha, while HB tests the left tail, or a negative alpha. Significance is

denoted with *, ** and *** for a 10%, 5% and 1% level, respectively.

Portfolio Carhart Alpha HA pval HB pval

Calamos -7.0E- 04 1.00 0.00***

Jamestown -1.6E- 03 1.00 0.00***

Jensen -2.0E- 04 1.00 0.00***

Northern -2.0E- 03 1.00 0.00***

Seligman Growth -6.0E- 04 1.00 0.00***

Seligman Value -3.0E- 04 1.00 0.00***

SunAmerica 2.0E-04 1.00 0.00***

Vanguard -1.0E- 03 1.00 0.00***

Van Kampen -5.0E- 04 1.00 0.00***

WaMu -1.7E- 03 1.00 0.00***

Ameristock -2.2E- 03 1.00 0.00***

Lomax -1.0E- 03 1.00 0.00***

Delaware -8.0E- 04 1.00 0.00***

HGK -8.0E- 04 1.00 0.00***

Rochdale Growth -1.4E- 03 1.00 0.00***

Rochdale Value -2.3E- 03 0.98 0.02**

Baird -2.3E- 03 0.99 0.01***

ProFunds -3.5E- 03 0.97 0.03**

Rydex Growth -1.2E- 03 1.00 0.00***

Rydex Value -3.7E- 03 0.87 0.13

43

Table VII

CRPA tests of fund manager skill

All values shown are empirically obtained p-values. Burns is the p-value from the Burns

(2007) cross-sectional measure of skill. The other three columns contain p-values derived

from the percentile or time-series approach to CRPA skill testing in which the distribution

of a funds time series of returns is compared with that of a percentile distribution obtained

from a sample of random funds. The T- and Permutation tests measure significance of the

difference in the distribution means. MWW is a test of stochastic order, where the null

hypothesis is that both samples (fund and random portfolio returns) are drawn from the

same distribution, and the one-tailed alternative is that fund returns are stochastically

greater than random portfolio returns. Significance is denoted with *, ** and *** for a

10%, 5% and 1% level, respectively.

Portfolio Burns T-test MWW Test Permutation Test

Calamos 0.221 0.767 0.2607 0.747

Jamestown 0.134 0.7811 0.3631 0.752

Jensen 0.003*** 0.0387** 0.0343** 0.038**

Northern 0.129 0.6985 0.1845 0.647

Seligman Growth 0.426 0.8439 0.3253 0.825

Seligman Value 0.032** 0.4366 0.2854 0.449

SunAmerica 0.238 0.8455 0.411 0.816

Vanguard 0.092* 0.6431 0.039** 0.556

Van Kampen 0.076* 0.675 0.5352 0.678

WaMu 0.144 0.7566 0.4898 0.749

Ameristock 0.003*** 0.1854 0.3212 0.201

Lomax 0.057* 0.6124 0.6305 0.609

Delaware 0.132 0.7398 0.4671 0.682

HGK 0.11 0.7729 0.2758 0.777

Rochdale Growth 0.183 0.6612 0.1298 0.566

Rochdale Value 0.263 0.7606 0.1423 0.713

Baird 0.046** 0.5192 0.107 0.523

ProFunds 0.998 0.9243 0.9558 0.987

Rydex Growth 0.819 0.8666 0.5397 0.881

Rydex Value 0.991 0.8385 0.728 0.767

44

Table VIII

CRPA tests of fund manager skill, adjusting returns for fund expenses

All values shown are empirically obtained p-values. Burns is the p-value from the Burns

(2007) cross-sectional measure of skill. The other three columns contain p-values derived

from the percentile or time-series approach to CRPA skill testing in which the distribution

of a funds time series of returns is compared with that of a percentile distribution obtained

from a sample of random funds. The T- and Permutation tests measure significance of the

difference in the distribution means. MWW is a test of stochastic order, where the null

hypothesis is that both samples (fund and random portfolio returns) are drawn from the

same distribution, and the one-tailed alternative is that fund returns are stochastically

greater than random portfolio returns. Significance is denoted with *, ** and *** for a

10%, 5% and 1% level, respectively.

Portfolio Burns T-test MWW Test Permutation Test

Calamos 0.794 0.8939 0.6412 0.902

Jamestown 0.256 0.8555 0.6022 0.826

Jensen 0.012** 0.1343 0.1423 0.135

Northern 0.256 0.7792 0.2815 0.743

Seligman Growth 0.718 0.8922 0.5622 0.883

Seligman Value 0.139 0.7171 0.6726 0.709

SunAmerica 0.311 0.8685 0.5216 0.843

Vanguard 0.347 0.7845 0.2777 0.642

Van Kampen 0.24 0.8336 0.7715 0.843

WaMu 0.592 0.9073 0.7966 0.907

Ameristock 0.004*** 0.3158 0.5261 0.324

Lomax 0.103 0.7168 0.7801 0.702

Delaware 0.79 0.8988 0.8215 0.888

HGK 0.178 0.834 0.4762 0.812

Rochdale Growth 0.483 0.7418 0.2893 0.65

Rochdale Value 0.605 0.8423 0.3031 0.811

Baird 0.087* 0.6127 0.1876 0.584

ProFunds 1 0.9521 0.9973 1

Rydex Growth 0.968 0.8938 0.7998 0.956

Rydex Value 1 0.8656 0.8785 0.854

45

Figure 1

Random fund sample holding period return probability density

Probability density plot for a sample of 1,000 random funds trading securities listed in the

S&P 500 for 6 years, from 2005 to 2010, with constraints on turnover and portfolio asset

count consistent with the mean of these values for real funds currently operating in the

market. Density obtained from the holding period returns of each of 1,000 funds. Six year

holding period return plotted on x-axis, probability in the y-axis. 95th percentile holding

period return denoted by dotted line.

46

Figure 2

Full random portfolio time series of returns probability densities

Time series of returns probability densities plotted for each of a set of 1,000 random fund

probability distributions, from the same sample used in figure I and ordered by average monthly

return. Separate sample portfolio return densities arrayed along the Y axis, with returns on the X

axis and the probability densities of these returns visible on the Z axis.

CRPA 1000 random fund sample Probability Density Surface

47

Figure 3

Net Power of Skill Test, 1 Lucky Period

Net power curves of tests of skill. Extra Return Factor is the additional monthly return added to a

baseline random portfolios returns to simulate overperformance through skill or luck, ranging

from 0% (no skill), to 4%. Net power of the test denotes the probability that the test rejects the

null of no skill when the sample has skill minus the probability that the test rejects the null of no

skill when the sample is just lucky (average of number of times the test rejections out of 1,000

trials, for each level of added return).

48

Figure 4

Net Power of Skill Test, 5 Lucky Periods

Net power curves of tests of skill. Extra Return Factor is the additional monthly return added to a

baseline random portfolios returns to simulate overperformance through skill or luck, ranging

from 0% (no skill), to 4%. Net power of the test denotes the probability that the test rejects the

null of no skill when the sample has skill minus the probability that the test rejects the null of no

skill when the sample is just lucky (average of number of times the test rejections out of 1,000

trials, for each level of added return).

49

Figure 5

Net Power of Skill Test, 10 Lucky Periods

Net power curves of tests of skill. Extra Return Factor is the additional monthly return added to a

baseline random portfolios returns to simulate overperformance through skill or luck, ranging

from 0% (no skill), to 4%. Net power of the test denotes the probability that the test rejects the

null of no skill when the sample has skill minus the probability that the test rejects the null of no

skill when the sample is just lucky (average of number of times the test rejections out of 1,000

trials, for each level of added return).

50

Figure 6

Net Power of Skill Test, 5 lucky periods, samples with no stochastic dominance: 60%

extra return for lucky portfolios

Net power curves of tests of skill. Extra Return Factor is the additional monthly return added to a

baseline random portfolios returns to simulate overperformance through skill or luck, ranging

from 0% (no skill), to 4%. Extra return factor for luck sample is higher than for skill sample. Net

power of the test denotes the probability that the test rejects the null of no skill when the sample

has skill minus the probability that the test rejects the null of no skill when the sample is just lucky

(average of number of times the test rejections out of 1, 000 trials, for each level of added return).

51

Figure 7

Portfolio Return Probability Density

Probability densities of the time series of returns of the Jensen fund (full line) and

SunAmerica (dashed line). The x-axis denotes monthly portfolio return, while the y-axis

shows probability density. Plot obtained using kernel density estimation, with a Gaussian

kernel and automated bandwidth selection.

- INFLUENCE OF CULTURE IN CHANGING CELLULAR SERVICE PROVIDERUploaded byNabeel Mohammed
- Capability indices for Birnbaum Saunders processes applied to electronic and food industries.pdfUploaded byMohamed Hamdy
- Lec_20Uploaded bycmollinedoa
- $R8TGZCZUploaded byCaitlin Lee
- Harm 98Uploaded byWalaa I. Matalqah
- syllabusUploaded byUday Kumar Chadalwada
- Statistics MemoUploaded byLeslie Fillip
- 5. Ijhrmr - The Implementation Influence of Indonesian Government Regulation PolicyUploaded byTJPRC Publications
- termprojectUploaded byapi-253465186
- 9709_m17_qp_72.pdfUploaded bySonia Mascarenhas
- Cash Holding and Company PerformanceUploaded byJackpeter Ndüati
- Package CPSurv in RUploaded byAntonio Gil
- New Microsoft Word Document (2)Uploaded byNarsimham Munna
- c1s1Uploaded bycupidkhh
- MB0034 Research Methodology Fall10Uploaded byMahesh Naik
- Hypothesis TestingUploaded byAashima Grover
- StatisticsUploaded byMd. Samin Ahmed
- Mhra Oos Oot Oct17Uploaded byanil
- Making a Research Proposal (2015 Ver 8)Uploaded byDinie Bidi
- Parametric Test rUploaded byRuju Vyas
- About Yadino Private Funding.Uploaded byKarien MomBintank
- Statistics LectureUploaded byPamelyn Faner Yao
- ProbabilityUploaded bynour
- HandoutUploaded byDurai Pandian
- HYPOTH. TESTING.pptUploaded byRobert Manea
- Project Duration Forecasting - Walt LipkeUploaded byJose Andrew Garcia Ibarra
- BUS 308 Innovative Educator/bus308.comUploaded byvenkatalakshmi
- Coldman R 12984698 Decsision Analysis Individual Assigment September 2010Uploaded byRyan Coldman
- 69_706.pdfUploaded byVarun Herlekar
- wts057464Uploaded byKaziRafi

- Online Privacy GuideUploaded bysootos
- Active Learning Literature SurveyUploaded bysootos
- R Arma simulationUploaded bysootos
- Continuous Time Stochastic ModellingUploaded bysootos
- Invoice FactoringUploaded bysootos
- A Framework for Iterative Data Selection in Exploratory VisualizationUploaded bysootos
- RES Monthly DataUploaded bysootos
- Nelder80Uploaded bysootos

- Comparative Analysis of Incentive Provision in Construction Industry.Uploaded byworldalternativeamusan3643
- Lucidworks Enterprise Search in 2025Uploaded byIvanVeraSolis
- Cult MetamorphosisUploaded byprayag527
- Spell Cards Alchemist 2Uploaded byMichael Breaux
- British HardwaresUploaded byCeejay Er
- Practise-questions-on-equilibrium.pdfUploaded byMahtab Alam
- Malaysia Water SupplyUploaded byAmir Iejie
- Andrew E. Pelling and Michael A. Horton- An historical perspective on cell mechanicsUploaded byGmewop30m
- Introduction to Marginal CostingUploaded bySaurabh Bansal
- budzsackUploaded by97wood
- performance judge instructionsUploaded byapi-261948345
- Hardware Organisation of Computers and MicroprocessorsUploaded byUchenna Ogunka
- Dose Measurement EquationsUploaded byAan Sharma
- Programando CiscoUploaded byaykargil
- Marsden-Basic Steps in Designing a Space MissionUploaded bybharathshaji
- Reducing the Risk of Moisture Problems From Concrete Roof DecksUploaded byNikos
- csae-wps-2012-10Uploaded byFaisal Suliman
- 03 Pradhan-CNM ProceedingsUploaded byBasu
- tipcoUploaded bymacrisa caragan
- Sample LettersUploaded byFaizan Raza
- Sampling Theorem – Bandpass or Intermediate or Under Sampling – GaussianWaves.pdfUploaded byNorozKhan
- PharynxUploaded byandrewwilliampalileo@yahoocom
- Fate and Fortune Telling - Speaking Prompts, horoscopes and Tarot Cards.docUploaded byAndrew Carroll
- Afe 4490Uploaded bythietdaucong
- Director of Academic AffairsUploaded byaswsuv
- Common BleepsUploaded byShannon Ramsumair
- Genocide : Matyred Armenia by Faiz El-Ghusein Bedouin Notable of Damacus . london 1917Uploaded byhayastan
- MAHCVpubsUploaded byIstirani Agusti
- ValigraUploaded byMaryGraceVelascoFuentes
- ShareSlides.Org-Case Clerking Hernia.pdfUploaded bySoa Shah