SSRN Id3625493

Star Ratings and Risk Taking of Mutual Funds∗
Sanghyun (Hugh) Kim
January 5, 2022
∗
Hugh Kim (hughkim@hku.hk) is at the University of Hong Kong, K.K. Leung Building, Pok Fu Lam Rd, Hong
Kong. I thank Vikram Nanda, Qinghai Wang, Munhee Han, and Thomas Maurer for helpful comments.
Electronic copy available at: https://ssrn.com/abstract=3625493

Star Ratings and Risk Taking of Mutual Funds
Abstract
This paper reveals that Morningstar ratings induce mutual funds to increase portfolio risk when
funds’ rankings are in the vicinity of rating cutoffs. Compared to their distant peers, mutual
funds near rating thresholds significantly take on more risk by loading up betas relative to
category benchmarks, while holding tracking errors relatively fixed. This risk-shifting behavior
is consistent with an attempt to increase within-category rankings of a fund’s Sharpe ratio,
thus potentially improving the fund’s star ratings. The star rating effect on risk taking is more
pronounced among funds with less extreme performance, funds less than five years old, and
during last months of calendar quarters. Placebo tests exploiting the June 2002 change in the
rating methodology and those using index funds yield expected null results.
Keywords: Morningstar ratings, managerial incentives, mutual funds, risk taking

1 Introduction
The idea that mutual fund investors use risk-adjusted returns in their capital allocation decision is
appealing and has motivated some prominent financial economists to use mutual fund flows to test asset
pricing models. Berk and van Binsbergen (2016), for instance, conclude that the capital asset pricing model
(CAPM) is the “closest to the asset pricing model investors are actually using” (see also Barber, Huang,
and Odean (2016)). Another strand of literature, however, documents that simple and readily available
Morningstar ratings have a powerful influence on investor flows, independent of the underlying continuous
performance measures (e.g., Del Guercio and Tkac (2008), Reuter and Zitzewitz (2015)).1 In a recent
study, Evans and Sun (Forthcoming) show that mutual fund investors use simple heuristics such as star
ratings, rather than asset-pricing models, for risk adjustment. Ben-David et al. (2019) further argue that
star ratings explain mutual fund investors’ behavior much better than any asset pricing models, concluding
that star ratings are the main determinant of capital allocation across mutual funds.
Given that mutual fund investors rely heavily on star ratings, it is natural to ask whether fund
managers also care about star ratings and, if so, how star ratings affect the managerial incentives of
mutual funds. Along this line of inquiry, Kim (2020) finds that mutual fund managers can inflate month-
end net asset values (NAV) of their funds through “portfolio pumping” when they are likely to finish the
month in the vicinity of rating thresholds.2 This paper studies the effect of star ratings on risk taking of
mutual funds. Unlike portfolio pumping, it is less clear whether and how star ratings would affect mutual
funds’ risk taking as a means to boost within-category rankings of Morningstar “Risk-Adjusted Returns”
(RAR), which, as implied by its name, explicitly adjusts for risk. In other words, since increasing risk
will increase both expected return and risk, the net effect of risk taking on Morningstar RAR is somewhat
1
Mutual fund share classes are rated every month on a scale of one star (lowest) to five stars (highest) on the
basis of Morningstar’s proprietary algorithm known as “Risk-Adjusted Returns” (RAR) over the prior three, five,
and ten years, depending on data availability. On the basis of within-category percentile rankings of Morningstar
RAR, the top 10% of funds receive five stars, the next 22.5% four stars, the middle 35% three stars, the next 22.5%
two stars, and the bottom 10% one star. Overall star ratings are then determined by taking weighted averages of
three-, five-, and ten-year star ratings, rounded to the nearest integer value.
2
Since open-end mutual funds calculate their NAVs from the closing prices of their holdings, mutual fund managers
can “pump up” the prices of their holdings through aggressive trading of stocks they already own, thereby inflating
their funds’ month-end NAVs. This high-frequency trading strategy is an illegal trading practice known as “portfolio
pumping” (Zweig (1997), Carhart et al. (2002)).

ambiguous and merits further discussions.
Prior studies typically focus on fund volatility to examine mutual funds’ risk taking when compen-
sation is linked to relative performance (see, e.g., Brown, Harlow, and Starks (1996), see also Aragon and
Nanda (2012) for hedge funds). While rewarding higher returns, Morningstar RAR, however, explicitly
penalizes risk as measured by volatility. In fact, Sharpe (1998) finds that ranking funds based on Morn-
ingstar RAR gives results that are similar to ranking funds on the basis of Sharpe ratio. Thus, mutual
fund managers who care about star ratings could be averse to increasing fund volatility when attempting
to boost expected returns. Furthermore, even without risk adjustment in performance evaluation, fund
managers could be cautious about increasing tracking errors around benchmarks. Indeed, Christoffersen
and Simutin (2017) find that in an attempt to outperform benchmarks, fund managers with large pension
assets tend to increase their exposure to high-beta stocks, while trying to maintain, or even reduce, tracking
errors.
The above considerations lead to the following “star rating effect” hypothesis: fund managers who
care about star ratings would increase portfolio betas relative to category benchmarks, while seeking
to avoid substantially increasing tracking errors the benchmark, in an attempt to boost within-category
rankings of Morningstar RAR (or equivalently, Sharpe ratio), thus improving star ratings.
At the end of each month, mutual fund share classes are rated on an integer scale of one star (lowest)
to five stars (highest) on the basis of Morningstar RAR over the prior three, five, and ten years, depending
on data availability. Overall star ratings (or simply referred to as star ratings in the paper) are then
determined by taking weighted averages of three-, five-, and ten-year star ratings, rounded to the nearest
integer value. Since the evaluation period is based on a rolling window, every month a fund rolls out of the
oldest month and rolls into a new month. Thus, the variation in the distance to rating thresholds as funds
roll out of the oldest month of an evaluation period can generate the variation in managerial incentives to
shift portfolio risk in a new month that funds roll into.
To test the star rating effect hypothesis, for each mutual fund share class and month, I measure the
distance to a rating threshold as the distance between its within-category Sharpe ratio percentile rankings

and the closest rating threshold (Kim (2020)). Sharpe ratios are computed using daily returns in excess
of the risk-free rate excluding the oldest month of an evaluation period, to proxy for fund managers’
expectations about their fund rankings starting a new month. Following Morningstar, funds are ranking
within Morningstar 3 × 3 categories along the size and value dimensions starting in June 2002 and, prior to
that, all U.S. equity funds are ranked against each other. For risk-shifting measures, I compute the ratio
of betas of fund returns relative to category benchmark returns (e.g., Christoffersen and Simutin (2017),
Lee, Trzcinka, and Venkatesan (2019)) and the ratio of volatility of fund returns in excess of category
benchmark returns (e.g., Brown, Harlow, and Starks (1996), Lee, Trzcinka, and Venkatesan (2019)). All
ratios are measured as the ratio of the current month’s figure to the prior month’s figure computed using
daily returns. In line with within-category rankings, category benchmark returns are computed as average
returns of the funds that belong to the same Morningstar category starting in June 2002 and, prior to that,
average returns are computed across all U.S. equity funds. Index funds are kept in the ranking procedure,
but excluded from the main analyses and used later in placebo tests.
Since star ratings are assigned to mutual fund share classes, mutual funds with multiple share classes
may have different star ratings. In my sample, about 54% of the fund-month observations with multiple
share classes have multiple star ratings. Although star ratings are assigned at the share class level, portfolio
decisions are made at the fund level. To account for multiplicity of fund-level observations, I estimate share-
class-level regressions with the weighted least squares (WLS) estimation in which each share-class-month
observation is weighted by the ratio of share-class total net assets (TNA) to fund TNA at the end of the
prior month so that each fund-month observation is equally weighted. Alternatively, regressions could be
run at the fund level, with share-class-level variables aggregated to the fund-level, as is standard in the
mutual fund literature. For robustness checks, I estimate the same regressions at the fund level with the
ordinary least squares (OLS) estimation. Reassuringly, I find that the results remain substantially similar
and even slightly stronger at the fund level.
Consistent with the star rating effect hypothesis, I find that mutual fund managers tend to take on
more risk as measured by beta when within-category rankings of their funds are close to rating thresholds.

Compared to their distant peers, mutual funds near rating thresholds tend to load up betas relative to
category benchmarks. Specifically, the squared distance to a rating threshold in month t − 1 negatively
predicts the ratio of portfolio beta relative to category benchmarks in month t to that in month t − 1.
In contrast, funds that are close to rating thresholds do not significantly increase tracking errors around
category benchmarks, compared to funds that are farther away.
Next, I examine the star rating effect on risk taking within the context of prior literature, which
has focused on extreme cases. At the one extreme, Chevalier and Ellison (1997) find that the convex
flow-performance relation (Ippolito (1992), Sirri and Tufano (1998)) induces mutual funds with extremely
good performance to increase the riskiness of their portfolios (which I label the “tournament” effect). At
the other extreme, mutual funds with extremely poor performance substantially increase portfolio risk
(Khorana (2001)) because such performance may lead to management turnover (Khorana (1996)) (which I
label the “survival” effect). Consistent with prior studies, I find that for one- and five-star funds, the star
rating effect is dominated by the tournament effect and survival effect, respectively.3 That is, 0th (100th )
percentile funds increase portfolio risk more than 10th (90th ) percentile funds, even though the latter is
closer to rating thresholds than the former. Nevertheless, without experiencing extreme performance,
mutual fund managers can still face incentives to increase portfolio risk due to the discrete nature of star
ratings. I find that the star rating effect on risk taking of mutual funds becomes much stronger among
funds with less extreme performance. To focus on the effect of star ratings on risk taking, I exclude extreme
performance groups (one- and five-star funds) from the subsequent analyses, but including them does not
materially change the results.
The star rating effect hypothesis has a clear cross-sectional prediction regarding the length of track
record. Since how star ratings are assigned depends on data availability, it becomes much harder for funds
to influence star ratings as the track record extends further. While completely determining overall star
ratings for funds less than five years old, three-year star ratings account for only a small faction (20 to 40
3
Lee, Trzcinka, and Venkatesan (2019), who find that mutual funds with performance-based contracts and mid-
year performance close to their announced benchmark increase their portfolio risk in the second part of the year,
make a similar observation regarding mutual funds with extremely poor performance.

percent) of overall star ratings for funds that are at least five years old. Hence, the star rating effect on risk
taking should be more pronounced among funds for which overall star ratings are completely determined
by three-year star ratings, which are presumably the easiest to influence. My analysis confirms that the
star rating effect is concentrated among funds that are less than five years old.
Even though star ratings can change from month to month, the incentives to improve star ratings
may not be equally strong for all months. Rather, last months of quarters (and years) are more prominent
than the rest of the months because mutual fund managers are likely more conscious about quarterly and
annual figures. Consistent with this calendar month prediction, the star rating effect is stronger during
March, June, September, and December, but substantially weaker during other months. Hence, mutual
fund managers shift portfolio risk dynamically as their funds’ rankings change around rating thresholds,
especially from less prominent months to more prominent months, while seeking to maintain an optimal
level of risk over time.
Next, I turn to conducting placebo tests to corroborate that the star rating effect on risk taking is
likely causal. First, I exploit the June 2002 change in the Morningstar rating methodology by reversing
this change in placebo tests (Kim (2020)). That is, funds would be ranked within Morningstar categories
prior to June 2002 when Morningstar introduced its 3 × 3 categories along the size and value dimensions
in the ranking procedure and, starting in June 2002, all U.S. equity funds would be ranked against each
other. Placebo category benchmark returns would be computed in an analogous manner and risk-shifting
measures would be computed relative to placebo category benchmark returns. A second set of placebo
tests use index funds in place of actively-managed funds because index funds have a limited ability to
actively alter portfolio risk. All placebo tests yield expected null results.
The remainder of this paper is organized as follow. Section 2 provides institutional details on Morn-
ingstar ratings. Section 3 introduces my data sets and describes how I construct variables used in the
subsequent analyses. Section 4 presents the main results of the paper. Placebo tests are conducted in
Section 5. Section 6 concludes.

2 Morningstar Ratings
Since the introduction of its five-star rating system in 1985, Morningstar has become the undisputed
leader of the mutual fund rating industry (Del Guercio and Tkac (2008)). Star ratings have been shown to
have a strong influence on investor flows, independent of the underlying continuous performance measures
(Del Guercio and Tkac (2008), Reuter and Zitzewitz (2015)). Evans and Sun (Forthcoming) show that
mutual fund investors use simple heuristics such as star ratings, rather than asset-pricing models, for risk
adjustment. Ben-David et al. (2019) argue that star ratings explain mutual fund investors’ behavior much
better than any asset pricing models, concluding that star ratings are the main determinant of capital
allocation across mutual funds. Kim (2020) shows that the discrete nature of Morningstar ratings induce
mutual fund managers to inflate month-end net asset values (NAVs) of their funds when they are likely to
finish the month in the vicinity of rating thresholds.
At the end of each month, mutual fund share classes are rated by Morningstar on an integer scale of
one star (the lowest rating) to five stars (the highest rating). Star ratings assigned to mutual fund share
classes are determined by within-category rankings of Morningstar “Risk-Adjusted Return” (RAR), which
adjusts for risk and accounts for all sales charges, over the prior three, five, and ten years, depending on
data availability.
Morningstar classifies mutual funds into finer categories (e.g., U.S. Equity Large Growth) within
broader category groups (e.g., U.S. Equity). Whereas all U.S. equity funds were ranked against each other
until May 2002, Morningstar has started ranking funds within Morningstar categories in June 2002 when
Morningstar introduced its 3 × 3 categories along the size dimension (Small, Mid-Cap, or Large) and the
value dimension (Value, Blend, or Growth). On the basis of within-category rankings of Morningstar RAR,
the top 10% of mutual fund share classes receive five stars, the next 22.5% four stars, the middle 35% three
stars, the next 22.5% two stars, and the bottom 10% receive one star.
Overall star ratings (or simply referred to as star ratings in the paper) are determined by the weighted
averages of three-, five-, and ten-year star ratings, depending data availability, rounded to the nearest
integer value. Share classes less than three years old are not rated. Share classes at least three years old

and less than five years old are rated based only on three-year star ratings. Share classes at least five years
old and less than ten years old are rated based on three-year star ratings (40 percent weight) and five-year
star ratings (60 percent weight). Share classes at least ten years old are rated based on three-year star
ratings (20 percent weight), five-year star ratings (30 percent weight), and ten-year star ratings (50 percent
weight).
Table 1 shows the transition matrix where each element in row i and column j represents the
probability of a mutual fund share class receiving rating j during month t conditional on its receiving
rating i during month t − 1. More than 10% of mutual fund share classes experience changes in star
ratings each month. Not surprisingly, ten-year star ratings are most persistent, followed by five-year star
ratings, and three-year star ratings are least persistent. Being weighted averages, overall star ratings are
more persistent with three-year star ratings, but less persistent with five- and ten-year star ratings. Like
this, casual empiricism suggests that changes in (overall) star ratings are most influenced by changes in
three-year star ratings.
[Insert Table 1]
3 Data and Variable Construction
3.1 The Distance to a Rating Threshold
My primary data come from Morningstar Direct, from which I obtain data on returns, total net assets
(TNAs), Morningstar categories, star ratings, inception dates, expense ratios, turnover ratios, index-fund
indicators, and institutional share class indicators. Star ratings are updated every month on the basis of
Morningstar’s proprietary algorithm known as “Risk-Adjusted Return” (RAR) over the prior three, five,
and ten years, depending on data availability. Sharpe (1998) finds that ranking funds based on Morningstar
RAR gives results that are similar to ranking funds on the basis of Sharpe ratio. Morningstar RAR rewards
higher returns, but penalizes risk as measured by volatility (see also Ben-David et al. (2019)).
To proxy for mutual funds’ incentives to increase portfolio risk driven by star ratings, I measure the

distance to a rating threshold as the distance between a fund’s within-category percentile rankings based
on the Sharpe ratio and its closest rating threshold (Kim (2020)). Specifically, for each mutual fund share
class and month, I compute the Sharpe ratio using daily returns in excess of the risk-free rate over the prior
three, five, and ten years. I exclude the oldest month from each rolling window in the computation of Sharpe
ratio to measure incentives as mutual funds “roll” out of the oldest month and “roll” into a new month.
Following the Morningstar rating methodology, I rank all U.S. equity mutual fund share classes against
each other until May 2002 and rank mutual fund share classes within Morningstar categories starting in
June 2002 when Morningstar introduced its 3 × 3 categories along the size and value dimensions. Although
my study focuses on actively-managed mutual funds, I keep index funds in the ranking procedure to be
consistent with the Morningstar rating methodology.
Then, I aggregate three-, five-, and ten-year within-category percentile rankings to “overall” within-
category percentile rankings in a similar manner as how Morningstar aggregates three-, five-, and ten-year
star ratings to overall star ratings (simply referred to as star ratings in the paper). Share classes less than
three years old are excluded from the analysis. For share classes at least three years old and less than
five years old, overall rankings equal three-year rankings. For share classes at least five years old and less
than ten years old, overall rankings are weighted averages of three-year rankings (40 percent weight) and
five-year rankings (60 percent weight). For share classes at least ten years, overall rankings are weighted
averages of three-year rankings (20 percent weight), five-year rankings (30 percent weight), and ten-year
rankings (50 percent weight). Finally, the distance to a rating threshold is computed for each mutual
fund share class and month as the distance between its overall within-category percentile rankings and its
nearest rating threshold. There are four rating thresholds separating five star ratings: 10th , 32.5th , 77.5th ,
and 90th percentiles.
For placebo tests, I compute placebo within-category percentile rankings by reversing the June 2002
change in the Morningstar rating methodology (Kim (2020)). Specifically, mutual fund share classes
would be ranked within Morningstar categories prior to June 2002 when Morningstar introduced its 3 × 3
categories along the size and value dimensions, whereas all U.S. equity mutual fund share classes would be

ranked against each other starting in June 2002. The placebo distance to a rating threshold would then be
computed based on the placebo within-category percentile rankings.
3.2 Measures of Risk Shifting and Other Variables
For risk-shifting measures, I compute the ratio of betas of fund returns relative to category bench-
mark returns (Christoffersen and Simutin (2017), Lee, Trzcinka, and Venkatesan (2019)) and the ratio of
tracking errors around category benchmark returns (Brown, Harlow, and Starks (1996), Lee, Trzcinka, and
Venkatesan (2019)). Specially, for each mutual fund share class and month, I compute portfolio betas by
regressing fund daily returns on category benchmark returns and compute tracking errors as volatility of
fund daily returns in excess of those of category benchmarks. Ratios are then taken as the current month’s
figure to the previous month’s figure. Consistent with within-category rankings, category benchmark re-
turns are computed as average returns of the funds that belong to the same Morningstar category starting
in June 2002 and, prior to that, average returns are computed across all U.S. equity funds. Consistent with
the inclusion of index funds in the ranking procedure, index funds are kept in the computation of category
benchmark returns.
For placebo tests, I compute placebo category benchmark returns by reversing the June 2002 change in
the Morningstar rating methodology. Then, I use placebo category benchmark returns in place of category
benchmark returns in the computation of risk-shifting measures. For control variables, I use the following
share-class-level variables: TNA, age as measured by the difference between the last calendar date of each
month and the inception date, expense ratio, turnover ratio, and an indicator variable for institutional share
class, all obtained from Morningstar Direct. For fund-level tests, I aggregate share-class-level variables to
the fund-level by computing the sum of TNAs, the maximum of ages, and the value-weighted averages of
other variables, weighted by TNA at the end of the prior month. Indicator variables are replaced by the
proportion of TNAs from share classes for which indicator variables take the value of one. All variables
are winsorized at 1% and 99%. Table 2 reports the summary statistics.
[Insert Table 2]

4 Main Results
4.1 Star Ratings and Portfolio Betas
In this subsection, I test the first prediction of the star rating effect hypothesis that in an attempt to
boost within-category percentile rankings of Morningstar RAR, mutual fund managers who care about star
ratings would increase portfolio betas relative to category benchmarks, while seeking to avoid substantially
increasing tracking errors around the benchmark. Being closely related to the Sharpe ratio, Morningstar
RAR rewards higher returns, while penalizing risk as measured by volatility (Sharpe (1998), Ben-David
et al. (2019)). By increasing exposure to high-beta stocks, mutual funds can increase expected returns and
relative performance (Christoffersen and Simutin (2017), Lee, Trzcinka, and Venkatesan (2019)), thereby
boosting within-category percentile rankings and improving star ratings.
To examine whether star ratings induce mutual funds to increase portfolio betas, I estimate the
following linear regression model:
βi,t
= γ × Squared distancei,t−1 + η × Covariatesi,t−1 + θt + εi,t (1)
βi,t−1
where i indexes mutual fund share classes (or funds) and t indexes time in month. All regressions are run
separately at the share class (security) level and at the fund (portfolio) level. The dependent variable is
the ratio of beta of daily returns of mutual fund share class (or fund) i relative to its category benchmark
returns during month t to that during month t − 1. Squared distancei,t−1 is mutual fund share class (or
fund) i’s squared distance between its within-category percentile rankings based on the Sharpe ratio and its
nearest rating threshold at the end of month t − 1. Covariatesi,t−1 are a vector of mutual fund share class
(or fund) characteristics that include the logarithmic of total net assets (TNA) (in $ million), logarithmic
of age (in years), expense ratio (in percent), turnover ratio, and an indicator variable for institutional share
class, all measured at the end of month t − 1. For fund-level tests, I aggregate share-class-level variables to
the fund-level by computing the sum of TNAs, the maximum of ages, and the value-weighted averages of
other variables at the end of the prior month. An indicator variable for institutional share class is replaced
10

by the proportion of TNAs from institutional share classes. All regressions include time fixed-effects (θt )
and standard errors are double-clustered by fund and by time.
Since star ratings are assigned to mutual fund share classes, mutual funds with multiple share classes
may have different star ratings. In my sample, about 54% of the fund-month observations with multiple
share classes have multiple star ratings. Although star ratings are assigned at the share class level, portfolio
decisions are made at the fund level. To account for multiplicity of fund-level observations, I estimate share-
class-level regressions with the weighted least squares (WLS) estimation in which each share-class-month
observation is weighted by the ratio of share-class TNA to fund TNA at the end of month t − 1 so that
each fund-month observation is equally weighted. Alternatively, regressions could be run at the fund level,
with share-class-level variables aggregated to the fund-level, as is standard in the mutual fund literature.
For robustness checks, I also estimate the same regressions at the fund level with the ordinary least squares
(OLS) estimation.
The regression results are presented in Table 3. In column (1) of Panel A, the estimate γ̂ is negative
and statistically significant at the 5% level and remains little changed after the inclusion of a host of share
class characteristics in column (2). Reassuringly, the results are substantially similar when regressions are
run at the fund level in Panel B. The estimate γ̂ is negative, statistically significant at conventional levels,
and even larger in magnitude compared to that in Panel A. Overall, the results are consistent with the
hypothesis that star ratings induce mutual funds to increase portfolio risk by loading up betas relative to
category benchmarks.
[Insert Table 3]
4.2 Star Ratings and Tracking Errors
In this subsection, I test the second prediction of the star rating effect hypothesis that in an attempt to
boost within-category percentile rankings of Morningstar RAR, mutual fund managers who care about star
ratings would increase portfolio betas relative to category benchmarks, while seeking to avoid substantially
increasing tracking errors around the benchmark. Prior studies typically focus on fund volatility to examine
11

mutual funds’ risk-taking when compensation is linked to relative performance (e.g., Brown, Harlow, and
Starks (1996)). Morningstar RAR, however, explicitly penalize risk as measured by volatility, being closely
related to the Sharpe ratio (Sharpe (1998), Ben-David et al. (2019)). Although mutual fund managers can
boost expected returns by increasing fund volatility (e.g., Brown, Harlow, and Starks (1996), Lee, Trzcinka,
and Venkatesan (2019)), it is unclear whether higher expected returns on average can be more than offset
by higher volatility. Thus, mutual fund managers who care about star ratings are likely cautious about
increasing fund volatility when attempting to boost expected returns because their objective is to boost
within-category rankings of Morningstar RAR (or equivalently, Sharpe ratio). Furthermore, even without
risk adjustment in performance evaluation, fund managers are likely averse to increasing tracking errors
around benchmarks. Indeed, Christoffersen and Simutin (2017) find that in an attempt to outperform
benchmarks, fund managers with large pension assets tend to increase their exposure to high-beta stocks,
while trying to maintain, or even reduce, tracking errors.
To examine the effect of star ratings on tracking errors around benchmarks, I estimate the following
linear regression model:
b )
σ(ri,t − ri,t
b
= γ × Squared distancei,t−1 + η × Covariatesi,t−1 + θt + εi,t (2)
σ(ri,t−1 − ri,t−1 )
separately at the share class (security) level and at the fund (portfolio) level. The dependent variable is the
ratio of volatility of daily returns of mutual fund share class (or fund) i in excess of its category benchmark
returns during month t to that during month t − 1. The rest of the model is the same as in Eq. (1).
The regression results are presented in Table 4. In Panel A, the estimate γ̂ is negative, but sta-
tistically insignificant, with or without controls. When regressions are run at the fund level in Panel B,
the results remain substantially similar. Overall, the results in this subsection, combined with those in
the previous subsection, support the hypothesis that compared to their distant peers, mutual funds near
rating thresholds tend to increase portfolio betas by loading up betas while seeking to avoid substantially
increasing tracking errors around benchmarks. This risk-shifting behavior is consistent with an attempt
12

to boost within-category rankings of Morningstar RAR (or equivalently, Sharpe ratio), thereby improving
star ratings.
[Insert Table 4]
4.3 Other Managerial Incentives
In this subsection, I examine the star rating effect on risk taking of mutual funds within the context
of other managerial incentives proposed in the literature. Prior studies suggest that there could be other
forces driving mutual funds’ risk-taking behavior that are not mutually exclusive with the star rating
effect. For instance, Chevalier and Ellison (1997) find that the convex flow-performance relation (Ippolito
(1992), Sirri and Tufano (1998)) induces mutual funds to alter the riskiness of their portfolios. Hence, for
five-star funds that are in the performance spectrum where the flow-performance relation is most convex,
the tournament effect may dominate the star rating effect in shifting portfolio risk. At the other extreme
of the performance spectrum, extremely poor performance may lead to management turnover (Khorana
(1996)). As a result, mutual funds with extremely poor performance substantially increase portfolio risk
(Khorana (2001)), essentially engaging in gambling for resurrection. Thus, for one-star funds, the survival
effect may dominate the star rating effect in altering portfolio risk. To separate out other managerial
incentives driving risk-taking behavior of mutual funds, I re-estimate the linear regression model in Eq.
(1) in sub-samples split by star ratings during month t − 1.
The regression results are presented in Table 5. The first two columns of Panel A report the share-
class-level results for mutual funds with one-star ratings during month t − 1. In column (1), the estimate γ̂
is positive and statistically significant at the 1% level, and remains qualitatively similar after the inclusion
of share class characteristics in column (2). The fund-level tests yield similar results in Panel B. The results
suggest that among extremely poorly performing funds, the survival effect dominates the star rating effect.
That is, funds near 0th percentile substantially take on more risk than funds near 10th percentile.
Turning to the other extreme group, the last two columns of Panel A report the results for mutual
funds with five-star ratings during month t − 1. In column (5), the estimate γ̂ is positive and statistically
13

significant at the 1% level, and remains little changed after the inclusion of share class characteristics in
column (6). The fund-level tests yield similar results in Panel B. The results suggest that among extremely
well performing funds, the tournament effect dominates the star rating effect. That is, funds near 100th
percentile substantially take on more risk than funds near 90th percentile.
[Insert Table 5]
The sub-sample results suggest that among funds that are extremely well performing or extremely
poorly performing, the star rating effect is dominated by other managerial incentives to alter risk-taking
behavior of mutual funds. Hence, to focus on the star rating effect on risk shifting of mutual funds, one-
and five-star funds will be excluded from the the subsequent analyses, but keeping them does not materially
change the results.
4.4 The Length of Track Record
The star rating effect hypothesis has a clear cross-sectional prediction about the length of track
record. Overall star ratings (or simply referred to as star ratings in the paper) are determined by the
weighted averages of three-, five-, and ten-year star ratings, depending data availability, rounded to the
nearest integer value. Share classes less than three years old are not rated. Share classes at least three
years old and less than five years old are rated based only on three-year star ratings. Share classes at least
five years old and less than ten years old are rated based on three-year star ratings (40 percent weight) and
five-year star ratings (60 percent weight). Share classes at least ten years old are rated based on three-year
star ratings (20 percent weight), five-year star ratings (30 percent weight), and ten-year star ratings (50
percent weight). As a result, it becomes much more difficult for funds to influence star ratings when overall
star ratings are also affected by five- and ten-year star ratings. Thus, if mutual fund managers who care
about star ratings increase portfolio risk in an attempt to improve star ratings, the star rating effect should
be more pronounced among funds for which three-year star ratings completely determine the overall star
ratings.
14

To test this cross-sectional prediction, I estimate the following linear regression model with an inter-
action term:
βi,t
= δ × Squared distancei,t−1 × 1(Agei,t−1 < 5 years old) + γ × Squared distancei,t−1
βi,t−1
(3)
+ ρ × 1(Agei,t−1 < 5 years old) + η × Covariatesi,t−1 + θt + εi,t
where i indexes mutual fund share classes (or funds) and t indexes time in month. All regressions are
run separately at the share class (security) level and at the fund (portfolio) level. The interaction term,
1(Agei,t−1 < 5 years old), is an indicator variable that takes the value of one if three-year star ratings
completely determine the overall star ratings t − 1. For fund-level tests, the interaction term is replaced
by the proportion of TNAs from share classes for which three-year star ratings completely determine the
overall star ratings (% less-than-five-year-old TNA). The rest of the model is the same as in Eq. (1).
Mutual fund share classes (or funds) with one- and five-star ratings during month t − 1 are excluded from
the analysis.
The regression results are presented in Table 6. In column (1) of Panel A, the estimates δ̂ and
γ̂ are both negative and statistically significant at the 1% and 10% levels, respectively. The estimate δ̂
becomes slightly greater, while the estimate γ̂ becomes slightly smaller, losing its statistical significance
(t-statistic = −1.36), when share class characteristics are controlled for in column (2). The fund-level tests
yield qualitatively similar results in Panel B. Overall, the results suggest that the star rating effect on risk
taking of mutual funds is concentrated among funds for which 3-year star ratings completely determine
overall star ratings. The results are consistent with the hypothesis that mutual fund managers who care
about star ratings alter portfolio risk in an attempt to improve star ratings.
[Insert Table 6]
4.5 The Calendar Month Effect
Star ratings are updated on a monthly basis. Certain calendar months, however, might be naturally
more important than other months because mutual funds typically report on an annual, semi-annual, or
15

quarterly basis. If this is the case, mutual fund mangers who care about star ratings are likely more
conscious about star ratings of their funds entering the last month of a calendar quarter. To test this
calendar month prediction of the star rating effect hypothesis, I re-estimate the linear regression model in
Eq. (1) in sub-samples split by calendar months.
The regression results are presented in Table 7. In columns (1) and (2), the sub-sample consists
of share-class-month (or fund-month) observations from March, June, September, and December that
are presumably more prominent than other calendar months. In columns (3) and (4), the sub-sample
consists of fund-month (or fund-month) observations from the remaining months that are presumably less
prominent. The share-class-level results are reported in Panel A. The estimate γ̂ is larger in magnitude
and statistically more significant in columns (1) and (2) than that in columns (3) and (4). The estimate
γ̂ is still negative, but loses statistical significance in columns (3) and (4), with t-statistics of −1.56 and
−1.38, respectively. The fund-level tests yield qualitatively similar results in Panel B. The results suggest
that consistent with the calendar month prediction, the star rating effect on risk taking of mutual funds is
stronger during March, June, September, and December, but substantially weaker during less prominent
calendar months. Hence, mutual fund managers shift portfolio risk dynamically as their funds’ rankings
change around rating thresholds, especially from less prominent months to more prominent months, while
seeking to maintain an optimal level of risk over time.
[Insert Table 7]
5 Placebo Tests
5.1 Reversing the June 2002 Change in the Rating Methodology
In this subsection, I conduct placebo tests in order to provide corroborating evidence that the star
rating effect on risk taking of mutual funds documented in Section 4.1 is likely causal. To this end, I exploit
the June 2002 change in the Morningstar rating methodology, following Kim (2020). Morningstar ranked
all U.S. equity funds against each other until May 2002. Morningstar, however, has started ranking funds
16

within Morningstar categories in June 2002 when Morningstar introduced its 3 × 3 categories along the
size dimension (Small, Mid-Cap, or Large) and the value dimension (Value, Blend, or Growth). I obtain
placebo percentile rankings by reversing this change in the Morningstar rating methodology. Specifically,
mutual fund share classes would be ranked (on the basis of Sharpe ratio) within Morningstar categories
prior to June 2002 when Morningstar introduced its 3 × 3 categories, whereas all U.S. equity funds would
be ranked against each other starting in June 2002. Placebo category benchmark returns are computed
in an analogous way. That is, placebo category benchmark returns are computed as average returns of
the funds that belong to the same Morningstar category prior to June 2002, while average returns are
computed across all U.S. equity funds starting in June 2002.
Using the placebo distance to a threshold based on placebo within-category percentile rankings, I
estimate the following linear regression model:
Placebo
βi,t
Placebo
= γ × Squared distancePlacebo
i,t−1 + η × Covariatesi,t−1 + θt + εi,t (4)
βi,t−1
separately at the share class (security) level and at the fund (portfolio) level. The dependent variable is the
ratio of beta of daily returns of mutual fund share class (or fund) i relative to its placebo category benchmark
returns during month t to that during month t − 1. Placebo category benchmark returns are computed
as average returns of the funds that belong to the same Morningstar category prior to June 2002, while
average returns are computed across all U.S. equity funds starting in June 2002. Squared distancePlacebo
i,t−1 is
mutual fund share class (or fund) i’s squared placebo distance between its placebo within-category percentile
rankings based on the Sharpe ratio and its nearest rating threshold at the end of month t − 1. Placebo
within-category percentile rankings are obtained by reversing the June 2002 change in the Morningstar
rating methodology. That is, mutual fund share classes would be ranked within Morningstar categories
prior to June 2002, whereas all U.S. equity funds would be ranked against each other starting in June
2002. The rest of the model is the same as in Eq. (1).
The regression results are presented in Table 8. In column (1) of Panel A, the estimate γ̂ is nega-
17

tive, but statistically insignificant. When share class characteristics are controlled for in column (2), the
estimate γ̂ remains statistically insignificant. In columns (3) and (4), I exclude funds with extremely good
performance or extremely poor performance, for which the star rating effect is dominated by other man-
agerial incentives. The estimate γ̂ remains little changed and statistically insignificant even when funds
with one- and five-star ratings during month t − 1 are excluded from the analysis. The fund-level tests
yield qualitatively similar results in Panel B. The estimate γ̂ is statistically insignificant in all columns.
The null results in placebo tests, combined with the significant results in Section 4.1, corroborate that the
star rating effect on risk taking of mutual funds is likely causal. Furthermore, the null results from a slight
distortion of the actual distance to a rating threshold also support the implicit assumption that mutual
fund managers have fairly precise knowledge about their funds’ within-category rankings of Morningstar
RAR (or equivalently, Sharpe ratio), which can be estimated using public information.
[Insert Table 8]
5.2 Index Funds
Unlike actively-managed mutual funds, index mutual funds that are passively tracking benchmark
indexes have a limited ability to dynamically alter portfolio risk. For this reason, index funds have thus
far been excluded from the analysis, although they are included in the ranking procedure and in the
computation of category benchmark returns, to be consistent with the Morningstar rating methodology.
In this subsection, I conduct placebo tests using index mutual funds because there is little reason to expect
star ratings to affect risk-taking behavior of index funds. I re-estimate the linear regression model in Eq.
(1) in a sample of index funds.
The regression results are presented in Table 9. In column (1) of Panel A, the estimate γ̂ is positive,
rather than being negative, and statistically insignificant. When share class characteristics are controlled
for in column (2), the estimate γ̂ remains little changed and statistically insignificant. In columns (3) and
(4), I exclude funds with extremely good performance or extremely poor performance, for which the star
rating effect is dominated by other managerial incentives. The estimate γ̂ remains statistically insignificant
18

even when funds with one- and five-star ratings during month t − 1 are excluded from the analysis. The
fund-level tests yield qualitatively similar results in Panel B. The null results in a sample of index mutual
funds lend further support to the claim that the star rating effect on risk taking of mutual funds is likely
causal.
[Insert Table 9]
6 Conclusion
This paper presents evidence that the discrete nature of Morningstar ratings can have a significant
effect on risk taking of mutual funds. Mutual fund managers who care about star ratings can increase
portfolio risk in an attempt to boost within-category rankings of their funds when they are in the vicinity
of rating cutoffs. Compared to their distant peers, mutual funds near rating thresholds significantly take
on more risk by loading up betas relative to category benchmarks, while holding tracking errors relatively
fixed. This risk-shifting behavior is consistent with an attempt to increase within-category rankings of
a fund’s Morningstar “Risk-Adjusted Return” (RAR) (or equivalently, Sharpe ratio), thereby potentially
improving the fund’s star ratings.
Despite a mounting evidence that mutual fund investors really care about star ratings (e.g., Del
Guercio and Tkac (2008), Reuter and Zitzewitz (2015), Evans and Sun (Forthcoming), Ben-David et al.
(2019)), there is relatively little research on how star ratings incentivize mutual fund managers. Mutual
fund managers do care about star ratings and star ratings can affect managerial incentives of mutual fund
managers. For instance, Kim (2020) shows that mutual fund managers can inflate month-end net asset
values (NAVs) of their funds through an illegal tracing practice known as “portfolio pumping” on the
last trading day of the month when they are likely to finish in the vicinity of rating thresholds. While
Kim (2020) focuses on high-frequency trading strategies during last minutes of the trading session on
the last trading day of the month, this paper focuses low-frequency portfolio strategies designed to boost
within-category rankings of Sharpe ratio, thus improving star ratings.
19

Unlike portfolio pumping, it is less clear how star ratings would affect risk taking of mutual funds.
This paper shows that the star rating effect on risk taking works through increasing beta rather than
volatility. That is, mutual fund managers increase portfolio betas relative to category benchmarks, without
substantially increasing tracking errors around benchmarks. This risk-shifting behavior is consistent with
an attempt to boost within-category rankings of Morningstar RAR, which rewards higher returns, but
penalizes risk as measured by volatility (Sharpe (1998), Ben-David et al. (2019)).
The literature on mutual funds’ risk taking has focused on extreme cases. At the one extreme,
mutual funds with extremely good performance alter the riskiness of their portfolios (e.g., Chevalier and
Ellison (1997)) due to the convex flow-performance relation (Ippolito (1992), Sirri and Tufano (1998)).
At the other extreme, mutual funds with extremely poor performance substantially increase portfolio
risk (Khorana (2001)) because extremely poor performance may lead to management turnover (Khorana
(1996)). This paper shows that without experiencing extreme performance, mutual fund managers can
still face incentives to increase portfolio induced by the discrete nature of star ratings. In addition, the
star rating effect on risk taking of mutual funds is significantly more pronounced among funds for which
(overall) star ratings are completely determined by three-year star ratings, which are presumably much
easier to influence than five- and ten-year star ratings.
20

References
Aragon, George O., and Vikram Nanda, 2012, Tournament behavior in hedge funds: High-water marks,
fund liquidation, and managerial stake, Review of Financial Studies 25, 937–974.
Barber, Brad M., Xing Huang, and Terrance Odean, 2016, Which factors matter to investors? Evidence
from mutual fund flows, Review of Financial Studies 29, 2600–2642.
Ben-David, Itzhak, Jiacui Li, Andrea Rossi, and Yang Song, 2019, What do mutual fund investors really
care about?, Working Paper .
Berk, Jonathan B., and Jules H. van Binsbergen, 2016, Assessing asset pricing models using revealed
preference, Journal of Financial Economics 119, 1–23.
Brown, Keith C., W. V. Harlow, and Laura T. Starks, 1996, Of tournaments and temptations: An analysis
of managerial incentives in the mutual fund industry, Journal of Finance 51, 85–110.
Carhart, Mark M., Ron Kaniel, David K. Musto, and Adam V. Reed, 2002, Leaning for the tape: Evidence
of gaming behavior in equity mutual funds, Journal of Finance 57, 661–693.
Chevalier, Judith, and Glenn Ellison, 1997, Risk taking by mutual funds as a response to incentives, Journal
of Political Economy 105, 1167–1200.
Christoffersen, Susan E.K., and Mikhail Simutin, 2017, On the demand for high-beta stocks: Evidence
from mutual funds, Review of Financial Studies 30, 2596–2620.
Del Guercio, Diane, and Paula A Tkac, 2008, Star power: The effect of Morningstar ratings on mutual
fund flow, Journal of Financial and Quantitative Analysis 43, 907–936.
Evans, Richard B, and Yang Sun, Forthcoming, Models or stars: The role of asset pricing models and
heuristics in investor risk adjustment, Review of Financial Studies .
21

Ippolito, Richard A., 1992, Consumer reaction to measures of poor quality: Evidence from the mutual
fund industry, Journal of Law and Economics 35, 45–70.
Khorana, Ajay, 1996, Top management turnover: An empirical investigation of mutual fund managers,
Journal of Financial Economics 40, 403–427.
Khorana, Ajay, 2001, Performance changes following top management turnover: Evidence from open-end
mutual funds, Journal of Financial and Quantitative Analysis 36, 371–393.
Kim, Sanghyun Hugh, 2020, Do mutual fund managers care about star ratings? Evidence from portfolio
pumping, Working Paper .
Lee, Jung Hoon, Charles Trzcinka, and Shyam Venkatesan, 2019, Do portfolio manager contracts contract
portfolio management?, Journal of Finance 74, 2543–2577.
Reuter, Jonathan, and Eric Zitzewitz, 2015, How much does size erode mutual fund performance? A
regression discontinuity approach, Working Paper .
Sharpe, William F, 1998, Morningstar’s risk-adjusted ratings, Financial Analysts Journal 54, 21–33.
Sirri, Erik R, and Peter Tufano, 1998, Costly search and mutual fund flows, Journal of Finance 53, 1589–
1622.
Zweig, Jason, 1997, Watch out for the year-end fund flimflam, Money Magazine November 1, 130–133.
22

Table 1: Transition Matrix of Star Ratings
This table reports the transition matrix where each element in row i and column j represents the probability (in
percent) of a mutual fund share class receiving star rating j during month t conditional on its receiving star rating i
during month t−1. At the end of each month, mutual fund share classes are rated by Morningstar on an integer scale
of one star (the lowest rating) to five stars (the highest rating) on the basis of Morningstar “Risk-Adjusted Return”
(RAR), which adjusts for risk and accounts for all sales charges, over the prior three, five, and ten years, depending
on data availability. Whereas all U.S. equity funds were ranked against each other until May 2002, Morningstar
has started ranking funds within categories in June 2002 when Morningstar introduced its 3 × 3 categories along
the size dimension (Small, Mid-Cap, or Large) and the value dimension (Value, Blend, or Growth). On the basis of
within-category rankings of Morningstar RAR, the top 10% of mutual fund share classes receive five stars, the next
22.5% four stars, the middle 35% three stars, the next 22.5% two stars, and the bottom 10% receive one star. Overall
star ratings (or simply referred to as star ratings in the paper) are determined by the weighted averages of three-,
five-, and ten-year star ratings, depending data availability, rounded to the nearest integer value. Share classes less
than three years old are not rated. Share classes at least three years old and less than five years old are rated based
only on three-year star ratings. Share classes at least five years old and less than ten years old are rated based on
three-year star ratings (40 percent weight) and five-year star ratings (60 percent weight). Share classes at least ten
years old are rated based on three-year star ratings (20 percent weight), five-year star ratings (30 percent weight),
and ten-year star ratings (50 percent weight).
Panel A: Three-year star ratings

During month t
During month t − 1 1 2 3 4 5
1 87.08 12.63 0.24 0.04 0.01
2 5.20 83.82 10.82 0.15 0.01
3 0.05 8.21 84.10 7.54 0.10
4 0.02 0.15 13.16 80.75 5.92
5 0.004 0.04 0.38 14.93 84.65
Panel B: Five-year star ratings

During month t
1 89.50 10.33 0.13 0.03 0.01
2 4.39 86.69 8.80 0.10 0.02
3 0.03 6.73 87.18 6.00 0.06
4 0.01 0.09 10.78 84.28 4.84
5 0.003 0.01 0.21 12.30 87.47
23

Table 1–Continued
Panel C: Ten-year star ratings

During month t
1 92.64 7.25 0.10 0.004 0.002
2 3.39 90.03 6.52 0.05 0.01
3 0.02 4.98 90.87 4.11 0.03
4 0.002 0.04 7.61 88.90 3.45
5 0 0.002 0.05 8.94 91.00
Panel D: (Overall) star ratings

During month t
1 87.55 12.18 0.23 0.03 0.01
2 3.81 86.34 9.73 0.11 0.01
3 0.04 6.72 86.87 6.32 0.05
4 0.01 0.11 10.82 84.66 4.41
5 0.001 0.02 0.29 14.16 85.53
24

Table 2: Summary statistics
This table reports the summary statistics. Distance to a rating threshold is the distance between within-category
percentile rankings of a mutual fund share class (or fund) based on the Sharpe ratio and its nearest rating threshold.
Ratio of portfolio betas is the ratio of beta of daily returns of mutual fund share class (or fund) relative to its category
benchmark returns during month t to that during month t − 1. Ratio of tracking errors is the ratio of volatility of
daily returns of mutual fund share class (or fund) in excess of its category benchmark returns during month t to that
during month t − 1. Other share-class-level variables include total net assets (TNA) (in $ million), age (in years),
expense ratio (in percent), turnover ratio, an indicator variable for institutional share class, and an indicator variable
for less-than-five-year-old share class. To construct fund-level variables, I aggregate share-class-level variables to the
fund-level by computing the sum of TNAs, the maximum of ages, and the value-weighted averages of other variables,
weighted by TNA at the end of the prior month. Indicator variables are replaced by the proportion of TNAs from
share classes for which indicator variables take the value of one. All variables are winsorized at 1% and 99%. The
sample covers the period from 1990 to 2018.
Panel A: Share-class-level variables

Statistic N Mean St. Dev. Pctl(25) Median Pctl(75)
Distance to a rating threshold 1, 161, 177 0.07 0.04 0.03 0.06 0.10
Ratio of portfolio betas 1, 161, 177 1.01 0.13 0.94 1.00 1.06
Ratio of tracking errors 1, 161, 177 1.05 0.35 0.81 1.00 1.23
TNA ($ million) 1, 161, 177 481.72 1, 347.78 12.07 65.13 303.48
Age (in years) 1, 161, 177 11.34 8.91 5.53 8.96 14.07
Expense ratio (%) 1, 150, 186 1.34 0.51 0.97 1.25 1.73
Turnover ratio 1, 093, 419 0.77 0.63 0.33 0.61 1.00
1(Institutional) 1, 161, 177 0.21 0.41 0 0 0
1(Age < 5 years old) 1, 161, 177 0.25 0.43 0 0 1
Panel B: Fund-level variables

Statistic N Mean St. Dev. Pctl(25) Median Pctl(75)
Distance to a rating threshold 469, 337 0.07 0.04 0.04 0.06 0.09
Ratio of portfolio betas 469, 337 1.01 0.15 0.94 1.00 1.07
Ratio of tracking errors 469, 337 1.05 0.35 0.81 1.00 1.23
TNA ($ million) 469, 337 1, 191.80 2, 983.77 68.39 256.75 970.92
Age (in years) 469, 337 14.90 11.93 6.62 11.41 18.74
Expense ratio (%) 463, 430 1.18 0.42 0.92 1.14 1.40
Turnover ratio 436, 741 0.77 0.66 0.32 0.59 1.00
% Institutional TNA 469, 337 0.27 0.38 0 0 0.6
% less-than-five-year-old TNA 469, 337 0.27 0.43 0 0 1
25

Table 3: Star Ratings and Portfolio Betas
This table presents the results of the following linear regression model:
βi,t
= γ × Squared distancei,t−1 + η × Covariatesi,t−1 + θt + εi,t
βi,t−1
where i indexes mutual fund share classes (or funds) and t indexes time in month. All regressions are run separately
at the share class (security) level and at the fund (portfolio) level. The dependent variable is the ratio of beta of
daily returns of mutual fund share class (or fund) i relative to its category benchmark returns during month t to
that during month t − 1. Squared distancei,t−1 is mutual fund share class (or fund) i’s squared distance between its
within-category percentile rankings based on the Sharpe ratio and its nearest rating threshold at the end of month
t − 1. Covariatesi,t−1 are a vector of mutual fund share class (or fund) characteristics that include the logarithmic of
total net assets (TNA) (in $ million), logarithmic of age (in years), expense ratio (in percent), turnover ratio, and an
indicator variable for institutional share class, all measured at the end of month t−1. Share-class-level regressions are
estimated using the weighted least squares (WLS) estimation in which each share-class-month observation is weighted
by the ratio of share-class TNA to fund TNA at the end of month t − 1 so that each fund-month observation is
equally weighted. For fund-level tests, I aggregate share-class-level variables to the fund-level by computing the sum
of TNAs, the maximum of ages, and the value-weighted averages of other variables, weighted by TNA at the end
of the prior month. An indicator variable for institutional share class is replaced by the proportion of TNAs from
institutional share classes. Fund-level regressions are estimated using the ordinary least squares (OLS) estimation.
All regressions include time fixed-effects (θt ), standard errors are double-clustered by fund and by time, and the
resulting t-statistics are reported in parentheses. Statistical significance at the 10%, 5%, and 1% level is indicated
by *, **, and ***, respectively. The sample covers the period from 1990 to 2018.
Panel A: Share-class-level tests (WLS)
Ratio of portfolio betas

(1) (2)
Squared distance −0.080∗∗ −0.070∗∗
(−2.344) (−2.129)
log(TNA) −0.001∗
(−1.759)
log(Age) −0.001
(−0.827)
Expense ratio 0.004∗∗
(2.222)
Turnover ratio −0.001
(−0.756)
1(Institutional) −0.0002
(−0.371)
Time fixed-effects Yes Yes
Observations 1,161,177 1,087,619
Adjusted R2 0.025 0.026
26

Table 3–Continued
Panel B: Fund-level tests (OLS)

(1) (2)
Squared distance −0.086∗∗ −0.075∗∗
(−2.165) (−1.975)
log(TNA) −0.001∗∗
(−2.097)
log(Age) −0.001
(−1.007)
(2.399)
(−0.842)
% Institutional TNA −0.001
(−1.072)
Observations 469,337 434,472
Adjusted R2 0.025 0.026
27

Table 4: Star Ratings and Tracking Errors
b
σ(ri,t − ri,t )
b
σ(ri,t−1 − ri,t−1 )
at the share class (security) level and at the fund (portfolio) level. The dependent variable is the ratio of volatility
of daily returns of mutual fund share class (or fund) i in excess of its category benchmark returns during month t to
equally weighted. For fund-level tests, I aggregate share-class-level variables to the fund-level by computing the sum
of TNAs, the maximum of ages, and the value-weighted averages of other variables, weighted by TNA at the end
of the prior month. An indicator variable for institutional share class is replaced by the proportion of TNAs from
institutional share classes. Fund-level regressions are estimated using the ordinary least squares (OLS) estimation.
All regressions include time fixed-effects (θt ), standard errors are double-clustered by fund and by time, and the
resulting t-statistics are reported in parentheses. Statistical significance at the 10%, 5%, and 1% level is indicated
Ratio of tracking errors

(1) (2)
Squared distance −0.028 −0.041
(−0.625) (−0.911)
log(TNA) −0.001∗∗∗
(−3.977)
log(Age) 0.001
(1.644)
Expense ratio −0.002
(−1.043)
Turnover ratio 0.001
(0.598)
(−0.912)
Observations 1,161,177 1,087,619
Adjusted R2 0.296 0.299
28

Table 4–Continued
Ratio of tracking errors

(1) (2)
(−0.648) (−0.955)
log(TNA) −0.001∗∗∗
(−4.244)
log(Age) 0.001∗
(1.716)
Expense ratio −0.002
(−0.840)
Turnover ratio 0.001
(0.616)
(−0.948)
Adjusted R2 0.298 0.301
29

Table 5: Other Managerial Incentives
This table presents the results of the following linear regression model in sub-samples split by star ratings during month t − 1:
βi,t
βi,t−1
where i indexes mutual fund share classes (or funds) and t indexes time in month. All regressions are run separately at the share class (security)
level and at the fund (portfolio) level. The dependent variable is the ratio of beta of daily returns of mutual fund share class (or fund) i relative
to its category benchmark returns during month t to that during month t − 1. Squared distancei,t−1 is mutual fund share class (or fund) i’s
squared distance between its within-category percentile rankings based on the Sharpe ratio and its nearest rating threshold at the end of month
t − 1. Covariatesi,t−1 are a vector of mutual fund share class (or fund) characteristics that include the logarithmic of total net assets (TNA)
(in $ million), logarithmic of age (in years), expense ratio (in percent), turnover ratio, and an indicator variable for institutional share class,
all measured at the end of month t − 1. Share-class-level regressions are estimated using the weighted least squares (WLS) estimation in which
each share-class-month observation is weighted by the ratio of share-class TNA to fund TNA at the end of month t − 1 so that each fund-month
observation is equally weighted. For fund-level tests, I aggregate share-class-level variables to the fund-level by computing the sum of TNAs, the
maximum of ages, and the value-weighted averages of other variables, weighted by TNA at the end of the prior month. An indicator variable
for institutional share class is replaced by the proportion of TNAs from institutional share classes. Star ratings of a fund are computed by the
TNA-weighted averages of star ratings of share classes belonging to the same fund, rounded to the nearest integer value. Fund-level regressions are
estimated using the ordinary least squares (OLS) estimation. All regressions include time fixed-effects (θt ), standard errors are double-clustered
by fund and by time, and the resulting t-statistics are reported in parentheses. Statistical significance at the 10%, 5%, and 1% level is indicated
30
Table 5–Continued

One-star funds Two-, three-, and four-star funds Five-star funds
(1) (2) (3) (4) (5) (6)
Squared distance 0.667∗∗∗ 0.590∗∗∗ −0.098∗∗∗ −0.090∗∗∗ 0.597∗∗∗ 0.559∗∗∗
(3.654) (3.098) (−3.134) (−2.857) (2.594) (2.641)

log(TNA) −0.001 −0.001∗ −0.0002
(−1.432) (−1.795) (−0.539)
log(Age) −0.0002 −0.0001 −0.001
(−0.116) (−0.096) (−0.857)
Expense ratio 0.005∗∗ 0.003∗ 0.003
(2.320) (1.827) (1.415)
Turnover ratio 0.0002 −0.001 −0.001
(0.153) (−1.012) (−0.562)
31
1(Institutional) −0.00004 −0.0001 −0.002∗∗

(−0.025) (−0.114) (−2.343)
Time fixed-effects Yes Yes Yes Yes Yes Yes
Observations 83,615 77,411 980,209 918,952 87,587 82,558
Adjusted R2 0.071 0.074 0.034 0.035 0.100 0.102
Table 5–Continued

One-star funds Two-, three-, and four-star funds Five-star funds
(1) (2) (3) (4) (5) (6)
Squared distance 1.139∗∗∗ 0.837∗∗∗ −0.111∗∗∗ −0.105∗∗∗ 0.649∗∗ 0.585∗∗
(4.260) (3.084) (−2.987) (−2.788) (2.280) (2.180)

log(TNA) −0.002∗ −0.001∗∗ −0.001
(−1.752) (−2.190) (−1.554)
log(Age) −0.002 −0.0001 −0.001
(−1.137) (−0.144) (−1.209)
Expense ratio 0.006∗∗ 0.003∗∗ 0.004
(1.977) (2.093) (1.198)
Turnover ratio −0.0002 −0.002 −0.001
(−0.097) (−1.044) (−0.555)
32
% Institutional TNA −0.006∗∗∗ −0.0004 −0.003∗

(−2.837) (−0.479) (−1.934)
Time fixed-effects Yes Yes Yes Yes Yes Yes
Observations 32,199 29,273 391,135 362,417 43,172 40,327
Adjusted R2 0.071 0.075 0.035 0.036 0.103 0.104
Table 6: The Length of Track Record
This table presents the results of the following linear regression model with an interaction term:
βi,t
= δ × Squared distancei,t−1 × 1(Agei,t−1 < 5 years old) + γ × Squared distancei,t−1
βi,t−1
+ ρ × 1(Agei,t−1 < 5 years old) + η × Covariatesi,t−1 + θt + εi,t
at the share class (security) level and at the fund (portfolio) level. The dependent variable is the ratio of beta of daily
returns of mutual fund share class (or fund) i relative to its category benchmark returns during month t to that during
month t − 1. The interaction term, 1(Agei,t−1 < 5 years old), is an indicator variable that takes the value of one if
3-year star ratings completely determine overall star ratings during month t − 1. For fund-level tests, the interaction
term is replaced by the proportion of TNAs from share classes for which 3-year star ratings completely determine
overall star ratings (% less-than-five-year-old TNA). Squared distancei,t−1 is mutual fund share class (or fund) i’s
squared distance between its within-category percentile rankings based on the Sharpe ratio and its nearest rating
threshold at the end of month t − 1. Covariatesi,t−1 are a vector of mutual fund share class (or fund) characteristics
that include the logarithmic of total net assets (TNA) (in $ million), logarithmic of age (in years), expense ratio (in
percent), turnover ratio, and an indicator variable for institutional share class, all measured at the end of month
t − 1. Share-class-level regressions are estimated using the weighted least squares (WLS) estimation in which each
share-class-month observation is weighted by the ratio of share-class TNA to fund TNA at the end of month t − 1
so that each fund-month observation is equally weighted. For fund-level tests, I aggregate share-class-level variables
to the fund-level by computing the sum of TNAs, the maximum of ages, and the value-weighted averages of other
variables, weighted by TNA at the end of the prior month. An indicator variable for institutional share class is
replaced by the proportion of TNAs from institutional share classes. Mutual fund share classes (or funds) with
one- and five-star ratings during month t − 1 are excluded from the analysis. Star ratings of a fund are computed
by the TNA-weighted averages of star ratings of share classes belonging to the same fund, rounded to the nearest
integer value. Fund-level regressions are estimated using the ordinary least squares (OLS) estimation. All regressions
include time fixed-effects (θt ), standard errors are double-clustered by fund and by time, and the resulting t-statistics
are reported in parentheses. Statistical significance at the 10%, 5%, and 1% level is indicated by *, **, and ***,
respectively. The sample covers the period from 1990 to 2018.
33

Table 6–Continued

(1) (2)
Squared distance × 1(Age < 5 years old) −0.234∗∗∗ −0.246∗∗∗
(−2.601) (−2.764)
Squared distance −0.043∗ −0.034
(−1.773) (−1.359)
1(Age < 5 years old) 0.001 0.0001
(1.553) (0.140)
log(TNA) −0.001∗
(−1.824)
log(Age) −0.0004
(−0.643)
Expense ratio 0.003∗
(1.778)
(−1.014)
(−0.180)
Adjusted R2 0.034 0.036
34

Table 6–Continued

(1) (2)
Squared distance × % less-than-five-year-old TNA −0.250∗∗ −0.262∗∗∗
(−2.485) (−2.645)
(−1.545) (−1.245)
% less-than-five-year-old TNA 0.001 −0.0002
(1.433) (−0.206)
log(TNA) −0.001∗∗
(−2.193)
log(Age) −0.0005
(−0.688)
(2.065)
(−1.042)
(−0.516)
Adjusted R2 0.035 0.036
35

Table 7: The Calendar Month Effect
This table presents the results of the following linear regression model in sub-samples split by calendar months:
βi,t
βi,t−1
equally weighted. For fund-level tests, I aggregate share-class-level variables to the fund-level by computing the
sum of TNAs, the maximum of ages, and the value-weighted averages of other variables, weighted by TNA at the
end of the prior month. An indicator variable for institutional share class is replaced by the proportion of TNAs
from institutional share classes. Mutual fund share classes (or funds) with one- and five-star ratings during month
t − 1 are excluded from the analysis. Star ratings of a fund are computed by the TNA-weighted averages of star
ratings of share classes belonging to the same fund, rounded to the nearest integer value. Fund-level regressions are
estimated using the ordinary least squares (OLS) estimation. All regressions include time fixed-effects (θt ), standard
errors are double-clustered by fund and by time, and the resulting t-statistics are reported in parentheses. Statistical
significance at the 10%, 5%, and 1% level is indicated by *, **, and ***, respectively. The sample covers the period
from 1990 to 2018.

Mar, Jun, Sep, Dec Other months
(1) (2) (3) (4)
Squared distance −0.166∗∗∗ −0.155∗∗ −0.064 −0.057
(−2.630) (−2.402) (−1.559) (−1.376)
log(TNA) −0.0005 −0.001
(−0.776) (−1.552)
log(Age) −0.001 0.0005
(−0.994) (0.678)
Expense ratio 0.007∗∗ 0.001
(2.525) (0.388)
Turnover ratio −0.003 −0.001
(−1.280) (−0.386)
1(Institutional) 0.003∗ −0.001∗
(1.728) (−1.720)
Time fixed-effects Yes Yes Yes Yes
Observations 327,190 306,619 653,019 612,333
Adjusted R2 0.028 0.030 0.038 0.039
36

Table 7–Continued

Mar, Jun, Sep, Dec Other months
(1) (2) (3) (4)
Squared distance −0.191∗∗ −0.183∗∗ −0.071 −0.066
(−2.571) (−2.397) (−1.439) (−1.313)
log(TNA) −0.001 −0.001∗∗
(−0.847) (−1.977)
log(Age) −0.001 0.0005
(−0.941) (0.603)
Expense ratio 0.008∗∗∗ 0.001
(2.695) (0.552)
(−1.468) (−0.314)
% Institutional TNA 0.003∗ −0.002∗∗
(1.727) (−2.244)
Observations 130,015 120,428 261,120 241,989
Adjusted R2 0.029 0.030 0.039 0.040
37

Table 8: Placebo tests exploiting the 2002 change in the rating methodology
Placebo
βi,t
Placebo
= γ × Squared distancePlacebo
i,t−1 + η × Covariatesi,t−1 + θt + εi,t
βi,t−1
daily returns of mutual fund share class (or fund) i relative to its placebo category benchmark returns during month
t to that during month t − 1. Placebo category benchmark returns are computed as average returns of the funds that
belong to the same Morningstar category prior to June 2002 when Morningstar introduced its 3 × 3 categories along
the size and value dimensions, whereas average returns are computed across all U.S. equity funds starting in June
2002. Squared distancePlacebo
i,t−1 is mutual fund share class (or fund) i’s squared placebo distance between its placebo
t − 1. Placebo within-category percentile rankings are obtained by reversing the June 2002 change in the Morningstar
rating methodology. That is, mutual fund share classes would be ranked within Morningstar categories prior to June
2002 when Morningstar introduced its 3 × 3 categories along the size and value dimensions, whereas all U.S. equity
funds would be ranked against each other starting in June 2002. Covariatesi,t−1 are a vector of mutual fund share
class (or fund) characteristics that include the logarithmic of total net assets (TNA) (in $ million), logarithmic of
age (in years), expense ratio (in percent), turnover ratio, and an indicator variable for institutional share class, all
measured at the end of month t − 1. Share-class-level regressions are estimated using the weighted least squares
(WLS) estimation in which each share-class-month observation is weighted by the ratio of share-class TNA to fund
TNA at the end of month t − 1 so that each fund-month observation is equally weighted. For fund-level tests, I
aggregate share-class-level variables to the fund-level by computing the sum of TNAs, the maximum of ages, and the
value-weighted averages of other variables, weighted by TNA at the end of the prior month. An indicator variable for
institutional share class is replaced by the proportion of TNAs from institutional share classes. In columns (3) and
(4), mutual fund share classes (or funds) with one- and five-star ratings during month t − 1 are excluded from the
analysis. Star ratings of a fund are computed by the TNA-weighted averages of star ratings of share classes belonging
to the same fund, rounded to the nearest integer value. Fund-level regressions are estimated using the ordinary least
squares (OLS) estimation. All regressions include time fixed-effects (θt ), standard errors are double-clustered by
fund and by time, and the resulting t-statistics are reported in parentheses. Statistical significance at the 10%, 5%,
and 1% level is indicated by *, **, and ***, respectively. The sample covers the period from 1990 to 2018.
38

Table 8–Continued
Ratio of portfolio betasPlacebo

All funds Ex. one- and five-star funds
(1) (2) (3) (4)
Squared distancePlacebo −0.020 −0.011 −0.024 −0.015
(−0.560) (−0.309) (−0.629) (−0.404)
log(TNA) −0.0003 −0.0003
(−1.223) (−1.159)
log(Age) −0.0004 −0.0001
(−0.855) (−0.187)
Expense ratio 0.004∗ 0.003
(1.666) (1.216)
(−0.078) (−0.256)
1(Institutional) −0.00001 0.0001
(−0.012) (0.055)
Observations 1,161,177 1,087,619 980,209 918,952
Adjusted R2 0.010 0.010 0.011 0.011
39

Table 8–Continued
Beta ratioPlacebo
(1) (2) (3) (4)
Squared distancePlacebo −0.016 −0.005 −0.029 −0.018
(−0.405) (−0.114) (−0.685) (−0.431)
log(TNA) −0.0004 −0.0004
(−1.505) (−1.456)
log(Age) −0.001 −0.0001
(−0.893) (−0.231)
Expense ratio 0.005∗ 0.004
(1.714) (1.300)
(−0.154) (−0.314)
% Institutional TNA −0.001 −0.0003
(−0.423) (−0.203)
Observations 469,337 434,472 391,135 362,417
Adjusted R2 0.009 0.010 0.011 0.011
40

Table 9: Placebo tests using index funds
This table presents the results of the following linear regression model in a sample of index funds:
βi,t
βi,t−1
by the ratio of share-class TNA to fund TNA at the end of month t−1 so that each fund-month observation is equally
weighted. For fund-level tests, I aggregate share-class-level variables to the fund-level by computing the sum of TNAs,
the maximum of ages, and the value-weighted averages of other variables, weighted by TNA at the end of the prior
month. An indicator variable for institutional share class is replaced by the proportion of TNAs from institutional
share classes. In columns (3) and (4), mutual fund share classes (or funds) with one- and five-star ratings during
month t−1 are excluded from the analysis. Star ratings of a fund are computed by the TNA-weighted averages of star
ratings of share classes belonging to the same fund, rounded to the nearest integer value. Fund-level regressions are
estimated using the ordinary least squares (OLS) estimation. All regressions include time fixed-effects (θt ), standard
errors are double-clustered by fund and by time, and the resulting t-statistics are reported in parentheses. Statistical
significance at the 10%, 5%, and 1% level is indicated by *, **, and ***, respectively. The sample covers the period
from 1990 to 2018.

(1) (2) (3) (4)
Squared distance 0.007 0.029 0.006 0.017
(0.097) (0.363) (0.098) (0.260)
log(TNA) −0.0003 −0.0003
(−1.116) (−1.048)
log(Age) −0.0002 −0.0001
(−0.343) (−0.106)
Expense ratio 0.001 0.001
(0.915) (1.344)
Turnover ratio 0.0002 0.00005
(0.418) (0.088)
1(Institutional) −0.001 −0.0001
(−1.641) (−0.157)
Observations 77,109 69,951 70,945 64,402
Adjusted R2 0.207 0.209 0.211 0.210
41

Table 9–Continued

(1) (2) (3) (4)
Squared distance 0.008 0.032 0.011 0.021
(0.092) (0.324) (0.148) (0.272)
log(TNA) −0.0003 −0.0002
(−0.947) (−0.901)
log(Age) −0.0003 −0.0002
(−0.404) (−0.417)
Expense ratio 0.001 0.002
(1.025) (1.408)
Turnover ratio 0.0001 −0.0004
(0.135) (−0.726)
% Institutional TNA −0.001∗ −0.0001
(−1.818) (−0.301)
Observations 36,019 32,177 32,849 29,273
Adjusted R2 0.204 0.206 0.211 0.210
42

SSRN Id3625493

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SSRN Id3625493

Uploaded by

Copyright:

Available Formats

Star Ratings and Risk Taking of Mutual Funds∗

Sanghyun (Hugh) Kim

Electronic copy available at: https://ssrn.com/abstract=3625493

is consistent with an attempt to increase within-category rankings of a fund’s Sharpe ratio,

Keywords: Morningstar ratings, managerial incentives, mutual funds, risk taking

Electronic copy available at: https://ssrn.com/abstract=3625493

Electronic copy available at: https://ssrn.com/abstract=3625493

shift portfolio risk in a new month that funds roll into.

Electronic copy available at: https://ssrn.com/abstract=3625493

and even slightly stronger at the fund level.

Electronic copy available at: https://ssrn.com/abstract=3625493

category benchmarks, compared to funds that are farther away.

materially change the results.

Electronic copy available at: https://ssrn.com/abstract=3625493

level of risk over time.

Section 5. Section 6 concludes.

Electronic copy available at: https://ssrn.com/abstract=3625493

finish the month in the vicinity of rating thresholds.

Electronic copy available at: https://ssrn.com/abstract=3625493

three-year star ratings.

3 Data and Variable Construction

3.1 The Distance to a Rating Threshold

Electronic copy available at: https://ssrn.com/abstract=3625493

consistent with the Morningstar rating methodology.

and 90th percentiles.

Electronic copy available at: https://ssrn.com/abstract=3625493

computed based on the placebo within-category percentile rankings.

3.2 Measures of Risk Shifting and Other Variables

are winsorized at 1% and 99%. Table 2 reports the summary statistics.

Electronic copy available at: https://ssrn.com/abstract=3625493

4.1 Star Ratings and Portfolio Betas

boosting within-category percentile rankings and improving star ratings.

following linear regression model:

Electronic copy available at: https://ssrn.com/abstract=3625493

and standard errors are double-clustered by fund and by time.

4.2 Star Ratings and Tracking Errors

Electronic copy available at: https://ssrn.com/abstract=3625493

while trying to maintain, or even reduce, tracking errors.

linear regression model:

Electronic copy available at: https://ssrn.com/abstract=3625493

4.3 Other Managerial Incentives

(1) in sub-samples split by star ratings during month t − 1.

Electronic copy available at: https://ssrn.com/abstract=3625493

change the results.

4.4 The Length of Track Record

Electronic copy available at: https://ssrn.com/abstract=3625493

4.5 The Calendar Month Effect

Electronic copy available at: https://ssrn.com/abstract=3625493

Eq. (1) in sub-samples split by calendar months.

seeking to maintain an optimal level of risk over time.

5.1 Reversing the June 2002 Change in the Rating Methodology

Electronic copy available at: https://ssrn.com/abstract=3625493

computed across all U.S. equity funds starting in June 2002.

estimate the following linear regression model:

2002. The rest of the model is the same as in Eq. (1).

Electronic copy available at: https://ssrn.com/abstract=3625493

5.2 Index Funds

(1) in a sample of index funds.

Electronic copy available at: https://ssrn.com/abstract=3625493

improving the fund’s star ratings.

within-category rankings of Sharpe ratio, thus improving star ratings.

Electronic copy available at: https://ssrn.com/abstract=3625493

penalizes risk as measured by volatility (Sharpe (1998), Ben-David et al. (2019)).