Selecting Mutual Funds Using Machine Learning Clas

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/333894756
Selecting Mutual Funds Using Machine Learning Classiﬁers
Thesis · May 2019
CITATION READS
1 4,784
1 author:
Cyril Vanderhaeghen
Edhec Business school
1 PUBLICATION 1 CITATION
SEE PROFILE
All content following this page was uploaded by Cyril Vanderhaeghen on 20 June 2019.
The user has requested enhancement of the downloaded file.

Selecting Mutual Funds
Using Machine Learning Classifiers
Cyril Vanderhaeghen
May 26, 2019
Master of Science in Financial Markets
Under the supervision of Professor Christophe CROUX
EDHEC Business School does not express approval or disapproval concerning the opinions given in this paper
which are the sole responsibility of the author.

Abstract
This paper uses machine learning computed probabilities as fund selection signals and tests this
signal in a fund of fund portfolio. Using time series data and alternative data, we trained several
classifying methods, Support Vector Machines, Logistic regression, Random Forest and an
Artificial Neural Network, to be used as decision processes when rebalancing our portfolio of
US mutual funds.
We found that the signal was relevant when it comes to accurately selecting funds, however the
models were mainly able to capture momentum information within mutual funds.
1
Table of content
1. Introduction ..................................................................................................................................... 5
2. Related Work .................................................................................................................................. 9
3. Models Description ....................................................................................................................... 11
3.1 Logistic Regression ................................................................................................................. 11
3.2 Support Vector Machines ........................................................................................................ 12
3.3 Random Forests ....................................................................................................................... 12
3.4 Voting/Ensemble classifier ..................................................................................................... 13
3.5 Artificial Neural Network ....................................................................................................... 14
4. Data and Features ......................................................................................................................... 15
4.1 Return based features .............................................................................................................. 16
4.2 Non-Net Asset Values based features ..................................................................................... 18
5. Calibrating the models and training results ............................................................................... 22
5.1 Tuning the models’ hyperparameters ...................................................................................... 22
5.2 Bayesian Optimization ............................................................................................................ 23
5.3 Random Search ....................................................................................................................... 23
5.4 Results and model comparison ................................................................................................ 24
5.5 Relationship between the models’ probability and the returns ............................................... 26
6. Applying the models in a strategy ............................................................................................. 30
6.1 Back testing results.................................................................................................................. 31
6.2 Predictions’ accuracy .............................................................................................................. 37
6.3 Looking at the momentum effect ............................................................................................ 39
7. Conclusion................................................................................................................................. 42
2
References ......................................................................................................................................... 44
3
List of Abbreviations
NAV Net Asset Value
ANN Artificial Neural Network
AUC Area under the Curve
CV Cross-validation
NAV Net Asset Value
ReLU Rectified Linear Unit
ROC Receiver Operating Characteristic
SVM Support Vector Machines
IR Information Ratio
4
Table of Figures
Figure 1 Sigmoid function ........................................................................................................ 11
Figure 2 SVM methodology ..................................................................................................... 12
Figure 3 Random Forest Methodology..................................................................................... 13
Figure 4 Artificial Neural Network methodology .................................................................... 14
Figure 5 Conditional distributions of 3 months returns............................................................ 16
Figure 6 Conditional distributions of 6 months returns............................................................ 17
Figure 7 Conditional distributions of 12 months returns.......................................................... 17
Figure 8 Conditional distributions of the volatility .................................................................. 18
Figure 9 Distribution of the consistency feature ...................................................................... 18
Figure 10 Conditional distributions of the number days of existence ...................................... 19
Figure 11 Average number of positive returns per state .......................................................... 20
Figure 12 Average number of positive returns per investment style ........................................ 21
Figure 13 5-fold ROC curves ................................................................................................... 26
Figure 14 Regression of returns to logistic regression probabilities ........................................ 27
Figure 15 Regression of returns to SVM probabilities ............................................................. 27
Figure 16 Regression of returns to random forest probabilities ............................................... 27
Figure 17 Regression of returns to ensemble classifier probabilities ....................................... 28
Figure 18 Regression of returns to ANN probabilities ............................................................. 28
Figure 19 Equal weight portfolio of all the funds over time .................................................... 31
Figure 20 Excess Returns for a quantile of 10% ...................................................................... 32
Figure 25 Predictions' accuracy ................................................................................................ 37
Figure 26 Overall accuracy for models trained without momentum component ..................... 40
Figure 27 Excess returns for models trained without momentum component ......................... 41
Figure 28 Returns when choosing the top 10% funds .............................................................. 46
Figure 29 Strategies’ value when choosing the top 10% funds ................................................ 46
Figure 35 Strategies' value when choosing the top 40% funds ................................................ 46
Figure 37 Strategies' value when choosing the top 50% funds ................................................ 46
5
Table of Tables
Table 1 Hyperparameter tuning results and cross validation scores ........................................ 25
Table 2 Regressions' results...................................................................................................... 29
Table 3 Testing mean excess returns’ significance for q = 10% .............................................. 34
Table 8 Information ratios ........................................................................................................ 36
Table 9 Mean accuracy of only the selected funds for different quantiles ............................... 38
Table 10 Testing results for higher than 50% accuracy ........................................................... 40
6
1. Introduction
This paper delivers an analysis of a mutual fund selection signal based on machine learning
classifiers’ outputted probability. The weighting scheme of each component within the portfolio
is based on the models’ calculated probability of the fund to yield a positive return over the
investment horizon. Using common risk-adjusted measures and prediction precision measures,
we will see how the strategies compare to one another when using different models to compute
the investment signals. We will also analyze their performances with respect to a naïve
momentum fund selection process.
A fund manager’s expertise is mostly captured by its funds track record. Similarly, a fund’s
commercial document usually displays past performance and present this information as selling
arguments. However, when it comes to fund selection, studies show that, on average, actively
managed funds struggle to outperform their benchmarks and other index funds (Fortin &
Michelson, 2002). Yet, numerous investors are willing to dedicate capital to mutual funds and
we can observe some successful funds within the industry, Blackrock or Vanguard to name a
few. It is evidence to the idea that some funds can consistently add value to their clients. This
suggests that, using data on the funds’ characteristics, one might be able to capture those
performing funds.
Even most popular fund classification and rating methods, like Morningstar’s methodology, are
based on the historical performances and use portfolio components such as asset allocation,
market capitalization, value-growth score as well as the beta and alpha of the funds. These
ranking methods are widely used by investors as tools to choose among a universe of funds.
Within this work, when choosing our explanatory variables, we will stray away from these
classical measures and analyze the predictive power of combining usual return-based features
alongside some non-financial features.

7
Because most studies use linear models such as CAPM or factor models as the primary
modelling tools in finance, more complex relationship might not be captured using these
classical methods. Moreover, these financial models are regression algorithms which aim at
providing an estimate of a given target variable over a time period.
A lot of, now popular, machine learning algorithms were developed in the late 20th century. It
is only recently that given the exponential growth of computing power as well as the growing
amount of data that industries have taken interest and have been able to efficiently leverage the
power of these algorithms. In this paper, we will apply classification models that are not
designed to give a continuous numerical prediction but are designed to give a class and a
prediction confidence to the target variable, in this study, a 1 for a positive return, a 0 for a
negative one. We will be using both linear models like logistic regression and nonlinear
machine models such as multilayer perceptron, a common artificial neural network architecture.
8
2. Related Work
Whether fund-specific characteristics can yield predictive information is still not settled among
researchers. Carhart showed that there is momentum information within mutual fund (1997).
However, results are mixed, for instance Lakonishok et al. found no relationship between the
performance of one year and the following one (Lakonishok, Shleifer, Vishny, Hart & Perry,
1992). Other research results found, on average, no positive abnormal returns across mutual
funds (Titman & Grinblatt, 1989).
Machine learning algorithms find successful applications in various fields, image recognition
(Krizhevsky & Sutskever, 2012) or the medical sector (Deo, 2015) to name a few.
Understandably, these algorithms have been a subject of great interest within the financial
industry, too. They are applied on a various range of classical problems such as stock price
prediction (Tarsauliya, Kant, Kala, Tiwari & Shukla, 2010) and also more original problems
like consumer credit risk modelling (Khandani, Kim & Lo, 2010), news sentiment analysis (Ho
& Wang, 2016) and even pattern recognition on chart images using Convolutional Neural
Networks (Gudelek, Boluk & Ozbayiglu, 2017).
When it comes to funds, Indro, Jiang, Patuwo and Zhang used an artificial neural network to
predict mutual funds’ performance (Indro et al., 1999). They find that the Neural Network was
better than classical linear models for growth and blend funds.
Ludwig and Piovoso (2005) apply Decision Trees, Neural Networks and Naïve Bayes to
compare money managers. They use input features such as 1-, 2- and 5-year excess returns,
percentage of outperforming quarters, tracking error and various ratios. The resulting accuracy
from predicting subsequent managers’ performance is above 65% and exceeded the
performance of a simple scoring model.

9
This motivates to use a set of features for this study that consist of both, variables computed
based on past returns and non-financial indicators. Trained on this feature space the models
predict for each given date the investment strategy in funds over a time period.
10
3. Model Description
This section reviews the algorithms applied as well as the data used. Further it is described how
the algorithms are trained and validated.
Throughout this study, we will use the widespread open source Scikit-Learn Python package
for training, testing and validating our Logistic Regression, Support Vector Machines, Decision
Tree and voting ensemble models. To setup and calibrate our Artificial Neuronal Network
(ANN), we will use both Keras and Scikit-Learn.
3.1 Logistic Regression
Logistic regression is a simple machine learning classifying model that maps the result of a
regression model to a sigmoid function:
𝟏
𝝈(𝒛) =
𝟏+ 𝒆−𝒛
With z being the result of a linear regression of our target
variable with our explanatory variables. Figure 1 Sigmoid function
The sigmoid function outputs a value in [0 , 1], as displayed in Figure 1. Therefore, one can
give a probabilistic interpretation of an element being within a certain class, in our case, the
fund yielding positive returns in the next forecasting period. If the function outputs a value
higher than 0.5, the datapoint is classified as 1, and 0 otherwise.
11
3.2 Support Vector Machines
This algorithm classifies the data by constructing a
hyperplane, separating the training data’s different
classes, in our case, positive or negative returns, and
maximizing the distance this hyperplane has to the
training data. Thus, the hyperplane separates the classes

Figure 2 SVM methodology
of funds yielding positive returns in the next forecasting
period and those that do not. When predicting the funds’ returns, the trained algorithm
determines where the new fund’s data points fall with respect to the hyperplane and infers a
positive return or negative return class to the fund.
Support Vector Machines originally do not give any probabilistic interpretation to their
classifications, however, using Platt scaling (Platt, 2000), SVMs can be applied in a
probabilistic setting.
3.3 Random Forests
Random forests algorithms construct several decision trees using the training data, those
decision trees find several simple binary rules to output a class. The random forest algorithm
then takes the mode of the class from all the individual trees as its prediction.
12
Figure 3 Random Forest Methodology
Using multiple trees has the benefit of avoiding the fact that individual trees tend to overfit the
training dataset.
Probabilities can be inferred to the predictions by looking at the proportion of vote for each
class from all the individual trees.
3.4 Voting/Ensemble classifier
This method aggregates the predictions from the previously mentioned algorithms by finding
the class that maximizes the sum of predicted probabilities from all the classifiers. On average,
the ensemble model works better than single classifiers since having several classifiers reduce
the prediction variance. The ensemble classifier will be composed of logistic regression, SVM
and random forest.
The probabilities are simply computed as being the average probabilities from each classifier.
13
3.5 Artificial Neural Network
At its roots, an Artificial Neural Network
(or Multilayer Perceptron) aims at
replicating the biological scheme of
biological brains: neurons connected to
each other by synapses.
The structure can incorporate any number
of hidden layers and neurons per layer, Figure 4 Artificial Neural Network methodology
each neuron from one layer being connected to all the neurons of the next layer. An ANN can
in theory approximate any continuous real function.
The value yk of a neuron is the weighted sum of the values of the previous neurons as defined
as:
𝑦𝑘 = 𝜑(∑𝑛𝑖 𝑤𝑘,𝑖 𝑥𝑖 + 𝑏)
The 𝑥𝑖 𝑠 are the previous layer’s neurons’ values, 𝑤𝑘,𝑖 𝑠 the weights associated to each neuron,
b is a bias and 𝜑 an activation function, typically Rectified Linear Unit (ReLU), hyperbolic tan
or the sigmoid function.
When fitting the model, the algorithm essentially finds all the appropriate weights between
neurons. It does so by using backpropagation, a method used to perform gradient descent to
adjust the weights between each node toward their optimal values in order to minimise the loss
function, mean squared error for instance.
By having one neuron as output to our ANN and choosing a sigmoid activation function for it,
we have a probabilistic output which is given by the value of the neuron.
14
4. Data and Features
The analysed data set used is provided by the Wharton Research Data Service database, it stems
from the “Survivor-bias-Free US Mutual Fund” series. It contains historical information such
as Net Asset Value (NAV) per share, cash percentage, 52days low/high. Furthermore, it
contains more diverse information, such as parent company city/state, phone number. The data
ranges from 1962 to today with information from both active and liquidated funds.
Throughout this study we will use monthly NAV per share data, fund inception date as well as
geographical and investment style data on the funds to conduct feature engineering.
In the next sections we provide a description of the features used as explanatory variables which
we split into two categories: features computed from the NAV per share and alternative features
not based on NAV per share. All the features used to perform model training are computed
from 04/2000 to 04/2001 to train the models at predicting the next quarter’s returns, on 07/2001.
This prediction date was chosen with the objective of having a balanced amount of positive and
negative return, so that the models train the most efficiently possible. At the date of prediction,
there are 2247 funds with positive returns and 1543 with negative returns, so a slight imbalance
with more positive returns.
Throughout this study we will define the returns as follow:
𝑁𝐴𝑉𝑡 − 𝑁𝐴𝑉𝑡−1
𝑟𝑡 =
𝑁𝐴𝑉𝑡−1
15
4.1 Return based features
Using the monthly NAV, we computed returns from which we defined the following 5 features:
- Consistency of returns within the last 12 months
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑟𝑒𝑡𝑢𝑟𝑛𝑠 𝑤𝑖𝑡ℎ𝑖𝑛 𝑡ℎ𝑒 𝑙𝑎𝑠𝑡 12 𝑚𝑜𝑛𝑡ℎ𝑠

𝐶𝑜𝑛𝑠𝑖𝑠𝑡𝑒𝑛𝑐𝑦 =
12
- Annualized volatility of monthly returns within the last 12 months defined as the
standard deviation of returns times √12
- Return over the last 3 months
The consistency and the 3-, 6- and 12-months returns were defined with the idea of capturing
momentum effects.
The features defined above display the following distributions conditional to the forward 3
months returns’ sign:
Figure 5 Conditional distributions of 3 months returns
16
As we can see on Figures 5, 6 and 7, the conditional distributions show positive correlation with
the next returns: more funds with positive 3-,6- and 12 months returns yield positive returns
over the next 3 months, the distributions have observable different means.
17
Figure 8 Conditional distributions of the volatility
hereby displays the distribution of volatility
conditional on the forward 3 months returns. We
can confirm visually that the funds that will yield
positive returns and those that will yield negative
returns have different distributions.
Figure 8 Conditional distributions of the volatility
Like previously plotted features, consistency
of returns as shown on Figure 9 Distribution of
the consistency feature, appears to be negatively
skewed and capture momentum effects as
funds with higher consistency in the past tend
to be more likely to yield positive returns over

Figure 9 Distribution of the consistency feature
the next period.
4.2 Non-Net Asset Values based features
To capture information that might not be present in the returns, we defined one continuous
variable and eight dummy variables based on alternative data:
18
Length of existence Feature
The number of days of existence between the
inception date and the calculation date.
The distribution of forward returns displayed on
Figure 10 does not display too different distributions
among the sign of forward returns beside smaller
variance for forward positive returns. Figure 10 Conditional distributions of the number days of
existence
We defined four dummy variables based on which
state of the US the fund’s parent company is located, the rationale is that depending on the
location a fund could have advantages such as better infrastructure, contact or access to talents.
Location Feature
For the 47 different states within the database, we averaged the number of positive returns of
all the funds located in each state, and defined the four categories using the 25%, 50% and 75%
quantiles. Displayed in Figure 11, is the states’ average number of positive returns, ranked from
largest to lowest.
We assume that the state location features are stable and do not change overtime.
19
Figure 11 Average number of positive returns per state
CRSP Style Feature
We follow the same methodology to define four categorical variables based on the
Wiesenberger, Strategic Insight and Lipper Objective codes. The CRSP Style Code consists of
up to four characters, with each position defined. Reading Left to Right, the four codes
represent an increasing level of granularity1. For example, a code for a mutual fund is EDYG,
where: E = Equity, D = Domestic, Y = Style, G = Growth.
Figure 12 shows the mean number of positive monthly returns for all funds in each CRSP Style
Code. We proceeded as for the state features by creating four dummy variables based on the
25%, 50% and 75% quantiles of the mean number of positive returns.
Again, we assume that a given fund’s investment methodology does not change overtime.
1
The complete descriptions of the codes are available at http://www.crsp.com/products/documentation/crsp-
style-code
20
Figure 12 Average number of positive returns per investment style
In total there are 14 features and 3790 different funds to train our models on.
21
5. Calibrating the models and training results
5.1 Tuning the models’ hyperparameters
Hyperparameter tuning is a very important task as these are the models’ parameters that cannot
be tuned during the training process using training data and thus must be defined by the user.
Each model has its own hyperparameters, we decided to tune the following ones:
- For Logistic Regression: The inverse regularization strength which accounts for
overfitting.
- For SVM: The C parameters which represents the hyperplane’s margin to the classes
and the polynomial degree kernel parameter
- For Random Forests: The total number of trees in the forest
- For Artificial Neural Network: The number of hidden layers and neurons per layer
We will conduct hyperparameter tuning with random search for logistic regression, SVM and
random forest, and use Bayesian optimization for our ANN.
To conduct random search optimization and to validate our models, we will use cross validation
(CV). CV is a powerful method to evaluate the algorithms’ predicting power while controlling
for overfitting. For instance, a 5-fold cross validation splits the training data into 5 equally sized
sets, each on which the model makes predictions after training on the 4 remaining ones of
training data.
The random seed was set to 42 whenever possible.
22
5.2 Bayesian Optimization
We decided to use a Bayesian Optimization2 approach (Larochelle & Adams, 2012) for the
ANN model as it is less computationally expensive than a grid search and more efficient than
a random search.
The idea behind Bayesian Optimization is to find the parameters that maximize an unknown
function by evaluating it at different points while still considering the previous tried values by
using a Gaussian process. Every new evaluation point is chosen as the set of parameters with
the highest Expected Improvement, defined as follow:
𝐸𝐼(𝑥) = 𝐸[max(𝑓(𝑥) − 𝑓(𝑥 ∗ ), 0]
With 𝑓 the function to maximize and 𝑥 ∗ the set of parameters giving the current maximum of
the function.
Within the deep learning framework, the number of epochs is the number of times the data is
feed into the ANN and the weights updated using gradient descent. In our case, at each epoch,
the ANN trains on 66% of the data and makes predictions and computes accuracy on the
remaining 33% of data. With this definition in mind, we define the function to optimize the
parameters on as the average accuracy on the 33% of training data during 60 epochs. We
perform the optimization with 20 evaluations on the following search space: one to five layers
of 5 to 20 neurons per layer.
5.3 Random Search
This method works by trying values randomly within a given search space and performing cross
validation to find the best set of tried parameters. The search spaces go as follow:
2
To implement this, we used the Python package available at https://github.com/fmfn/BayesianOptimization
23
- For Logistic Regression’s inverse regularization strength, from 0.01 to 10
- For SVM’s C parameter and polynomial degree kernel parameter respectively, from
0.01 to 10 and 1 to 10.
- For Random Forests’ total number of trees in the forest, from 100 to 300
5.4 Results and model comparison
Table 1 shows the hyperparameters tuning values as well as the mean 5-fold cross validation
score, defined as the average accuracy over each testing set. The accuracy is computed the
following:
𝑇𝑃 + 𝑇𝑁
𝑎𝑐𝑐 =
𝑇𝑃 + 𝑇𝑁 + 𝐹𝑃 + 𝐹𝑁
𝑇𝑃 are the True Positives, correct positively classified instances
𝑇𝑁 are the True Negatives, correct negatively classified instances
𝐹𝑃 are the False Positives, incorrect positively classified instances
𝐹𝑁 are the False Negatives, incorrect negatively classified instances
The cross-validation scores are very high for a finance exercise, we believe it is because the
models can capture the strong relationship between past returns and future returns displayed in
the feature description part very well. Indeed, if we were to choose funds solely based on the
past 3-months returns’ sign, we would already achieve an accuracy of around 64%.
Judging by the mean 5-fold CV score, it appears the best model is the random forest algorithm
and worst is the ANN which yields a bit more overfitting than the other models.
24
Table 1 Hyperparameter tuning results and cross validation scores
To further compare our models, we look at their 5-fold Receiving Operating Characteristic
(ROC) curves and Area Under the Curve (AUC). The ROC curves show the relationship
between the false positives and true positives when changing the classification threshold. An n-
fold ROC curve is a similar concept to cross validation. We train the model several times on
different hold out data and compute the predictions on the rest of the data. We then compute
the mean ROC curve over all the iterations.
The AUC can be interpreted as the probability of the model classifying a randomly chosen
positive instance higher than a randomly chosen negative one. It gives a metric to compare our
models.
25
Figure 13 5-fold ROC curves
The ROC curves are plotted on Figure 13. The models have the following AUC with the best
in terms of AUC being the ANN:
- Logistic Regression: 0.85
- SVM: 0.86
- Random Forest: 0.88
- Ensemble Classifier: 0.88
- ANN: 0.91
5.5 Relationship between the models’ probabilities and the returns
The models output probabilities with the methods described previously in section 3. To get an
idea of the relationship between the returns and the predicted probabilities, we conduct a linear
regression. The target variable is the training data’s target variable: the 3 months returns from
04/2001 to 07/2001. The explanatory variables are the probabilities associated to the models’
predictions.
To conduct this analysis, we split the dataset into 66% testing and 33% training samples, fit the
models on the training sample and compute the probabilities on the testing sample, the results
are plotted below.
26
Figure 14 Regression of returns to logistic regression probabilities
Figure 15 Regression of returns to SVM probabilities
Figure 16 Regression of returns to random forest probabilities
27
Figure 17 Regression of returns to ensemble classifier probabilities
Figure 18 Regression of returns to ANN probabilities
As we can see on Figures 14 to 18, the returns of the funds seem to be well captured by the
probabilities outputted by the models in a linear regression. Table 2 shows the regressions’
coefficients and r-squared, the coefficients are all highly statistically significant.
The coefficients and r-squared are all within a similar range and these results provide support
to our idea of using the models’ probabilities as fund selection signals.
28
Table 2 Regressing returns against probability results
29
6. Applying the models in a strategy
We apply the models on a long-only strategy ran on a universe of 10,415 funds. The time
window used to run the strategy on goes from 01/2002 to 12/2017, which corresponds to 64
quarters. The portfolio is rebalanced quarterly and at each rebalancing date: we filter the
relevant funds meaning those with at least 1 year of data, we recompute the NAV-based features
using up to 1 year of data and add the non-time dependent features. Then, using the models
trained as described previously, we make probabilistic predictions on the universe of funds.
The investment signal used is the outputted probability by the models of the fund yielding a
positive return in 3 months-time, the probabilities are ranked from highest to lowest and the
amount of funds in our portfolio for the next quarter is composed of a chosen quantile of the
funds. We define a weighting scheme so that, funds with higher probabilities have a higher
weight within the portfolio, as computed with the following formula:
𝑝𝑖
𝑤𝑖 =
∑𝑘 𝑝𝑘
𝑤𝑖 is the weight of fund i
𝑝𝑖 is the probability given to fund i
∑𝑘 𝑝𝑘 is the sum over all the selected funds’ probabilities
We also create a naïve momentum strategy in which we invest in the best past 3 months
performers, weighted according to the magnitude of the last 3 months returns, similarly to the
probability weighting:
𝑟𝑖
𝑤𝑖 =
∑𝑘 𝑟𝑘
𝑤𝑖 is the weight of fund i

30
𝑟𝑖 is the previous quarter returns of fund i
∑𝑘 𝑟𝑘 is the sum over all the selected funds’ previous quarter returns
We analyze the excess returns of each of the strategies computed using an equally weighted
market portfolio made of all the mutual funds available in our dataset, which’s value over the
back testing period is displayed on Figure 19.
Figure 19 Equal weight portfolio of all the funds over time
6.1 Back testing results
Figure 20 to Figure 24 below show the quarterly excess returns of the strategy for different
quantile of the universe of funds ranging from 10% to 50% of the top probabilities.
The best performing models are logistic regression and the ANN (labeled MLPClassifier);
however, the models yield similar returns patterns with SVM being the least performing one.
We can clearly see that the machine learning based strategies’ excess returns are correlated with
the Naïve momentum strategy’s performances. This might suggest that the models mostly
captured momentum effects which we will investigate more in depth in part 6.3.
31
As we select more and more funds with less prediction confidence, the below figures show that
the strategies’ returns diminish and look more and more similar because they are more likely to
pick up the same funds. The Naïve strategy on the other hand does not change much at some
point because of the weighting scheme based on past returns.
Figure 20 Excess Returns for a quantile of 10%
32
33
The below Table 4 to Table 7 are the results of testing the significance of the strategies’ mean
excess returns when adding more unconfident predictions in the portfolio. Again, the more
funds we choose the lower the expected excess returns, and the less the excess returns are
significantly different from zero. This goes in line with the regression results showed in 5.5,
that models’ probabilities can be a good investment signal as the excess returns from strategies
involving more highly confident predictions are more significant.
We can also observe that the Naïve momentum strategy yields much higher and significant
mean excess returns than the strategies we have implemented.
More figures are available in the appendix displaying absolute returns and P&L.
Table 3 Testing mean excess returns’ significance for q = 10%
34
35
Table 8 displays the information ratio (IR) of each strategies with respect to the market
portfolio, this ratio captures the gain of expected excess return per unit of risk, computed as
follow:
𝐸[𝑟 − 𝑟𝑚 ]
𝐼𝑅 =
√𝑣𝑎𝑟[𝑟 − 𝑟𝑚 ]
𝑟 is the return
𝑟𝑚 is the market portfolio return
It appears that the ANN’s IR drops more significantly compared to the other methods when we
choose more funds, Logistic Regression on the other hand consistently outperforms the other
algorithms and again SVM performs the worst but none of our method can beat the Naïve
momentum strategy which consistently has an IR at least above 2.
Table 8 Information ratios
36
6.2 Predictions’ accuracy
Aiming at the investigation of the predictive power of the models the prediction accuracy is
assessed. The prediction accuracies over time on each quarter are displayed on Figure 25. They
vary significantly over time with the following standard deviations:
- Logistic Regression: 7.72%
- SVM: 7.62%
- Random Forest: 8.09%
- Ensemble Classifier: 7.87%
- ANN: 8.39%
- Naïve: 10.38%
The machine learning based methods seem more reliable than the Naïve selection method which
has the highest standard deviation of accuracy.
Figure 25 Predictions' accuracy
37
The average accuracies are all similar and lower than the results we obtained on the training
data:
- Logistic Regression: 64.54 %
- SVM: 64.75%
- Random Forest: 64.68%
- Ensemble Classifier: 64.92%
- ANN: 65.30%
- Naïve: 65.76%
The nature of the training data we used is likely to be the reason why we see this difference
between the average accuracy for the back-testing period and the training data accuracies, as
the training data had some slight imbalance regarding the amount of positive and negative
instance.
Table 9 displays the mean accuracies when only considering the funds in our portfolio. They
are much higher than the overall accuracies, however the Naïve method seems less reliable than
our machine learning models. This means that the overperformance from the Naïve method
seen previously suggests that, although the Naïve method is less accurate in predicting whether
or not a fund is profitable in the next period, the selected funds yield higher returns.
Table 9 Mean accuracy of only the selected funds for different quantiles
38
6.3 Looking at the momentum effect
Throughout the results we observed in the previous parts, the findings suggest that the machine
learning based strategies could explain the same excess fund performance as Naïve momentum
selection strategies. In order to more closely examine how much of momentum, effect our
algorithms have captured, we retrained our models and back tested them after removing all the
features related to momentum effects. Therefore, we removed, the 3-, 6- and 12-months returns
as well as the consistency of returns feature, as those features were created in an effort to
measure past performance, that is momentum. Thus, the feature space for this test consists of
volatility, number of days of existence as well as the geographical and investment style features,
a total of 10 variables. Volatility remains in the feature space as it is a proxy for the past
riskiness and dispersion of fund returns. Nevertheless, Stivers and Sun (2010) find for stock
returns that dispersion is negatively related to subsequent momentum premiums, which means
for our analysis that, if their result were applicable to funds too, some momentum information
might be captured by volatility. However, for this analysis, volatility as proxy for riskiness
remains within the feature space.
Figure 26 shows the overall accuracy of our models, compared to Figure 25. Clearly the
accuracies are no better than random draws, with average accuracies ranging from 46% to 53%.
However, looking closely at the accuracy patterns of the ANN and Logistic Regression they
appear to be more correlated to each other than to the other models, possibly because they use
the same sigmoid function to infer probabilities.
39
Figure 26 Overall accuracy for models trained without momentum component
This visual conclusion is consolidated when looking at the below Table 10, which summarizes
the results of testing for a higher than 50% accuracy.
Table 10 Testing results for higher than 50% accuracy
40
Figure 27 Excess returns for models trained without momentum component
In line with the previous finding, the excess returns are mostly negative throughout the back-
testing period as Figure 27 shows above.
41
7. Conclusion
In this paper, we apply logistic regression, random forest, support vector machines, ensemble
classifier and artificial neural networks to a fund selection problem. We defined a fund selection
signal based on the probabilities given by the models, which represent the models’ prediction
confidence at classifying the next time period returns as positive. The explanatory variables we
defined include both past returns-based features, volatility, consistency of returns and past
returns, and alternative features to extract information from geographical and investment style
data as well as capturing the time of existence of the funds.
We applied the models when back testing a strategy to build a fund of funds portfolio on a
universe of 10,415 funds from the Survivor-Bias-Free US Mutual Fund database from the
Wharton Research Data Service. The models were trained at predicting the 3-months forward
returns from 04/2001 to 07/2001 using features computed prior to 04/2001, the accuracy on this
training sample was very high. They were then used during a 15 years period from 01/2002 to
12/2017 for a quarterly rebalanced strategy.
The probabilistic signal proved to be relevant at selecting funds. However, when testing the
models without momentum related features their accuracy could not statistically be rejected
from being random. Thus, it can be inferred that the original models we developed and trained,
were only able to capture momentum information within the explanatory variables and no
information came from the non-return-based features. The machine learning algorithms do not
statistically outperform a naïve momentum fund selection strategy but proved to be better at
correctly selecting funds that will yield positive returns over the next time period irrespectively
of the magnitude of the returns. The best performing algorithms were logistic regression and
the artificial neural network, possibly because they share the same method to infer prediction
confidence. On the other hand, the support vector machine performed the worst, it might be
42
because the algorithm is not originally designed to provide a probabilistic output, making it less
reliable.
The study can be extended and pushed further by improving the features engineering and
selection part. Using less momentum-capturing features, the accuracy of the models might
improve, classical financial measures such as the fund’s alpha or various financial ratios might
add additional information factoring. Factoring the funds’ fees and transaction costs into the
calculation of the performance or into the models’ features would also be an interesting addition
to the study.
Although the alternative features we chose were not able to provide additional information,
many studies on other areas suggest that information can be contained in non-financial data.
For instance, one could create a feature describing the fund manager’s experience and
qualification (Chevalier & Ellison, 1999) and have it as a time depend variable as a fund’s
manager can be subject to change.
More state-of-the-art machine learning technics, such as generative models or reinforcement
learning models, should also be looked at as they stray from classical finance models and have
found great success in other fields of applications.
43
References
Brown, S. J., & Goetzmann, W. N. (1995). Performance persistence. The Journal of finance,
50(2), 679-698.
Carhart, M. M. (1997). On persistence in mutual fund performance. The Journal of finance,
52(1), 57-82.
Chevalier, J., & Ellison, G. (1999). Are some mutual fund managers better than others? Cross‐
sectional patterns in behavior and performance. The journal of finance, 54(3), 875-899.
Deo, R. C. (2015). Machine learning in medicine. Circulation, 132(20), 1920-1930.
Fortin, R., & Michelson, S. (2002). Indexing versus active mutual fund management.
JOURNAL OF FINANCIAL PLANNING-DENVER-, 15(9), 82-95.
Gudelek, M. U., Boluk, S. A., & Ozbayoglu, A. M. (2017, November). A deep learning-based
stock trading model with 2-D CNN trend detection. In 2017 IEEE Symposium Series on
Computational Intelligence (SSCI) (pp. 1-8). IEEE.
Ho, K. Y., & Wang, W. W. (2016). Predicting stock price movements with news sentiment: An
artificial neural network approach. In Artificial Neural Network Modelling (pp. 395-403).
Springer, Cham.
44
Indro, D. C., Jiang, C. X., Patuwo, B. E., & Zhang, G. P. (1999). Predicting mutual fund
performance using artificial neural networks. Omega, 27(3), 373-380.
Khandani, A. E., Kim, A. J., & Lo, A. W. (2010). Consumer credit-risk models via machine-
learning algorithms. Journal of Banking & Finance, 34(11), 2767-2787.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep
convolutional neural networks. In Advances in neural information processing systems (pp.
1097-1105).
Lakonishok, J., Shleifer, A., Vishny, R. W., Hart, O., & Perry, G. L. (1992). The structure and
performance of the money management industry. Brookings Papers on Economic Activity.
Microeconomics, 1992, 339-391.
Ludwig, R. S., & Piovoso, M. J. (2005). A comparison of machine‐learning classifiers for
selecting money managers. Intelligent Systems in Accounting, Finance & Management:
International Journal, 13(3), 151-164.
Platt, J. (1999). Probabilistic outputs for support vector machines and comparisons to
regularized likelihood methods. Advances in large margin classifiers, 10(3), 61-74.
Snoek, J., Larochelle, H., & Adams, R. P. (2012). Practical bayesian optimization of machine
learning algorithms. In Advances in neural information processing systems (pp. 2951-2959).
45
Stivers, C., & Sun, L. (2010). Cross-sectional return dispersion and time variation in value and
momentum premiums. Journal of Financial and Quantitative Analysis, 45(4), 987-1014.
Tarsauliya, A., Kant, S., Kala, R., Tiwari, R., & Shukla, A. (2010). Analysis of artificial neural
network for financial time series forecasting. International Journal of Computer Applications,
9(5), 16-22.
Titman, S., & Grinblatt, M. (1989). Mutual fund performance: An analysis of quarterly portfolio
holdings. Journal of Bussines, 62(3).
46
View publication stats

Selecting Mutual Funds Using Machine Learning Clas

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Selecting Mutual Funds Using Machine Learning Clas

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Selecting Mutual Funds Using Machine Learning Classiﬁers

Thesis · May 2019

The user has requested enhancement of the downloaded file.

Using Machine Learning Classifiers

May 26, 2019

Master of Science in Financial Markets

Under the supervision of Professor Christophe CROUX

which are the sole responsibility of the author.

2. Related Work .................................................................................................................................. 9

3. Models Description ....................................................................................................................... 11

3.1 Logistic Regression ................................................................................................................. 11

3.2 Support Vector Machines ........................................................................................................ 12

3.3 Random Forests ....................................................................................................................... 12

3.4 Voting/Ensemble classifier ..................................................................................................... 13

3.5 Artificial Neural Network ....................................................................................................... 14

4. Data and Features ......................................................................................................................... 15

4.1 Return based features .............................................................................................................. 16

4.2 Non-Net Asset Values based features ..................................................................................... 18

5. Calibrating the models and training results ............................................................................... 22

5.1 Tuning the models’ hyperparameters ...................................................................................... 22

5.2 Bayesian Optimization ............................................................................................................ 23

5.3 Random Search ....................................................................................................................... 23

5.4 Results and model comparison ................................................................................................ 24

6. Applying the models in a strategy ............................................................................................. 30

6.1 Back testing results.................................................................................................................. 31

6.2 Predictions’ accuracy .............................................................................................................. 37

6.3 Looking at the momentum effect ............................................................................................ 39

NAV Net Asset Value

ANN Artificial Neural Network

AUC Area under the Curve

NAV Net Asset Value

ReLU Rectified Linear Unit

ROC Receiver Operating Characteristic

SVM Support Vector Machines

momentum fund selection process.

alongside some non-financial features.

providing an estimate of a given target variable over a time period.

funds (Titman & Grinblatt, 1989).

Networks (Gudelek, Boluk & Ozbayiglu, 2017).

performance of a simple scoring model.

the algorithms are trained and validated.

(ANN), we will use both Keras and Scikit-Learn.

3.1 Logistic Regression

regression model to a sigmoid function:

With z being the result of a linear regression of our target

variable with our explanatory variables. Figure 1 Sigmoid function

higher than 0.5, the datapoint is classified as 1, and 0 otherwise.

This algorithm classifies the data by constructing a

hyperplane, separating the training data’s different

classes, in our case, positive or negative returns, and

maximizing the distance this hyperplane has to the

training data. Thus, the hyperplane separates the classes

of funds yielding positive returns in the next forecasting

positive return or negative return class to the fund.

3.3 Random Forests

class from all the individual trees.

3.4 Voting/Ensemble classifier

and random forest.

At its roots, an Artificial Neural Network

(or Multilayer Perceptron) aims at

replicating the biological scheme of

biological brains: neurons connected to

each other by synapses.