You are on page 1of 8

The Journal of Prediction Markets 2021 Vol 15 No 1 pp 3-9

PREDICTING INTRADAY CRYPTOCURRENCY


RETURNS – A SPARSE SIGNALS APPROACH
Vaibhav Lalwani1 Vedprakash Vasantrao Meshram
Indian Institute of Management Goa Institute of Management
Raipur, Raipur-493661, India Goa, India
E-mail: vaibhavlalwani@outlook.com

ABSTRACT

We test for the existence of sparse and short-lived signals in minute-by-minute


cryptocurrency returns. Using a large set of linear as well as non linear
predictors and a machine learning technique called the LASSO, we generate
1-minute ahead out of sample return forecasts for ten major cryptocurrencies.
The forecasts obtained from the LASSO are statistically superior to those
generated by the benchmark models. The LASSO based estimation selects
predictors that are sparse and quite short lived.

JEL Codes: G12, G15, G17

Keywords: sparse signals; cryptocurrencies; LASSO; machine learning; intra-


day returns; financial markets

INTRODUCTION

Long lived predictors such as dividend yields, volatility, historical prices etc.
form the bedrock of the return predictability literature. By long lived, we mean
that these predictors have been shown to predict returns over a long horizon.
However, given the rapid and ever changing nature of information arrival and
absorption in financial markets, the possibility of existence of short lived
predictors cannot be ruled out. Chinco et al. (2019) report the presence of short
lived predictors for 1-minute stock returns using machine learning methods.
They show that while these predictors are unexpected and hard to intuit, they
are nonetheless economically meaningful and significant.
In this study, our objective is to test for the existence of sparse and short-
lived signals in cryptocurrency returns. The rapid growth in these assets has
sparked huge interest in academic researchers about the nature of price
fluctuations of these assets. We add to the evidence by testing if one can use
the Least absolute shrinkage and selection operator (LASSO) of Tibshirani

1
  Corresponding Author

3
THE JOURNAL OF PREDICTION MARKETS
2021 15 1

(1996) to extract sparse signals in the cross section of cryptocurrency


returns.
Cryptocurrency markets are an interesting setting to study short lived
predictors in asset returns because cryptocurrencies are volatile and highly
unpredictable2. Unlike fiat currencies, there is no monetary authority to
stabilise the purchasing power of cryptocurrencies. As Chinco et al. (2019) put
it, short lived sparse predictors in stock returns are largely statistical and hard
to intuit based on economic rationale. The same could be said for cryptocurrency
returns. With very little guidance from economic theory on the valuation of
cryptocurrencies, purely statistical factors could, at the very least, uncover
some patterns of predictability in these large and complex markets.
The nature of short-lived signals makes traditional estimation methods such
as the Ordinary Least Squares (OLS) unfit for the task at hand. One reason for
this is that while the possible set of predictors is large, only a few are relevant for
prediction at a given point of time. This requires an estimation method that
selects a few sparse predictors out of many. Further, the expected life of a signal
is short, and this leads to a situation where we must extract a small number of
relevant predictors from a large set but with only a handful of observations to
train/estimate the model. In other words, we have a situation where the number
of predictors is larger than the number of observations. This task is clearly out of
bounds for the traditional OLS estimation. Chinco et al. (2019) argue that the
LASSO is a precise methodology for extracting sparse, short-lived, and
unexpected predictors from stock returns and subsequently show that the LASSO
improves the accuracy of 1-minute out-of-sample return forecasts for the stocks
listed in the US market. Similarly, we use the LASSO to generate 1 minute ahead
forecasts of 10 major cryptocurrencies using a wide range of possible predictors
that also include complex interactions among returns of other currencies.
Our major findings show that unexpected, short-lived and sparse signals
extracted from a statistical selection rule can be used to predict future
cryptocurrency returns. We add to the growing literature on return predictability
in cryptocurrencies (Gregoriou (2019)) and to studies that try to understand the
relation between crypto currencies and other assets (Ciaian et al. (2019)).
The next section discusses the Data and methods used in the study and is
followed by the results and the conclusion of our analysis.

DATA AND METHODS

We obtain data for minute-by-minute closing prices of ten major


cryptocurrencies (Bitcoin, Ripple, Litecoin, Bitcoin Cash, Dash, EOS,
Ethereum, Ethereum Classic, Monero and Zcash) as well as nine fiat currency
pairs (Australian Dollar, Canadian Dollar, Chinese Yuan, Euro, Japanese Yen,

2
  On Nov 12, 2017 Bitcoin was trading at $5,950.07. Within a month, Bitcoin reached
an all-time high of $ 19,783.06 on Dec 17, 2017.

4
SPARSE SIGNALS IN CRYPTOCURRENCIES

New Zealand Dollar, Pound, Swiss Franc and Swedish Kroner). All currency
prices are denominated in US Dollars and all the data are obtained from the
Bloomberg Database. Our sample begins on 24-Sep-2018 and ends on
02-Dec-2019, the end date being the date of data collection. While the time
period may appear short, it is to be noted that the analysis is based on high
frequency data, for which a large time period study is not computationally
feasible. In its current state, our study involves the estimation of more than
10 million regressions to generate forecasts.

ESTIMATION METHODOLOGY

Our general methodology involves the estimation of a predictive model:

Ri = X i ,j ∗ β̂ + ε i

where  Ri  is the return on cryptocurrency i,  X i ,j  is list of standardised predictors
{ }
and β = β0 , β1 ...β j . The LASSO estimator is the solution to the following
penalised least squares optimization problem for some value of λ > 0 .

{
β̂lasso = argmin || R − Xb ||22 + λ ||b ||1
b
}
The inclusion of the λ penalty is what gives the LASSO its sparsity property.
The optimizer will convert any coefficient greater than λ to zero, and thus we
are left with a handful of sparse predictors out of many candidate predictors.
The choice of λ is not a trivial one. Following standard best practices discussed
in Hastie et. al. (2001), we use 10-fold cross validation to estimate the in sample
MSE at different values of λ . Following this procedure, there are two ways that
a researcher can select the optimal value of λ - either choose the λ that
minimises the in sample MSE ( λmin ) or select a λ that is 1 standard error
higher than the λmin (i.e. λ1se ). We pick λ1se for our main forecasting exercise as
λmin has been shown to overemphasise in sample fit at the cost of poor out of
sample forecasting performance.
We use a 30-minute rolling estimation window for training a predictive
model and then generating one minute ahead out of sample forecasts using the
parameters obtained from the trained model. For generating the forecast of
1-minute ahead return for a given cryptocurrency at time t+1, we use 5 lags of
returns of 19 predictors (10 crypto-currencies and nine fiat currencies) along
with their squares and cross interactions with each other. We also use
cumulative returns of all 19 predictors from last 2 minutes upto the last
5 minutes to capture momentum in high frequency returns. We add to the
analysis in Chinco et al. (2019) by using higher order and momentum variables
as predictors. Given the reported non linearities in asset prices in studies such

5
THE JOURNAL OF PREDICTION MARKETS
2021 15 1

as Reboredo et al. (2012), we consider it necessary to include non-linear terms


in generating predictions for cryptocurrency returns. Further, Moritz and
Zimmermann (2016, unpublished3) show that there is a non-linear relationship
between past returns and current returns that can be captured by interactions
among past returns, thus justifying the need for higher order interactions as
predictors in the forecasting model. In total, we have 1121 predictors in our
forecasting model4. Missing values of predictors are converted to zeroes. As is
customary in estimation using the LASSO, all variables are standardized to
have a mean zero and variance as one. We remove any 30-minute intervals
from our analysis where the number of available observations available to train
the predictive model are less than 15.

RESULTS

In table 1, we report the accuracy of return forecasts (MSEs) of the ten


cryptoccurencies using the LASSO as well five benchmark models -
auto-regressive (AR) models up to five lags.
The results show that forecasts generated using the LASSO are superior to
those from benchmark models for nine out of the ten cryptocurrencies. The
LASSO model forecasts has least MSE values for all currencies except Ripple,
where it is only marginally worse than the AR(1) benchmark forecasts.
Among the models studied here, the LASSO and the AR1 benchmark
have split the honours of the best forecasting models between them. Within
the auto-regressive benchmarks, other benchmark models have much a lower
forecast accuracy than the AR(1). Therefore, we conduct rest of the analysis
using only AR(1) as a benchmark. We also conduct a test to see if the
improvement in forecast accuracy by using the LASSO is statistically
significant vis-a-vis the AR(1) benchmark. We use the modified version of
Deibold-Mariano test proposed by Harvey et al. (1997) to test the above
hypothesis. The p-values of the test show that the null of the benchmark (AR1)
accuracy being greater than or equal to that of the LASSO forecasts is rejected
at 10% level for six out of the ten cryptocurrencies. There is a statistically
significant improvement in out of sample forecast accuracy using the LASSO
over the benchmark models, although it seems that this improvement is not
the same across all currencies. Statistically, we observe the strongest results
for Bitcoin, Litecoin and EOS. Being one of the most liquid and broadly traded
3
 B. Moritz and T. Zimmermann, ‘Tree-Based Conditional Portfolio Sorts: The
Relation between Past and Future Stock Returns’ (March 1, 2016). Available at http://
dx.doi.org/10.2139/ssrn.2740751
4
  These predictors are: 5 lags of returns on 19 currencies(95) + squared 5 lags of
returns on 19 currencies(95) + one-to-one interaction (product) of returns on
19 currency returns for all 5 lags (19C2 * 5 = 855) + cumulative returns on all
19 currencies in the last 2 to 5 minutes (76). Thus, we have a total set of 1121 predictors.

6
SPARSE SIGNALS IN CRYPTOCURRENCIES

Table 1. Forecast Accuracy of LASSO and Benchmark Models

DM Test
Currency LASSO AR1 AR2 AR3 AR4 AR5 (p-value)
Bitcoin 1.02 1.21 1.46 1.72 2.11 2.50 0.00
Ripple 2.76 2.63 3.00 3.46 4.11 91.08 1.00
Litecoin 6.11 6.73 8.30 10.34 13.75 18.68 0.00
Bitcoin Cash 18.98 21.77 26.82 36.20 46.18 68.56 0.18
Dash 4.88 159.00 426.74 1966.08 5621.66 8154.31 0.14
EOS 1.97 2.81 3.77 5.69 9.04 41.67 0.00
Ethereum 2.24 2.34 2.74 3.33 4.14 4.99 0.01
Ethereum
5.71 5.99 7.26 9.00 12.26 140.53 0.01
Classic
Monero 6.30 386.32 425.42 555.43 770.40 2256.25 0.16
Zcash 1.67 2.28 3.33 4.69 8.51 9013.79 0.01

Note: This table reports the Mean squared error for the one-minute crypto
return forecast using the LASSO and the auto-regressive (AR) benchmark
models. The last column shows the p-value of the modified Diebold-Mariano
test proposed by Harvey et al., (1997). The null hypothesis of the test is that the
forecast accuracy of the AR(1) benchmark model is greater than or equal to
that of the LASSO model. MSE values are multiplied by 100 to facilitate
readability.

currencies in the crypto universe, it is expected that Bitcoin prices react more
swiftly to information, thus giving rise to short lived predictors. The absence
of short-lived predictors in other currencies could be due to multiple reasons.
First, due to weak liquidity on an intraday basis, observed prices are not
reliable enough to use in the forecasting exercise. Second, the number of
sparse signals could be either zero or greater than the estimation window (30).
In either of these cases, the LASSO will fail to detect sparse signals. (See
Chinco et al. (2019) for a discussion on this.)
Further, we conduct an analysis on the nature of the sparse predictors
derived from the virtue of the LASSO’s feature selection properties. For each
forecast, we determine the number of predictors that LASSO selected as
significant. This is basically the number of predictors with non-zero coefficients
in the model estimated by LASSO. The results in table 2 suggest that in about
91% to 95% of the cases, the LASSO does not select even a single predictor
from the entire choice of candidates (thus estimating only an intercept). For
bitcoin cash return forecasting model, LASSO selected atleast one nonzero
predictor in about 9% of the the cases. The frequency of predictor selection is
higher in case of bitcoin cash.

7
THE JOURNAL OF PREDICTION MARKETS
2021 15 1

Table 2. Analysis of Signals Generated by the LASSO Estimation

Currency Average Signal Average Signal Signal Average Life of


(Signal > 0) Frequency signal
Bitcoin 0.31 5.81 5.26% 1.29
Ripple 0.32 5.52 5.84% 1.31
Litecoin 0.31 5.28 5.88% 1.33
Bitcoin Cash 0.37 4.13 8.94% 1.54
Dash 0.23 5.62 4.18% 1.32
EOS 0.27 5.70 4.73% 1.29
Ethereum 0.32 5.48 5.89% 1.33
Ethereum Classic 0.29 5.21 5.54% 1.34
Monero 0.29 5.29 5.49% 1.35
Zcash 0.26 5.68 4.52% 1.31

Notes: This table reports the characteristics of predictors selected by the


LASSO. Average Signal shows the average number of predictors obtained from
the LASSO model. Average Signal (Signal > 0) shows the average number of
predictors only for the cases where the LASSO selects at-least one predictor.
Signal Frequency is proportion of forecasts where the LASSO selects at least
one predictor. Average life of signal is the average length of a run of a predictor
with non-zero value.

We also estimate the average life of a predictor. The life of a predictor is


the number of consecutive forecasts for which it has a non-zero values. So, if
LASSO selects a predictor for the forecast model for minute t, t+1, t+2, but
not in t+3, the life of the predictor is 3 minutes. This value can also be
understood as the average length of a run of a predictor with non-zero
coefficient. Our results show that the average life of a predictor for any
currency is less than 2 minutes. This is much lower than the findings in Chinco
et al. (2019) who reported a life of about 13 minutes for a predictor. This
difference is likely due to the nature of the asset markets considered in these
two studies. Chinco et al. (2019) focus on stock returns in a single market.
Because our study involves cryptocurrencies, which are traded across the
world for 24 hours a day, we can only take predictors from other currency
markets which share the same trading characteristics as these cryptocurrencies.
This limits our choice of predictors and thus will affect the comparability of
our results with previous studies.
In additional results not reported here (available with authors upon request),
we repeat our estimation using the elastic net estimator instead of the LASSO.
Our results related to out of sample forecast accuracy are similar using the
elastic net penalty instead of the LASSO.

8
SPARSE SIGNALS IN CRYPTOCURRENCIES

CONCLUSION

We apply the LASSO algorithm to generate 1-minute ahead rolling forecasts


for returns of ten cryptocurrencies using returns on other cryptocurrencies as
well as fiat currencies as predictors. We find that out-of-sample forecasts
generated using the LASSO generate lower errors than the forecasts from
benchmark models. The sparse set of predictors obtained using LASSO are
generally short lived, with mean life of just under 2 minutes. Our findings
confirm that unexpected, short-lived and sparse signals extracted from a
statistical selection rule can be used to predict future cryptocurrency returns.

REFERENCES

A. Chinco, A. D. Clark-Joseph and M. Ye. ‘Sparse signals in the cross-section


of returns’ The Journal of Finance (2019), 74(1):449–492. https://doi
.org/10.1111/jofi.12733
P. Ciaian, M. Rajcaniova, and K. d’Artis. ‘The economics of BitCoin price
formation’ Applied Economics (2016), 48(19):1799–1815. https://doi.org
/10.1080/00036846.2015.1109038
A. Gregoriou. ‘Cryptocurrencies and asset pricing’ Applied Economics Letters
(2019) 26 (12):995–998. https://doi.org/10.1080/13504851.2018.1527439
D. Harvey, S. Leybourne, and P. Newbold. ‘Testing the equality of prediction
mean squared errors’ International Journal of Forecasting (1997),
13(2):281–291. https://doi.org/10.1016/S0169-2070(96)00719-4
T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning
(New York, Springer series in statistics, 1st volume. 2001) https://doi
.org/10.1007/978-0-387-84858-7
J. C. Reboredo, J. M. Matias, and R. Garcia-Rubio. ‘Nonlinearity in forecasting
of high-frequency stock returns’. Computational Economics (2012),
40(3):245–264. https://doi.org/10.1007/s10614-011-9288-5

9
Copyright of Journal of Prediction Markets is the property of University of Buckingham Press
and its content may not be copied or emailed to multiple sites or posted to a listserv without
the copyright holder's express written permission. However, users may print, download, or
email articles for individual use.

You might also like