You are on page 1of 10

Research Proposal

A. Basic Information

Project Title: Predicting Altcoin Future Trends by Vector Autoregression Using the Current Value of Bitcoin

Topic: Machine Learning, Data Analysis, Forecasting, Statistics, Auto Regression, Block Chain, Cryptocurrency

Proponent: John Ashly Barrientos, Rogelio Diaz Jr., Mary Loid Garcia, John Rey Serdiña

B. Technical Description

Background of the Study:

Cryptocurrency has been gaining a lot of attention lately because of non-fungible token (NFT) games that pay you real money in
exchange for the currencies you earn in-game. However, because of the volatile nature of cryptocurrencies, many are still skeptical
about their legitimacy, and some are afraid to invest in something unsure.

This research aims to create a simple forecasting model of alternative coins or "altcoins" such as Ethereum, Dogecoin, Litecoin
using the value of Bitcoin by vector autoregression. Some altcoins are influenced by bitcoin value so the researchers decided to use
its value to predict the changes on altcoins over time using the VAR model. The data set will be taken from different sources on the
internet specifically from CoinGecko and other cryptocurrency listings. This research might help others who are interested in investing
and buying cryptocurrency as a guide for wiser and accurate decisions.

Statement of the Problem: General and Specific

The volatile nature of cryptocurrency can be a major obstacle for someone who is new in this type of market. The prices of any
cryptocurrency might be soaring high right now but it can go down any hours, minutes or even seconds. This research aims to clear
that obstacle and attempts to solve the following problems:
1. How to draw prediction or forecast using the Vector Autoregression model and the gathered data?
2. How do altcoins and bitcoin correlate to each other based on the Vector Autoregression model?
3. How accurate is the forecast?

How did others solve the problem? (10 articles)

1. Anupriya, & Garg, S. (2018). Autoregressive Integrated Moving Average Model based Prediction of Bitcoin Close Price. 473-478.

In this paper the prediction of Bitcoin close price by using the ARIMA model has been performed. The ARIMA model is found
suitable for the prediction of bitcoin prices because this model is used for prediction of time series data. In terms of visualizations,
results are manifest by using the R programming language. The obtained results are then compared with actual prices and
percent mean error is calculated. The present mean error is found here less than 6% for most of the values.

2. Alessandretti, L. ElBahrawy, E. Baronchel, A. (2018). Anticipating Cryptocurrency Prices Using Machine Learning. Complexity, 2018(8983590):16

In this research the researchers used Machine learning and AI-assisted approach to test the hypothesis that the inefficiency of
the cryptocurrency market can be exploited to generate abnormal profits. They tested the performance of three forecasting
models on daily cryptocurrency prices for currencies. Two of them (Method 1 and Method 2) were based on gradient boosting
decision trees and one is based on long short-term memory recurrent neural networks (Method 3). In Method 1, the same model
was used to predict the return on investment of all currencies; in Method 2, they built a different model for each currency that uses
information on the behaviour of the whole market to make a prediction on that single currency; in Method 3, they used a different
model for each currency, where the prediction is based on previous prices of the currency. They’ve analyzed daily data for
cryptocurrencies for the period between Nov. 2015 and Apr. 2018. It is shown that simple trading strategies assisted by state-of-
the-art machine learning algorithms outperform standard benchmarks. The result showed that nontrivial, but ultimately simple,
algorithmic mechanisms can help anticipate the short-term evolution of the cryptocurrency market.
3. Abraham, J. Higdon, D. Nelson, J. Ibarra, J. (2018). Cryptocurrency Price Prediction Using Tweet Volumes and Sentiment Analysis. SMU Data
Science Review 1(3):1.

In this paper, they’ve presented a method for predicting changes in Bitcoin and Ethereum prices utilizing Twitter data and
Google Trends data. Bitcoin and Ethereum, the two largest cryptocurrencies in terms of market capitalization represent over $160
billion dollars in combined value. Twitter is increasingly used as a news source influencing purchase decisions by informing users
of the currency and its increasing popularity. As a result, quickly understanding the impact of tweets on price direction can provide
a purchasing and selling advantage to a cryptocurrency user or a trader. By analyzing tweets, they’ve found that tweet volume,
rather than tweet sentiment (which is invariably overall positive regardless of price direction), is a predictor of price direction. By
utilizing a linear model that takes as input tweets and Google Trends data, they were able to accurately predict the direction of
price changes. By utilizing this model, a person is able to make better informed purchase and selling decisions related to Bitcoin
and Ethereum.

4. Spilak, B. (2018). Deep Neural Networks for Cryptocurrencies Price Prediction.

This paper presents a Neural Network framework to provide a deep machine learning solution to the price prediction problem.
The framework is realized in three instances with a Multilayer Perceptron (MLP), a simple Recurrent Neural Network (RNN) and a
Long Short-Term Memory (LSTM), which can learn long dependencies. They’ve described the theory of neural networks and deep
learning in order to be able to build a reproducible method in their applications on the cryptocurrency market. Since price
prediction is used in order to make financial decisions such as trade signals, they compared different approaches of the prediction
problem by exploring supervised learning methods in classification tasks. They’ve studied these models to predict out-of-sample
price directions of height major cryptocurrencies with a rolling window regression method and compare different weighted
portfolios to test how an investor can benefit from fundamental indicators such as market capitalization. They’ve concluded that
LSTM has the best accuracy for predicting directional movements for the most important cryptocurrencies of CRIX and that an
equally weighted portfolio beats CRIX in the first quarters of 2017.

5. Sathyanarayana, S.S. & Sudhindra, G. (2019). Modeling Cryptocurrency (Bitcoin) using Vector Autoregressive (Var) Model. SDMIMD Journal of
Management. 10. 47-64.

This research aims to investigate the relationship between five major global currencies namely USD, GBP, Euro, CHF and
Japanese Yen with the prominent Cryptocurrency Bitcoin. For the purpose of the study the data has been collected from
yahoo.com and various websites. The researchers have collected data from September 2013 to March 2018. In the very first
phase the collected data has been tested for normality and later it has been tested for existence of unit root by running an ADF
test. Later, GARCH (1,1) model and EGARCH model. Since the data was integrated at first order Johansen cointegration test has
been conducted and VECM model (USD and GBP) and Unrestricted VAR test (Euro and Japanese Yen) has been run. In the last
phase Impulse response function and ARDL model has been done to draw conclusions. The researchers concluded that apart
from that variance regressors such as USD, GBP and JP yen were also significantly contributing to the volatility in the Bitcoin.
6. Bohte R, Rossini L. (2019) Comparing the Forecasting of Cryptocurrencies by Bayesian Time-Varying Volatility Models. Journal of Risk and Financial
Management; 12(3):150.

This paper studies the forecasting ability of cryptocurrency time series. This study is about the four most capitalized
cryptocurrencies: Bitcoin, Ethereum, Litecoin and Ripple. Different Bayesian models are compared, including models with
constant and time-varying volatility, such as stochastic volatility and GARCH. The result of the forecasts of the BVAR-GARCH
model only for Ripple is not more often in the 95% credible interval. This would imply that the forecasts are less volatile using the
BVAR-GARCH model compared to the BVAR model, and for Ripple this would be the opposite. The BVAR-SV and BVARX-SV
models have the highest percentages of all the cryptocurrencies except for Bitcoin. This would suggest that using Stochastic
Volatility will not give a good prediction overall using credible intervals. The results between the BVAR model and the BVARX
model are close to each other, thus there is not a clear distinction between these two models. However, the BVARX-GARCH
model is the model that stands out the most, which gives the most forecasts in the 95% credible interval, the only exceptions are
the BVAR-GARCH model for Ethereum and the BVARX-SV model for Bitcoin.

7. Cohen, G. (2020). Forecasting Bitcoin Trends Using Algorithmic Learning Systems.

This research has examined the ability of two forecasting methods to forecast Bitcoin’s price trends. The research is based on
Bitcoin—USA dollar prices from the beginning of 2012 until the end of March 2020. The researchers used particle swarm
optimization to find the best forecasting combinations of setups. Results show that Bitcoin’s price changes do not follow the
“Random Walk” efficient market hypothesis and that both Darvas Box and Linear Regression techniques can help traders to
predict bitcoin's price trends. They concluded that both methodologies work better predicting an uptrend than a downtrend. The
best setup for the Darvas Box strategy is six days of formation. A Darvas box uptrend signal was found to be efficient predicting
four sequential daily returns while a downtrend signal faded after two days on average. The best setup for the Linear Regression
model is 42 days with 1 standard deviation.
8. Sebastião, H. & Godinho, P. (2021). Forecasting and trading cryptocurrencies with machine learning under changing market conditions. Financial
Innovation 7(3)
This study examines the predictability of three major cryptocurrencies: bitcoin, ethereum, and litecoin, and the profitability of
trading strategies devised upon ML, namely linear models, RF, and SVMs. The models are validated in a period characterized by
unprecedented turmoil and tested in a period of bear markets, allowing the assessment of whether the predictions are good even
when the market direction changes between the validation and test periods. The classification and regression methods use
attributes from trading and network activity for the period from August 15, 2015 to March 03, 2019, with the test sample beginning
on April 13, 2018. For the test period, five out of 18 individual models have success rates of less than 50%. The trading strategies
are built on model assembling. The ensemble assuming that five models produce identical signals (Ensemble 5) achieves the best
performance for ethereum and litecoin, with annualized Sharpe ratios of 80.17% and 91.35% and annualized returns (after
proportional round-trip trading costs of 0.5%) of 9.62% and 5.73%, respectively. These positive results support the claim that
machine learning provides robust techniques for exploring the predictability of cryptocurrencies and for devising profitable trading
strategies in these markets, even under adverse market conditions.

9. Patel, M. Sudeep, T. Gupta, R. Neeraj, K. (2020). A Deep Learning-based Cryptocurrency Price Prediction Scheme for Financial Institutions. Journal of
Information Security and Applications (55)

In this paper, Long short-term memory (LSTM) and Gated Recurrent Unit (GRU) based hybrid cryptocurrency prediction
schemes are proposed; they focus on only two cryptocurrencies, namely Litecoin and Monero. The results depict that the
proposed scheme accurately predicts the prices with high accuracy, revealing that the scheme can be applicable in various
cryptocurrencies' price predictions.

10. Awotunde J.B., Ogundokun R.O., Jimoh R.G., Misra S., Aro T.O. (2021) Machine Learning Algorithm for Cryptocurrencies Price Prediction. In: Misra S.,
Kumar Tyagi A. (eds) Artificial Intelligence for Cyber Security: Methods, Issues and Possible Horizons or Opportunities. Studies in Computational
Intelligence (972).

This study learns how to adapt Long Short-Term Memory (LSTM) to build the cryptocurrency price prediction model. The key
factors used are available price, close price, high price, low price, volume and market cap with the interdependencies amid some
cryptocurrencies thus centers on measuring vital features that influence the trade’s unpredictability by applying the model to
increase the effectiveness of the process. The LSTM model outperformed other models in terms of Bitcoin, Ether and Litecoin
cryptocurrencies. The proposed model is found to be efficient for cryptocurrency price prediction when compared to similar
models with 67.43% accuracy.

Statement of the Problem: General


and Specific How do you intend to solve the problem:

The volatile nature of


cryptocurrency can be a major obstacle 1. The researchers will collect data from the internet and will import the dataset
for someone who is new in this type of gathered into a python program, the program will then give the model of time series
market. The prices of any of each altcoin and will make a training model based on the dataset. The forecast is
cryptocurrency might be soaring high generated by the trained model but in order to get the real data, the researchers will
right now but it can go down any hours, invert it back to its original scale.
minutes or even seconds. This research
aims to clear that obstacle and attempts
to solve the following problems: 2. The researchers will base on the model created by the Vector Autoregression in
order to know the relationship of the altcoins and bitcoin.

1. How to draw prediction or forecast


using the Vector Autoregression
model and the gathered data? 3. The researchers will test the accuracy of the forecast by comparing the actual data
from the cryptocurrency listing and the predicted data.

2. How do altcoins and bitcoin


correlate to each other based on the
Vector Autoregression model?

3. How accurate is the forecast?

Conceptual Framework
Deliverables:

The following are the deliverables of this study:

1. Graph representation of the altcoin time series.

2. Data set of the altcoin in tables.

3. Prediction of the future values of altcoins.

4. Validation of the Prediction based on Vector Autoregression and the Actual currency list.

Compilation of Algorithms in the research area identified above, list the algorithms/technologies being used. Add more rows to the
table, if necessary.
Algorithm
Strength Weakness Application (Cited)
the traditional model identification ARIMA model has been used extensively in the field
For ARIMA and Auto-ARIMA, you can techniques for identifying the correct of finance and economics as it is known to be robust,
ARIMA & ARMA run as many forecast periods as you model from the class of possible models efficient and has a strong potential for short-term
wish if you only use the time-series are difficult to understand and usually share market prediction. (Nagesh Singh Chauhan,
variable (Y) computationally expensive. 2020)
LSTM is well-suited to classify, process LSTMs are prone to overfitting and it LSTM networks were studied and implemented for
and predict time series given time lags is difficult to apply the dropout classification of gesture data because of their
of unknown duration. Relative algorithm to curb this issue. Dropout ability to learn long-term dependencies. The
Long Short-Term insensitivity to gap length gives an is a regularization method where designed model could classify 26 gestures with an
Memory (LSTM) advantage to LSTM over alternative input and recurrent connections to accuracy of 98%, showing the feasibility of using
RNNs, hidden Markov models and LSTM units are probabilistically LSTM based neural networks for the purpose of
other sequence learning methods. excluded from activation and weight sign language translation. (Abraham, A. Nayak and
updates while training a network. A. Iqbal.2019. "Real-Time Translation of Indian
Sign Language using LSTM,")
Linear regression is straightforward to It lacks practicality since most A proposed local linear regression model was
understand and explain, and can be problems in real world aren’t “linear”. applied to short-term traffic prediction. The
regularized to avoid overfitting. In Linear regression performs poorly performance of the model was compared with
addition, linear models can be updated when there are non-linear previous results of nonparametric approaches that
easily with new data using stochastic relationships. A strong correlation are based on local constant regression, such as
gradient descent. Weaknesses: Linear does NOT imply cause and effect the k-nearest neighbor and kernel methods, by
regression performs poorly when there relationship. using 32-day traffic-speed data collected on US-
Linear Regression
are non-linear relationships. 290, in Houston, Texas, at 5-min intervals. It was
Linear Regression Is Sensitive to
found that the local linear methods consistently
Outliers. Outliers are data that are
showed better performance than the k-nearest
surprising.
neighbor and kernel smoothing methods.
(Sun H, Liu HX, Xiao H, He RR, Ran B. Use of
Local Linear Regression Model for Short-Term
Traffic Forecasting.)
Generally, gradient boosting decision It is prone to overfitting, models can In this article, they explore the applicability of
trees is more accurate compare to other be computationally expensive and crowdsourced data for this purpose. They apply a
modes, it trains faster especially on take a long time to train, especially gradient boosting trees algorithm to model
larger datasets, most of them provide on CPUs; in addition, it is hard to individuals’ mobility decision making processes.
support handling categorical features, interpret the final models. The applicability of the developed model is seen as
Gradient Boosting
some of them handle missing values a potential platform for personalized mobility
Decision Trees
natively. management in smart cities and a communication
tool between the city and users. (Semanjski I,
Gautama S. Smart City Mobility Application—
Gradient Boosting Trees for Mobility Prediction and
Analysis Based on Crowdsourced Data. Sensors.
2015)
References:
1. Anupriya, & Garg, S. (2018). Autoregressive Integrated Moving Average Model based Prediction of Bitcoin Close Price. 473-478.
http://dx.doi.org/10.1109/ICSSIT.2018.8748423

2. Alessandretti, L. ElBahrawy, E. Baronchel, A. (2018). Anticipating Cryptocurrency Prices Using Machine Learning. Complexity, 2018(8983590):16
https://doi.org/10.1155/2018/8983590

3. Abraham, J. Higdon, D. Nelson J. Ibarra, J. (2018). Cryptocurrency Price Prediction Using Tweet Volumes and Sentiment Analysis. SMU Data Science Review
1(3):1.https://scholar.smu.edu/datasciencereview/vol1/iss3/1

4. Spilak, B. (2018). Deep Neural Networks for Cryptocurrencies Price Prediction. https://d-nb.info/1185667245/34

5. Sathyanarayana, S.S. & Sudhindra, G. (2019). Modeling Cryptocurrency (Bitcoin) using Vector Autoregressive (Var) Model. SDMIMD Journal of Management. 10. 47-64.
http://dx.doi.org/10.18311/sdmimd/2019/23181

6. Bohte R, Rossini L. (2019) Comparing the Forecasting of Cryptocurrencies by Bayesian Time-Varying Volatility Models. Journal of Risk and Financial Management;
12(3):150. https://doi.org/10.3390/jrfm12030150

7. Cohen, G. (2020). Forecasting Bitcoin Trends Using Algorithmic Learning Systems.

8. https://www.mdpi.com/1099-4300/22/8/838/pdf

9. Sebastião, H. & Godinho, P. (2021). Forecasting and trading cryptocurrencies with machine learning under changing market conditions. Financial Innovation 7(3)
a. https://doi.org/10.1186/s40854-020-00217-x

10. Patel, M. Sudeep, T. Gupta, R. Neeraj, K. (2020). A Deep Learning-based Cryptocurrency Price Prediction Scheme for Financial Institutions. Journal of Information
Security and Applications 55
https://doi.org/10.1016/j.jisa.2020.102583

11. Awotunde J.B., Ogundokun R.O., Jimoh R.G., Misra S., Aro T.O. (2021) Machine Learning Algorithm for Cryptocurrencies Price Prediction. In: Misra S., Kumar Tyagi A.
(eds) Artificial Intelligence for Cyber Security: Methods, Issues and Possible Horizons or Opportunities. Studies in Computational Intelligence(972).
https://doi.org/10.1007/978-3-030-72236-4_17

12. https://libres.uncg.edu/ir/uncw/f/zhai2005-2.pdf

13. Morton Glantz, Johnathan Mun, in Credit Engineering for Bankers (Second Edition), 2011

14. https://iq.opengenus.org/long-short-term-memory-lstm/

15. https://www.geeksforgeeks.org/understanding-of-lstm-networks/
16. https://www.mvorganizing.org/what-are-the-advantages-and-disadvantages-of-regression-analysis/

17. https://www.analyticssteps.com/blogs/simple-linear-regression-applications-limitations-examples

18. https://medium.com/@kevinkhang2909/advantages-and-disadvantages-of-each-algorithm-use-in-machine-learning-cb973d1aee15

19. https://towardsdatascience.com/machine-learning-part-18-boosting-algorithms-gradient-boosting-in-python-ef5ae6965be4

20. https://www.kdnuggets.com/2021/04/gradient-boosted-trees-conceptual-explanation.html

You might also like