You are on page 1of 6

2017 International Conference on Frontiers of Information Technology

Regression analysis for ATM cash flow prediction


Akber Rajwani, Tahir Syed, Behraj Khan, Sadaf Behlim
National University of Computer and Emerging Sciences, Karachi, Pakistan,
e-mail: {tahir.syed}@nu.edu.pk

A BSTRACT need of effective cash management solution which enables


One of the most challenging task for a bank is to maintain bank to forecast need of cash replenishment in ATMs. The
cash in their ATMs (Automated Teller Machines) so that they algorithms/technique must also be flexible enough and ability
can easily serve their customers. To solve this problem, they to allow the bank to re-forecast future demand and perform
create a daily estimate for each of their ATM, which can result what–if analyses to optimize the bank ATM network for cash
into “Out of Cash” or “Over Stock” situations. This requires distribution [9].
a solution which can resonably predict how much cash inflow The key role of ATM cash forecasting model is to collect
would be needed for the next day by examining and learning the former data and process it, so that it can predict the future
from past transactional data. We present results for regression demand. Many authors have contributed towards this cause.
techniqes, including using the LSTM model for time-series Their work is discussed in the section of literature review.
for the first time to the best of our knowledge, to solve the In this paper we will apply some of technique on indigenous
“Cash Estimation” problem. This would allow banks to adopt dataset and suggest a novel approach for predicting ATM cash.
to the changing needs of cash according to specific occasions, Although there have been many works done to forecast ATM
holidays, etc. The dataset that we would be using is transaction cashflows, as enumerated in the section on literature, but no
record for past 2.5 years for ATMs situated in a busy district of research has been done for Pakistan. We believe that location
Karachi, Pakistan. This research will help banks in effectively of data is one of the important feature and it has great effects
reducing extra cost error which they bear for maintenance of on result. Hence the purpose of our research is to propose a
their cash as well as increasing customer satisfaction. novel data-driven approach for ATM cash flow prediction on
Keywords—ATM Cash Flow, Regression, Long Short Term an indigenous dataset. Therefore, the objectives of this work
Memory are:
• To collect the dataset, preprocess it and perform ex-
I. I NTRODUCTION
ploratory data analysis.
Maintaining cash in ATMs (Automated Teller Machines) • To apply machine learning techniques for the selection
is a significant operating condition for commercial banks, in of contributive and discriminative features..
order to fulfill customer demand and avoid penalties by central • To propose a data-driven approach for ATM cash outflow
banks. As per recent reports, the number of cardholders are prediction on an indigenous dataset..
growing day by day and with this increasing figures, cash with-
drawal transaction has been amplified to a greater extend. [1]. II. L ITERATURE R EVIEW
Therefore, it becomes really challenging for banks to maintain Numerous authors had done work to alleviate ATM cash
a sufficient amount of cash in their ATMs. Nevertheless, most problem. Ahmadreza Ghodratia et al. too present research on
of the banks try to fix the particular by setting a constant ATM cash management. They used genetic algorithm approach
threshold amount in their ATMs or do lump sum estimations to determine refill amount of each ATM [7]. The data in this
for the same cause. This phenomena of handling cash may research is collected via survey, consisted of transaction date
result in out of stock or over stock situation. In both the cases, for year 2011-2012 of an Iranian bank Ayandeh (Ayandeh
there is a great decline in terms of profit. According to the Bank of Iran). In this research authors on the basis of their
one of the survey that took place in 2013, it was observed that exploration concluded that some of bank ATM need to be
21 percent people use other banks, ATMs due to out of cash upload in 3 days while some of ATM should be upload with
situations [1]. cash on daily basis.
Managing cash in ATMs is very crucial for banks in terms Saad M. Darwish also worked to improve the estimating cor-
of profit and their customer satisfaction, and that is why banks rectness of ATM cash demand. In 2013 the approach which he
has started giving special attention to this issue. Due to this used is the extension of ANN that is an Interval Type-2 Fuzzy
increasing count of cash withdrawal transaction, bank cannot Neural Network (IT2FNN). He used the simulated data of 25
afford to fix the replenish amount for their each ATM. This ATM for his experiments [9]. The structure set of date he used
replenish amount may vary ATM to ATM, depending on the consist of: everyday, weekly and monthly seasonality localized
various factors such as, location (where the ATM is planted), sudden changes (gazette/public holidays and festival effects)
peak factor (what are peak hour of transactions), day of week, were used to imitate the customers’ money withdrawal from
day of month, and many more. For example: An ATM located ATMs that are categorized by different transaction volumes.
in Mall has more transaction load on weekends as compare The experiment showed that average forecast accuracy (per
to ATM located in any residential area. Hence there is a week) of the proposed technique is about 97.72% while the

0-7695-6347-3/17/$31.00 ©2017 IEEE 212


DOI 10.1109/FIT.2017.00045
minimum forecast exactness is 94.15%. empirical distribution. They used sample of 5.000 UK bank
Abirami et al. 2014, used data mining approaches to deal accounts to perform several test and to analyze the efficiency
with ATM cash prediction. The key objectives of their work of their models. They concluded that the empirical distribution
is to provide easy identification of ATM norm and to monitor of random effects works well with 5000 accounts but also they
the ATM usage (peak) time, so that ATM must be available said that there are millions of accounts in bank.
when it is needed most. The data consist of 30 day transactions Our objective of this research is to find the best solution
of a day for testing purpose and later it was extended to 30 for forecasting amount of cash need to be deploy on ATM for
ATMs transaction carried in a month. The results in their work withdrawal transactions. Following is our opted methodology:
were predicted from the former data (past data) and track the
III.DATASET
appropriate solution which is needed. It happens that the types
of transaction which carried out frequently are characterized In this section we have discussed exploratory data analysis
and on the basis of that characterization, services will be (EDA), in such way that how we have extract features from
provided to the customers which is better than the normal. Also data? how we have created new feature set?, how we have plot
peak hours were identified which demonstrate that at what day the data to analyze pattern. We collected the ATM transaction
of a month a customer will use particular ATM most, so that data of two and a half years. The data consist of 120,246 rows.
more comfort can be provided in peak hours [3]. This data consisted of transaction information which has been
Simutis et al. [5] demonstrate an approach that based on ar- carried out on ATM. This transaction includes: withdrawal,
tificial neural network (ANN) to forecast a daily cash need for fund transfer, title fetch, bill payment, pin change, balance
every ATM in the bank network and they invented a procedure inquiry and many more. We have also managed to arrange the
for cash upload of ATMs. They discussed existing solution for data of ATM replenishment of 2.5 years. This data consisted of
ATM cash flow prediction and observe bank network which how much amount is inserted into ATM and how much the data
comprises of 1225 ATMs. In their technique; they discussed was present before inserting new cash. The transaction data
the most important factors for ATMs maintenance such as cost which we have received from bank consisted of 60 columns.
of cash, cost of uploading and cost of daily services. Their Each column contains the transaction information, such as
work showed that in case of higher interest rate and minimum date, amount, time, terminal ID (ATM ID), ATM location,
cost of money uploading in ATMs, their procedure reduces the currency code, customer ID, card number, tracks information,
ATMs maintenance costs up to 15-20%. However, they pointed transaction type, channel type, transaction ID and etc. each of
out that further experimental studies that are necessary for the these column indicates unique information like currency code
practical execution. specifying in which currency transaction had been performed.
Erol Genevois et al. highlighted the problem of ATM Amount specifying how much amount had been transacted,
location and cash management in automated teller machines tracks information is the information which is behind the
in 2015. In their research they discussed two problem which mag stripe and it cannot be seen via human eye, channel
bank are facing one is finding suitable location for ATM type showing the source channel from which transaction is
and other is cash management strategy. The author in this initiated. Transaction log id is the unique reference number for
paper suggested new ways to plant new ATM according to its each transaction, and transaction type table indicates type of
location. In addition, they also discussed in detailed regarding transaction, Table I showing the transaction codes, transaction
various technique opted by other researcher for optimal cash description along with the transaction count which had been
management [6]. carried out in 2.5 years.
Brentnall et al. [17] construct a methods for forecasting the TABLE I: Transaction Code Description
daily amounts of withdrawals from automated teller machines Transaction Code Transaction Description Count
(ATMs). The data which they have used consisted of informa- 01 Withdrawal 86128
30 Balance Inquiry 18274
tion of two year of 190 ATMs of one of the bank of United 43 Open Ended 1410
Kingdom. They applied different existing models such as linear 44 Inter Bank Funds Transfers 2626
models, autoregressive models, structural time series models 50 PIN Change 1
53 Transaction Code Mini Statement Transaction Description 3197 Count
and Markov-switching models and compared these models. 54 PIN Change 25
Furthermore, they experimented different model for each ATM, 56 Cheque Book Request 3
also they used a logarithmic scoring rule in order to conclude 57 PIN Validation ATM 1553
63 Title Fetch 5467
the most suitable seasonal and distributional expectations for 71 Bill Relationship Inquiry 615
each ATM. In their work they mentioned that by using different 72 Bill Inquiry 558
technique for each ATM, they had a chance to examine the pros 73 Bill Payment 389
and cons of the each technique. In their study, performance
indicator was used examine each method/model on existing Our problem statement was to forecast cash withdrawal,
data. therefore we have eliminated all the transaction data except
In their additional research, [18] they used random-effects for withdrawals. After elimination, our data set had become
model to predict how much a customer is used to transaction around 86,129 rows. In this data we have two kind of
amount in his every visit. They used Multinomial distribution withdrawal transaction one is approved (Response Code =
for the distribution of amounts and to model the distribution 000) which say the guarantee money has been dispensed
of random effect model he used the Dirichlet distribution and in those transaction and second is not approved (Response

213
Code! = 000) which says transaction had been rejected due over dependent variable. In our case our output variable is
to some reasons. Table II shows the response code and its Transaction amount (we are predicting amount needed on next
description along with the count of approved and not approved day) and rest of parameters are input variables.
transactions.
TABLE II: Respond code and description of transaction
Response Code Response Description Count
000 Approved 74951
001 Limit Exceeded 1731
002 Account Not Found 254
003 Account Inactive 62
004 Low Balance 2879
007 Card not found 6
009 Error in input data 10
010 Duplicate Transaction 4
014 Warm Card 326
015 Stolen/Lost Card 14
016 Bad Card Status 23
017 Customer Cancellation 163
020 Invalid response 31
024 Bad PIN 542 Fig. 1: Relationship between transaction amount and date
025 Expired Date 24
028 Account not linked 199
029 Internal Error 3
In the figure 1 we have shown the relationship between
039 No Credit account 1 transaction amount and date. The green spots showing the
041 Expired date mismatch 5 dates in 2013, while the blue and red dates indicates the dates
045 Unable to process 272 in 2014 and 2015 respectively. In the above plot, it can be
049 Internal message Error 2 7
050 Host status unknown 100 noticed that most of transactions per day is in between 50k to
051 Host not processing 569 100k, which clearly shows the pattern in it. The next plot that is
053 No saving account 31 figure 2 shows the relationship between transaction amount and
054 Safe transmit mode 33
055 Host link down 894 transaction count. The Pearson coefficient value is 0.94 clearly
056 Sent to issuer 263 indicates that both of these variable are highly co-related to
058 Timed out 1018 each other. The maximum transaction count was encounter in
060 PIN retries exhausted 127
061 HSM not responding 24
our data is 200 and the maximum transaction amount is 250k.
079 Honor with ID 17 The graph is linear which mean as the number transaction
080 Message format found 3 count increases the amount of withdrawal will also increase as
083 No Comms Key 1
091 Issuer reversal 263
shown in figure 2.
094 TXN is not allowed 3
096 Transaction rejected 10
097 Cash has expired 21
104 Account Blocked 148
216 Routing not found 2
968 ATM reversed 15
975 Faulty Dispense 9
976 Cash Retracted 29

In the data of replenishment there is the information of


denomination each cassette holds. Along with the notes count.
This data also contains the information regarding how much
notes had been inserted on that particular ATM. There were
459 rows altogether.
IV. E XPERIMENTATION
Fig. 2: Corelation between transaction amount and transaction
In this section, we have started with mapping the transaction count
pattern of 2.5 year as shown in figure 1. In the figure we
can notice that most of transactions that occur every day is A high correlation between transaction-amount and
in between 50k to 100k indicating the transactional pattern. transaction-count indicates that most transactions were of
The correlation between transaction amount and transaction a small amount of money, and no single large transaction
count is 0.98 which indicated most transaction is of small accounted for a large chunk of the money.
amounts. We used regression for solving ATM cash man- All the experiment which are carried out split into ratio of
agement problem. Normally, a regression analysis is carried 70/30 which means 30 percent is the training the model where
for two purposes: first is to predict the value/behavior of as 70 percent is for testing the model. We have used Scikit-
the dependent variable for an individuals for which we have learn library to conduct our experiments. The below are a set
some information of the explanatory variables, and second of functions projected for regression in which the output value
is to estimate the influence of some explanatory variable is expected to be a linear combination of the input variables.

214
In mathematical term, if E. Time Series regression
Time series modeling is the method for forecasting and
y(w, x) (1)
prediction. It takes the decision by working on time that is
is the output value then we can say. (minutes, years, days, hours) and find out hidden insight in
data. It works well when the data is correlated. It is basically
y(w, x) = w0 + w1 x1 + ....wp xp (2) a set of data points gathered at constant time interval. There
are two things which makes time series special and different
Above formulae designate the vectors w = w1 , ..., wp as
from linear regression. First it is time dependent, unlink
coef_ and w0 as intercept_.
linear regression which says that observations are independent,
Linear Regression fits a linear model with coefficients w =
second, it identify the sessional trends in the data for example
w1 , ..., wp to minimize the residual sum of squares between
transactions occur most before gazette holiday etc. In order
the observed responses in the dataset, and the responses
to run time series model we have assumed that time series
predicted by the linear approximation. Mathematically it solves
(TS) is stationary and its statistical properties such as variance,
a problem of the form:
mean remain same over period. It is important to because
there is very high probability that series will follow same
minw ||Xw − y||2 (3)
pattern in future also. To check the stationary we have did
A. Linear Regression Model two thing first is plot Rolling Statistics. This plot contains the
With Linear Regression Model along with default settings
moving averages and moving variance and analyze if it varies
we have got following result mean squared error: 0.05 variance
from time. Another test for checking stationary is to check
score: 0.79. It can be noticed that error has been decreased
with Dickey-Fuller Test which consist of a ’Test Statistic’ and
drastically. The error shows that we are 0.05 percent away
’Critical Values’ at difference level (confidence). If the ‘Critical
from accuracy. The strong value of variance is 1 but in our
Value’ is greater than ‘Test Statistic’ then we can say that the
case it is 0.79 which is acceptable.
series is stationary as shown in figure 3.
B. Ridge Linear Regression Model
Ridge Regression technique, is used when the data suffers
from multi-co-linearity. In multi-co-linearity, even the least
squares estimates (OLS) are unbiased, their variances are
large which deviates the observed value far from the actual
value. By adding a degree of bias to the regression estimates,
ridge regression reduces the standard errors. Ridge regression
solves the multi-co-linearity problem by shrinking parameter
λ. (lambda). These are some important points about Ridge.
The assumptions of this regression is same as least squared
regression except normality is not to be assumed. It shrinks the
value of coefficients but does not reaches 0, which proposes
no feature selection. This method regularizes and uses l2
regularization. After running ridge we got following results:
mean squared error: 0.03 and variance score: 0.89. It can be Fig. 3: Analyzing Stationary of time series with Rolling
noticed that MSE and variance is improved. Statistics
C. Experiment with Lasso Model Following are the results of Dickey-Fuller Test:
The Least Absolute Shrinkage and Selection Operator is a Test Statistic -4.880407
regression method that involves constraining the absolute size p-value 0.000038
of the regression coefficients. By constraining the sum of the Lags Used 21.000000
absolute values of the estimates, we end up in a condition Number of Observations Used 889.000000
where some of the parameter estimates may be exactly 0. The Critical Value 5 % -2.864797
larger the penalty applied, the further estimates are shrunk Critical Value 1 % -3.437727
towards 0. This is convenient for some automatic variable Critical Value 10 % -2.568504
selection, or when dealing with highly correlated predictors,
where standard regression will usually have ‘too large’ coef- From the figure 3 we concluded that the difference in
ficients. After running lasso we got following results: mean standard deviation is very small, mean is clearly increasing
squared error: 0.03 and variance score: 0.90. It can be noticed and decreasing with time and thus we can say that it’s not
that MSE and variance is improved. a stationary series. Also, the test statistic is way less than the
D. Experiment with Bayesian Ridge Regression critical values. Note that the signed values should be compared
Bayesian linear regression is a method of linear regression and not the absolute values. Hence from the above test we
in which the statistical analysis is assumed to be within the concluded that we have to make time series stationary. To
context of Bayesian inference. After running BRR we got the make time series stationary we need to model the trend and
following results: mean squared error: 0.03 and variance score: seasonality in the distribution and remove those trends from
0.90. the series to get a stationary distribution. One of the method

215
to stationary series to calculate moving average and find it’s RSS we have got normalize root squared sum 0.30. We have
stationary. In this method, we take average of ‘k’ prior values run ARMA model for weeks also but the results were not
depending on the frequency of time series. Here we can take satisfactory either.
the average over the past one month, i.e. last thirty values. F. Recurrent Neural Network - Long Short-Term Memory
After running Dickey-Fuller Test, we found the following plot (LSTM) Model
and distribution as shown in figure 4.
The long short term memory network is the RRN which
trained itself with the help of back propagation. LSTM contain
the memory block instead of neurons which are connected via
layers. To setup our experiment for neural network we have
used transaction amount feature, because we are predicting
transaction amount. That is, given the amount of transaction (in
units of thousands) on a day, what is the amount of transaction
next day? We have created 3 variable one contain transaction
amount of day (X), second contain the contain transaction
amount of next day (Y), and third contain the transaction
amount of next to next day (Z). Figure 6 contain the plot of
test with LSTM.

Fig. 4: Distribution of model after stationarization


Following are the results of Dickey-Fuller Test:
Test Statistic -1.073524e+01
p-value 2.910471e-19
Lags Used 2.100000e+01
Number of Observations Used 8.880000e+02
Critical Value (5%) -2.864800e+00 Fig. 6: Experiment with Neural Network LSTM
Critical Value (1%) -3.437735e+00 Figure 6 shows the data which is plotted, showing the
Critical Value (10%) -2.568506e+00 original dataset represents in blue, the predictions for the
training dataset in green, and the predictions on the unseen test
Figure 4 shows a much better series. The rolling values dataset in red. On our experiment, we have divided training and
seems to be varying slightly but we can say that there is no testing dataset with the ratio of 67:33 and ran 20 epochs. With
trend in the series. In addition, the test statistic is smaller than this setup of variable, we got MSE of .027 on Training dataset
the 5% critical values so we could conclude that the confidence while MSE of .028 on testing dataset. The RMSE indicates
of our stationary series is 95%. Now we have made the time that we are .027 away from our actual result.
series stationary we can directly run our experiments of time G. Recurrent Neural Network - LSTM for Regression with
series forecasting and the approach we will use is the ARIMA Time Series
(Auto-Regressive Integrated Moving Averages) approach. This We have used time series regression for solving our problem.
approach is same as linear equation, the only difference is that Like in the above window example, we can take prior time
the predictor depend on three parameter that is p, d and q of steps in our time series as inputs to predict the output at the
an ARMA model. next time step. With changing the shape of data, we have run
our experiment on previous variable, and got the MSE of .028
on Training dataset while MSE of .029 on testing dataset. The
MSE indicates that we are .028 away from our actual result.
Figure 7 shows the test with LSTM with time series

Fig. 5: Analyzing Stationary of time series after calculating


Moving Averages
Figure 5 indicate that our predication is not accurate, we Fig. 7: Plot of LSTM for Regression with Time Series
have created the series on two variables that is amount and Figure 7 shows the data which is plotted, showing the
date which is time series object. The blue line in the graph original dataset represents in blue, the predictions for the
show the original series whereas red line in the plot showing training dataset in green, and the predictions on the unseen
the prediction. The RSS (Residual Squared Sum) indicates the test dataset in red. On our experiment, we have divided training
error of 351 which is not acceptable. After normalization of and testing dataset with the ratio of 67:33 and ran 20 epochs.

216
H. Discussion ACKNOWLEDGEMENTS
All the information of features is discussed in section 3, also We are thankful to Yameen M. Malik, Zaid M. Memon
we have added missing value in the data set. After running and Farrukh H. Syed for the discussions and a review of the
experiment we have found following MSE and variance for manuscript.
each experiment. R EFERENCES
TABLE III: Best Accuracy per Algorithm [1] S. Madhavi, S. Abirami, C. Bharathi, B. Ekambaram, T. Krishna Sankar,
Algorithm Mean Squared Eror Variance
A. Nattudurai, N. Vijayarangan. “ATM Service Analysis Using Predictive
Linear Regression Model 0.05 0.79 Data Mining”. International Journal of Computer, Electrical, Automation,
Ridge Linear Regression Model 0.03 0.89 Control and Information Engineering Vol: 8, No:2, 2014.
Lasso Model 0.03 0.90 [2] M. Erol Genevois, D. Celik, H. Z. Ulukan. “ATM Location Problem and
RidgeCV Model 0.03 0.90 Cash Management in Automated Teller Machines”. International Journal
LassoLAR 0.26 -0.01 of Social, Behavioral, Educational, Economic, Business and Industrial
Bayesian Redige Regression 0.03 0.90 Engineering Vol: 9, No: 7, 2015.
Recurrent Neural Network (LSTM Model) 0.028 0.91
LSTM for Regression with Time Series 0.029 0.91 [3] Ahmadreza Ghodratia, Hassan Abyakb and Abdorreza Sharifihosseinic.
“ATM cash management using genetic algorithm”. Management Science
Letters 3 (2013) 2007–2014.
Note that MSE (Mean squared error) indicates how far [4] Saad M. Darwish. “A Methodology to Improve Cash Demand Forecasting
we are away from actual prediction while variance is about for ATM Network”. International Journal of Computer and Electrical
inference in data. The good value for variance is near to Engineering, Vol. 5, No. 4, August 2013.
one. We can notice from the above table that almost all of [5] Mohammad Hossein Pour Kazemi, Ph.d Eldar. Sedaght Parast, Mojtaba
the regression techniques have provided impressive result, but Amini “Prediction of Optimal Cash Injection in Automated Teller Ma-
chines, ARMA Approach”. 2014.
the best we have got is from Bayesian Ridge Regression and
[6] Venkatesh Kamini 1 Vadlamani Ravi 2 Anita Prinzie3 Dirk Van den
RidgeCV Model that is 0.90 variance and 0.03 MSE. One of Poel4. “Cash Demand Forecasting in ATMs by Clustering and Neural
the reason for better result is may be it does cross validation Networks”. 30 September 2013.
on data, due to which the data trained and test more efficiently. [7] Mojtaba Zandevakili, Mehdi Javanmard “Using fuzzy logic (type II) in
The next experiment which we conducted on our data is the the intelligent ATMs’ cash management” International Research Journal
time series regression, in time series experiment we have kept of Applied and Basic Sciences 2014 Available online at www.irjabs.com
ISSN 2251-838X / Vol, 8 (10): 1516-1519.
two features only one is transaction amount and other is date
[8] A.R. Brentnall, M.J. Crowder, D.J. Hand, "Predictive-sequential fore-
and created the stationary series in order to identify trends and casting system development for cash machine stocking", International
seasonality in data. We have tested the stationary of time series Journal of Forecasting, vol. 26, pp.764-776, 2010a
model with Dickey-Fuller test and found 95% confidence on [9] A.R. Brentnall, M.J. Crowder, D.J. Hand, "Predicting the amount indi-
stationary series. After making the time series stationary we viduals withdraw at cash machines using a random effects multinomial
have experimented with ARIMA (Auto regressive Integrated model", Statistical Modeling, vol. 10 (2), pp.197-214, 2010b.
Moving Average). The ARIMA forecasting for a stationary
time series is nothing but a linear (like a linear regression)
equation. The predictors depend on the parameters (p,d,q) of
the ARIMA model: Number of AR (Auto-Regressive) terms
(p), Number of MA (Moving Average) terms (q), and Number
of Differences terms (d). The RSS (Residual Squared Sum)
indicates the error of 351 which is not acceptable. After
normalization of RSS we have got normalize root squared
sum 0.30, which is very low therefore we can conclude that
ARIMA approach don’t work well on our data. The figure
3 shows stationary of time series which we got after running
experiments. The reasons for this results may be: less features,
data is asymmetric, sample are not selected properly, under-
fitting and over-fitting
V. C ONCLUSIONS
In this paper we have attempted to solve the ATM-out-of-cash
problem. The data set which we have used in this research
contains the ATM withdrawal transactional data of one of the
largerst banks of Pakistan. The proposed study of this paper has
implemented various algorithms for optimization such as linear
regression, Ridge Linear Regression, LASSO Model Ridge CV
Model, LASSO LAR, Bayesian Ridge Regression, Random
Forest Regression, time series prediction, RRN, RRN with time
series and ARIMA. In the final analysis, we found that linear
regression still provides the optimal solution. We report 98%
accuracy with that approach on this original data set.

217

You might also like