You are on page 1of 10

LQ45 STOCK PREDICTION OPTIMIZATION BASED

ON LONG SHORT-TERM MEMORY


Syakur Syakur1,a Mustafid Mustafid2,b Kusworo Adi3,c
1
Magister Sistem Informasi, Pascasarjana, Universitas Diponegoro, Semarang, Indonesia
2
Depateremen Statistik, Fakultas Sains dan Matematika, Universitas Diponegoro, Semarang, Indonesia
3
Depateremen Statistik, Fakultas Sains dan Matematika, Universitas Diponegoro, Semarang, Indonesia

a
syakurpragen@gmail.com
b
mustafid@lecturer.undip.ac.id
c
kusworoadi@lecturer.undip.ac.id

Abstract. LQ45 is index of the Indonesia Stock Exchange (IDX) consisting of 45 companies that comply certain criteria
to target investors in choosing certain stocks. There are many artificial neural network systems to predict stock market
movements such as using the ANN and CNN methods, but it still have a weakness in calculating the accuracy of these
methods. Therefore, this study use Long Short-Term Memory Networks (LSTM) which is part of the Recurrent Neural
Network method (RNN) to predict stock price movements. Compared to other Artificial Neural Networks (ANN), LSTM
is more suitable for processing non-linear, non-stationary and complex financial time series. To improve prediction
accuracy, this study use RMSE because it is more accurate for time series data. This study aims to apply the Recurrent
Neural Network to predict the stock market price of LQ45 using the long short-term memory method. The data is taken
from yahoo finance in numerical form which shows the open price, price, close, high price, and low price for further
processing and prediction of stock prices. The data in this study contains transaction level and the amount of capitalization
obtained from the Indonesia Stock Exchange (IDX). For analytical purposes, this study use the long-short-term memory
classifier so that it can be used to build a classification model and test its performance in the LQ45 data set. Accuracy is
the performance criterion which is chosen to measure this effect. The result shows that long short-term memory has the
best performance to predict the LQ45 index. This result is not only useful for literature enriching of the learning machine
technique but also has a significant influence on stock market predictions which are the stock price movements and the
LQ45 index prediction ability.

1. Introduction
The stock market generates a lot of transaction data every day, which provides a deep neural
network with a large amount of data to train and improve their predictive ability [8]. Using historical data
to predict stock earnings rankings through a new stock selection model based on a deep neural network
[5]. a deep neural network trained by price data to predict the daily volatility of stocks in the stock
market.Their results show that normalization is very useful for increasing accuracy and the prediction
accuracy increases significantly with the addition of the data price dimension. They also suggest
predictions for different types of stocks separately, which can further improve accuracy [6]. build systems
that leverage deep learning architecture to enhance features and adopt extreme machine learning to predict
market impact. They concluded that deeply studied about feature representation with extreme learning
could provide better market impact prediction accuracy.
The LQ 45 index is an index from the 45 groups that was selected to meet the criteria of having liquidity,
high market capitalization, having a high trading frequency and having growth prospects as well as fairly
good financial conditions. With these criterias, the LQ 45 group is a group of company shares that are in
demand and become the focus of investors' attention. IDX routinely shares groups included in LQ45 every
6 months, namely in the period February - July and August - January.
Deep learning is widely used in stock price prediction.Which is Recurrent Neural Networks (RNN) has
proven to be one of the most reliable models for processing complex time series data. However, this model
has two drawbacks, the first is related to gradient finishing, the second is the exploding gradient (Alonso
et al., 2018). These problems have been solved with Long short-term memory (LSTM), a variation of the
traditional Recurrent Neural Networks (RNN) [3]. Long short-term memory (LSTM) is an advanced type
of Recurrent Neural Networks (RNN) and is widely applied in various fields, such as speech recognition,
translation, and image classification. Another important application of Long short-term memory (LSTM)
is a time series prediction [2]. Long short-term memory (LSTM) can improve predictive performance in
the stock market and demonstrate the effectiveness of new prediction methods based on Long short-term
memory (LSTM). However, some of these studies only consider the basic features of stocks, such as price,
volume and other technical indices, as well as other informative factors that directly affect price
movements.
The purpose of this study is to predict the stock price of LQ45 using the Long short-term memory method
.The data is taken from yahoo finance for further processing and prediction of stock prices. This study
uses data taken in real time via the internet, there are six aspects of the category, namely high, low, open,
close, volume and adj close. In the testing phase, the data will be utilized in real time.
2. Literature Review
2.1 Related Work
Stock market predictions serve as a challenging area for investors to make profits in financial
markets. A large number of models used in stock market forecasting are not able to provide accurate
predictions. This article proposes a stock market prediction system that effectively predicts the state
of the stock market. The deep convolutional long-term memory (Deep-ConvLSTM) model acts as a
prediction module, trained using the Rider-based monarch butter y optimization algorithm (Rider-
MBO). The proposed Rider-MBO algorithm is an integration of the rider optimization algorithm
(ROA) and MBO. Initially, the data from the stock market is directly subjected to the calculation of
technical indicators, representing features from which the required features are obtained through
clustering using Sparse-Fuzzy C-Means (Sparse-FCM) followed by feature selection. Powerful
features are provided to the Deep-ConvLSTM model to make accurate predictions. The evaluation is
based on evaluation metrics, such as the mean squared error (MSE) and the root mean squared error
(RMSE), using six forms of direct stock market data. The proposed stock market prediction model
obtains a minimum MSE and RMSE of 7.2487 and 2.6923 which indicate the effectiveness of the
proposed method in stock market prediction [4].
Deep Learning is renowned for extracting high-level abstract features from large amounts of raw data
without relying on prior knowledge, which is potentially interesting in forecasting financial time
series. Network short-term memory (LSTM)) is considered a cutting-edge technique in sequential
learning, which is less commonly applied to financial time series prediction, but is inherently suitable
for this domain. We propose a new methodology of deep learning prediction, and based on this, build
a deep learning hybrid prediction model for the stock market — CEEMD-PCA-LSTM. In this model,
complementary ensemble empirical mode decomposition (CEEMD), as smoothing and sequence
decomposition of modules, can describe fluctuations or trends of different scales of the time series
step by step, producing a series of intrinsic mode functions (IMF) with different characteristic scales.
Then, by retaining most of the information about the raw data, PCA reduces the dimensions of the
decaying IMF components, eliminating redundant information and increasing the predictive response
speed. After that, the high abstract feature level is included separately into the LSTM network to
predict the closing price on the next trading day for each component. Finally, the synthesis of the
predictive value of the individual components is used to obtain the final predictive value. The
empirical results of six representative stock indices from three types of markets show that our
proposed model outperforms the benchmark model in terms of predictive accuracy, i.e., lower test
error and higher directional symmetry. Utilizing the key research findings, we performed trading
simulations to validate that the proposed model outperformed the benchmark model in absolute terms
of profitability performance and risk-adjusted profitability performance. Furthermore, the robustness
test model shows a more stable resistance than the benchmark model [9].
2.2 Long Short-Term Memory
Long short-term memory (LSTM) is an advanced type of Recurrent Neural Networks (RNN) and is
widely applied in various fields, such as speech recognition, machine translation, and image classification.
Another important application of Long short-term memory (LSTM) is time series prediction,[2] LSTM
which can significantly improve the prediction performance of Long short-term memory (LSTM) in the
stock market and demonstrate the effectiveness of new prediction methods based on Long short-term
memory (LSTM). However, the study makes predictions by Long short-term memory (LSTM) considering
only the basic features of stocks, such as price, volume and other technical indices, and ignores other
informative factors that directly influence price movements.
Figure 1 . LSTM memory cell structure.

LSTM is one of Recurrent Neural Network (RNN) , where modifications are made to
the RNN by adding a memory cell that can store information for a long period of time [7].
LSTM is proposed as a solution to overcome the vanishing gradient in RNN when
processing long sequential data.
Figure 1 depicts the architecture of the LSTM. In LSTM there are 3 gates namely input gate,
forget gate, and output gate. The computational process in the LSTM is carried out in the
following stages.
The first stage in processing the LSTM method is to decide what data to use
removed from the cell state ( . The process is carried out by the sigmoid layer ( ) with the
name Layer ) Forget Gate. The sigmoid layer processes and as a new input then 1 produces
an output (a value between 0 and 1).
𝑓𝑡 = σ(𝑊𝑓 [ ℎ𝑡−1, 𝑥𝑡] + 𝑏𝑓. (1)

𝑓 : forget gate 𝑡
σ : layer sigmoid
𝑊 : score weight forget gate 𝑓
h: previous output value t-1

x: new input value

b: forget gate bias value


Then, the second stage will decide about the data that will be stored in the cell state. In this stage
there are two gate layers, namely the input gate layer and the ground layer, the first layer, namely the
input gate layer, will process and make decisions on the value to be updated and produce . Then, the
tanh layer creates a new value then it is added to the cell state. The next step is to combine each result
from the input gate layer and the tanh layer.
𝑖𝑡 = σ(𝑊𝑖 [ℎ𝑡−1, 𝑥𝑡 ] + 𝑏𝑖 (2)
dengan :
𝑖 : input gate 𝑡
𝑊 : nilai weight input gate 𝑖
𝑏 : nilai bias untuk input gate 𝑖

The third step is to update the old cell state , to cell state 𝑡−1
yang baru.
𝐶𝑡 = 𝑓𝑡𝐶𝑡−1 + 𝑖𝑡𝐶𝑡 , ((3)
dengan :
𝐶 : cell state baru 𝑡
𝐶 : cell state lama

The fourth or final stage in the LSTM method is to produce an output. First, the sigmoid (σ)
layer will decide about which part of the cell state is the output. Then, the output will be sent to the
tanh layer (with a value between -1 and 1) and , sent to the sigmoid gate so that the resulting output
is the same as the previous output.
𝑜𝑡 = σ(𝑊𝑜 ℎ𝑡−1, 𝑥𝑡 [ ] + 𝑏𝑜. (4)
dengan :
𝑜 : output gate 𝑡
𝑊 : nilai weight pada output gate 𝑜
𝑏 : nilai bias pada output gate

Normalization is the process of transforming data so that the data is in the range [0,1].
Normalization can change the data value into a smaller range of values and still maintain the data
pattern. Normalization is done with the following formula:

𝑦 = (𝑥−𝑚𝑖𝑛) (5)

(𝑚𝑎𝑥−𝑚𝑖𝑛)

Adaptive Moment Estimation (Adam) is an algorithm or method that aims to optimize the
attributes of the neural network, such as weight, to minimize the error rate. During the training
process, the model attributes continue to transform to minimize the loss function in order to maximize
the model's accuracy. This research will use an optimizer in the form of Adaptive Moment Estimation
(Adam). Adam is an optimizer that can be used as a substitute for stochastic gradient descent (SGD).
Adam was chosen because of its advantages that are computationally efficient, do not require large
memory, and are suitable for use on large datasets and many parameters.

Root Mean Square Error (RMSE) is a method for measuring the accuracy of regression parameters.
RMSE is always positive because the RMSE formula is derived from the Euclidean squared distance
and the RMSE value will always approach zero if the error is getting smaller. Because of this, the
RMSE method is one of the best methods or measuring tools to measure the accuracy of regression

parameters. (6)

At = Nilai data Aktual

Ft = Nilai hasil peramalan

N= banyaknya data

∑ = Summation (Jumlahkan keseluruhan nilai)

The accuracy value of the prediction model on the resulting test data obtained from the data taken in
real time via online, now with a trained model, it can used for checking the price for the last 5 days
close to the real price or not, the reverse transformation of the prediction, because it possible to
normalize the data before the model training, the prediction of the test data will also be normalized,
so that the inverse transformation will bring the value to the original scale, so we have to calculate
the percentage. In this prediction there are 6 attributes to make predictions, namely Opening, Closing,
Selling price, Highest Price, Lowest Price and Average.

3. System Realization
The material used in this study is LQ45 stock data taken from yahoo.finance in real time. The
research tool used in this research is Python with pandas and numpy libraries. Today, python is considered
one of the most popular languages for machine learning (ML).
Predictions will be processed with structural analysis to identify the data features of the highest
price, lowest price, open price, close price, average, volume and changes after that through the pre-process
stage into feature extraction, ascending, normalization and segmentation. After the data is pre-processed,
it is continued by entering the Long short-term memory (LSTM) process, the structured data is entered
into the forget gate, input gate, output gate, and dense layer (Signoid) to determine the weights generated
from the Long method. short-term memory (LSTM). Next, do a test using RMSE as a regression accuracy
to get the final score, after getting the final score, it is entered into a class that has been given a range of
five classes. To enter my own formula I use Anaconda in editing the script to make it lighter and easier.
We can retrieve historical stock data using a library called 'nsepy' and to track other stocks, the
stock market ticker symbol for that company can be used. The LSTM model requires input data of the
form x and y, where x will represent the last 10 prices and y represent the 11th day. Since LSTM is a
Neural network based algorithm, standardizing or normalizing the data is mandatory to get fast and more
accurate adjustments.
The results of stock predictions use the Long short-term memory (LSTM) method can be used
to predict LQ45 stocks. The focus of LSTM is to find out the movement of stock prices after that the
prediction data will be visualized in the form of graphs and prediction tables. This is explained in the
framework of the information system in the figure below.

Input Proses Output

stock data (get • 80% Training Data


data via yahoo.
• Data is scaled
Finance)
between 0 and 1

• The x_train and


y_train data are
converted to an array

Data visualization
LSTM 3Hidden layer and stock price
prediction

Figure 2. Information system framework

Get the stock price of the company 'BNI' using the company stock ticker (BNI) from January 1,
2020 to November 10, 2020. Next, display the number of rows and columns in the dataset. Create a new
dataframe with just closing prices and convert it into an array. Then, create a variable to store the length
of the data set for training. The training data set contains about 80% of the data. Now the dataset is scaled
its value between 0 and 1 inclusively, this is done because the data before being sent to the neural network
must be scaled first. A training dataset containing the closing price of the last 60 days which will be used
to predict the value of the 61st closing price. So the first column in the data set 'x_train' will contain the
values from the data set from index 0 to index 59 (total 60 values) and the second column will contain the
values from the data set from index 1 to index 60 (60 values) and so on. . The data set 'y_train' will contain
the 61st value located at index 60 for the first column and the 62nd value located at index 61 of the data
set for the second value and so on and so forth. It then converts the independent train data set 'x_train' and
the dependent train data set 'y_train' into a numpy array so that it can be used to train the LSTM model.
Recreate the data into 3 dimensions in the form [number of samples, number of time steps, and number
of features]. The LSTM model expects such a 3-dimensional data set. An LSTM model is made to have
two LSTM layers with 50 neurons and two Dense layers, one with 25 neurons and the other with 1 neuron.
Compile the model using the mean squared error (MSE) loss function and use the adam optimizer. Test
the model using the training data set. Note, fit is another name for train. The batch size is the total number
of training examples present in a batch, and the epoch is the number of iterations as the entire data set is
passed back and forth through the neural network. Create a test data set, then convert the independent test
data set 'x_test' into an array so that it can be used to test the LSTM model. Reshape the data into 3
dimensions in the form of [number of samples, number of time steps, and number of features. Now get
the predicted value from the model using the test data. The root mean squared error (RMSE) test, which
is a measure of how accurate the model is. A value of 0 will indicate that the predicted value of the model
matches the true value of the test data set perfectly. The lower the value, the better the model.

4. Result and Discussion


In this system, the initial date and the date of arrival are input for data collection to be taken and
for stock data options, namely in the form of stock data for which data will be retrieved.

Figure 3. Initial system interface


Figure 4. Stock prediction results
The graphic data above is the process using the LSTM method and calculates accuracy using the Root
Mean Square Error (RMSE), the Mean Accuracy for the accuracy results is 96.9% and for tomorrow's
opening data the range is: 6001.1494, the lowest price is 5998.8784, and the highest price is 6126.2256.
explained in the image below.

Figure 5. Prediction table and actual table


5. Conclucion
Stock prediction using the LSTM method is used to predict LQ45 stock price movements. There
are 67 data tested using Bank BNI data and become 57 data after being normalized using the Root Mean
Square Error (RMSE). The Mean Accuracy for the results is 96.9%. Based on the experiment, the LSTM
method is very effective to use for predicting LQ45 stock price movements.

Reference
1. Alonso, M.N., Batres-Estrada, G. dan Moulin, A., Deep Learning in Finance: Prediction of Stock Returns
with Long Short-Term Memory Networks, Big Data and Machine Learning in Quantitative Investment
(2017), 251–277.
2. Cen, Z. dan Wang, J.,Crude oil price prediction model with long short term memory deep learning based
on prior knowledge data transfer, Energy ,(2019),169160–171.
3. Gers, F.A., Schmidhuber, J. dan Cummins, F., Learning to forget: Continual prediction with LSTM,
Neural Computation 12 (10), (2000),2451–2471.
4. Kelotra, A. dan Pandey, P., Stock Market Prediction Using Optimized Deep-ConvLSTM Model, Big Data
, 2020, 8 (1), 5–24.
5. Long, J. dkk., An integrated framework of deep learning and knowledge graph for prediction of stock
price trend: An application in Chinese stock exchange market, Applied Soft Computing Journal 91106205.
(2020).
6. Li, X., Wu, P. dan Wang, W., Incorporating stock prices and news sentiments for stock market prediction:
A case of Hong Kong, Information Processing and Management (2020), 57 (5), 102212.
7. Manaswi, N.K., Deep Learning with Applications Using Python, Deep Learning with Applications Using
Python, 2018,91–96.
8. Yu, K. dkk., A Key Management Scheme for Secure Communications of Information Centric Advanced
Metering Infrastructure in Smart Grid, IEEE Transactions on Instrumentation and Measurement , 2015,
64 (8), 2072–2085.
9. Zhang, Y., Chu, G. dan Shen, D., The role of investor attention in predicting stock prices: The long short-
term memory networks perspective, Finance Research Letters 38 (January 2021), 101484.

You might also like