Professional Documents
Culture Documents
PATTAREEYA PIRAVECHSAKUL
THESIS ADVISOR
(Assistant Professor Teerasit Kasetkasem, Ph.D.)
THESIS CO-ADVISOR
(Mr. Sanparith Marukatat, Ph.D.)
THESIS CO-ADVISOR
(Professor Itsuo Kumazawa, Ph.D.)
DEPARTMENT HEAD
(Assistant Professor Nithiphat Teerakawanich, Ph.D.)
DEAN
(Associate Professor Srijidtra Charoenlarpnopparut, Ph.D.)
THESIS
PATTAREEYA PIRAVECHSAKUL
ABST RACT
Pattareeya Piravechsakul : Stock Trading Approaches using LSTM and GANs.
Master of Engineering (Information and Communication Technology for Embedded
Systems), Major Field: Information and Communication Technology for Embedded
Systems, Department of Electrical Engineering.
Thesis Advisor: Assistant Professor Teerasit Kasetkasem, Ph.D.
Academic Year 2021
ACKNOWLEDGEMENT S
ACKNOWLEDGEMENTS
I would like to take this opportunity to thank everyone who has been involved
in this project. Without them, my research work to not complete the research
successfully.
Firstly, I would like to express my deep and sincere gratitude to my advisor,
Asst. Prof. Dr. Teerasit Kasetkasem who gave me a lot of knowledge such as financial
literacy, machine learning and etc. He always advises and supports me in everything.
Moreover, his vision, dynamism and motivation have deeply inspired me to do this
project about using AI in finance.
Secondary, Dr. Sanparith Marukatat, co-advisor from NECTEC researcher
(NSTDA) who explained to me to understand deep learning models involved in this
project. Furthermore, he suggested to me a lot of valuable ideas to do in this work. I am
really thankful and appreciate having a great chance to cooperate with him. Apart from
my thesis advisor and NECTEC researcher, I would like to thank another co-advisor,
Prof. Itsuo Kumazawa from Tokyo Institute of Technology who taught me the
fundamentals of artificial intelligence and machine learning.
Finally, I would like to thank to Thailand Advanced Institute of Science and
Technology and Tokyo Institute of Technology (TAIST-Tokyo Tech), Kasetsart
University, and the National Science and Technology Development Agency for
supporting my master’s degree and giving me a full scholarship that covers tuition fee
and living allowance.
Pattareeya Piravechsakul
TABLE OF CONTENTS
Page
....................................................................................................................................... C
ABSTRACT.................................................................................................................. C
ACKNOWLEDGEMENTS .......................................................................................... D
TABLE OF CONTENTS .............................................................................................. E
LIST OF TABLES ........................................................................................................ G
LIST OF FIGURES ...................................................................................................... H
INTRODUCTION ......................................................................................................... 1
OBJECTIVES ................................................................................................................ 3
REVIEW OF LITERATURES ...................................................................................... 4
BACKGROUNDS ......................................................................................................... 6
1. Moving Average (MOV) strategy ....................................................................... 6
2. Moving Average Convergence Divergence (MACD) strategy ........................... 7
3. Bollinger Bands (BB) Strategy ............................................................................ 9
4. Relative Strength Index (RSI) strategy.............................................................. 10
5. Long Short-Term Memory (LSTM) .................................................................. 11
6. Convolution Neural Network (CNN) ................................................................ 13
7. Generative Adversarial Networks (GANs) ....................................................... 16
7.1 Conditional Generative Adversarial Networks (CGANs) ........................ 18
METHODOLOGY ...................................................................................................... 21
1. Workflow Diagram ............................................................................................ 21
2. Financial Dataset ............................................................................................... 21
3. Data Preparation ................................................................................................ 22
3.1 Input Data Structures ................................................................................ 22
3.2 Data Normalization ..................................................................................... 22
F
Page
Page
Figure 21. Examples of the long-short trading strategy of CWGANs stock price
predictor (a) past 6-day (P6), (b) past 8-day (P8), (c) past 16-day (P16) on SET50
index data between 3 July 2015 to 3 July 2019. .......................................................... 44
Figure 22. Average return comparison between our models and traditional indicators
(SET50 index data in 2020) ......................................................................................... 45
Figure 23. Average return comparison between our models and traditional indicators
(Top 100 Thai stocks in 2020) ..................................................................................... 47
Figure 24 Average return comparison between our models and traditional indicators
(Top 100 Thai stocks between 1 Jan 2021 to 15 Oct 2021 .......................................... 50
Figure 25 Average return comparison between our models and traditional indicators
(9 picked stocks between 1 Jan 2021 to 15 Oct 2021) ................................................. 52
INTRODUCTION
The financial market is one of the most attractive investments among many
people. There are many studies on stock market price movement prediction for
decades (Nelson, Pereira, & Oliveira, 2017), but a future price prediction is still very
difficult and complex. There are many technical indicators available in the market that
allow traders and investors can use. The technical indicators are created from
observation and measurable characteristics of quantitative historical data without
considering a financial statement such as a balance sheet, income statement, and cash
flow statement. The technical indicators are classified into two main types, a lagging
indicator, and a leading indicator. Examples of the lagging indicator based on a
historical data report are Moving Average Convergence Divergence (MACD), and
Bollinger Bands (BB). The lagging indicators provide traders and investors with
delayed buy or sell signals at later than the actual market trend. As a result, the
indicators have made investors/traders miss the best point to buy or sell. The leading
indicators attempt to forecast future price movements in the market such as Relative
Strength Index (RSI), Stochastic Oscillator, and On-Balance Volume (OBV). Most
traders use the leading indicator to buy a stock when a price is oversold and sell a
stock when a price is overbought.
In recent years, machine learning and deep learning algorithms have become a
part of every task, including in the financial market. The financial market approaches
to use the deep learning algorithms to predict future prices based on learning
historical price data. In this work, the deep learning algorithms used in this work are
Long-Short Term Memory Network (LSTM) and Generative Adversarial Networks
(GANs). Both models were used to predict the future price based on learning
historical price data from the Stock Exchange of Thailand SET50 index data. The
LSTM is very popular algorithm use in time-series data. Many applications show high
performance by using the LSTM model. The GANs model is very successful in image
generation. Many tasks in generating realistic photographs have shown the high
performance by using Generative Adversarial Networks. Thus, we expect both models
2
to make a profit and overcome the four technical indicators and Buy & Hold
strategies. Also, we studied the different architectures designs in generator model and
discriminator model of the GANs based on LSTM and CNN. Thus, there were 4
different architectural designs, LSTM-LSTM, LSTM-CNN, CNN-CNN, and CNN-
LSTM. Then, the best performance of LSTM and GANs based on the root mean
square error (RMSE) value and the R-squared value would trade in back testing and
forward testing cases to test the profitable model. Additionally, both models would
compare the profitability with five strategies, namely, buy and hold, MOV, MACD,
BB, and RSI strategies.
3
OBJECTIVES
1. To study the ability of LSTM and GANs to learn stock price prediction.
2. To study the different architectural designs based on CWGANs model.
3. To compare the profitability of LSTM, CWGANs, Buy & Hold and the four
technical indicators, namely MOV, MACD, BB and RSI based on the Stock
Exchange of Thailand (SET) market.
4. To find a suitable trading strategy of deep learning models and overcome the
four technical indicators and buy & hold strategy.
4
REVIEW OF LITERATURES
number of input features. Their result shows that the model performed well, GANs
can distinguish between unseen normal cases and manipulative stock pattern with
accuracy of 68.10%. (Zhang, Zhong, Dong, Wang, & Wang, 2019) used four
methods, namely, SVR, ANN, LSTM, and GANs for stock price prediction. Their
results show that GANs model has the highest accuracy based on RMSE value.
However, there were hardly research works that studied in finding suitable trading
strategy of the deep learning model based on the Stock Exchange of Thailand (SET)
market. This work, we employed GANs and LSTM model to find suitable trading
strategy and make profitable model. Also, both models compare the profitability with
buy & hold strategy and four technical indicators, namely MOV, MACD, BB, RSI
strategy.
6
BACKGROUNDS
https://forextraininggroup.com/anatomy-of-popular-moving-averages-in-forex/
1
𝑆𝑀𝐴𝑘 = ∑𝑛𝑖=𝑛−𝑘+1 𝑃𝑖 (1)
𝑘
where 𝑆𝑀𝐴𝑘 , 𝑃𝑖 and 𝑛 are the simple moving average at 𝑘 window size, price value
and the number of observed values.
A buy signal happens when the short-term moving average line is above the
long-term moving average on the first day. On the contrary, a sell signal happens
when the short-term moving average line is below the long-term moving average line
on the first day.
https://www.dailyfx.com/education/technical-analysis-tools/macd-indicator
where 𝐾 is smoothing factor for the EMA (𝐾 = 𝑆𝑚𝑜𝑜𝑡ℎ𝑖𝑛𝑔 𝑉𝑎𝑙𝑢𝑒/(1 + 𝑡)), and t is
the number of days. The smoothing value can be altered upon preference, typically
chosen as 2.
A buy signal occurs when the MACD line crosses above the signal line on the
first day as starting bullish crosses. A bearish crossover or a sell signal occurs when
the MACD line crosses below the signal line on the first day.
For example, (Figure 2) when the blue line that represents the MACD line is
above the red line that represents the Signal line on the first day, the buy signal is
9
generated. On the other hand, the sell signal is generated, the MACD line is below the
Signal line on the first day.
Bollinger Bands (BB) (Shah & Manubhai, 2015) remain another widely
popular technical indicator. The BB were developed by John Bollinger in the 1980’s.
The Bollinger Bands use the N period Simple Moving Average (SMA) and standard
deviation of previous price N days as follow the equation (6). The BB are a type of
volatility indicator, they consist of a band of three lines, upper band, middle band, and
lower band of likely a channel of prices. The basic explanation of BB, the upper band
and the lower band are created as a channel that represent measuring overbought or
oversold conditions. When the stock price is above the upper band that means there is
a statistically high. In other words, the stock price is beginning to look expensive for
most traders. Generally, most traders believe that the stock price is likely to be cheap
or statistically low when the price is below the lower band. The distance of the bands
is created based on standard deviation. The BB can inform a trader how much the
volatility swings. When the volatility of price increase, that means the distance
between the upper band and the lower band will increase as well.
https://www.tradingview.com/chart/GOLDSILVER/AQAQPvfD-Gold-Silver-Ratio-RSI-Bollinger-Bands-Buy-or-Sell/
𝑵
∑ (𝑿 −𝑿) ̅̅̅̅𝟐
𝝈 = √ 𝒊=𝟏 𝑵𝒊 (5)
∑𝑵
𝒊=𝟏 𝑿𝒊
̅=
𝑿 (6)
𝑵
𝑩𝑩𝒊 = 𝑿𝒊 ± 𝝈𝒊 ∗ 𝒅 (7)
Where 𝜎, 𝑋̅, 𝑁, and 𝑑 are the standard deviation, the average of the total of
observations, the number of observation days, and the number of standard deviations
value away from the average value. Typically, the default of 𝑑 and N are set of 2 and
20, respectively.
Figure 3 shows the BB trading strategy. A buy signal is generated when the
stock price has fallen below the lower band on the first day. In contrast, the stock
price is outside the upper band on the first day, a sell signal is created.
100
𝑅𝑆𝐼 = 100 − 𝑥 (8)
𝑦
11
where x and y are average of N days’ up closing price and average of N days’ down
closing price. Typically, the default value of N-day is 14-day.
https://www.chartmill.com/documentation/technical-analysis-indicators/128-The-RSI-Indicator-and-Trading-Strategies
The RSI oscillates between 0 % and 100 %. A sell signal occurs when the
value is above 70 % represents overbought. To the contrary, oversold or buy signal is
generated when the value is below 30 %.
The LSTM (Chouhan et al., 2018) (Liu, Liao, & Ding, 2018) is designed to
solve vanish gradient problem during back propagation of recurrent neural network
(RNN) so LSTM can reduce effect of short-term memory. In other words, the LSTM
can keep past information. This is important thing in the work to predict stock prices.
12
There are three different gates in LSTM cell (Wei, 2019), a forget gate, an input gate,
and an output gate.
Forget gate decides which information will be forgotten or kept. The current
input and the previous hidden state are passed into the Sigmoid function. The output
values from the Sigmoid function will tell the network should forget or keep
information. If the values close to 0, this gate will forget the information. On the other
hand, this gate will keep the information when the values close to 1.
Input gate decides which information should be updated from the current state.
The hidden state and the current input are passed into the Sigmoid function, the output
values are between 0 to 1. 0 means not important information, 1 means important
information. At the same time, the hidden state and the current input are passed into
the hyperbolic tangent (tanh) function which generates the output values between -1
to 1 to assist the network. The new input is multiplied between the output of input
gate and the candidate input.
Output gate determines the next hidden state. This gate is multiplied between
the output values of the hidden state and the current input are passed into the Sigmoid
13
function and the cell state information under the tanh function. The new cell state and
the new hidden state will be carried to the next LSTM cell. Thus, the equation of new
cell state and hidden state output are defined as:
where 𝐶𝑡 is the new cell state, ℎ𝑡 is the hidden state output, 𝐶𝑡−1 is the previous cell
state, 𝑓𝑡 is the output of the forget gate, 𝐶𝑡̅ is the candidate input (processed under the
tanh activation function), it is the output of the input gate, and 𝑂𝑡 is the output of the
output gate.
Convolutional Neural Network (CNN) (O'Shea & Nash, 2015) is widely used
in image classification, image recognition and objects detections. CNN is a specific
type of artificial neural network model which can extract feature importance from
input data to learn which features are the important features from input data. CNN
takes input data with labels for training, each input data will pass a series of
convolution layers with filter or kernel. It can learn important features by using small
squares of input data.
Convolution
As seen in (Fig. 6), the convolution of 7x7 matrix (I) as input data multiplies
with 3x3 matrix (K) as a filter (or kernel), the operation will skip from left to right and
top to bottom. Here, we will get the dimension of output is 5x5 matrix. The output of
the operation will be called “Feature Maps”.
14
The feature maps (I*K) will show us understanding of what features our CNN
detect. The result in feature maps can change depending on the filter (or kernel)
values that we use.
Padding
For example (Fig. 6), the dimension of output does not equal as the input, it
might affect to loss importance information. Therefore, we can solve the problem by
using same padding. There are two options for padding as shown in Fig. 7. Valid
15
padding means no padding. This case reduces output size. Same padding or usually
called “Zero padding” gives the output has the same dimension as the input by adding
zero values to the left/right and up/down of input data.
Strides
Stride is the number of data skipping that we want. The more the number of
data skipping, the less training time, meanwhile, some important information might be
lost.
There are several tasks about time series data. Examples of time series data,
climate changing of each day, house price, stock price, patient health monitoring etc.
We analyze time series data in order to extract significant characteristic of the data.
Nowadays, a lot of suitable machine learning algorithms are able to use with time
series analysis. One of them is Convolution Neural Network (CNN). As previous
mentioned details, CNN is famous in image processing area. Nevertheless, research
has shown that using CNN based on regression model for time series analysis has
high level of accuracy and robustness in prediction area.
Time series data consist of two elements, length, and width. The length is the
number of timesteps that we want the network to see each epoch in training. The
16
width is the number of features. Examples of length and width for stock price data,
the length is the number of only trading day, the width can be open price, high price,
low price, close price, or technical indicator data. The input data is fed to each
Convolution layer. The convolution kernels within the layer can specific the length of
the 1D convolution window. The kernels will move from the starting to the ending of
the time series. Then the output data from 1-D CNN will be reduced dimension by
using Max-pooling layer or Flatten layer (option). After that, Fully Connected layer,
or Dense layer, will be used for the last layer with activation function to find the final
output.
The GANs analogy (Fig. 9) shows that there is a pre-owned designer bag shop
which buy a designer bag from customers. However, there are some customers who
are counterfeiter that take fake designer bag to sell at the pre-owned designer bag
shop. The counterfeiter tries to fool the shop owner that his bag is real while the shop
owner will authenticate whether the designer bag is real or fake. Sometimes, it’s very
easy to identify which is a real designer bag for the owner shop. Therefore, the
counterfeiter must learn many techniques to make fake designer bag to look like
authenticate designer bag. On the other hand, the owner shop would learn more about
how to authenticate designer bag techniques in order that the counterfeiter is not able
to fool too easy.
17
The gold of the counterfeiter as Generator can fool the owner shop. The gold
of the owner shop as Discriminator is able to distinguish whether the designer bag is
real or fake designer bag. The counterfeiter (Generator) and the owner shop
(Discriminator) will compete each other as zero-sum game. The two networks try to
get more accuracy of their goal. Eventually, the final goal is to consider the
equilibrium of their competition as a result the generator has ability to create fake data
that the discriminator is not able to indistinguishable that is the fake data.
where 𝑥, 𝑧, 𝐷, and 𝐺 are the real data, the fake data, the discriminator, and the
generator. As seen in equation (10) and (11), the objective functions of the
discriminator consist of 2 functions. The 2 functions come from the real data (label 1)
and the fake data (label 0). Here, the objective function of the discriminator should be
maximized while the objective function of the generator from equation (12) should be
minimized.
The combining of the equation (14) and (15) is shown in the equation (16). Here, we
get the minimax game that the generator (G) and the discriminator (D) compete each
other that are defined as.
All mentioned above, it is a simple GANs concept. There are many more types
of GANs (Alqahtani, Kavakli-Thorne, & Kumar Ahuja, 2019), such as Conditional
GANs (CGANs), Wasserstein GANs (WGANs), Deep Convolution GANs
(DCGANs), Cycle GANs etc. However, this work focused on only CGANs and
WGANs model.
Conditional GANs or CGANs are used to solve a problem of simple GANs that
cannot control classes of generated output that we need. The generated output of the
simple GANs is examples from random classes, but the CGANs use example from the
19
The CGANs are very necessary for time series sequence data such as a stock
price prediction, video prediction. Because a future stock price should correlate with
the nearest past stock price. If we use the simple GANs model for stock price
prediction task, the model might generate random plausible output for a given dataset.
In other words, tomorrow’s stock price prediction from the model does not correlate
to stock price today.
Figure 11. Example of comparison generated output between Simple GANs model
(left picture) and CGANs model (right picture)
Training GANs are quite difficult. Sometimes, the two models might never
converge or happen mode collapse. In the recent year, there was a novel framework of
GANs, namely Wasserstein Generative Adversarial Networks or WGANs, was
introduced in 2017 by Martin Arjovsky (Arjovsky, Chintala, & Bottou, 2017). Instead
of using a discriminator model to classify an input data that probability distribution of
the input data is closer to real or fake label. The WGANs proposes changing the
discriminator model from a classification to a critic. The critic performs to score
realness or fakeness of given data. This thing helps the training WGANs to increase
stable and decrease sensitivity. The WGANs are more stable than the basic GANs.
Also, the wight clipping of WGANs helps to avoid mode collapse. Therefore, this
experiment combined the WGANs and the CGANs.
21
METHODOLOGY
1. Workflow Diagram
2. Financial Dataset
There are many platforms that provides free financial data. In this work, the
stock price data are downloaded from two platforms. Stock Exchange of Thailand
SET50 index are from Siamchart website. The SET50 index data is used from August
2011 to June 2021 or around ten years. There are a total of 2406 observations in the
SET50 index data. Another platform, Yahoo Finance provides individual historical
stock data. This platform also allows us to get a free API. We selected the individual
stock data from the stock prices of the top 100 listed companies based on SET market
22
by large ranked market capitalization. It can check the component stocks in SET50
index and SET100 index from www.set.or.th.
3. Data Preparation
All models were designed to work on a rolling window. First, the raw dataset
was downloaded from Yahoo Finance (or CSV file). It was reshaped dimension from
2 dimensions to 3 dimensions as input data to feed each model. The elements of 3
dimensions consist of the number of samples, the number of timesteps and the number
of features, respectively, as show in Fig. 11. There are 6 features, consist of Close
price, Open price, High price, Low price, and two technical indicators that are the 7-
day moving average (Mov7), the 8-day exponential moving average (EMA8). Here,
the 2406 samples, the 20 timestep (20 days historical stock price), and the 1 output
step (predict 1 day stock price) were applied in this work.
There are many methods to normalize data. Here, Min-max scale method is
used to normalize the input data. The all data is normalized in range -1 to 1 to feed in
our models.
23
In this experiment, we used the historical daily closing data from the SET50
index between 24 August 2011 and 30 June 2021. The total data was 2406 samples.
There are 1924 samples or 80% of the total samples as training data (24 August 2011
– 3 July 2017) and 482 data and others are testing samples (4 July 2017 – 30 June
2021) or 20% of the total samples. For the reason, why we used the SET50 index
data, because the SET50 index is proportional to the weighted gains of the top 50 Thai
stocks based on the capital markets. Moreover, the SET50 index contains many
stocks, and it should be able to capture the over market behaviors than an individual
24
stock. Since the network learns with the overall market data, it should be more
generalized and less prone to overfit with the training samples.
We expected to build a deep learning model that is more robust and efficient than
the popular traditional technical indicators. In this work, we chose two deep learning
models to predict the future stock price. First model, Long-Short Term Memory, or
LSTM, that is one of the most widely deep learning models applied to time series
prediction. Another model, Generative Adversarial Networks, or GANs for short, are
very successful in generating realistic images. The output of the realistic images is so
fascinating and impressive. As a result, we applied the GANs in time series data to
generate a future stock price.
A network model (See Fig. 14) was designed based on the LSTM neural
network to work as a stock price predictor. The historical data from the Stock
Exchange of Thailand SET50 index were used. The historical data were normalized
and fed to the LSTM network as the training set of the network layers. Here, there are
4 layers of LSTM network. The dimension of the inner cells in the LSTM was 50
units each layer. The Dropout was set 0.2 and added each the LSTM layer in order to
prevent overfitting of the model. The optimizer for our LSTM model is Adam
algorithm with learning rate 0.01. The batch size was 128 and 500 epochs for training
the model. The linear activation function and the Tanh activation function were used
in the last Dense layer as the output.
We took 20 sequence length day of historical data and try to predict the future
price on the 21st day. The rolling window equals 1. Then, we move the 20 days
window and again predict on the 22nd day and so on. We iterate like this over the
whole samples. To evaluate the performance model by following statistical indicator,
namely Root mean square error (RMSE). We tried training the model several times
25
and selected the best results of the lowest RMSE value. Finally, we found that the
result used linear activation function as output layer that better than the Tanh
activation function.
The training process concept of the CWGANs model is shown in Fig. 15. Here,
latent space (Z) was sampled from normal distribution. The latent space was
concatenated with stock price as an input data to fed in the generator. This combined
step increased more accuracy for the generator to predict a future stock price than
feeding only the latent space. There were 20 timesteps and 6 input features of the
input data in the generator. The generator’s output timestep was 1. In other word, each
20 days historical stock price timestep (20 input timesteps) would predict 1 day as a
future stock price (1 output timestep). Next, the generator’s output was fed to the
critic (or the discriminator) as the input data of the critic. The critic performs to
predict scores the realness or fakeness of a given data.
26
64 and the learning rate was 0.00001 for RMSProp optimizer. We also set Z
dimension as 50.
5. Trading Operations
Our trading operations for LSTM model, we took a long position when the
predicted model price was more than the previous N-day closing price. After that, we
would close the long position when the predicted model price was less than the
previous N-day closing price. A short position would occur when the predicted model
price was less than the previous N-day closing price. Then, to close the short position
when the predicted model price was more than the previous N-day closing price. The
equation of LSTM trading strategy was defined as
where 𝑃𝑡+1 was predicted price for next day by LSTM model, 𝑃𝑡−𝑁 was the past N-
day closing price that were set from past 1-day to past 19-day.
29
In the same way of CWGANs trading strategy, but the CWGANs model used
the average of the number of predicted price possibilities compared to the past N-day
closing price. The number of predicted price possibilities were the number of the
latent space dimension. For a long position, a buy signal occurred when the average of
the number of predicted price possibilities were more than the previous N-day closing
price. On the contrary, a sell signal occurred when the average of the number of
predicted price possibilities were less than the previous N-day closing price. For a
short position, the buy signal occurred when the average of the number of predicted
price possibilities were less than the previous N-day closing price. The sell signal
occurred when the average of the number of predicted price possibilities were more
than the previous N-day closing price. Here, the equation of the CWGANs trading
strategy were defined as
∑𝑍
𝑖=1(𝑃𝑡+1 )𝑖
𝐵𝑢𝑦 𝑆𝑖𝑔𝑛𝑎𝑙 ∶ > 𝑃𝑡−𝑁 , 𝑁 = 1, 2, 3, … ,19
𝑍
𝐿𝑜𝑛𝑔 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛 = { (19)
∑𝑍
𝑖=1(𝑃𝑡+1 )𝑖
𝑆𝑒𝑙𝑙 𝑆𝑖𝑔𝑛𝑎𝑙 ∶ < 𝑃𝑡−𝑁 , 𝑁 = 1, 2, 3, … ,19
𝑍
∑𝑍
𝑖=1(𝑃𝑡+1 )𝑖
𝐵𝑢𝑦 𝑆𝑖𝑔𝑛𝑎𝑙 ∶ < 𝑃𝑡−𝑁 , 𝑁 = 1, 2, 3, … ,19
𝑍
𝑆𝑒𝑙𝑙 𝑃𝑜𝑠𝑖𝑡𝑖𝑜𝑛 = { (20)
∑𝑍
𝑖=1(𝑃𝑡+1 )𝑖
𝑆𝑒𝑙𝑙 𝑆𝑖𝑔𝑛𝑎𝑙 ∶ > 𝑃𝑡−𝑁 , 𝑁 = 1, 2, 3, … ,19
𝑍
where 𝑍 was the number of the latent space dimension, 𝑃𝑡−𝑁 was the past N-day
closing price that were set from past 1-day to past 19-day.
30
1. Selecting Model
To select the model to use for testing profitability, we selected the model by
evaluation RMSE value and R-squared value. We repeated training the models several
times to ensure that the models were able to have reproducibility results. Table 1
showed the RMSE value and the R-squared value (training and testing samples)
results of model. Based on our models, the LSTM had least RMSE value in both
training and testing samples. Here, the CNN and the LSTM were applied as a base
structure of CWGANs. We found the RMSE value of CNN - CNN architecture as a
base structure of the CWGANs had the least RMSE values in the training samples and
testing samples based on our different architectural designs. The LSTM-LSTM and
CNN-LSTM architecture failed to converge. Therefore, this experiment selected the
CNN - CNN architecture based on the CWGANs as a stock price predictor.
CWGANs CWGANs
Root Mean Square
LSTM (LSTM- CNN) (CNN - CNN)
Error
(RMSE)
Training samples 7.48 22.57 12.35
Testing samples 13.5 28.8 19.13
There are different default values in each technical indicator tool. We assume
that most traders or investors rarely change the default values of the technical
indicator tools provided by brokers, so the default values of the technical indictor
tools would be set following the standard values that brokers set.
Typically, the MACD indicator default setting are 12-period EMA, 26-period
EMA and 9-period EMA as a shorter-period, a longer-period, and a signal line,
respectively. There are a lot of time periods used in moving average. However, the
most common time periods that traders like to use in the MOV are 5, 10, 15, 20, 50,
100, and 200, our MOV setup selected 5-period as a short-period and 12-period as a
longer period for the default values. The Bollinger Bands value default, the number of
days in time-period and the standard deviation are 20 days and 2. The RSI indicator
uses the past 14 periods which is the default values. Also, the strategy of the MACD,
BB and RSI were applied to compare profitability with the stock price predictor
model. The trading strategies of each indicator were mentioned in Background
Section.
After we selected the best model of the LSTM and the CWGANs model based
on the RMSE and the R-squared value. To evaluate the profitable model, both
selected models were separated into 2 cases, long-only case, and long-short case.
Then, the selected models would be tested for long-only and long-short execution
32
through back testing and forward testing by following condition in trading operation.
The back testing was applied to compare the viability between our models and the
technical indicators. To find an effective way to analyze the robustness of trading
strategy, we compared the predicted price with the past N days price. The past N days
were set from past 1-day to past 19-day. Therefore, there are the total 38 cases of each
selected model, 19 long-only cases, 19 long-short cases. After that, we chose the best
performance in long-only case and long-short case of the back testing to trade in the
forward testing.
4. Result Comparisons
𝑉𝐿 −𝑉𝑆
𝑅𝑂𝑅 = 100 (21)
𝑉𝑆
where 𝑅𝑂𝑅, 𝑉𝐿 and 𝑉𝑆 are the rate of return, last value of investment, and start value
of investment.
In facts, we should not consider only the return, but we should consider risk-
adjusted return as well. To do that, the four financial metrics, namely Sharpe ratio,
Sortino ratio, Maximum drawdown, and Calmar ratio were computed in the
experiments. The Sharpe ratio indicates how much wealth our approaches can earn
comparing to the volatility or the risk. However, the Sharpe ratio punishes the good
risk of the investment, the Sortino ratio punish only the bad risk of the investment.
The Sortino ratio is a type of the Sharpe ratio that focus on downside risk or negative
volatility. The Sharpe ratio and the Sortino ratio of portfolio were defined as
33
𝐸[𝑅𝑎 ]
𝑆ℎ𝑎𝑟𝑝𝑒𝑎 = (22)
𝜎𝑎
𝐸[𝑅𝑎 ]
𝑆𝑜𝑟𝑡𝑖𝑛𝑜𝑎 = (23)
𝜎𝑎(−)
where 𝐸[𝑅𝑎 ], 𝜎𝑎 ,and 𝜎𝑎(−) are expected return, standard deviation and standard
deviation of negative return. Here, the higher the Sharpe ratio or the Sortino ratio, the
better the portfolio is.
Another financial metric, Maximum drawdown (MDD), tells us how much the
maximum loss we might get. The MDD measures maximum loss in the past from a
peak to a trough of a portfolio as shown in Fig. 17.
𝑃−𝐿
𝑀𝐷𝐷 = (24)
𝑃
where 𝑃,and 𝐿 are peak value in interesting period, and trough value in interesting
period. Here, the lower the maximum drawdown, the lower the risk of portfolio is. To
clarify, we showed the Calmar ratio as well. The Calmar ratio indicates the
relationship between risk and return. It is useful for analyzing trading strategies or
models based on historical data. The Calmar ratio is defined as
𝐸[𝑅𝑎 ]
𝐶𝑎 = (25)
𝑀𝐷𝐷
34
where 𝐶𝑎 , 𝐸[𝑅𝑎 ], and 𝑀𝐷𝐷 are the Calmar ratio, expected return and maximum
drawdown.
The back testing results on the first period of each strategy, traditional
technical indicators, LSTM trading in long-only case, LSTM trading in long-short
case, CWGANs trading in long-only case and CWGANs trading in long-short case
were displayed in Table. 3, Table. 4, Table. 5, Table. 6, and Table. 7, respectively. As
demonstrated, the technical indicator, namely MOV, MACD, BB and RSI made not
satisfy return a profit. The MOV, the MACD and the RSI made a profit of 2%
approximately, but the BB made a loss of -2%. Also, they could not overcome the buy
& hold strategy. If investors or traders buy the SET50 index and hold it for 4 years,
they can get a profit of 17.94% in buy & hold strategy. However, our both models can
make the higher return than the buy & hold strategy and the technical indicators. The
LSTM trading strategy with past 8-day case, or LSTM-P8 for short, gain the profit of
21.26%, 15.32% in long-only and long-short case, respectively. For our CWGANs
trading strategy, most of the P cases were able to overcome the technical indicators in
long-only case. There are only P6, P8 case can overcome buy & hold strategy. The
trade execution processing the SET50 index data between 3 July 2015 to 3 July 2019
based on MOV, MACD, BB and RSI are shown in Fig. 18 respectively.
To find proper trading strategy way of our models with individual stocks or
testing samples, we picked the best ranking performance cases based on the
performance of investment. In Table 2, the P8 was chosen because they made the best
performance in long-only and long-short case. The P6, P8, P16 were picked in the
CWGANs trading strategy. Although, the P10 was the third ranking in the long-only
case, the P10 could not made the profitability in the long-short case. Thus, the P16
was picked instead of the P10. The examples of the trade execution processing of the
LSTM and CWGANs trading strategy cases were shown in Fig. 19- Fig.21,
respectively.
35
BB
-2.12% 0.004 0.006 -17.52% -0.12
RSI
2.72% 0.12 0.16 -20% 0.13
36
Strategy
Past day Return ratio Sharpe Sortino MDD Calmar
(P) (%) ratio ratio (%) ratio
1 0.27% 0.05 0.06 -9.7 0.03
Strategy
Past day Return ratio Sharpe Sortino MDD Calmar
(P) (%) ratio ratio (%) ratio
1 -12.85% 0 0 -37.6 -0.34
After both models were back tested in the training dataset, we kept the past 8-
day case of both models to test in forward testing. Here, the forward testing was
separated into 4 parts as follow:
1. The first part period: the SET50 index data in year 2020
2. The second part period: the Top 100 stocks data in year 2020
3. The third part period: the top 100 stocks data between 1/1/2021 and
15/10/2021
4. The last part period: Picked Stock based on market capitalization between
1/1/2021 and 15/10/2021
take the return of 37.29%, 23.96%, respectively. Moreover, the long-short case of
both models can yield the most profitable case of 76.13% in the LSTM model, and
66.60% in the CWGANs model.
In this LSTM trading strategy case that we tried trading other past day case,
we found that the past 5-day (P5) case and the past 12-day (P12) case take a higher
profit than the P8 case. Thus, we kept the P5 and P12 for testing with the top 100 Thai
stocks experiment.
100.00%
76.13%
80.00% 66.60%
60.00%
37.29%
40.00%
23.96%
18.57%
20.00% 12.25%
0.00%
.
-20.00% -12%
-15.77%
-40.00% -26.70%
Figure 22. Average return comparison between our models and traditional indicators
(SET50 index data in 2020)
To employ the chosen past day of the LSTM and the CWGANs trading strategy
with the top 100 Thai stocks, the trading execution would be separated into 2 periods,
the year 2020 and 1/1/21-15/10/21. Since there is a covid-19 situation in 2020, a lot of
stocks price has fallen rapidly. Thus, we wanted to see how much wealth our
strategies could generate only in 2020.
Fig 23 shows the annual average return of each trading strategy in 2020. In
this experiment, the LSTM trading strategy in both cases outperforms all strategies
based on the experiments. The LSTM take the return of 26.55%, 49.28% from
investment in the Top 100 Thai stocks in long-short case and long-short case,
respectively. The GANs trading strategy cannot overcome some the traditional
technical indicators, it made the return less than the MACD in long-only case and
long-short case. However, the CWGANs trading strategy can overcome the buy &
hold, the BB, and the RSI strategy in both cases.
60.00%
49.28%
50.00%
40.00%
30.00% 25% 26.55%
20.75% 22.87%
17.17%
20.00%
10.00% 6.55%
0.83%
0.00%
-10.00% .
-20.00% -12.64%
Figure 23. Average return comparison between our models and traditional
indicators (Top 100 Thai stocks in 2020)
48
The Top 100 stocks data in SET100 between 1/1/2021 and 15/10/2021
Fig 24. shows the average return of the top 100 Thai stocks in each strategy in
2021 (1/1/21-15/10/21). As shown, the technical indicators and our models cannot
defeat the buy & hold strategy. The investment performance details of each strategy in
2020 and 2021 are shown in Table 8- Table 9, respectively.
CWGANs long-only
(P8 case) 5.62% 0.34 0.77 -12 0.69
CWGANs long-short
(P8 case) -0.44% 0.09 0.34 -25 0.18
50
12.00% 11.17%
10.00%
8.00%
6.55%
6.18%
5.62%
6.00%
4.64%
4%
4.00%
2.53%
2.00%
0.00%
. -0.56% -0.44%
-2.00%
In fact, no one invest 100 stocks at the same time. Also, there are a lot of
stocks gain benefit and loss benefit at the same time from the covid-19 situation.
Thus, the practical deployment would select 9 Thai stocks to do the profitability in
each strategy. To consider picking 9 Thai stocks, we selected by following market
capitalization. The market capitalization separates into three parts, Large-cap, Mid-
cap, and Small-cap. Additionally, the picked 9 stocks should build from different
sectors in a portfolio. The 9 picked stocks were evaluated the investment
performance. The picked stocks were followed as:
1. Stock in the SET50 as large cap: AOT (Transportation and logistics), BDMS
(Health care service), STGT (Personal product and pharmaceuticals), HMPRO
(Commerce)
51
Here, we assume that there were no commission fee and taxes. The initial wealth
was set at 1,000,000 baht in a simulated trading account. We started to collect data
from 1 January 2021 to 15 October 2021. Also, the stocks would action buy or sell
signal by following the rules in the trading operation of each strategy. The average
return of our investment result in this experiment is displayed in Fig.25. The LSTM
trading and the CWGANs trading outperform the MOV, the MACD, the BB, the RSI,
and the buy & hold strategy in both cases. Especially, the LSTM trading strategy in
long-short case can gain a profit of 29%. Its average return has doubled size as
compared with the technical indicators and the buy & hold strategy 2 times.
Table 12 and Table 13 show the investment performance details of the 9 picked
stocks of the traditional technical indicators and our models, respectively.
%Average Return
Table 13 Investment performance of the picked 9 stocks based on our stock price
predictors (1 January 2021 – 15 October 2021)
%Average Return
35.00%
29.19%
30.00%
25.00%
19.46%
20.00%
15.40% 14.72%
15.00% 12.78% 13.11%
11.44%
10.07%
10.00% 6.54%
5.00%
0.00%
.
We summarized the average return of all trading strategies for the four
forward testing cases in the Table 12. In the first part, the BB, RSI and Buy&Hold
could not make a profit. But the MACD, MOV can make a profit. Both models could
make a higher return than the four technical indicators, but there was the LSTM past
8day case earned only 1% less profit than the MACD. In the second part, the BB and
RSI could not make a profit as same the first part. The buy&hold hardly got a profit,
just 0.83%. But the MOV and the MACD could make a profit and overcome the
CWGANs model and the LSTM past8day a little bit in long-only case. However, the
LSTM Past 12 day in long-only and long-short case could make a higher the average
return than the MACD and MOV indicator.
CWGANs long-only
23.96% (P8 case) 17.17% (P8 case) 5.62% (P8 case) 14.72% (P8 case)
case
CWGANs long-short
69.68% (P8 case) 22.87% (P8 case) -0.44% (P8 case) 19.46% (P8 case)
case
In this third part period, the LSTM and GANs trading could not overcome
buy&hold, RSI and BB strategy. Both models could make a higher return than the
54
MACD and MOV in long-only case, but both models incurred a loss in the long-short
case. There was only the experiment in the third part where the LSTM trading earned
less profit than the GANs trading in the long-only case.
In the last period, both models can make higher the average return than four
technical indicators and buy&hold strategy. The LSTM past 8day case can make
higher return than the GANs past 8day case in long-only case and long-short case
Overall, the LSTM and GANs outperform four technical indicators. Mostly The
LSTM trading also can make a higher the average return than the GANs trading.
55
DISCUSSION
This thesis proposes stock trading by using deep learning algorithms, namely
LSTM, and GANs based on the Stock Exchange of Thailand market (SET). Both
models use the SET50 index data for training dataset. In the selecting GANs model,
we design the four patterns of the CWGANs architecture, CNN- CNN, CNN -LSTM,
LSTM-LSTM, and LSTM- CNN. The CNN - CNN architecture is better result than
the LSTM- CNN architecture based on our dataset. However, the CNN -LSTM and
the LSTM-LSTM architecture fails to converge. Thus, the CNN - CNN architecture of
the CWGANs is used for back testing. We also consider repeatability of LSTM and
CWGANs model to ensure that both models have ability to predict the future stock
prices. Then, the back testing performs to give scenarios of trading strategy. There are
38 scenarios, 19 long-only scenarios and 19 long-short scenarios. We choose the best
scenario in long-only and long-short cases of each model. After that, the best scenario
of each model is evaluated the profitability of the trading strategy in the testing
dataset. In addition, the Top 100 stocks are applied to evaluate the profitability of the
trading strategy of the models as well. This process is called “forward testing”. The
forward testing consists of 4 parts, SET50 index data in 2020, Top 100 stocks in 2020,
Top 100 stocks between 1 January 2021 to 15 October 2021, and Practical
deployment section. After we selected trading strategy from scenarios of the LSTM
and the CWGANs model, we would compare the profitability with the four technical
indicators, namely, MOV, MACD, BB and RSI and Buy & Hold strategies.
The back testing in the SET50 index data, the suitable trading strategy of
CWGANs and the LSTM model is P8 case in long-only and long-short cases. The
LSTM and CWGANs can overcome the four technical indicators and Buy & Hold
strategies. The long-only P8 case of the LSTM and CWGANs can make the profit of
21.26% and 19.36%. While the MOV, MACD, BB and RSI got 2.35%, 2.13%, -
2.12% and 2.72%, respectively. For the first part of forward testing of SET50 index
data in 2020, we found that the LSTM and the CWGANs still outperform the four
technical indicators and Buy & Hold strategies. The Buy & Hold, BB and RSI
strategy got loss of -15.77%, -12% and -26.70%. The CWGANs P8 case got profit of
56
23.96%, 66.60% in long-only and long-short cases, respectively. The LSTM P8 case
got profit of 17.72%, 53.30% in long-only and long-short cases, respectively.
However, the LSTM P5 case can make higher return than the LSTM P8 case, the
LSTM P5 case get the profit of 37.29%, 76.13% in long-only and long-short cases.
Another the forward testing is used with the top 100 Thai stocks of the SET100
market in 2020. The CWGANs P8 and the LSTM P8 case can have profitability.
Although, the return of the CWGANs is less than the MACD of 8%, the MDD of the
MACD is more than the MDD of CWGANs of 10%. Both models and the four
technical indicators cannot overcome the Buy & Hold strategy between 1 January
2021 to 15 October 2021. In the practical deployment section, we selected 9 stocks
based on market capitalization, the result showed that the average return of LSTM P8
case and the CWGANs P8 case outperform the MOV, MACD, BB and RSI and Buy
& Hold strategies. If we consider the return details of each stock, BEAUTY stock
cannot get a profit in both models. However, the RSI can yield 40.90% for BEAUTY
stock, the total average return of both models still outperforms the RSI.
Based on the experiments, the LSTM trading strategy can make higher return
than the CWGANs trading strategy in some cases. The LSTM trading strategy and
CWGANs trading strategy may help traders to gain profit from the Stock Exchange of
Thailand (SET).
57
CONCLUSION
LITERATURE CITED
Adebiyi, A., Adewumi, A., & Ayo, C. (2014). Stock price prediction using the ARIMA
model.
Alqahtani, H., Kavakli-Thorne, M., & Kumar Ahuja, D. G. (2019). Applications of
Generative Adversarial Networks (GANs): An Updated Review. Archives of
Computational Methods in Engineering, 28. doi:10.1007/s11831-019-
09388-y
Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein GAN.
Chouhan, L., Agarwal, N., Parmar, I., Saxena, S., Arora, R., Gupta, S., & Dhiman, H.
(2018). Stock Market Prediction Using Machine Learning.
de Souza, M. J. S., Ramos, D. G. F., Pena, M. G., Sobreiro, V. A., & Kimura, H.
(2018). Examination of the profitability of technical analysis based on moving
average strategies in BRICS. Financial Innovation, 4(1), 3.
doi:10.1186/s40854-018-0087-z
Eckerli, F., & Osterrieder, J. (2021). Generative Adversarial Networks in finance: an
overview.
Gurrib, I. (2014). The Moving Average Crossover Strategy: Does it Work for the
S&P500 Market Index? SSRN Electronic Journal. doi:10.2139/ssrn.2578302
Leangarun, T., Tangamchit, P., & Thajchayapong, S. (2018). Stock Price Manipulation
Detection using Generative Adversarial Networks.
Liu, S., Liao, G., & Ding, Y. (2018, 31 May-2 June 2018). Stock transaction
prediction modeling and analysis based on LSTM. Paper presented at the 2018
13th IEEE Conference on Industrial Electronics and Applications (ICIEA).
Nelson, D. M. Q., Pereira, A. M., & Oliveira, R. A. d. (2017). Stock market's price
movement prediction with LSTM neural networks. 2017 International Joint
Conference on Neural Networks (IJCNN), 1419-1426.
O'Shea, K., & Nash, R. (2015). An Introduction to Convolutional Neural Networks.
ArXiv e-prints.
Sahin, U., & Ozbayoglu, A. M. (2014). TN-RSI: Trend-normalized RSI Indicator for
Stock Trading Systems with Evolutionary Computation. Procedia Computer
Science, 36, 240-245. doi:https://doi.org/10.1016/j.procs.2014.09.086
Shah, N., & Manubhai, P. (2015). A Comparative Study on Technical Analysis by
Bollinger Band and RSI.
Vaidya, R. (2020). Moving Average Convergence-Divergence (MACD) Trading Rule:
An Application in Nepalese Stock Market "NEPSE". Quantitative
Economics and Management Studies, 1(6), 366-374.
doi:10.35877/454RI.qems197
Wang, Z., She, Q., & Ward, T. E. (2019). Generative Adversarial Networks: A Survey
and Taxonomy. ArXiv, abs/1906.01529.
Wei, D. (2019). Prediction of Stock Price Based on LSTM Neural Network. 2019
International Conference on Artificial Intelligence and Advanced Manufacturing
60
(AIAM), 544-547.
Zhang, K., Zhong, G., Dong, J., Wang, S., & Wang, Y. (2019). Stock Market
Prediction Based on Generative Adversarial Network. Procedia Computer
Science, 147, 400-406. doi:https://doi.org/10.1016/j.procs.2019.01.256
CURRICULU M VITAE
CURRICULUM VITAE
NAME Pattareeya Piravechsakul