Professional Documents
Culture Documents
Abstract — Machine learning algorithms such as Support Vector individual investors especially retail investors who have
Machine (SVM) and Artificial Neural Network (ANN) are used for some basic knowledge and limited budget could build up
machine trading. The problem with SVM and ANN is that it is their portfolio efficiently resulting to the expansion in
difficult to determine appropriated features for both models; financial market and interest for investing in financial
moreover, it is also time consuming to perform backpropagation instruments as alternative investments for wealth
for ANN when a number of both features and data increase. management.
However, machine trading is not confined only these two
algorithms, but also time series models. In this study, we will Machine learning algorithms, such as Support Vector
employ time series models namely Autoregressive Integrated Machine (SVM) and Artificial Neural Network (ANN), are
Moving Average (ARIMA) and Holt – Winters’ Exponential the learning methods commonly used for machine trading
Smoothing (HW) as the guidance for trading since time series [2]. However, these methods must determine all factors that
models only require time series as an input. We will perform time influence on the asset price and therefore not scalable to
series analysis to snatch the trading opportunity in the Stock obtain all trading information both directly and indirectly
Exchange of Thailand (SET). There are fifty companies on the list affecting particular stock such as, GDP growth, foreign
of SET50 index; we will choose five amongst them to invest exchange and past financial performance. Time series
measured by Sharpe Ratio; the top five from this measurement will models instead use only one factor that indexed in time order
be selected as invested assets in simulated portfolio. Furthermore,
to determine the asset price. Hence, the input for the model
the well-known portfolio optimization framework by Harry
calibration is only the price of the stock itself.
Markowitz will be used to ensure that the combination of the
invested assets is located on the efficient frontier; the result from In this study, we will employ time series models namely
this study is favorable as the return generated by these activities Autoregressive Integrated Moving Average (ARIMA) and
outperforms market return; furthermore, manipulating time series Holt – Winters’ Exponential Smoothing (HW) as the
into different time lags yields higher return and as well as guidance for trading. There are fifty companies in the list of
combining both ARIMA and HW models to help predict stock SET50 index announced by the Stock Exchange of Thailand
prices also improves power of prediction of time series models. (SET); only the top five amongst them measured by Sharpe
This study will help retail investor who has limited resource
Ratio[3] will be invested, the top five from this
overthrow bias and intuition throughout investment decision
measurement will be selected as invested assets in simulated
process ranging from finding the stocks for investment to
capturing market movement for trading opportunity. portfolio. Furthermore, the well-known portfolio
optimization framework by Harry Markowitz[4] will be
I. INTRODUCTION used to ensure that the combination of the invested assets is
located on the efficient frontier.
Investment in financial securities becomes parts of
people’s lives. Investment in financial instruments is not This paper is conducted at the empirical level to evaluate
only a mean of yield enhancement, but also a vehicle to the extended time series models using ARIMA and HW. By
obtain tax privileges such as investing in Long Term Equity using mean square error (MSE), both models are then
Fund (LTF) and Retirement Mutual Fund (RMF). compared in terms of robustness and accuracy. Besides, by
Furthermore, corporates managing treasury operations also combining both ARIMA and HW together, we can make
needs investment apart from bank deposits for their surplus better decisions to foresee the closing price. The body of this
cash. paper is presented in nine sections. Section 2-5 summarizes
the essential but important prior works related to time series
Some institute investors might utilize trading system models. The comparison of the extended ARIMA and HW
such as algorithmic trading or arbitrage opportunity to model follows the data analysis process and then evaluated
exploit the market discrepancy[1]; however, retail investors in Section 6 and 7. Section 8 combined both ARIMA and
will find it difficult to weather the market situation due to HW model for better judgement the closing price. The
the inability to access trading information on time and summary is then discussed in the end.
limited budget for their advanced financial technology;
besides, they have little knowledge of trading compared
with large financial institutions who invest a huge amount
on their trading platform and skilled personnel, thus there is
a need for the guideline regarding portfolio construction
which can be implemented on personal computer such that
II. LITERATURE REVIEWS from the past to present as can be seen from the following
generalized form;
A number of numerical predicting models have been
developed to forecast assets prices. Barack and Lawrence
Xt = β0 + Ԑt – β1Ԑt-1 – β2Ԑt-2 - …… - βqԐt-q (2)
[5] implemented ANN once to predict the stock prices in
Nairobi Securities Exchange and New York Stock
MA model is the linear model comprised of the series
Exchange; the models gave the preferable results; however,
of parameter β and white noise Ԑ from the past until present;
they cannot carried all experiment for all sixty stocks due
while, β0 is the average of the selected period of time series.
to time consuming during model turning as they needed
ARMA is the combination of both AR and MA model.
around two hours to fit the model to predict the price of a
The generalized from of ARMA can be described as,
stock; besides, they had to single out the stocks
qualitatively for model’s inputs. SVM also used to predict
Xt = a0 + a1Xt-1 + a2Xt-2 +……+ apXt-p + Ԑt – β1Ԑt-1 - β2Ԑt-
stock prices with substantial performance by using time
2- …… - βqԐt-q (3)
series of historical data including open price, low price,
high price and closed price as the features [6]; however,
ARMA model incorporates both its value in the past
selecting features is critical and controversial [7] as both
(Xt) along with white noise (Ԑt), ARIMA model is the
ANN and SVM can provide different results with different
extension of ARMA in order to handle non-stationary
set of features such that, in fundamental analysis, there are
process which is caused by factors like seasonality and
many aspects to which an investor can pay attention, for
trend within time series. The standard form for ARIMA
example, in top down analysis [8], the factors we can take
model is formulated as follows;
into account including the current state in economic cycle,
fiscal policy, monetary policy, nature of industry,
( )∆ = + ( ) (4)
regulations and pricing and costing policy etc.; these factors
can be incorporated into both ANN and SVM and will not
yield the same result. Besides, unlike ANN, we do not have Where;
to resort the performance of the computer to perform back ∆d = difference between lag time such as Xt – Xt-1
propagation [9] ; hence, in this study we will employ time α(L) = 1 – α1L – α2L2 - …… - αpLp (ARMA model)
series models to capture the stock movements as time series β(L) = 1 – β1L – β2L2 - …… - βqLq (MA model)
models require fewer inputs just only historical price is B. Holt and Winters exponential smoothing model (HW)
needed; moreover, we are not demanded much time to fit
The model was firstly developed by Professor Charlest
the model since we do not have to perform back
C. Holt for forecasting trends in production. Later,
propagation, thus time series models give us more
Professor Peter R. Winters has improved the model by
flexibility to capture the market movement and guide the
adding seasonality so that the model can handle business
trading.
time series which normally exhibit trend and seasonality.
HW can accommodate the time series which stationary
III. AUTOREGRESSIVE INTEGRATED MOVING AVERAGE assumption can be disregarded. Hence, HW yields the great
(ARIMA) AND HOLT-WINTERS’ EXPONENTIAL SMOOTHING benefits as normal business time series does exhibit trend,
WITH SEASONALITY seasonal, circle and irregular effects within time series. The
model can be written as Yt = Tt + St + Ct + It where Yt is the
value of time series t, Tt is trend at time t, St denotes
A. Autoregressive integrated moving average (ARIMA) seasonal at time t, Ct denotes business life circle and It
ARIMA model is one of time series models which can represents irregularity at time t.
handle none-stationary process. The foundation inherited in HW can be classified into two choices – additive and
ARIMA model stems from Autoregressive (AR) model and multiplicative. The difference between the two models is
Moving average (MA) model. the calculation of seasonal factor St (described below)
Autoregressive or AR model is time series model which where the former will handle seasonality by simply
works on the stable process which can be elaborated by the subtracting each value in time series by its expected value,
following generalized form; while the latter calculates the ratio between the value in
time series and its expected value; however, the
Xt = a0 + a1Xt-1 + a2Xt-2 +……+ apXt-p + Ԑt (1) multiplicative HW method is ubiquitously used amongst
the two.
AR model is just a linear model where its current value The model utilized Triple Exponential Smoothing
Xt depends on the former value of Xt-1 and Xt-2 so on and so method which comprises of three components including
forth plus Gaussian white noise (Ԑt) where the effect of level, trend and seasonal. The expression for the
white noise is the effect of unexpected even on time series. components can be generalized as,
Ԑt is stable and follow the Normal Distribution with mean
(µ) equals to 0 and stable variance (σ2) and Ԑt can be also Expression for additive model
called random shock since it represents the unforeseen
= ( − ) + (1 − )( + ) Level (5)
circumstances that deviate from the process.
Moving average or MA model is different from AR = ( − ) + (1 − ) Trend (6)
model in the sense that Xt depends on its white noise (Ԑt)
218
2019 4th International Conference on Information Technology (InCIT), Bangkok, THAILAND
Where: ℎ = (17)
α, β and γ denote exponential smoothing parameters
and p is numbers of seasonal in a year such as 12 for Where:
monthly collected data or 4 for quarterly one.
Rx = Return from investment asset
219
2019 4th International Conference on Information Technology (InCIT), Bangkok, THAILAND
4. Investors consider both expected return and risk, then 1. Airport of Thailand PLC (AOT)
the utility function of each investor depends on both 2. Energy Absolute PLC (EA)
expected risk and return. 3. Global Power Synergy (GPSC)
5. Given the same amount of risk, investors will invest 4. Indorama Ventures PCL (IVL)
in the asset yielding the highest return and the vice versa. 5. Beauty Community PCL (BEAUTY)
The study from Markowitz suggests that to construct the The next is to find the portion for each candidate
efficient portfolio, the combination of the invested assets such that we will obtain the optimum portfolio in
must be located on the efficient frontier as efficient frontier accordance with Markowitz’s portfolio theory, in this case
represents the best reward of return given the certain we will find the portion for each stock such that the risk
amount of risk measured by standard deviation [12] where (variance) of the portfolio is minimum. The result is that we
the construction of the portfolio will maximize the expected will invest 55%, 2.4%, 22%, 16.2% and 4.4% in AOT, EA,
return Rp while keep the variance (σ2p) of the portfolio at GPSC, IVL, and BEAUTY accordingly and the efficient
the minimum as described below, frontier is shown next.
Rp = wTµ (18)
σ2p = wT∑w (19)
Where;
220
2019 4th International Conference on Information Technology (InCIT), Bangkok, THAILAND
TABLE II. PARAMETERS IN ARIMA MODELS FROM THE FIRST stock all are the same as we have determined in earlier
ITERATION
except that the time series we will use here will be varied by
its lag time.
Stock Smoothing parameters ( , , )
AOT = . , = , = What we means about lag time here is that; when we say
EA = . , = . , = we will use two lags of time to forecast the future price
GPSC = . , = . , = means we will collect the closed price of our stock every
IVL = . , = . , period indicated by aforementioned lags of time such as, if
= .
we use two lags of time for our prediction we will collect
BEAUTY = . , = , =
the closed price very two days and use as our time series for
building models both ARIMA and Holt – Winters or when
we use four lags of time for building up forecasted
D. Model performance and evaluation models we will gather closed price every four days
One dimension for selecting the model is to see whether depicted in diagram below;
the model generates large amount of errors. One
methodology for gauging the accuracy of the model is to T0 T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
find the mean square errors (MSE) the least MSE, the better Fig. 2. Time line with different lag time
performance of the model. Here, we can summarize the
MSE for each model the traded stocks employed after 100 - Members of two lag time series is comprised of T0, T2,
simulations of trading in the table. T4, T6… T10
TABLE III. SUMMARY OF SUM SQUARE ERRORS (MSE) AND - Members of four lag time series is comprised of T0, T4
PERFORMANCE and T8
Stocks Model
ARIMA Holt and Winters TABLE IV. SUMMARY OF PERFORMANCE FOR EACH LAG TIME
AOT 0.57 0.81
EA 4.43 5.05 Number Profit No. Profit per Trade
GPSC 2.03 2.95 of Lag ARIMA Holt - Trade ARIMA Holt-
IVL 1.18 1.17 winters Winters
BEAUTY 0.26 10.35 1 -9.38% -9.65% 100 -0.093% -0.096%
Total 8.21 20.33 2 32.75% 20.64% 50 0.655% 0.413%
Return -9.38% -9.65% 3 19.69% 14.41% 33 0.596% 0.436%
4 14.54% 1.95% 25 0.581% 0.078%
5 5.74% 4.32% 20 0.287% 0.216%
The first glimpse at the table, we can say that
ARIMA model outperforms HW model as total MSE for
ARIMA model is far lower than one from ARIMA model; The result shown above summarized the result from
the result is true for the accuracy describing above. From trading by using different lags up to five lags of time. We
the simulation for 100 times, we found that portfolio using can see that using different lags of time yields different
ARIMA model as the predictor yields -9.38%, while HW result. The most relevant criteria for evaluating the
yields -9.65%, while if an investor invests during the performance is profit per trade, as it says, the profit per trade
simulated period will suffer from 11.43% loss; thus, only will get rid of bias based on the number of trading
ARIMA model outperforms the market return. transactions such that we scale down the result into per
trading transaction to judge which model perform better.
We can conclude from our experiment that; even
though, the model fits time series quite well, the instruction We can see that when we use time series used two lags
to trigger the execution to buy or sell is matter, especially of close prices as the input, ARIMA provides the highest
when the prediction signals the up move, but the signal is return or 0.655% per trading, while Holt – Winters performs
too weak in other words just a little higher than previous best when using three lag of closing prices as the input.
price. In this case, the gain might not compensate Seemingly, using different lag time enhanced the
commission fees and eventually, we will lose as we trade performance of overall model as the result of
too often; the lost is not from the inaccuracy of the 1. Lower commission fees as the number of trade is
prediction, but from the commission fees instead. drastically reduced. The first proposed methodology
assumes that we will made the transaction every and each
transaction we will pay 0.25% on the amount of trading, thus
VIII. ENHANCED PERFORMANCED BY MULTIPLE DAY implementing time lag for trading means the less trading
TRADING transaction; with two lags will cut the commission fee by
We implement the same strategy for both ARIMA and half.
HW model; however, the input time series will be modified
2. Daily volatility reduces unnecessary transaction. We
in accordance with lag time starting from two to five lags of
can see on the next graph that the volatility of AOT daily
time series.
price is higher than AOT four lag price as the latter graph is
To be able to compare with our previous strategy, we smoother than daily’s one. The smoother time series will
will also use the same assumptions using in daily trade enhance the model due to both ARIMA and HW employ the
strategy which includes the invested assets (AOT, EA, IVL, previous day trading price in building up the model the less
GPSC and BEAUTY), trading period and stock (from 1st volatility of time series, the more accuracy of the model.
January 2018 to 28th June 2018) and the weight in each Furthermore, daily trading will overlook the power of
221
2019 4th International Conference on Information Technology (InCIT), Bangkok, THAILAND
market correction or we put more concern over daily instantaneously. Results from all strategies simulating in
volatility; there are many times that there is a rumor or this study are summarized in table V.
uncertainty happens during day trade resulting to the spike
in price. Player in the market does take time to recognize all
relevant information to remake the right decision; market TABLE V. SUMMARY OF PERFORMANCE FROM ALL METHODS
anxiety during political protest could be the good example,
during the chaos, there will always be a flow of information Strategy Profit No. Profit per trade
ARIMA HW Trade ARIMA HW
which is more arbitrary in nature; investors will response to
Daily -9.38% -9.65% 100 -0.093% -
the perceived information and reconcile their decision when trading 0.096
they receive the fact. This will create the fluctuation on stock %
price which could move up or down, then revert to the 2-day lags 32.75% 20.64% 50 0.655% 0.413
original level on the next trading day. Thus, using time lag trading %
of trading will remove such irregularity and lessen 3-day lags 19.69% 14.41% 33 0.596% 0.436
trading %
superfluous trading.
4-day lags 14.54% 1.95% 25 0.581% 0.078
trading %
5-day lags 5.74% 4.32% 20 0.287% 0.216
trading %
Combined
Both 0.5%
ARIMA
and HW
222