You are on page 1of 17

Applied Soft Computing Journal 82 (2019) 105553

Contents lists available at ScienceDirect

Applied Soft Computing Journal


journal homepage: www.elsevier.com/locate/asoc

Conference

Analysis of temporal pattern, causal interaction and predictive modeling of


financial markets using nonlinear dynamics, econometric models and machine
learning algorithms

Indranil Ghosh a , Rabin K. Jana b , , Manas K. Sanyal c
a
Department of Operations Management & IT, Calcutta Business School, WB 743503, India
b
Operations & Quantitative Methods Area, Indian Institute of Management Raipur, CG 493661, India
c
Department of Business Administration, University of Kalyani, WB 741235, India

highlights

• Predictive analytics models are developed to predict future movements of stock markets.
• Machine learning algorithm are used to enhance the predictive accuracy.
• Temporal patterns and causal interactions of the stock markets are studied.

article info a b s t r a c t

Article history: This paper presents a novel predictive modeling framework for forecasting the future returns of
Received 19 August 2018 financial markets. The task is very challenging as the movements of the financial markets are volatile,
Received in revised form 2 May 2019 chaotic, and nonlinear in nature. For accomplishing this arduous task, a three-stage approach is
Accepted 2 June 2019
proposed. In the first stage, fractal modeling and recurrence analysis are used, and the efficient market
Available online 7 June 2019
hypothesis is tested to comprehend the temporal behavior in order to investigate autoregressive prop-
Keywords: erties. In the second stage, Granger causality tests are applied in a vector auto regression environment
Predictive modeling to explore the causal interaction structures among the indexes and identify the explanatory variables
Machine learning for predictive analytics. In the final stage, the maximal overlap discrete wavelet transformation is
Econometric models carried out to decompose the stock indexes into linear and nonlinear subcomponents. Seven machine
Nonlinear dynamics and deep learning algorithms are then applied on the decomposed components to learn the inherent
Financial market patterns and predicting future movements. For numerical testing, the daily closing prices of four
major Asian emerging stock indexes, exhibiting non-stationary behavior, during the period January
2012 to January 2017 are considered. Statistical analyses are performed to ascertain the comparative
performance assessment. The obtained results prove the effectiveness of the proposed framework.

1. Introduction further complicate the decoding of the trend [1]. As a result,


stock market forecasting occasionally bears a certain degree of
A closer inspection of evolutionary dynamics and inherent uncertainties that eventually creep into the mind of investors.
patterns of stock markets for estimating future movements have This motivates us to analyze the temporal pattern for extracting
garnered tremendous attention due to its importance on the meaningful insights pertinent to dynamics of stock markets, and
overall economic growth of any nation [1,2]. Stock markets are to propose a robust predictive modeling framework to predict the
often characterized as highly complex nonlinear dynamic sys-
future movements.
tems [3]. This makes the evaluation of temporal characteristics
There are many variables, parameters, and sets in the paper.
and predictive modeling very challenging. The confluent sensi-
tiveness of stock markets to chaotic events result in a higher Before highlighting the review of literature, nomenclatures used
degree of volatility and nonlinearity in their movements that in this paper are summarized in top of the next page.
There exist various traditional statistical and econometric
∗ Corresponding author. models for understanding the behavior of financial markets ex-
E-mail addresses: fri.indra@gmail.com (I. Ghosh), rkjana@iimraipur.ac.in hibiting high degree of volatility and forecasting their future
(R.K. Jana), manas_sanyal@klyuniv.ac.in (M.K. Sanyal). movements [2–7]. Majority of these models assume that the

https://doi.org/10.1016/j.asoc.2019.105553
1568-4946/
2 I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553

Nomenclature RQA Recurrence Quantification Analysis


RSI Relative Strength Index
ADF Augmented Dickey Fuller
RW Random Walk
AIC Akaike Information Criterion
SADE Stacked Denoising Auto Encoder
ANN Artificial Neural Network
SC Schwarz Information Criterion
ANFIS Adaptive Neural Fuzzy Inference Sys-
tem SGD Stochastic Gradient Descent
ARDL Autoregressive Distributed Lag SPA Superior Predictive Ability
BSE Bombay Stock Exchange STOC Stochastic Oscillator
CCF Cross-Correlation Function SVR Support Vector Regression
CCI Channel Commodity Index TT Trapping Time
D Fractal Dimensional Index TI Theil Inequality
DCC-GARCH Dynamic Conditional Correlation — TWSE Taiwan Stock Exchange
Generalized Autoregressive Conditional VAR Vector Auto Regression
Heteroscedasticity WR Williams Overbought/Oversold Index
DET Determinism Rate ZA Zivot Andrews
DM Diebold–Mariano
DNGC Does not Granger Cause
DNN Deep Neural Network variance remains unchanged while explaining the predictabil-
EMA Exponential Moving Average ity. Empirical studies confirm that this is not a comprehensive
EN Elastic Net assumption and often lead to serious deviation [8–10]. The RP
EPA Equal Predictive Ability and RQA are used to identify different types of crashes in the
ERS Elliott–Rothenberg–Stock stock market [3] as well as to test the existence of Fractional
ERT Extremely Randomized Trees Brownian Motion. The findings of these research works suggest
FA Firefly Algorithm that the markets are inefficient. Econometric techniques like
DCC-GARCH [11], Granger causality [12,13], conditional Granger
FPE Final Prediction Error
causality [14], ARDL [15], Granger causality and nonlinear
FFNN Feed Forward Neural Network
ARDL [16] are used to evaluate the association and interac-
GA Genetic Algorithm tion among homogeneous and heterogeneous assets. Wavelet-
H Hurst Exponent based techniques [17,18] are also used to explore the such re-
HQ Hannan–Quinn Information Criterion lationships. These studies are restricted to identification of direc-
IR Impulse Response tion of volatility spillovers and generating insights for portfolio
IA Index of Agreement diversification.
JSX Jakarta Stock Exchange The major drawbacks of conventional statistical and econo-
KOSPI Korea Composite Stock Price Exchange metric models have spurred a rapid development of artificial
intelligence, machine and deep learning techniques for stock
LAM Laminarity
market prediction [9,19–24]. Empirical ensemble mode decom-
LR Likelihood Ratio
position, least square SVM optimized through PSO and GARCH
LSTMN Long Short-Term Memory Network are combined for generating final forecasts [9]. In a similar man-
MA Moving Average ner, SVR optimized through chaotic FA is used for prediction of
MACD Moving Average Convergence Diver- daily closing prices of Intel, National Bank shares and Microsoft.
gence However, the entire prediction exercise was carried out on the
MAD Mean Absolute Deviation aggregate series itself [22]. A granular framework comprising of
MAPE Mean Absolute Percentage Error wavelet NN and rough set for automatic attribute selection and
MCS Model Confidence Set obtaining future predictions of five global stock markets [24]. This
approach outperformed traditional forecasting models.
M-Boosting MODWT — Boosting
A multivariate deep learning approach comprising of SADE
MRS Markov Regime Switching
and bagging produced superior forecasts compared to RW, MRS,
M-SVR MODWT — Support Vector Regression FFNN, and SVR [19]. Another deep learning approach is used
M-EN MODWT — Elastic Net for predictive modeling of five-minute intraday data of KOSPI
M-RF MODWT — Random Forest index [20]. Fractal modeling and machine learning algorithms
M-ERT MODWT — Extremely Randomized are used to generate statistically significant forecasts of NIFTY
Trees 50, Hangseng, NIKKEI and NASDAQ indexes [21]. Another ma-
MODWT Maximal Overlap Discrete Wavelet chine learning based study utilized ANFIS in conjunction with
Transformation GARCH and Markov switching regime for successful predictive
M-DNN MODWT — Deep Neural Network analysis of emerging economies in Latin America [23]. These
M-LSTMN MODWT — LSTMN models either use a univariate framework comprising of historical
lagged information or deploy a set of technical indicators for
NSC Nash Sutcliffe Coefficient
performing forecasting. However, stock markets are prone to
PP Philip–Perron
be deeply interlinked with each other for cross country trades,
PSY Psychological Line foreign policies, etc. Hence, including other markets having pro-
REC Recurrence Rate found causal influence as explanatory variables can augment the
RF Random Forest quality of forecasts. The present work considers this fact by care-
RMSE Root Mean Squared Error fully delving the association apart from using standard technical
RP Recurrence Plot indicators.
I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553 3

This study explores the temporal dynamics, interrelationship, Table 1


and proposes a predictive analytics framework for forecasting Descriptive statistics.

future movements of four stock indexes — KOSPI, BSE, JSX, and Measures BSE TWSE JSX KOSPI

TWSE. The random walk hypothesis is tested first. Then fractal Mean 23772.52 8561.78 4791.51 1983.32
Median 25313.74 8573.72 4836.03 1983.80
modeling and recurrence analysis are used to detect the presence
Maximum 30133.35 9973.12 5726.53 2209.46
of Brownian motion in the evolutionary dynamics of the respec- Minimum 15948.10 6894.66 3654.58 1769.31
tive time series. The magnitude of the trend, periodicity and Std. Dev. 4057.22 740.10 444.10 72.24
random components are determined with the help of RP and RQA. Skewness −0.2994 −0.1656 −0.1242 0.0375
The causal nexus among the considered indexes are determined Kurtosis −1.4090 −0.9167 −0.9757 0.4532
JARQUE BERA TEST 122.390* 49.769* 53.085* 53.085*
at different lags by using the CCF. The Granger causality analysis
Weisberg–Bingham Test 0.90493* 0.97592* 0.97499* 0.99285*
in a VAR environment is applied to explore the causal interactions Frosini Test 2.5252* 0.8311* 1.1093* 0.5649*
and the dependence structure. The MODWT is used to decompose Hegazy Green Test 0.095506* 0.022994* 0.024118* 0.007253*
the stock indexes into linear and nonlinear subcomponents. Then Shapiro Wilk Test 0.90381* 0.97492* 0.97419* 0.99260*
machine learning algorithms — SVR, EN, RF, ERT, boosting, DNN, *Significant at 1% significance level.
and LSTMN are applied for recognizing patterns and predicting
future movements. Table 2
This research contributes to the literature by proposing a novel Stationarity check.
research framework for exploring the temporal dynamics, inter- Stationarity check: Unit root tests
relationship, and predicting future movements with enhanced Stationarity Test BSE TWSE JSX KOSPI
prediction accuracy. The predictive analytics component presents
ADF 0.1815# −0.0224# −0.2710# −1.8359#
a granular forecasting structure comprising of wavelet decompo- PP −3.9397# −4.4078# −5.0133# −5.7931#
sition and state-of-the-art machine learning and deep learning ZA −4.8844# −4.7650# −4.2407# −5.5179#
algorithms. Usually, the predictive analytics of stock market uses ERS 0.1468# −0.3758# −0.1253# −0.4659#
technical features in a multivariate setup and wavelet-based time Stationarity check: First order difference
series decomposition in a univariate setup separately. We predict ADF −32.6301* −34.4738* −22.7578* −34.7960*
the performance by combing them together in an integrated mul- PP −32.5902* −34.4751* −32.8546* −34.9926*
tivariate setup that incorporates other significant independent ZA −32.7404* −34.5996* −32.8307* −35.0533*
ERS −8.8358* −4.3611* −13.7197* −5.5534*
variables discovered through the causality analysis.
The remainder of this paper is structured as follows. Section 2 # Not significant.
presents the research problem studied in this paper. Section 3 *significant at 1% significance level.
presents the data profile and emphasizes on the key statisti-
cal properties of the dataset. Section 4 elucidates the details of
research methods employed in this study. Section 5 presents said markets can be achieved. On the other hand, the problem
and critically analyzes the overall findings in terms of empirical is extremely challenging as external events occasionally hinder
inspection for gaining deeper insights into temporal dynamics of the growth of emerging economies which is easily transmitted to
considered market, association and causal interplay, and predic- stock indexes of countries, thus, reflecting the chaotic behavior.
tive performance. Finally, Section 6 concludes the article by high- The endeavor of the present study is to present an integrated
lighting the overall contributions, limitations and future scope of framework for critically evaluate evolutionary temporal patterns
work. of considered financial time series, comprehend the structure of
interrelationship and yielding one-day ahead forecasts of prices.
2. Problem studied
3. Data profile and characteristics
Predictive analysis of stock markets can broadly be categorized
into two strands. When the objective is to estimate the absolute Daily closing prices of JSX, BSE, KOSPI, and TWSE for the period
figures of stock prices or future returns, mostly regression-based January 2012 to June 2017 are collected from ‘Metastock’ data
forecasting models are deployed to accomplish the task [12,22]. repository for investigation. The following figures depict the tem-
Alternatively, the other aspect of predictive modeling typically poral movement of the respective indexes during the considered
attempts to determine the direction of future movements of stock period (see Fig. 1 )
prices. In general, classification algorithms are used to tackle The descriptive statistics of the respective financial time series
directional predictive modeling [25]. The target variables of first are shown in Table 1.
and second categories are continuous and nominal in nature, The test statistic values confirm that none of the series follow
respectively. normal distribution. Therefore, the use of nonparametric research
Our work belongs to the first category of predictive modeling. framework for making predictions is justified. The stationarity
The proposed research framework undertaken in this study aims of the stock indexes is examined using ADF, ERS, PP, and ZA
to estimate one-day ahead forecasts of actual closing prices of unit root tests. ZA test finds the presence of structural breaks
four Asian stock indexes namely BSE, TWSE, JSX, and KOSPI in a in the time series while examining the existence of unit roots.
multivariate setup. The chosen stock proxies represent develop- For traditional econometric analysis, it is essential to perform this
ing countries in Asia which in turn makes examination of tempo- exercise. Table 2 presents the outcome of unit root tests at level
ral characteristics of the said indexes, assessing causal interaction, and first order difference. It is revealed that the series is first order
and building predictive modeling frameworks extremely impor- stationary.
tant. It is apparent that emerging markets are more attractable Identification of the presence of unit roots is essential for an-
than the developed counterparts to traders and other market alyzing causal interactions through Granger causality tests. Since,
players. Hence, there lies an excellent opportunity for profit mak- all the four markets are found to be first order stationary, I(1)
ing which eventually enhances the financial health of the nations return series are considered for econometric analysis to assess
in long term. Thus, portfolio management and algorithmic trading the direction of causation. Evidences from the statistical tests
can be immensely benefitted if precise future projections of the imply that the daily closing prices of the considered stocks are
4 I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553

Fig. 1. Temporal evolution.

characterized by their nonparametric nature and nonstationary 4.1.1.1. R/S analysis and Hurst exponent. It was originally ideated
behavior. It may be noted here that the presence of these prop- and formulated by Hurst [26] to study the level of river Nile for
erties justifies the deployment of advanced machine learning construction of reservoirs. Thereafter, Mandelbrot and Wallis [27]
and deep learning algorithms for predictive modeling exercise. have proposed improvements to this method. The method is
However, the use of such sophisticated algorithms may not result briefly mentioned below:
in a good success if the time series follows a Brownian motion.
a. Decompose the time series {RN } with N observations to d
Therefore, it is important to test the random walk hypothesis.
subseries R (i, d) , (i = 1, 2, . . . , n), where n is the length of
the subseries.
4. Research methods b. Calculate Ed , the average of the decomposed series.
c. Estimate the accumulated deviation
This section enunciates the research framework deployed to i
accomplish the objectives. Broadly, the research methods can be

X (i, d) = {R(k, d) − Ed } (1)
segregated into three categories — nonlinear dynamics for empir-
k=1
ical investigation, econometric modeling for delving interaction,
and predictive analytics for carrying out forecasting. Detailed d. Determine the range
procedures of these approaches are explained sequentially.
Rd = max{X (i, d)} − min{X (i, d)} (2)

4.1. Nonlinear dynamics e. Determine the standard deviation



 n
 ∑
Nonlinear dynamics tools are used to check random walk Sd = √(1/n) {R(k, d) − Ed }2 (3)
hypothesis and gain deeper insights about temporal evolutionary i=1
patterns of the considered financial time series. Fractal modeling
and recurrence analysis are performed for empirical investiga- f. Determine the rescaled range of all the subseries
tions. D

(R/S )n = (1/A) (Rd /Sd ) (4)
4.1.1. Fractal modeling d=1
It is often used to test the efficient market hypothesis [21,25].
The Hurst exponent and the R/S statistic have the following
Fractal dimensional index (D) and Hurst exponent (H) are calcu-
relationship:
lated by using the rescaled range (R/S) analysis to discover the
presence of long or short memory structures in the time series. KnH = (R/S )n , K is a constant. (5)
I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553 5

The magnitude of H is estimated through curve fitting: 4.3. Predictive modeling

log K + H log n = log (R/S )n (6)


This subsection presents a granular level forecasting approach
The values of H vary between 0 and 1. If H = 0.5, then for predictive modeling of the respective stock indexes in a mul-
the series elements are uncorrelated and hence there exists pure tivariate framework. MODWT model is used for decomposition of
random walk. The long memory trend is detected when the value series into linear and nonlinear components.
of H is significantly greater than 0.5 and a short memory trend is Seven pattern recognition algorithms belonging to machine
detected when the value of H is less than 0.5. and deep learning paradigm, namely, SVR, EN, RF, ERT, Boosting,
4.1.1.2. Fractal dimensional index. Empirical evidences suggest DNN, and LSTMN are then applied to obtain component wise
that apparently random looking financial assets have some degree forecasted figures. These combined models are denoted as M-
of predictability embedded in their temporal dynamics. D, a SVR, M-EN, M-RF, M-ERT, M-Boosting, M-DNN, and M-LSTMN.
non-integer dimension, represents the working principle of any The final forecast is obtained by adding the forecasted figures
chaotic system [21,28]. By estimating the magnitude of D, the of individual decomposed components. The multivariate frame-
underlying evolutional characteristics of four stock indexes can work uses the stock indexes as input variable that significantly
be sieved. The relationship between D and H is as follows: impact a particular index and a set of technical indicators as other
independent variables.
D=2−H (7)
The values of D vary between 1 to 2. A value of D equal to
4.3.1. MODWT
1.5 implies the existence of pure random walk. Long and short
This decomposition method segregates a signal into a time
memory trends are detected corresponding to the range of 1 <
varying scale and can efficiently separate nonlinearity and other
D < 1.5 and 1.5 < D < 2, respectively. A value of D closer
random components that are embedded in the financial data
to 1 signifies the long-memory dependence, also known as the
‘Joseph’s Effect’, while a value closer to 2 implies short-memory while preserving the inherent features likes spillovers,
dependence, also known as ‘Noah’s Effect’. heteroscedasticity, and volatility clustering [19,29,31]. MODWT
is a highly redundant transformation technique that has several
4.1.1.3. Correlation between periods (cN ). It is a measure for quan- advantages over the traditional discrete wavelet transform. It
tifying the magnitude of a persistent or anti-persistent trend translates and dilates an original function f (t) onto a father
[10,25]. CN is estimated as: wavelet ϕ (t) and a mother wavelet ψ (t) at predefined scales. If
CN = 2(2H −1) − 1 (8) Pit is the i-th time series value at time t, then it can be written
by as follows:
For an ideal random time series, the value of CN is 0. A persistent
P P P P
time series is characterized by positive values of CN , while an Pit = VJ i (t ) + WJ i (t ) + WJ −i 1 (t ) + · · · + W1 i (t ) , (9)
anti-persistent time series is characterized by negative values of P
∑ Pi P
∑ Pi
where VJ i (t ) = ϕ φJ ,k (t ) , Wj i (t ) = ψ ωj,k (t ) , φJ ,k (t )
CN . If CN = 0.85, then 85% of the dataset under investigation is J ,k J ,k
k k
influenced by its own historical information.
t − 2J k
( )
−J /2
=2 φ , and
4.1.2. Recurrence plot (RP) 2J
It is a graphical tool that accounts for the recurrence in higher j
( )
t −2k
dimensional phase space. The ideation was further exploited and ωj,k (t ) = 2−j/2 ω j
.
2
numerically developed by Eckmann [29]. Sometimes visualiza-
tion with the help of RP may not lead to clear interpretation. The inverse wavelet transform can be written as:
To overcome this obstacle, recurrence quantification analysis is ∞
∑ ∞

proposed [30]. f (t ) = W̃l,k × ψl,k (t ) (10)
l=−∞ k=−∞
4.1.3. Recurrence quantification analysis (RAQ) ∫∞
This is a data modeling process capable of catering valuable where W̃l,k = −∞ f (t ) ψl,k (t ) dt is the discrete wavelet trans-
insights beyond the capability of RP through quantification of form of f (t ).
the true nature of dynamical systems by computing the mea- The MODWT technique is adopted to decompose daily closing
sures like REC, DET, TT, and LAM. The REC represents recurrent prices of BSE, TWSE, JSX and KOSPI. Unlike the traditional DWT,
points percentages. A higher REC implies the existence of periodic MODWT does not need dyadic dataset. It is invariant to a circular
pattern and a smaller REC implies random behavior. The overall shift as well. Overall, the MODWT performs a robust and non-
deterministic structure of a system is measured by DET that orthogonal transformation keeping down the sampled values at
estimates the recurrence points and helps in building diagonal the respective levels of decomposition. The empirically observed
lines parallel in RP. The TT reflects the average length of the MODWT estimates are more efficient than the estimates obtained
vertical line structure. It is an estimate of average duration that from other techniques [21]. The study uses ‘haar’ wavelets to ob-
the system stays in a state. LAM is a measure of the quantity of tain six levels of decomposition for the respective stock indexes.
recurrence points. The smaller LAM values indicate the chaotic Hence, the six wavelet and one scaling coefficients are generated.
nature of the dataset.

4.2. Econometric modeling 4.3.2. Support vector regression (SVR)


SVR is a popular machine learning algorithm used for various
To comprehend the nature of association structure, conven- challenging predictive modeling problems [32]. It performs the
tional Pearson’s correlation and cross-correlation tests are ap- regression through nonlinear pattern mining and discovering the
plied. To inspect the causal interrelationship for identifying pre- linear separation boundary through quadratic optimization. The
dictors, Granger causality test is used. R package ‘kernlab’ is used for implementation of the model.
6 I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553

4.3.3. Elastic net (EN) 2.2: Use the training set Di to train the model Mt .
This is a regression approach which dynamically mixes Lasso 2.3: Compute the error of Mt (Err(Mt )).
and Ridge regressions for feature selection and regularization to 2.4: If the computed error is greater 0.5 repeat 2.1 to 2.3, else
solve problems related to predictive modeling [33]. The approach go to 2.5.
is built upon the principle of OLS estimation by optimizing the 2.5: Update the weights of the samples in set Di .
sum of squared residuals. In a dataset containing N samples 2.6: Normalize weight values.
and p predictors (x1 , . . . . . . , xp ), the response (yi ) in a multiple Step 3: Find the accuracy weights of each base learner.
regression framework is expressed as: Step 4: Combine the weighted outcomes of base learners and
obtain the final outcome.
yi = β0 + β1 x1 + β2 x2 + · · · + βp xp , (11)

and the coefficient vector β̂ = (β0 , β1 , . . . , βp ) are determined 4.3.7. Deep neural network (DNN)
as: The recent surge in the deep learning has resulted in a plethora
[ N
] of models. DNN is an extension of the traditional swallow multi-
1 ∑( )2 layered feed forward neural network which allows to incorporate
β̂ = argmin yi − β0 − xTi βi + λPα (β) (12)
β0 ,βi ∈Rp+1 2N multiple hidden layer for deep learning [38]. There exists a se-
i=1
∑p ries of activation functions and optimization algorithms for the
where, Pα (β) = (1 − α) 12 ∥β∥2l2 + α ∥β∥l1 , ∥β∥2l2 = j=1 βj , and
2
learning process of DNN [4,39]. It has lately been applied suc-
∑p
∥β∥l1 = j=1 βj . cessfully in predictive analysis of time series observations. In this
The expression Pα (β) is termed as the penalty of the elastic study, a DNN of three hidden layers comprising one hundred
net algorithm. For α = 1, the algorithm works as the Ridge nodes in each is chosen. Rectified linear activation function is
regression while for α = 0, it works as the Lasso regression. The deployed. Stochastic gradient descent is used as the learning al-
degree of shrinkage is monitored by λ which is responsible for gorithm. ‘Tensorflow’ platform and Python programming language
feature selection and shrinkage operation simultaneously. The R are utilized for executing the algorithm.
package ‘glmnet’ is used for simulating the model.
4.3.8. Long short-term memory network (LSTMN)
4.3.4. Random forest (RF) LSTM is another deep learning technique used for forecast-
RF is an ensemble machine learning tool that garnered ing the future trends of the time series in granular framework.
tremendous attention for its high precision, robustness to out- It is a variant of traditional RNN that can effectively thwart
liers, and applicability in predictive modeling tasks [34]. It has the vanishing gradient problem [39]. LSTM comprises of several
been extensively used for modeling arduous classification and memory cells to keep records of states and a series of gates (input,
regression tasks. The best feature for the branching operation forget and output gates) to control the flow of information. By
at each node of an individual decision tree is determined from regulating the information traffic and memory structure, it can
the randomly selected subset of features. The number of decision effectively model long-range dependence. For learning the archi-
trees may vary in the range of one hundred to one thousand tecture, BPTT algorithm is applied. Python Programming language
depending upon the complexity of the problem. The final assign- is used in ‘Tensorflow’ framework for practical implementation of
ment of class label information for classification or estimation the model.
of aggregate values for continuous output is determined through
majority voting or arithmetic averaging scheme. The ‘rattle’ GUI 4.4. Performance assessment
package of R is used for simulation of RF.
The forecasting models M-SVR, M-EN, M-RF, M-ERT, M-DNN,
4.3.5. Extremely randomized trees (ERT) and M-Boosting are used for predicting future figures. To eval-
Like the RF, ERT acts as an ensemble predictive analytics tool uate the efficiency of the respective models, the following six
where the thresholds for branching operations in base learners measures are used:
are selected randomly apart from selecting random subset of ∑N { }2
features [35]. Thus, it takes randomness to the next level to i=1 Yact (i) − Ypred (i)
avoid overfitting. To implement the algorithm, the R package
NSC = 1 − ∑N { }2 , (13)
i=1 Yact (i) − Yact
‘extraTrees’ is used. ∑N
i=1 (Yact (i) − Ypred (i))
2

4.3.6. Boosting IA = 1 − ∑ {⏐ } , (14)


N ⏐Ypred (i) − Yact ⏐ + |Yact (i) − Yact | 2

Boosting is an ensemble predictive modeling algorithm in i=1

which each calibration tuple is consigned as a weight value [36].


[ ∑ (
N
)2 ]1/2
1
N i=1 Yact (i) − Ypred (i)
As base learners, a series of decisions are used for pattern recog- TI = [ ]1/2 ]1/2 , (15)
nition. After the successful completion of learning process in one ∑N [ ∑
N
i=1 Yact (i) i=1 Ypred (i)
1 2 1 2
+
learner, weights are updated to enable the rest of the learners N N

to carry out deeper investigation on the training samples that [ N


]1/2
account for a marginally higher rate of error. It computes the 1 ∑( )2
RMSE = Yact (i) − Ypred (i) (16)
weighted average of outcomes of constituent base modelers for N
i=1
the final prediction. In this work, the AdaBoost (Adaptive Boost- [ N
]
ing) algorithm [37] is adopted. Here, ‘GAMBoost’ package of R is 1 ∑⏐
⏐Yact (i) − Ypred (i)⏐

MAD = (17)
used for realizing the model. To perform regression tasks, the N
i=1
following steps are executed.
N
Step 1: Let, (<x1 , y1 >, <x2 , y2 >, . . . . . . . . . . . . ,<xN , yN >) be the N 1 ∑ ⏐ Yact (i) − Ypred (i) ⏐
⏐ ⏐
MAPE = ⏐ ⏐ × 100 (18)
training samples in a set D. Initialize the weights of each sample N ⏐ Yact (i) ⏐
i=1
to 1/d, where d is the cardinality of D.
Step 2: For each base learner (i) perform: where Ypred (i) and Yact (i) are predicted and actual values, respec-
2.1: Draw bootstrapped samples to generate Di . tively. For a good prediction, NSC values should vary between -
I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553 7

Table 3 Table 4
Findings of fractal inspection. Results of RQA.
Series H D CN NoP Effect RWH Index REC (%) DET (%) LAM (%) TT (%)
BSE 0.8907 1.1093 0.7189 Persistent ‘Joseph’s Effect’ Rejected BSE 1.5782 92.2571 95.7059 9.1077
TWSE 0.8894 1.1116 0.7156 Persistent ‘Joseph’s Effect’ Rejected TWSE 1.1106 87.0099 91.8132 6.8206
JSX 0.8934 1.1066 0.7252 Persistent ‘Joseph’s Effect’ Rejected JSX 0.9208 88.3117 94.1126 6.8065
KOSPI 0.8893 1.1107 0.7154 Persistent ‘Joseph’s Effect’ Rejected KOSPI 0.2460 70.4121 79.0691 2.8463

NoP: Nature of Pattern, RWH: Random Walk Hypothesis.


Table 5
Pearson’s correlation test.

∞ to 1. If the value is close to 1, then the accuracy of prediction Index BSE TWSE JSX KOSPI
is better. Similarly, IA values closer to 1 infer efficient predictions BSE –
while the values nearer to 0 suggest poor forecasts. Unlike other TWSE 0.8560* –
JSX 0.7955* 0.8774* –
two metrics, TI should be close to 0 for efficient predictive mod- KOSPI 0.5912* 0.7180* 0.6289* –
eling. The RMSE, MAD and MAPE should be minimum for efficient
forecasting. Unlike the other measures, depending upon the range *Significant at 1% significance level.
of target variables, RMSE, MAD and MAPE can be greater than 1
as well. To keep uniformity, while computing RMSE, MAD and
MAPE, the actual and predicted values are rescaled in between 0 of determinism in KOSPI is lower than other three series that
to 1. justifies the usage of sophisticated forecasting framework for
Fig. 2 represents a flowchart of the proposed research frame- estimating future figures. Moderate TT values imply temporal
work. movements are not restrained to a particular state for a longer
duration. State changes are not highly abrupt as the values of DET
5. Results and discussions and LAM are considerably higher than that of a complete chaotic
signal. Overall, from the analysis it can be concluded that time
This section outlines the findings and key insights obtained series are not perfectly deterministic signals; but more of higher
from nonlinear dynamics, econometric and predictive modeling order deterministic chaos.
frameworks, respectively. Outcome of one segment of research The findings from recurrence analyses further justify the im-
method eventually assists in triggering the subsequent methods plications of fractal modeling. It should be noted that nonsta-
in a systematic manner. tionary and nonparametric behavior were detected beforehand.
This justified the use of wavelet based granular approach for fore-
5.1. Empirical investigation through nonlinear dynamics casting. However, the findings of nonlinear dynamics rejected the
efficient market hypothesis for the considered markets by show-
As stated earlier, fractal dimensional index and Hurst expo- ing evidence of autoregressive behavior. Hence, the necessity of
nent are estimated to test presence random walk entrenched in usage of technical indicators for building forecasting models is
temporal dynamics in daily closing prices of four stock indexes justified. It is an extremely important finding as adding insignifi-
and then meaningful behavioral properties of temporal evolution-
cant features may result in overfitting problem. It also presets the
ary pattern of the said indexes are analyzed through RP and RQA.
platform for examining causal nexus because meticulous inquiry
Table 3 summarizes the outcomes of fractal analysis.
of interrelationship among markets dominated by random walk
H values in Table 3 are significantly higher than 0.5 and
is a futile exercise. Both fractal modeling and recurrence analy-
close to 0.9 for all the indexes. This implies that all the four
ses are used for understanding the key nonparametric statistical
indexes are driven by biased random walk. Therefore, the struc-
properties of the selected stock markets and determining the
tural presence of long memory dependence is established, and the
need of technical indicators in the granular forecasting frame-
Efficient Market Hypothesis is rejected for these indexes. Hence,
work. Next, we have reported the outcome of tests of association
the deployment of granular wavelet based advanced prediction
and causal interaction among these indexes.
model is duly justified. The higher values of CN further suggest
the autoregressive characteristics of the time series; which in
turn recommends employment of technical indicators for con- 5.2. Outcome of association and causality tests
structing predictive framework for projecting future trend as the
markets are significantly driven by their past information. For the To study the interrelationship, Pearson’s correlation coeffi-
remaining empirical analyses, RP and RQA are used to validate cient, cross-correlation and Granger’s causality tests are per-
the findings of fractal modeling and gaining deeper insights. The formed on the considered indexes. The results are reported in
following figures display the RP of the respective stock indexes Table 5 and CCF plots are shown in Fig. 4.
(see Fig. 3). CCF plots display the association between a pair of time series
Except KOSPI, thick diagonal lines can be observed in all the in various lags through computation of auto correlation func-
plots. The main diagonal of KOSPI comparatively thinner than its tions. It is apparent from Fig. 4 that the pairs BSE-TWSE and
counterparts. Small disjoint segments can be observed in BSE and BSE-JSX exhibit significantly high positive associations, and the
TWSE. It is apparent from the structure of RP that none of the pairs TWSE-JSX and TWSE-KOSPI exhibit moderately high pos-
financial time series exhibit pure chaotic properties. To ascertain itive associations. However, the association through traditional
more on structural behavior, it is essential to look at the RQA correlation tests often fail to extract deeper insights. To further
parameters that are tabularized in Table 4. delve into the causal interplay, Granger causality tests are per-
It can be observed that the recurrence rates of all the four formed in a pairwise manner. The test framework requires the
financial time series are not on a higher side, and hence indicate indexes under investigation to be stationary. In our case, the four
a lower degree of periodicity. REC values of KOSPI are the lowest. indexes are strictly non-stationary, and their first order differ-
The higher values of DET and LAM support the deterministic ences are stationary. Therefore, the return series of the respective
structure. Therefore, the presence of the higher order determin- stock indexes (BSER, JSXR, TWSER, and KOSPIR) are calculated to
istic chaos can be implied. Like the recurrence rate, the degree accomplish this task.
8 I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553

Fig. 2. Proposed research framework.

It is also well known that the outcome of causality assessment It can be observed that BSE returns are only affected by KOSPI
in a VAR environment is highly susceptible. This phenomenon returns and the causal structure is unidirectional. TWSE returns
is also known as structural instability. To thwart this issue, the are significantly influenced by both BSE and KOSPI returns. KOSPI
VAR framework is utilized, and lag length is chosen according to returns significantly influence BSE returns, however, the direction
Akaike information criterion (AIC). Table 6 portrays the results. of influence is not bidirectional. The JSX returns are influenced by
TWSE and KOSPI returns. KOSPI returns are found to be affected
In Table 6, LR denotes the test statistic value of the sequential
by none of the other three returns. Next, the IR is estimated for
modified LR (Likelihood-ratio) test at 5% significance level, FPE
assessing the impacts of shock from one asset on another.
denotes the final prediction error, AIC denotes the Akaike infor-
It can be noticed from Fig. 5 that in the short run, shocks of
mation criterion, SC denotes the Schwarz information criterion, their own movement tend to play a significant role in volatility.
and HQ denotes the Hannan–Quinn information criterion. The To further justify the claim, variance decomposition is performed,
lowest value of AIC is obtained at lag order 8. So, causality and the results are presented in Table 8.
analysis is carried out on this basis and the results are presented The percentages of variance in returns of all the four stock
in Table 7. indexes are largely explained by themselves in the short run.
I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553 9

Fig. 3. Recurrence plots.

Table 6 Table 7
Results of lag selection. Causality analysis.
Lag logL LR FPE AIC SC HQ Null Hypothesis Chi-square Probability Result
1 16557.11 113.0513 3.20e−17 −26.62981 −26.54729 −26.59877 TWSER ‘DNGC’ BSER 9.7244# 0.1367 Accepted
2 16597.22 79.64191 3.08e−17 −26.66863 −26.52010 −26.61278a BSER ‘DNGC’ TWSER 37.3059* 0 Rejected
3 16628.41 61.71862 3.00e−17 −26.69309 −26.47854 −26.61240 JSXR ‘DNGC’ BSER 4.0971# 0.6635 Accepted
4 16656.50 55.40771 2.94e−17 −26.71255 −26.43199 −26.60704 BSER ‘DNGC’ JSXR 10.4576# 0.1067 Accepted
5 16668.70 23.98722 2.96e−17 −26.70643 −26.35985 −26.57610 KOSPIR ‘DNGC’ BSER 71.9597* 0 Rejected
6 16699.07 59.52533a 2.89e−17a −26.72958a −26.31698 −26.57442 BSER ‘DNGC’ KOSPIR 10.2708# 0.1137 Accepted
7 16712.06 25.37865 2.91e−17 −26.72474 −26.24612 −26.54475 JSXR ‘DNGC’ TWSER 4.2344# 0.6450 Accepted
8 16722.94 21.18071 2.93e−17 −26.71649 −26.17186 −26.51168 TWSER ‘DNGC’ JSXR 31.1601* 0 Rejected
9 16728.80 11.36934 2.98e−17 −26.70016 −26.08952 −26.47053 KOSPIR ‘DNGC’ TWSER 70.9776* 0 Rejected
10 16737.66 17.14174 3.02e−17 −26.68867 −26.01201 −26.43421 TWSER ‘DNGC’ KOSPIR 5.1313# 0.5271 Accepted
11 16745.25 14.61423 3.06e−17 −26.67511 −25.93244 −26.39583 KOSPIR ‘DNGC’ JSXR 16.2216** 0.0126 Rejected
12 16756.77 22.14324 3.08e−17 −26.66791 −25.85922 −26.36380 JSXR ‘DNGC’ KOSPIR 7.5716# 0.2712 Accepted
a DNGC: ‘does not Granger Cause’.
Selected lag order.
*Significant at 1% level.
**Significant at 2% significance level.
# Not significant.
Indexes that are found to possess impact on the others by Granger
causality, have marginal impact on variance as well. Hence, the
overall findings of causation analysis are validated through the
Technical indicators are majorly computed based on historical
outcomes of variance decomposition. Results of causal interac-
prices and trading volume information. The estimated technical
tions help in comprehending the structure of interrelationships.
indicators assist the traders in making critical decisions regarding
buying or selling of shares [40–42]. Some widely used technical
indicators are MA, bias, MACD, EMA, RSI, CCI, Bollinger band, mo-
5.3. Predictive modeling performance mentum, WR, PSY, and STOC. Table 9 summarizes the dependent
and independent constructs settings.
As explained earlier, MODWT based hybrid predictive mod- Initially, MODWT technique is used fir decomposing individ-
els are applied for making predictions. Proper selection of ex- ual series. Figs. 6–9 show the graphical illustration of MODWT
planatory constructs plays a pivotal role in predictive modeling decomposition of BSE, TWSE, KOSPI and JSX.
exercise. This paper uses few well-known technical indicators For the predictive exercise, the original dataset is portioned
and stock indexes having significant causal influence discovered into 85% training and 15% test data. However, unlike randomly
through Granger causality assessment as explanatory variables. splitting the entire data into training and test datasets, daily
10 I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553

Fig. 4. CCF plots.

observations from January 2012 to June 2016 are selected as extrapolation capability of respective predictive models. Several
training dataset and July 2016 to June 2017 as the test dataset. process parameters of individual frameworks are varied to run
This alignment of training and test datasets assists in judging the twenty experimental trials for each model and the average values
I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553 11

Fig. 5. Impulse response plots.

of performance indicators are computed to determine the pre- Table 11 shows that the RMSE, MAD, and MAPE figures are
dictive accuracy. Instead of varying the process parameters and almost negligible. Therefore, likewise the previous performance
conducting trials, it is important to perform the hyper parameter evaluation using other three indexes, this assessment also implies
tuning for the predictive algorithms to obtain higher prediction accuracy of forecasts. This justifies the use of the predictive ana-
accuracy. The proposed framework resulted in predictions of lytics models for the selected stock indexes. Figs. 10–13 portray
good accuracy. Therefore, the process parameter tuning in combi- the graphical representation of performance of respective models
natorial optimization setup to traverse search space has not been on selected test samples.
performed in this study. Table 10 presents the average values of It is important to ascertain the statistical significance of re-
performance indicators for both training and test datasets for all spective predictive models. This study adopts two statistical ap-
the algorithms. proaches, namely, test for EPA and test for SPA to accomplish the
The obtained results show that the NSE and IA values are task. To carry out the EPA of the forecasting models, DM pairwise
considerably higher for all the methods for both datasets and test is carried out on the performance of the models on the test
at the same time TI values are considerably low. Therefore, the dataset, whereas the MCS [43] is used to execute SPA in order
future values of the considered stocks can be estimated with a to rank the respective models according to their performance.
higher degree of precision. Table 11 summarizes performance of Tables 12 to 15 present the outcome of the test for individual
respective frameworks in terms of RMSE, MAD, and MAPE. stock markets. Since the test operates on a pairwise manner and
12 I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553

Table 8
Results of variance decomposition.
Variance decomposition of BSER Variance Decomposition of TWSER
Period S.E. BSER JSXR KOSPIR TWSER S.E. BSER JSXR KOSPIR TWSER
1 0.008982 100.0000 0.000000 0.000000 0.000000 0.007875 0.205620 0.722181 1.785545 97.28665
2 0.009036 98.99794 0.082237 0.867757 0.052062 0.007999 2.916552 0.742326 1.837086 94.50404
3 0.009229 95.52101 0.089382 4.339593 0.050018 0.008043 2.904503 0.825656 2.436984 93.83286
4 0.009303 94.02251 0.355465 5.563872 0.058151 0.008121 2.918850 0.852707 4.085395 92.14305
5 0.009335 93.56958 0.398789 5.912645 0.118989 0.008237 2.841315 0.828921 6.082163 90.24760
6 0.009341 93.44576 0.426611 5.915991 0.211634 0.008259 2.831366 0.866105 6.320251 89.98228
7 0.009369 92.89871 0.429150 5.912064 0.760073 0.008329 3.003217 0.968815 7.434779 88.59319
8 0.009373 92.84583 0.430303 5.956518 0.767349 0.008339 3.004913 0.970754 7.569805 88.45453
9 0.009375 92.80273 0.430084 5.997269 0.769920 0.008348 2.999829 0.969569 7.725106 88.30550
10 0.009375 92.79953 0.430862 5.997610 0.772002 0.008352 2.998472 0.969604 7.792235 88.23969
Variance decomposition of JSXR Variance Decomposition of KOSPIR
1 0.009630 0.028592 99.97141 0.000000 0.000000 0.007723 0.591027 0.052602 99.35637 0.000000
2 0.009708 0.336855 98.46987 0.419170 0.774108 0.007744 0.671935 0.249709 98.83037 0.247983
3 0.009814 0.614418 96.57287 0.709657 2.103056 0.007754 0.917619 0.249152 98.58560 0.247628
4 0.009975 0.702574 95.22223 1.993675 2.081518 0.007762 0.916224 0.448145 98.37640 0.259235
5 0.010013 0.697248 94.71827 2.505915 2.078566 0.007772 0.919757 0.471848 98.32993 0.278463
6 0.010047 1.128872 94.08723 2.703251 2.080653 0.007794 0.991360 0.522083 98.05634 0.430213
7 0.010062 1.139295 93.96809 2.738251 2.154362 0.007817 1.300085 0.604327 97.66169 0.433896
8 0.010069 1.150219 93.88164 2.753659 2.214481 0.007817 1.300001 0.609531 97.65406 0.436412
9 0.010071 1.158017 93.85959 2.760678 2.221711 0.007818 1.308093 0.609341 97.64107 0.441496
10 0.010076 1.166081 93.80624 2.803346 2.224330 0.007820 1.312007 0.610393 97.63421 0.443389

Table 9
Dependent and predictor variables.
Dependent Predictor
BSE MA (10 days), bias (20 days) EMA (10 days and 20 days), MACD (9 days), upper Bollinger
band (20 days), lower Bollinger band (20 days), RSI (14 days), momentum (10 days), WR (10
days), CCI (14 days), KOSPI.
TWSE MA (10 days), bias (20 days) EMA (10 days and 20 days), MACD (9 days), upper Bollinger
band (20 days), lower Bollinger band (20 days), RSI (14 days), momentum (10 days), WR (10
days), CCI (14 days), BSE, KOSPI.
JSX MA (10 days), bias (20 days) EMA (10 days and 20 days), MACD (9 days), upper Bollinger
band (20 days), lower Bollinger band (20 days), RSI (14 days), momentum (10 days), WR (10
days), CCI (14 days), TWSE, KOSPI.
KOSPI MA (10 days), bias (20 days) EMA (10 days and 20 days), MACD (9 days), upper Bollinger
band (20 days), lower Bollinger band (20 days), RSI (14 days), momentum (10 days), WR (10
days), CCI (14 days).

Table 10
Performance assessment in terms of NSC, IA and TI.
Performance of M-SVR Performance of M-EN
Training dataset Test dataset Training dataset Test dataset
NSC IA TI NSC IA TI NSC IA TI NSC IA TI
BSE 0.9902 0.9968 0.0067 0.989 0.994 0.008 0.990 0.995 0.0081 0.988 0.993 0.010
TWSE 0.992 0.9970 0.0069 0.990 0.994 0.009 0.991 0.996 0.007 0.988 0.994 0.008
JSX 0.991 0.9961 0.006 0.988 0.993 0.009 0.990 0.994 0.007 0.989 0.993 0.009
KOSPII 0.9871 0.994 0.009 0.983 0.991 0.012 0.988 0.994 0.010 0.984 0.990 0.012
Performance of M-ERT Performance of M-Boosting
Training dataset Test dataset Training dataset Test dataset
BSE 0.995 0.999 0.003 0.992 0.997 0.006 0.9959 0.9994 0.003 0.993 0.998 0.006
TWSE 0.996 0.999 0.003 0.993 0.998 0.006 0.997 0.999 0.003 0.994 0.998 0.005
JSX 0.995 0.999 0.004 0.991 0.997 0.008 0.995 0.999 0.003 0.992 0.997 0.007
KOSPI 0.992 0.998 0.006 0.989 0.997 0.009 0.993 0.999 0.006 0.9900 0.9979 0.0083
Performance of M-DNN Performance of M-RF
Training dataset Test dataset Training dataset Test dataset
BSE 0.996 0.9993 0.003 0.994 0.997 0.006 0.996 0.999 0.004 0.993 0.998 0.006
TWSE 0.997 0.999 0.003 0.995 0.998 0.005 0.997 0.999 0.003 0.994 0.997 0.006
JSX 0.996 0.999 0.004 0.993 0.997 0.006 0.995 0.998 0.003 0.992 0.997 0.007
KOSPI 0.992 0.999 0.005 0.9895 0.9978 0.0094 0.992 0.999 0.006 0.989 0.996 0.009
Performance of M-LSTMN
Training dataset Test dataset
BSE 0.995 0.998 0.004 0.992 0.996 0.006
TWSE 0.996 0.999 0.003 0.994 0.998 0.005
JSX 0.996 0.998 0.004 0.992 0.996 0.007
KOSPI 0.993 0.998 0.005 0.990 0.996 0.009
I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553 13

Table 11
Performance assessment in terms of RMSE, MAD and MAPE.
Performance of M-SVR Performance of M-EN
Training dataset Test dataset Training dataset Test dataset
RMSE MAD MAPE RMSE MAD MAPE RMSE MAD MAPE RMSE MAD MAPE
BSE 0.032 0.027 0.101 0.037 0.033 0.113 0.033 0.028 0.102 0.038 0.034 0.113
TWSE 0.029 0.025 0.114 0.035 0.028 0.124 0.030 0.026 0.115 0.037 0.029 0.125
JSX 0.040 0.033 0.123 0.047 0.040 0.132 0.042 0.034 0.124 0.047 0.041 0.132
KOSPI 0.036 0.030 0.120 0.041 0.034 0.127 0.036 0.031 0.121 0.042 0.035 0.128
Performance of M-ERT Performance of M-Boosting
Training dataset Test dataset Training dataset Test dataset
BSE 0.027 0.024 0.093 0.031 0.025 0.106 0.026 0.022 0.092 0.030 0.024 0.103
TWSE 0.025 0.021 0.108 0.029 0.025 0.118 0.023 0.019 0.106 0.028 0.022 0.115
JSX 0.035 0.028 0.116 0.042 0.036 0.123 0.034 0.027 0.116 0.039 0.033 0.121
KOSPI 0.031 0.023 0.113 0.036 0.028 0.120 0.029 0.023 0.112 0.034 0.027 0.119
Performance of M-DNN Performance of M-RF
Training dataset Test dataset Training dataset Test dataset
BSE 0.025 0.021 0.090 0.029 0.024 0.102 0.028 0.023 0.094 0.033 0.027 0.101
TWSE 0.023 0.019 0.104 0.027 0.0235 0.114 0.024 0.022 0.107 0.029 0.028 0.112
JSX 0.033 0.026 0.116 0.039 0.033 0.120 0.034 0.028 0.119 0.041 0.034 0.125
KOSPI 0.030 0.023 0.111 0.035 0.026 0.119 0.032 0.024 0.113 0.036 0.029 0.117
Performance of M-LSTMN
Training dataset Test dataset
BSE 0.027 0.021 0.092 0.031 0.024 0.105
TWSE 0.023 0.020 0.108 0.027 0.022 0.116
JSX 0.033 0.026 0.115 0.038 0.033 0.120
KOSPI 0.029 0.024 0.114 0.036 0.028 0.121

Fig. 6. MODWT decomposition of BSE. Fig. 7. MODWT decomposition of TWSE.

is sensitive to order of constituents in pairs, models are stacked for their superior performance. It should be noted here that
with the numbers indicating the order in the table for the sake of although the performance of M-SVR and M-EN is not as superior
understanding. A significant positive test statistic value indicates compared to the other models, they can effectively be used for
that the predictive performance of the second model is superior one day ahead forecasting. Therefore, due to the inherent ineffi-
than the first model. A significant negative test statistic value ciency of the said markets any one of these granular forecasting
infers the superiority of the first model over the second one. frameworks can be used for trading purposes.
From the test statistic values, it is seen that no significant The MCS approach can detect the best predictive model among
difference exists in the performance of M-SVR and M-EN models a set of competitive models with specified level of confidence. It
for predictive modeling of all four financial time series. Also, M- can distinguish between superior and inferior models based on
RF, M-ERT, M-Boosting, M-DNN, and M-LSTMN outperform both loss function. The findings are presented in Table 16.
M-SVR and M-EN on all four occasions. Ensemble characteristics MCS has eliminated M-SVR and M-EN from evaluation as they
and deep learning principle of these four models largely account are found to be significantly inferior to other five models. The
14 I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553

Table 12
Results of DM test for BSE prediction.
Model M-SVR (1) M-EN (1) M-RF (1) M-ERT (1) M-Boosting (1) M-DNN (1) M-LSTMN (1)
M-SVR (2) –
M-EN (2) 0.2371# –
M-RF (2) 3.9234* 4.0168* –
M-ERT (2) 3.6085* 3.8860* 0.1765# –
M-Boosting (2) 3.9487* 4.0432* 0.1824# 0.1951# –
M-DNN (2) 3.9686* 4.0268* 0.1837# 0.1886# 0.1737# –
M-LSTMN (2) 3.9328* 4.0239* 0.1816# 0.1897# 0.1728# 0.1833# –

*Significant at 1% level.
# Not significant.

Table 13
Results of DM test for TWSE prediction.
Model M-SVR (1) M-EN (1) M-RF (1) M-ERT (1) M-Boosting (1) M-DNN (1) M-LSTMN (1)
M-SVR (2) –
M-EN (2) 0.2246# –
M-RF (2) 3.8678* 3.9984* –
M-ERT (2) 3.6231* 3.8771* 0.1792# –
M-Boosting (2) 3.9536* 4.0486* 0.1831# 0.1926# –
M-DNN (2) 3.9659* 4.0319* 0.1856# 0.1875# 0.1709# –
M-LSTMN (2) 3.9447* 4.0254* 0.1839# 0.1852# 0.1705# 0.1816# –

*Significant at 1% level.
# Not significant.

Table 14
Results of DM test for JSX prediction.
Model M-SVR (1) M-EN (1) M-RF (1) M-ERT (1) M-Boosting (1) M-DNN (1) M-LSTMN (1)
M-SVR (2) –
M-EN (2) 0.2312# –
M-RF (2) 3.8148* 4.0126* –
M-ERT (2) 3.6186* 3.8662* 0.1812# –
M-Boosting (2) 3.8120* 4.0439* 0.1849# 0.1892# –
M-DNN (2) 3.9445* 4.0547* 0.1846# 0.1816# 0.1745# –
M-LSTMN (2) 3.7689* 4.0368* 0.1833# 0.1784# 0.1787# 0.1937# –

*Significant at 1% level.
# Not significant.

Table 15
Results of DM test for KOSPI prediction.
Model M-SVR (1) M-EN (1) M-RF (1) M-ERT (1) M-Boosting (1) M-DNN (1) M-LSTMN (1)
M-SVR (2) –
M-EN (2) 0.2385# –
M-RF (2) 3.8775* 4.0491* –
M-ERT (2) 3.6149* 3.9714* 0.1754# –
M-Boosting (2) 3.9857* 4.0688* 0.1807# 0.1876# –
M-DNN (2) 3.9116* 4.0612* 0.1819# 0.1825# 0.1745# –
M-LSTMN (2) 3.8387* 4.0485* 0.1793# 0.1807# 0.1787# 0.1937# –

*Significant at 1% level.
# Not significant.

Table 16 6. Conclusions
MCS based evaluation.
Series M-RF M-ERT M-Boosting M-DNN M-LSTMN
This paper attempts to reveal the key characteristics of se-
BSE (4) (5) (2) (1) (3)
lected stock markets representing the four emerging economies
TWSE (4) (5) (1) (2) (3)
JSX (4) (5) (3) (1) (2) in Asia and subsequently delves into their causal interrelationship
KOSPI (5) (4) (1) (2) (3) and predictive analysis. It combines technical indicators and other
influential stock indexes recognized through the causal model-
ing as independent features for estimating future figures in a
multivariate framework. This forecasting structure is unique as
the existing literature reports separate usage of technical indica-
values in parenthesis represent the respective ranks for financial
tors and macroeconomic variables-based forecasting models. The
time series. It can be observed that M-DNN model has outper- major findings of this research are summarized below:
formed others in predicting BSE and JSX while for estimating
• The seven granular predictive models used in this study
forecasts of TWSE and KOSPI, M-Boosting has emerged as the best produce extremely good performance measures on training
model. and test samples. The average NSC and IA values of all the
I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553 15

Fig. 8. MODWT decomposition of JSX.


Fig. 10. Predictive performance of BSE.

Fig. 9. MODWT decomposition of KOSPI.


Fig. 11. Predictive performance of TWSE.

seven models are close to 1. The average TI values are close


between TWSE and JSX returns where the former impacts
0. This justified the efficacy of all the predictive models. The
the later.
average RMSE, MAD and MAPE values are found to be very • The considered markets did not react immediately with the
small. These measures indicate that the predictive modeling arrival of new information due to the presence of long-range
framework is highly effective for making superior predic- dependence structure. The findings support the autoregres-
tions for the considered stock markets. Out of the seven sive behavior of the markets which in turn suggests that
models, M-DNN and M-Boosting are the top two models in technical indicators can be effectively used for estimating
yielding superior forecast according to the MCS based SPA future movements and trading.
evaluations. The scope of this research is limited to the time duration under
• The BSE returns were found to be significantly affected consideration. This integrated approach can easily be extended
by KOSPI while it possesses influence over TWSE. KOSPI for a longer duration. Causal nexus can also be explored in the
returns were found to be not triggered by other three mar- nonlinear framework in the future. It would be interesting to
ket returns. Unidirectional causal interrelationship existed examine the nature of the behavior and the causal interaction
16 I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553

impending conflict with this work. For full disclosure statements


refer to https://doi.org/10.1016/j.asoc.2019.105553.

References

[1] J. Zhang, S. Cui, Y. Xu, Q. Li, T. Li, A novel data-driven stock price trend
prediction system, Expert Syst. Appl. 97 (2018) 60–69, http://dx.doi.org/
10.1016/j.eswa.2017.12.026.
[2] R. Jammazi, R. Ferrer, F. Jareño, S.J.H. Shahzad, Time-varying causality
between crude oil and stock markets: What can we learn from a multiscale
perspective?, Int. Rev. Econ. Financ. 49 (2017) 453–483, http://dx.doi.org/
10.1016/j.iref.2017.03.007.
[3] K. Guhathakurta, B. Bhattacharya, A.R. Chowdhury, Using recurrence plot
analysis to distinguish between endogenous and exogenous stock mar-
ket crashes, Physica A 389 (2010) 1874–1882, http://dx.doi.org/10.1016/j.
physa.2009.12.061.
[4] X. Sun, H. Chen, Y. Yuan, Z. Wu, Predictability of multifractal analysis
of hang seng stock index in Hong Kong, Physica A 301 (2001) 473–482,
http://dx.doi.org/10.1016/S0378-4371(01)00433-2.
[5] Z. Lin, Modelling and forecasting the stock market volatility of SSE
composite index using GARCH models, Future Gener. Comput. Syst. (2018)
http://dx.doi.org/10.1016/j.future.2017.08.033.
[6] M.M. Rounaghi, F. Nassir Zadeh, Investigation of market efficiency and
financial stability between S & P 500 and london stock exchange: Monthly
and yearly forecasting of time series stock returns using ARMA model,
Physica A (2016) http://dx.doi.org/10.1016/j.physa.2016.03.006.
[7] C. Narendra Babu, B. Eswara Reddy, Prediction of selected Indian stock
using a partitioning–interpolation based ARIMA–GARCH model, Appl.
Comput. Inform. (2015) http://dx.doi.org/10.1016/j.aci.2014.09.002.
Fig. 12. Predictive performance of JSX. [8] W. Liu, B. Morley, Volatility forecasting in the hang seng index using
the GARCH approach, Asia-Pacific Financ. Mark. (2009) http://dx.doi.org/
10.1007/s10690-009-9086-4.
[9] J.-L. Zhang, Y.-J. Zhang, L. Zhang, A novel hybrid method for crude oil price
forecasting, Energy Econ. 49 (2015) 649–659, http://dx.doi.org/10.1016/j.
eneco.2015.02.018.
[10] I. Ghosh, M.K. Sanyal, R.K. Jana, Fractal inspection and machine learning-
based predictive modelling framework for financial markets, Arab. J. Sci.
Eng. (2018) http://dx.doi.org/10.1007/s13369-017-2922-3.
[11] S. Singhal, S. Ghosh, Returns and volatility linkages between international
crude oil price, metal and other stock indices in India: Evidence from VAR-
DCC-GARCH models, Resour. Policy 50 (2016) 276–288, http://dx.doi.org/
10.1016/j.resourpol.2016.10.001.
[12] J.B. Geng, Q. Ji, Y. Fan, The relationship between regional natural gas
markets and crude oil markets from a multi-scale nonlinear granger
causality perspective, Energy Econ. 67 (2017) 98–110, http://dx.doi.org/
10.1016/j.eneco.2017.08.006.
[13] C. Peng, H. Zhu, Y. Guo, X. Chen, Risk spillover of international crude oil
to China’s firms: Evidence from granger causality across quantile, Energy
Econ. 72 (2018) 188–199, http://dx.doi.org/10.1016/j.eneco.2018.04.007.
[14] A. Papana, C. Kyrtsou, D. Kugiumtzis, C. Diks, Financial networks based
on granger causality: A case study, Physica A 482 (2017) 65–73, http:
//dx.doi.org/10.1016/j.physa.2017.04.046.
[15] E. Bouri, R. Gupta, A. Lahiani, M. Shahbaz, Testing for asymmetric nonlinear
short- and long-run relationships between bitcoin, aggregate commodity
and gold prices, Resour. Policy (2018) http://dx.doi.org/10.1016/j.resourpol.
2018.03.008.
[16] A. Dutta, E. Bouri, D. Roubaud, Nonlinear relationships amongst the implied
volatilities of crude oil and precious metals, Resour. Policy (2018) http:
//dx.doi.org/10.1016/j.resourpol.2018.04.009.
[17] J.C. Reboredo, M.A. Rivera-Castro, A. Ugolini, Wavelet-based test of co-
movement and causality between oil and renewable energy stock prices,
Fig. 13. Predictive performance of KOSPI. Energy Econ. 61 (2017) 241–252, http://dx.doi.org/10.1016/j.eneco.2016.10.
015.
[18] R. Singh, D. Das, R.K. Jana, A.K. Tiwari, A wavelet analysis for exploring the
relationship between economic policy uncertainty and tourist footfalls in
during the financial crisis, pre- and post-crisis periods. The per- the USA, Curr. Issues Tour. (2018) 1–8, http://dx.doi.org/10.1080/13683500.
formances of the presented forecasting models may be checked 2018.1445204.
at the crisis periods for its effectiveness in volatile conditions. Pa- [19] Y. Zhao, J. Li, L. Yu, A deep learning ensemble approach for crude oil price
rameter tuning of predictive algorithms for generating forecasts forecasting, Energy Econ. 66 (2017) 9–16, http://dx.doi.org/10.1016/j.eneco.
2017.05.023.
in such periods can be explored thoroughly to check their impact [20] E. Chong, C. Han, F.C. Park, Deep learning networks for stock market
on prediction accuracy, as the quality of forecast obtained may analysis and prediction: Methodology, data representations, and case
degrade in high volatile periods. studies, Expert Syst. Appl. 83 (2017) 187–205, http://dx.doi.org/10.1016/
j.eswa.2017.04.030.
[21] I. Ghosh, M.K. Sanyal, R.K. Jana, Analysis of causal interactions and
Declaration of competing interest predictive modelling of financial markets using econometric methods,
maximal overlap discrete wavelet transformation and machine learning:
A study in asian context, Lecture Notes in Comput. Sci. (2017) http:
No author associated with this paper has disclosed any po- //dx.doi.org/10.1007/978-3-319-69900-4_84, (Including Subser. Lect. Notes
tential or pertinent conflicts which may be perceived to have Artif. Intell. Lect. Notes Bioinformatics).
I. Ghosh, R.K. Jana and M.K. Sanyal / Applied Soft Computing Journal 82 (2019) 105553 17

[22] A. Kazem, E. Sharifi, F.K. Hussain, M. Saberi, O.K. Hussain, Support vector [40] L.A. Laboissiere, R.A.S. Fernandes, G.G. Lage, Maximum and minimum
regression with chaos - based firefly algorithm for stock market price stock price forecasting of Brazilian power distribution companies based
forecasting, Appl. Soft Comput. 13 (2013) 947–958, http://dx.doi.org/10. on artificial neural networks, Appl. Soft Comput. J. (2015) http://dx.doi.
1016/j.asoc.2012.09.024. org/10.1016/j.asoc.2015.06.005.
[23] W. Kristjanpoller R., K. Michell V., A stock market risk forecasting model [41] C.H. Su, C.H. Cheng, A hybrid fuzzy time series model based on ANFIS
through integration of switching regime, ANFIS and GARCH techniques, and integrated nonlinear feature selection method for forecasting stock,
Appl. Soft Comput. J. 67 (2018) 106–116, http://dx.doi.org/10.1016/j.asoc. Neurocomputing (2016) http://dx.doi.org/10.1016/j.neucom.2016.03.068.
2018.02.055. [42] X. dan Zhang, A. Li, R. Pan, Stock trend prediction based on a new status
[24] L. Lei, Wavelet neural network prediction method of stock price trend box method and adaboost probabilistic support vector machine, Appl. Soft
based on rough set attribute reduction, Appl. Soft Comput. 62 (2018) Comput. J. (2016) http://dx.doi.org/10.1016/j.asoc.2016.08.026.
923–932, http://dx.doi.org/10.1016/j.asoc.2017.09.029. [43] P.R. Hansen, A. Lunde, J.M. Nason, The model confidence set, 2004, http:
[25] L. Budinski-Petković, I. Lončarević, Z.M. Jakšić, S.B. Vrhovac, Fractal prop- //dx.doi.org/10.2139/ssrn.522382.
erties of financial markets, Physica A (2014) http://dx.doi.org/10.1016/j.
physa.2014.05.017.
[26] H.E. Hurst, Long-term storage capacity of reservoirs, Trans. Am. Soc. Civ.
Eng. 116 (1951) 770–799, http://dx.doi.org/10.1119/1.18629. Indranil Ghosh is an Assistant Professor of Operations
[27] B.B. Mandelbrot, J.R. Wallis, Joseph Noah, And operational hydrol- Management & IT at Calcutta Business School, India. He
ogy, Water Resour. Res. 4 (1968) 909–918, http://dx.doi.org/10.1029/ has earned his B.E. degree in Information Technology
WR004i005p00909. and M. Tech degree in Industrial Engineering & Man-
[28] L. Xian, K. He, K.K. Lai, Gold price analysis based on ensemble empirical agement from WBUT, India. His areas of interests are
model decomposition and independent component analysis, Physica A data mining & pattern recognition, AI techniques, sta-
(2016) http://dx.doi.org/10.1016/j.physa.2016.02.055. tistical data analysis, design and analysis of algorithm,
[29] J.-P. Eckmann, S.O. Kamphorst, D. Ruelle, Recurrence plots of dynamical supply chain management & ERP, image processing &
systems, Europhys. Lett. 4 (1987) 973–977, http://dx.doi.org/10.1209/0295- computer vision and social network analysis.
5075/4/9/004.
[30] C.L. Webber, J.P. Zbilut, Dynamical assessment of physiological systems and
states using recurrence plot strategies, J. Appl. Physiol. 76 (1994) 965–973, Rabin K. Jana is an Assistant Professor of Operations
http://dx.doi.org/10.1152/jappl.1994.76.2.965. & Quantitative Methods at Indian Institute of Man-
[31] D. Das, P. Bhowmik, R.K. Jana, A multiscale analysis of stock return agement Raipur, India. He has earned his Ph.D. from
co-movements and spillovers: Evidence from Pacific developed markets, IIT Kharagpur, India. He was a postdoctoral researcher
Physica A 502 (2018) 379–393, http://dx.doi.org/10.1016/j.physa.2018.02. at George Mason University, USA and National Uni-
143. versity of Singapore. Dr. Jana is a senior member of
[32] V.N. Vapnik, The Nature of Statistical Learning Theory, Vol. 8, Springer, Operational Research Society of India, a member of
1995, p. 188, http://dx.doi.org/10.1109/TNN.1997.641482. Decision Sciences Institute, USA, and Indian Statistical
[33] H. Zou, T. Hastie, Regularization and variable selection via the elastic- Institute, Kolkata. His research interest includes opti-
net, J. R. Stat. Soc. 67 (2005) 301–320, http://dx.doi.org/10.1111/j.1467- mization, AI techniques, machine learning, and time
9868.2005.00503.x. series forecasting.
[34] L. Breiman, Random forests, Mach. Learn. 45 (2001) 5–32, http://dx.doi.
org/10.1023/A:1010933404324.
[35] P. Geurts, D. Ernst, L. Wehenkel, Extremely randomized trees, Mach. Learn. M.K. Sanyal is a Professor in the Department of Busi-
63 (2006) 3–42, http://dx.doi.org/10.1007/s10994-006-6226-1. ness Administration, University of Kalyani, India. He
[36] R.E. Schapire, Y. Singer, Improved boosting algorithms using confidence- has earned his M. Tech degree and Ph.D. in Com-
rated predictions, Mach. Learn. 37 (1999) 297–336, http://dx.doi.org/10. puter Science. Professor Sanyal has published several
1023/A:1007614523901. research papers in international journals of repute and
[37] M. Collins, R.E. Schapire, Y. Singer, Logistic regression AdaBoost and co-authored a number of books.
Bregman distances, Mach. Learn. 48 (2002) 253–285, http://dx.doi.org/10.
1023/A:1013912006537.
[38] G.E. Hinton, S. Osindero, Y.W. Teh, A fast learning algorithm for deep belief
nets, Neural Comput. 18 (2006) 1527–1554, http://dx.doi.org/10.1162/neco.
2006.18.7.1527.
[39] T. Fischer, C. Krauss, Deep learning with long short-term memory networks
for financial market predictions, European J. Oper. Res. (2018) http://dx.
doi.org/10.1016/j.ejor.2017.11.054.