You are on page 1of 14

ABSTRACT

The foundation for efficient financial management, sound monetary policies, and long-term strategic decision-making
globally is an accurate estimate of the exchange rate (ER). Economic diversification is made possible by a stable and
competitive ER. To forecast the trends and data that affect the ER's growth or decrease, economists, researchers, and
investors have carried out a number of studies. This study compares the effectiveness of Long Short-Term Memory (LSTM)
and Support Vector Machine (SVM) in predicting future exchange rate values. by contrasting the two deep learning
models' evaluation scores. It chooses the best machine learning algorithm. According to the findings, the LSTM neural
network model outperforms the SVM network model in terms of mean square error (MSE) and mean absolute percentage
error (MAPE). This shows that the LSTM neural model predicts Ghana’s exchange rate better.

KEYWORDS: Long short-term memory, support vector machine, Recurrent neural network, Prediction, exchange rate,
mean square error, mean absolute percentage error
1. INTRODUCTION

The price at which one country’s currency is exchanged for another country’s is known as the exchange rate. The most
significant relative price in the financial sector is the exchange rate (Rusydi and Islam, 2007). The importance of the
exchange rate stems from its role in ensuring international trade in goods and services, determining the volume of imports
and exports, setting domestic prices, and maintaining economic balance (Okonkwo et al.,2017). The determination of the
various exchange rate regimes and policies that Ghana has undergone to achieve currency stabilization remains uncertain,
owing to the volatile nature of the country’s exchange rate. There is still a lack of consensus among policymakers
regarding the key determinants of exchange rate fluctuations, and researchers have not yet reached a definitive
conclusion regarding its precise effect on Ghana’s global trade (Yussif et al., 2022). Contemporary literature pertaining to
the determination and prediction of exchange rates is consistent with the postulation that were established several
decades ago. Several empirical findings suggest that the exchange rate adheres to a random walk model, and fluctuations
in exchange rate are indeterminate. Additionally, currencies of nations with high inflation rates tend to devalue over
extended periods, with the extent of depreciation being roughly equivalent to the inflation differential. The fluctuations
observed in the current exchange rate exhibit occasional instances of overshooting, subsequently leading to a gradual
realignment towards the equilibrium (Zou et al., 2017). In the financial markets, modeling and forecasting a financial
variable is an extremely important activity that can be of great benefit to a variety of different participants, including
practitioners, regulators, and policy makers. In addition, forecasting is an essential part of both the financial and
managerial decision-making processes (Majhi et al., 2009). Therefore, making a valid forecast of a variable in the financial
markets is of the utmost significance, particularly in a nation like Ghana. Technical analysis, also referred to as machine
learning, is the process of making educated predictions and educated guesses about the future (Chhajer et al., 2022).
Machine learning is a branch of artificial intelligence that deals with developing and validating algorithms using data
(Chhajer et al., 2022). Many businesses are being automated; utilizing mathematical models, computers use internet
trading to make quick decisions. It develops markets where short-term volatility and sell-offs replace long-term forecasts
(Chhajer et al., 2022). The most popular algorithms for analyzing and forecasting stock market movements are ANN and
SVM. Using tick data, these systems offer up to 99.9\% accuracy (Selvanuthu, 2019). The characteristics of financial
forecasting include being data-intensive, noisy, non-stationary, unstructured, and having hidden linkages (Kshirsagar,
2018). In this work, two algorithms are being discussed, namely, Long Short-Term Memory (LSTM), Support Vector
Machine (SVM).

Accurate exchange rate forecasting is crucial for businesses, investors, and policymakers in the world of financial markets.
Finding accurate exchange rate prediction models is especially important in the context of Ghana’s economy, where
foreign exchange is crucial for trade and financial stability. Support Vector Machines (SVM) and Long Short-Term Memory
(LSTM) networks have become effective time series forecasting tools with the development of deep learning techniques.
In the end, the results of this comparative study will offer insightful information about how well SVM and LSTM algorithms
can forecast Ghana’s exchange rates. This research aims to assist in the selection of appropriate forecasting
methodologies in the context of Ghana’s dynamic economic landscape by highlighting the advantages and disadvantages
of each model. The results might have broader repercussions for other developing nations dealing with comparable
difficulties in managing exchange rate fluctuations.

Following is how the remaining sections of the paper are organized: Section 2 includes an overview of relevant literature.
Section 3 gives a quick review of the various tactics used in this research in addition to describing the dataset that was
used and the experimental setting. In Section 4, the findings of the analyses are presented, and in Section 5, the results
are discussed. The conclusion and suggestions based on the findings are highlighted in part 6, the last part.

2. LITERATURE REVIEW

In applications like image captioning, voice conversion, and natural language processing, recurrent neural networks
(RNNs), a deep learning technique, are demonstrated to have a significant ability to capture the hidden correlations
happening in the huge data. Additionally, they function well while dealing with issues (Zhang et al., 2017). The original
RNN, however, suffers from the issue of a vanishing gradient since the perception of the earlier nodes by later nodes
decreases. Long short-term memory (LSTM) networks (Hochreiter and Schmidhuber, 1997), were suggested as an
enhanced network architecture to address the aforementioned issue. LSTM networks outperform traditional RNNs in
terms of extracting time series features across a wider time range. As an illustration (Bruin et al., 2016), suggested a
method based on an LSTM network to achieve good diagnosis and prediction performances in the instances of complex
procedures, hybrid faults, and significant noise. Based on the widely used measurement signals (Yuan et al., 2016),
advocated the usage of the LSTM network to achieve fast fault detection and identification. The outcome demonstrated
that the LSTM network was more effective than the convolutional network at identifying and locating faults in railway
track circuits. Additionally, Zhao et al., (2017) which introduced a unique traffic forecast model built on the LSTM network,
demonstrated the predictive power. According to Chen et al., (2021), the LSTM outperforms the autoregressive integrated
moving average, support vector regression, and adaptive network fuzzy inference system in terms of predictive
performance. Additionally, it has been claimed that LSTM algorithms surpass cutting-edge methods in terms of accuracy
and noise tolerance for time-series classification (Karim et al., 2019). In general, the LSTM network is an enhanced RNN
that handles longer time series better (Zhang et al., 2017). Machine learning relies heavily on LSTMs because most RNN
cannot overcome short-term memory. As a result, it is difficult to transfer information from past actions to those that will
be conducted in the future. Recurring Neural Networks, for instance, might retain some historical data when processing a
data set or a set of numbers that are projected.

Support Vector Machines, is a powerful machine learning model that may be applied to classification or regression
problems. The SVM algorithm, which is most frequently used for classification issues, performs best in high-dimensional
environments or when many dimensions exceed the number of samples (Chhajer et al., 2021). This algorithm acts as a tool
for risk management and offers investors abnormal profits on their investments (Henrique et al., 2018). For tasks involving
the classification of data into two groups, classification algorithms are employed. Support Vector Machines (SVM) are
capable of classifying new data points when provided with a set of labeled training data that belongs to distinct categories
(Vishwanathan and Murty, 2002). SVMs excel particularly when dealing with limited data, and their primary strength lies
in effectively classifying linearly separable data, a task that comes naturally to them. Additionally, when working with
substantial datasets, SVMs often outperform Artificial Neural Networks (ANN) both in terms of speed and efficiency
(Vishwanathan and Murty, 2002). SVM models are specifically designed to establish an optimal boundary line, effectively
dividing n-dimensional spaces into distinct groups. This strategic approach simplifies the process of categorizing new data
in the future (Vishwanathan and Murty, 2002). SVM can be roughly categorized into Linear SVM, which separates data in a
linear way (statistics that can be separated into two groups by a single straight line). A linear SVM classifier is used to
categorize linear data, whereas non-linear SVM is used to categorize statistics that cannot be separated into groups using
a single line. Using a Non-Linear SVM classifier, it is categorized. SVM’s benefits include the fact that it works best when
there is a distinct margin of separation, highly effective in high-dimensional settings and performs well when there are
more samples than dimensions. SVM uses a subset of training points known as Support Vectors and is excellent at
remembering data. SVM has a drawback. Due to longer training periods, processing huge amounts of data takes longer
(Chhajer et al., 2021). A learning technique that may be applied to evaluate and map both linear and non-linear functions
and solve issues like pattern recognition and prediction is the general structure of Artificial Neural Networks and the
Support Vector Machine (SVM). A hyperplane or collection of hyperplanes (classes) are produced by a high-dimensional
space and can then be used for categorization. Polynomial, Radial Basis Function (RBF), and MLP classifiers can all be
learned using SVM. SVM is based on the risk minimization concept and complements regularization theory. SVM relies
heavily on Kernel Function and mathematical programming approaches to operate. SVM is used to classify and perform
regression on data (Patil et al., 2016).

Numerous forecasting methods have been thoroughly studied in the financial sector. These methods include a variety of
machine learning techniques, such as support vector machines, neural networks, and cutting-edge methods like deep
learning. Unfortunately, there aren’t many thorough survey studies that address these approaches in depth. Notable
authors of thorough reviews in this area are Cavalcante et al. (2016), Bahrammirzaee (2010), and Saad and Wunsch
(1998). The most recent of them, organized by Cavalcante et al. in 2016, focused mostly on stock market-specific
methodology with some discussion of their applicability to foreign exchange markets.

3. DATA AND METHODS


3.1 DATA

The data is a secondary data on USD/GHS - US Dollar Ghanaian Cedi exchange rate from investing.com. The data covers
the dates, daily prices, the open price, the high price, low prices, the volume and change in percentages of Ghana’s
exchange rate. But we are just using the dates and daily prices for our analysis, which means the other variables can be
ignored. The duration of the data is from 14th January 2010 to the variable is 15th of June 2023. Due to the volatility of
data over the duration, preprocessing before further analysis is advised.

The data was cleaned during the pre-processing stage, and the variables date and price do not contain any missing values,
indicating the accuracy of the data. Using range transformation to compute all price variables to be in a range of 0 and 1,
we kept all the prices variables within the same range by performing feature scaling.

In order to evaluate the model’s performance on untried data, it is essential to divide the data into train and test sets.
Following normalization of the data, the scaled data is divided into train and test groups using the ratios 70:30, 80:20, and
90:10, with 80 percent of the data constituting the train set and 20 percent representing the test set. It is important to
split the data up in an orderly manner, keeping the data’s original sequence in mind.

3.2 METHODS

This section covers the modus operandi this study employed in accomplishing the objectives of the study. The studies the
abstract background of Long-Short Term Memory (LSTM) and Support Vector Machine (SVM).

3.2.1 LONG-SHORT TERM MEMORY (LSTM)

Hochreiter and Schmidhuber first presented the Long Short-Term Memory networks (LSTM) in 1997. It arrived as a
remedy for the Recurrent Neural Networks’ (RNNs’) gradient vanishing issue. Since then, numerous improvements have
been made by other scholars. Hochreiter and Schmidhuber’s 1997 paper, "Long Short-Term Memory Networks: A Model
for the Human Brain’s Processing of Sequential Data," was the first to describe these networks. An excellent illustration is
when we study a manuscript; in an effort to remember the crucial facts, we lose the unimportant ones. RNNs struggle
with long-term dependencies, however LSTM units contain features that solve the issue. Long short-term memory (LSTM),
one of the many types of recurrent neural network (RNN), is able to record information from earlier stages and use it to
forecast the future (Patterson and Gibson, 2017). Long Short-Term Memory (LSTM) focused around "memory line" has
been found to be very helpful in forecasting scenarios utilizing long time data because RNN cannot maintain long time
memory. LSTMs with integrated memory lines in their gates can be used to remember data from earlier stages.

3.4 SUPPORT VECTOR MACHINE

Cortes and Vapnik (1995) introduced the support vector regressor with a minor modification but used the same principles as the support vector
classifier(24). The SVM's basic operating principle is to execute linear regression inside the feature space after nonlinearly transforming the input data into a
high-dimensional feature space using the kernel function(25). SVM algorithms can be categorized by the use of kernels, the absence of local minima, or the
amount of support vectors, and SVM regression is regarded as a nonparametric technique because it relies on kernel functions(26). The linear epsilon-
insensitive SVM (ε-SVM) regression is used in this investigation. Finding a function 𝑓̂𝑆𝑉𝑀 (𝑥𝑛 ) that deviates from 𝑦𝑛 by no more than ε and is also as smooth
as possible is the aim of ε-SVM. The regression function is represented by the following:

𝑓̂𝑆𝑉𝑀 (𝑋) = 𝜔 ∙ 𝜑(𝑥) + 𝑠 3.5.1


s is a scalar threshold, ω is the weight vector; and ϕ(X) denotes a nonlinear mapping function that transforms the input space X into a high dimensional
feature space;. The SVM model performs linear regression in the high-dimensional feature space by ε-insensitive loss. The coefficients ω and s then can be
estimated through minimising the regularised risk function:
‖𝜔‖2
𝐽(𝜔) =
2
𝑦𝑛 − φ(ω, 𝑋𝑛 ) − s ≤ ε
𝑠. 𝑡 ∀𝑛: { 3.5.2
φ(ω, 𝑋𝑛 ) + s − 𝑦𝑛 ≤ ε

It is likely that the function f (X) that perfectly satisfies Eq. 3.5 2 for all points does not exist. Therefore, slack variables ζn and ζ∗𝑛 are introduced for each
point to deal with other infeasible constraints. These slack variables allow regression errors to exist up to the value of ζn andζ∗𝑛 , but still meet the required
conditions. This objective function is described as(27):
𝑛
‖𝜔‖2
𝐽(𝜔) = + 𝐶 ∑(ζn + ζ ∗𝑛 )
2
𝑖=1

𝑦𝑛 − φ(ω, 𝑋𝑛 ) − s ≤ ε + ζn
φ(ω, 𝑋𝑛 ) + s − 𝑦𝑛 ≤ ε + ζ𝑛∗
𝑠. 𝑡 ∀𝑛 ∶ { 3.5.3
𝜉𝑛 ≥ 0
𝜉𝑛∗ ≥ 0

where the box constraint, which denotes the penalty degree of the sample with an error greater than ε, is represented by the positive constant C.

The dual problem is resolved by using the optimization approach to maximize the function. The dual formula is constructed by presenting the primal
function for each observation Xn with the non-negative Lagrangian multipliers αn and α∗𝑛 . We reduce the double formula.
𝑁 𝑁 𝑁 𝑁
1
𝐿(𝛼) = ∑ ∑( αi − α𝑖∗ )(αj − α𝑗∗ )𝐺(𝑥𝑖 , 𝑥𝑗 ) + 𝜀 ∑(αi + α∗𝑖 ) + ∑ 𝑦𝑖 (αi − α∗𝑖 ) 3.5.4
2
𝑖=1 𝑗=1 𝑖=1 𝑖=1

𝑁
∑ (αn − α∗𝑛 ) = 0
𝑠. 𝑡 ∀𝑛: 𝑛=1
0 ≤ 𝛼𝑛 ≤ 𝐶
{ 0 ≤ 𝛼𝑛∗ ≤ 𝐶
where the Gaussian kernel function
2
𝐺(𝑥𝑖 , 𝑥𝑗 ) = 𝑒𝑥𝑝(−‖𝑥𝑖 − 𝑥𝑗 ‖ ) 3.5.5

is used as the kernel function of SVM. The SVM model obtained by minimising Eq. 3.5.4 is then given as:
𝑁

𝑓̂(𝑥) = ∑(αi + α∗𝑖 ) 𝐺(𝑥𝑛 , 𝑥) + 𝑠 3.5.6


𝑖=1

Then sequential minimal optimisation (SMO) (28) approach is introduced to determine the appropriate parameters (e.g. C and ε) of SVM and the SVM
model can be finally determined.

The SVM and LSTM models will be trained and assessed using the appropriate training and testing subsets for each train-test ratio. We will
compute and compare the evaluation metrics across various ratios, including Mean Absolute Error (MAE), Mean Squared Error (MSE), and
Mean Absolute Percentage Error (MAPE). The train-test ratios will reveal how different data sizes affect how well the models perform. The
outcomes will shed light on the trade-off between having more extensive training sets and being able to test the model on a wider variety of
data points.

MEAN SQUARED ERROR: It calculates the mean difference between actual values and predicted values. The model's performance in terms of prediction
errors can be measured using MS E in a straightforward and understandable manner.

1
𝑀𝑆𝐸 = ∑((𝑦̂ − 𝑦)2
𝑁

MEAN ABSOLUTE ERROR: A common metric for assessing the precision of predictions in regression tasks is mean absolute error (MAE). It calculates the
mean absolute difference between actual (ground truth) values and predicted values

1 𝑁
𝑀𝐴𝐸 = ∑ |𝑦𝑖 − 𝑦̂𝑖 |
𝑁 𝑖=1

MEAN ABSOLUTE PERCENTAGE ERROR: The average percentage difference between predicted and actual values is measured by MAPE.
1 𝑁 𝑦𝑖 − 𝑦̂𝑖
𝑀𝐴𝑃𝐸 = ∑ | | × 100%
𝑁 𝑖=1 𝑦𝑖

Given a dataset of N data points, where:

 𝑦𝑖 represents the actual (true) value of the target variable for the ith data point.

 𝑦̂𝑖 represents the predicted value of the target variable for the ith data point.

4. Results
The descriptive analysis for the data is in the table below (Table 4.1) according to the table below the total number of the data sets is 3493.
The figure below (Figure 4.1) below shows the yearly time series plot of the exchange rates over the years. The x-axis represents the years
(which is 2010 to 2024), the y-axis represents the prices (in Ghana cedis). They plot suggests that the exchange rate series is not stationary, the
line plot visually depicts the changes in the exchange rate over time, indicating increase, the plot also exhibits extreme volatility from around
2019 to beginning of 2023 which is probably due to the pandemic in 2019 and the Russian Ukraine war.

Count 3493
Minimum value 1.410000
Maximum value 14.750000
Mean 4.320948
25% 1.991500
50% 4.030000
75% 5.630000
STD 2.568892
Table 4.1 Descriptive Analysis for the Exchange rates

Figure 4.1 Time Series Plot of Daily USD/CEDI Exchange Rate

MSE MAE MAPE

70-30 SPLIT 0.24703574173074797 0.14274103503503247 1.9535246107426165


80-20 SPLIT 0.30337607268510264 0.23674975181918415 2.8896963621543468

90-10 SPILT 0.4090685763358463 0.4376102064785204 4.119532435299224

Table 4.2 LSTM model performance

You might also like