You are on page 1of 5

Bonfring International Journal of Data Mining, Vol. 3, No.

2, June 2013 12

ISSN 2277 - 5048 | © 2013 Bonfring
Abstract--- Exchange rate prediction is a challenging topic
in the recent decade. Various studies have been done to
improve the prediction regarding the accuracy in terms of
level error and directional status error. The aim of this paper
is to introduce a methodology that uses KNN (K-nearest
neighbors) and DTW (dynamic time warping) to improve the
fluctuation prediction and to have better evaluation
parameters in the literature of financial market forecasting,
comparing to other researches. The study is done with
USD/JPY(United States Dollar/Japanese Yen) exchange rate
time series and the results show improvement of prediction
regarding the direction of time series. USD/JPY exchange
rates are gathered from 1971 to 2012 and are partitioned into
30 element segments regarding the monthly cyclic behavior of
the time series. Then two different set of these 30 element
segments are divided with 7:3 ratio and the KNN is used to
find out the 3 nearest neighbors regarding the DTW as
similarity function. By a chosen function introduced also in
this research, the directional status of the last element is
predicted and the prediction result is then compared with
other results in the literature of exchange rate prediction.
Keywords--- Dynamic Time Warping, Time Series
Classification, Exchange Rate Prediction, KNN, USD/JPY

I. INTRODUCTION
HE biggest financial market in the world is foreign
exchange market with more than 4 billion dollars daily
turnover[1]. This shows the importance of prediction in this
market for the traders. Prediction in the financial market has
been a controversial topic both in the economy and forecasting
literature [2]. Many believe in random walk theorem and
efficient market hypothesis that says in a market where all the
actors have the same access to information, the bigger profit
comes from the more the trader takes risks and not by
forecasting, but there are many other researchers who believe
in use of forecasting and modeling the financial time series to
the traders and their decision makings [3]. The USD/JPY

ArashNegahdari Kia, PhD Scholar, University of Tehran, Iran. E-mail:
arash.nkia@gmail.com
Dr. SamanHaratizadeh, Assistant Professor, Department of Network
Science and Technologies, University of Tehran, Iran.
Dr. HadiZare, Assistant Professor in the Department of Network Science
and Technologies at University of Tehran, Iran.

DOI: 10.9756/BIJDM.4658
exchange rate financial market shows random walk
characteristics more than other exchange rates in the floating
regime and thus is harder to be predicted [1].In this study the
researchers believe in the possibility of finding the patterns in
financial time series and usefulness of them for the traders in
their trades and try to forecast the fluctuations of USD/JPY
time series.
In literature review of financial time series forecasting,
three points of view can be found out. Some researches
emphasize on economics literature and analyze the charts and
series by economical indices like those in technical analysis of
stock markets. The second point of view emphasizes the
statistical approaches like ARIMA (Auto-Regressive
Integrated Moving Average) methodology for forecasting and
the third point of view tries to forecast the series with newer
artificial intelligence and data mining techniques [4].In this
paper we use KNN methodology with DTWas distance
function between our time series patterns to classify the series
and predict the fluctuation of a given financial time series of
exchange rate.
In a research, efficient nearest neighbor search for
multivariate time series had been studied[5]. Efficient
similarity search over future stream time series had been
studied in another work with use of KNN and DTW [6].
Another efficient similarity search over multivariate time
series was done in the same yearby Zhou et al, [7], which used
KNN and DTW. Online detection and prediction of special
patterns over financial data streams was studied with the help
of KNN and DTW in a research by Jiang et al, [8]. In 2010, A
framework for time-series analysis was presented in a work by
Kurbalija et al, [9]. In the same year, Xing and Keogh did a
brief survey on sequence classification [10].Banko and
Abonyi studied correlation based dynamic time warping of
multivariate time series in 2012 and used KNN for indexing
large time series databases [11]. Ando did a study in 2012 to
optimize the performance of classification in time-series with
nearest neighbor density approximation [12]. Nguyen et al. did
a research in 2012, for time series classification with
ensemble-based positive unlabeled learning methodology and
used KNN in their study [13]. There can be found many other
researches both in financial time series field and other sciences
field that use KNN and/or DTW as their methodology to
classify, model, or predict sequences or time series.
In the next section of this paper, the methodology used for
prediction of exchange rate time series is proposed. The KNN
and DTW methods are described in brief and in section 3, the
Prediction of USD/JPY Exchange Rate Time Series
Directional Status by KNN with Dynamic Time
Warping AS Distance Function
ArashNegahdari Kia, Dr. SamanHaratizadeh and Dr. HadiZare
T
Bonfring International Journal of Data Mining, Vol. 3, No. 2, June 2013 13

ISSN 2277 - 5048 | © 2013 Bonfring
data preparation and compilation phase is discussed. Section 4
talks about the evaluation methodologies in financial time
series literature and section 5 presents the results. The last
section concludes and suggests some topics for the future
researches.
II. METHODOLOGY
In this section we talk about our methodology of prediction
and a brief introduction to DTW for finding the distance
between two time series. There are two ways of predicting a
time series. First is the way that predicts the time series values
of the next elements and the second is to predict the next
fluctuations, that means, the emphasis of the prediction is to
know if the series goes up or down in direction regarding the
previous value [1].We talk about this more in the evaluation
section. The 3NN is used in KNN methodology to find the
three nearest neighbors of the time series that is going to be
predicted and then by the distance from the time series to its 3
nearest neighbors, a function is used to determine the class (up
or down fluctuation in our case) of the time series by the
distance of it from its neighbors. The distance is calculated
with DTW technique that is discussed later in this section.
After finding the 3 nearest neighbors of the time series, we use
the function introduced in (1) to estimate the class label. Class
label 1, says the time series next value is going to be greater
than the previous value and label 0 says that the next value is
going to be less than the previous value. This shows the
fluctuation of the time series in direction.
(1)
Suppose , , and be the distances of the 3 nearest
neighbor time series from the time series that is going to be
predicted and their correspondent class labels are , , and
. Using the inverse of distances makes the nearest neighbors
have a greater impact in the result. The whole methodology is
presented in figure 1 as a diagram. Description is presented
afterwards as a step by step process:

Figure 1: Prediction Process Diagram with KNN and DTW
i. Splitter block: Splitting the exchange rate time series
into n-element equal sequences with class labels 1 or
0, if the next value after the last element is going to
be greater or less than the last element of the
sequence.
ii. Partitioning block: divides the sequences into test and
train set. In our case, the test set is 20 percent of all
the samples as used in the work of Yao, and Tan, [1].
The class label of the last sequence in the test set can
be used as a prediction of fluctuation of exchange
rate in our method and presents a buying or selling
signal to the traders.
iii. DTW block: Finds the distance between the entire
test set instances with the train set instances with
dynamic time warping algorithm.
iv. KNN block: Finds the K nearest neighbors of a
sample in test set and passes them into the prediction
block.
v. Prediction block: Uses the function presented in (1)
to predict the class label (next direction of the series
in our case).
Dynamic Time Warping
A brief description of DTW used in our procedure is
presented here. Dynamic time warping is a method of finding
a more accurate distance measure between time series,
compared to simple Euclidean distance between the elements
of time series and sequences one by one. This is due to the
different time series not being aligned in time[14]. The
algorithm is presented in figure 2.
Bonfring International Journal of Data Mining, Vol. 3, No. 2, June 2013 14

ISSN 2277 - 5048 | © 2013 Bonfring

Figure 2: DTW Algorithm in Partitioned Exchange Rate Time Series
III. DATA COMPILATION
Data set has come from the Federal Reserve economic
data, economic research division of Federal Reserve Bank of
St. Louis [15]. We used USD/JPY exchange rate data from
Dec 4
th
of 1971 to Dec 24
th
of 2012 that by 15331 records of
date-exchange rate, the data was divided into 30 element time
series groups that became 500 time series with a class label
which was derived from the next element (31th element, or the
first element of the next sequence).
Figure 3 shows the exchange rate time series which will be
predicted in terms of fluctuation in our research. If the 31th
element was higher than the 30th element, the class label was
1, otherwise 0, that showed the direction of fluctuation, the
label which we tried to predict in the test set. We used the 30
elements sequence because of the monthly cyclic behavior of
the exchange rate time series.

Figure 3: USD/JPY Exchange Rate Time Series from Dec 4th
of 1971 to Dec 24th of 2012
The data before 1971 was useless for our research due to
the Smithsonian agreement which adjusted the fixed exchange
rates of Breton Woods’s conference in 1944. The USD/JPY
became a floating exchange rate and Japanese Yen was not
purged to US Dollar anymore, so the prediction of this time
series became reasonable after 1971 [16].
There were some missing data in the data set for holidays.
The MA3 (Moving Average of 3 days before) that is a famous
technical analysis index was used to fill in the missing data of
those days. At the end the compiled data set of 500*30
element sequences with one class label was divided into a
training set of 400 sequences and a test set of 100 sequences
by 7:3 ratio. The 7:3 ratio came from the 7:2:1 ratio of
training, validation, and test set that was used in Yao and Tan
work in 2000 to learn their multilayer perceptron neural
network [1]. Another reason for the7:3 ratio is that we will
compare our results with Yao and Tan study which has been
cited a lot in the exchange rate prediction literature [1].
IV. EVALUATION METHODS
Three different evaluation methodologies in financial time
series prediction that are used more are described in this
section. First is a class of mean square error evaluation
parameters which catches the error in terms of level. This
means that the mean square error shows if the prediction is
closer to the real future value of the time series or not. The
RMSE (Root Mean Square Error) is presented in (2). The is
the real future value of the time series while the is the
predicted value.
RMSE = (2)
The second methodology looks at the problem in a more
practical way. It assumes that the traders come to trade with a
prediction method and with a fixed amount of money, so the
evaluation methodology should calculate how much profit or
loss the traders will have after trading in a specific period of
time. This method is good due to its practical point of view
and not good because it depends on two other parameters of
the money traders have at first and determining the period of
time we want to simulate the trade[4].
The third evaluation method is to see what percentage of
predicted values is in the correct direction. It means that if the
real future value is bigger than the last value, the predicted
value should also be bigger than the last value and if the real
future value is less than the last value, the predicted value also
should be less than the last value. This is shown in (3). This
evaluation method is the most important one for a trader. This
evaluation parameter is called directional status or directional
success. It is also called gradient% in Yao and Tan’s study [1].
Assume the present value of exchange rate is 5, and the real
future value is 6. If one predictor predicts the value 4 for the
future and the other predictor predicts 100, the first predictor
is better in terms of MSE evaluation parameter but the second
is better in terms of directional success or gradient. But the
Bonfring International Journal of Data Mining, Vol. 3, No. 2, June 2013 15

ISSN 2277 - 5048 | © 2013 Bonfring
first predictor signals selling to the trader and hence results in
loss. This is a simple example of why directional success is a
more practical evaluation parameter comparing to MSE.
(3)

Same as before, is the original value of the time series,
is the predicted value, N is the number of samples, and,
shows if the direction is predicted correct or not.
V. RESULTS AND DISCUSSION
After implementing the process described in the
methodology section of our research we reached the results
presented in table 1. In table 1 the result of our methodology is
compared to the work of Yao and Tan [1]. The result shows 10
percent increase in the correctness of directional status. This
shows that using the KNN-DTW method achieves more
success in directional status prediction than the multilayer
perceptron modeling in prediction of USD/JPY exchange rate
time series.



Table 1: Comparing KNN-DTW model with Yao, Tan ANN-
MLP (Artificial Neurl Network – Multilayer Perceptron)
model in Directional Status Evaluation Parameter
Time Series Model
Normal
Mean Error
in Test Set
Directional
Success
USD/JPY[1]
MLP with 5-
4-1 Structure
(3 Layers)
1.966195 46.59%
JPY/USD (Yao
J., Tan C.L.;
2000)
MLP with 6-
4-1 Structure
(3 Layers)
1.242099 46.59%
USD/JPY
3NN – DTW
Model
- 54%

The diagram in figure 4 presents the instances in the test
set and if the prediction of direction was correct or not. The
upper band shows the error in prediction while the lower band
shows the correctness of it.







Figure 4: Error and Correctness in Prediction of Direction in the Test Set
As it was mentioned before, the Yen-Dollar market
(USD/JPY exchange rate market) follows the random walk
properties more than other exchange rate markets because it is
more efficient market than others and information flow is
more efficient in the Yen-Dollar market [1]. This makes the
Yen-Dollar market more challenging and harder for the
prediction models and achieving a 10 percent improvement in
directional status prediction could be useful to the practical
usage and the traders.
VI. CONCLUSION AND FURTHER RESEARCHES
In this study, we found out that using KNN with dynamic
time warping as distance function in a specific procedure will
improve directional status prediction results comparing the
previous studies. Using a data set of 15331 USD/JPY
exchange rate records, we built 500 sequences of 30 element
exchange rates and partitioned this data into 30% test set and
70% training set. The results specifically showed a 10 percent
improvement in directional prediction comparing the study of
Yao and Tan [1] which is one the most cited researches in the
field of financial prediction using newer artificial intelligence
and data mining technique methodologies.
For further studies, the researches may use other
customized DTW algorithms to improve the distance function.
Other exchange rates like GBP/USD or EUR/USD can also be
tested to see how much improvement this method will achieve
in directional prediction. The function that we used after the
KNN process can also be different than the function we
presented in (1) and the changes in the evaluation results can
be analyzed.
Bonfring International Journal of Data Mining, Vol. 3, No. 2, June 2013 16

ISSN 2277 - 5048 | © 2013 Bonfring
REFERENCES
[1] J. Yao, Ch. L. Tan, "A case study on using neural networks to
perform technical forecasting of forex.",Neurocomputing, vol.
34, no. 1, pp. 79-98, 2000.
[2] Abu-Mostafa, S. Yaser, A. F. Atiya, "Introduction to financial
forecasting.", Applied Intelligence, vol. 6, no.3, pp. 205-213,
1996.
[3] A. Timmermann, C. W.J. Granger, "Efficient market hypothesis
and forecasting.", International Journal of Forecasting, vol. 20,
no.1, pp. 15-27, 2004.
[4] M. Fathian, and A. N. Kia, "Exchange rate prediction with
multilayer perceptron neural network using gold price as
external factor.", Management Science, vol. 2, 2012.
[5] K. Yang, C. Shahabi, "An efficient k-nearest neighbor search for
multivariate time series.", Information and Computation, vol.
205, no. 1, pp. 65-98, 2007.
[6] X. Lian, L. Chen, "Efficient similarity search over future stream
time series.", Knowledge and Data Engineering, IEEE
Transactions, vol. 20, no. 1, pp. 40-54, 2008.
[7] D. Zhou, M. Li, H. Yan, "An Efficient Similarity Search For
Financial Multivariate Time Series.", Wireless Communications,
Networking and Mobile Computing, WiCOM'08. 4th
International Conference on.IEEE, 2008.
[8] T. Jiang, Y. Feng, B. Zhang, "Online detecting and predicting
special patterns over financial data streams.", Journal of
Universal Computer Science, vol. 15, no. 13, pp. 2566-2585,
2009.
[9] V. Kurbalija, et al., "A framework for time-series analysis.",
Artificial Intelligence: Methodology, Systems, and Applications.
Springer Berlin Heidelberg, pp. 42-51, 2010.
[10] Z. Xing, J. Pei, E. Keogh, "A brief survey on sequence
classification.", ACM SIGKDD Explorations Newsletter,vol. 12,
no. 1, pp. 40-48, 2010.
[11] Z. Bankó, J. Abonyi, "Correlation based dynamic time warping
of multivariate time series.", Expert Systems with Applications,
2012.
[12] Sh. Ando, "Performance-Optimizing Classification of Time-
series based on Nearest Neighbor Density Approximation.",
Data Mining Workshops (ICDMW), 12th International
Conference on, IEEE, 2012.
[13] M. N. Nguyen, X.L. Li, S.K. Ng, "Ensemble based positive
unlabeled learning for time series classification.", Database
Systems for Advanced Applications, Springer Berlin
Heidelberg, 2012.
[14] E. Keogh, Ch. A. Ratanamahatana, "Exact indexing of dynamic
time warping." Knowledge and information systems, vol. 7, no.
3, pp. 358-386, 2005.
[15] [online], access date: 2013-01-15, URL:
http://research.stlouisfed.org/fred2/categories/94
[16] T. Murano, "International Currency Realignment and the
Yen."The Developing Economies, vol. 10, no. 4, pp. 340-358,
1972.

ArashNegahdari Kia did his B.S. in Computer Science
at Department of Mathematics, Faculty of Science in
University of Tehran during 2001-2005. He did his
M.Sc. IT engineering in e-commerce at School of
Industrial Engineering, Iran University of Science and
Technology during 2006-2009. He is presently pursuing
his Ph.D. in IT engineering in the area of Data Mining
and financial prediction under the guidance of Dr.
SamanHaratizadeh, Associate Professor, Department of
Network Science and Technologies, University of Tehran. He has been
awarded the first place honor of post graduate student in his M.Sc. degree. His
Master Thesis was about prediction of foreign exchange rate time series with
hybrid model of artificial neural networks and ARIMA using gold price as
external factor.

Dr. SamanHaratizadeh has received the doctorate degree
in Artificial Intelligence, Software Engineering from
Sharif University of Technology. He is currently
working as Assistant Professor, Department of Network
Science and Technologies, University of Tehran. His
research interests, courses, and papers are in the field of
Data Mining, Machine Learning, Decision Support
Systems, Bio-Informatics and development of intelligent
tools for information processing, and automatic decision making.

Dr. HadiZare received the Ph.D. degree in Applied
Mathematics from the Amirkabir University of
Technology, Tehran, Iran in 2012. He received an
M.Sc. in Mathematical Statistics with high distinction
from the Amirkabir University of Technology in 2008.
He is currently an Assistant Professor in the Department
of Network Science and Technologies at University of
Tehran. His research interestsare in Statistical Machine
Learning and include feature selection, dimensionality
reduction methods, and probabilistic models for complex networks.