This action might not be possible to undo. Are you sure you want to continue?
Decision Tree for Financial Time Series Data
PeiChann Chang , ChinYuan Fan , ChiaHsuan Yeh, WanLing Pan
AbstractStock price predictions suffer from two well
known difficulties, i.e., complicated and nonstationary
variations within the large historic data. This paper establishes
a novel financial time seriesforecasting model by a case based
fuzzy decision tree induction for stock price movement
predictions in Taiwan Stock Exchange Corporation (TSEC).
This forecasting model integrates a case based reasoning
technique, a Fuzzy Decision Tree (FDT), and Genetic
Algorithms (GA) to construct a decisionmaking system based
on historical data and technical indexes. The model is major
based on the idea that the historic price data base can be
transformed into a smaller casebase together with a group of
fuzzy decision rules. As a result, the model can be more
accurately react to the current tendency of the stock price
movement from these smaller case based fuzzy decision tree
inductions. Hit rate is applied as a performance measure and
the effectiveness of our proposed CBFDT model is
demonstrated by experimentally compared with other
approaches on various stocks from TSEC. The average hit rate
of CBFDT model is 91º the highest among others.
I. INTRODUCTION
ining stock market trend is a challenging task due to its
high volatility and noisy environment. Many Iactors
inIluence the perIormance oI a stock market including
political events, general economic conditions, and traders`
expectations. Although stocks and Iutures traders have relied
heavily upon various types oI intelligent systems to make
trading decisions, the perIormances have been a
disappointment 2.
Many attempts have been made to predict the Iinancial
markets, ranging Irom traditional time series approaches to
artiIicial intelligence techniques, such as Iuzzy systems and
artiIicial neural network (ANN) methodologies 1.
However, the main drawback with ANNs, and other
blackbox techniques, is the tremendous diIIiculty in
interpreting the results. They do not provide an insight into
the nature oI the interactions between the technical
indicators and the stock market Iluctuations. Thus, there is a
need to develop methodologies that provide an increased
understanding oI market processes 7 and 9. Another
issue to be dealt with is that the dimensionality oI Iinancial
time series data also creates another challenge in ANN
approaches.
The development oI a timely and accurate trading
decisionmaking tool is the key Ior stock traders to make
proIits. Since the stock price series is aIIected by a mixture
oI deterministic and random Iactors 7, new tools and
techniques are needed in dealing with noise and nonlinearity
in stock price prediction. Decision tree aimed at searching
Ior rules hidden in very large amount oI data. This is a new
and eIIicient approach Ior time series analysis. In addition,
decision tree techniques have already been shown to be
interpretable, eIIicient, problem independent and able to
treat largescale applications. However, they are also
recognized as highly unstable classiIiers with respect to
minor perturbations in the training data. Fuzzy logic
provides the advantages in handling these variances due to
the elasticity oI Iuzzy sets Iormalism. In this work, a decision
tree tool ID3 (Iterative Dichotomizer 3)16 is combined
with the Iuzzy theory and genetic algorithms to develop a
case based Iuzzy decision tree Ior stock trading decision. The
proposed model is able to predict the trends oI stocks more
precisely and to oIIer speculators a better inIormation
platIorm during the stock trading.
M
II. LITERATURE REVIEW
The Iuzzy decision tree is similar to the standard
decision tree methods (e.g. CART 11, 16) based on a
recursive binary partitioning algorithm. At each node during
the construction process oI a Iuzzy decision tree, the most
stable splitting region is selected and the boundary
uncertainty is estimated based on an iterative resampling
algorithm. The boundary uncertainty estimate is used within
the region`s Iuzzy membership Iunction to direct new
samples to each resulting partition with a quantiIied
conIidence. The Iuzzy membership Iunction is used to
recover those samples that lie within the uncertainty oI the
splitting regions. Many attempts 10, 14, 19 have been
made in the past to introduce this new technology into stock
prediction. Sorensen et al. 19 use CART to partition assets
into outperIorming and underperIorming assets. PortIolio
composed by uniIormly weighted outperIorming assets.
It has been a new tendency that combining the soIt
computing (SC) technologies oI NNs, Iuzzy logic (FL) and
genetic algorithms (GAs) may signiIicantly improve an
PeiChann Chang and ChiaHsuan Yeh are with the Department of
Information Management, Yuan Ze University, Taoyuan 32026, Taiwan,
R.O.C.
(Corresponding Author`s Email: iepchang¸saturn.yzu.edu.tw)
ChinYuan Fan., is with the Department of Industrial Engineering and
Management, Yuan Ze University, Taoyuan 32026, Taiwan,
R.O.C.( email:S948906¸mail.yzu.edu.tw).
WanLing Pan., is with the Department of Industrial Engineering and
Management, Yuan Ze University, Taoyuan 32026, Taiwan,
R.O.C.( email:S955408¸mail.yzu.edu.tw).
76
9781424418190/08/$25.00 c 2008 IEEE
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 23, 2009 at 11:34 from IEEE Xplore. Restrictions apply.
analysis 1,3, 8, 13, 15, 17. In general, NNs are
used Ior learning and curve Iitting, FL is used to deal with
imprecision and uncertainty, and GAs are used Ior search
and optimization. Zadeh 20 pointed out, merging these
technologies results in a tolerance Ior imprecision,
uncertainty, and partial truth to achieve tractability,
robustness, and low solution cost.
This research will Iollow Zadeh`s suggestion by
combining several soIt computing techniques such as Iuzzy
decision tree, Case based weighted data clustering, and
genetic algorithm to develop a Iorecasting model Ior stock
trading decision. In addition to Iuzzy decision tree, the Case
based clustering algorithm is applied to cluster the data
beIore the Iuzzy decision rules are generated. A set oI Iuzzy
decision rules is generated Ior each cluster, which enables us
to determine the Iuzzy terms oI each variable. Finally, a GA
is applied as an evolving tool to Iurther Iinetune the
Iorecasted result Irom the FDT model.
III. A CASE BASED FUZZY DECISION TREE
Decision tree induction is Iree Irom parametric
assumptions and it generates a reasonable tree by
progressively selecting attributes to branch the tree. A
decision tree is a Ilowchartlike structure where each node
represents a test on an attribute (such as trading volumes),
each branch represents an outcome oI the test (such as
trading volumes ÷ high) and leaI nodes represent a
classiIication oI an instance (such as buy). By combining
technical indices, stock price variation, and transaction
volumes on stock trading, this research will apply a Iuzzy
decision tree to develop a Iorecasting model Ior generating
decision rules in stock trading decisions.
A novel Iinancial time seriesIorecasting model is
developed by clustering and evolving Iuzzy decision tree Ior
stocks in TSEC. This Iorecasting model integrates a data
clustering technique, a Fuzzy Decision Tree (FDT), and
Genetic Algorithms (GA) to construct a decisionmaking
system based on historical data and technical indexes. The
set oI historical data is divided into n subclusters by
adopting a Case base weighted algorithm. GA is then applied
to evolve the number oI Iuzzy terms Ior each input index in
FDT. The Iorecasting accuracy oI the model can also be
Iurther improved.
The Iramework oI CBFDT is shown in Figure 1 and it can
be divided into Iour major steps. They are 1.) Screening
stocks Irom TSEC; 2.) Clustering Case Library into smaller
cases; 3.) Establishing Fuzzy Decision Tree; and Iinally 4.)
Outputting the Iorecasting results. The details oI each step
are Iurther explained in the Iollowing sections.
A. The Selection of Stocks
The source oI the data Ior analysis is selected Irom stock
trading data Irom 2000/8/10 to 2005/9/30 on TSEC (Taiwan
Stock Exchange Corporation). The Iollowing Iundamental
indices were applied to select stocks which are worthy oI
investment. These indices are listed in Table 1.
˶˿ ˴̆̆ʳ˄
˶˿˴̆̆˅
˶˿ ˴̆̆ʳˆ
˦˄
˃
˅
ˇ
ˉ
ˋ
˄˃
Fig 1. The Framework oI CBFDT
TABLE I
FUNDAMENTAL INDICES FOR INVESTING IN STOCK
Indices Descriptions
Stock Capital
This index is used to estimate the scale oI a
company. The higher the amount oI stock capital,
the higher the circulating ability is.
Monthly Revenue
Monthly Revenue represents the operation
achievements oI a company. The better revenue
situation shows the company having the ability to
make more proIits.
Earnings Per Share
(EPS)
EPS÷Total ProIit / Total Stock Shares
Turnover Rate
Turnover rate is an index to be observed and it
represents the level which investors concern.
Net worth and
market value ratio
(NWMV)
NWMVR÷ Stock Net worth / Market price
PriceEarnings
Ratio, PER
PER ÷Stock price / ProIit aIter taxes
The lower ratio represents investors can buy stock
with lower price.
2008 IEEE International Conference on Fuzzy Systems (FUZZ 2008) 77
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 23, 2009 at 11:34 from IEEE Xplore. Restrictions apply.
B. A Case Based Weightedclustering method
A stock historic case library which is derived Irom
yahoo.com.tw is applied to develop the weighted distance
metric and a similarity measure used in the Iollowing. 18
First assume a Stock Library equal to
{ }
1 2
, ,.....,
N
SL e e e =
{
. Each
case in the library can be identiIied by an index oI
corresponding Ieatures. In addition each stock has an
associated action to be made Ior its current perIormance and
the action is either a hold, sell or buy decision. More
Iormally we use a collection oI Ieatures ( )}
1,.....,
j
F j n
( )
n i
y
=
1 2
, ,......, ,
i i i i
x x =
to
represent the cases and a variable V to denote the action. The
ith case e
i
in the library can be represented as a
n¹1dimensional vector, i.e. e x .
Where
j
x corresponds to the value oI Ieature
1
j
F j n
 

\ .
s s
N


\ .
and
y
i
corresponds to the action 
to be taken and it will
be deIined later. Suppose that Ior each j (1 _ j _ n) a weight
wj
(
has been assigned to the jth Ieature to indicate
the importance oI the Ieature. Then, Ior any pair oI
cases and in the library, a weighted distance metric
can be deIined as
1,....., i =
0,1
j
w
(
(
)
2 2




.


¸ ¸
e
p
e
q
e
( )
1/ 2 1/ 2
2
2
1 1
n n
w
pq j pj qj j j
j j
d w x x w x
  
  
 
\ . 
\ . \
= =
= ÷
¿ ¿
(1)
Where .When all the weights are equal to 1 the
distance metric deIined above coincides with the Euclidean
measure, denote by .
2
2
pj qj j
x x x

\ .
= ÷
d
1
pq
 

\ .
By using the weighted distance deIined in equation (1), a
similarity measure between two cases, , can be deIined
as Iollows:
( ) w
pq
SM
( )
( )
1
1
w
pq
w
pq
SM
d o
=
+ ·
(2)
Where Į is a positive parameter. When all weighs take value
1, the similarity measure is denoted by .
1
pq
SM
 

\ .
AIter introducing the weighted distance metric and the
similarity measure, the weighted clustering method is Iurther
described in the Iollowing steps:
Phase one: Finding every weighted value from important
Technical Indices.
In this step, the gradient method is applied to Iind the
weighted values Irom important Technical Indices and a
Ieature evaluation Iunction is deIined. The smaller is the
evaluation value, the better are the corresponding Ieatures.
Thus we would like to Iind the weights such that the
evaluation Iunction attains its minimum. The detail
processes can be described as Iollows:
Step 1. Select the parameter o and the learning rateq .
Step 2. Initialize
j
w with random values in 0, 1.
Step 3. Compute
j
w A Ior each j using equation (3)
j
j
E
w
w
q
c
A = ÷
c
(3)
In this equation, E is deIined as equation (4)
( )
( )
( )
( )
( ) ( )
( )
( )
1 1
2* 1 1
* 1
w
pq pq pq pq
pq q p
SM SM SM SM
E w
N N
<
¿ ¿
w (
÷ + ÷
(
¸ ¸
=
÷
(4)
where N is the number oI cases in the SL base
Step 4. Update with
j
w
j
w w
j
+A Ior each j.
Step 5. Repeat step 3 and step 4 until convergence, i.e., until
the value oI E becomes less than or equal to a given
threshold or until the number oI iterations exceeds a certain
predeIined number.
Phase two: Dividing the SL (Stock library) into Several
Clusters
This section attempts to partition the Stock library into
several clusters by using the weighted distance metric with
the weights learned in previous section. Since the Ieatures
are considered to be in realvalue, many methods such as
KMeans clustering 5 and Kohonen` selIorganizing
network 5 15can be used to partition the case library.
However, this paper adopts a typical approach oI clustering,
by Shiu et al 18 which uses only the inIormation oI
similarity between cases. This approach Iirst transIorms the
similarity matrix to an equivalent matrix and then considers
the cases being equivalent to each other as one cluster. The
detail processes can be described as Iollows:
Step 1. Give a signiIicant level (threshold)
(  0,1  e
Step 2. Determine the similarity matrix
( )
( )
w
pq
SM SM =
according to equation (1) and (2)
Step 3. Compute 1 SM SM = .
( )
pq
SM s =
Where
( )( )
( ) ( )
max min ,
w w
pq k pk kq
s sm = sm
Step 4. II then go to step 5, else replace SM
with SM1 and go to step 3.
1 SM SM c
Step 5. Determine several clusters based on the rule 'case p
and case q¨ belong to the same cluster iI and only iI _ ȕ.
pq
s
AIter clustering the case library into smaller cases, next
section will take a brieI introduction to the GAFDT
Iorecasting model.
78 2008 IEEE International Conference on Fuzzy Systems (FUZZ 2008)
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 23, 2009 at 11:34 from IEEE Xplore. Restrictions apply.
C. GAFDT forecasting model 0
( )
0
x a
x a a x b
b a
x
c x b x c
c b
c x
µ
s ¦
¦
÷ s s
¦
¦ ÷
=
´
÷ s s
¦
¦ ÷
¦
s
¹
(5)
This research Iirst uses casebased reasoning methods to
clustering Stocks data.Then combines our previous research
Genetic Algorithms and Fuzzy Decision Trees (GAFDT) 4
to develop a Iorecasting model Ior the prediction oI stock
price movement. The Iramework oI GAFDT is depicted as
Iollows:
˶˿ ˴̆̆ ʳ ˄
˶˿ ˴̆̆ʳ ˅
˶˿ ˴ ̆ ̆ ʳ ˆ
˦˄
˃
˅
ˇ
ˉ
ˋ
˄˃
Fig 2. The Framework oI GAFDT
1) Data FuzziIication
2) ID3 decision tree
The ID3 decision tree learning algorithm computes the
InIormation Gain G based on each attribute A, and it is
deIined as Iollows:
( )
( , ) ( ) ( ),
v
v
v values A
S
G S A Entropy S Entropy S
S e
= ÷
¿
(6)
where S is the total input space and is the subset oI S Ior
which attribute A has a value v. The Entropy (S) over classes
is given by , where
v
S
2
1
log ( )
c
i
i
p p
=
¿ ÷
i i
p represents the
probability oI class 'i.¨ The attribute with the highest
inIormation gain, says B, is chosen as the root node oI the
tree. Next, a new decision tree is recursively constructed
over each value oI B using the training subspace
{ }
B
S S ÷ .A leaInode or a decisionnode is Iormed when all
the instances within the available training subspace are Irom
the same class. For detecting anomalies, the ID3 decision
tree outputs binary classiIication decision oI '0¨ to indicate
normal and '1¨ to indicate anomaly class assignments to test
instances.
The Iuzzy resolution concept in Iuzzy set theory is applied to
transIorm data attribute Irom continuous to discrete. Then, a
decision tree classiIication method is Iurther embedded to
build a stock Iorecasting model. In summary, ID3 decision
tree will be applied in our model as a programming tool.
3) Evolving Fuzzy Decision Tree by Genetic Algorithm
Genetic Algorithm will be used in this stage to improve the
accuracy oI FDT (Iuzzy decision tree) in Iinancial data
Iorecasting. Genetic Algorithms will Iind the best number oI
Iuzzy terms oI every input data (technical indices), and then
the Iitness Iunction will be recalculated aIter each new
number oI Iuzzy terms. In this research, Iitness Iunctions is
the Iorecasting accuracy oI stock price movement, i.e., buy,
sell or hold decision. Next, GA will continue the selection,
crossover, and mutation. The process will iteratively repeat
until the stopping criteria are satisIied.
Kosko¡12] used Fuzzy Entropy method to revise Iuzzy
theory data, and Janikow11 used Iuzzy set`s probability to
replace clear set probability by calculating Iuzzy set data
Entropy. In Fuzzy set theory, membership Iunction is one oI
the basic concepts, through this concept one will be able to
process quantitative Iuzzy set data, and dispose oI Iuzzy
message. How to Iind an apropos membership Iunction to
approach quantitative Iuzzy set data and dispose oI Iuzzy
message becomes very important in Iuzzy set theory.
However, there is not exist one perIect rule to adopt all kinds
oI Iuzzy set data. Researchers always consider diIIerent
problems with diIIerent membership Iunction; the most used
membership Iunction includes Triangles membership
Iunctions, trapezoid membership Iunctions, Gauss
membership Iunctions. This research will adopt Triangles
membership Iunctions Ior our primary membership
Iunctions. The equation oI triangles membership Iunctions
describing as Iollows:
D. The judgment of output value
This research mainly applies evolutional Iuzzy decision trees
to predict the trend oI stock price movement. The judgment
oI stock price movement is shown as Iollows:
1 t t
t
x x
y
x
÷
÷
= (7)
where is the closing price oI individual stock in the t
th
period and is the closing price oI individual stock in the
(t 1)
th
period
t
x
1 ÷ t
x
2008 IEEE International Conference on Fuzzy Systems (FUZZ 2008) 79
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 23, 2009 at 11:34 from IEEE Xplore. Restrictions apply.
It will be a sell decision when y is greater than ¹0.5°. On the
other hand, it will be a buy decision when y is less than
0.5°. Otherwise, it will be a hold position iI y is between
¹0.5 and 0.5ʘ.
IV. EXPERIMENTAL RESULT
According to the criteria listed in Table I, there are three
diIIerent stocks selected Ior studying and they are the Epistar
Corp. (EPISTAR), Silicon Integrated System Corp. (SiS)
and UMC Corp. (UMC) which represent upward, downward
and steady state stocks Ior our research purpose. The historic
data oI these stocks are derived Ior observation Irom
2000/8/10 to 2005/9/30. The main purpose oI instance
selection is to emphasis the importance oI stock screening.
Another purpose is to show that the proposed model can
have a robust perIormance even under diIIerent type oI stock
trends. Then diIIerent input Iactors Ior each stock are
selected according to stepwise regression analysis.
A. Best Parameters Setting (Stepwise regression)
According to the Stepwise regression (SRA) method,
important Iactors oI each stock are selected Irom the set oI
input Iactor and there are 24 technical indices in the input set.
Statically soItware SPSS is applied to execute the SRA
procedure, and important input Iactors are selected and
shown in the Iollowing table.
TABLE II
INPUT FACTORS SELECTED BY STEPWISE REGRESSION ANALYSIS
Stock Names Input Factors Results
Technical Indices 12RSI
EPISTAR DiIIerence oI Technical
Indices
10BIAS diIIerence
6RSI diIIerence
12RSI diIIerence
Technical Indices 12RSI
Sis DiIIerence oI Technical
Indices
10BIAS diIIerence
6RSI diIIerence
12RSI diIIerence
Technical Indices 12W°R
UMC DiIIerence oI Technical
Indices
10BIAS diIIerence
12RSI diIIerence
12W°R diIIerence
B. A Weighted fuzzy clustering method
Experimental design is applied to decide the best parameter
setting. AIter the experimental tests, the parameter setting is
shown in Table III. In addition, the best number oI cases Ior
each stock is also shown in this table.
TABLE III
BEST PARAMETER SETTING FROM EXPERIMENTAL METHOD
Parameter setting EPISTAR SIS UMC
Į 0.6 0.6 0.6
Learning Rate 0.7 0.7 0.7
ȕ 0.65 0.65 0.4
Phaseone run times 1000 1000 1000
Phasetwo run times 30 30 30
Best number of Cases 8 Cases 7 Cases 4 Cases
C. Best Parameters Setting (GeneticAlgorithms)
GeneticAlgorithms are applied to evolve the Iuzzy terms oI
each Iactor in this research. Four important Iactors are
selected in this experimental design and they are Population
Size, Number oI Generation, Crossover rate and Mutation
rate. AIter GA evolving, we will expect to derive a better
Iactor design Ior GA evolving applications oI these three
stocks is shown in Table IV.
TABLE IV
PARAMETER SETUPS OF GA FOR STOCKS EPISTAR, SIS, AND UMC
Epistar Sis Umc
Factors
Levels Levels Levels
Population Size 20 20 20
Number of Generation 100 10 100
Crossover rate 0.9 0.9 0.9
Mutation rate 0.1 0.1 0.3
D. Method Comparisons
AIter setting up the parameters oI the experiments, we
take the output oI CBFDT to be compared with those Irom
traditional FDT and GAFDT. As shown in Table V, the
5Iold crossover test show that CBFDT perIorm much batter
than GAFDT and FDT in hit rate perIormance.
TABLE V
HIT RATE COMPARISONS OF ALL STOCKS FROM DIFFERENT FORECASTING
MODELS
Crossover test
First Second Third Fourth Fifth
FDT 0.76 0.67 0.71 0.71 0.70
GAFDT 0.85 0.79 0.87 0.82 0.82
EPISTA
R
CBFDT 0.91 0.90 0.91 0.91 0.93
FDT 0.76 0.69 0.68 0.75 0.68
GAFDT 0.83 0.81 0.78 0.81 0.77 SIS
CBFDT 0.93 0.91 0.93 0.92 0.93
FDT 0.75 0.72 0.69 0.71 0.70
GAFDT 0.84 0.83 0.83 0.86 0.78
A
V
G
H
i
t
R
a
t
e
UMC
CBFDT 0.93 0.93 0.92 0.94 0.95
E. Discussions
As observed in Table 6, CBFDT outperIorms than other
FDT methods. The reasons are: 1.) A case basedclustering
method does split the case library into more homogeneous
smaller cases in datapreprocessing stage. ThereIore, Iuzzy
rules generated Irom each case can more sensitively react to
the current stock price movement. 2.) An evolving FDT can
be more eIIectively to decide the number oI Iuzzy terms
especially when the number oI data is increasingly large. As
shown in table VI, the data amounts and number oI Iuzzy
terms show that the more the number oI data, the more the
Iuzzy terms are. To generate eIIective Iuzzy rules, the
number oI Iuzzy terms should be evolved through GA. As a
result, the hit rate can be Iurther improved than FDT.
80 2008 IEEE International Conference on Fuzzy Systems (FUZZ 2008)
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 23, 2009 at 11:34 from IEEE Xplore. Restrictions apply.
TABLE VI.
DATA AMOUNTS AND NUMBER OF FUZZY TERMS IN EACH TRIAL
Data mounts
Trials
500 100 50
1 9ǵ4ǵ8ǵ9 7ǵ8ǵ9ǵ8 5ǵ7ǵ6ǵ6
2 9ǵ9ǵ6ǵ9 7ǵ8ǵ4ǵ9 6ǵ7ǵ2ǵ4
3 9ǵ6ǵ2ǵ9 7ǵ8ǵ9ǵ8 6ǵ5ǵ4ǵ3
4 9ǵ6ǵ6ǵ9 2ǵ8ǵ6ǵ9 5ǵ7ǵ3ǵ6
5 8ǵ6ǵ8ǵ8 8ǵ8ǵ4ǵ8 4ǵ7ǵ7ǵ8
6 9ǵ4ǵ2ǵ9 2ǵ8ǵ3ǵ9 4ǵ7ǵ8ǵ6
7 9ǵ6ǵ4ǵ9 7ǵ8ǵ2ǵ9 5ǵ7ǵ8ǵ6
8 9ǵ4ǵ9ǵ9 5ǵ8ǵ7ǵ8 2ǵ7ǵ6ǵ6
9 9ǵ6ǵ3ǵ9 8ǵ8ǵ8ǵ8 4ǵ7ǵ7ǵ7
10 9ǵ8ǵ5ǵ9 7ǵ8ǵ4ǵ9 2ǵ7ǵ7ǵ6
In addition, Data distribution is another important Iactor
to be considered since it will aIIect the number oI
Iuzzyterms to be clustered. For example, 12RSI¸delta is
divided clearly into 9 Iuzzy terms as shown in Figure 3 and
the number oI data in each term is small and distributed
evenly. However, iI it is divided into 3 Iuzzy terms as shown
in Figure 4, there are a large number oI data in each term and
the Iuzzy rules generated may not be able to react to the real
situation and it may lead to wrong decisions. ThereIore, the
numbers oI Iuzzy terms oI each Ieature do aIIect the number
oI rules generated.
12RSI
ˇ
˄˃
˄ˊ
˄ˋ
˄ˉ
˅˅
ˊ
ˇ
˅
˃
ˈ
˄˃
˄ˈ
˅˃
˅ˈ
˄ ˅ ˆ ˇ ˈ ˉ ˊ ˋ ˌ
Fuzzy Terms
Data Numbers
Fig3 12RSI¸delta Divided into 9 Iuzzy terms
12RSI
ˇ
˄˃
˄ˊ
˄ˋ
˄ˉ
ˊ
ˇ
˅
˃
ˈ
˄˃
˄ˈ
˅˃
˅ˈ
1 2 3
Fuzzy Terms
Data Numbers
Fig 4. 12RSI Divided into 3 Iuzzy terms
5.
V. CONCLUSION
A considerable amount oI research has been conducted to
study the behavior oI a stock price movement. However, the
investor is more interesting in making proIit by providing
simple trading decision such as Buy/Hold/Sell Irom the
system rather than predicting the stock price itselI. ThereIore,
we take a diIIerent approach by applying a case based Iuzzy
decision tree to predict the stock price movement. A
stepwise regression (SRA) method is applied to select most
important Iactors Irom the set oI inputs. Next, a weighted
clustering method is adopted to divide the case base into a
smaller case. Within each case, a more homogeneous data
are grouped into together. ThereIore, these data can be more
eIIectively react to the current stock price movement. Finally,
a GA is applied to evolve the Iuzzy terms oI each Iactor in
order to derive the best Iuzzy decision tree Irom each case.
Through a series oI experimental tests, the CBFDT
outperIorms other approaches with an average hit rate
around 91°. It is the highest among the literature published
up to present. The Hitratio (buy or sell) oI the Iuture stock
price movement can be applied to help investors to make
better decision in trading stocks.
In the Iuture, the proposed system can be Iurther
investigated by incorporating other soIt computing
techniques or a better Data Mining Iorecasting model other
than ID3 decision tree systems. They are listed as Iollows:
2/ A diIIerent Iorecasting model: There are numerous
Iorecasting models other than ID3 model exist in the
academic area. It is worth a while to study the behavior
oI these models when applied in prediction oI the stock
price movement. DiIIerent input Iactors and diIIerent
Iorecasting models such as CART, C4.5 are possible
candidate models Ior improving the accuracy oI the
perIormance measure.
3/ DiIIerent Data FuzziIication Method: DiIIerent kinds oI
Iuzzy membership Iunctions can be applied to transIorm
the original data, including Trapezoid membership
Iunctions, Gauss membership Iunctions. These
Iunctions may lead to a better perIormance result.
REFERENCES
1. A. Abraham, N. Baikunth, and P.K. Mahanti. 'Hybrid Intelligent
Systems Ior Stock Market Analysis.¨ Lecture Notes in Computer
Science,vol.2074, pp. 337345 ,2001.
2. AbuMostaIa, Y.S. and A.F Atiya. 'Introduction to Iinancial
Iorecasting.¨ Applied Intelligence, vol.6, pp. 205213, 1996.
3. Baba, N., N. Inoue and H. Asakawa. 'Utilization oI Neural
Networks & GAs Ior Constructing Reliable Decision Support
Systems to Deal Stocks.¨ IEEEINNSENNS International Joint
ConIerence on Neural Networks (IJCNN'00), vol.5, pp 5111 5116.
2000.
4. P.C Chang, ChenHao Liu, ChinYuan Fan, WeiHsiu Huang
'Establishing a Cluster Based Evolving Fuzzy Decision Tree on
Financial Time Series Data.¨ The 8
th
Asia paciIic Industrial
Engineering & Management ,Kaoshiung,2007
P.C. Chang, C.H. Liu 'A TSK type Fuzzy Rule Based System Ior
Stock Price Prediction,¨ Expert Systems with Applications 34 (1),
(2006) pp. 135144.
6. P.C.Chang , and T. Warren Liao, 'Combing SOM and Fuzzy Rule
Base Ior Flow Time Prediction in Semiconductor ManuIacturing
Factory.¨ Applied SoIt Computing, vol.6 (2), pp. 198206.2006a.
7. S. C. Chi,, Chen, H. P., and C. H. Cheng, 'A Forecasting Approach
Ior Stock Index Future Using Grey Theory and Neural Networks,¨
IEEE International Joint ConIerence on Neural Networks, pp.
38503855, 1999.
8. G. Corani, G. Guariso.¨ Coupling Iuzzy modeling and neural
networks Ior river Ilood prediction.¨ IEEE Transactions on Systems,
Man and Cybernetics, Part C: Applications and Reviews Vol.35(3),
2008 IEEE International Conference on Fuzzy Systems (FUZZ 2008) 81
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 23, 2009 at 11:34 from IEEE Xplore. Restrictions apply.
pp.382  390 . 2005.
9. G. P. Zhang, 'Avoiding PitIalls in Neural Network Research¨ IEEE
Transaction on Systems, Man, and Cybernetics, part C vol.37,
pp316.2007.
10. H.L.Larsen and R.R.Yager, 'A Iramework Ior Iuzzy recognition
technology¨ IEEE Transaction on Systems, Man, and Cybernetics,
part C vol.30, pp6576.2000.
11. Janikow, C.Z., 'Fuzzy decision tree: Issues and methods¨, IEEE
Trans. On System, Man, and Cybernetics Part B: Cybernetics, Vol.
28, No. 1, pp.114,1998.
12. Kosko, B., Neural Network and Iuzzy Systems,
PrenticeHall,Englewood CliIIs,NJ,1992.
13. MuChun Su, ChihWen Liu, ShuennShing Tsay,
'Neuralnetworkbased Iuzzy model and its application to transient
stability prediction in power systems¨ IEEE Transaction on Systems,
Man, and Cybernetics, part C vol.29, pp.149157.1999.
14. Mugambi, E.M., A. Hunter., G. Oatley and L. Kennedy.,
'PolynomialIuzzy decision tree structures Ior classiIying medical
data¨, KnowledgeBased System, Vol.17, Issue. 24, pp. 8187,
2004.
15. Murata, T., H. Ishibuchi, and M. Gen, 'Adjusting Fuzzy Partitions by
Genetic Algorithms and Histograms Ior Pattern ClassiIication
Problems,¨ Proc. oI IEEE ConI. on Computational Intelligence, pp.
914, 1998.
16. Quinlan J.R. 'Induction oI decision trees¨, Machine Learning, Vol.
1,1986
17. R.H. Golan, W.Ziarko, ¨A methodology Ior stock market analysis
utilizing rough set theory.¨ Proceedings oI the IEEE/IAFE 1996
ConIerence on Computational Intelligence Ior Financial
Engineering, pp. 3240,1995.
18. Shiu, S.C.K., Li,Y., Wang,X.Z. 'Using Iuzzy integral to model
casebase competence¨ Proc. of Soft Computing in Casebased
Reasoning Workshop, conjunction with the 4th Int. ConI. in
CaseBased Reasoning, ICCBR 2001, Vancouver, Canada, pp.
206212. 2001.
19. Sorensen E. H, K. L. Miller and C. K Ooi, ,'The Decision Tree
Approach to Stock Selection¨, journal oI PortIolio Management, Iall,
pp.4245, 2000
20. Zadeh, L.A. 'Fuzzy sets. InIormation and Control.¨ Vol.8 ,
pp.338353,1965.
82 2008 IEEE International Conference on Fuzzy Systems (FUZZ 2008)
Authorized licensed use limited to: IEEE Xplore. Downloaded on February 23, 2009 at 11:34 from IEEE Xplore. Restrictions apply.
Downloaded on February 23. 2009 at 11:34 from IEEE Xplore. The Selection of Stocks 2008 IEEE International Conference on Fuzzy Systems (FUZZ 2008) 77 Authorized licensed use limited to: IEEE Xplore.n A. Restrictions apply. .
A Case Based Weightedclustering method wj SL e e eN wj wj E wj E Fj j n ei ei xi xi Fj i N xin yi j n E w pq q p w SM pq SM pq N N SM pq w SM pq xj yi wj w j j j SL wj n wj wj ep eq d pq w n j w j x pj xqj n j wj xj xj x pj xqj d pq SM pq w SM pq w d pq w SM w SM pq SM SM pq SM SM s pq s pq k w w sm pk smkq SM SM s pq 78 2008 IEEE International Conference on Fuzzy Systems (FUZZ 2008) Authorized licensed use limited to: IEEE Xplore. Restrictions apply.B. . Downloaded on February 23. 2009 at 11:34 from IEEE Xplore.
C. . 2009 at 11:34 from IEEE Xplore. GAFDT forecasting model x x b c c a a x b a b x x x c a b c x 2) ID3 decision tree GS A Entropy S v values A Sv S Entropy Sv Sv c p i i p i pi S S B D. Downloaded on February 23. The judgment of output value y xt xt xt xt xt 2008 IEEE International Conference on Fuzzy Systems (FUZZ 2008) 79 Authorized licensed use limited to: IEEE Xplore. Restrictions apply.
Method Comparisons A. Downloaded on February 23. Best Parameters Setting (GeneticAlgorithms) D. 2009 at 11:34 from IEEE Xplore.C. Discussions B. A Weighted fuzzy clustering method 80 2008 IEEE International Conference on Fuzzy Systems (FUZZ 2008) Authorized licensed use limited to: IEEE Xplore. Best Parameters Setting (Stepwise regression) E. Restrictions apply. .
2008 IEEE International Conference on Fuzzy Systems (FUZZ 2008) 81 Authorized licensed use limited to: IEEE Xplore. 2009 at 11:34 from IEEE Xplore. . Restrictions apply. Downloaded on February 23.
of Soft Computing in Casebased Reasoning Workshop 82 2008 IEEE International Conference on Fuzzy Systems (FUZZ 2008) Authorized licensed use limited to: IEEE Xplore. Downloaded on February 23. Restrictions apply.Proc. 2009 at 11:34 from IEEE Xplore. .
This action might not be possible to undo. Are you sure you want to continue?
We've moved you to where you read on your other device.
Get the full title to continue reading from where you left off, or restart the preview.