You are on page 1of 32

Pattern Discovery of Fuzzy Time Series for Financial Prediction

-IEEE Transaction of Knowledge and Data Engineering

Presented by Hong Yancheng For COMP630P, Spring 2009

Outline
Introduction and target problem Background knowledge and related work Modeling the candlestick pattern Candlestick pattern for financial prediction Experiments and applications Conclusion and Discussion

Problems with existing stock prediction tools


A lot of tools exists for predicting stock price
Artificial Neural Network, SVM, NeuroFuzzy, Nave Bayes and so on

Three major problems with these tools


Training process is nontrivial and training result cannot be further used for other target Prediction results are incomprehensible
Hard for user to tuning the parameters

Gap exists between prediction result and investment decision


Improving prediction VS buy/sell decision

Target problem
Data preprocessing are needed before applying various of techniques
Data mining, machine learning & pattern recognition

Good knowledge representation method can assist investors Knowledge-based method to transfer financial data to comprehensible rules and visual patterns

Outline
Introduction and target problem Background knowledge and related work Modeling the candlestick pattern Candlestick pattern for financial prediction Experiments and applications Conclusion and Discussion

Japanese Candlestick Theory


Four general ways of represent stock price fluctuation
Original daily fluctuation Single close price Bar chart Candlestick chart
More visual information

Fuzzy Time Series


Fuzzy time series Assume U is the universe of discourse, where U = {x1, x2,, xn}. A fuzzy set Ai of U is defined by Ai = Ai (x1)/x1 + Ai (x2)/x2 + + Ai (xn)/xn where Ai (xk) is membership function of the fuzzy set Ai , Ai: U -> [0,1]

Outline
Introduction and target problem Background knowledge and related work Modeling the candlestick pattern Candlestick pattern for financial prediction Experiments and applications Conclusion and Discussion

Fuzzy candlestick pattern


A fuzzy candlestick pattern is composed of related fuzzy candlestick lines in a period A fuzzy candlestick line has seven parts
Sequence, open style, close style, upper shadow, body, body color and lower shadow Sequence defines the location of the candlestick Open/Close style model the relationship between consecutive candlestick lines

Candlestick line modeling


Modeling the length of shadow and body Four linguistic variables EQUAL, SHORT, MIDDLE and LONG indicate the fuzzy sets of length
Lupper = ([high MAX(open, close)]/open) * 100 Llower = ([MIN(open, close) - low]/open) * 100 Lbody = ([MAX(open, close) MIN(open, close)]/open) * 100

Candlestick line modeling


The membership function of four fuzzy sets are shown as follows
The range is set to (0, 14) because the Taiwan stock price limitation

Candlestick line modeling


The body color is defined by three terms BLACK, WHITE and CROSS
If openclose > 0 then body color is BLACK If openclose < 0 then body color is WHITE If openclose = 0 then body color is CROSS

Candlestick line modeling


The open/close style is another important feature Five linguistic variables LOW, EQUAL_LOW, EQUAL, EQUAL_HIGH, HIGH indicate fuzzy sets of open/close style

Trend modeling
Two linguistic variables are used to model the trends before and after the candlestick pattern previous trend is represented by weekly candlestick line Six fuzzy sets are used to define the trend
CROSS, EQUAL, WEAK, NORMAL, STRONG, and EXTREME BEARISH and BULLISH define the body color

Trend modeling
Following trend is derived from the variation of close price (Closet+n Closet)/ Closet * 100
Closet+n and Closet mean the close price at day t+n and day t respectively n is a user-defined parameter

Outline
Introduction and target problem Background knowledge and related work Modeling the candlestick pattern Candlestick pattern for financial prediction Experiments and applications Conclusion and Discussion

Three major pattern recognition problems


Sensing problem
Measured values are open, close, high, low

Feature extraction problem


Fuzzy candlestick patterns

Pattern classification problem


Can be determined by user

Forecast procedure
Step 1
Calculate the variation percentage between two close prices. Use the minimum increase Imin and maximum increase Imax to define the universe of discourse UoD = [Imin D1, Imax +D2] E.g. Imin = -5.83, Imax = 7.66 then UoD = [-6, 8]

Step 2
Partition UoD into several intervals E.g. partition [-6, 8] into seven intervals [-6, -4], [4, -2], , [6, 8]

Forecast procedure
Step 3
Define fuzzy sets on the UoD associate with the intervals in step 2

Step 4
Fuzzifying the values calculated in step 1 If v ux, and there is Ay in which maximum membership function occurs at ux, v is translate to Ay

Forecast procedure
Step 5
Calculate all the candlestick patterns

Step 6
Refine extracted patterns, identify important attributes

Step 7
Select pattern for forecasting based on probability P(Ax |Py ) Statistic T = Count(Py Ax)/Count(Py) as the threshold to select the patterns

Forecast procedure
Step 8
Forecast the trend follows Rule 1: test pattern not found, set variation v to 0 Rule 2: test pattern found, set variation v to arithmetic average of midpoints of matched patterns Forecast = close + close * v

Step 9
Evaluate the forecasting MSE = (Forecasti - Actuali)2 / N

Outline
Introduction and target problem Background knowledge and related work Modeling the candlestick pattern Candlestick pattern for financial prediction Experiments and applications Conclusion and Discussion

Experiments and Applications


The experiments are conducted based on TAIEX index from 2004-01-02 to 2005-01-31 and 2330(TSMC) from 1997-10-23 to 200212-25

Experiments and Applications


Experiment for TAIEX index

Experiments and Applications


Experiment results for TAIEX

Problems with existing stock prediction tools


Three major problems with these tools
Training process is nontrivial and training result cannot be further used for other target Prediction results are incomprehensible
Hard for user to tuning the parameters

Gap exists between prediction result and investment decision


Improving prediction VS buy/sell decision

Experiments and Applications


Experiment with 2330 (TSMC)
The focus is to find the buying time of the stock The rule is: IF T>0.5 and the following trend is STRONG_INCREASE or EXTREME_INCREASE THEN select the pattern 5-day return is 2.9% on average

Experiments and Applications


Fuzzy modifier can be implemented to help user tuning the parameters
ABOVE, BELOW, PLUS, VERY, EXTREMELY, MORE_OR_LESS, SOMEWHAT, and NOT E.g. STRONG_BEARISH and EXTREME_BEARISH can be merged by ABOVE STRONG_BEARISH

Outline
Introduction and target problem Background knowledge and related work Modeling the candlestick pattern Candlestick pattern for financial prediction Experiments and applications Conclusion and Discussion

Conclusion and Discussion


Pros
Knowledge-based method to represent the financial time series and to facilitate the knowledge discovery Comprehensible, computable and visual Can be used directly or as data preprocess Cons Time complexity How many candlestick lines for a pattern

Thanks for listening

Q&A

You might also like