3 views

Uploaded by cicimonaa_viva

save

You are on page 1of 5

**TIME SERIES PREDICTION BASED ON ENSEMBLE ANFIS
**

DE-WANG CHEN

1

, JUN-PING ZHANG

2

1

School of Electronics and Information Engineering, Beijing Jiaotong University, Beijing, 100044, China

2

Department of Computer Science and Engineering, Fudan University, ShangHai, 200433, China

E-MAIL: cdw@telecom.njtu.edu.cn, jpzhang@fudan.edu.cn

Abstract:

In this paper, random and bootstrap sampling method

and ANFIS (Adaptive Network based Fuzzy Inference

System）are integrated into En-ANFIS (an ensemble ANFIS)

to predict chaotic and traffic flow time series. The prediction

results of En-ANFIS are compared with an ANFIS using all

training data and each ANFIS unit in En-ANFIS.

Experimental results show that the prediction accuracy of the

En-ANFIS is higher than that of single ANFIS unit, while the

number of training sample and training time of the En-ANFIS

are less than that of the ANFIS using all training data. So,

En-ANFIS is an effective method to achieve both high

accuracy and less computational complexity for time series

prediction.

Keywords:

Time series prediction; ANFIS; ensemble learning;

bootstrap; traffic flow

1. Introduction

Time series prediction is a branch of probability and

statistical discipline with many applications in economics

prediction, weather analysis and traffic flow prediction [1].

There are many methods in time series prediction, such as

linear regression, Kalman filtering [2], neural network [3],

and fuzzy system [4]. Linear regression is simply but has

less adaptation. Kalman filtering is an adaptive method, but

intrinsically linear. The neural network can approximate

any nonlinear function, but it demands a great deal of

training data and is hard to interpret. On contrary, fuzzy

system has good capability of interpretation, but its

adaptability is relative low.

ANFIS [5] put forward by Jang in 1993 integrated the

advantage of both neural network and fuzzy system, which

not only have good learning capability, but can be

interpreted easily also. ANFIS have many applications in

many areas, such as function approximation, intelligent

control and time series prediction.

As to sample selection, many papers on time series

prediction have not given good methods. On the one hand,

they just partition the training data and testing data

randomly. So, the training data sometimes do not always

reflect the real distribution of the prediction model and the

effectiveness of the prediction algorithm can’t be assured.

On the other hand, when there are too many training data,

the training time is long. So how to choose a set of training

data to reflect the real distribution of the prediction model

and decrease the training time in the huge training data is a

very important problem in time series prediction.

In this paper, the ensemble learning and ANFIS are

integrated into En-ANFIS time series prediction algorithm;

the focus is to study the influence of sample selection and

weighting method in ensemble learning on the time series

prediction algorithm. The principle of ANFIS and the

ensemble learning will be introduced in Section II. The

En-ANFIS algorithm and how to apply it in the chaotic

time series and traffic flow prediction will be discussed in

Section III. The results of time series prediction using

En-ANFIS, the ANFIS using all training data (allANFIS)

and ANFIS unit in En-ANFIS are compared each other in

Section IV. The main conclusion and future work are given

in the Section V.

2. ANFIS and Ensemble learning

ANFIS is a neural network implementation of a T-S

(Takagi-Suguno) fuzzy inference system. ANFIS apply the

hybrid algorithm, which integrates BP (Backpropagation)

and LSE (least square estimation) algorithm, so it has rapid

learning speed. Ensemble is a learning paradigm where

multiple component learners are trained for a same task,

and the predictions of the component learners are combined

for dealing with future instances [6]. Since an ensemble is

often more accurate than its component learners, such a

paradigm has become a hot topic in recent years and has

already been successfully applied to optical character

recognition, face recognition, scientific image analysis,

medical diagnosis, etc [7].

In this paper, the output of ensemble learning is the

average of each ensemble unit with different coefficient.

0-7803-9091-1/05/$20.00 ©2005 IEEE

3552

Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, Guangzhou, 18-21 August 2005

When used in classification, the output of the ensemble

learning often is the voting result of all ensemble units.

There are many methods to realize ensemble learning. In

this paper, we will use bootstrap (sampling with

replacement) and random sample without replacement to

construct the subsystems in ensemble system [8].

3. En-ANFIS structure and time series data

description

In this paper, the structure of the proposed En-ANFIS

is illustrated in Fig.1.

Training data

ANFIS1'

Sample tech.

ANFIS2'

ANFISn-1' ANFISn’

Testing data

ANFIS1 ANFIS2 ANFISn-1

ANFISn

System output

Sample layer

Training layer

Testing layer

Output layer

Input layer

Figure1. The structure of En-ANFIS

In Fig.1, En-ANFIS is divided into five layers: input

layer, sample layer, training layer, testing layer and output

layer. Each ANFISi' is trained using bootstrapped or

random selected training data. ANFIS

i

is the trained

ANFIS

i

’. The testing data input to every ANFIS

i

at the same

time. The output of En-ANFIS is the comprehensive output

of all ANFIS units in testing layer. Two methods are

adopted in this paper to calculate the output of En-ANFIS,

which are uniform weighting as in (1) and non-uniform

weighting according to the reciprocal of the training error

of ANFIS

i

’ as in (2) and (3).

Apparently, En-ANFIS is more complicated than a

single ANFIS, but as it is parallel computation, the

complexity of computation do not increase. Furthermore,

the number of random sample or bootstrap is less than the

all-training data; the complexity of computation will

decrease accordingly. Multi bootstrap or random sample

will guarantee to approximate the intrinsic distribution of

data. A chaotic time series and traffic flow time series will

be used to validate the effect of the proposed En-ANFIS.

n ANFIS nANFIS

n

i

i

/ E

1

∑

=

=

∑

=

=

n

i

i i

ANFIS k EnANFIS

1

*

(1)

(2)

∑

=

=

n

i

i

i

i

trnRMSE

trnRMSE

k

1

) / 1 (

) / 1

‘

‘

（

(3)

The chaotic used is Mackey-Glass time series data [9],

whose differential equation is defined as:

) ( 1 . 0

) ( 1

) ( 2 . 0

) (

10

.

t x

t x

t x

t x −

− +

−

=

τ

τ

(4)

When and 2 . 1 ) 0 ( = x 17 ＝ τ , it is a non-periodic

and non-convergent time series illustrated in Fig.2, which is

very sensitive to initial conditions. (We assume

whent .)

0 ) ( = t x

0 <

0 200 400 600 800 1000 1200

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

Time (sec)

x

(

t

)

Mackey-Glass Chaotic Time Series

Figure2. The chaotic time series

Another time series we used is traffic flow data,

collected in Beijing urban expressway (3nd ring road). The

detail of the data collection and transmission can be seen in

[10]. The one-week traffic flow data is illustrated in Fig.3.

3553

Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, Guangzhou, 18-21 August 2005

0 1000 2000 3000 4000 5000 6000

0

200

400

600

800

1000

1200

1400

1600

1800

2000

time(2min)

f

l

o

w

(

v

e

h

/

h

)

trafficData

Figure3 Traffic flow data in Beijing 3rd ring road

4. The comparisons and analysis of different

prediction algorithms

4.1. Training data and test data

Similar to [5], We predict x (t+6) from the four past

values of the chaotic time series, that is, x (t-18), x (t-12), x

(t-6), and x(t). Therefore the format of the training data is

[x(t-18), x(t-12), x(t-6), x(t); x(t+6)].

From t = 118 to 1117, we collect 1000 data pairs of the

above format. For allANFIS, the first 500 are used for

training while the others are used for testing. For each

ANFIS unit, we only use 30% of the training data that is

150 sets of data. The training data using random sample are

different, but that of bootstrap have some repetitious data.

Fig.4 shows the training data of ANFIS’

10

after

bootstrapping, consisting of 131 different training data and

19 repetitious training data.

0 50 100 150

0

100

200

300

400

500

Bootstrap time

t

r

a

i

n

i

n

g

d

a

t

a

n

u

m

b

e

r

Figure4.New training data for ANFIS

10

after bootstrapping

In traffic flow prediction, we use the y(t), y(t-1), y(t-2)

and y(t-4) to predict y(t+1) after using input selection.

Therefore the format of the training data is: [y(t-4), y(t-2),

y(t-1), y(t); y(t+1)]. We total got 5030 sets of traffic data.

The 70% (3521) of them are used in training and others

(1509) for testing. We also use the same setting for

bootstrap sampling and random sampling, which is each

ANFIS unit only use 30% of all training data.

To ensure the same standard for comparison,

En-ANFIS have 10 ANFIS unit, and each ANFIS unit adopt

the same parameter set: 55 nodes, 80 linear parameters; 24

nonlinear parameters, 16 fuzzy rules and 10 training epochs.

The only difference of ANFIS units is the training data. The

comparative performance indices (PI) include RMSE (root

mean squared error), APE(%)(average percentage of error),

TT(s) (the training time in second) and NTD (the number of

training data). APE is defined as:

n

y

y i y abs

n

i

/ %) 100 *

) ) ( (

( APE

1

∑

=

−

＝ (5)

Different sampling and weighting methods result in

different output of En-ANFIS, as in Table 1.

Table1. The different kinds of En-ANFIS

Uniform

weighting

Non-uniform

weighting

Random sample En-ANFIS1 En-ANFIS2

Bootstrap sample En-ANFIS3 En-ANFIS4

4.2. Chaotic time series prediction

Table 2 the PI comparison of different algorithms

RMSE APE TT NTD

ANFISmin 0.0031 0.23 1.82 150

ANFISmax 0.0040 0.30 2.32 150

Random

sampling

ANFISmean 0.0034 0.25 2.05 150

ANFISmin 0.0034 0.24 1.72 150

(124)

ANFISmax 0.0048 0.30 2.64 150

(135)

Bootstrap

sampling

ANFISmean 0.0039 0.28 2.09 150

(129)

En-ANFIS1 0.0027 0.21 2.32 150

En-ANFIS2 0.0029 0.22 2.32 150

En-ANFIS3 0.0027 0.21 2.64 150

(135)

En-ANFIS4 0.0029 0.22 2.64 150

(135)

allANFIS 0.0025 0.19 6.38 500

The main PI of En-ANFIS, allANFIS and ANFIS units

are shown in Table2. If the sample time is omitted and

3554

Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, Guangzhou, 18-21 August 2005

suppose the ANFIS unit works in parallel, we can think that

the TT of En-ANFIS is the maximum TT of all ANFIS units

and the NTD of En-ANFIS is the maximum NTD of all

ANFIS units. The number in bracket in Table 2 is the

different data of training data, as bootstrap has some

repetitious data.

From the Table2, we can find that the En-ANFIS is

always better than any ANFIS unit whatever using different

sampling technologies or weighting methods. Fig.5 show

the different output error of AFNIS

1

, ANFIS

2

and

En-ANFIS under bootstrap sampling, from which we can

find that the error scatterplots of En-ANFIS in contained in

that of two ANFIS units. Compared with allANFIS, the PI

of En-ANFIS is almost the same, but a little worse,

however, the TT and NTD of En-ANFIS decrease

apparently. The PI of En-ANFIS using different sampling

technologies and weighting methods do not show

apparently difference, althoug it seems that uniform

weighting is a little better. Fig.6 shows that output error

scatterplots of En-ANFIS and allANFIS under bootstrap

sampling.

0 50 100 150 200 250 300 350 400 450 500

-0.025

-0.02

-0.015

-0.01

-0.005

0

0.005

0.01

0.015

0.02

0.025

test data

e

r

r

o

r

Bootstrap

ANFIS1

ANFIS2

enANFIS3

Figure5. The comparison of output error scatterplots of

ANFIS

1

, ANFSI

2

and En-ANFIS

0 50 100 150 200 250 300 350 400 450 500

-0.015

-0.01

-0.005

0

0.005

0.01

0.015

test data

e

r

r

o

r

Bootstrap

enANFIS3

enANFIS4

allANFIS

Figure6. The output error scatterplots of En-ANFIS and

allANFIS

4.3. Traffic flow prediction

As the analytic process of chaotic time series

prediction, the main PI of En-ANFIS, allANFIS and ANFIS

units in traffic flow prediction are shown in Table3.

Table 3. The PI comparison

RMSE APE TT NTD

ANFISmin 134.41 15.20 11.30 1056

ANFISmax 152.45 16.43 13.07 1056

Random

sampling

ANFISmean 138.03 15.56 11.68 1056

ANFISmin 133.84 14.96 12.78 1056

(903)

ANFISmax 150.15 16.47 17.99 1056

(922)

Bootstrap

sampling

ANFISmean 142.41 15.70 15.48 1056

(915)

En-ANFIS1 128.83 14.93 13.07 1056

En-ANFIS2 128.67 14.93 13.07 1056

En-ANFIS3 127.56 14.73 17.99 1056

(922)

En-ANFIS4 127.87 14.77 17.99 1056

(922)

AllANFIS 126.97 14.72 40.02 3521

From Table3, we also get the similar results as in

chaotic time series prediction. The En-ANFIS is always

better than any ANFIS unit in prediction accuracy. The

prediction accuracy of En-ANFIS is similar to that of

allANFIS, however, En-ANFIS uses much less time and

training data. The sampling method and weighting manner

have little influence on the prediction accuracy of

En-ANFIS. In this case, bootstrap has a little better off than

the random sampling; however, weighting manner almost

has no influence at all.

0 200 400 600 800 1000 1200 1400 1600

-1000

-800

-600

-400

-200

0

200

400

600

800

test data

e

r

r

o

r

Random

ANFIS1

ANFIS2

enANFIS1

Figure7. The output error of two ANFIS unit and

En-ANFIS

3555

Proceedings of the Fourth International Conference on Machine Learning and Cybernetics, Guangzhou, 18-21 August 2005

0 200 400 600 800 1000 1200 1400 1600

-500

-400

-300

-200

-100

0

100

200

300

400

500

test data

e

r

r

o

r

Random

enANFIS1

enANFIS2

allANFIS

Figure8 The output error of two En-ANFIS and

allANFIS

Fig.7 shows the error difference among two ANFIS

unit and En-ANFIS under the random sample, the curve of

En-ANFIS is also contained in that of the ANFIS unit. Fig.8

shows the error curves of two kinds of En-ANFIS and

allANFIS under the random sample, it is easy to find that

three curve are overlapped each other, so it is hard to

determine which method is the best.

5. Conclusions

From the above analysis and comparison in chaotic

time series and traffic flow prediction, we can find that the

En-ANFIS can improve performance much better than any

ANFIS unit. On the other hand, the ensemble of multiple

weak ANFIS units can reach a high performance. So, it is

possible to use less training data, less training time to

achieve a good effect. We also discuss how the sampling

method and weighting manner have influence on the PI of

En-ANFIS. According to the experimental results, it is no

apparent different using different sampling technologies

and weighting methods.

The method put forward in this paper not only can be

used in ANFIS ensemble, but in other system ensemble,

such as ensemble neural network, and ensemble fuzzy

system. Furthermore, it is believed that this method is not

only effective in prediction problem, but it also will work in

other domain, such as classification, pattern recognition and

so on.

In this paper, we just give an elementary work on how

to integrate the ensemble learning and ANFIS and use it in

time series prediction. There is so much future work to do,

such as the new sampling technologies, the convergence of

the algorithm, the number of ANFIS unit, the weighting

method, the proportion of training data used in ANFIS unit

and so on.

Acknowledgements

This paper is supported by an open research

foundation from Shanghai Key Laboratory of Intelligent

Information Processing, under grant IIPL-04-014.

References

[1] Shumway RH and stoffer D S. Times Series Analysis

and its application. New York: Springer-Verlag, 2000.

[2] Jie Ma and Jian-fu Teng, “Predict chaotic time-series

using unscented Kalman filter”, Proceeding of the

Third International Conference on Machine Learning

and Cybernetics, Shanghai, China, 26-29, pp.867-890

August, 2004.

[3] R.S Crowder, “Predicting the Mackey-Glass time

series with cascade correlation learning” in Proc. 1990

Connectionist Models Summer School, D.Toutezky, et

al Eds., Carnegie Mellon Univ. , pp.117-123,1990.

[4] Kandel, Fuzzy expert systems, Boca Raton, FL: CRC

Press, 1992.

[5] Jyh-Shing Roger Jang, “ANFIS:

adaptive-network-based fuzzy inference system”,

IEEE Trans. On SMC, Vol.23, No.3, pp.665-685,

1993.

[6] Thomas G. Dietterich, “Machine Learning Research:

Four Current Directions”, Artificial Intelligence, 18(4),

pp.97--136, 1998

[7] Z.-H. Zhou, J. Wu, and W. Tang. ``Ensembling neural

networks:many could be better than all'', Artificial

Intelligence, , 137(1-2), pp. 239-263, 2002.

[8] T. Hastie; R. Tibshirani; J. Friedman, "The elements

of statistical learning", Springer-Verlag, 2001.

[9] M.C. Mackey and L.Glass, “Oscillation and chaos in

physiological control systems”, Science , Vol.197,

No.287-289, July, 1997.

[10] Dewang Chen, Junping Zhang, Shuming Tang and

Jue Wang, “Freeway Traffic Stream Modeling Based

on Principal Curves and Its Analysis”, Vol.5,No.4,

IEEE Transaction On Intelligent Transportation

Systems, pp.246-258, Oct., 2004.

3556

- Peer Ground Motion Database Users ManualUploaded byJack
- Robust Analysis of Multibiometric Fusion Versus Ensemble Learning Schemes: A Case Study Using Face and PalmprintUploaded byAI Coordinator - CSC Journals
- Analysis of Chromite Processing Plant Data by First Order Autoregressive ModelUploaded byAlejandro Valenzuela
- Machine Learning and PR ApproachesUploaded bypradeep8968
- Neural Networks for ForecastingUploaded bycristian_master
- lec4-08 (1)Uploaded byalanpicard2303
- Market ForecastUploaded byGowthaman Raj
- Dynamically Evolving Systems SAS Cluster Analysis Using TimeUploaded byarijitroy
- ForecastingUploaded bybiffybyro
- Granger Causality Testing Linkages Between International Stock MarketsUploaded byrazvand_89
- The Three Types of Factor Models a Comparison of Their Explanatory PowerUploaded byZishu Liu
- Noise Resilient Periodicity Mining In Time Series Data basesUploaded byseventhsensegroup
- bfastUploaded byJorge Juan Cabrera
- Bowerman CH15 APPT FinalUploaded byMuktesh Singh
- Master Plan for the Potential Relocation of the Draper Prison Preliminary Draft Report 2014Uploaded byState of Utah
- Comparison of the Estimation Capabilities of Response SurfaceUploaded byRubina Nelofer
- [Elearnica.ir]-Improving the Robustness of Myoelectric Pattern Recognition for Upper LimbUploaded bymaziar
- whyemhisflawedandintrotofmh-120119052653-phpapp02Uploaded byMichel Caspary
- ann3-6sUploaded byashikhmd4467
- Futures Volume 6 issue 2 1974 [doi 10.1016%2F0016-3287%2874%2990022-6] I.F. Clarke -- 2. The Cicero syndromeUploaded byManticora Venerabilis
- PAPERPA6Uploaded bypostscript
- coll2016.pdfUploaded bySara Raquel Almeida
- Test StationarityUploaded byaiwen_wong2428
- Application of ANNs for Reservoir Characterization With Limited DataUploaded byManuel Chi
- kimotoUploaded byarsatheesh
- Yelp ChallengeUploaded bycvelm001
- Timeseries v5 UnannotatedUploaded byMohamedAbdelrazek
- Quantitative MethodsUploaded byJoseph Anbarasu
- 1-s2.0-S221256711400728X-mainUploaded byJessy Charina Kembaren
- Identificarea Copiilor Cu DisabilitatiUploaded byAlina Boboc