You are on page 1of 6

Forecasting the Runoff Using Least Square Support Vector Machine

FENG Lijun1,2 LI Shuquan1 1Tianjin University of Finance and Economics, P.R.China, 300222 2 College of Urban and Rural Construction, Agriculture University of Hebei, P.R.China, 071001 flj69@126.com Abstract: To forecast the runoff of medium and long term is a research difficult problem in natural science domain, and that to forecast the runoff accurately is the important foundation of preventing flood and reducing natural disasters and optimizing the management of water resource. For this problem, we propose a new forecast method based on least square support vector machine in this paper to forecast the runoff of medium and long term, and we compare the forecast result using least square support vector machine with the forecast result using artificial neural network (Back propagation, BP) . The experiments prove that the method of least square support vector machine has advantages of lower error in simulation and higher precision in forecast comparing with artificial neural network (Back propagation, BP). Keywords: Runoff, Least Square Support Vector Machine, Forecast , BP

1 Introduction
To forecast the runoff of medium and long term is a research difficult problem in natural science domain. Its difficult lies in that hydrology situation is affected by various uncertain factors, such as climate, weather, mankind activity and geographical environment change[1]. But forecasting the runoff accurately is the important foundation of preventing flood and reducing natural disasters and optimizing the management of water resource. So improving the precision of forecasting runoff of medium and long term is the research focus at home and abroad all the time. At present the methods to forecast runoff are numerous. According to technology route, we can divide these methods into two category. One is to construct the forecast model based on change law of runoff, such as time series model, artificial neural network model,wavelet model,pattern regconition model and so on. The other is to construct the forecast model based the relation between runoff and its influence factors, such as plural linear regression model, ANN model, fuzzy mode model and so on. It is sucessful for these models to forecast runoff generally. But the important foundation of parameter learning method of these forecast models is statistics. The traditional statistics theory is used based on big data sample situation. In fact we are difficult to get big data sample when forecasting runoff. So it is very necessary to study the statistics theory based on samll data sample. In 1970s, Vapnik began to study the statistics learning theory based on samll sample and proposed thelearning method of support vector machine in 1995. This method is a new and different from the learning method of ANN. At present this method is used in a lo of research fields and the research result indicates this method is feasible and superior. But its application in the fields of hydrology and water resouce has just started. Some documents indicate that the authors at home studied the flood predication and ground water prediction etc. using support vector machine method[2]. Recently, the overseas scholars have made great progress on the research of statistical learning theory. This method has many merits in setting up model in the respects of using small sample and knowledge acquirement, and the like. In this paper the author proposed the least square support vector machine method(LS-SVM) and forecasted the runoff using it. Our main purpose lies in introducing the thinking, characteristic and key place of LS-SVM briefly and offer the new approach for forecasting the runoff.

2 Theory of support vector machine


Project supported by investment of Tianjin teaching committee (No. TJGL06-099)

884

Support Vector Machine (SVM)[3] is a kind of new machine learning algorithm that has developed on the basis of statistical learning theory. Based on the principle of structural risk minimization, this algorithm can solve the problem of overfitting effectively and has good generality capability and better classification accuracy. It is becoming a new study focus of machine learning field after pattern-recognition and neural network. 2. 1 Vapniks SVM theory The theory of SVM was originally put forward by Boser and Guyon and Vapnik[4] at the Computational Learning Theory Congress held in 1992. SVM was originally used to find the optimal separating hyperplane of linear classification problem. The so-called optimal separating hyperplane not only can be used to separate the data correctly, but also can maximize the margin. Therefore, in regard to known observation samples ( x1 , y1 ) , ( x 2 , y 2 ) ( x n , y n ) Considering two separate classes,

y i = 1 , we can construct the optimal separating hyperplane to classify the samples. The problem of
constructing the optimal separating hyperplane can be turned into the following optimization problem: 1 2 min (w ) = w (1) 2 s.t.

The problem above can be transformed into the following dual problem by using Lagrange optimization method:
min W ( ) = 1 2

s.t.

i =1

In which i denotes the Lagrange multipliers. Solving Equation (2) with constraints Equation determines the Lagrange multipliers, and the optimal separating hyperplane is given by the following equation,
f ( x ) = sgn{( w* xi ) + b * } = sgn{

In which sgn () denotes the sign function. So far the discussion has been restricted to the case where the training sample is linearly separable. However, in general this will not be the case. In the case where it is misclassification, alternatively a more complex function can be used to describe the boundary. To enable the optimal separating hyperplane method to be generalized, Cortes and Vapnik (1995) introduced non-negative variables, i 0. The i is a measure of the misclassification errors. The optimization problem is now posed so as to minimize the classification error as well as minimizing the bound on the VC dimension of the classifier. The generalized optimal separating hyperplane is determined by the vector w which that minimizes the functional,
min (w, ) = 1 w 2
2

+C

Where C is a given value. The generalized optimal separating hyperplane is nearly the same as to

885

=0

yi ( w xi + b) 1

i =1 j =1

y i y j i j ( x i , x j )

i 0, i 1 2

j =1

(2)

y ( x , x) + b}
i i i i

(3)

(4)

linearly separable problem, just the constraints Equation turns into 0 i C i 1 2 l . To non-linear problem, we can transform it to the problem of a high dimensional feature space by the use of reproducing kernels. The idea of the kernel function is to enable operations to be performed in the input space rather than the potentially high dimensional feature space. Hence the inner product does not need to be evaluated in the feature space. This provides a way of addressing the curse of dimensionality. So the optimal separating hyperplane is transformed to:

f ( x) = sgn{ y i i K ( xi , x) + b}
i

Where K ( xi , x j ) = ( xi ) ( x j ) Hence, if we select different kernel function, we can acquire different support vector machines. 2. 2 LS-SVM theory Suykens J.A.K[5] put Least Squares Support Vector Machine (LS-SVM) forward in 1999. In regard to some known observation samples ( x1 , y1 ) , ( x 2 , y 2 ) ( x n , y n ) yi R , if we classify these data, we can construct the following optimization problem:
min J (w, e ) =
w,b ,e

1 T 1 w w+ 2 2

e
i =1

2 k

s.t.

Where is the regularization parameter, determining the trade-off between the fitting error minimization and smoothness. The solution is obtained after constructing the Lagrangian,
L(w, b, e, ) = J (w, e )

{w (x ) + b + e
T i i i =1

Where i is Lagrangian multiplier. Application of the conditions for optimality yields the following linear system (8):
0 IT 1 I + b = 0 I y

= K xi , x j = ( x i ) x j

The resulting LS-SVM model for function estimation becomes function (10).
i

i =1

where i , b comprise the solution to the linear system. In Equation (10), K ( x, xi ) is the so-called kernel function with which the input vector can be mapped

y=

K (x

xi ) + b

886

where I = [1 1 K 1] the matrix

= [ 1 2 K n ]

y = [ y1 y 2 K y n ]

y i = w T ( xi ) + b + ei , i = 1

2 3

yi }

Mercers condition is applied in

( )

(5)

(6)

(7)

(8)

(9)

(10)

implicitly into a high-dimension feature space. The most usual kernel functions are polynomial, Gaussian-like or some particular sigmoid.

3 Forecasting process
In order to forecast the runoff, we constructed the following forecasting process. We may carry on the work according to this process. This process is as figure 1:
Constructing forecast index system Getting sample

Preprocessing data Forecast runoff Figure 1. The process of forecast investment risk

In Figure 1, first, we need to construct forecast index system. In other words, we must know what factors we should take into consideration in the process of forecasting runoff. As for different region, the influence factors selected are dissimilar. So we must select the influence factors scientifically and reasonably according to the different region location when carrying on the forecast work of runoff. Whether we select the influence factors scientifically and reasonably would affect the justness and accuracy of runoff forecast. Second, we need to get the LS-SVM training sample. According to the constructed forecast index system, we collect the historical hydrology data of some region so that we can acquire the observation sample of LS-SVM. See ( x1 , y1 ) , ( x 2 , y 2 ) ( x n , y n ) . In this sample, xi ( i = 1 2 3 n ) is a vector in which each element is a forecast index. y i denotes the forecast result. Different output of y i reflects different value of runoff. Third, we need preprocess the LS-SVM training sample, including normalizing the data, and so on. By means of preprocessing the sample data, we may optimize the LS-SVMs capability of study and decision-making. Fourth, we forecast the runoff by the LS-SVM model which we have got. By training the
i

i =1

these parameters decide the LS-SVM learning and forecast capability to a great degree, the selection of hyperparameter and kernel function parameter is very important when we train LS-SVM. After we
i

i =1

model. According to the output result of model we can get the value of runoff.

4 Experiments
According to the requirement of the research task, we collect the hydrology data of recent 25years from some hydrology station. We divide the data into two groups, namely taking the previous 21 years data as training data and taking the rest 4 years data as test data. In these data, y i denotes the annual
887

get the LS-SVM model y =

K (x

xi ) + b , we can input the data preprocessed into the LS-SVM

observation sample, we can get the LS-SVM decision-making model y =

K (x

xi ) + b . Because

runoff value and xi denotes forecast index. i = 1, 2,3,4 . x1 demotes the total rainfall from December to November of last year. x 2 demotes the total rainfall of January of this year. x3 demotes the total rainfall of February of this year. x4 demotes the total rainfall of March of this year. Then we analyzed data using LS-SVM matlab toolbox. To the kernel function, we compare the polynomial kernel, radial basis kernel, sigmoid kernel and so on. At last we found radial basis kernel fits the problem of runoff
Tabble 1 The runoff data of some hydrology station and its forecast results comparison of using different method Forecast results of using LS-SVM Forecast results of using BP Real value No. Forecast value Forecast value 3 -1 Forecast error % (m s ) Forecast error % (m3s-1) (m3s-1) 22.9 23.667 3.349 29.438 28.550 1 2 3 4 22* 23* 24* 25
*

23.4 36.8 22.0 39.9 24.6 20.4 30.3

23.012 32.165 21.301 36.552 23.040 22.796 33.124


2

1.658 12.595 3.177 8.391 6.341 11.745 9.320

26.290 31.327 19.162 31.066 30.840 17.877 30.982

12.352 15.144 12.900 22.140 25.367 12.365 2.251


2

forecast. To kernel parameter , , we found the forecast result is good when = 0.01, = 0.05 . In order to carry on method comparison, we analyzed the casualty data using BP algorithm again. See table 1. In table1, from sample No. 22 to 25 are test samples and the rest are training samples. From table1 we draw the conclusion that the method of LS-SVM has the obvious superiority comparing with the method of artificial neural network. Besides, we are prone to get into local optimization and we need to determine the network structure and its parameters when using the method of artificial neural network. Generally to choose the parameter of artificial neural network is more difficult than LS-SVM. Considering the speed of the computation, the method of LS-SVM is quicker than artificial neural network. So, the method of LS-SVM has more accurate forecast ability and better application prospect in hydrology and water resource domain.

5 Conclusions
We use the method of LS-SVM to deal with forecast problem of runoff in this paper. This method can take full advantage of the data distributing characteristic, and that we dont need previous knowledge and skill when constructing making-decision function. This proposes a new approach for the runoff forecast. From now on we will further our study on how to choose kernel function and its parameter, thus make the runoff forecast more convenient and accurate.

References
[1]Wang Wensheng, Cao Xuewei, Lei Danfa. Application of Wavelet Network Model to Forecast Runoff of Geheyan Reservoi, Hubei Water Power, 2005,(4) , pp.10-12 (in Chinese) [2]Liao jie, Wang wensheng,Li yueqing, Huang weijun.,Support Vector machine Method and Its

888

Application to Prediction of Runoff. Journal of Sichuan University.,2006,38(6), pp. 2428(in Chinese) [3]Corinna Cortes, Vladimir Vapnik, Support vector networks, Machine learning, 1995,20(3), pp. 273297 [4]Vapnik V. Statistical learning theory [M], New York: John Wiley & Sons, 1998. [5]J.A.K. Suykens, J. Vandewalle, Least squares support vector machine classifiers, Neural processing letters, 1999,9(3), pp. 293300

889