You are on page 1of 8


Wind Speed Forecasting Using Fully Recurrent Neural Network in Wind Power Plants Hong Liangyou, Jiang Dongxiang, Huang Qian, Ding Yongshan Department of Thermal Engineering, Tsinghua University, Beijing 100084, China, Abstract: Wind speed forecasting is very important to the operation of wind power plants and power systems. Because of its nonlinearity and non-stationary, wind forecasting is a severe task. This paper deals with the problem of short-term wind speed forecasting based on historical time-serial meteorological data. A kind of recurrent neural network called PRNN (pipelined recurrent neural network) is adopted here to make prediction. It is a NARMAX ANN model (nonlinear auto regressive moving average artificial neural network with external inputs). First, the raw wind speed data is processed to be steady using logarithm difference method. Then, the phase space reconstruction method of chaotic theory is adopted to determine embedded dimension and time delay. Third, based on the selected embedded dimension and time delay, we develop a PRNN+TDL (tapped-delay-line filter) ANN model. Several on-line learning and optimal algorithms are used to train network. Finally, the ANN predicted result is amended based on the statistic characteristic of wind speed. The model is tested at a wind power plant over one year period. The raw data interval is 1 minutes. And predication term is about five minutes to one hour. The result shows that forecasting accuracy was effectively improved by the proposed method. The average predication error is within 10%. In additional, model parameters are very important factors affecting predication precision. Different type wind data need different parameters. Keywords: Wind speed forecasting, time series, recurrent neural networks, real time learning 1. Introduction

Wind speed signal is a kind of statistically non-stationary and nonlinear signal, due to highly complex interactions and the contribution of various meteorological parameters. There are many prediction methods have been suggested in the literature. The simplest method is persistence forecast: the prediction is set equal to the last available measurement. Other methods includes ARMAX model, Kalman filters, artificial neural network (ANN), Fuzzy Logic, etc. The architecture of Recurrent neural networks (RNN’s) enable the information to be temporally memorized in the networks. They can exhibit a wide range of dynamics, due to feedback, and are also tractable nonlinear maps. The application of using RNN’s as predictors in nonlinear dynamical systems is increasing [1]. In 1995, Haykin and Li presented a novel computationally efficient nonlinear predictor based on the pipelined recurrent neural network (PRNN) [2]. The PRNN consists of a number of small scale recurrent neural networks (RNNs), but maintains its relatively low computational complexity considering the entire number of neurons in its architecture. In addition, the PRNN architecture helps to circumvent the problem of vanishing gradient in several aspects [7].

In the PRNN configuration. consisting of many levels of recurrent signal processing.e. Linear subsection is represented by a conventional tapped delay-line (TDL) filter. The Haykin–Li’s Nonlinear Predictor The Predictor is a combination of two subsections. constitutes the nonlinear subsection of the Predictor. 1.! This paper adopts above mentioned nonlinear predictor to make on-line wind speed forecast.1. The overall cost function of the PRNN becomes (2) ! . i. Equations (1) give a full description of the PRNN. with each module consisting of neurons. Pipelined recurrent neural network [7] (1) The ( )-dimensional external signal vector is delayed by times steps before feeding the module . one of the feedback signals is substituted with the output of the first neuron of the following module. All the modules operate using the same weight matrix W. Of course. 2. the system is reconstructed by the theory of phase space reconstruction before forecasting. PRNN. Module of the PRNN is a fully connected RNN. The overall output signal of the PRNN is . where as in modules . Fig. This combination of nonlinear and linear processing should be able to extract both nonlinear and linear relationships contained in the input signal [6].1 Nonlinear subsection The PRNN is a modular neural network and consists of a certain number M of fully connected RNNs as its modules. RNN are connected as shown in Fig. the output of the first neuron of the first module. 2. The function of PRNN is to linear input signal.

which is shown in Fig. 3). 3. 2). This has been achieved via local linearization of the RTRL around the current point in the state space of the network. Tapped-delay-line filter[2] 2. Such an algorithm provides an adaptive learning rate normalized by the norm of the gradient vector at the output neuron.Weight Updating: A learning algorithm uses the suitably chosen overall cost function (2) in order to calculate the weight matrix correction factor which updates the weight matrix . This algorithm has been proved suffered from some serious drawbacks such as divergence.3 Prediction procedure The procedure was composed of the three following subtasks [2]: 1). This paper adopted a normalized version of the RTRL. Prediction: Compute the one-step forward nonlinear prediction errors of the PRNN at the time instant k. Filtering: Using (1) the output of the PRNN is computed. using equation (1) and (3).2 linear subsection The linear subsection of the neural network-based predictor consists of a tapped-delay-line (TDL) filter. 3 Training Algorithms for the Predictor Original learning algorithms for PRNN are RTRL.2. Fig.1 RLS algorithms for linear subsection The description of RLS algorithms is as follows [1]: (4) ! . forgetting 2.! (3) Where one-step forward prediction error from module and factor which determines the weighting of the individual modules. The updated input signal to every module is formed by substituting the external signal input with the updated external signal input The output of the PRNN was then fed into the LMS filter in order to produce the predicted signal of the nonlinear predictor.2.

2 Normalized RTRL algorithms for PRNN Here. All the experiments take same strategy. [8] and Takens [9]. According to Packard et al.! 3. module. we adopt the NRTRL algorithms to train the network. FNN (false nearest neighbors) [13] for embedding dimension. the condition of must be satisfied [1]. The weights are adapted as follows: [4. mutual information [11]. Here. for delay time and G-P [12]. 4 The theory of State Space Reconstruction The concept of low-dimensional chaos has proven to be fruitful in nonlinear time series analysis. 3. the optimal value of delay time and embedding dimension is important for the quality of the reconstruction. If the sampling time is . It is believed that a dynamic nonlinear system can be reconstructed from a single time series signal though method of delay (MOD). For the real data with noise. parameter we introduced.3 Initialization of the Algorithm Initialization of the synaptic weight matrix for PRNN is traditional epochwise training method of RNNs [3] with 100 epochs run over 10% of total data point.1 Some basic information The function in equation (1) is given by (7) As for normalized algorithm. the nonlinear predictors in this paper are a kind of NARMAX network. And ! . In fact. Other methods such as C-C [14] method and time windows method [15] determine these parameters together. 5 Prediction result of wind speed 5. the delay time is . and is calculated by RTRL algorithm. 5] (5) Where denotes the gradients at the output neuron positive constant. Many methods have been suggested for estimating these parameters such as autocorrelation function [10]. with respect to the learning rate of ith is an additional weights from the jth neuron. the method of delays can be used to embed a scalar time series into a d-dimensional space as follows: (6) Where is the index lag.

2 Fig. It shows that the result does not strongly depend on length of data (Fig. The time series is sampled at one point per minute for a period of one year.3. we have tested the sensitivity of Cao-FNN method on data length. This is a necessary step for better performance gain. the operator of first order sequence difference is worked on the signal. Finally. All three original wind speed signals we used here are consist of 1000 point. Then. namely p=d. The measure that was used to assess the performance of the predictors was the ! . The prediction is made as follows: first.4. signal is rescaled to range between -1 and 1. 1000 data point is chosen for determination of delay time and embedding dimension. Selection of delay time and embedding dimension Here. invariant of Cao-FNN. 5. and E2 is the same) 5. are calculated for determining the minimum embedding dimension of wind speed signal. E2 (d) is certainly related to d.4).! the wind signals are preprocessed before feed into the predictor. Our speed signal is get from some wind power plant. The latter is helpful to distinguish deterministic data from random data. as a result. average mutual information versus time lag Fig.3 Typical prediction results and discussion In this section. the dimensional of external signal vector for PRNN is set by the number of embedding dimension. Then. the values E1 and E2 versus dimension (E1-1 and E1-2 represents E1value obtained using 1000 and 2000 data points. the delay time is determined by the first minimum of the average mutual information. First. the original is reconstructed and new signal is obtained. Both E1 (d) and E2 (d). it cannot be a constant for all d. The embedding dimension is chosen by the method of Cao-false nearest neighbors (Cao-FNN). we illustrate the application of the predictor on some coarse sampled signal. For deterministic data. In fact.

That is also why is chosen as a performance indicator here. Fig. there are some place where the relative err become very large. Compared with the persistence forecasting.5. at the mean while. Table 2 shows the Parameters for PRNN. t=2. t=2. The result of State Space Reconstruction is listed as follows: t=2.! forward prediction gain given by [6] (8) Where denotes the estimated variance of the speed signal . In the following pictures. prediction results of wind2 ! . d=8 for wind2. The results given by RTRL+LMS algorithm and NRTRL+NLMS algorithm are shown in table 1. prediction results of wind1 Fig. d=6 for wind3. So a small absolute prediction err will lead to large relative err. It is clearly from the figure that the predictor can trace the change of signal quickly. 6. 7 is a representation of typical prediction results.6. the PRNN+TDL model with NRTRL+RLS training algorithm gives an effective improved result.5. whereas denotes the estimated variance of the forward prediction error signal . The reason is that the original wind velocity there is near zero. d=6 for wind1. Fig.

0% 21.1 Performance of different predictor signal Persistence RTRL+LMS NRTRL+NLMS NRTRL+RLS forecast Wind1 10. The results show that the improved PRNN+TDL predictor has great advantage with persistence forecast.63 dB 12. 2.7dB 14. The sensitivity of Cao-FNN method on data length is tested. the experiments show that MOD is an effective method in determination of parameters for PRNN. Mandic.” IEEE Transactionson Signal Processing. Chambers. 1995.2007CB210304) Reference [1] [2] Danilo P. The embedding dimension is chosen by the method of Cao-false nearest neighbors (Cao-FNN).8% 35. ! . no.66 dB Wind2 11. “Nonlinear adaptive prediction of nonstationary signals. Acknowledgment This work was supported by National Basic Research (973) Program of China (No. John Wiley & Sons.7. typical prediction results are shown.65dB 12.0% NRTRL+RLS 6 Conclusions In this paper.85dB 7. It shows that the result does not strongly depend on length of data. The delay time is determined by the first minimum of the average mutual information. it is a serve task to do prediction. 526-535.14 dB 14.21dB 14.87 dB 7.2 Parameters of PRNN RTRL+LMS NRTRL+NLMS Performance improve 19. Li. Then. the Pipelined Recurrent Neural Network (PRNN) is adopted as a predictor to make prediction on wind speed signal.64 dB 12. vol. 2001 S. 43. Third.84dB 7. “Recurrent neural networks for prediction”. Due to the non-stationary and nonlinearity property of the wind. NRTRL+RLS algorithm is the best train algorithm for this nonlinear predictor. Jonathon A. Haykin and L. a Normalized algorithm is introduced to train the PRNN. the paper presents the architecture of the nonlinear predictor. And. the original wind speed signal is reconstructed by the theory of state space reconstruction. The performance between different predictors is compared. pp.25 dB Wind3 5.90 dB Table.! Fig. First. Finally. prediction results of wind3 Table.

).Grassberger. 1996. vol. Nonlinear Time Series Analysis[M]. Jens Baltersee and Jonathon A.33:1134-1140.45:3403.127:48-60. in: D.1997. Mandic and Jonathon A. 336. 6. Dynamical Systems and Turbulence. State space reconstruction parameters in the analysis of chaotic times series – the role of the time window length[J]. “Nonlinear Adaptive Prediction of Speech with a Pipelined Recurrent Neural Network”. Lett. NO. R. Determining embedding dimension for phase-space reconstruction using a geometrical construction[J]. Rev.Kim. 10.D. H. 1980.B. 1986. Neural Computation. Cambridge: Cambridge University Press. Measuring the strangeness of strange attractors[J]. P. Physica D. 898 Springer. Young (Eds. J.Brown. Farmer.pp. Danilo P.S. Mandic.S. Lecture Notes in Mathematics. Williams and Jing Peng. 2.Schreiber.Swinney. VOL. NOVEMBER 1999. 1983. Rev. “An Efficient Gradient Based Algorithm for On Line Training of Recurrent Network Trajectories”. J.9:189-208 M. H. Phys.95:13-28.Abarbanel. A 1992. Physica D. Chambers. L. p. 46.Procaccia. 1. ! .A.M. Rand. 270-280.I. Independent coordinates for strange attractors form time series[J].L. D. AUGUST 1998.Eykholt.Fraser.1981. 8. Crutchfield. Packard. NO. 1990。 Danilo P. 1999. “A Learning Algorithm for Continually Running Fully Recurrent Neural Networks”. IEEE TRANSACTIONS ON NEURAL NETWORKS. 490-501. pp. IEEE TRANSACTIONS ON SIGNAL PROCESSING. “Toward an Optimal PRNN-Based Nonlinear Predictor”. Shaw. H.A. H. Chambers.H. Berlin. Physica D.D. Takens. Chambers. “A normalised real time recurrent learning algorithm”. F. Neural Computation. Signal Processing 80 (2000) 1909-1916 Ronald J.Kennel.Kugiurmtzis. R.Kantz.Williams and David Zipser.P. Jonathon A. R. delay times and embedding windows[J]. Rev.p127. A.! [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] Ronald J. T.S. Nonlinear dynamics. N. I.D. VOL.Salas. Phys. 45 (1980) 712. J. Phys.