You are on page 1of 9

240

IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 21, NO. 1, FEBRUARY 2006

Forecasting System Imbalance Volumes in Competitive Electricity Markets


Maria P. Garcia and Daniel S. Kirschen, Senior Member, IEEE
AbstractForecasting in power systems has been made considerably more complex by the introduction of competitive electricity markets. Furthermore, new variables need to be predicted by various market participants. This paper shows how a new methodology that combines classical and data mining techniques can be used to forecast the system imbalance volume, a key variable for the system operator in the market of England and Wales under the New Electricity Trading Arrangements (NETA). Index TermsData mining, electricity markets, multidimensional forecasting, neural networks, time series.

then highlighted. This paper also shows how exploratory analysis can be used to describe the structure of the data series and determine the relationships between the different variables involved in the forecasting process. Results based on actual NETA market data demonstrate the accuracy that can be achieved. II. FORECASTING IN ELECTRICITY MARKETS Dash et al. [1], Bunn [2], Sfetsos [3], and Kermanshahi et al. [4] provide overviews of the progress that has been made recently in the broad eld of forecasting in power systems. The introduction of competitive electricity markets has considerably increased not only the complexity of this task but also the breadth of this eld. Forecasting is no longer an activity performed only by the system operator. All market participants must do some forecasting to maximize their protability and control their exposure to risk. Load is no longer the only uncertain variable that must be forecasted. Market participants are interested in prices [5][10], traded volumes, and market length. Market variables are much more noisy than the system load. The values of these variables are driven in complex ways by many interacting factors. It is thus important to expand previous one-dimensional approaches (see, for example, [6]) to multidimensional inputs. The amount of data to be considered is huge and involves not only the market clearing data but also the positions that the participants took for every market period and synthetic indicators of market activity. Changes in market rules affect the way some variables are calculated and inuence the behavior of market participants. These changes reduce the amount of historical data that can reliably be used for forecasting. One could try to adapt to electricity markets the techniques that are used for forecasting the behavior of other markets. Unfortunately, these techniques cannot easily be applied to electricity because it signicantly differs from other physical commodities, such as corn or oil, and nancial instruments, such as stocks and bonds. Furthermore, the rules and characteristics of electricity markets are quite different from those of these other markets [11]. III. NETA AND THE NIV Since March 2001, electricity trading in England and Wales has been governed by NETA. Unlike its predecessor, the Electricity Pool of England and Wales, NETA does not dictate how

I. INTRODUCTION ORECASTING in power systems used to deal mostly with predicting future values of the load. With the introduction of competitive electricity markets, forecasting has become not only a much larger but also a much more complex topic. While predicting future prices is obviously an important issue, it is not the only question for which market participants seek answers. Their commercial performance is also strongly affected by their ability to predict other variables such as the volumes that will be traded in different segments of the market. Since in a competitive environment all participants have the freedom to operate independently, the overall level of uncertainty in the operation of the power system increases, and the variables that might be relevant proliferate. Choosing among the dozens of variables that are recorded in a typical market, those that are the best predictors of the quantity of interest becomes a major challenge. This paper explores how data mining can be used to select the most promising predictors and how new and traditional techniques can be combined to develop more accurate forecasting tools. It adopts the perspective of a system operator that must procure the energy needed to maintain the balance between generation and load. If system operators can accurately forecast the amount of balancing energy needed during each market period, they can purchase this energy on the forward market rather than on the spot market. Such an approach usually helps minimize the balancing cost. In the New Electricity Trading Arrangement (NETA), which is in effect in England and Wales, this key variable is called the net imbalance volume (NIV). Traditional time series forecasting methods are rst compared with neural networks techniques. The advantages of multidimensional forecasting over one-dimensional approaches are
Manuscript received December 20, 2004; revised July 6, 2005. This work was supported in part by the National Grid Transco and in part by Statsoft. Paper no. TPWRS-00661-2004. The authors are with the Department of Electrical Engineering and Electronics, University of Manchester, Manchester M60 1QD, U.K. (e-mail: Daniel. kirschen@manchester.ac.uk). Digital Object Identier 10.1109/TPWRS.2005.860924

0885-8950/$20.00 2006 IEEE

GARCIA AND KIRSCHEN: FORECASTING SYSTEM IMBALANCE VOLUMES

241

Fig. 1. NETA timeline.

electrical energy is to be bought and sold. Instead, it establishes a framework for bilateral trading between generators, suppliers, traders, and consumers [12]. Participants can choose the timing and the instruments they use for trading. NETA only provides mechanisms for keeping the system in balance and for settling the imbalances that inevitably arise between the physical and contractual positions of the market participants. Fig. 1 illustrates the timeline for market operation under NETA. At gate closure, one hour ahead of real time, bilateral trading comes to an end for the current half-hourly trading period, and all participants must notify the system operator of their expected production or consumption. They may also submit to the system operator bids and offers expressing their willingness to deviate from these levels. These bids and offers in the balancing mechanism involve a quantity, a price, and technical parameters, indicating the speed at which these adjustments can be made. The system operator chooses the bids and offers needed to keep the generation and the load in balance and to maintain the security of the system. If the system is short, the operator accepts offers from generators to increase their production or bids from the demand side to reduce their consumption. On the other hand, if the system is long, the system operator accepts bids to reduce the output of generators or offers from the demand side to increase the load. Keeping the system in balance thus has a cost that is ultimately passed on to the consumers. To keep this cost under control, the regulator and the system operator agree each year on an annual target cost. If the system operator manages to operate the system for less than this target cost, it is rewarded by being allowed to retain part of the difference [13]. On the other hand, if it exceeds the target, it must pay part of the excess. This scheme gives the system operator, which is a for-prot company, a strong incentive to minimize the balancing cost. Being able to forecast accurately the amount of balancing energy that it will need to buy or sell during each half-hourly market period helps the system operator meet this goal. Instead of accepting some of the bids and offers that are made by market participants in the balancing mechanism, the system operator also has the option to buy or sell energy in the forward market. Since the prices it can get through this advance trading in the forward market are often better than those that it can achieve through the balancing mechanism, this strategy can save a signicant amount of money as long as the forecast is sufciently accurate. If the forecast is incorrect, the system operator might indeed have to compensate for excessive trades it made in the forward market by buying or selling energy in the balancing mechanism. The system operator is thus very interested in forecasting as accurately as possible the NIV, which is dened as the algebraic

Fig. 2. NIV detail from 01/04/2001 to 21/11/2004 in megawatts.

sum of the imbalances of all the individual market participants. This variable represents the total net energy that it must buy or sell in the forward market or through the balancing mechanism. Fig. 2 shows the values of NIV over a seven-month period. Observation of this gure suggests that this variable does not display any obvious seasonality, such as the daily and weekly patterns that one can observe in the demand for electrical energy. IV. COMPARISON OF TRADITIONAL TIME SERIES ANALYSIS AND DATA MINING TECHNIQUES Exploratory time series analysis aims to identify the nature and the structure of an event through observation of its past behavior. Time series analysis can also be used to predict future values of a variable based on these past values. Traditional forecasting techniques [e.g., autoregressive integrated moving average (ARIMA), exponential smoothing] are limited to predicting values for one variable based on its previous values [14]. ARIMA also assumes that the time series is stationary, that the values are normally distributed, and that the residuals are independent. Many of the time series relevant to electricity markets do not satisfy these conditions. More complex models must be used to forecast nonperiodic, nonstationary, and noisy series. Emerging data mining modeling techniques can be adapted to uncover nonlinear relations in a priori irregular data. This can lead to more accurate forecasting techniques [15]. The Cross-Industry Standard Process for data mining was developed in the mid 1990s to organize the process of transforming raw data into useful information [16]. This data mining process can be adapted to data analysis and forecasting in electricity markets. The basic steps of this process are dened as follows. Business understanding: In the context of this paper, this step involves understanding the rules of the market. Data understanding: Having collected all the relevant data, their structure must be analyzed. Exploratory time series analysis techniques (such as correlation analysis, singular spectrum decomposition, and distributed lags analysis) are used to determine the statistical characteristics of the series. Data preparation: This step involves data selection, data cleaning, data construction, data integration, and data formatting [17]. While these tasks may appear mundane, they are critical to the success of the whole process.

242

IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 21, NO. 1, FEBRUARY 2006

Modeling: Various techniques are applied in this phase, and the corresponding parameters are adjusted to their optimal values. Wehenkel [18] provides a complete description of data mining modeling techniques and their application to power systems. Nowadays, modeling for forecasting is most commonly based on neural networks (NNs). Evaluation: Once the model has been created, its output must be assessed. For forecasting, this step involves the computation of error measurements to quantify the accuracy that has been achieved and some sensitivity analysis to appraise the robustness of the process. Deployment: In forecasting, this phase involves the automation of the collection of real-time data and the timely processing of this data. These steps are not necessary completed in sequential order. It may be necessary to iterate between them. Traditional time series analysis and data mining are not incompatible techniques. Classical exploratory time series techniques are a useful tool not only to identify the time structure of a series but also to select the input variables and analyze the relations that might exist between them. In practice, they are complementary techniques, with time series analysis aiding in the critical steps of the data mining process. V. ONE-DIMENSIONAL ANALYSIS OF NIV One-dimensional analysis uses time series to identify the time structure of the data and then to produce a medium-term forecast (one week ahead) of this variable [19]. A. Analysis of the Time Structure of NIV Data The objective of this analysis is to identify NIVs time structure and to separate its main temporal components: trend, seasonality, and noise. The data set is preprocessed to better expose the embedded information while preserving the original nature of the pattern. [20], [21]. The two steps involved in this preprocessing are as follows. Filtering and smoothing: Moving medians of different window length (8 and 48 periods length) are applied to the original series Variable transformations: mean subtraction, normalization, linear trend subtraction, and autocorrelation correction [22]. The transformed series are the inputs used in the modeling phase. The modeling tools applied to uncover the data time structure are as follows. Autocorrelation and partial correlation analysis [23] to detect NIVs seasonal patterns. The partial autocorrelation is an extension of the autocorrelation function that claries the existence of seasonal effects by removing the effect of the correlation of the intermediate elements within a specic lag. Spectrum Fourier analysis [24] is a mathematical prism that decomposes the data into its sinusoidal components to detect seasonal and cyclical components. Caterpillar decomposition [25] decomposes the original series into independent additive time series. This decomposition is used to identify, extract, and isolate the trend,

Fig. 3. Autocorrelation and partial autocorrelation of the current value of NIV with previous values (i.e., consecutive lags).

oscillatory, and noise components of the transformed series. Fig. 3 shows the correlation and partial correlation results obtained with NIV data. The correlation results show the high autocorrelation at lag 1 and how it decays slowly thereafter. The partial correlation also shows how, despite the strong correlation at lag 1, none of the partial autocorrelations are important. The spectrum Fourier analysis shows consistent mathematical results for the analysis of all the transformed series, conrming NIV noisy data structure. Similarly, the caterpillar decomposition does not detect strong seasonal components but can be used as a lter to eliminate the high-frequency components that are present in NIV. The combination of these results for NIV original and differentiated series shows that the NIV time series has no seasonality, no constant mean, and a constant noisy structure [26]. B. NIV Forecasting Using Time Series Techniques This section compares the performance of different time series forecasting methods when applied to NIV and explores the effect of the size of the training data set on the quality of the forecast. The following three techniques have been investigated [14], [22]: ARIMA, which combines the facts that elements of time series are serially dependent (autoregressive process) and that each of these elements is affected by past errors not included in the autoregressive process (moving average process); exponential smoothing, which operates like a moving average process where the recent observations are given a higher weight than older ones; caterpillar forecasting, which is based on the linear recurrent formula decomposition of the same name. The data analyzed consist of a series smoothed on the basis of an eight-period window. The original series is divided in seen/training and unseen data blocks. Unseen data sets correspond to a one-week period (42 observations). Two different unknown data sets are selected. In the rst one, NIV presents an increasing trend and, in the second one, a decreasing trend. For each of these unseen data sets, the seen data sets consist of series of 500, 1000, and 1500 observations. Fig. 4 illustrates a detail of the forecasted values for the decreasing trend data set.

GARCIA AND KIRSCHEN: FORECASTING SYSTEM IMBALANCE VOLUMES

243

In summary, these results show that past NIV data are not a good predictor of future values. NIV time series are noisy, unstructured, changing, and normally distributed. It should not come as a surprise that NIV is a clear example of the central limit theorem because it results from a large number of actions by different market participants acting independently. VI. MULTIDIMENSIONAL ANALYSIS OF NIV The results from the one-dimensional analysis suggest that a multidimensional approach is necessary to effectively forecast NIV. One of the main challenges when applying multidimensional forecasting techniques is the selection of the most relevant input variables. In the case of NIV, an exploratory analysis was performed to understand the possible interactions between NIV and other variables from the balancing mechanism. This made possible a systematic and rational selection of the variables used as input to the forecasting [19]. A. Multidimensional Exploratory Analysis To detect the qualitative, quantitative, and temporal relations between NIV and the other variables, the following techniques were applied. Time series analysis [23] was used to uncover the relations between the variables at different points in time. This included a cross spectrum analysis to determine the correlation between variables at different frequencies, as well as a distributed lag analysis to evaluate the delayed relationships between variables, i.e., the lagged effect on NIV of other variables. Multivariate exploratory techniques [27] were used to understand multidimensional relations and their statistical signicance. For example, multidimensional correlation analysis was used to measure the linear relations between variables. Similarly, data mining Kohonen networks [28] determined cluster areas of common values mapped in the two-dimensional spatial information given by the output neurons. All variables are rst preprocessed using moving medians to lter and smooth the series. Each variable is then normalized and differentiated. Table I shows how the use of the transformed series allows the analysis of different interactions between NIV and the rest of the balancing mechanism variables. The input variables have been divided into pre-gate closure and post-gate closure variables, where gate closure refers to the time when the bilateral forward markets close and the balancing mechanism begins operation until real time. Under NETA, gate closure takes place one hour before each half-hour trading period. Pre-gate closure variables include demand forecast, submitted offer volumes, submitted bid volumes, maximum declared capacity at gate closure, gate closure imbalance volume (GCIV), market imbalance volume, and activity on the electronic power exchange. Post-gate closure variables include demand, demand forecast error (DFE), post-gate closure effects, accepted bid volumes, accepted offer volumes, accepted bid cashows, accepted offers cashows, and imbalance prices. Statistical analysis of the pre-gate closure variables only conrms the expected relation between GCIV and NIV. On the

Fig. 4. NIV forecasted solutions (in megawatts) for different forecasting bases (500, 1000, and 1500 cases).

It shows how the forecasts converge quickly either to an almost constant value [for a (1,0,6) ARIMA model] or to a constant slope (exponential smoothing). Only the caterpillar forecasting method is capable of predicting an oscillatory behavior. With an average prediction error of 500 MW (50%) [19], the numerical error measurements do not show any consistent advantage for any one method over the other two. Not only is the accuracy of the forecast poor, but there are also substantial differences between the dynamics of NIV and the various forecasts. The number of cases included in the training data sets has a different effect on each method. For ARIMA, it only affects the nal value and, for exponential smoothing, the nal slope. On the other hand, for the caterpillar method, an increase in the number of observations affects the series decomposition and the reconstruction that forms the basis of the forecast. The forecasted results become smoother as the amount of seen data increases (similar results are obtained for the increasing trend data set).

244

IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 21, NO. 1, FEBRUARY 2006

TABLE I RELATION BETWEEN PREPROCESSED INPUT AND OUTPUT VARIABLES

tackled by any structure of NN suitable for regression purposes, providing that the data set is suitably preprocessed into the correct form. Choosing the optimal NN structure is thus an important task. 2) Development of the NNs: The main stages of the development of the NNs used to forecast NIV follow the standard data mining procedure [22], [31]. Data selection: The selection of variables is guided by the physical quantity that they represent and by the results of the multivariable exploratory analysis. Experience demonstrated that the choice of variables has a strong inuence on the quality of the results. Including too many variables or the wrong variables can lead to dimensionality problems. Cases selection: Three different data sets are required to develop an NN for forecasting. The training data set is used to train the network. The selection data set is used to select the best trained network. Finally, the test data set (containing only unseen data) is used to measure the performance of the networks. The number of cases needed for both the training and the selection data sets depends on the number of connections and the complexity of the function to be modeled. However, over-tting problems can arise if excessively large sets are used. Data preparation: NNs usually require pre- and post- processing of the data to adapt rst the input variables to the characteristics of the neurons activation functions and second to transform the output to the normal data range. Training: This process consists in a progressive adaptation of the NN parameters to learn the desired behavior. Several networks of different architectures are trained using the training data set. Each network produces its own prediction for the unseen data set. Assessment: Once the different networks have been created, several measures of performance are obtained (error measurements, training performance, sensitivity analysis, and residuals analysis). These results can also be used as a feedback to modify the parameters of the networks. The best architecture is selected based on performance over the selection data set. This development process does not produce a unique network that can be used in all cases. Various architectures produce optimal results for different forecasting conditions. A different optimal network, therefore, must be created for each forecasting scenario. The parameters of the networks also need to be adjusted to the timeframe considered. The following two forecasting scenarios were considered. Case 1: One month ahead forecast performed on a daily basis. Each forecast value represents the median NIV value for a whole day. Case 2: One week ahead forecast where six values are forecast for each day. Working and nonworking days are treated separately.

other hand, the results obtained with the post-gate closure variables show that these variables can be used as input variables for forecasting NIV. A delayed time relation between NIV and the DFE was exposed. However, due to the time scale used for forecasting, this relation can be omitted. B. Multidimensional NN Forecasting This forecasting technique is based on the relation between the past (seen) values of the balancing mechanism variables and the future (unseen) values of NIV. However, the relations linking the past and future values of these variables are neither simple nor linear. Data mining techniques, in particular NNs, have been shown to be able to uncover these complex associations while maintaining the time structure of the analyzed series. 1) Possible NN Architectures: NN techniques continue to be enhanced through improvements in computational performance and in the exibility of the software used for their implementation [15], [29], [30]. Depending on the ways the input, hidden, and output neurons are connected, various NN architectures are produced. In this project, the following structures were used and tested: linear networks (LNs); multilayer perceptron (MLP); radial basis function (RBF); probabilistic neural networks (PNNs); generalized regression networks (GRNNs) The selection of a network architecture depends on the problem characteristics. While MLP is one of the most popular architectures for modeling functions of any complexity, RBFs are extensively used for large and recurrent problems as they are quick to train. PNNs, on the other hand, have been extensively used for classication problems considering the outputs as the probability value of class membership. Finally, GRNNs are similar to the PNN, but they perform only regression tasks. Although these different network structures work on different principles, all of them can be adapted to suit a specic problem. For example, a PNN can be used for regression and forecasting problems if the output is treated as the expected value of the model. One important characteristic of NIV forecasting is the structure of the time series of both the input variables and the forecasted output. This transforms the problem of forecasting NIV into a specialized form of regression. As such, it can be

3) Data Preparation and Selection: All variables are ltered using smoothing moving medians with windows of 48 periods for Case 1 and eight periods for Case 2. The series are then normalized. Finally, the time relation between the input and output

GARCIA AND KIRSCHEN: FORECASTING SYSTEM IMBALANCE VOLUMES

245

TABLE II DEFINITION OF THE TRAINING, SELECTION, AND TEST DATA SETS FOR CASE 1

TABLE III BEST NNs AND CORRESPONDING ERRORS FOR EACH SUBCASE OF CASE 1

variables is transformed to expose the information in such a way that the present values of the input variables can predict future values of NIV. be a vector from the known (training and selection) Let data set (1) where time index (e.g., corresponds to day one); th balancing mechanism variable at time ; value of NIV (i.e., the output) at time . In the training and selection data sets, all the input and output are known. The unknown (or test) components of the vectors data set is formed by vectors dened as (2) where time index; th balancing mechanism variable at time ; value of NIV at time . In the test data set, known values of the balancing mechanism variables (input) allow us to calculate the unknown (future) values of NIV (output). The data selection process differs for each of the two cases that were considered. Case 1: includes six different subcases. Each of them corresponds to a four-week period of data. Table II shows the denitions of the training, selection, and test data sets. Case 2: consists of twelve different subcases, each of them corresponding to a weeks worth of data. The forecasted period consists of 12 consecutive weeks (weeks 17 to 29 in 2002) divided into working and nonworking days. The training, selection, and test data sets were selected as follows. Working days: The selection data set consists of the week preceding the test data. The training data set consists of the four-week period nishing two weeks prior to the test data set.

Nonworking days: The selection data set consists of the nonworking days of the previous week. The training data set correspond to a two-week period nishing two weeks prior to the test data set. The variables used as predictors for this forecast are the demand forecast, DFE, accepted bid volumes, accepted offer volumes, forward trades, gate closure imbalance volume, Min Accepted Offers Accepted Bids , imbalance prices, and type of day (Monday, Tuesday, ). 4) Results: a) Case 1: Forecast Over a One-Month Period: Table III shows the NN congurations that produce the smallest error for each month as well as the accuracy that has been achieved in terms of the root mean-squared error (RMSE) and the mean absolute relative error (MARE), which are dened in (3) and (4), respectively. The numbers following the network names indicate the number of units in the input, hidden, and output layers, separated by a colon (e.g., 12:72-5-1:1 indicates 12 units in the input layer, 72, 5, and 1 units, respectively in the rst, second, and third hidden layers and 1 unit in the output layer). For the cases considered, the optimum network architectures are either multilayer perceptron or radial basis function. Linear networks perform badly in all the cases studied. These results show that there is not a unique network structure or even type that is best for all the forecasted months RMSE NIV NIV NIV NIV NIV (3)

MARE

(4)

Fig. 5 compares in the time domain the actual values of NIV with the best forecasts for each month. b) Case 2: Forecast Over a One-Week Period: Similar tests show that for forecasts over a one-week period, the multilayer perceptron is the network architecture that provides optimum solutions for most of the analyzed cases and for both working and nonworking days [19]. Table IV shows the minimum errors for working and nonworking days separately and a weighted average for the weekly error. This table shows that there is a big difference between the forecasting accuracy for working and nonworking days. Forecasts for nonworking days are more accurate than for working day. Possible reasons for this difference include the following.

246

IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 21, NO. 1, FEBRUARY 2006

Fig. 5. Actual and forecasted values of NIV for the month-ahead forecasts using the best network. TABLE IV MEASURES OF ERROR FOR CASE 2 (WEEKLY FORECASTING)

The number of cases considered: For working days, the forecast is based on 30 cases, while for nonworking days, it is based on 12 cases only. The difference in input data: NIV is less volatile for nonworking days than it is for working days. The standard deviation of NIV for working days is 835.47 MW, while it is only 721.11 MW for nonworking days

c) Sensitivity Analysis: As part of the assessment process, a sensitivity analysis was performed to evaluate the relative importance of the input variables on the accuracy of the forecast. This evaluation was based on the effect that omitting a predictor from the development of the NN has on the accuracy of the forecast. Predictors were then ranked on the basis of the deterioration that their omission causes. This process was repeated for 12 separate single week forecasts for working days. Table V shows the relative importance of each predictor based on this sample. It also shows the range of the rankings (1 for most important, 15 for least important) and a raw cumulative score obtained by summing the rankings of each variable for each of the 12 weekly forecasts. These results suggest that some variables are better predictors than others but that the relative importance varies from week to week. For all the cases considered, ignoring a predictor never improves the accuracy of the forecast. 5) Discussion: While the NNs that have been developed are able to predict NIV with reasonable accuracy for both weekly and monthly horizons, no single network architecture provides optimal results for all market conditions. A new network must, therefore, be developed regularly if this accuracy is to be maintained. Comparing the results obtained for cases 1 and 2 shows that increasing the frequency of observations does not improve the accuracy of the results. A more accurate forecast is obtained on a monthly basis (average MAE: 363 MW) than on a weekly basis (average MAE: 440 MW). This can be explained by the nature of the data: A daily aggregation of NIV is less scattered than an

GARCIA AND KIRSCHEN: FORECASTING SYSTEM IMBALANCE VOLUMES

247

TABLE V SENSITIVITY ANALYSIS FOR WEEKLY FORECAST (WORKING DAYS)

The number of cases and their selection for the required data sets can be modied as more data become available. Different forecasting scenarios may require a different number of cases to be included in the data sets. The more updated information the network can use for learning, the better the forecast becomes. However, it is important to avoid overtraining since that would lead to an inexible network and inaccurate results. Multidimensional forecasting using NNs gives a better accuracy than other multidimensional methods based on linear regression. Table VI compares the monthly forecast obtained with NNs and with a production-grade program based on linear regression methods. Table VII presents a similar comparison for working and nonworking days in the weekly scenario. NNs also outperform one-dimensional linear methods forecasting. Table VIII compares the accuracy of multidimensional NNs with one-dimensional methods for weekly forecast. In all cases, the NNs outperform other methods, even when the time horizon is larger than the one considered in the one-dimensional forecasting. VII. CONCLUSION Competition increases the need for forecasting in power systems. New market variables present complex structures that are not easily modeled by traditional techniques. A recently developed data mining technique can provide the necessary tools to uncover nonlinear relations between these variables and identify the best predictors for the variables to be forecast. The application of classical techniques in combination with data mining NNs yields a more accurate and realistic performance than conventional forecasting techniques. However, to maintain a reasonable accuracy, these networks must be updated on a regular basis. While the results presented in this paper demonstrate that the proposed approach yields forecasts that are useful and nancially valuable, it is clear that the accuracy of forecasts of market variables is still much lower than the accuracy that one can achieve when trying to predict daily or weekly load proles and prices. Differences in accuracy when forecasting different variables can be explained by a number of factors. For any variable, the main difculties in forecasting arise from high noise, nonlinear effects, data availability, and length of the forecasting horizon. Therefore, when comparing the accuracy of forecasts of NIV and electricity prices, one should consider the following. NIV is more volatile than prices. If the volatility is dened as the standard deviation of the rate of change of the normalized variables, NIVs volatility is 0.08 compared with 0.04 and 0.009 for the system buy and sell prices, respectively. NIV needs to be forecasted one week to one month ahead, while electricity prices forecasts are usually performed for shorter horizons [5][7], [9], [10], [32]. There are no clear linear relations between NIV and other market variables, while several studies have shown linear relations between prices and other market variables, such as demand and capacity shortfalls [7], [32]. NIV is a newer variable than prices. Much less historical data are thus available since the market only started in

TABLE VI ERROR COMPARISON FOR MONTHLY FORECAST

TABLE VII ERROR COMPARISON FOR WEEKLY FORECAST (WORKING AND NONWORKING DAYS)

TABLE VIII ERROR COMPARISON


FOR WEEKLY FORECAST (ONE-DIMENSIONAL AND MULTIDIMENSIONAL NNs)

aggregation in six blocks of four hours per day (i.e., aggregated in so-called EFA blocks).

248

IEEE TRANSACTIONS ON POWER SYSTEMS, VOL. 21, NO. 1, FEBRUARY 2006

2001 and has since undergone several rule changes that have caused external modications in the data. Finally, the true measure of improvement when forecasting market imbalance volumes is not an abstract error index but rather the savings in balancing costs that this improvement makes possible. With a one-month-ahead time scale in particular, the 20% error reduction in the forecasted volume that the proposed method achieves makes possible a signicant increase in the amount of energy that can be the traded in the forward market and hence a substantial saving in balancing costs. ACKNOWLEDGMENT The authors would like to thank Dr. C. Aldridge for his assistance. REFERENCES
[1] P. K. Dash, G. Ramakrishna, A. C. Liew, and S. Rahman, Fuzzy neural networks for time-series forecasting of electric load, Proc. Inst. Elect. Eng., Gener., Transm., Distrib., vol. 142, no. 5, pp. 535544, Sep. 1995. [2] D. W. Bunn, Forecasting loads and prices in competitive power markets, Proc. IEEE, vol. 88, no. 2, pp. 163169, Feb. 2000. [3] A. Sfetsos, Short-term load forecasting with a hybrid clustering algorithm, in Proc. Inst. Elect. Eng., Gener., Transm., Distrib., vol. 150, May 2003, pp. 257262. [4] B. Kermanshahi and H. Iwamiya, Up to year 2020 load forecasting using neural nets, Int. J. Elect. Power Energy Syst., vol. 24, pp. 789797, 2002. [5] J. Bastian, J. Zhu, V. Banunarayanan, and R. Mukerji, Forecasting energy prices in a competitive market, Inst. Elect. Eng. Comput. Appl. Power, pp. 4045, 1999. [6] J. C. Cuaresma, J. Hlouskova, S. Kossmeier, and M. Obersteiner, Forecasting electricity spot-prices using linear univariate time-series models, Appl. Energy, vol. 77, pp. 87106, 2004. [7] C. P. Rodriguez and G. J. Anders, Energy price forecasting in the Ontario competitive power system market, IEEE Trans. Power Syst., vol. 19, no. 1, pp. 366374, Feb. 2004. [8] X. Wang, N. Hatziargyriou, and L. H. Tsoulakas, A new methodology for nodal forecasting in deregulated power systems, IEEE Power Eng. Rev., pp. 4851, 2002. [9] F. J. Nogales, J. Contreras, A. J. Conejo, and R. Espinola, Forecasting next-day electricity prices by time series models, IEEE Trans. Power Syst., vol. 17, no. 2, pp. 342348, May 2002. [10] J. Contreras, R. Espinola, F. Nogales, and A. J. Conejo, ARIMA models to predict next-day electricity prices, IEEE Trans. Power Syst., vol. 18, no. 3, pp. 10141020, Aug. 2003. [11] D. Pilipovic, Energy risk: Valuing and managing energy derivatives, J. Energy Lit., vol. 4, p. 111, 1998. [12] P. Stephenson and M. Paun, Electricity market trading, Power Eng. J., vol. 15, no. 6, pp. 277288, Dec. 2001. [13] Ofce of Gas and Electricity Markets (OFGEM), NGC System Operator Incentive Scheme from April 2004,, Dec. 2003. [14] R. S. Tsay, Analysis of nancial time series, in Wiley Series in Probability & Statistics. New York: Wiley, Dec. 21, 2001, p. 472. [15] Z. Vojinovic, K. Vojislav, and R. Seidel, A data mining approach to nancial time series modeling and forecasting, Int. J. Intell. Syst. Account., Fin., Manage., vol. 10, pp. 225239, 2001.

[16] C. Pete, C. Julian, K. Randy, and K. Thomas, CRISP-DM 1.0 Step-ByStep Data Mining Guide,, 2000. [17] D. Pyle, Data Preparation for Data Mining. San Francisco, CA: Morgan Kaufmann, 1999. [18] L. A. Wehenkel, Automatic Learning Techniques in Power Systems. Norwell, MA: Kluwer, 1998. [19] M. P. Garcia, Forecasting system imbalance volumes and analysis of unusual events in competitive electricity markets, Ph.D. dissertation, Univ. Manchester, Manchester, U.K., 2005. [20] C. Antunes and A. Oliveira, Temporal data mining: An overview, presented at the 7th ACM SIGKDD Int. Conf. Knowledge Discovery Data Mining (KDD-2001), San Francisco, CA, 2001. [21] M. Gavrilov, D. Anguelov, P. Indyk, and R. Motwani, Mining the stock market: Which measure is best?, presented at the Conf. Knowledge Discovery Data, Boston, MA, 2000. [22] (2006) Electronic Statistics Textbook. Statsoft, Tulsa, OK. [Online]. Available: http://www.statsoft.com/textbook/stathome.html [23] D. Pea, G. C. Tiao, and R. Tsay, Basic concepts in univariate time series, in A Course in Time Series Analysis, Wiley series in Probability and Statistics. Probability and statistics, J. W. Sons, Ed. New York: Wiley-Interscience, 2000, p. 496. [24] J. B. Elsner and A. A. Tsonis, Singular spectrum analysis: A new tool in time series analysis, in A New Tool in Time Series Analysis, J. B. Elsner, Ed. New York: Plenum, 1996. [25] N. Golyandina, V. Nekrutkin, and A. Zhigljavsky, Analysis of Time Series Structure. SSA and Related Techniques, 2001. [26] C. Alexander, Market Models: A Guide to Financial Data Analysis. New York: Wiley, 2001. [27] S. K. Kachigan, Multivariate Statistical Analysis: A Conceptual Introduction, 2nd ed. New York: Radius, 1991. [28] T. Kohonen, Self-Organizing Maps, 3rd ed., 2001. [29] T. Kolarik and G. Rudorfer, Time series forecasting using neural networks, J. Time Series Neural Netw., pp. 8694, 2004. [30] S.-H. C. a. S. H. Kim, Data mining for nancial prediction and trading: Application to single and multiple markets, Expert Syst. Appl., vol. 26, pp. 131139, 2004. [31] Y. JingTao and T. C. Lim, Guidelines for nancial forecasting with neural networks, presented at the Int. Conf. Neural Information Processing, Shanghai, China, 2001. [32] H. Y. Yamin, S. Shahidehpour, and Z. Li, Adaptive short-term electricity price forecasting using articial neural networks in the restructured power markets, Elect. Power Energy Syst., vol. 26, pp. 571581, 2004.

Maria P. Garcia received the electrical engineers degree from the Universidad Ponticia de Comillas, Madrid, Spain, in 2001 and the M.Sc. degree in power systems from the University of Manchester Institute of Science and Technology (UMIST), Manchester, U.K., in 2001. She is currently working toward the Ph.D. degree at UMIST.

Daniel S. Kirschen (M86SM91) received the electrical and mechanical engineers degree from the Free University of Brussels, Brussels, Belgium, in 1979 and the M.S. and Ph.D. degrees in electrical engineering from the University of Wisconsin-Madison in 1980 and 1985, respectively. From 1985 to 1994, he worked for Control Data Corporation and for Siemens. He is currently a Professor of electrical energy systems at the University of Manchester, Manchester, U.K.