You are on page 1of 9

Electrical Power and Energy Systems 31 (2009) 409–417

Contents lists available at ScienceDirect

Electrical Power and Energy Systems


journal homepage: www.elsevier.com/locate/ijepes

Hybrid evolutionary algorithms in a SVR-based electric load forecasting model


Wei-Chiang Hong *
Department of Information Management, Oriental Institute of Technology, 58, Sec. 2, Sichuan Rd., Panchiao, Taipei County, Taipei 220, Taiwan

a r t i c l e i n f o a b s t r a c t

Article history: Accurately electric load forecasting has become the most important issue in energy management; how-
Received 18 January 2007 ever, electric load often presents nonlinear data patterns. Therefore, looking for a novel forecasting
Received in revised form 16 September approach with strong general nonlinear mapping capabilities is essential. Support vector regression
2008
(SVR) reveals superior nonlinear modeling capabilities by applying the structural risk minimization prin-
Accepted 20 March 2009
ciple to minimize an upper bound of the generalization errors, it is quite different with ANNs model that
minimizing the training errors. The purpose of this paper is to present a SVR model with a hybrid evolu-
tionary algorithm (chaotic genetic algorithm, CGA) to forecast the electric loads, CGA is applied to the
Keywords:
Support vector regression (SVR)
parameter determine of SVR model. With the increase of the complexity and the larger problem scale
Chaotic genetic algorithm (CGA) of electric loads, genetic algorithms (GAs) are often faced with the problems of premature convergence,
Electric load forecasting slowly reaching the global optimal solution or trapping into a local optimum. The proposed CGA based on
the chaos optimization algorithm and GAs, which employs internal randomness of chaos iterations, is
used to overcome premature local optimum in determining three parameters of a SVR model. The empir-
ical results indicate that the SVR model with CGA (SVRCGA) results in better forecasting performance
than the other methods, namely SVMG (SVM model with GAs), regression model, and ANN model.
Ó 2009 Elsevier Ltd. All rights reserved.

1. Introduction and Park et al. [8] proposed exponential smoothing models by Fou-
rier series transformation to forecast electric load. Douglas et al. [9]
In the recent years, along with the power system privatized and considered verifying the impacts of forecasting model in terms of
deregulated, the issue of accurately electric load forecasting has re- temperature. They combined Bayesian estimation with dynamic
ceived more attention in a regional or a national system. The error linear model into load forecasting. The results indicate that the
of electric load forecasting may increase the operating cost [1–3]. presented model is suitable for predicting load with imperfect
Therefore, overestimation of future load results in excess supply, weather information. The disadvantage of these methods is time
and it is also not welcome to the international energy network. consuming, particularly for the situation while the number of vari-
In the contrast, underestimation of load leads to a failure in provid- ables is increased.
ing enough reserve and implies high costs in peaking unit. Ade- To achieve the accuracy of load forecasting, state space and Kal-
quate electric production requires each member of the global man filtering technologies, developed to reduce the difference be-
cooperation being able to forecast its demands accurately. How- tween actual loads and prediction loads (random error), are
ever, it is complex to predict the electric load, because the influenc- employed in load forecasting model. This approach introduces
ing factors include climate factors, social activities, and seasonal the periodic component of load as a random process. It requires
factors. Climate factors depend on the temperature and humidity; historical data more than 3–10-year to construct the periodic load
social factors imply human social activities including work, school variation and to estimate the dependent variables (load or temper-
and entertainment affecting the electric load; seasonal factors then ature) of power system [10,11]. Moghram and Rahman [12] pro-
include seasonal climate change and load growth year after year. posed a model based on this technique and verified that the
In the last few decades, there are widespread references with proposed model outperforms another four forecasting methods
regard to the efforts improving the accuracy of forecasting meth- (multiple linear regression, time series, exponential smoothing,
ods. One of these methods is a weather-insensitive approach which and knowledge based approach). The disadvantage of these meth-
used historical load data to infer the future electric load. It is fa- ods is difficult to avoid the observation noise in the forecasting
mous known as Box-Jenkins’ ARIMA [4–6], which is theoretically process especially multivariable considered.
based on univariate time sequences. In addition, Christianse [7] Regression models construct the causal-effect relationships be-
tween electric load and independent variables. The most popular
* Tel.: +886 2 7738 0145x5316; fax: +886 2 7738 6310. models are linear regression, proposed by Asbury [13], considering
E-mail address: samuelhong@ieee.org the ‘‘weather” variable into forecasting model. Papalexopoulos and

0142-0615/$ - see front matter Ó 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.ijepes.2009.03.020
410 W.-C. Hong / Electrical Power and Energy Systems 31 (2009) 409–417

Hesterberg [14] added the factors of ‘‘holiday” and ‘‘temperature” and weather data from the Hydro-Quebec databases where three
into their proposed model. The proposed model used weight least types of variables were used as inputs to the neural network. Their
square method to obtain robust parameter estimation encounter- proposed model demonstrates ANNs capabilities in load forecast-
ing with the heteroskedasticity. Soliman et al. [15] proposed a mul- ing without the use of load history as an input. In addition, only
tivariate linear regression model in load forecasting, including temperature (from weather variables) is used, in this application,
temperature, wind cooling/humidity factors. The empirical results where results show that other variables like sky condition (cloud
indicate that the proposed model outperforms the harmonic model cover) and wind velocity have no serious effect and may not be
as well as the hybrid model. These models are based on linear considered in the load forecasting procedure.
assumption, however, these independent variables are unjustified The support vector machines (SVMs) implement the structural
to be used because of the terms are known to be nonlinear. Re- risk minimization (SRM) principle rather than empirical risk mini-
cently, Asber et al. [16] employed Kernel regression model to mization principle implemented by most of the traditional neural
establish a relationship among past, current and future tempera- network models. Based on this principle, SVMs achieve an opti-
tures and the system loads to forecast the load in the Hydro Qué- mum networks structure. In addition, the SVMs will be equivalent
bec distribution network. A set of past load history comprising of to solving a linear constrained quadratic programming problem so
weather information and load consumption is used. The paper pro- that the solution of SVMs is always unique and globally optimal.
poses a class of flexible conditional probability models and tech- Along with the introduction of Vapnik’s e-insensitive loss function
niques for classification and regression problems. A group of [28], SVMs also have been extended to solve nonlinear regression
regression models is used, each one focusing on consumer classes estimation problems. Therefore, SVMs are successfully in time ser-
characterizing specific load behavior. Numerical investigations ies forecasting. Cao [29] used the SVMs experts for time series fore-
show that the suggested technique is an efficient way of comput- casting. The generalized SVMs experts contained a two-stage
ing forecast statistics. neural network architecture. The numerical results indicated that
In the recent decade, lots of researches had tried to apply the the SVMs experts are capable to outperform the single SVMs mod-
artificial intelligent techniques to improve the accuracy of the load els in terms of generalization comparison. Cao and Gu [30] pro-
forecasting issue. Knowledge-based expert system (KBES) and arti- posed a dynamic SVMs model to deal with non-stationary time
ficial neural networks (ANNs) are the popular representatives. The series problems. Experiment results showed that the DSVMs out-
KBES approaches constructed electric load forecasting by simulat- perform standard SVMs in forecasting non-stationary time series.
ing the experiences of the system operators who were well-expe- Meanwhile, Tay and Cao [31] used SVMs in forecasting financial
rienced in the processes of electricity generation, such as time series. The numerical results indicated that the SVMs are
Rahman and Bhatnagar [17]. The characteristic feature of this ap- superior to the multi-layer back-propagation neural network in
proach is rule-based, which implied that the system transformed financial time series forecasting. Hong and Pai [32] applied SVMs
new rule from received information. In other word, an expert capa- to predict engine reliability. Their experimental results indicated
bility which is training by the existence presuming will be made that SVMs outperform Duane model, ARIMA model and general
much increasing accuracy of forecasting [17–19]. This approach regression neural networks model. Pai et al. [33] proposed a mul-
is derivation of the rules from on-the-job training and sometimes ti-factor support vector machine model to forecast Taiwanese de-
transforming the information logic to equations could be mand for travel to Hong Kong from 1967 to 1996. They indicated
impractical. that the proposed MSVM model outperforms BP model, FF model,
Meanwhile, lots of researches also had tried to apply ANNs to Holt’s model, MA model, Naïve model, and multiple-regression
improve the load forecang accuracy. Dillon et al. [20] used adaptive model. For electric load forecasting, Chen et al. [34] are the pio-
pattern recognition and self-organizing techniques for short term neers for proposing a SVM model, which was the winning entry
load forecasting. Dillon et al. [21] presented a three layered feed- of a competition aiming at mid-term load forecasting (predicting
forward adaptive neural network to forecast short term load. Their daily maximum load of the next 31 days) organized by EUNITE net-
proposed model was trained by back-propagation neural network. work in 2001, to solve the problem. They discuss in detail how
This model is additionally applied to real data from a power system SVM, a new learning technique, is successfully applied to load fore-
and distinguished providing superior comparative results with casting. Pai and Hong [35] employed the concepts of Jordan recur-
other methods are given. In the meanwhile, Park et al. [22] pro- rent neural networks to construct recurrent SVR model in Taiwan
posed a 3-layer back-propagation neural network to daily load regional long-term load forecasting. In addition, they used genetic
forecasting problems. The inputs include three indices of tempera- algorithms to determine approximate optimal parameters in the
ture: average, peak and lowest loads. The outputs are peak loads. proposed RSVMG model. They concluded that RSVMG outper-
The proposed model outperforms the regression model and the formed other models, such as SVMG, ANN, and regression models.
time series model in terms of forecasting accuracy index, mean Similarly, Pai and Hong [36] proposed a hybrid model of SVR and
absolute percent error (MAPE). Novak [23] applied the radial basis simulated annealing (SA) algorithms to forecast Taiwan long-term
function (RBF) neural networks to forecast electricity load. The re- electric load. In which, SA is employed to select approximate opti-
sults indicate that RBF is at least 11 times faster and more reliable mal parameters in the proposed SVMSA model. Conclusively, they
than the back-propagation neural networks. Darbellay and Slama indicated that SVMSA is superior to ARIMA and GRNN models in
[24] applied the ANNs to predict the electricity load in Czech. terms of MAPE, MAD, and NRMSE.
The experimental results indicate that the proposed ANN model In Pai and Hong [35], SVR with genetic algorithms (GAs) is supe-
outperform the ARIMA model in terms of forecasting accuracy in- rior to other competitive forecasting models (regression and
dex, normalized mean square error (NMSE). Abdel-Aal [25] pro- ANNs). However, based on the selection operation rules of GAs,
posed an abductive network to conduct one-hour-ahead load only a few best fitted members of the whole population of a gen-
forecast for 5 years. Hourly temperature and hourly load data are eration can survive. After some generations the population diver-
considered. The results of the proposed model are very promising sity would be greatly reduced, and GAs might lead to a
in terms of forecasting accuracy index, MAPE. Hsu and Chen [26] premature convergence to a local optimum in the searching the
employed the ANNs model to forecast the regional electricity load suitable parameters of a SVR model. To overcome these drawbacks,
in Taiwan. The empirical results indicate that proposed model is it is necessary to find some effective approaches and improve-
superior to traditional regression model. Recently, Kandil et al. ments on GAs to maintain the population diversity and avoid lead-
[27] applied ANNs for short term load forecasting using real load ing to misleading local optimum. One possible approach is to
W.-C. Hong / Electrical Power and Energy Systems 31 (2009) 409–417 411

divide the chromosome population into several subgroups and where wðxÞ is called feature which is nonlinear mapped from the in-
limit the crossover between the members in different subgroups put space x. The w and b are coefficients which are estimated by
to maintain the population diversity. However, such a method minimizing the regularized risk function
would be with sufficient huge population size, which is not typical X
N
in business forecasting application problem solving. RðCÞ ¼ ðC=NÞ Le ðdi ; yi Þ þ kwk2 =2 ð2Þ
The other feasible approach is focused on the chaos approach, i¼1
due to its easy implementation and special ability to avoid being
where
trapped in local optimum [37]. Chaos often occurred in a determin- 
istic nonlinear dynamic system [38,39]. It is highly unstable mo- 0 jd  yj 6 e
Le ðd; yÞ ¼ ð3Þ
tion in finite phase space. Such a motion is very similar to a jd  yj  e otherwise
random process (‘‘randomicity”). Therefore, any variable in the cha-
and C and e are prescribed parameters. In Eq. (2), Le ðd; yÞ is called
otic space can travel ergodically over the whole space of interest
the e-insensitive loss function (as thick line in Fig. 1c). The loss
(‘‘ergodicity”). The variation of those chaotic variables has a delicate
equals zero if the forecasted value is within the e-tube [46,47]
inherent rule in spite of the fact that its variation looks like in dis-
(see Eq. (3)). The second term, kwk2 =2, measures the flatness of
order (‘‘regularity”). In addition, it is extremely sensitive to the ini-
the function.
tial condition, which is an important property sometimes referred
Therefore, C is considered to specify the trade-off between the
to as the so-called butterfly effect [40]. Attempting to simulate
empirical risk and the model flatness. Both C and e are user-deter-
numerically a global weather system, Lorenz discovered that min-
mined parameters. Two positive slack variables f and f , which
ute changes in initial conditions steered subsequent simulations
represent the distance from actual values to the corresponding
towards radically different final stales. Based on the two advanta-
boundary values of e-tube, are introduced. Then, Eq. (2) is trans-
ges of the chaos, the chaotic optimization algorithm (COA) was
formed into the following constrained form;Minimize
proposed to solve complex function optimization [38]. The basic
!
idea of the COA is to transform the variable of problems from the X
N
 2
solution space to chaos space and then perform search to find Rðw; f; f Þ ¼ kwk =2 þ C ðfi þ fi Þ ð4Þ
i¼1
out the solution by the three characteristics (randomicity, ergodic-
ity, and regularity) of the chaotic variables. Recently, the chaotic ge- with the constraints,
netic algorithm (CGA), which integrates GAs with COA, was
wwðxi Þ þ bi  di 6 e þ fi ; i ¼ 1; 2; . . . N
originally proposed by Yuan et al. [41] to fully apply their respec-
tive searching advantages. Firstly, the three characteristics of the
di  wwðxi Þ  bi 6 e þ fi ; i ¼ 1; 2; . . . ; N
chaotic variable are employed to make the individuals of sub-gen-
erations distributed ergodically in the defined space and thus to
fi ; fi P 0; i ¼ 1; 2; . . . ; N
avoid from the premature of the individuals in the sub-genera-
tions. Secondly, CGA also takes the advantage of the convergence This constrained optimization problem is solved using the fol-
characteristic of GAs to overcome the randomness of the chaotic lowing primal Lagrangian form:
process and hence to increase the probability of producing better !
1 XN
optimization individuals and finding the global optimal solution. Lðw; b; f; f ; ai ; ai ; bi ; bi Þ ¼ kwk2 þ C ðfi þ fi Þ
Henceforward, a series application of CGA has also been proposed 2 i¼1
[42–44]. X
N
This investigation presented in this paper is motivated by a de-  bi ½wwðxi Þ þ b  di þ e þ fi 
sire to solve the problem of maintaining the population diversity of i¼1
GAs mentioned above in determining the three free parameters in X
N

the SVR regional electric loads forecasting model. Therefore, the  bi ½di  wwðxi Þ  b þ e þ fi 
CGA method proposed by Yuan et al. [41] is employed in the SVR i¼1

model, namely SVRCGA, to provide good forecasting performance X


N

in capturing non-linear electric loads changes tendency. A numer-


 ðai fi þ ai fi Þ ð5Þ
i¼1
ical example in the literature [26] is employed to compare the fore-
casting performance of the proposed model. In addition, some Equation (5) is minimized with respect to primal variables w, b,
particular comparison of optimum search algorithms in determin- f and f , and maximized with respect to nonnegative Lagrangian
ing three parameters of a SVR is conducted. The remainder of this multipliers ai , ai , bi and bi . Therefore, Eqs. (6)–(9) are obtained.
paper is organized as follows. The SVR and CGA (SVRCGA) are XN
@L
introduced in Section 2. A numerical example is presented in Sec- ¼w ðbi  bi Þwðxi Þ ¼ 0 ð6Þ
tion 3. Conclusions are discussed in Section 4. @w i¼1

@L XN
2. Methodology ¼ ðb  bi Þ ¼ 0 ð7Þ
@b i¼1 i
2.1. SVR model
@L
¼ C  bi  ai ¼ 0 ð8Þ
The support vector machines (SVMs) were proposed by Vapnik @fi
[45]. The basic concept of the SVR is to map nonlinearly the origi-
@L
nal data x into a higher dimensional feature space (Fig. 1a and b). ¼ C  bi  ai ¼ 0 ð9Þ
N
Hence, given a set of data G ¼ ðxi ; di Þi¼1 (where xi is the input vec- @fi
tor; di is the actual value, and N is the total number of data pat- Finally, Karush–Kuhn–Tucker conditions are applied to the
terns), the SVR function is regression, and Eq. (4) thus yields the dual Lagrangian by substitut-
ing Eqs. (6)–(9) into Eq. (5). Then, the dual Lagrangian, Eq. (10), is
y ¼ f ðxÞ ¼ wwðxÞ þ b ð1Þ
obtained when kernel function is Kðxi ; xj Þ ¼ wðxi Þwðxj Þ,
412 W.-C. Hong / Electrical Power and Energy Systems 31 (2009) 409–417

Fig. 1. Transformation process illustration of a SVR model.

X
N X
N
1X N X N ture convergence to a local optimum in the searching the suitable
#ðbi ; bi Þ ¼ di ðbi  bi Þ  e ðbi þ bi Þ  ðb parameters of a SVR model. Therefore, the CGA is used in the pro-
i¼1 i¼1
2 i¼1 i¼1 i
posed SVR model to optimize the parameter selection.
 bi Þðbj  bj ÞKðxi ; xj Þ ð10Þ

subject to the constraints, 2.2. Chaotic genetic algorithms in selecting parameters of the SVR
model
X
N
ðbi  bi Þ ¼ 0
i¼1 2.2.1. Chaotic sequence
Chaos is an irregular non-linear phenomenon in natural world
0 6 bi 6 C; i ¼ 1; 2;    ; N and is the highly unstable unpredictable motion of deterministic
systems in finite phase space. Thus, a nonlinear system is said to
0 6 bi 6 C; i ¼ 1; 2;    ; N be chaotic if it exhibits sensitive dependence on initial conditions
and has an infinite number of different periodic responses. This
The Lagrange multipliers in Eq. (10) satisfy the equality
sensitive dependence on initial conditions is generally exhibited
bi  bi ¼ 0. The Lagrange multipliers bi and bi , are calculated and
by systems containing multiple elements with nonlinear interac-
an optimal desired weight vector of the regression hyperplane is,
tions. In addition, it is not only observed in complex systems, but
X
N even in the simplest logistic equation.
w ¼ ðbi  bi ÞwðxÞ ð11Þ Chaotic sequence could often be represented by the famous lo-
i¼1 gistic function (one-dimension base) defined by [50], as Eq. (13),
Hence, the regression function is Eq. (12).
xðiþ1Þ ¼ lxðiÞ ð1  xðiÞ Þ ð13Þ

X
l
f ðx; b; b Þ ¼ ðbi  bi ÞKðx; xi Þ þb ð12Þ
i¼1
xðiÞ 2 ð0; 1Þ; i ¼ 0; 1; 2; . . .

Here, Kðx; xi Þ is called the kernel function. The value of the ker- where xðiÞ is the value of the chaotic variable x at the ith iteration, l
nel is equal to the inner product of two vectors x and xi in the fea- is the so-called bifurcation parameter of the system, l 2 ½0; 4. The
ture space wðxÞ and wðxi Þ; i.e., Kðx; xi Þ ¼ wðxÞ  wðxi Þ. In general, system behavior varies significantly with l, the value of l deter-
there are three types of common examples of kernel function, mines whether x stabilizes at a constant size, wags between a lim-
the polynomial kernel, Kðxi ; xÞ ¼ ða1 xTi x þ a2 Þd (with degree d, a1 ited sequences of sizes, or whether x behaves chaotically in an
and a2 represent the coefficients); the multi-layer perceptron ker- unpredictable pattern. For certain values of the parameter l, of
nel function, Kðxi ; xÞ ¼ tanhðxTi x  bÞ (where b is the constant); and which l ¼ 4 is one, and xð0Þ R f0:25; 0:5; 0:75g, the above system
the Gaussian RBF kernel function, Kðxi ; xÞ ¼ expðkxi  xk2 =2r2 Þ. exhibits chaotic behavior. It is easily to observe that a very small
Till now, it is hard to determine the type of kernel functions for difference in the initial value of x causes a large difference in its fu-
specific data patterns [47,48]. However, any function that satisfies ture behavior, which is the basic characteristic of chaos. In addition,
Mercer’s condition by Vaplink [45] can be used as the Kernel func- x can travel ergodically over the whole space of interest, thus, the
tion. In this work, the Gaussian function is used in the SVR. The variation of x seems to have a delicate inherent rule in spite of
parameters that users have to specify are the error goal e, the con- the fact that its variation looks like in disorder.
stant C and the width of the radial basis function r.
The selection of three parameters, r, e and C, of a SVR model is 2.2.2. Implementation steps of chaotic genetic algorithm
important to the accuracy of forecasting. For example, if C is too GAs have been employed in a lot of empirical applications, due
large (approximated to infinity), then the objective is to minimize to the versatility and robustness in solving optimization problems.
the empirical risk, Le ðd; yÞ only, without model flatness in the opti- However, there are two major shortcomings on GAs, slow conver-
mization formulation Eq. (4). Parameter e controls the width of the gence and trapped into local optimum, which are mainly caused
e-insensitive loss function, which is used to fit the training data. from the population diversity reduction. Population diversities of
Large e-values result in more flat regression estimated function. an initial population cannot be maintained under selective pres-
Parameter r controls the Gaussian function width, which reflects sure, even if the initial individuals are taken randomly to be diver-
the distribution range of x-values of training data. Therefore, all sified and, i.e., distributed uniformly, it could not be guaranteed
the three parameters affect model constructing in different ways. that the qualities of initial population are also uniformly arranged,
There are lots of existing practical approaches to the selection of the initial individuals could only be supposed to be fully diversified
C and e, such as user-defined based on priori knowledge and expe- in the search space. Thus, most of the initial chromosomes are ba-
rience, cross-validation, and asymptotical optimization [49]. How- nal and far from the global optimum. This is because that if the ini-
ever, structural methods for efficiently and simultaneously tial population is not well designed, the GAs’ searches always are
confirming the selection of those three parameters efficiently are found to be trapped into local optimum. Thus, chaotic optimization
lacking. In addition, aforementioned, GAs are lack of knowledge instead of random approach is employed to generate initial
memory functions, which leads to time consuming and a prema- population.
W.-C. Hong / Electrical Power and Energy Systems 31 (2009) 409–417 413

On the other hand, for traditional GAs and related improve-


ments have a common arrangement that completely ignored the
Start
(Set parameters : psize , pc, δ, pm, qmax )
individuals’ experiences during their lifetime. This is because that
they are based on randomized searches, there are no necessary
connections between the current and next generations except for Chaotic optimization generates initial
some controlling operators such as crossover and mutation opera- population
tors. Mutation is an effective operator to increase and retain the (Generation =1)
population diversity, and is also an efficient approach to get away
the local optimum. The purpose of mutation can continuously pur-
sue the individual of higher fitness value and guide evolution of the
whole population. A large scale of mutation is good for acquiring Is the number of generation
the optimum solution in extensive search, however, the search is less than or equal to the maximal
rough and the solution accuracy is poor. In contrary, if the preci- number
sion is satisfactory, the solution will often be got trapped at a local
optimum or take too long time to converge. Therefore, this paper Yes
applies the annealing chaotic mutation operation. It can not only
simulate chaotic evolutionary process of biology, but also easily Calculate the fitness function
employ chaotic variable in carrying through ergodic search of solu-
tion space, to find another more excellent solution in the current
neighborhood area of optimum solution, and to let GAs possess Generation=
ongoing motivity all along. The proposed procedure of CGA is illus- generation+1 Parent selection
trated as follow and the flowchart is shown as Fig. 2. No
Step 1: Generating initial population by chaotic optimization.
The values of the three parameters in a SVR model in the ith iter-
ðiÞ
ation can be represented as X k ; k ¼ C; r; e. Set i ¼ 0, and employ Crossover
Eq. (14) to map the three parameters among the intervals
ðiÞ
ðMink ; Maxk Þ into chaotic variable xk located in the interval (0, 1).
ðiÞ
ðiÞ X k  Mink Annealing chaotic Mutation
xk ¼ ; k ¼ C; r; e ð14Þ
Maxk  Mink
Then, by using Eq. (13) with l ¼ 4 to compute the next iteration
ðiþ1Þ ðiþ1Þ
chaotic variable, xk . Transform xk to obtain three parameters
ðiþ1Þ
for the next iteration, X k , by the following Eq. (15). End
ðiþ1Þ ðiþ1Þ
Xk ¼ Mink þ xk ðMaxk  Mink Þ ð15Þ
Fig. 2. Chaotic genetic algorithm flowchart.
After this transformation, the three parameters, C, r, and e, are
encoded into a binary format; and represented by a chromosome
that is composed of ‘‘genes” of binary numbers. Each chromosome before crossover
has three genes, which represent three parameters. Each gene has
Parameter Parameter Parameter
40 bits. For instance, if each gene contains 40 bits, a chromosome
contains 120 bits. More bits in a gene correspond to finer partition 1 1 0 0 0 0 1 0 1 0 1 1
Parent 1
of the search space.
Step 2: Evaluating fitness. Evaluate the fitness (forecasting er- Parent 2 0 1 0 1 1 1 1 0 0 1 0 1
rors) of each chromosome. In this paper, a negative mean absolute
percentage error (-MAPE) is used as the fitness function. The MAPE
is as Eq. (16), Crossover Point=1

N   after crossover
1X ai  fi 
MAPE ¼  100% ð16Þ Parameter Parameter Parameter
N i¼1  ai 
Offspring 1 1 1 0 1 0 1 1 0 0 1 0 1
where ai and fi represent the actual and forecast values, and N is the
number of forecasting periods. Offspring 2
0 1 0 0 1 0 1 0 0 0 1 1
Step 3: Selection. Based on fitness functions, chromosomes with
higher fitness values are more likely to yield offspring in the next Fig. 3. A simplified example of parameter representation.
generation. The roulette wheel selection principle is applied to
choose chromosomes for reproduction.
Step 4: Crossover. In crossovers, chromosomes are paired ran- 0.09375, respectively. Finally, decode the crossover three parame-
domly. The single-point-crossover principle is employed herein. ters in a decimal format.
Segments of paired chromosomes between two determined Step 5: Annealing chaotic mutation. For the ith iteration (gener-
break-points are swapped. For simplicity, suppose a gene has four ation) crossover population (X ^ ðiÞ ; k ¼ C; r; e) of current solution
k
bits, thus, a chromosome contains 12 bits (Fig. 3). Before crossover space ðMink ; Maxk Þ are mapped to chaotic variable interval [0, 1]
is performed, the values of the three parameters in #1 parent are and formed crossover chaotic variable space ^
ðiÞ
xk ; k ¼ C; r; e, as Eq.
1.5, 1.25 and 0.34375, respectively. For #2 parent, the three values (17),
are 0.625, 8.75 and 0.15625, accordingly. After crossover, for #1
^ ðiÞ  Mink
X
offspring, the three values are 1.625, 3.75 and 0.40625, accord- ^xðiÞ k
k ¼ C; r; e; i ¼ 1; 2; . . . ; qmax
k ¼ ; ð17Þ
ingly. For #2 offspring, the three values are 0.5, 6.25 and Maxk  Mink
414 W.-C. Hong / Electrical Power and Energy Systems 31 (2009) 409–417

where qmax is the maximum evolutional generation of the popula- Generally, the probability of crossover (pc ) is chosen between 0.5
ðiÞ ðiÞ
tion. Then, the ith chaotic variable xk is summed up to ^xk and and 0.8, based on previous numerous empirical experiences, it is
the chaotic mutation variable are also mapped to interval [0, 1] as set to be 0.5. For the probability of mutation (pm ), it is critical factor
Eq. (18), in keeping the diversity of the population, and it is clamped to be
0.1; the annealing operation parameter (d) is clamped to be 0.9
~xðiÞ ^ðiÞ ðiÞ
k ¼ xk þ dxk ð18Þ [51].
where d is the annealing operation. Finally, the chaotic mutation
variable obtained in interval [0, 1] is mapped to the solution interval 3.3. Three parameters determination of the SVRCGA models
ðMink ; Maxk Þ by definite probability of mutation (pm ), and com-
pletes a mutative operation. In the training stage, the rolling-based forecasting procedure is
conducted, which dividing training data into two subsets, namely
~ ðiÞ ¼ Mink þ ~xðiÞ ðMaxk  Mink Þ
X ð19Þ fed-in (for example, eight load data) and fed-out (four load data),
k k
respectively. Firstly, the primary eight load data of fed-in subset
Step 6: Stop condition. If the number of generation is equal to a
are feeding into the SVRCGA model, and the structural risk minimi-
given scale, then the best chromosomes are presented as a solu-
zation principle is employed to minimize the training error, then
tion, otherwise go back to Step 2.
obtain one-step ahead forecasting load, namely the 9th forecasting
load. Secondly, the next eight load data, including seven of the fed-
3. Numerical example in subset data (from 2nd to 8th) pulsing the 9th data in the fed-out
subset, are similarly again fed into the SVRCGA model, the struc-
3.1. Data set tural risk minimization principle is also employed to minimize
the training error, then obtain one-step ahead forecasting load,
This study uses Taiwan regional electric load data to compare namely the 10th forecasting load. Repeat the rolling-based fore-
the forecasting performances of SVRCGA models with those of casting procedure till the 12th forecasting load is obtained. Mean-
ANN and regression models proposed by Hsu and Chen [26]. In while, training error in this training stage is also obtained.
addition, due to the same application type of SVM, authors’ previ- Different regions of Taiwan electric loads in a time series are fed
ous proposed model, SVMG, is also involved in comparison (notice into the SVRCGA model to forecast electric load in the next valida-
that RSVMG model had not only contained the concepts of SVM but tion period.
also the applications of Jordan recurrent neural network, therefore, While training errors improvement occurs, the three kernel
RSVMG model is beyond the same comparable scope with respect parameters, r, C, and e of SVRCGA model adjusted by CGA are em-
to this proposed SVRCGA and could not be involved in compari- ployed to calculate the validation error. Then, the adjusted param-
son). Table 1 lists the data used in this example. Totally, there eters with minimum validation error are selected as the most
are 20 data (from 1981 to 2000) of Taiwan regional electricity load. appropriate parameters. Finally, a four-steps-ahead policy is used
It is necessary to compare the forecast performance on the same to forecast electric load in each region. Note that the testing data
basis. Therefore, the data are divided into three data sets: the train- sets are not used for modeling but for examining the accuracy of
ing data set (12 years, from 1981 to 1992), validation data set (4 the forecasting model. The forecasting results and the suitable
years, from 1993 to 1996), and the testing data set (4 years, from parameters for the different regional SVRCGA models are illus-
1997 to 2000). The data sets are listed in Table 2. trated in Table 4.

3.2. Parameter setting in the CGA algorithm 3.4. Forecasting results and discussions

The parameters of the CGA algorithm in the proposed model for The forecasting results of various forecasting models with accu-
two numerical examples are experimentally set as shown in Table racy index, MAPE (see Eq. (16)), are illustrated in Table 5. Firstly,
3. The population sizes (psize ) are both 200, the maximum evolu- the proposed SVRCGA models have smaller MAPE values (except
tional generations of the population (qmax ) are both fixed as 500. the northern region presents a little inaccuracy, accuracy differ-

Table 1
Taiwan regional electric load (from 1981 to 2000) (unit: MW).

Year Northern regional load values Central regional load values Southern regional load values Eastern regional load values
1981 3388 1663 2272 122
1982 3523 1829 2346 127
1983 3752 2157 2494 148
1984 4296 2219 2686 142
1985 4250 2190 2829 143
1986 5013 2638 3172 176
1987 5745 2812 3351 206
1988 6320 3265 3655 227
1989 6844 3376 3823 236
1990 7613 3655 4256 243
1991 7551 4043 4548 264
1992 8352 4425 4803 292
1993 8781 4594 5192 307
1994 9400 4771 5352 325
1995 10,254 4483 5797 343
1996 10,719 4935 6369 363
1997 11,222 5061 6336 358
1998 11,642 5246 6318 397
1999 11,981 5233 6259 401
2000 12,924 5633 6804 420
W.-C. Hong / Electrical Power and Energy Systems 31 (2009) 409–417 415

Table 2
Training, validation, and testing data sets of the proposed model.

Data sets SVRCGA model SVMG model (Pai and Hong [31]) ANN model (Novak [23])
Training data 1981–1992 1981–1992 1981–1996
Validation data 1993–1996 1993–1996
Testing data 1997–2000 1997–2000 1997–2000

Table 3
CGA’s parameters setting in the numerical example.

Numerical Population size Maximal generation Probability of crossover The annealing operation parameter The probability of mutation
models (psize ) (qmax ) (pc ) (d) (pm )
SVRCGA 200 500 0.5 0.9 0.1

ence 0.30%) than the ANN and regression models. Particularly, the Taiwan Strait in 1995 leads lots of businesses to decide to move
ANN model seems failing to capture the load decreasing trend from out Taiwan (those businesses’ manufacturing factories are often
1998 to 1999 both in the central and southern regions. Secondly, set up in the central and southern regions); and (2) September
SVRCGA model is superior to SVMG model in terms of MAPE. Figs. 21 earthquake in Central Taiwan in 1999 leads electricity supply
4–7 illustrate real values and forecasting values of different models out of control at least 6 months. The electric load data employed
regarding each region. in this manuscript is only 20 data (from 1981 to 2000) which
For the first forecasting results regarding to the facts that the may not provide sufficient data tendency information for the train-
ANN model fails to capture the load decreasing trend from 1998 ing process while ANN modeling. In the meanwhile, the two acci-
to 1999 both in the central and southern regions, this is the known dent events mentioned above did not seriously affect the northern
drawback that the ANN model is time consuming to be required region, thus, the actual electric load did not appear deceasing trend
sufficient training data to learn more ‘‘expert rules” to approxi- from 1998 to 1999, ANN model could easily learn to capture the
mately predict the transition trend of electric load due to other transition tendency of electric load. For the eastern region, due to
accident events, such as (1) Missile Military Maneuvers between Southeast Asian Financial Crisis in 1997, the tourism industries

Table 4 Load (MW)


Forecasting results and parameters of SVRCGA models. 13,500

Regions Parameters MAPE of testing (%) 13,000

r C e 12,500

Northern 0.710 2.143  1010 0.52 1.36 12,000


Central 8.312 6.846  1010 0.46 1.72
Southern 0.740 1.325  1010 8.70 2.00 11,500
Eastern 20.85 7.667  1010 1.10 2.57
11,000

10,500

Table 5 10,000
Actual SVRCGA SVMG ANN Regression
Forecasting results of SVRCGA, SVMG, ANN, and regression models (unit: MW).
9,500
Year Actual SVRCGA SVMG ANN Regression 1997 1998 1999 2000
Year
Northern regional
1997 11,222 11,203 11,213 10,991 11,262 Fig. 4. Forecasting values of SVRCGA, SVMG, ANN, regression models and actual
1998 11,642 11,665 11,747 11,643 12,162 values in northern region.
1999 11,981 12,064 12,173 11,804 12,395
2000 12,924 12,360 12,543 12,834 13,122
MAPE 1.36 1.40 1.06 2.45
Southern regional Load (MW)
1997 6336 6373 6265 6305 6493 6,300
1998 6318 6462 6389 6476 6868 6,100
1999 6259 6415 6346 6537 7013
2000 6804 6623 6513 6672 7481 5,900
MAPE 2.00 2.02 2.48 8.29 5,700
Central regional 5,500
1997 5061 5071 5060 5112 5361
5,300
1998 5246 5118 5203 5301 5711
1999 5233 5244 5230 5350 5780 5,100
2000 5633 5406 5297 5572 6131
4,900
MAPE 1.72 1.81 1.73 8.52
4,700 Actual SVRCGA SVMG ANN Regression
Eastern regional
1997 358 362 358 378 380 4,500
1998 397 374 373 403 407 1997 1998 1999 2000
1999 401 398 397 410 413 Year
2000 420 409 408 435 440
MAPE 2.57 2.65 3.62 4.1 Fig. 5. Forecasting values of SVRCGA, SVMG, ANN, regression models and actual
values in central region.
416 W.-C. Hong / Electrical Power and Energy Systems 31 (2009) 409–417

Load (MW) GAs into the SVR model, i.e., it is feasible for hybrid (combined or
7,600 integrated) evolutionary algorithms with SVR-based forecasting
7,400 models.
7,200
7,000 4. Conclusions
6,800
6,600 From the historical data, the Taiwan regional electric load val-
6,400 ues show a strong growth trends, particularly in the northern re-
6,200
gion. This is a common electric load phenomenon in the
developing countries. However, the growth rate in electric demand
6,000
seems to become the rigorous mission to avoid overproduction or
5,800 Actual SVRCGA SVMG ANN Regression
underproduction electricity load. In this paper, we employed a no-
5,600
1997 1998 1999 2000 vel forecasting technique, SVRCGA, to examine its potentiality in
Year forecasting regional electric loads. Three other forecasting ap-
proaches, the SVMG, ANN, and the regression models, are used to
Fig. 6. Forecasting values of SVRCGA, SVMG, ANN, regression models and actual
compare the forecasting performance. Experiment results indicate
values in southern region.
that the proposed SVRCGA model outperforms the other ap-
proaches in terms of forecasting accuracy, except the northern
region.
Load (MW) This study is the first to apply the SVR model with CGA to fore-
460
cast electric load. Authors, so far, had finished a series of researches
440 regarding suitable parameters determine of a SVR load forecasting
420 model by employing novel optimization algorithms (including ge-
netic algorithms and simulated annealing algorithm, see Pai and
400
Hong [35,36]). Those empirical results obtained in these studies re-
380 veal that the hybrid model of SVR and proposed algorithms is a va-
360 lid alternative in the electric industry while original SVR model is
applied (see Pai and Hong [35,36], and this manuscript). If the data
340
pattern is more complicate where the original SVR model could not
320 obtain any ideal forecasting results, it is feasible to apply the recur-
Actual SVRCGA SVMG ANN Regression
300 rent SVR model (Pai and Hong, [35]). Of course, recurrent SVR mod-
1997 1998 1999 2000 els are time consuming than original SVR models. On the other
Year hand, electric load forecasting model ought to contain several so-
Fig. 7. Forecasting values of SVRCGA, SVMG, ANN, regression models and actual
cial factors to increase its explanation capabilities, i.e., multivariate
values in eastern region. forecasting models, such as social activities and seasonal factors
could be introduced into the SVRCGA model to forecast electric
load. In addition, some other advanced hybrid evolutionary algo-
in the region all suffered from the economic crisis, global arrival rithms to determine the suitable parameters should be combined
figures for 1997 reflected the difficult market conditions in the with the SVR model to forecast electricity demand. Finally, other
growth rate decline, thus, the actual electric load did not grow type of kernel functions employing should be the advanced re-
increasingly in 1997, however, ANN model optimistically fore- search issue of the SVR theory.
casted the large growth and led irretrievable forecasting error. Ori-
ginal SVR model focuses on, as mentioned previously, empirical Acknowledgment
risk minimization (e-insensitive loss function) rather than ‘‘expert
rules” learning to approximately predict the transition trend of This research was conducted with the support of National Sci-
electric load. This advantages lead SVR model to be able to deal ence Council, Taiwan (NSC 97-2410-H-161-001).
with any data pattern no matter data tendencies may present fluc-
References
tuation or sustained increasing or decreasing types. Contributing to
forecasting accuracy is that SVR model almost provides the small- [1] Bunn DW. Forecasting loads and prices in competitive power markets. Proc
est MAPE values in each regional electric load forecasting. How- IEEE 2000;88(2):163–9. doi:10.1109/5.823996.
ever, expert rules learning is useful to be involved into the SVR [2] Douglas AP, Breipohl AM, Lee FN, Adapa R. Risk due to load forecast
uncertainty in short term power system planning. IEEE Trans Power Syst
model, SVRCGA model is the empirical representative. 1998;13(4):1493–9. doi:10.1109/59.736296.
It is interesting to address the superiority that the SVRCGA [3] Gross G, Galiana FD. Short term load forecasting. Proc IEEE 1987;75(12):
model outperforms the SVMG model. It is obviously to see the per- 1558–73.
[4] Box GEP, Jenkins GM. Time series analysis, forecasting and control. San
formance of CGA in overcoming trapped into local optimum of GAs. Francisco: Holden-Day; 1970.
For example, for the northern region, the local solution of GAs in [5] Chen JF, Wang WM, Huang CM. Analysis of an adaptive time-series
Pai and Hong [35] by eight fed-in data rolling type, (r, C, autoregressive moving-average (ARMA) model for short-term load
forecasting. Electr Power Syst Res 1995;34(3):187–96. doi:10.1016/0378-
e) = (0.30, 2.10  1010, 400.00), with local optimal forecasting error, 7796(95) 00977-1.
1.40%, could be improved by CGA to (r, C, e) = (0.71, 2.143  [6] Vemuri S, Hill D, Balasubramanian R. Load forecasting using stochastic models.
1010, 0.52) to be the appropriate local optimal forecasting error, In: Proceedings of the 8th power industrial computing application conference,
1973. p. 31–7.
1.36%. Similarly, for central region, the local solutions of CGA in
[7] Christianse WR. Short term load forecasting using general exponential
Pai and Hong [35] by eight fed-in data rolling type, smoothing. IEEE Trans Power Apparat Syst 1971;PAS-90:900–11.
(r, C, e) = (0.90, 1.85  1010, 50), with local optimal forecasting er- [8] Park JH, Park YM, Lee KY. Composite modeling for adaptive short-term load
forecasting. IEEE Trans Power Syst 1991;6(1):450–7. doi:10.1109/ 59.76686.
ror, 1.81%, could be improved by CGA to (r, C, e) = (8.312, 6.846 
[9] Douglas AP, Breipohl AM, Lee FN, Adapa R. The impact of temperature forecast
1010, 0.46) to the appropriate local optimal forecasting error, uncertainty on Bayesian load forecasting. IEEE Trans Power Syst 1998;13(4):
1.72%. It is effective to integrate chaotic adjustment approach with 1507–13. doi:10.1109/59.736298.
W.-C. Hong / Electrical Power and Energy Systems 31 (2009) 409–417 417

[10] Brown RG. Introduction to random signal analysis and Kalman filtering. New [30] Cao L, Gu Q. Dynamic support vector machines for non-stationary time series
York: John Wiley and Sons Inc.; 1983. forecasting. Intell Data Anal 2002;6:67–83.
[11] Gelb A. Applied optimal estimation. Massachusetts: The MIT Press; 1974. [31] Tay FEH, Cao L. Application of support vector machines in financial time series
[12] Moghram I, Rahman S. Analysis and evaluation of five short-term load forecasting. Omega 2001;29(4):309–17. doi:10.1016/S0305-0483(01)00026-3.
forecasting techniques. IEEE Trans Power Syst 1989;4(4):1484–91. [32] Hong WC, Pai PF. Predicting engine reliability by support vector machines. Int J
doi:10.1109/59.41700. Adv Manuf Technol 2006;28(1–2):154–61. doi:10.1007/s00170-004-2340-z.
[13] Asbury C. Weather load model for electric demand energy forecasting. IEEE [33] Pai PF, Hong WC, Lin CS. Forecasting tourism demand using a multi-factor
Trans Power Apparat Syst 1975;PAS-94:1111–6. support vector machine model. Lect Notes Artif Intell 2005;3801:513–21.
[14] Papalexopoulos AD, Hesterberg TC. A regression-based approach to short-term [34] Chen BJ, Chang MW, Lin CJ. Load forecasting using support vector machines: a
system load forecasting. IEEE Trans Power Syst 1990;5(4):1535–47. study on EUNITE competition 2001. IEEE Trans Power Syst
doi:10.1109/59.99410. 2004;19(4):1821–30. doi:10.1109/TPWRS.2004.835679.
[15] Soliman SA, Persaud S, El-Nagar K, El-Hawary ME. Application of least absolute [35] Pai PF, Hong WC. Forecasting regional electric load based on recurrent support
value parameter estimation based on linear programming to short-term load vector machines with genetic algorithms. Electr Power Syst Res
forecasting. Int J Electr Power Energ Syst 1997;19(3):209–16. doi:10.1016/ 2005;74(3):417–25. doi:10.1016/j.epsr.2005.01.006.
S0142-0615(96)00048-8. [36] Pai PF, Hong WC. Support vector machines with simulated annealing
[16] Asber D, Lefebvre S, Asber J, Saad M, Desbiens C. Non-parametric short-term algorithms in electricity load forecasting. Energ Convers Manage
load forecasting. Int J Electr Power Energ Syst 2007;29(8):630–5. doi:10.1016/ 2005;46(17):2669–88. doi:10.1016/j.enconman.2005.02.004.
j.ijepes.2006.09.007. [37] Wang L, Zheng DZ, Lin QS. Survey on chaotic optimization methods. Comput
[17] Rahman S, Bhatnagar R. An expert system based algorithm for short-term load Technol Autom 2001;20(1):1–5.
forecasting. IEEE Trans Power Syst 1998;3(2):392–9. doi:10.1109/59.192889. [38] Li B, Jiang W. Optimizing complex functions by chaos search. Cybernet Syst
[18] Chiu CC, Kao LJ, Cook DF. Combining a neural network with a rule-based expert 1998;29(4):409–19. doi:10.1080/019697298125678.
system approach for short-term power load forecasting in Taiwan. Expert Syst [39] Ohya M. Complexities and their applications to characterization of chaos. Int J
Appl 1997;13(4):299–305. doi:10.1016/S0957-4174(97) 00048-1. Theor Phys 1998;37(1):495–505. doi:10.1023/A:1026620313483.
[19] Rahman S, Hazim O. A generalized knowledge-based short-term load- [40] Lorenz EN. Deterministic nonperiodic flow. Journal of the Atmospheric
forecasting technique. IEEE Trans Power Syst 1993;8(2):508–14. Sciences 1963;20(2):130–41. doi:10.1175/1520-
doi:10.1109/59.260833. 0469(1963)020<0130:DNF>2.0.CO;2.
[20] Dillon TS, Morsztyn K, Phua, K. Short term load forecasting using adaptive [41] Yuan X, Yuan Y, Zhang Y. A hybrid chaotic genetic algorithm for short-term
pattern recognition and self organizing techniques. In: Proceedings of the fifth hydro system scheduling. Math Comput Simul 2002;59(4):319–27.
world power system computation conference (PSCC-5), September 1975, doi:10.1016/S0378-4754(01)00363-9.
Cambridge, paper 2.4/3. p. 1–15. [42] Liao GC. Hybrid chaos search genetic algorithm and meta-heuristics method
[21] Dillon TS, Sestito S, Leung S. Short term load forecasting using an adaptive for short-term load forecasting. Electr Eng 2006;88(3):265–76. doi:10.1007/
neural network. Int J Electr Power Energ Syst 1991;13(4):186–92. doi:10.1016/ s00202-004-0272-0.
0142-0615(91)90021-M. [43] Lü QZ, Shen GL, Yu RQ. A chaotic approach to maintain the population diversity
[22] Park DC, El-Sharkawi MA, Marks II RJ, Atlas LE, Damborg MJ. Electric load of genetic algorithm in network training. Comput Biol Chem
forecasting using an artificial neural network. IEEE Trans Power Syst 2003;27(3):363–71. doi:10.1016/S1476-9271(02)00083-X.
1991;6(2):442–9. doi:10.1109/59.76685. [44] Yan X, Chen D, Hu S. Chaos-genetic algorithms for optimizing the operating
[23] Novak B. Superfast autoconfiguring artificial neural networks and their conditions based on RBF-PLS model. Comput Chem Eng
application to power systems. Electr Power Syst Res 1995;35(1):6–11. 2003;27(10):1393–404. doi:10.1016/S0098-1354(03)00074-7.
doi:10.1016/0378-7796(95)00980-9. [45] Vapnik V. The nature of statistic learning theory. New York: Springer; 1995.
[24] Darbellay GA, Slama M. Forecasting the short-term demand for electricity – do [46] Drucker H, Burges CJC, Kaufman L, Smola A, Vapnik VN. Support vector
neural networks stand a better chance? Int J Forecast 2000;16(1):71–83. regression machines. Adv Neural Inform Process Syst 1997;9:155–61.
doi:10.1016/S0169-2070(99)00045-X. [47] Vojislav K. Learning and soft computing-support vector machines, neural
[25] Abdel-Aal RE. Short-term hourly load forecasting using abductive networks. networks and fuzzy logic models. Massachusetts: The MIT Press; 2001.
IEEE Trans Power Syst 2004;19(1):164–73. doi:10.1109/TPWRS. 2003.820695. [48] Amari S, Wu S. Improving support vector machine classifiers by modifying
[26] Hsu CC, Chen CY. Regional load forecasting in Taiwan – application of artificial kernel functions. Neural Networks 1999;12(6):783–9. doi:10.1016/S0893-
neural networks. Energy Convers Manage 2003;44(12):1941–9. doi:10.1016/ 6080(99) 00032-5.
S0196-8904(02)00225-X. [49] Cherkassky V, Ma Y. Practical selection of SVM parameters and noise
[27] Kandil N, Wamkeue R, Saad M, Georges S. An efficient approach for short term estimation for SVM regression. Neural Networks 2004;17(1):113–26.
load forecasting using artificial neural networks. Int J Electr Power Energ Syst doi:10.1016/ S0893-6080(03) 00169-2.
2006;28(8):525–30. doi:10.1016/j.ijepes.2006.02.014. [50] May RM. Simple mathematical models with very complicated dynamics.
[28] Vapnik V, Golowich S, Smola A. Support vector machine for function Nature 1976;261:459–67. doi:10.1038/261459a0.
approximation, regression estimation, and signal processing. Adv Neural [51] Dekkers A, Aarts EHL. Global optimization and simulated annealing. Math
Inform Process Syst 1996;9:281–7. Program 1991;50(1–3):367–93. doi:10.1007/BF0159494.
[29] Cao L. Support vector machines experts for time series forecasting.
Neurcomputing 2003;51:321–39. doi:10.1016/S0925-2312(02)00577-5.