You are on page 1of 21

M PRA

Munich Personal RePEc Archive

Performance of Differential Evolution
Method in Least Squares Fitting of Some
Typical Nonlinear Curves

Mishra, SK
North-Eastern Hill University, Shillong (India)

29 August 2007

Online at http://mpra.ub.uni-muenchen.de/4656/
MPRA Paper No. 4656, posted 07. November 2007 / 04:05

Performance of Differential Evolution Method in
Least Squares Fitting of Some Typical Nonlinear Curves

SK Mishra
Department of Economics
North-Eastern Hill University
Shillong (India)

I. Introduction: Curve fitting or fitting a statistical/mathematical model to data finds its
application in almost all empirical sciences - viz. physics, chemistry, zoology, botany,
environmental sciences, economics, etc. It has four objectives: the first, to describe the
observed (or experimentally obtained) dataset by a statistical/mathematical formula; the
second, to estimate the parameters of the formula so obtained and interpret them so that
the interpretation is consistent with the generally accepted principles of the discipline
concerned; the third, to predict, interpolate or extrapolate the expected values of the
dependent variable with the estimated formula; and the last, to use the formula for
designing, controlling or planning. There are many principles of curve fitting: the Least
Squares (of errors), the Least Absolute Errors, the Maximum Likelihood, the Generalized
Method of Moments and so on.

The principle of Least Squares (method of curve fitting) lies in minimizing the
sum of squared errors, s 2 = ∑ i =1[ yi − g ( xi , b)]2 , where yi , (i = 1, 2,..., n) is the observed value of
n

dependent variable and xi = ( xi1 , xi 2 ,..., xim ); i = 1, 2,..., n is a vector of values of independent
(explanatory or predictor) variables. As a problem the dataset, ( y, x) , is given and the
parameters ( bk ; k = 1, 2,..., p ) are unknown. Note that m (the number of independent
variables, x j ; j = 1, 2,..., m ) and p (the number of parameters) need not be equal. However,
the number of observations ( n ) almost always exceeds the number of parameters ( p ).
The system of equations so presented is inconsistent such as not to permit s2 to be zero; it
must always be a positive value. In case s2 may take on a zero value, the problem no
longer belongs to the realm of statistics; it is a purely mathematical problem of solving a
system of equations. However, the method of the Least Squares continues to be
applicable to this case too. It is also applicable to the cases where n does not exceed p .

Take for example two simple cases; the first of two (linear and consistent)
equations in two unknowns; and the second of three (linear and consistent) equations in
two unknowns, presented in the matrix form as y = Xb + u :
10   1 2   u1 
10  1 2   b1   u1     
 1 b   
 26  = 3 5  b  + u  and  26  =  3 5  b  + u2 
    2  2 14   −1 4   2  u3 
Since y = Xb + u, it follows that X − g ( y − u ) = b. Here X − g the generalized inverse of X
(Rao and Mitra, 1971). Further, since X − g = ( X ′X ) −1 X ′ (such that ( X ′X ) −1 X ′X = I , an
identity matrix), it follows that b = ( X ′X ) −1 X ′y − ( X ′X )−1 X ′u. Now, if X ' u = 0 , we have
b = ( X ′X ) −1 X ′y. For the first system of equations given above, we have

. . Note that an approximate value of the first derivative (elements of the Jacobian matrix) of a function ϕ (b) at any  ∂ϕ  [ϕ (ba + ∆ ) − ϕ (ba −∆ )] point ba may be obtained numerically as  ∂b   .... b(0) = (b(0)1 . This fact immediately leads us to the Gauss-Newton method (of nonlinear Least Squares). obtain b(2) = b(1) + ∆b..10]/2 = 13. . let us look at the problem slightly differently. ( X ′X ) =  −17 10  =  −17 10  .... evaluate the equations at b(0) to obtain y(0)i . .. Obtain the next approximation of b as b(1) = b(0) + ∆b. In the system of equations that we have at hand (i... This method is an iterative method and may be described as follows. ∀ i. x1m  ∂y1 ∂b1 ∂y1 ∂b2 .. Then. Also. . ( X ′X ) −1 =   . ∂yn ∂bm  Thus. y − u = Xb ). .e. Thus we obtain the estimated parameters.. which is equal to 2 . we have 1 2  1 3 −1  11 13  1  45 −13 1 19 70 −97  X ′X =   3 5  =  . j ) is constant. find ∆b = ( J (0) ' J (0) )−1 J (0) ' ( y − y(0) )... Evaluate the equations at b(1) to obtain y(1) and also find J (1) at b(1) . ∂y1 ∂bm  x x22 . Now. find ∆b = ( J (1)' J (1) )−1 J (1)' ( y − y(1) ). This is so since the three equations are mutually consistent. the first   ba (ba + ∆ − ba −∆ ) derivative of φ (v) = 2v 2 + 5v + 3 at v = 2 may be obtained as [ϕ (v = 2 + 1) − ϕ (v = 2 − 1)) /[2 + 1 − (2 − 1)] which is [(18 + 15 + 3) – (2 + 5 + 3)] / (3 . .. the Jacobian (J.. for the second system of equations.. b(0)2 . 1 3 1 2  10 17  −1 1  29 −17   29 −17  −1  −5 2  X ′X =   3 5  = 17 29  . This y(0) will (almost always) be different from the y given in the dataset.... if the system is nonlinear (in parameters). ∂y2 ∂bm  X =  21 = =J ... Similarly. As before... the J matrix varies in accordance with the value of b j at which yi is evaluated.1) = [36 . b̂ .       xn1 xn 2 . x2 m  ∂y2 ∂b1 ∂y2 ∂b2 . ( X ′X ) X ′ =  3 −1  2 5     290 − 289        −5 2  10   2   b1  ( X ′X ) −1 X ′y =     =   =  . For example. or the matrix of the first partial derivatives of yi with respect to b j ) is X.. or the matrix of ∂yi ∂b j ∀ i. ( X ′X ) −1 X ′ =  2 5 4   13 45  326  −13 11  326 9 16 57   −1 4   10  1 19 70 −97    1  652   2   b1  ( X ′X ) −1 X ′y =  26 = = = 326 9 16 57    326 1304   4  b2  14  This solution is identical to any solution that we would have obtained by solving any combination of two equations (taken from the three equation).  .  x11 x12 . Or.  3 −1  26   4  b2  This solution is identical to the one obtained if we would have solved the first system of equations by any algebraic method (assuming ui = 0 ∀ i ). .. xnm  ∂yn ∂b1 ∂yn ∂b2 .. b = ( X ′X ) −1 X ′y may be considered as ( J ′J ) −1 J ′y . In a system of linear equations J (the Jacobian.. Take any arbitrary value of b... Now. b(0) p ) and find J(0) at that. And continue until ∆b is negligibly small. However.

Stanislaw Ulam. we would obtain. in general. Törn & Viitanen. The Differential Evolution Method of Optimization: The method of Differential Evolution (DE) was developed by Price and Storn in an attempt to solve the Chebychev polynomial fitting problem. 1987. A new point (pz) is constructed from those three points by adding the weighted difference between two points (w(pb-pc)) to the third point (pa). Initially. the “Simulated Annealing Method “ of Kirkpatrick and others (1983) and Cerny (1985). All these methods use the one or the other stochastic process to search the global optima. These methods supplement other mathematical methods used to this end. only an approximate value. yielding a candidate point. II. 1953) and it was gradually realized that the simulation approach could provide an alternative methodology to mathematical investigations in optimization. The simulation-based optimization became a hotbed of research due to the invention of the ‘genetic algorithm’ by John Holland (1975). but unfortunately such applications have been only few and far between. f(p) is obtained) for their fitness. the “Particle Swarm Method” of Kennedy and Eberhart (1995) and the “Differential Evolution Method” of Storn and Price (1995) are quite effective. its development to deal with nonconvex (multimodal) functions stagnated until the mid 1950’s. Metropolis et al. the ‘Clustering Method” of Aimo Törn (1978. George Box (1957) was perhaps the first mathematician who exploited the idea and developed his evolutionary method of nonlinear optimization. These methods may be applied to nonlinear curve fitting problem (Mishra. Hence. pb and pc) are randomly chosen from the population. is evaluated and if found better than pi then it 3 . Among them. 1994). MJ Box (1965) developed his complex method. It may be noted that a nonlinear least squares problem is fundamentally a problem in optimization of nonlinear functions. John Nelder and Roger Mead (1964) developed their simplex method and incorporated in it the ability to learn from its earlier search experience and adapt itself to the topography of the surface of the optimand function. A number of other methods of global optimization were soon developed. On account of the ability of these methods to search optimal solutions of quite difficult nonlinear functions. This point. they provide a great scope to deal with the nonlinear curve fitting problems. which strews random numbers over the entire domain of the decision variables and therefore has a great potentiality to escape local optima and locate the global optimum of a nonlinear function. ill conditioned or multimodal problems. The Gauss-Newton method is very powerful. many methods have been developed to deal with difficult. say pu. Note that although in this example we obtain the exact ∂ϕ ∂v = 4v + 5 value of the first derivative. pu. a population of points (p in m-dimensional space) is generated and evaluated (i. “Tabu Search Method” of Fred Glover (1986). John von Neumann and Nicolas Metropolis had in the late 1940’s proposed the Monte Carlo method of simulation (Metropolis. The crucial idea behind DE is a scheme for generating trial parameter vectors. evaluated at v =2. Then for each point (pi) three different points (pa. Then this new point (pz) is subjected to a crossover with the current point (pi) with a probability of crossover (cr). Almost a decade later. 2006).e. but it fails to work when the problem is ill conditioned or multi-modal. Initially optimization of nonlinear functions was methodologically based on the Lagrange-Leibniz-Newton principles and therefore could not easily escape local optima. Hence.

Some models (and datasets) have been obtained from other sources also. a particular problem might be simple for the one algorithm but very difficult for the others and so on. (3) Higher. Misra-1(d). They are also very sensitive to the (guessed) starting points. Sin-Cos. the problems may be classified into four categories: (1) Lower. US Department of Commerce. MGH-09. The list of problems (dealt with in the present study) so categorized is given below: Table-1: Classification of Problems according to Difficulty Level Difficulty Source of Classified Problem Names level Problem by Lower Chwirut. 4 .mostly from two main sources.com/neuralpower now new website at www. Lanczos-3 NIST NIST Judge Goffe Author Mount.geocities. USA at http://www.gov/div898/strd/nls/nls_main. and (4) Extra Hard.replaces pi else pi remains. III. the first from the website of NIST [National Institute of Standards and Technology (NIST). Gauss-2. According to the level of difficulty. Nelson. MGH-17. two different datasets may present different levels of difficulty. in fitting a nonlinear function to a dataset.shtml] and the second. Misra-1(c). there’s many a slip between cup and lip. Those with better exploitative abilities converge faster but are easily caught into the local optimum trap.com). Thurber Hougen Mathworks. MGH-10. Gauss-3. Ratkowsky-43. For the same model and the optimization algorithm starting at the same point. (ii) the dataset. and (iv) the guessed range (or the starting points of search) of parameters. the website of the CPC-X Software (makers of the AUTO2FIT Software at http://www. Cos-Sin CPC-X Author Average ENSO. Objectives of the Present Work: The objective of the present work is to evaluate the performance of the Differential Evolution at nonlinear curve fitting. Again. (2) Average. Roszman Higher Bennett. BoxBOD.nist. Therefore. In this paper we will use ‘CPC-X’ and ‘AUTO2FIT’ interchangeably. Gauss-1. For this purpose. Lanczos-1 NIST NIST Lanczos-2. This process makes the differential evaluation scheme completely self-organizing.com Author Multi-output CPC-X Author Extra CPC-X problems (all 9 challenge functions) CPC-X CPC-X Hard It may be noted that the difficulty level of a Least Squares curve fitting problem depends on: (i) the (statistical) model.7d-soft. The algorithms that have excellent explorative power often do not converge fast. different algorithms have different abilities to combine their explorative and exploitative functions while searching for an optimum solution.models and datasets .itl. Hahn. NIST NIST Ratkowsky-42. (iii) the algorithm used for optimization. Similarly. Eckerle. This new vector is used for the next iteration. we have collected problems . Kirby. Thus we obtain a new vector in which all points are either better than or as good as the current points.

48234 (not f(2. the function has been fitted to the related data and the estimated values.19137809) = 0. The model is specified as: x b1 x2 − 3 b5 y = rate = +u 1 + b2 x1 + b3 x2 + b4 x3 For the given dataset the minimum s 2 = ∑ i =1 ( yi − yˆi ) 2 = f (b1 . The rate of kinetic reaction (y) is dependent on the quantities of three inputs: hydrogen (x1). CPC-X Software.498571. s2 or RMS. 2001) f (2. The purpose is to highlight the discrepancies between the observed and the expected values of y. b4 . The Judge’s Function: This function is given in Judge et al (1990).25258511.IV. 1994 as well as in the computer program simann.2357485)=16. The Judge Function Hougen-Watson Function 2.0400477234. 0. The Findings: In what follows.0627757706. Mathworks. of the predicted variable (y or the dependent variable) has been obtained. see at Mathworks. ŷ . b2 . -0. 0. 1988.298900981.f). 1. 5 . It is an easy task for the Differential Evolution method to minimize this function.9826092)=20.com) for reaction kinetics is a typical example of nonlinear regression model. b3 . The datasets and the models are available at the source (NIST. Goffe’s SIMANN). b1 ) is bimodal.112414719. In case of any model. It has the global n minimum for s 2 = f (0.0817301 and a local minimum (as pointed out by Wild. 0.35. Along with the associated data it is a rather simple example of nonlinear least squares curve fitting (and parameter estimation) where s 2 = ∑ i =1 ( yi − yˆi ) 2 = f (b0 . The Hougen-Watson Function: The Hougen-Watson model (Bates and Watts. The expected values ( ŷ ) have been arranged in an ascending order and against the serial number so obtained the expected ŷ and observed y have been plotted.319) = 20. b5 ) = n f (1.864787293. These values (along with the certified values) have been presented to compare the performance of the Differential Evolution 1.9805 as mentioned by Goffe. n-pentane (x2) and isopentane (x3). The goodness of fit of a function to a dataset may be summarily judged by R2 (that always lies between 0 and 1). 1. we present our findings on the performance of the Differential Evolution method at optimization of the Least Squares problems. -0. The graphical presentation of the observed values against the expected values of y suggests that the model fits to the data very well.

2299428125E-11.00576317. This function has been fitted to two sets of data (data-1 and data-2).22999349E-11 against the certified value of 2.0951 exp(-x) + 0. b3 .0. However. b2 . 5) are obtained.0105309084) = 2384. b2 . For the third data set we have obtained 6 .00000728. The Chwirut Function: This function (specified as y = exp(−b1 x) /(b2 + b3 x) + u ) describes ultrasonic response (y) to metal distance (x). In case of the first set of data the Differential Evolution method has found the minimum value of s 2 = ∑ i =1 ( yi − yˆi ) 2 = f (b1 . Lanczos Function: Lanczos (1956) presented several data sets (at different accuracy levels) generated by an exponential function g(x) = 0.47714 . for the second set of data the results are marginally sub-optimal.190278183 0. b5 .0962570522.15955. b6 ) = (0.00613140045 0.3. Chwirut Function: Data Set 1 Chwirut Function: Data Set 2 Lanczos Function [ h(x) = b1exp(-b 2 x) + b3exp(-b 4 x) + b5 exp(-b 6 x)+u ] Data Set .1. For the second set of data. 3.55759463. 0.1.167327315 0.8607 exp(-3x) + 1.5.00000322) = 9.0121159344) = 515.04802941.3. The estimated parameters are very close to the true parameters.3. Using the given dataset of this problem one may estimate the parameters of h (x) = b1exp(-b 2 x) + b3exp(-b 4 x) + b5 exp(-b 6 x) + u and check if the values of (b1 .5576 exp(-5x).07870717E-18 for the first data set.860703939.55287658.864265983.00000927.0.1 Data Set -2 Data Set . b4 . The estimated parameters are once again very close to the true ones.1. For the second data set we obtained s 2 = f (0. 1.0951.4307867721E-25.00786966. b3 ) is 513.8607.1. but we have obtained n f (0.00517431911 0. b3 ) n which is f (0. b2 .3 4. while the certified value is 1. the certified value of s 2 = ∑ i =1 ( yi − yˆi ) 2 = f (b1 . 1.5576. We have obtained s 2 = f (0.5.0951014297.00289537) =2.

525544858) = 788.. -0.2.49668704.58215875.98659893.s 2 = f (1.26.53978668. 1979) measured response values (y) against input values (x) to scanning electron microscope line width standards. 4.8876144. Kirby Function ENSO Function 6.16648026E-005) = 3.0. 0.13927398.9050739624.95235111. The Hahn Function: Hahn (197?) studied thermal expansion of copper and fitted to data a model in which the coefficient of thermal expansion of copper (y) is explained by a ratio of two cubic polynomials of temperature (x) measured in the Kelvin scale. The Kirby function is the ratio of two quadratic polynomials.00408637261.-0. 44. The model was: y = (b1 + b 2 x + b3 x 2 + b 4 x 3 )/(1 + b5 x + b6 x 2 + b 7 x 3 ) + u .53243829 against the certified value = 1. We have obtained s 2 = f (10. 0.3110885 -1. We have obtained s 2 = f (1.0762128.-0.3.000240537241. The difference in the atmospheric pressure (y) drives the trade winds in the southern hemisphere (NIST. Australia and time (x).1.07763262. If in place of specifying the cubic in the denominator as (1 + b5 x + b6 x 2 + b7 x 3 ). 0. The Kirby Function: Kirby (NIST. The certified value is 1.67450632.0.) functions are in radians.00172418116.539787 against the certified value of 788. 2. 1989) relates y.42626427E-006.6117193594E-08.532801425. y = g ( x) = (b1 + b 2 x1 + b3 x12 )/(1+ b 4 x1 + b5 x12 ) + u .212322867. The function is specified as y = b1 + b2cos(2 π x/12) + b3sin(2 π x/12) + b5cos(2 π x/b4) + b6sin(2 π x/b4) + b8cos(2 π x/b7) + b9sin(2 π x/b7) + u Arguments to the sin(. 7.23144401E-007) = 1. monthly averaged atmospheric pressure differences between Easter Island and Darwin.0869370574. -1. -0. The ENSO Function: This function (Kahaner.122692829. 0.844297096.5107492.0057609942.0.955661374) to be 1. We have obtained s 2 = f (1. 5. we permit the specifications as (b8 + b5 x + b6 x 2 + b7 x 3 ) such that the model specification is y = (b1 + b 2 x + b3 x 2 + b 4 x 3 )/(b8 + b5 x + b6 x 2 + b 7 x 3 ) + u and fit it to Hahn’s data. we have: 7 .00259611813. USA). -1.90507396 against the certified value of 3. et al.) and cos(. 0.5324382854.62314288 0.61172482E-008.

75747467) = 1. 5.93584702. USA) is the model (Meyer. in degrees centigrade). 0. -0. 0.0128675349. Another minimum of S 2 = f (b1 . USA) is the model (Osborne. 0. 1978. -1. Garbow and Hillstrom (1981) presented some nonlinear least squares problems for testing unconstrained optimization software.797683317645138. 1972) specified as y = b1 + b 2 exp(-b 4 x) + b3exp(-b5 x) + u whose parameters are to be estimated on MGH-17 data.s 2 = f (-1.075056038492363E-04.-0. 6181. We have obtained s 2 = f (2.4648946975E-05 . 0.123056508. the NIST certified value of s2. USA) is specified as y = b1 (x 2 + b 2 x)/(x 2 + b3 x + b 4 ) + u that fits to MGH-09 data with NIST certified s 2 = 3.215629874.time (x1.7976833176. 0.0221226992) = 5. NIST.00718170192.00560963647.34635. The MGH Functions: More.50662711E-006. Yet another problem (MGH-17.5324382854) for entirely different set of parameters. We have obtained s 2 = f (0. Another problem (MGH-10. These problems were found to be difficult for some very good algorithms. Of these functions.375410053.223635) = 87.5906836. -1. The Nelson Function: Nelson (1981) studied performance degradation data from accelerated tests and explained the response variable dialectric breakdown strength (y. -0. 9.61777188E-009. He specified the model as y = b1 − b2 x1 exp(-b3 x 2 ) + u .945855171 . b3 ) is found to be s 2 = f (-7.61777132E-009. NIST.000422738373. 1970) specified as y = b1exp(b 2 /(x + b3 )) + u whose parameters are to be estimated on MGH-10 data.0101248026. 0.0577010134) = 3. The value of b8 is remarkably different from unity. We have obtained s 2 = f (0. Of course.16423365E-007. NIST. -0. 0. 1. in kilo-volts) by two explanatory variables . Hahn Function Nelson Function 8.192806935.797683317645143 against the NIST-certified value. Hahn’s specification is parsimonious.4093164. 345. MGH-09 (Kowalik and Osborne.464894697482394E-05 against s 2 = 5. that meets the certified value given by 2.89391801. in weeks) and temperature (x2. b2 .0750560385E-04 against which have obtained s 2 = f (0.0577010131) = 3.191282322. 3.94585517018605 against the NIST certified value of s 2 = 87.136062327) = 3.46468725. 8 . 5. 2.532438285361130 NIST (1.

if we specify the model as y = 1 2 + u .000208136273) = 0.408029x 2 + 96. Misra (1978) recorded a number of datasets and formulated a model that describes volume (y) as a function of pressure (x). 0. (1 + b5 x + b 6 x 2 + b 7 x 3 ) We fitted this model to the given data and obtained minimum s 2 = 5.642708239666791E+03 against the NIST-certified value = 5.0635539913x 3 9 .508629371x 2 + 0.040966836971 obtained s 2 = f (636. y = b1b 2 x((1+b 2 x)-1 )+ u was fitted to Misra-1[d] data set and we obtained s 2 = f (437.05641929528263857 against the NIST certified value.642708239666863E+03 1. The estimated model is obtained as: 1288. MGH-09 Function MGH-10 Function MGH-17 Function 10.27805041 + 1. We have fitted this function to data (Misra-1[c]) and against the NIST certified value of 0.0497272963x 3 (b + b x + b3 x 2 + b 4 x 3 ) Alternatively.04096683697065384 .369708.07925x + 583. we obtain (b8 + b5 x + b 6 x 2 + b 7 x 3 ) 1646.13968 + 1491. 0. Misra-1[c] Function Misra-1[d] Function 11.397972858x 2 + 0. The Thurber Function: Thurber (NIST. 0.238368x 2 + 75. Another model.4166441x 3 yˆ = 1 + 0.000302273244) = 0.6427082397E+03.30744 + 1905.67444x + 745. measured in natural log) by a model y = +u .23497375x + 0.96629503x + 0. 197?) studied electron mobility (y) as a (b1 + b 2 x + b3 x 2 + b 4 x 3 ) function of density (x.427256.386272x 3 yˆ = . The Misra Functions: In his dental research monomolecular adsorption study. s 2 = 5.056419295283.5 ) + u . His model Misra-1[c] is: y = b1 (1-(1+2b 2 x)-0.

has been estimated by us 10 . USA). 14. (1978) explained the biochemical oxygen demand (y.809409. -6.9484847331E-04.27805041 in the model serves no purpose except demonstrating that the parameters of the model are not unique.4557.948484733096893E-04 against NIST certified value 4. in days) by the model y = b1 (1-exp(-b 2 x)) + u . 0. Thurber Model Thurber Model (alternative specification) 12. 1204. The BoxBOD Function: Box et al.b 2 x .547237484) = 1. We estimated it on the given data and obtained s 2 = f (0.1680088766E+03. 1. NIST. The model was specified as y = b1 . Roszman Function BoxBOD Function 13. -181.1953505E-006. specified as y = b1 / (1+exp(b 2 -b3 x)) + u with the dataset RAT-42.27805041) we do not get the estimated parameters of the original model.arctan(b3 / (x-b 4 ))/π + e . We have obtained the minimum s 2 = f (213.168008876555550E+03 against the NIST certified value. USA).34271) = 4. in mg/l) by incubation time (x.201968657. The Roszman Function: In a NIST study Roszman (19??) investigated the number of quantum defects (y) in iodine atoms and explained them by the excited energy state (x in radians) involving quantum defects in iodine atoms (NIST.It appears that replacing of 1 by b8 = 1. The Ratkowsky Functions: Two least squares curve-fitting problems presented by Ratkowsky (1983) are considered relatively hard. The first (RAT-43. Note that on uniformly dividing all the parameters of the (estimated) alternative model by b8 (=1.

4635887487E-03.5((x-b3 )/b 2 ) 2 ) + u where y is transmittance and x is wavelength.61807684.641513. 1. We have obtained s 2 = f (-1. The rate of convergence of the DE solution towards the minimum has been rather slow. 8. 0. Against the NIST certified value of minimum 3 s 2 =5.159008779E-03 of CPC-X.0673592002) = 8. 0.0565229338. specified as y = b1 / ((1+exp(b 2 -b3 x))(1/b ) + u with the dataset RAT-43. Against the reference value of 5. has been estimated by us to yield 4 s 2 = f (699. The second model (RAT-43.to yield s 2 = f (72. 451. The Mount Function: Although the specification of this function is identical to the Eckerle function. we have obtained s 2 = f (-2523. USA).7864049080E+03. (NIST. -4.43 15. The Eckerle Function: In a NIST study Eckerle (197?. The Bennett Function: Bennett et al. 1.786404907963108E+03 against the NIST certified value. 46. 0.27924837) = 8. NIST.80508.241207571054023E-04.2404744073E-04. Bennett Function Eckerle Function 16.55438272. USA) fitted the model specified as y = (b1/b 2 ) exp(-0. 1994) conducted superconductivity magnetization modeling and explained magnetism (y) by duration (x.463588748727469E-03 against the NIST certified value. log of time in minutes) by the model y = b1 (b 2 + x)(-1/b ) + u . the CPC-X Software have fitted it to a different dataset.75962938. we have obtained the value of 11 . 17.7378212.42 Ratkowsky Function .08883218.27712526.541218) = 1. 5. Ratkowsky Function . NIST.932164428) = 5. 8. 2.4622375.056522933811241 against the NIST certified value.

Mount (Eckerle) Function CPC-X-9 Function 18. Further. The Multi-output Function: The CPC-X has given an example of a multi-output function in which two dependent variables ( y1 and y2 ) are determined by the common independent variables ( x1 .1581777.9699794119704664 (against 0.997148464588044 respectively. against the reference values of RMS and R2 (5.362592746.s 2 = f (1.159008010368E-03 .01728442 450.29795109) = 14. first when (i) we have not constrained the sum of errors ∑ u1 and ∑ u2 individually to be near zero.5412806 4. -29.1546909) obtained by AUTO2FIT. S 2 = f (19.9971484642) we have obtained 5. x3 ) and they have some common parameters ( b ) such that: y1 = x1b1 + b 2 ln(x 2 ) exp(x 3b3 ) + u1 y 2 = x1b1 + b 2 exp(x b24 ) ln(x 3b3 ) + u 2 Multi-output Function -1 Multi-output Function -2 We have fitted these functions to the dataset.9704752) and RMS = 1.8159227.028842682e-03 and 0. provided by the CPC-X. x2 . We 4 fitted this function to the given data.154690900182629 (against 1.0E-06 in magnitude. We obtained R2 = 0. 2. (ii) we have constrained each of them to be less than 1. 19.028842307409E-03 and 0. The CPC-X-9 Function: This function is specified as y = b1exp(b 2 (x + b3 ) b ) + u . in two ways.892013) = 5.66642182461028 . The two fits differ marginally as shown in the table below: 12 . -0.

100767089 0.13226556) .96331268 0.97236378 0.01977698 obtained by AUTO2FIT.219 146. Estimated Parameters of Multi-output function: (i).493461213.100648248 0. Unconstrained and (ii). 10.701 ŷ = + 160. Constrained b1 b2 b3 b4 R12 R22 s12 s22 RMS1 RMS2 i 0. Further.41672 + x1 ) (1 + 0.98468 1962. The CPC-X-8 Function: This is a composite multivariate sigmoid function given as b1 y= + b5 x 3b6 + u (b 2 + x1 ) (1 + b3 x 2 ) (x 3 -b 4 ) 2 We have fitted this function to AUTO2FIT data and obtained R2 = 0.990723 0. 21.429694 1.255886425 2.cos(b 2 x 2 )) x 3 b3 /x1 + u We have obtained bˆ = (2.99740460 and RMS = 0.0926076197 0. This function is more difficult than the Sin-Cos function to fit. 5. 2.89 146.011115788 respectively.256305001 2.5 (3615.9953726879097797 slightly larger than the R2 (= 0.995372) obtained by AUTO2FIT.9999618.025161467 against reference values 0. The Cos-Sin Function: This function (given by the CPC-X Software) is specified as y = ((b1 /x1 ) .536364662 x 2 ) (x 3 -27. R2 = 0. 20. there is some inconsistency in the figures of R2 and 13 .990522 and 0.83684187) .99072 0.984674 1962.9980138. R2 = 0. -49.21047 ii 0.01056060934407798 and RMS = 0. It may be noted that we have no information as to how the CPC-X has defined the minimand function.0197770998736 against 0.9974045694 and 0. our results are not quite different from theirs. The estimated function is 174808.8118343) 2 The value of s2 is 0.210697 The reference values of R12 and R22 are 0.5788 4.9930915320764427 and RMS = 1.430452 1. Yet.016475 x 3-2.02516162826 respectively. Sin-Cos Function Cos-Sin Function 22. 2.011115788318 against reference values 0.0929527106 0.49225824.9930915321 and 1.984717 respectively.93908006.5238 4. The Sin-Cos Function: This function (given by the CPC-X Software) is specified as y = (b1 + x1 /b 2 + cos(b3 x 2 /x 3 ))/(b 4 sin(x1 +x 2 +x 3 )) + u We have obtained bˆ = (0.

RMS (of errors) reported by CPC-X.263771900781600.0.72078474 x 2 + 0.8767278) and s 2 = f (-101.000384550462 x1 .00626078. The CPC-X-4 Function: This function is a ratio of two linear functions.113552) = 7.9715471304250647 against the R2 = 0.50244.969929562).9715471 of AUTO2FIT. CPC-X-3 Function CPC-X-4 Function 25.68651757015526 . both in four b 0 + b1 x1 + b 2 x 2 + b3 x 3 + b 4 x 4 predictor variables.0738767 .87672786941874 (against 0.000744446437 x1 x 2 ŷ = 1 . CPC-X-8 Function CPC-X-7 Function b1 + b 2 x1 + b3 x 2 + b 4 x1 x 2 23.0. -170.0303920084 x 2 + (1. The value of RMS(E) is 1. 1 + b 5 x1 + b 6 x 2 + b 7 x1 x 2 We have fitted it to CPC-X data and obtained R2 = 0.0. The CPC-X-3 Function: The function specified as y = b1 / (1 + b 2 /x + x/b3 ) + u has been fitted to the test dataset provided by the CPC-X.078841. 0.2. Its specification is: y = +u.0267347156 x1 . The estimated function is 92.07039964E-005) x1 x 2 24.969923509396039 (against reference value. Our s2 is 21. RMS = 0. The CPC-X-7 Function: This function is specified as y = +u.006260685261970 against the AUTO2FIT value 1. -1258. We have fitted 1 + a 1 x1 + a 2 x 2 + a 3 x 3 + a 4 x 4 14 . We obtain R2 = 0. If their R2 is smaller than our R2 then their RMS(E) cannot be smaller than our RMS(E).

233856 0.4846360 1244.00228812301 x1.489906 67. It is worth reporting that the function fitting to dataset-1 is easier as it is robust to a choice of b2 than the other two datasets.294186 0.6682212 100.05714). The CPC-X-5 Function: The function y = b1 + b 2 x1b3 + b 4 x b25 + b 6 x1b7 x b28 + u has been fitted to the data provided by CPC-X.4811113 23.32578x 2 + 1.822206428 2.0183284 0.761643 19. 26. Estimated Parameters of Blended Gaussian Function with Different Datasets Function b1 b2 b3 b4 b5 b6 Gauss1 98.this function to the data (given by CPC-X) and obtained R2 = 0.3893889 1315.745644x1 + 2120. The estimated function is: 674.634308339x1.528209231 2.998050 18.2703296 reported by the makers of AUTO2FIT.1) else the algorithm is caught in the local optimum trap.996486539 Gauss3 111.631664635x1-0.994632848 and RMS = 0.833300621 + 0.0894933939x1-0.1297733 71.560015248x 4 The s 2 = 53118.996899074 Blended Gaussian Function (1) Blended Gaussian Function (2) Blended Gaussian Function (3) 27. Ours s2 = 2.636195 23.80514286.0571405953 (against reference value 48.9932818431032495 against 0.578584 72.484636013 2.9945029 Gauss2 99.3005001 1244.5282092 1247.9403689 0. The problem is extremely ill conditioned and very sensitive to the choice of the domain or the initial (starting) values of parameters.0104972764 100.0109949454 101.7782107 0. 15 .1953565114881. However.8222432 1315.0455895 Gauss3 98.0334385585x 3 -0.0109458794 73.695531 2 2 b7 b8 NIST certified s2 Ours s Ours RMS Ours R Gauss1 178. the other two datasets need (0 < b2 < 0.42617267 2 -0.5259727 1247.996962322 Gauss2 153. The function is specified as y = b1exp(-b 2 x) + b3 exp(-(x-b 4 ) 2 /b52 ) + b6 exp(-(x-b7 ) 2 /b82 ) + u We have fitted this function to the three sets of data and obtained the following results. 0.8051428644699052 against the reference value.67934 + 227.270102 19. All the three datasets are problematic if b5 or b8 is given a range much beyond (0.312443915 + 0.572582178x1 + 5.7050314 147.030955 23. The estimated model is yˆ = 0.880225 107.231129 0.42909022 2 We would also like to mention that the present solution needed several trials to get at these values.64254986x 3 -176. A range (0 < b2 < 10) yields the results. The Blended Gaussian Function: NIST has given three datasets (with different difficulty levels) to fit a blended Gassian funcion.3024453470938 against 0.2415305900 and RMS = 48.051025x 4 yˆ = 1 + 0. 50). We have obtained R2 = 0.55641932x 2 + 0.

439647321698 against the reference values of 0.9614190785232305 and RMS = 0. We obtain R2 = 0.0066977514x 3 + 0.479215814564.882331799699548 against the 16 .3028129 respectively.04612509E-006x 2 -0. The CPC-X-1 Function: This function is given as: y = + b 4 x b5 + u .0132847799 The problem is extremely unstable.0498 + 1042.5000473232604 for the following estimated model.0214726136 as reported.8622291597468909 and RMS = 0.000262177177x1 -7.2236173793023 against the reference values 0.58342731 + 0. We have b1 + b 2 x b3 fitted this function to CPC-X-1 data to obtain R2 = 0.02134x 0.499998867 + 12184.999644261 and 0.54335611E-005x1 -3.0270514576x 3 + 0. ŷ = -13104.0331768444x 4 yˆ = 1+ 9. CPC-X-2 Function CPC-X-1 Function [y = g(x) view] 1 30. We obtained s2 = 0.00668129827x 4 For this estimated model the value of s2 is 3.2476x -0. The estimated function is 4.95307951E-006x 2 -0.9346422 and 0. The CPC-X-2 Function: This function is a ratio of two other linear functions given as b1 +b 2 x1 + b3 x 2 + b 4 x 3 + b5 x 4 y= +u 1+ a1 x1 + a 2 x 2 + a 3 x 3 + a 4 x 4 We have obtained R2 = 0. CPC-X-5 Function CPC-X-6 Function 28.199999589 -114. The CPC-X-6 Function: The function y = b1 + b 2 x b3 + b 4 x b5 + b 6 x b7 + u has been fitted to CPC-X-6 data. 29.09568x 0.

3671976 0.376667. although. Among the CPC-X functions (including the Mount. the Cos-Sin and the Multi-output functions) either comfortably or with some trial and error in setting the ranges of parameters to be estimated. the Sin-Cos.450036183x -0. sub-optimal.450036183 . Concluding Remarks: The Differential Evaluation (DE) method applied to fit functions to datasets given by NIST and others has exhibited a mixed performance.7034 against 104.reference value 0. It has been successful at the job for all problems. wayward or haywire. Oftentimes. the Blended Gauss functions have been relatively difficult and sensitive to the choice of initial values or range of parameters. However. It may be noted that unless otherwise stated or discussed. In particular.46250826E-006x 4. Such cases have been duly reported. and other functions namely the Mount. after all. narrower domains were specified. of various degrees of difficulty. The function # 5 has been quite difficult to optimize and although the DE took the solution very close to the one reported by CPC-X. the DE has been successful to obtain the optimum results even if the domains of parameters were too wide. the Cos-Sin and the Multi-output functions) . the DE does not require the domain (of parameters) to be specified in a narrow range as do the other software/methods to solve the nonlinear least squares problem.99678004 and RMS = 622. yhat view] V. The value of s2 obtained by us is 69021203. the Sin-Cos. Our estimates are far off the mark. 17 . the Cos-Sin and the Multi-output functions have been very easy to fit.ten of them posed by the CPC-X Software as the challenge problems . 4.0. in a few cases when a too wide domain made the program unstable. 3. the Sin-Cos. the Mount.the DE has been able to deal with nine (challenge functions # 9. but it remained. The DE solution to the CPC-X-6 function remained appreciably far from the optimal fit.98784. 8. given by NIST. We obtain 1 ŷ = + 1.00346072131 CPC-X-1 Function [ y vs. 7.

This choice is often problem oriented and that for obvious reasons. The Differential Evolution optimizer is a (stochastic) population-based method. you may try to change the control parameter of ‘Population Size’ …”. As a result. it does not ensure every run will be successful. In spite of several trials. the DE oftentimes allows for a large and wide domain for the parameters to start the search. Additionally. we may conclude that its success rate is appreciably high and it may be used for solving nonlinear curve fitting problem with some good degree of reliability and dependability (for performance of other software on NIST functions see Lilien.not by the Global Levenberg-Marquard or the Global BFGS method. The DE performed miserably in dealing with two CPC-X functions: #1 and #2. 2000). Larger is the number of attempts greater is the probability that they would find out the optimum. the DE method operates with a number of parameters that may be changed at choice to make it more effective. but it might be ineffective (or counterproductive) in certain other cases. It may be noted that all population-based methods of optimization partake of the probabilistic nature inherent to them. the DE failed to reach any closer to the optimal solution (the reference R2 provided by the CPC-X). It may be noted that there cannot be any ‘sure success method’ to solve all the problem of nonlinear least squares curve fitting. Secondly. … In some cases. Additionally. Surfaces may be varied and different for different functions. Even for Auto2Fit. The scheme of adaptation is largely based on some guesswork since nobody knows as to the true nature of the problem (environment or surface) and the most suitable scheme of adaptation to fit the given environment. A particular choice may be extremely effective in a few cases. there is a relation of trade-off among those parameters. The CPC-X has also fitted the multi-output function by the DE method . The CPC-X problems are the challenge problems for any nonlinear Least Squares algorithm. like any other population-based method of optimization. About these problems. They have suggested that to solve these problems one should use Global Levenberg-Marquard or Global BFGS method. one cannot obtain certainty in their results. unless they are permitted to go on for indefinitely large search attempts. the CPC-X Software themselves remark: “Some of those test data are very hard. Further. 18 . The most of other algorithms for solving a nonlinear least squares problem are too (impracticably) demanding on the initial guess of parameters. all of them adapt themselves to the surface on which they find the optimum. and may never get right answers without using Auto2Fit. If the DE has performed well at more than half the number of such challenge problems (and done better than the AUTO2FIT in some cases).

• Hahn. Swartzendruber. H (1990) The Theory and Practice of Econometrics. 272-280. Moler. L. D. D (1988) Nonlinear Regression Analysis and Its Applications.com/neuralpower • Eberhart RC and Kennedy J (1995) “A New Optimizer using Particle Swarm Theory”. US Department of Commerce. IEEE Service Center. • Bennett. NJ. of Michigan Press. • Kowalik. Piscataway. Applied Analysis. NJ: pp. pp.. CD Jr. Prentice Hall. pp. MJ (1965) “A New Method of Constrained Optimization and a Comparison with Other Methods”. (1956). pp. • Lilien. US Department of Commerce. 1087-1092. USA. • Glover F (1986) " Future Paths for Integer Programming and Links to Artificial Intelligence". • Box. %20One%20More%20Test%20Data of http://www. H (1994) Superconductivity Magnetization Modeling.html 19 . National Institute of Standards and Technology (NIST). D and Watts.mathworks. Phys. References • 7d-soft High Technology Inc (--) AUTO2FIT Software The New Website of CPC-X Software http://www. S (1989) Numerical Methods and Software.com • Bates. 671-680. pp. (1979) Scanning Electron Microscope Line Width Standards. 5:533-549. Ann Arbor. MP (1983) "Optimization by Simulated Annealing". Gelatt. J. pp.7d-soft. Hunter.com (. L and Brown. New York.com/neuralpower/Regression_Test. • Mathworks. • Cerny. US Department of Commerce. MR (1978) Methods for Unconstrained Optimization Problems. 6 . US Department of Commerce. Science. DM (2000) “Review: Econometric Software Reliability and Nonlinear Estimation In Eviews: Comment” Journal of Applied Econometrics. C and Nash. New York. J (1975) Adaptation in Natural and Artificial Systems. and Vecchi. RC. GP. John Wiley and Sons. V (1985) "Thermodynamical Approach to the Traveling Salesman Problem: An Efficient Calculations by Fast Computing Machines". • Box. Applied Statistics.geocities. National Institute of Standards and Technology (NIST). pp. CH and Lotkepohl. 483-487. S.com/access/helpdesk_r13/help/toolbox/stats/nonlin_3. T (197?) Copper Thermal Expansion Study. • Eckerle. GG. Computers and Operations Research. New York. Englewood Cliffs. pp. 65-100. JS and Osborne. New York. 4598. USA. • CPC-X Software (--) At http://www. USA. GEP (1957) “Evolutionary Operation: A Method for Increasing Industrial Productivity”. Chem. Hill. WG and Hunter.21.Example: Nonlinear Modeling Hougen-Watson Model http://www.) Statistical Toolbox . C. JS (1978) Statistics for Experimenters. Univ. 107-110. J. National Institute of Standards and Technology (NIST). Ferrier and Rogers (1994) "Global Optimization of Statistical Functions with Simulated Annealing. Englewood Cliffs. • Box. • Kahaner. Elsevier North-Holland. Griffith. 39–43.geocities.htm#3. 8. National Institute of Standards and Technology (NIST). 15(1). • Judge. 60 (1/2).com/neuralpower now at www.geocities. • Lanczos. Comp. WE. Wiley. K (197?) Circular Interference Transmittance Study. 81-101. 220. 42-52. • Goffe. • Kirby. R. Lee. USA. John Wiley. NJ. 441-445. 6. • Kirkpatrick.. • Holland. Prentice Hall. Proceedings Sixth Symposium on Micro Machine and Human Science." Journal of Econometrics.

N (1987) The Beginning of the Monte Carlo Method. LCW and Szegö. 7: pp. National Institute of Standards and Technology (NIST). • Thurber. 2-R-30(2). • Meyer. pp. • Ratkowsky. in Numerical Methods for Nonlinear Optimization. Marcel Dekker. pp.com 20 .f . USA. RR (1970) Theoretical and Computational Aspects of Nonlinear Regression. USA. The source codes are available on request to mishrasknehu@yahoo.utk. IEEE Transactions on Reliability.edu. R (1964) “A Simplex Method for Function Minimization” Computer Journal. 465-486. Wiley. in Nonlinear Programming. 7(1). and Teller. Berkley. pp. Theory Appl. Special Issue. KE (1981) Testing unconstrained optimization software.Bug in Description of Judge's Function” letter to netlib_maintainers@netlib. A. Teller. • Nelson. Rosenbluth.itl. Rosen.shtml • Osborne. L (19??) Quantum Defects for Sulfur I Atom.nist. SK (2006) “Fitting a Logarithmic Spiral to Empirical Data with Displaced Origin”. National Institute of Standards and Technology (NIST).f Note: The author has written his own program (FORTRAN 77). US Department of Commerce. pp. • Rao. North Holland. 267-276. and Hillstrom. Opt. Garbow. New York. W (1981) Analysis of Performance-Degradation Data. 15. Academic Press. ACM Transactions on Mathematical Software. 45. D (1978) Dental Research Monomolecular Adsorption Study. 1. OL and Ritter.org and bgoffe@whale.f available at http://netlib2. USA. J.P. BS. • Storn. 171-189. Amsterdam. Lootsma (Ed). • NIST (--) Nonlinear Regression http://www.edu/opt/simann. No. AA and Viitanen. J. • Wild. • Törn.• Metropolis. • Metropolis. N. E (1953) "Equation of State Simulation Algorithm". M. Mangasarian. R (197?) Semiconductor electron mobility modeling. Academic Press. New York. G. CR and Mitra. JB. • Mishra. • Nelder. of Global Optimization. US Department of Commerce. SK (1971) Generalized Inverse of Matrices and its Applications. (Eds) Towards Global Optimization – 2. New York. • Roszman. 149-155. 41-51. K (1995) "Differential Evolution .st. in Dixon. 308-313. 17-41. J (2001) “Simann.. MR (1972) Some Aspects of Nonlinear Least Squares Calculations. DA (1983) Nonlinear Regression Modeling. 5.gov/div898/strd/nls/nls_main. National Institute of Standards and Technology (NIST). Los Alamos Science. K (Eds). • More. Rosenbluth. JA and Mead. US Department of Commerce.usm. pp. • Törn. R and Price.A Simple and Efficient Adaptive Scheme for Global Optimization over Continuous Spaces": Technical Report. A. International Computer Science Institute. SSRN http://ssrn.com/abstract=897863 • Misra. New York. pp. in the Simulated Annealing based Fortran Computer program for nonlinear optimization Simann. S (1994) “Topographical Global Optimization using Presampled Points”. JJ. 125-130. AA (1978) “A search Clustering Approach to Global Optimization” .cs.