Optimal Model Structure Identification 2 - Nonlinear Regression

Vol.
8, 2023-17
Optimal Model Structure Identification. 2. Nonlinear Regression
Hugo Hernandez
ForsChem Research, 050030 Medellin, Colombia
hugo.hernandez@forschem.org
doi: 10.13140/RG.2.2.25901.87527
Abstract
Nonlinear regression consists in finding the best possible model parameter values of a given
homoscedastic mathematical structure with nonlinear functions of the model parameters. In
this report, the second part of the series, the mathematical structure of models with nonlinear
functions of their parameters is optimized, resulting in the minimum estimation of model error
variance. The uncertainty in the estimation of model parameters is evaluated using a linear
approximation of the model about the optimal model parameter values found. The
homoscedasticity of model residuals must be evaluated to validate this important assumption.
The model structure identification procedure is implemented in R language and shown in the
Appendix. Several examples are considered for illustrating the optimization procedure. In many
practical situations, the optimal model obtained has heteroscedastic residuals. If the purpose
of the model is only describing the experimental observations, the violation of the
homoscedastic assumption may not be critical. However, for explanatory or extrapolating
models, the presence of heteroscedastic residuals may lead to flawed conclusions.
Keywords
Homoscedasticity, Mathematical Structure, Modeling, Nonlinear Regression, Optimization,
Parameter Identification, Uncertainty
1. Introduction
This is the second part of a series of reports discussing different strategies for optimizing the
structure of mathematical models fitted from experimental data. In the first report of this
series, the concept of randomistic models was introduced as follows [1]:
( ) ( ) ( )
(1.1)
Cite as: Hernandez, H. (2023). Optimal Model Structure Identification. 2. Nonlinear Regression.
ForsChem Research Reports, 8, 2023-17, 1 - 35. Publication Date: 11/12/2023.
Optimal Model Structure Identification
2. Nonlinear Regression
Hugo Hernandez
ForsChem Research
where the functions and are deterministic functions of the independent variable vector
. [ ] is a vector of model parameters. is a standard deterministic variable [2]
(equivalent to number ), and is a type I standard random variable [3].
A general formulation of the multi-objective optimization problem of model structure

identification can be represented as follows:
( )
( )
( )
( )
(1.2)
where represents the set of possible models considered, is a bias penalty term to be
minimized, is a residual error penalty term to be minimized, and is a parametric mismatch
penalty term to be minimized.
Considering a homoscedastic model:
( ) ( )
(1.3)
where the constant is the standard deviation of the model from the experimental data, the
optimization problem can be simplified resulting in the minimization of sums of squared
residuals ( ) [1]:
( )
〈 〉
̂
(1.4)
or in the parsimonious minimization of the estimated residual variance ( ̂ ):
̂
( )
〈 〉
̂
(1.5)
where the residuals are defined as the difference between each experimental observation ( )
and its corresponding model prediction ( ̂ ( ) ):
̂( )
(1.6)
11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

10.13140/RG.2.2.25901.87527 www.forschem.org (2 / 35)
Hugo Hernandez
ForsChem Research
In the previous report [1], the particular case of homoscedastic models with a linear function of
the parameters was considered. A stepwise multiple linear regression strategy proceeding in
both directions (backward elimination and forward selection) was suggested based on the
selection of relevant terms for the model prioritized on their absolute linear correlation
coefficients with respect to the response variable, followed by the identification of statistically
significant or explanatory terms based on optimal significant levels. Two additional optional
constraints were included, considering a lower limit in the normality value of the residuals
(normality assumption check), as well as a lower limit in standard residual error (avoiding
model overfitting). Such stepwise strategy, successfully overcoming several limitations of
conventional stepwise regression, was implemented in R language (https://cran.r-project.org/),
and different examples were considered to illustrate its use.
In this report, a more general case of homoscedastic models with nonlinear functions of the
model parameters is considered. Unfortunately, analytical expressions can only be obtained for
specific nonlinear functions, and for that reason, numerical approaches are considered for both
parameter identification and model structure optimization.
2. Models with Nonlinear Functions of Parameters
2.1. General Formulation
The general homoscedastic model described by nonlinear functions of model parameters was
already presented in Eq. (1.3). However, a slight modification will be introduced to guarantee
the existence of parameters resulting in unbiased model residuals. Thus, the modified version
is:
( ) ( )
(2.1)
where
( ) ( )
(2.2)
[ ]
(2.3)
and
̅ ∑ ( )
(2.4)
where ̅ is the average value of all observations of variable .

Hugo Hernandez
ForsChem Research
Notice that Eq. (2.4) is used to determine , and therefore the parameter estimation
procedure is performed only considering vector . In other words, the unbiased model
structure considered is ultimately:
( ) ̅ ( ) ∑ ( )
(2.5)
Alternatively, the nonlinear function ( ) can also be described by an infinite polynomial

Taylor series expansion (in terms of the model parameters) as follows:
∑
( ) ( []
̂ [ ])
( ) ∑ ∑ ∑ ( ) ∏
∏ []
( ̂ )
(2.6)
where is the number of parameters considered in function , and ̂ is a vector of model

parameters estimations.
Now, model (2.5) will be expressed as follows:
∑
( ) ( []
̂ [ ])
( ) ̅ ∑ ∑ ∑ ( ) ∏
∏ []
( ̂ )
∑
( ) ( []
̂ [ ])
∑∑ ∑ ∑ ( ) ∏
∏ []
( ̂ )
(2.7)
2.2. Parameter Identification
The optimal parameter values can be found by solving either problem (1.4) or problem (1.5). Let
us consider again the general homoscedastic nonlinear model (2.5). Then, the corresponding
objective functions to minimize will be:
∑ ∑( ̅ ( ) ∑ ( ))
(2.8)

Hugo Hernandez
ForsChem Research
(2.9)
Considering a fixed model structure (the number of model parameters remains constant) and a
fixed set of observations for parameter identification, then both optimization problems can be
equivalently substituted by:
( ( ( )) ( ( ) ))
(2.10)
where
∑ ( )
( ( ))
̅ ( )
(2.11)
∑ ( ̅)( ( )
̅ ( ))
( ( ))
(2.12)
̅ ( ) ∑ ( )
(2.13)
The corresponding optimality conditions require:
( ( )) ( ( ))
( ) ( )
[ ] ( ̂ ) [ ] ( ̂ )
(2.14)
where ̂ represents the optimal estimation of the model parameter values for the set of
observations considered.
Eq. (2.14) can be rearranged and alternatively expressed as follows:
( ) ( )
∑( ̅
( ̂ ) (̂ ) ) (( ) ∑( ) )
[ ] ( ̂ ) [ ] ( ̂ )
( ) ( )
∑( ̅) (( ) ∑( ) )
[ ] ( ̂ ) [ ] ( ̂ )
(2.15)
Unfortunately, the set of nonlinear equations given by Eq. (2.14) or (2.15) does not have a
( )
general analytical solution for ̂ , as it depends on the particular functionality of .
[ ]

Hugo Hernandez
ForsChem Research
However, notice that the nonlinear function can be approximated about ̂ (where ̂ is an
estimation close to the optimum ̂ ) by a linear function of the model parameters, simply by
truncating Eq. (2.6) after the first power terms, resulting in:
( )
∑ [( ) ( ̂ [ ] )]
( ) ( ̂ ) []
[] ( ̂ )
(2.16)
The set of optimality conditions (from Eq. 2.15) can be rearranged into:
( ) ( )
∑ ∑ (( ) ∑( ) )
[ ] ( ̂ ) [ ] ( ̂ )
( ) ( )
(( ) ∑( ) ) ̂[]
[] ( ̂ ) [] ( ̂ )
( ) ( )
∑ (( ) ∑( ) )
[ ] ( ̂ ) [ ] ( ̂ )
( ̅) ( ( ̂ ) ∑ ( ̂ ))
( ) ( )
∑ [(( ) ∑( ) ) ̂ [ ]]
[] ( ̂ ) [] ( ̂ )
)
(2.17)
Eq. (2.17) represents a set of linear equations with ̂ as the unknown vector. This set of
equations can be iteratively solved from an arbitrary initial estimation until convergence is
achieved, finding the optimum estimation. Unfortunately, this method can diverge.
Alternatively, minimization problem (2.10) can also be solved by any suitable numerical
optimization method. When ( ) is linear with respect to the parameters, the optimization
is easily solved by gradient-based optimization methods. However, nonlinear functions may
result in local optima that do not necessarily correspond to the global optimum. Thus, the use
of global optimization strategies is highly advisable. One possible approach is the multi-
algorithm strategy proposed in a previous report [4], where multiple types of optimization
algorithms (including gradient-based and non-gradient-based methods) are sequentially used
to find the global optimum.

Hugo Hernandez
ForsChem Research
2.3. Parameter Uncertainty and Identifiability
The exact analytical determination of the uncertainty in the estimation of model parameters is
challenging for the case of nonlinear functions. Considering the similitude between multiple
linear regression and the linear approximation given in Eq. (2.17), an approximate expression
for parameter uncertainty can be obtained [1]:
( )( ̂ ) (( (̂ ) (̂ ) ) ) ̂ (̂ )
(2.18)
where (̂ ) is the corresponding approximated Fisher information matrix (a matrix)
evaluated at ̂ , with elements:
( ) ( )
(̂ )[ ] ( ) ∑( )
[] ( ̂ ) [] ( ̂ )
(2.19)
Of course, numerical methods like Monte Carlo simulation can also be used to determine
uncertainty in the estimation of model parameters.
Now, the set of model parameters can be identified if and only if all terms in ( (̂ ) (̂ ) )
are finite. In other words, when (̂ ) (̂ ) is not singular (or close to singular). On the other
hand, the matrix (̂ ) (̂ ) is non-singular when i) all columns in (̂ ) are linearly
independent, ii) the number of rows is not less than the number of columns in (̂ ), or iii) the
values of at least two model parameters differ several orders of magnitude from each other
(parameter re-scaling or standardization is needed).
Two or more columns in (̂ ) are linearly correlated when at least one of the parameters is
redundant in the model. Take for example the following function:
( ) ( )
(2.20)
The matrix -th row of (̂ ) then becomes:
̂ ̂ ̂ ̂
(̂ )[ ] [ ( ∑ ) ( ∑ ) ( ∑ )]
̂ ̂ ̂
̂ ̂ ̂ ̂
( ∑ )[ ]
̂ ̂ ̂
(2.21)

Hugo Hernandez
ForsChem Research
and
̂ ̂ ̂ ̂
( ) ̂ ( )
̂ ̂ ̂
̂ ̂ ̂ ̂
(̂ ) (̂ ) (∑ ( ∑ )) ( ) ̂ ( )
̂ ̂ ̂
̂ ̂ ̂ ̂
̂ ( ) ̂ ( ) ( )
[ ̂ ̂ ̂ ]
(2.22)
The matrix (̂ ) (̂ ) is singular because the columns of (̂ ) are linearly correlated. In

addition, it can also be observed that:
̂ ̂ ̂ ̂
( ) ̂ ( )
̂ ̂ ̂
̂ ̂ ̂ ̂
( ) ̂ ( )
̂ ̂ ̂
̂ ̂ ̂ ̂
̂ ( ) ̂ ( ) ( )
[ ̂ ̂ ̂ ]
̂ ̂ ̂ ̂ ̂ ̂
( ) (( ) ( ) ̂ ( ) ̂ ( ) )
̂ ̂ ̂ ̂ ̂
̂ ̂ ̂ ̂ ̂ ̂ ̂ ̂
( ( ) ̂ ( ) ̂ ( ) )
̂ ̂ ̂ ̂ ̂
̂ ̂ ̂ ̂ ̂ ̂
̂ ( ) ( ̂ ( ) ̂ ( ) ( ) )
̂ ̂ ̂ ̂ ̂
̂ ̂ (̂ ̂ ) ̂ ̂ (̂ ̂ ) (̂ ̂ ) (̂ ̂ )
( ) ( )
̂ ̂ ̂ ̂ ̂ ̂
(2.23)
So, it is impossible to determine all three parameters, independently on the number of

observations available. In fact, all three parameters should have been grouped as a single
parameter , resulting in the following identifiable model:
( )
(2.24)

Hugo Hernandez
ForsChem Research
2.4. Model Structure Simplification and Optimization
After an identifiable model has been obtained, proceeding similarly as in the case of multiple
linear regression [1], a statistic can be defined for each model parameter:
(̂ [ ] )
̂
[ ]( [ ] ) ( [ ] )( ̂ )
[]
(2.25)
where is an arbitrary reference value for ̂ [ ] , typically . It is also assumed that is a

standard normal random variable. This assumption can be verified using a suitable normality
test [5] on the model residuals.
If ̂[] , it is highly likely that the model parameter can be replaced by , resulting in
[ ]( )
a simpler model. However, care must be taken because this action may also remove other
model parameters from the model structure. This is particularly the case for . Thus, it is
always advisable to check for the resulting ̂ value (keeping all other model parameter values
constant). Only if ̂ decreases (without overfitting) the model simplification can be accepted.
It is then desirable to test for the simplification of model parameters starting from the lowest
value and continuing in ascending order. When or when ̂ increases, then the
simplification procedure may stop.
It is also advisable considering integer values of ( , , , etc.), as well as rational values ( ,

, , etc.) and transcendental constants (such as , , , , etc.).
Notice that forcing a particular value on the model parameter increases the degrees of
freedom for the estimation of residual error variance. However, it might also increase the sum
of squared residuals, ultimately increasing the error variance. For that reason, all changes must
always be tested. In addition, if a forced parameter value is accepted, all remaining model
parameters should be estimated again before testing a new change.
The final step consists in evaluating the bias correction term ( ). The statistic is
approximated in this case by:
̂ (̅ ∑ ( ̂ ))
(̂ ) ̂ ̂
(2.26)
The bias correction term can be neglected (and removed from the model) if (̂ )
.

Hugo Hernandez
ForsChem Research
The proposed procedure for optimizing a nonlinear model structure is illustrated in Figure 1. A
numerical implementation of this procedure in R language (https://cran.r-project.org/) is
included in the Appendix. The proposed function stepnlm is similar to the function used for
multiple linear regression (steplm [1]). The main differences of the new stepnlm function with
respect to the previous steplm are: i) The definition of arbitrary values for simplifying the
model, ii) the identification of model parameters by numerical optimization, iii) the numerical
evaluation of partial derivatives for the estimation of parameter uncertainty, and iv) the lack of
predefined model structures.
Figure 1. Proposed procedure for structure optimization of nonlinear models
A very important step in the proposed procedure is the evaluation of homoscedasticity of

residuals. This evaluation can be done using a suitable scedasticity test, preferably also suitable
for non-normal distributions of residuals. In particular, the -value method for evaluating
homoscedasticity [6] is used in this procedure. However, any other preferred homoscedasticity
test can be alternatively employed. Nevertheless, results obtained for heteroscedastic residuals
are reported.

Hugo Hernandez
ForsChem Research
Homoscedasticity of residuals is the basis of the nonlinear model (1.3) employed for optimizing
the model structure. Thus, if residuals are heteroscedastic, the results obtained by this
procedure may not be reliable. In some cases, a monotonic nonlinear transformation of the
response variable may result in homoscedastic residuals (like for example a logarithm
transformation). In other cases, a different model is needed. Typically, such nonlinear
transformations are obtained by a trial-and-error procedure, although some systematic
methods such as the Box-Cox transformation [7] are available.
3. Illustrative Examples
3.1. Multiple Linear Regression
As a first illustrative example, let us consider the extreme multiple linear regression example
(Example 3) introduced in the previous part of this series [1]. The data is shown in Table 1. It is
considered an extreme example for illustrating the potential failure of stepwise regression.
Table 1. Extreme Multiple Linear Regression Data [1]
0.11 16.55 12.37

0.69 15.08 12.66
5.50 0.00 12.00
2.89 7.77 11.93
4.47 2.16 11.06
1.81 12.09 13.03
3.15 8.18 13.13
0.00 15.94 11.44
3.15 7.91 12.86
3.02 6.29 10.84
4.67 1.69 11.20
0.16 15.58 11.56
0.68 13.28 10.83
5.71 0.00 12.63
3.87 5.36 12.46
The general model function considered is the following linear expression:
( ) [ ] [ ]
(3.1)
This function is implemented in R language as follows:
modelfn1<-function(x,param) sum(param*x)

Hugo Hernandez
ForsChem Research
The optimization of the model structure can be performed either considering multiple linear
regression (steplm) or nonlinear regression (stepnlm). The first case was considered previously
resulting in [1]:
x1=c(0.11,0.69,5.5,2.89,4.47,1.81,3.15,0,3.15,3.02,4.67,0.16,0.68,5.71,3.87)
x2=c(16.55,15.08,0,7.77,2.16,12.09,8.18,15.94,7.91,6.29,1.69,15.58,13.28,0,5
.36)
x=data.frame(x1,x2)
y=c(12.37,12.66,12,11.93,11.06,13.03,13.13,11.44,12.86,10.84,11.2,11.56,10.8
3,12.63,12.46)
steplm(y,x)
$model
[1] "Y~1+X2+X1"
$coefficients
(Intercept) X2 X1
-4.534347 1.002182 3.005421
$Tvalues
(Intercept) X2 X1
-110.1527 401.2022 401.1985
$model_performance
s R2adj Nvalue
0.007463931 0.999913422 1.827062191
$significant_vars
Df |T| value Pr(>|T|)
X2 1 401.20 < 2.2e-16 ***
X1 1 401.20 < 2.2e-16 ***
Residuals 12
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
$significance_level
[1] 0.04812713
The second case is evaluated as follows1:

stepnlm(y,x,param0=c(0,0),modelfn=modelfn1)
$bias
[1] -4.534347
$param
coeff Tvalue uncertainty
1 3.005421 401.1985 0.007491108
2 1.002182 401.2022 0.002497947
$model_performance
s R2adj Nvalue Hvalue
0.007463931 0.999913422 1.827062191 3.266647954
1
Model predictions and model residuals, included in the output of the function, will not be shown for
simplicity.

Hugo Hernandez
ForsChem Research
Notice that the bias term in nonlinear regression represents the intercept in multiple linear
regression.
As we can see, both results are exactly identical. Thus, we may conclude that the nonlinear
regression approach is more general and can be also used for linear models.
However, in nonlinear regression the significance of the regressor variables is not evaluated
because it will not be necessarily associated to the significance of an individual model
parameter, as in the case of multiple linear regression.
3.2. General Arrhenius Model
As a second example, let us consider the general Arrhenius equation considering a Maxwell-
Boltzmann correction for the pre-exponential coefficient [6]:
( )
( ) ( )
(3.2)
This expression describes the effect of absolute temperature on the rate coefficient of a
reaction. The model parameters are:
 : Pre-exponential coefficient associated to collision frequency [ ( )]
 : Characteristic power of the rate-limiting step of the reaction.
 : Activation energy (expressed in absolute Kelvin degrees)
Eq. (3.2) will be used to describe the following bimolecular reaction:
( ) ( ) ( ) ( )
(3.3)
140 different measurements of the rate of reaction at different temperatures, reported in more
than 20 different works between 1963 and 2000, are used for the estimation of reaction
parameters. These are shown in Table 2. The temperature range of the experiments is
approximately between and . The values of the activation energy barrier (in K),
previously estimated by different authors, are between and , and the pre-
exponential temperature power ranges from to [7].
Since parameter identification procedures has been previously performed for this set of
experimental data [6], those results will be used as starting point ( ) for the optimization of
the nonlinear model structure:

Hugo Hernandez
ForsChem Research
[ ]
(3.5)
Table 2. Kinetic data for reaction (3.3) [7]

T [K] k [cm3/s] T [K] k [cm3/s] T [K] k [cm3/s] T [K] k [cm3/s]
3246.5 8.6941E-11 1330.6 1.3359E-12 622.4 1.0407E-14 412.9 1.9014E-16
3178.1 6.9259E-11 1319.7 5.2286E-13 612.0 1.1335E-14 412.9 2.3870E-16
2987.6 4.9248E-11 1319.1 9.7722E-13 607.1 7.4008E-15 408.3 2.9131E-16
2987.1 5.3633E-11 1284.7 1.0644E-12 590.3 3.4363E-15 405.0 2.1926E-16
2986.4 6.1824E-11 1273.4 1.3748E-12 590.3 4.5661E-15 396.3 1.6983E-16
2818.4 3.6029E-11 1251.6 1.7757E-12 572.1 2.3756E-15 396.3 2.1319E-16
2765.1 4.1534E-11 1242.5 3.9362E-13 572.0 5.5736E-15 391.1 1.9033E-16
2764.7 4.5231E-11 1241.7 8.7248E-13 550.7 4.3175E-15 384.0 1.2788E-16
2764.0 5.3643E-11 1241.4 1.2626E-12 544.7 2.9840E-15 378.1 1.6058E-16
2665.3 4.3968E-11 1201.9 4.8037E-13 521.8 1.8415E-15 377.2 7.8899E-17
2575.9 1.8741E-11 1201.5 8.2442E-13 519.8 4.3208E-15 375.2 1.2435E-16
2493.6 5.6797E-12 1191.5 1.1596E-12 516.3 1.4671E-15 372.4 7.2467E-17
2447.3 2.3531E-11 1173.0 6.9524E-13 514.5 2.2474E-15 371.5 1.1101E-16
2226.9 1.9846E-11 1103.0 7.7927E-13 509.2 1.5981E-15 369.6 1.2090E-16
2225.8 2.8719E-11 1041.3 4.8081E-13 504.0 2.1238E-15 368.7 7.2481E-17
2101.5 1.3717E-11 1034.3 2.7232E-13 499.0 1.2030E-15 364.2 5.3030E-17
2101.3 1.4520E-11 992.8 2.2325E-13 497.2 2.1242E-15 363.3 7.4592E-17
1841.1 8.7086E-12 918.8 2.3644E-13 493.9 1.4681E-15 362.4 6.6578E-17
1775.3 6.7437E-12 913.4 1.5004E-13 492.3 1.7914E-15 356.3 5.6157E-17
1754.6 5.6865E-12 902.2 1.7298E-13 484.2 1.6928E-15 353.8 4.3486E-17
1694.6 4.9338E-12 880.8 1.2655E-13 479.6 6.8170E-16 349.6 5.6178E-17
1603.8 2.5664E-12 865.3 1.6347E-13 479.6 1.1372E-15 348.0 4.1097E-17
1602.0 8.2320E-12 831.8 4.8164E-14 468.9 1.1375E-15 345.6 2.9223E-17
1569.7 2.7168E-12 827.0 6.2209E-14 460.2 9.5943E-16 340.0 2.6091E-17
1537.5 2.0448E-12 813.2 1.0088E-13 458.7 1.2392E-15 336.1 3.5677E-17
1491.5 1.4132E-12 804.7 4.4240E-14 447.7 6.2664E-16 328.0 2.0217E-17
1490.5 3.0448E-12 755.5 2.9732E-14 447.7 7.4318E-16 322.3 1.8051E-17
1490.3 3.4114E-12 747.6 5.8825E-14 445.0 5.5934E-16 321.6 2.1408E-17
1447.9 1.2262E-12 740.5 2.3025E-14 441.0 7.2252E-16 318.8 1.6114E-17
1433.7 1.3739E-12 740.3 4.0654E-14 437.2 4.5854E-16 317.4 1.9664E-17
1419.3 2.4960E-12 736.8 1.9415E-14 425.9 3.3556E-16 315.4 1.5227E-17
1393.6 1.0051E-12 733.0 3.5271E-14 425.9 4.4588E-16 300.7 1.0838E-17
1393.4 1.1586E-12 715.5 1.8347E-14 425.8 5.1399E-16 297.7 8.8846E-18
1393.3 1.3356E-12 676.4 1.2686E-14 422.2 3.6548E-16 297.7 1.0242E-17
1330.8 1.1923E-12 673.3 1.1323E-14 415.1 3.9813E-16 295.9 1.1155E-17
In addition, the following arbitrary parameter value tolerances ( ) are considered:
[ ]
(3.6)

Hugo Hernandez
ForsChem Research
The experimental data is saved as a data frame in R with the name “Arrhenius”. The function
initially considered is (from Eq. 3.2):
modelfn2<-function(x,param){
param[1]*x^param[2]*exp(-param[3]/x)*((1+(1+(param[3]/x)^2))/4)
}
The results obtained after nonlinear model structure optimization for this case are the
following:
stepnlm(Arrhenius$y,Arrhenius$x,param0=c(1.1e-25,4.32,475),modelfn=modelfn2,
ptol=c(1e-31,1e-6,1e-4))
Error in solve.default(t(phi0)%*%phi0): system is computationally singular:

reciprocal condition number = 2.1656e-62
Error in stepnlm(Arrhenius$y, Arrhenius$x, param0 = c(1.1e-25, 4.32, 475), :

modelfn is an unidentifiable function. Please check the model structure
(change structure, re-scale or remove parameters) and try again!
This error was obtained due to the large differences in parameter values, spanning almost
orders of magnitude. As suggested, we must re-scale one or more parameters. Particularly, the
pre-exponential coefficient will be re-scaled. The new model is then:
( )
( ) ( )
(3.7)
The model structure remains the same, but the initial value and tolerance of parameter , and
regressand change as follows:
stepnlm(1e25*Arrhenius$y,Arrhenius$x,param0=c(1.1,4.32,475),modelfn=modelfn2
,ptol=c(1e-6,1e-6,1e-4))
Error in stepnlm(1e+25 * Arrhenius$y, Arrhenius$x, param0 = c(1.1, 4.32, :

The default identification method diverges. Please use a different method
or a different starting point and try again!
The problem is no longer computationally singular thanks to parameter re-scaling, but the
default parameter optimization (i.e. sequential linearization) diverges. As an alternative, the
default optimization function available in R (optim) is used:
stepnlm(1e25*Arrhenius$y,Arrhenius$x,param0=c(1.1,4.32,475),modelfn=modelfn2
,ptol=c(1e-6,1e-6,1e-4),optmethod=optim)
$bias
[1] 0
$param
1 1.252043 0.7227056 1.7324386
2 4.303841 24.8903615 0.1729119
3 0.000000 NaN 0.0000000

Hugo Hernandez
ForsChem Research
$model_performance
3.016737e+13 9.582601e-01 -Inf -6.948636e+01
Warning message:
Model residuals are heteroscedastic! Use a suitable transformation of the re
sponse variable, or a different model, and try again.
There are several points to notice from these results:
 First, the bias correction term ( ) was exactly zero. That is, the function considered is
unbiased.
 Second, the activation energy ( ) was no different from zero either. It means that the
uncertainty in the estimation of was much larger than the optimal parameter value
obtained.
 Third, the pre-exponential term ( ) has also an increased uncertainty (| | ), but it
is a relevant term (it cannot be exactly zero).
 Fourth, the residuals are far from normal ( ), with an apparently large standard
deviation ( ), due to the scaling factor used ( ).
 Fifth, the model performance is good ( ).
 Finally, the data is highly heteroscedastic ( ). Thus, the results of this
procedure may not be reliable.
Notice that the model obtained by this procedure is the following homoscedastic model:
( )
(3.8)
where is a non-normal standard random variable.
Probably, the fact that the resulting activation energy value is negligible (due to high
uncertainty) simply reflects the inadequacy of homoscedastic model (3.8) since residuals are
heteroscedastic.
Let us observe the performance of such model graphically (Figure 2), using the following code:
curve((1.252043/2)*x^4.303841,from = min(Arrhenius$x),to=max(Arrhenius$x),
col="green",xlab="Temperature [K]",ylab="Reaction rate [10^-25 cm^3/s]")
points(Arrhenius$x,1e25*Arrhenius$y,col="blue")
While Eq. (3.8) seems to provide a good description of the experimental data, the residual plot
shown in Figure 3 illustrates the heteroscedasticity of model residuals. Unfortunately, this
means that the current procedure employed for parameter identification and structure
optimization may not be reliable for the function considered in Eq. (3.7).
res=1e25*Arrhenius$y-modelfn2(Arrhenius$x,c(1.252043,4.303841,0))

Hugo Hernandez
ForsChem Research
plot(log(Arrhenius$x),res,xlab="Log(Temperature [K])",ylab="Model residuals"

,col="red")
Figure 2. Graphical comparison of model predictions (Eq. 3.8) and experimental observations (Table 2)
Figure 3. Model residuals plot obtained from Eq. 3.8 and experimental observations (Table 2)

Hugo Hernandez
ForsChem Research
A possible alternative to this situation is nonlinearly transforming the regressand. For example,
let us consider the model for the natural logarithmic transformation of the reaction rate
coefficient:
( ) ( ( ) )
(3.9)
where , (preserving the previous notation), and the new bias correction term
( ) corresponds to:
(3.10)
The new function to be used is:
modelfn2L<-function(x,param){
param[1]*log(x)-param[2]/x+log(1+(1+param[2]/x)^2)
}
And the nonlinear model structure optimization procedure yields:

stepnlm(log(Arrhenius$y),Arrhenius$x,param0=c(4.32,475),modelfn=modelfn2L,
ptol=c(1e-6,1e-4),optmethod=optim)
$bias
[1] -59.0002
$param
1 4.357406 32.60131 0.1336574
2 3004.853511 26.87288 111.8173227
$model_performance
0.3472787 0.9946762 -1.8500093 1.1568497
Notice that the default identification method (iterative linearization) converged, yielding the
following optimal parameter values:
(3.11)
with , and a non-normal ( ), but now homoscedastic ( ),

distribution of residuals, as seen in Figure 4.
par(mfcol=c(1,2))
curve(-59.0002+modelfn2L(x,c(4.357406,3004.853511)),from = min(Arrhenius$x),
to=max(Arrhenius$x), col="green",xlab="Temperature [K]",ylab="Log(reaction
rate [cm^3/s])")

Hugo Hernandez
ForsChem Research
points(Arrhenius$x,log(Arrhenius$y),col="blue")
res=log(Arrhenius$y)-modelfn2L(Arrhenius$x,c(4.357406,3004.853511))+59.0002
plot(log(Arrhenius$x),res,xlab="Log(Temperature [K])",ylab="Model residuals"
,col="red")
Figure 4. Model predictions and residuals from Eq. (3.9) and experimental observations (Table 2)
Now, let us test whether the model parameters can be replaced by arbitrary constant values.
For example, if we assume and , then we obtain:
stepnlm(log(Arrhenius$y),Arrhenius$x,param0=c(4.357406,3004.853511),modelfn=
modelfn2L,ptol=c(1e-6,1e-4),testparam=c(4.4,3000),optmethod=optim)
$bias
[1] -59.28836
$param
1 4.4 NaN 0
2 3000.0 NaN 0
$model_performance
0.3457520 0.9947229 -2.6301052 2.2655936
confirming that they are suitable assumptions for this data set. The -values are undetermined
because we are assuming constant values with zero uncertainty.

Hugo Hernandez
ForsChem Research
3.3. Vapor Pressure Model
The behavior of vapor pressure of liquids ( ) as a function of temperature ( ) is typically

modeled using the empirical Antoine equation [10]:
( )
(3.12)
where , , and are model parameters.
An alternative empirical model for describing vapor pressure has been proposed having only
two model parameters [11]:
( )
( ) ( )
( )
(3.13)
where and represent a reference observation, and and are the model
parameters.
Structure optimization and parameter estimation for both models will be performed
considering the vapor pressure of pure water at different temperatures in the range
– [12-14]. The data is summarized in Table 3.
The Antoine model is implemented in R using the following function:
modelfn3A<-function(x,param){
exp(param[1]-param[2]/(x+param[3]))
}
On the other hand, the alternative model considering and is

implemented as follows:
modelfn3B<-function(x,param){
((x/373.15)*erfc(param[1]/x)/erfc(param[1]/373.15))^param[2]
}
where erfc is:

erfc<-function(x) 2*pnorm(x*sqrt(2),lower=FALSE)
The optimization of Antoine’s model is arbitrarily started from a previously reported parameter
set: [ ]. For the alternative empirical model, the initial parameter
values are: [ ].

Hugo Hernandez
ForsChem Research
Table 3. Vapor pressure data for water between and [12-14].

(K) (atm) (K) (atm) (K) (atm)
255.85 0.0013 333.15 0.197 473.15 15.35
273.16 0.0060 339.65 0.263 486.25 20.00
274.35 0.0066 343.15 0.308 493.15 22.89
275.15 0.0070 353.15 0.468 494.15 23.34
277.15 0.0080 356.15 0.526 498.15 25.17
283.15 0.0121 363.15 0.693 507.75 30.00
284.45 0.0132 369.15 0.866 513.15 33.03
287.15 0.0158 373.15 1.000 523.15 39.25
291.15 0.0204 379.15 1.230 524.25 40.00
293.15 0.0231 383.15 1.420 533.15 46.31
295.35 0.0263 393.15 1.960 537.85 50.00
298.15 0.0313 393.25 2.000 548.15 58.70
303.15 0.0419 398.15 2.290 549.65 60.00
307.15 0.0526 409.48 3.210 553.15 63.33
307.25 0.0526 413.15 3.570 573.15 84.76
313.15 0.0729 423.15 4.700 593.15 111.4
314.75 0.0789 425.55 5.000 613.15 144.1
317.15 0.0899 433.15 6.100 633.15 184.2
323.15 0.122 448.15 8.810 647.10 217.8
324.75 0.132 453.15 9.900
327.15 0.148 453.65 10.00
stepnlm(VaporPressure$Pv,VaporPressure$T,param0=c(12.549,4646.02,0.001341),
modelfn=modelfn3A,ptol=c(1e-6,1e-4,1e-6),optmethod=optim)
$bias
[1] 0
$param
1 12.61495 1186.0166 0.0106364
2 4683.13217 711.0819 6.5859250
3 0.00000 NaN 0.0000000
$model_performance
0.2634182 0.9999641 -18.2642636 -55.6057223
Warning message:
Now, considering the logarithm transformation of vapor pressure, the new model becomes:
modelfn3AL<-function(x,param){
-param[1]/(x+param[2])
}
Parameter is removed from the function code, as it will now be represented by the model
bias.

Hugo Hernandez
ForsChem Research
stepnlm(log(VaporPressure$Pv),VaporPressure$T,param0=c(4646.02,0.001341),mod
elfn=modelfn3AL,ptol=c(1e-4,1e-6),optmethod=optim)
$bias
[1] 11.72558
$param
1 3847.95216 157.23624 24.472426
2 -44.94323 -43.84521 1.025043
$model_performance
0.01914141 0.99996602 -21.15615121 -50.47679520
Warning message:
Unfortunately, in this case heteroscedasticity was not removed by the transformation, as can
be observed in Figure 5. Notice that heteroscedasticity in this example is caused by a structural
deviation of the model rather than by an effect of scale (as in the previous example). For that
reason, no nonlinear transformation will remove heteroscedasticity. Nevertheless, the model
fitness is extremely satisfactory for descriptive purposes, as can be confirmed in Figure 6.
Figure 5. Model residuals as function of Temperature obtained for Antoine Equation of water. Left plot:
Eq. (3.12). Right plot: Logarithm transformation of the vapor pressure.
Since the for both approaches is almost identical, but the uncertainty in the estimation of
parameters is less for the untransformed model, then we may conclude that the optimal
parameter values for Antoine’s model of water vapor pressure (in atmospheres), in the range
of temperatures from to , are:
[ ]
[ ]
(3.14)

Hugo Hernandez
ForsChem Research
Figure 6. Model predictions and experimental observations as function of temperature obtained for
Antoine Equation of water. Left plot: Eq. (3.12). Right plot: Logarithm transformation of the vapor
pressure.
Proceeding similarly with the alternative empirical model given by Eq. (3.13), the following
results are obtained:
stepnlm(VaporPressure$Pv,VaporPressure$T,param0=c(459.37,2.8176),modelfn=mod
elfn3B,ptol=c(1e-4,1e-6),optmethod=optim)
$bias
[1] 0
$param
1 421.127693 151.1680 2.78582612
2 3.119176 137.3454 0.02271046
$model_performance
0.2182447 0.9999754 -15.9727490 -62.2246764
Warning message:
stepnlm(log(VaporPressure$Pv),VaporPressure$T,param0=c(459.37,2.8176),modelf
n=modelfn3BL,ptol=c(1e-4,1e-6),optmethod=optim)
$bias
[1] 0
$param
1 430.372634 137.0045 3.14130383
2 3.043107 106.5623 0.02855708
$model_performance
0.01539341 0.99997803 -13.40162090 -26.33647921
Warning message:
using:

Hugo Hernandez
ForsChem Research
modelfn3BL<-function(x,param){
log(((x/373.15)*erfc(param[1]/x)/erfc(param[1]/373.15))^param[2])
}
The results obtained are graphically summarized in Figure 7. In practice, the behavior of this
model is similar to Antoine’s model (and even slightly better), and is also suitable for
descriptive purposes.
Figure 7. Model predictions (top plots) and residuals (bottom plots) of vapor pressure of water as
functions of temperature, obtained for Eq. (3.13). Left plot: Original scale. Right plot: Logarithm
transformation of the vapor pressure.
3.4. Barometric Model
Let us now consider a general barometric model obtained after combining the structure of
different barometric models previously reported [15,16]. The general model describes the
vertical profile of atmospheric pressure as a function of altitude ( ):
( ) ( )
(3.15)
Or equivalently,

Hugo Hernandez
ForsChem Research
( ) ( )
(3.16)
The optimal parameter values will be obtained considering the set of experimental
observations summarized in Table 4.
The model functions employed are:

param[1]*exp(-param[2]*x)*(1-x*param[3])^param[4]
}
modelfn4L<-function(x,param){
-param[1]*x+param[3]*log(1-x*param[2])
}
and the initial estimations of the parameter values are: , ,

, and .
The following results are obtained:
stepnlm(Barometric$P,Barometric$Z,modelfn=modelfn4,param0=c(1,0.02,0.02,2),p
tol=c(1e-4,1e-6,1e-6,1e-4),optmethod=optim)
Error in solve.default(t(phi1) %*% phi1, tol = NULL) :

Lapack routine dgesv: system is exactly singular: U[3,3] = 0
stepnlm(log(Barometric$P),Barometric$Z,modelfn=modelfn4L,param0=c(0.02,0.02,
2),ptol=c(1e-6,1e-6,1e-4),optmethod=optim)
$bias
[1] 0
$param
1 0.00000000 NaN 0.000000000
2 0.02637225 13.02545 0.002024671
3 4.15025167 11.76454 0.352776505
$model_performance
0.006847938 0.995205953 -10.843383640 -30.167745076
During the execution of the procedure, the untransformed model resulted in a singular matrix.
On the other hand, the logarithm transformation successfully obtained an optimal result.
However, the residuals were heteroscedastic (see Figure 8). Interestingly, one of the
parameters was set to zero, simplifying the model structure, resulting in:
( ) ( )
(3.17)
Table 4. Atmospheric pressure values at different altitudes [15]

Hugo Hernandez
ForsChem Research
Altitude Pressure Altitude Pressure Altitude Pressure Altitude Pressure

(km) (atm) (km) (atm) (km) (atm) (km) (atm)
0.000 1.0000 0.650 0.9317 0.915 0.9030 1.425 0.8537
0.504 0.9376 0.658 0.9307 0.933 0.9011 1.432 0.8438
0.507 0.9386 0.659 0.9287 0.936 0.8971 1.479 0.8527
0.513 0.9396 0.662 0.9307 0.943 0.9020 1.510 0.8399
0.514 0.9386 0.664 0.9257 0.946 0.8991 1.522 0.8458
0.515 0.9405 0.666 0.9159 0.950 0.9090 1.531 0.8389
0.522 0.9445 0.666 0.9218 0.955 0.9001 1.533 0.8497
0.524 0.9376 0.667 0.9297 0.955 0.9001 1.543 0.8330
0.526 0.9396 0.670 0.9238 0.956 0.8991 1.543 0.8409
0.526 0.9376 0.671 0.9228 0.960 0.8951 1.555 0.8399
0.532 0.9445 0.673 0.9218 0.971 0.9011 1.566 0.8330
0.533 0.9386 0.674 0.9257 0.977 0.8922 1.593 0.8270
0.533 0.9396 0.686 0.9287 0.989 0.8971 1.621 0.8359
0.536 0.9396 0.689 0.9218 0.997 0.8863 1.647 0.8221
0.540 0.9376 0.692 0.9218 1.031 0.8932 1.680 0.8261
0.545 0.9376 0.698 0.9257 1.054 0.8912 1.680 0.8290
0.551 0.9336 0.709 0.9247 1.063 0.8833 1.683 0.8201
0.554 0.9267 0.712 0.9247 1.067 0.8833 1.750 0.8211
0.554 0.9317 0.713 0.9119 1.069 0.8784 1.808 0.8093
0.557 0.9405 0.717 0.9228 1.091 0.8902 1.817 0.8211
0.564 0.9356 0.753 0.9139 1.117 0.8784 1.855 0.8014
0.565 0.9336 0.758 0.9188 1.125 0.8823 1.870 0.8162
0.565 0.9405 0.759 0.9218 1.130 0.8754 1.872 0.8093
0.567 0.9307 0.760 0.9169 1.142 0.8774 1.876 0.8014
0.569 0.9386 0.760 0.9178 1.145 0.8833 1.880 0.8043
0.570 0.9326 0.765 0.9208 1.146 0.8784 1.909 0.8053
0.571 0.9415 0.767 0.9149 1.151 0.8803 2.074 0.8004
0.573 0.9326 0.769 0.9129 1.160 0.8774 2.110 0.7866
0.576 0.9376 0.779 0.9169 1.161 0.8784 2.132 0.7797
0.577 0.9474 0.790 0.9139 1.166 0.8803 2.139 0.7876
0.577 0.9346 0.790 0.9119 1.179 0.8774 2.155 0.7807
0.579 0.9336 0.792 0.9109 1.184 0.8715 2.187 0.7866
0.580 0.9336 0.802 0.9149 1.204 0.8665 2.225 0.7836
0.581 0.9415 0.816 0.9109 1.207 0.8744 2.241 0.7688
0.582 0.9307 0.822 0.9169 1.223 0.8626 2.250 0.7678
0.590 0.9317 0.824 0.9119 1.230 0.8705 2.250 0.7678
0.592 0.9297 0.830 0.9159 1.239 0.8695 2.307 0.7481
0.599 0.9287 0.832 0.9090 1.287 0.8665 2.345 0.7728
0.606 0.9326 0.832 0.9040 1.298 0.8497 2.354 0.7678
0.607 0.9336 0.838 0.9099 1.305 0.8606 2.440 0.7639
0.615 0.9257 0.847 0.9099 1.310 0.8655 2.554 0.7520
0.629 0.9376 0.860 0.9129 1.312 0.8655 2.584 0.7520
0.632 0.9356 0.871 0.9040 1.366 0.8586 2.618 0.7471
0.635 0.9238 0.875 0.9050 1.386 0.8586 2.826 0.7283
0.640 0.9218 0.876 0.9040 1.396 0.8448 3.654 0.6346
0.643 0.9356 0.877 0.9050 1.397 0.8537 8.848 0.3316

Hugo Hernandez
ForsChem Research
Figure 8. Model predictions (left plot) and residuals (right plot) of atmospheric pressure (logarithm) as a
function of altitude, obtained for Eq. (3.17).
While heteroscedasticity is not so evident graphically, statistical tests are greatly influenced by
the lack of additional data at both high and low altitudes. Nevertheless, a good description of
the experimental data is obtained. Also in this case, the lack of homoscedasticity may be due to
the constant lapse rate assumption. For a non-constant lapse rate model, reduced
heteroscedasticity might be obtained.
3.5. Oscillatory Damping Model
Maali et al. [17] reported the behavior of the interaction stiffness versus the gap between a
cantilever tip of an atomic force microscope (AFM) and the surface of liquid
octamethylcyclotetrasiloxane. A sample of data extracted from [17] is summarized in Table 5.
Let us now consider the following empirical oscillatory damping model for stiffness ( ):
( ) ( )
(3.18)
The model function implemented in R language is:
param[1]*exp(-param[2]*x)+param[3]*exp(-param[4]*x)*cos(param[5]*x-
param[6])
}
The starting values considered for the parameters are the following: , ,
, , , . These values were obtained by simple trial and error,
until an initial reasonable fit is obtained. For this example, an initial is obtained. An
experimental measurement error of is arbitrarily considered.
√
Table 5. Stiffness vs. cantilever gap in AFM [17]

Hugo Hernandez
ForsChem Research
Stiffness Stiffness Stiffness Stiffness

Gap (A) Gap (A) Gap (A) Gap (A)
(N/m) (N/m) (N/m) (N/m)
10.28 0.338 13.92 0.099 21.83 -0.004 35.96 -0.022
10.35 0.279 14.13 0.115 22.05 0.015 37.25 -0.014
10.42 0.236 14.34 0.131 22.20 0.029 37.89 -0.009
10.49 0.206 14.56 0.150 22.48 0.042 38.32 -0.004
10.56 0.198 14.77 0.163 22.69 0.053 39.17 0.004
10.63 0.069 14.98 0.181 22.91 0.061 39.39 0.010
10.70 0.139 15.63 0.187 23.33 0.069 40.24 0.004
10.77 0.109 15.78 0.228 23.76 0.074 41.10 -0.004
10.84 0.061 16.06 0.187 24.19 0.058 41.53 -0.004
10.91 0.050 16.27 0.204 24.62 0.048 41.96 -0.004
10.98 0.039 16.48 0.166 24.83 0.034 42.60 -0.017
11.06 0.010 16.70 0.150 25.26 0.010 43.46 -0.020
11.14 -0.004 16.91 0.139 25.61 -0.001 44.10 -0.020
11.22 -0.009 17.13 0.115 25.90 -0.017 44.95 -0.014
11.30 -0.020 17.28 0.128 26.33 -0.036 45.38 -0.012
11.38 -0.039 17.34 0.096 27.19 -0.044 45.60 -0.014
11.46 -0.030 17.49 0.080 27.40 -0.044 46.67 -0.006
11.54 -0.047 17.55 0.058 28.04 -0.036 47.74 0.002
11.62 -0.052 17.77 0.039 28.47 -0.028 48.59 -0.004
11.70 -0.060 17.98 0.031 28.69 -0.017 49.24 -0.009
11.78 -0.047 18.41 0.010 28.90 -0.012 49.88 -0.009
11.99 -0.041 18.56 -0.004 29.54 -0.001 50.73 -0.012
12.33 -0.030 18.62 -0.025 30.61 0.004 51.38 -0.017
12.63 -0.022 19.05 -0.044 31.04 0.013 52.23 -0.009
12.75 -0.009 19.27 -0.057 31.47 0.021 53.09 -0.009
12.87 0.002 19.69 -0.065 31.68 0.023 53.94 -0.012
12.99 0.018 20.12 -0.074 32.32 0.015 55.23 -0.012
13.11 0.029 20.55 -0.068 32.97 0.010 55.44 -0.006
13.23 0.045 20.76 -0.052 33.39 -0.001 56.30 -0.012
13.35 0.058 20.91 -0.044 34.46 -0.014 56.94 -0.009
13.47 0.069 21.19 -0.033 34.68 -0.020 58.01 -0.009
13.62 0.074 21.34 -0.025 35.11 -0.022 59.30 -0.012
13.77 0.091 21.62 -0.001 35.32 -0.036 59.72 -0.009
The optimization results are the following:

stepnlm(Damping$S,Damping$x,modelfn=modelfn5,param0=c(10,0.35,1.6,0.15,0.8,0
),ptol=c(1e-4,1e-6,1e-5,1e-6,1e-5,1e-4),optmethod=optim,ErrSD=0.1/sqrt(12))
$bias
[1] 0
$param
1 9.31074116 3.7242487 2.500032065
2 0.31418664 14.5560461 0.021584614
3 1.61828294 8.3911594 0.192855703
4 0.14869886 18.6481041 0.007973940
5 0.78590415 94.1577711 0.008346673
6 -0.05466918 0.4313885 0.126728428
$model_performance
s R2adj CF Nvalue Hvalue
0.0185772 0.9419912 0.5513500 -7.0771538 -13.9300396
Warning message:

Hugo Hernandez
ForsChem Research

The model performance is illustrated in Figure 9.
Figure 9. Model predictions (left plot) and residuals (right plot) of interaction stiffness as a function of
cantilever gap, obtained for Eq. (3.18).
Heteroscedasticity of residuals, again in this example, indicates a structural lack of fit. In other
words, the model structure might be further improved.
4. Conclusion
A numerical procedure for optimizing the mathematical structure of models including nonlinear
functions of the model parameters was presented. The procedure was implemented in R
language using the function stepnlm. This procedure combines nonlinear regression with a
stepwise elimination procedure based on the estimation uncertainty of the model parameters.
The estimation uncertainty is determined using a linear approximation about the optimal
parameter values. One of the main assumptions of this procedure is the homoscedasticity of
model residuals, which is evaluated using a statistical test.
Several examples are considered for illustrating the proposed procedure. It is shown that this
method is general and also works for multiple linear regression models (linear with respect to
the model parameters). It was also shown that model structures leading to homoscedastic
residuals provide a more reliable estimation of parameter values. Models with heteroscedastic
residuals may still be used for descriptive purposes only.
Acknowledgment and Disclaimer
This report provides data, information and conclusions obtained by the author(s) as a result of original
scientific research, based on the best scientific knowledge available to the author(s). The main purpose

Hugo Hernandez
ForsChem Research
of this publication is the open sharing of scientific knowledge. Any mistake, omission, error or inaccuracy
published, if any, is completely unintentional.
This research did not receive any specific grant from funding agencies in the public, commercial, or not-
for-profit sectors.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC
4.0). Anyone is free to share (copy and redistribute the material in any medium or format) or adapt
(remix, transform, and build upon the material) this work under the following terms:
 Attribution: Appropriate credit must be given, providing a link to the license, and indicating if
changes were made. This can be done in any reasonable manner, but not in any way that
suggests endorsement by the licensor.
 NonCommercial: This material may not be used for commercial purposes.
References
[1] Hernandez, H. (2023). Optimal Model Structure Identification. 1. Multiple Linear Regression.
ForsChem Research Reports, 8, 2023-13, 1 - 53. doi: 10.13140/RG.2.2.31051.57121.
[2] Hernandez, H. (2022). Standard Deterministic, Standard Random, and Randomistic Variables.
ForsChem Research Reports, 7, 2022-06, 1 - 18. doi: 10.13140/RG.2.2.36316.87688.
[3] Hernandez, H. (2018). Multidimensional Randomness, Standard Random Variables and Variance
Algebra. ForsChem Research Reports, 3, 2018-02, 1-35. doi: 10.13140/RG.2.2.11902.48966.
[4] Hernandez, H. (2023). Multi-Algorithm Optimization. ForsChem Research Reports, 8, 2023-12, 1 -
35. doi: 10.13140/RG.2.2.21772.49284.
[5] Hernandez, H. (2021). Testing for Normality: What is the Best Method? ForsChem Research
Reports, 6, 2021-05, 1-38. doi: 10.13140/RG.2.2.13926.14406.
[6] Hernandez, H. (2023). Evaluating Scedasticity using H-values. ForsChem Research Reports, 8, 2023-
16, 1 - 35. doi:
[7] Box, G. E., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical
Society Series B: Statistical Methodology, 26 (2), 211-243. doi: 10.1111/j.2517-6161.1964.tb00553.x.
[8] Hernandez, H. (2019). Collision Energy between Maxwell-Boltzmann Molecules: An Alternative
Derivation of Arrhenius Equation. ForsChem Research Reports, 4, 2019-13, 1-27. doi:
10.13140/RG.2.2.21596.33926.
[9] Baulch, D. L., Bowman, C. T., Cobos, C. J., Cox, R. A., Just, T., Kerr, J. A., ... & Walker, R. W. (2005).
Evaluated kinetic data for combustion modeling: Supplement II. Journal of Physical and Chemical
Reference Data, 34 (3), 757-1397. doi: 10.1063/1.1748524.
[10] Thomson, G. W. (1946). The Antoine equation for vapor-pressure data. Chemical Reviews, 38 (1), 1-
39. doi: 10.1021/cr60119a001.
[11] Hernandez, H. (2022). Molecular Modeling of Macroscopic Phase Changes 2: Vapor Pressure
Parameters. ForsChem Research Reports, 7, 2022-16, 1 - 43. doi: 10.13140/RG.2.2.10226.38086.
[12] Stull, D. R. (1947). Vapor Pressure of Pure Substances. Inorganic Compounds. Industrial &
Engineering Chemistry, 39 (4), 540-550. doi: 10.1021/ie50448a023.

Hugo Hernandez
ForsChem Research
[13] Liu, C. T., & Lindsay Jr, W. T. (1970). Vapor pressure of deuterated water from 106 to 300. deg.
Journal of Chemical and Engineering Data, 15 (4), 510-513. doi: 10.1021/je60047a015.
[14] The Engineering ToolBox (2010). Water - Heat of Vaporization vs. Temperature. Available at:
https://www.engineeringtoolbox.com/water-properties-d_1573.html. Accessed: December 1, 2023.
[15] NOAA, NASA & USAF (1976). US standard atmosphere, 1976. Washington.
https://www.ngdc.noaa.gov/stp/space-weather/online-publications/miscellaneous/us-standard-
atmosphere-1976/us-standard-atmosphere_st76-1562_noaa.pdf.
[16] Hernandez, H. (2020). A Barometric Formula without the Hydrostatic Pressure Assumption.
ForsChem Research Reports, 5, 2020-14, 1-22. doi: 10.13140/RG.2.2.20093.49126.
[17] Maali, A., Cohen-Bouhacina, T., Couturier, G., & Aimé, J. P. (2006). Oscillatory dissipation of a
simple confined liquid. Physical Review Letters, 96 (8), 086105. doi:
10.1103/PhysRevLett.96.086105.
Appendix. R function
stepnlm<-function(y,x,param0,modelfn,testparam=NULL,optmethod=NULL,ptol=1e-
8,ErrSD=NULL,Nmin=-Inf){
if(is.function(modelfn)==FALSE) stop("modelfn is not a suitable R function")
if (is.null(ErrSD)) ErrSD=0
ErrVar=ErrSD^2
#Regressand
Y=data.frame(y)
#Number of observations
n=nrow(Y)
#Independent variables
X=data.frame(x)
#Number of independent variables
nx=ncol(x)
#Number of model parameters
np=length(param0)
if (length(ptol)==1) ptol=ptol*rep(1,np)
if (is.null(testparam)) testparam=rep(0,np)
fixed=rep(FALSE,np)
#Initial model evaluation

fM0=c()
for (i in 1:n){
fM0[i]=modelfn(X[i,],param0)
}
res0=y-fM0
res0=res0-mean(res0)
VR0=sum(res0^2)/(n-np+sum(fixed)-1)
#Initial derivatives evaluation
dfM0=matrix(NA,nrow=n,ncol=np)
for (i in 1:n){
for (l in 1:np){
dp=rep(0,np)
dp[l]=ptol[l]
dfM0[i,l]=0.5*(modelfn(X[i,],param0+dp)-modelfn(X[i,],param0-dp))/ptol[l]
}
}
#Initial regressor matrix
phi0=matrix(NA,nrow=n,ncol=np)
for (l in 1:np){
phi0[,l]=dfM0[,l]-mean(dfM0[,l])

Hugo Hernandez
ForsChem Research
}
#Parameter Uncertainty and Identifiability
Q0=NULL
try(Q0<-solve(t(phi0)%*%phi0),silent=TRUE)
if (is.null(Q0)) {
try(print(det(solve(t(phi0)%*%phi0),tol=NULL)))
stop("modelfn is an unidentifiable function. Please check the model structure (change
structure, re-scale or remove parameters) and try again!")
}
Vp0=diag(Q0)*VR0
t2p0=(param0-testparam)^2/Vp0
CF0=1-abs((VR0-ErrVar)/(VR0+ErrVar))
if (exists("N.norm.test")){
N0=N.norm.test(res0,display=FALSE)$N
} else {
N0=Nmin
}
#Parameter estimation
if (is.null(optmethod)){
Y=y-mean(y)-(fM0-mean(fM0))+(phi0%*%param0)
paramopt=Q0%*%(t(phi0)%*%Y)
pdiff=abs(paramopt-param0)
pdmax=max(pdiff-ptol)
while (pdmax>0){
param=paramopt
fM=c()
for (i in 1:n){
fM[i]=modelfn(X[i,],param)
}
dfM=matrix(NA,nrow=n,ncol=np)
for (i in 1:n){
for (l in 1:np){
dp=rep(0,np)
dp[l]=ptol[l]
dfM[i,l]=0.5*(modelfn(X[i,],param0+dp)-modelfn(X[i,],param0-dp))/ptol[l]
}
}
phi=matrix(NA,nrow=n,ncol=np)
for (l in 1:np){
phi[,l]=dfM[,l]-mean(dfM[,l])
}
Q=solve(t(phi)%*%phi,tol=NULL)
Y=y-mean(y)-(fM-mean(fM))+(phi%*%param)
paramopt=Q%*%(t(phi)%*%Y)
pdiff=abs(paramopt-param)
if (pdmax>max(pdiff-ptol)){
pdmax=max(pdiff-ptol)
} else {
stop("The default identification method diverges. Please use a different method or a
different starting point and try again!")
}
}
} else {
save(y,X,fixed,testparam,file="stepnlmTemp.R")
objfn<-function(p){
load("stepnlmTemp.R")
param=p
param[which(fixed==TRUE)]=testparam[which(fixed==TRUE)]
fM=c()
for (i in 1:n){
}
return(var(fM)-2*cov(y,fM))
}
paramopt=optmethod(par=param0,fn=objfn)$par
}
fM1=c()

Hugo Hernandez
ForsChem Research
for (i in 1:n){
fM1[i]=modelfn(X[i,],paramopt)
}
res1=y-fM1
for (i in 1:n){
for (l in 1:np){
if (fixed[l]==FALSE){
dp=rep(0,np)
dp[l]=ptol[l]
dfM1[i,l]=0.5*(modelfn(X[i,],paramopt+dp)-modelfn(X[i,],paramopt-dp))/ptol[l]
}
}
}
phi1=matrix(NA,nrow=n,ncol=np-sum(fixed))
counter=0
for (l in 1:np){
if (fixed[l]==FALSE) {
counter=counter+1
phi1[,counter]=dfM1[,l]-mean(dfM1[,l])
}
}
Q1=solve(t(phi1)%*%phi1,tol=NULL)
Vp1=rep(NA,np)
for (l in 1:np){
Vp1[l]=Q1[l,l]*VR1
} else {
Vp1[l]=0
}
}
t2p1=(paramopt-testparam)^2/Vp1
} else {
N1=Nmin
}
while (min(t2p1,na.rm=TRUE)<=1){
j=which(t2p1==min(t2p1,na.rm=TRUE))
fixed[j]=TRUE
param2=paramopt
param2[which(fixed==TRUE)]=testparam[which(fixed==TRUE)]
if (is.null(optmethod)){
pdiff=abs(paramopt-param2)
paramopt2=param2
while (max(pdiff-ptol)>0){
param=paramopt2
fM=c()
for (i in 1:n){
}
dfM=matrix(NA,nrow=n,ncol=np)
for (i in 1:n){
for (l in 1:np){
dp=rep(0,np)
dp[l]=ptol[l]
dfM[i,l]=0.5*(modelfn(X[i,],param+dp)-modelfn(X[i,],param-dp))/ptol[l]
}
}
}
phi=matrix(NA,nrow=n,ncol=np-sum(fixed))
counter=0
for (l in 1:np){

Hugo Hernandez
ForsChem Research
counter=counter+1
phi[,counter]=dfM[,l]-mean(dfM[,l])
}
}
Q=solve(t(phi)%*%phi,tol=NULL)
Y=y-mean(y)-(fM-mean(fM))+(phi%*%param[which(fixed==FALSE)])
paramnew=Q%*%(t(phi)%*%Y)
counter=0
for (l in 1:np){
counter=counter+1
paramopt2[l]=paramnew[counter]
} else {
paramopt2[l]=testparam[l]
}
}
pdiff=abs(param-paramopt2)
}
} else {
save(y,X,fixed,testparam,file="stepnlmTemp.R")
objfn<-function(p){
load("stepnlmTemp.R")
param=p
param[which(fixed==TRUE)]=testparam[which(fixed==TRUE)]
fM=c()
for (i in 1:n){
}
return(var(fM)-2*cov(y,fM))
}
paramopt2=optmethod(par=param2,fn=objfn)$par
paramopt2[which(fixed==TRUE)]=testparam[which(fixed==TRUE)]
}
fM2=c()
for (i in 1:n){
fM2[i]=modelfn(X[i,],paramopt2)
}
res2=y-fM2
if (VR2<VR1 & VR2>=ErrVar){
paramopt=paramopt2
fM1=fM2
res1=res2
VR1=VR2
for (i in 1:n){
for (l in 1:np){
dp=rep(0,np)
dp[l]=ptol[l]
dfM1[i,l]=0.5*(modelfn(X[i,],paramopt+dp)-modelfn(X[i,],paramopt-dp))/ptol[l]
}
}
}
phi1=matrix(NA,nrow=n,ncol=np-sum(fixed))
counter=0
for (l in 1:np){
counter=counter+1
phi1[,counter]=dfM1[,l]-mean(dfM1[,l])
}
}
if (sum(fixed)<np){
Q1=solve(t(phi1)%*%phi1,tol=NULL)
}

Hugo Hernandez
ForsChem Research
Vp1=rep(NA,np)
counter=0
for (l in 1:np){
Vp1[l]=0
counter=counter+1
if (Q1[counter,counter]>0) Vp1[l]=Q1[counter,counter]*VR1
}
}
t2p1=(paramopt-testparam)^2/Vp1
} else {
N1=Nmin
}
} else {
fixed[j]=FALSE
t2p1[j]=NA
}
}
if (exists("H.sked.test")){
H1=H.sked.test(list(res1,X),ttype="homoscedastic",display=FALSE)$H.value
} else {
H1=0
}
if (min(H1)<0) warning("Model residuals are heteroscedastic! Use a suitable transformation

of the response variable, or a different model, and try again.",call.=FALSE)
bias=mean(y)-mean(fM1)
t2i=bias^2/VR1
if (t2i<=1){
resi=y-fM1
VRi=sum(resi^2)/(n-np+sum(fixed))
VR1=VRi
res1=resi
bias=0
t2i=NaN
}
#Fitness Coefficient & N-value validation
if (ErrVar==0) CF1=NULL
if (exists("N.norm.test")==FALSE) N1=NULL
uncertainty=sqrt(Vp1)
Tvalue=(paramopt-testparam)*sign(paramopt)/uncertainty
return(list(bias=bias,param=data.frame(coeff=paramopt,Tvalue=Tvalue,uncertainty=uncertainty)
,model_performance=c(s=sqrt(VR1),R2adj=1-
VR1/var(y),CF=CF1,Nvalue=N1,Hvalue=min(H1)),ypred=bias+fM1,res=res1))
}


Optimal Model Structure Identification 2 - Nonlinear Regression

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Optimal Model Structure Identification 2 - Nonlinear Regression

Uploaded by

Copyright:

Available Formats

Vol.

Optimal Model Structure Identification. 2. Nonlinear Regression

A general formulation of the multi-objective optimization problem of model structure

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

2. Models with Nonlinear Functions of Parameters

2.1. General Formulation

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

Alternatively, the nonlinear function ( ) can also be described by an infinite polynomial

where is the number of parameters considered in function , and ̂ is a vector of model

Now, model (2.5) will be expressed as follows:

2.2. Parameter Identification

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

Eq. (2.14) can be rearranged and alternatively expressed as follows:

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

2.3. Parameter Uncertainty and Identifiability

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

The matrix (̂ ) (̂ ) is singular because the columns of (̂ ) are linearly correlated. In

So, it is impossible to determine all three parameters, independently on the number of

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

2.4. Model Structure Simplification and Optimization

where is an arbitrary reference value for ̂ [ ] , typically . It is also assumed that is a

It is also advisable considering integer values of ( , , , etc.), as well as rational values ( ,

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

Figure 1. Proposed procedure for structure optimization of nonlinear models

A very important step in the proposed procedure is the evaluation of homoscedasticity of

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

3.1. Multiple Linear Regression

Table 1. Extreme Multiple Linear Regression Data [1]

0.11 16.55 12.37

The general model function considered is the following linear expression:

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

The second case is evaluated as follows1:

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

3.2. General Arrhenius Model

Eq. (3.2) will be used to describe the following bimolecular reaction:

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

Table 2. Kinetic data for reaction (3.3) [7]

In addition, the following arbitrary parameter value tolerances ( ) are considered:

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

Error in solve.default(t(phi0)%*%phi0): system is computationally singular:

Error in stepnlm(Arrhenius$y, Arrhenius$x, param0 = c(1.1e-25, 4.32, 475), :

Error in stepnlm(1e+25 * Arrhenius$y, Arrhenius$x, param0 = c(1.1, 4.32, :

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

There are several points to notice from these results:

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

plot(log(Arrhenius$x),res,xlab="Log(Temperature [K])",ylab="Model residuals"

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

And the nonlinear model structure optimization procedure yields:

with , and a non-normal ( ), but now homoscedastic ( ),

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

3.3. Vapor Pressure Model

The behavior of vapor pressure of liquids ( ) as a function of temperature ( ) is typically

On the other hand, the alternative model considering and is

where erfc is:

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

Table 3. Vapor pressure data for water between and [12-14].

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

3.4. Barometric Model

11/12/2023 ForsChem Research Reports Vol. 8, 2023-17

The model functions employed are: