You are on page 1of 31

Gauss-Newton Method for Ordinary Differential Equation (ODE) Models

In this chapter we are concentrating on the Gauss-Newton method for the estimation of unknown parameters in models described by a set of ordinary differential equations (ODEs).
6.1 FORMULATION OF THE PROBLEM

As it was mentioned in Chapter 2, the mathematical models are of the form


dx(t) dt

= f(x(t), u, k)
y(t) = Cx(t)

; x(t0) = x0

(6.1)
(6.2)

or more generally
y(t) = h(x(t),k) where

(6.3)

k=[k h k2,...,kp] T is a p-dimensional vector of parameters whose numerical values are unknown;
84

Copyright 2001 by Taylor & Francis Group, LLC

Gauss-Newton Method for ODE Models

85

x=[X|,x 2 ,...,x n ]' is an n-dimensional vector of state variables;


x0 is an n-dimensional vector of initial conditions for state variables which are assumed to be known precisely;
u=[u b U2,...,u r ] T is an r-dimensional vector of manipulated variables which are either set by the experimentalist or they have been measured and it is assumed that their numerical values are precisely known;

f=[fi,f2,...,fn]' is a n-dimensional vector function of known form (the differential equations);


y=[y,,y 2 ,...,y m ] T is the m-dimensional output vector i.e., the set of variables that are measured experimentally; and
C is the mxn observation matrix, which indicates the state variables (or lin-

ear combinations of state variables) that are measured experimentally. Experimental data are available as measurements of the output vector as a function of time, i.e., [ y j , t j ] , i=l,...,N where w i t h y j we denote the measurement of the output vector at time tj. These are to be matched to the values calculated by the model at the same time, y(tj), in some optimal fashion. Based on the statistical properties of the experimental error involved in the measurement of the output vector, we determine the weighting matrices Qj (i=l,...,N) that should be used in the objective function to be minimized as mentioned earlier in Chapter 2. The objective function is of the form,

S(k)=

[y, - y(t f , k)] Q f [y; - y(t, , k)]

(6.4)

Minimization of S(k) can be accomplished by using almost any technique available from optimization theory, however since each objective function evaluation requires the integration of the state equations, the use of quadratically convergent algorithms is highly recommended. The Gauss-Newton method is the most appropriate one for ODE models (Bard, 1970) and it presented in detail below.

6.2

THE GAUSS-NEWTON METHOD

available at the jth iteration. Linearization of the output vector around k1-'' and retaining first order terms yields

Again, let us assume that an estimate k1^ of the unknown parameters is

Copyright 2001 by Taylor & Francis Group, LLC

86

Chapter 6

y(t,k + l ) ) = y(t,k)
OK

Ak 0+l

(6.5)

Assuming a linear relationship between the output vector and the state variables (y = Cx), the above equation becomes = Cx(t,k G) ) + C
ox

(6.6)

5k

In the case of ODE models, the sensitivity matrix G(t) = (5xT/5k)T can-

not be obtained by a simple differentiation. However, we can find a differential equation that G(t) satisfies and hence, the sensitivity matrix G(t) can be determined as a function of time by solving simultaneously with the state ODEs another set of differential equations. This set of ODEs is obtained by differentiating both sides of Equation 6.1 (the state equations) with respect to k, namely
(6.7)

5k I dt

5k

Reversing the order of differentiation on the left-hand side of Equation 6.7 and performing the implicit differentiation of the right-hand side, we obtain
5f
of

dt or better dG(t) dt

5x

5k

(6.8)

of1 5x

G(t) +

5k

(6.9)

The initial condition G(t0) is obtained by differentiating the initial condition, x(to)=x0, with respect to k and since the initial state is independent of the

parameters, we have:
G(to) = 0.
Copyright 2001 by Taylor & Francis Group, LLC

(6.10)

Gauss-Newton Method for ODE Models

87

Equation 6.9 is a matrix differential equation and represents a set of nxp ODEs. Once the sensitivity coefficients are obtained by solving numerically the above ODEs, the output vector, y(t,k^ l+1) ), can be computed.

Substitution of the latter into the objective function and use of the stationary condition 9S(kti+1))/5k(i+1) = 0, yields a linear equation for Ak ti+1)

AAk M ) = b
where
N i=l

(6.11)

and
N

b = ^G T (t,)C T Q, [y, -Cx(t,k w )]

(6.13)

Solution of the above equation yields Ak Crl> and hence, k 0 '' 0 is obtained

from
k GH) = k(j)+ Ak t, + i)

(6.14)

where u is a stepping parameter (0<u < 1) to be determined by the bisection rule. The simple bisection rule is presented later in this chapter whereas optimal stepsize determination procedures are presented in detail in Chapter 8. In summary, at each iteration given the current estimate of the parameters, kw, we obtain x(t) and G(t) by integrating the state and sensitivity differential equations. Using these values we compute the model output, y(tj,k w ), and

the sensitivity coefficients, G(t;), for each data point i=l,...,N which are subsequently used to set up matrix A and vector b. Solution of the linear equation yields Ak^ 11 ' and hence k^ 11 ' is obtained. Thus, a sequence of parameter estimates is generated, k (l) , k (2> ,... which often converges to the optimum, k*, if the initial guess, k(0), is sufficiently close. The converged parameter values represent the Least Squares (LS), Weighted Least Squares (WLS) or Generalized Least Squares (GLS) estimates depending on the choice of the weighting matrices Qj. Furthermore, if certain assumptions regarding the statistical distribution of the residuals hold, these parameter values could also be the Maximum Likelihood (ML) estimates.

Copyright 2001 by Taylor & Francis Group, LLC

88 6.2.1 1.
2.

Chapter 6

Gauss-Newton Algorithm for ODE Models

Input the initial guess for the parameters, k(0) and NSIG.

For j=0,l, 2,..., repeat.

3.

Integrate state and sensitivity equations to obtain x(t) and G(t). At each sampling period, tj, i=l,...,N compute y(tj,k^), and G(tj) to set up matrix A and vector b.
Solve the linear equation AAk =b and obtain Ak
G+i) (j 1 "')

4. 5.

Determine u. using the bisection rule and obtain k

=k +uAk

(i)

6.

Continue until the maximum number of iterations is reached or convergence is achieved (i.e.,

Ak(

<1(T N S I G ).

7.

Compute statistical properties of parameter estimates (see Chapter 11).

The above method is the well-known Gauss-Newton method for differential equation systems and it exhibits quadratic convergence to the optimum.

Computational modifications to the above algorithm for the incorporation of prior knowledge about the parameters (Bayessian estimation) are discussed in detail in Chapter 8.
6.2.2 Implementation Guidelines for ODE Models
If the dimensionality of the problem is not excessively high, simultane-

/. Use of a Differential Equation Solver

ous integration of the state and sensitivity equations is the easiest approach to implement the Gauss-Newton method without the need to store x(t) as a function of time. The latter is required in the evaluation of the Jacobeans in Equation 6.9 during the solution of this differential equation to obtain G(t). Let us rewrite G(t) as
ox
ox
OX

G(t)d

ox

ok

ok->

= [gi.fc, ...,g p ]

(6.15)

In this case the n-dimemional vector g, represents the sensitivity coefficients of the state variables with respect to parameter k[ and satisfies the following ODE,
Copyright 2001 by Taylor & Francis Group, LLC

Gauss-Newton Method for ODE Models

89

dg,(t) _ f of 7

dt

ox

ok i

gi(to)=0

(6.16a)

Similarly, the n-dimensional vector g2 represents the sensitivity coefficients of the state variables with respect to parameter k2 and satisfies the following ODE,
dg 2 (t) _ fofT ox

dt

I ; g2(to)=o ok,

df

(6.16b)

Finally for the last parameter, kp, we have the corresponding sensitivity vector gp
dg p (t) dt
of 1 ox
^f of 6k,,

P(t)

; gp(t0)=o

(6.16c)

Since most of the numerical differential equation solvers require the equations to be integrated to be of the form dz = q>(z) ; z(to) = given (6.17)

dt"

we generate the following nx.(p+l)-dimensional vector z

x(t)

ox
ox

x(t)

g 2 (t)

g,(t)

(6.18)

ox ok p
Copyright 2001 by Taylor & Francis Group, LLC

90

Chapter 6

and the corresponding nx(p+l)-dimensional vector function (p(z)

5x
(p(z) =

gl(t)
82 (t)

(6.19)

Of

-f

gp(t)

If the equation solver permits it, information can also be provided about the Jacobean of q>(z), particularly when we are dealing with stiff differential

equations. The Jacobean is of the form


of

ox of 1

oz

(6.20)

Sf T

.,.xT

ox

where the "*" in the first column represents terms that have second order derivatives of f with respect to x. In most practical situations these terms can be neglected and hence, this Jacobean can be considered as a block diagonal matrix
Copyright 2001 by Taylor & Francis Group, LLC

Gauss-Newton Method for ODE Models

91

as far as the ODE solver is concerned. This results in significant savings in terms of memory requirements and robustness of the numerical integration.

In a typical implementation, the numerical integration routine is requested to provide z(t) at each sampling point, tj, i=l,...,N and hence, x(t,) and G(tj) become available for the computation of y(ti,k tl> ) as well as for adding the
appropriate terms in matrix A and b.

2. Implementation of the Bisection Rule


As mentioned in Chapter 4, an acceptable value for the stepping parameter u is obtained by starting with u=l and halving u until the objective function becomes less than that obtained in the previous iteration, namely, the first value of |j. that satisfies the following inequality is accepted.

In the case of ODE models, evaluation of the objective function, S(k(*+uAk(J':1'), for a particular value of u. implies the integration of the state equations. It should be emphasized here that it is unnecessary to integrate the
state equations for the entire data length [t0, tN] for each trial value of u.. Once the objective function becomes greater than S(k<^')), a smaller value of |j, can be

chosen. By this procedure, besides the savings in computation time, numerical instability is also avoided since the objective function becomes large quickly and the integration is often stopped well before computer overflow is threatened (Kalogerakis and Luus, 1983a). The importance of using a good integration routine should also be emphasized. When Ak*-'"1' is excessively large (severe overstepping) during the determination of an acceptable value for \a numerical instability may cause computer overflow well before we have a chance to compute the output vector at the first data point and compare the objective functions. In this case, the use of a good integration routine is of great importance to provide a message indicating that the tolerence requirements cannot be met. At that moment we can stop the integration and simply halve u and start integration of the state equations again. Several restarts may be necessary before an acceptable value for (i is obtained. Furthermore, when k^+uAk^ 1 '' is used at the next iteration as the current estimate, we do not anticipate any problems in the integration of both the state and sensitivity equations. This is simply due to the fact that the eigenvalues of
the Jacobean of the sensitivity equations (inversely related to the governing time constants) are the same as those in the state equations where the integration was

performed successfully. These considerations are of particular importance when the model is described by a set of stiff differential equations where the wide range of the prevailing time constants creates additional numerical difficulties that tend to shrink the region of convergence (Kalogerakis and Luus, 1983a).
Copyright 2001 by Taylor & Francis Group, LLC

92

Chapter 6

6.3

THE GAUSS-NEWTON METHOD - NONLINEAR OUTPUT RELATIONSHIP

When the output vector (measured variables) are related to the state variables (and possibly to the parameters) through a nonlinear relationship of the form y(t) = h(x(t),k), we need to make some additional minor modifications. The sensitivity of the output vector to the parameters can be obtained by performing the implicit differentiation to yield:

ox
Substitution into the linearized output vector (Equation 6.5) yields

y(t,k where

O'+i)

) = h(x(t,k )) + W(t)Ak

(j)

(i +1 )

tc. 6 T>-> 2j

( - )

and hence the corresponding normal equations are obtained, i.e.,

AAk " = b
where
N

(6.25)

A = wT(tj)QiW(tj)
i =l

(6.26)

and
N i= l ,. -,

If the nonlinear output relationship is independent of the parameters, i.e., it


is of the form

y(t) = h(x(t)) then W(tj) simplifies to


Copyright 2001 by Taylor & Francis Group, LLC

(6.28)

Gauss-Newton Method for ODE Models

93

W(t)

T ( V = -^lX Jj

G (tj)

(6.29)

and the corresponding matrix A and vector b become

'' 1=1
and

iv o\ )\ \

\v a\ y; \

(6.30)

Jt ^

(nhr^

,. i

(6.31)

In other words, the observation matrix C from the case of a linear output relationship is substituted with the Jacobean matrix (9hT/9x)T in setting up matrix A and vector b.
6.4 THE GAUSS-NEWTON METHOD - SYSTEMS WITH UNKNOWN INITIAL CONDITIONS

Let us consider a system described by a set of ODEs as in Section 6.1.

dt

= f(x(t),u,k) y(t) = Cx(t)

; x(t0) = x0

(6.32) (6.33)

The only difference here is that it is further assumed that some or all of the components of the initial state vector x0 are unknown. Let the q-dimensional vector p (0 < q < ri) denote the unknown components of the vector x0. In this class of parameter estimation problems, the objective is to determine not only the parameter vector k but also the unknown vector p containing the unknown elements of the initial state vector x(to). Again, we assume that experimental data are available as measurements of the output vector at various points in time, i.e., [ y j , t j ] , i=l,...,N. The objective function that should be minimized is the same as before. Tthe only difference is that the minimization is carried out over k and p, namely the objective function is viewed as

Copyright 2001 by Taylor & Francis Group, LLC

94
N

Chapter 6

S(k,p) = ^[y; -y(tj,k,p)FQi|yj -y(t,,k,p)]


i=l

(6.34)

Let us suppose that an estimate k and p of the unknown parameter and initial state vectors is available at the j"1 iteration. Linearization of the output vector around k1^' and p yields,
f^.TW, TV

ax

I I 5k
'1

-*!] f-M ApCH) 3x I (dp I

(6-35)

Assuming a linear output relationship (i.e., y(t) = Cx(t)), the above equation becomes y(t,,kc'" 'V') = Cx(ti,k(i),p(i)) + CG(t) Ak(r" + CP(t) Ap^" (6-36)

where G(t) is the usual nxp parameter sensitivity matrix (oxT/5k)T and P(t) is the nxq initial state sensitivity matrix (9xT/9p)T. The parameter sensitivity matrix G(t) can be obtained as shown in the previous section by solving the matrix differential equation,

dt
with the initial condition, G(to) = 0.

(6.37)

(6.38)

Similar to the parameter sensitivity matrix, the initial state sensitivity matrix, P(t), cannot be obtained by a simple differentiation. P(t) is determined by solving a matrix differential equation that is obtained by differentiating both sides of Equation 6. 1 (state equation) with respect to p. Reversing the order of differentiation and performing implicit differentiation on the right-hand side, we arrive at
Copyright 2001 by Taylor & Francis Group, LLC

Gauss-Newton Method for ODE Models

95

d or better dP(t) dt

(6.39)

0\

P(t)

(6.40)

The initial condition is obtained by differentiating both sides of the initial condition, x(t0)=x0, with respect to p, yielding (6.41)
(n-q)xq

P(to)=

Without any loss of generality, it has been assumed that the unknown initial states correspond to state variables that are placed as the first elements of the state vector x(t). Hence, the structure of the initial condition in Equation 6.41. Thus, integrating the state and sensitivity equations (Equations 6.1, 6.9 and 6.40), a total of nx(p+q+\) differential equations, the output vector, y(t,kti+1),pc'Tl)) is obtained as a linear function of k ti+l) and p^1'. Next, substitution of yfek^'^p^'1') into the objective function and use of the stationary criteria

(6.42a) and
= 0

(6.42b)

yields the following linear equation

Copyright 2001 by Taylor & Francis Group, LLC

96

Chapter 6

^G T (t,)C T QCG(t,) ^G T (t,)C T QCP(t,)

Ak ( J + 1 ) Ap<J + " (6.43)

2> T (t,)C T QCG(t,)

^PT(t,)CTQ(y1--Cx(t,,k(j),p(j))

' and p*^1' are obtained next as

Solution of the above equation yields Ak^ r l ) and Ap(-i+1). The estimates

k (j)

Ap (j+i)

(6.44)

where a stepping parameter u (to be determined by the bisection rule) is also used. If the initial guess k<0), p(0) is sufficiently close to the optimum, this procedure yields a quadratic convergence to the optimum. However, the same difficulties, as those discussed earlier arise whenever the initial estimates are far from
the optimum.

If we consider the limiting case where p=0 and q^O, i.e., the case where there are no unknown parameters and only some of the initial states are to be estimated, the previously outlined procedure represents a quadratically convergent method for the solution of two-point boundary value problems. Obviously in this case, we need to compute only the sensitivity matrix P(t). It can be shown that
under these conditions the Gauss-Newton method is a typical quadratically convergent "shooting method." As such it can be used to solve optimal control problems using the Boundary Condition Iteration approach (Kalogerakis, 1983).

6.5

EXAMPLES

6.5.1

A Homogeneous Gas Phase Reaction

Bellman et al. (1967) have considered the estimation of the two rate constants k] and k, in the Bodenstein-Linder model for the homogeneous gas phase reaction of NO with O7:
2NO + O,
Copyright 2001 by Taylor & Francis Group, LLC

2NO,

Gauss-Newton Method for ODE Models

97

The model is described by the following equation


x(0) = 0

dt

(6.45)

where a= 126.2, p=91.9 and x is the concentration of NO2. The concentration of NO2 was measured experimentally as a function of time and the data are given in Table 6.1 The model is of the form dx/dt=f(x,k1,k2) where f(x,k1,k2)=k|(a-x)(p-x)22 k?x . The single state variable x is also the measured variable (i.e., y(t)=x(t)). The sensitivity matrix, G(t), is a (Ix2)-dimensiona/ matrix with elements:

G(t)=

G2(t)] =

(6.46)

Table 6.1 Data for the Homogeneous Gas Phase Reaction of NO with O2. Time
0 1 2 3 4 5 6 7 9 11 14

Concentration ofNO2
0 1.4

19 24 29 39 Source: Bellman et al. (1967).


In this case, Equation 6.16 simply becomes,
dG,

6.3 10.5 14.2 17.6 21.4 23.0 27.0 30.5 34.4 38.8 41.6 43.5 45.3

of
ox

dt

9k,

G,(0) = 0

(6.47a)

Copyright 2001 by Taylor & Francis Group, LLC

98

Chapter 6

and similarly for G2(t),

dG 2 dt
where

5f 5k,

G2(0) = 0

(6.47b)

= -k,(p-x)2-2k,(a-xXp-x)-2k2x

(6.48a)

df 5k,
of

= (cc-x)(p-x)2

(6.48b)

= -x"

(6.48c)

Equations 6.47a and 6.47b should be solved simultaneously with the state equation (Equation 6.45). The three ODEs are put into the standard form (dz/dt = (p(z)) used by differential equation solvers by setting
x(t)

z(t) =

G,(t) G 2 (t)

(6.49a)

and
k!(a-x)(p-x)2-k2x2

q>(z)=

) 2 +2k 1 (a-x)(p-x) + 2k 2 x]G 1 + (a-x)(p-x) 2 -[k,(p-x) 2 +2k ] (a-x)(p-x) + 2k 2 x]G 2 -x 2

(6.49b)

Integration of the above equation yields x(t) and G(t) which are used in setting up matrix A and vector b at each iteration of the Gauss-Newton method.

6.5.2

Pyrolytic Dehydrogenation of Benzene to Diphenyl and Triphenyl

Let us now consider the pyrolytic dehydrogenation of benzene to diphenyl and triphenyl (Seinfeld and Gavalas, 1970; Hougen and Watson, 1948):
Copyright 2001 by Taylor & Francis Group, LLC

Gauss-Newton Method for ODE Models

99

2C2H6 <> C, 2 H 10 + H 2 C 6 H 6 + C i 2 H 1 0 ^^ C I O H I 4 + H 2

The following kinetic model has been proposed - = -r,-r 2 (6.50a)

dt

dt
where

(6.50b)

(6.5 la)

r2 = k 2 [ x , x 2 - ( l - x 1 - 2 x 2 X 2 - 2 x 1 - x 2 ) / 9 K 2 ]

(6.5 Ib)

where Xi denotes Ib-mole of benzene per Ib-mole of pure benzene feed and x2 denotes Ib-mole of diphenyl per Ib-mole of pure benzene feed. The parameters kj and k2 are unknown reaction rate constants whereas K] and K2 are equilibrium constants. The data consist of measurements of x, and x2 in a flow reactor at eight values of the reciprocal space velocity t and are given below: The feed to the reactor was pure benzene.
Table 6.2. Data for the Pyrolytic Dehydrogenalion of Benzene

Reciprocal Space Xi X2 Velocity (t) x 104 0.828 5.63 0.0737 0.704 11.32 0.113 0.622 16.97 0.1322 22.62 0.565 0.1400 0.499 34.0 0.1468 0.482 39.7 0.1477 0.470 45.2 0.1477 0.1476 0.443 169.7 Source: Seinfeld and Gavalas (1970); Hougen and Watson (1948).

Copyright 2001 by Taylor & Francis Group, LLC

100

Chapter 6

As both state variables are measured, the output vector is the same with the

state vector, i.e., yi=Xi and y2=x2. The feed to the reactor was pure benzene. The equilibrium constants K, and K2 were determined from the run at the lowest space velocity to be 0.242 and 0.428, respectively. Using our standard notation, the above problem is written as follows:
dx,_ = f,(x],x2;k1,k2) dt dx 2 = f2(x],x2,k,,k2) ~dT (6.52a)

(6.52b)

where f]=(-r]-r2) and f2=r]/2-r2. The sensitivity matrix, G(t), is a (2x2)-dimensional matrix with elements:

= [g,(t),g 2 (t)] = ftq okj (6.53)

G,,(t) G 12 (t)" G 21 (t) G 22 (t)

5x2

Equations 6.16 then become,

dgi(t)
dt

of 5k,

gi(to)=0

(6.54a)

and

dg 2 (t) _
dt

8f]
ox

of
g2(to)=0

(6.54b)

Taking into account Equation 6.53, the above equations can also be written as follows:

Copyright 2001 by Taylor & Francis Group, LLC

Gauss-Newton Method for ODE Models

101

fdr "
dt dG 21 dt and
\_ivj j ^

~5fi 5x, 5f2

5fi ax 2 5f2

11
_ 21.
G

pfl ]

ax i ax 2
of i 5xj 5f2 5x]

5k, 5f2 5k]

; G,,(to)=0, G 2 ,(to)=0

(6.55a)

dt dG 22 dt

of i G 12 3x2 5f2 3x2 L. 22}

"af i 1 ak 2 ;
Sf2 8'k2

G2,(to)=0, G22(to)=0

(6.55b)

Finally, we obtain the following equations


dG,, af,
G n (0) = 0 (6.56a)

dt

ax
5f2
f, \

5k,

( ^x~r ^7
2i+ 5f
2

~~

G 2 1 (0) = 0

(6.56b)

dG12 _ of, dt dG 22 dt where


Sf 2 -^

ax,
5f 2 -

(6.56c)

SxJ

r 12+.

lax2)

U ^5f 2 G 22+^
5k2

(6.56d)

ax
=-k

-^K x^2 - (4x, 2 V 'Jk' Of '

JK,

yR 9

(6.57a)

ax,

3K,

x , - 2 ) | - k 2 [ x, - (5x, -5 + 4 x 2 ) | (6.57b)

vis..

af, ax

= ^ 2x,+-

k,

2X2

3K,

9K

-(4Xl-4 + 5x2)

(6.57c)

Copyright 2001 by Taylor & Francis Group, LLC

102

Chapter 6

^T~ = y (2x 2 + 2 x , -2) -kJ x, - (5x, + 4 x 2 -5)| (6.57d)

5f, 5k,

xj" +

(x 2 + 2 x , x 7 -2x 7

(6.57e)

3K

(x2+2x,x2-2x2)

(6.57f)

Sk 7

X | X 7 - px 2 -4x, + 5 x , x 2 -5x 2 + 2 x 2 9K 2

(6.57g)

5f7
5k,

= -I x , x2, -

'

9K 7

(2x 2 -4x, + 5 x , x 2 -5x 2 + 2 x 2 +2J

(6.57h)

The four sensitivity equations (Equations 6.56a-d) should be solved simultaneously with the two state equations (Equation 6.52). Integration of these six [=nx(p+l)=2x(2+l)] equations yields x(t) and G(t) which are used in setting up matrix A. and vector b at each iteration of the Gauss-Newton method. The ordinary differential equation that a particular element, Gy, of the (nxp)dimensional sensitivity matrix satisfies, can be written directly using the following expression,

dt

(6.58)

6.5.3 Catalytic Hydrogenation of 3-Hydroxypropanal (HPA) to 1,3-Propanediol (PD)


The hydrogenation of 3-hydroxypropanal (HPA) to 1,3-propanediol (PD) over Ni/SiCVAKOj catalyst powder was studied by Professor Hoffman's group at the Friedrich-Alexander University in Erlagen, Germany (Zhu et al., 1997). PD is a potentially attractive monomer for polymers like polypropylene terephthalate. They used a batch stirred autoclave. The experimental data were kindly provided by Professor Hoffman and consist of measurements of the concentration of HPA and PD (CHp,\, CPD) versus time at various operating temperatures and pressures.
Copyright 2001 by Taylor & Francis Group, LLC

Gauss-Newton Method for ODE Models

103

The complete data set will be given in the case studies section. In this chapter, we

will discuss how we set up the equations for the regression of an isothermal data set given in Tables 6.3 or 6.4.
The same group also proposed a reaction scheme and a mathematical model that describe the rates of HPA consumption, PD formation as well as the formation of acrolein (Ac). The model is as follows

dt dCPD dt
- = r3 - r4 - r_ 3

5.59a)

(6.59b)

dt

(6.59c)

where Ck is the concentration of the catalyst (10 g/L). The reaction rates are given below
r, =-

(6.60a)
H
KLP H
0.5

k2CPDCHPA 1+
K P

(6.60b)

+ K2CHPA

r, = k,C HPA
r

(6.60c) (6.60d)

Ac

r4 = k 4 C A c C H P A

(6.60e)

In the above equations, kj (j-1, 2, 3, -3, 4) are rate constants (U(mol min g), K] and K2 are the adsorption equilibrium constants (L/mol) for H2 and HPA respectively. P is the hydrogen pressure (MPa) in the reactor and H is the Henry's law constant with a value equal to 1379 (L bar/mol) at 298 K. The seven parameters (kj, k2, k3, k.3, k4, K, and K2) are to be determined from the measured concentrations of HPA and PD.
Copyright 2001 by Taylor & Francis Group, LLC

104

Chapter 6

Table 6.3

Data for the Catalytic Hydrogenation of 3-Hydroxypropanal (HPA) to 1,3-Propanediol (PD) at 5.15 MPa and 45 <C

t (miri) 0.0 10 20 30 40 50 60 80 100 120 140 160

CHPA (mol/L) 1.34953 1.36324 1.25882 1.17918 0.972102 0.825203 0.697109 0.421451 0.232296 0.128095 0.0289817

CpD (mol/L)
0.0 0.00262812

0.00962368

0.0700394 0.184363 0.354008 0.469777 0.607359 0.852431 1.03535 1.16413 1.30053 1.31971

Source: Zhu et al. (1997).

Table 6.4 Data for the Catalytic Hydrogenation of 3-Hydroxypropanal (HPA) to 1,3-Propanediol (PD) at 5.15 Mpa and80 C

t (miri) 0.0 5 10 15 20 25 30
Source: Zhuetal. (1997).

CHPA (mol/L) 1.34953 0.873513 0.44727 0.140925 0.0350076 0.0130859 0.00581597

CPD (mol/L)

0.0 0.388568 0.816032 0.967017 1.05125 1.08239 1.12024

In order to use our standard notation we introduce the following vectors:


x = [x,, x2, x3]T = [CHPA, CPD, CAc]T

k = [k,, k2, k3, Ic,, k5, k^kyf = [kh k2, k3, k.3, kt, K,, K2]T

y = [yi, y2f = [CHPA, CPD]T


Hence, the differential equation model takes the form,
Copyright 2001 by Taylor & Francis Group, LLC

Gauss-Newton Method for ODE Models

105

dx -z- = f , ( x 1 , x 2 , x 3 ; k i , k 2 , . . . , k 7 ; u i , u 2 ) dt dx, = f2(x1,x2,x,;k],k2,...,k7;u,,u2) dt dx. - = f-)(x1,x-,,x3;k1 ,k2,...,k7;U] ,u9 dt and the observation matrix is simply
C=

(6.6 la)

(6.6 Ib)

(6.6 Ic)

(6.62)

In Equations 6.61, U[ denotes the concentration of catalyst present in the reactor (Ck) and u2 the hydrogen pressure (P). As far as the estimation problem is concerned, both these variables are assumed to be known precisely. Actually, as it will be discussed later on experimental design (Chapter 12), the value of such variables is chosen by the experimentalist and can have a paramount effect on the quality of the parameter estimates. Equations 6.61 are rewritten as following dx, (6.63a) dx 2

"dT
= Ui(r r r 2 )

(6.63b)

dt

(6.63c)

where r, = H

k,u2 2x ,
0.5

(6.64a)

p, =
k

(6.64b)
6"2
k

H
Copyright 2001 by Taylor & Francis Group, LLC

7Xl

106

Chapter 6

r3 = k 3 x , r_ 3 = k 4 x .

(6.64c)

(6.64d) (6.64e)

r4 = k 5 x 3 x ,

The sensitivity matrix, G(t), is a (3x7)-ditnensional matrix with elements:


dx dx

G(t)=

[g,(t), g2 (t),...,g 7 (t)] =

(6.65a)

. . G 1 7 (t)'
G(t) =

G 2 ] ( t ) . . . G 27
G 3 7 (t)

dkj

(6.65b)

Equations 6.16 then become,

^ = (-^lT ,,,+(-*
dt

gi(to)=0

(6.66a)

dg 2 (t) _
dt

3f'
d\

g2(t)

g2(to)=0

(6.66b)

dt
where

ox

g7(to)=0

(6.66c)

Copyright 2001 by Taylor & Francis Group, LLC

Gauss-Newton Method for ODE Models

107

rr^o po rsf.^i
(

UJ
and

i^oxj ^2 J ^ox 3 J 3f2 5f2 6f2

(6.67a)

5f3 of3 of3 _ ^ 5 x , J ^0X2 J l,Sx 3 J_

I5xlj / _,, \

UX2J f ^- \

1^3 J / ^,. \

N=
I 5 k .ij
dG u
dt
dG 2 ,

1 Sk I* 11 l J (^) :J-U....7
(6.67b)

M
, , f ^ l V. , 5fl . r,

Taking into account the above equations we obtain

(dfi]
-UJ ''
fSf, V

(df,]

U2J " ' ' ^ / ^ ' o k , ' ^1IW


,f3f,V
G

^-^
U

dt

v iy

-x o

11 + U 7

v^ /

21+K

,f^V
G

1^*37

31 + a,

af

2
'

r^

/-m

^ i

2l()

dG

t dt

'7 ,

f5fi V -k
^ox,j
f -v? \

'

7+

5f af /U ' V27+ , fk i V3 7 +,
G G

af
3,

(6.68)

^0x3 J
f ~-c

l^ox 3 J
f ~-c

5k7
^c

' J r17() rm n
G

dG 27
Qt

,_

[ or 2 ol^ '2 *2 -\ pi? + -, G27 + G37 + ; G 2 7 (0)


I (7X i j 1 (7X 7 J I t/X ^ y C'K y

dG

37 dt

fSf3V 6x

( .

_L

,fSfOr
(

^ ^ _ L

,f5fOr

( T ,

4-

, 5fl r
ok 7
^1
v

( I -,-, 1 U 1 (I
~

rm
^

n
'

J I

' '

Copyright 2001 by Taylor & Francis Group, LLC

108

Chapter 6

The partial derivatives with respect to the state variables in Equation 6.67a that are

needed in the above ODEs are given next


(6.69a)

(6.6%)

(6.69c)

(6 69d)
OX

(6.69e)

(6.69f)

ox.

(6.69g)

(6.69h)
^X^

(6.69i)

The partial derivatives with respect to the parameters in Equation 6.67b that are needed in the above ODEs are given next

Copyright 2001 by Taylor & Francis Group, LLC

Gauss-Newton Method for ODE Models

109

- u

of 9kT

(6.70a)

-u

dt
dk,

-u

(6.70b)

df

5kT

(6.70c)

of 5k 4

df,

(6.70d)

Copyright 2001 by Taylor & Francis Group, LLC

no

Chapter 6

ok 5 of ok 5

3k,

0
x3x.

(6.70e)

-u,
*

or,

df
ok,
*

5k6 or. ok 6

dr. dkt or-. 3k t

(6.70f)

*6

^L 3k 7 5k 7

5r2 ^| f okj 1*7 or2 "j f Sr, u, -U

f Sr,
1

(6.70g)

1*7

*7J

The 21 equations (given as Equation 6.68) should be solved simultaneously


with the three state equations (Equation 6.64). Integration of these 24 equations

yields x(t) and G(t) which are used in setting up matrix A and vector b at each iteration of the Gauss-Newton method. Given the complexity of the ODEs when the dimensionality of the problem increases, it is quite helpful to have a general purpose computer program that sets up the sensitivity equations automatically. Furthermore, since analytical derivatives are subject to user input error, numerical evaluation of the derivatives can also be used in a typical computer implementation of the Gauss-Newton method. Details for a successful implementation of the method are given in Chapter 8.
Copyright 2001 by Taylor & Francis Group, LLC

Gauss-Newton Method for ODE Models

111

6.6 EQUIVALENCE OF GAUSS-NEWTON WITH THE QUASILINEARIZATION METHOD

The quasilinearization method (QM) is another method for solving off-line parameter estimation problems described by Equations 6.1, 6.2 and 6.3 (Bellman and Kalaba, 1965). Quasilinearization converges quadratically to the optimum but has a small region of convergence (Seinfeld and Gavaias, 1970). Kalogerakis and Luus (1983b) presented an alternative development of the QM that enables a more efficient implementation of the algorithm. Furthermore, they showed that this simplified QM is very similar to the Gauss-Newton method. Next the quasilinearization method as well as the simplified quasilinearization method are described and the equivalence of QM to the Gauss-Newton method is demonstrated.

6.6.1 The Quasilinearization Method and its Simplification An estimate k^* of the unknown parameter vector is available at the jth iteration. Equation 6.1 then becomes
( i) ^-^ = f(x - (t),k ( i ) ) x dt '

(6.71)

Using the parameter estimate k1-'"'"1' from the next iteration we obtain from Equation 6.1

dt
By using a Taylor series expansion on the right hand side of Equation 6.72 and keeping only the linear terms we obtain the following equation

'

(6.73)

where the partial derivatives are evaluated at x^(t). The above equation is linear in x^1' and k1-'^. Integration of Equation 6.72 will result in the following equation

Copyright 2001 by Taylor & Francis Group, LLC

112

Chapter 6

x ti+1) (t) = g(t) + G(t)k u+ "

(6.74)

where g(t) is an n-dimensional vector and G(t) is an nxp matrix. Equation 6.74 is differentiated and the RHS of the resultant equation is equated with the RHS of Equation 6.73 to yield
(6.75)

dt and

dt

The initial conditions for Equations 6.75 and 6.76 are as follows g(t 0 ) = x 0
G(t 0 ) = 0.

(6.77a)
(6.77b)

Equations 6.71, 6.75 and 6.76 can be solved simultaneously to yield g(t) and G(t) when the initial state vector x0 and the parameter estimate vector k^ are given. In order to determine k^+1' the output vector (given by Equation 6.2) is inserted into the objective function (Equation 6.4) and the stationary condition yields,
(6.78)

The case of a nonlinear observational relationship (Equation 6.3) will be examined later. Equation 6.78 yields the following linear equation which is solved by LU decomposition (or any other technique) to obtain k 0+l)
iN

^G T (t,)C T Q 1 CG(t,)

(6.79)

As matrix Qs is positive definite, the above equation gives the minimum of the objective function.
Copyright 2001 by Taylor & Francis Group, LLC

Gauss-Newton Method for ODE Models

113

Since linearization of the differential Equation 6.1 around the trajectory x(t), resulting from the choice of k has been used, the above method gives k^ +1) which is an approximation to the best parameter vector. Using this value as k a new k^ +l ' can be obtained and thus a sequence of vectors k(0), k (l) , k <2) ... is obtained. This sequence converges rapidly to the optimum provided that the initial guess is sufficiently good. The above described methodology constitutes the Quasilinearization Method (QM). The total number of differential equations which must be integrated at each iteration step is nx(p+2). Kalogerakis and Luus (1983b) noticed that Equation 6.75 is redundant. Since Equation 6.74 is obtained by linearization around the nominal trajectory x(t) resulting from k, if we let k1-'*1' be k then Equation 6.74 becomes
(t\ _ g^l fr/t ) \+ _i_r^^t\L-0) v X (j) (^ I) lj (I JK (t\ Q(\\ ^O.oUJ

Equation 6.80 is exact rather than a first order approximation as Equation 6.74 is. This is simply because Equation 6.80 is Equation 6.74 evaluated at the point of linearization, k. Thus Equation 6.80 can be used to compute g(t) as
fy A- \

&V^/

__v (J ) (+\

VV vJ^l^H

f""1 ftM? 0 )

^U.O 1 )

I.

0 1 \

It is obvious that the use of Equation 6.81 leads to a simplification because the number of differential equations that now need to be integrated is nx(p+l). Kalogerakis and Luus (1983b) then proposed the following algorithm for the QM.
Step 1. Select an initial guess k(0>. Hence j=0.

Step 2. Integrate Equations 6.71 and 6.76 simultaneously to obtain x(t) and G(t).
Step 3. Use equation 6.81 to obtain g(tj), i=l,2,...,N and set up matrix A and vector b in Equation 6.79.

Step 4. Solve equation 6.79 to obtain k"''. Step 5. Continue until


k(J+D_k(j)

<TOL

(6.82)

where TOL is a preset small number to ensure termination of the iterations. If the above inequality is not satisfied then we set k=k(J+l), increase j by one and go to Step 2 to repeat the calculations.

Copyright 2001 by Taylor & Francis Group, LLC

114

Chapter 6

6.6.2

Equivalence to Gauss-Newton Method

If we compare Equations 6.79 and 6.11 we notice that the only difference between the quasilinearization method and the Gauss-Newton method is the nature of the equation that yields the parameter estimate vector k (rl> . If one substitutes Equation 6.81 into Equation 6.79 obtains the following equation
t Ci+i) =

i=l
N

(6.83)

By taking the last term on the right hand side of Equation 6.83 to the left hand side one obtains Equation 6.11 that is used for the Gauss-Newton method. Hence, when the output vector is linearly related to the state vector (Equation 6.2) then the simplified quasilinearization method is computationally identical to the Gauss-Newton method. Kalogerakis and Luus (1983b) compared the computational effort required
by Gauss-Newton, simplified quasilinearization and standard quasilinearization

methods. They found that all methods produced the same new estimates at each iteration as expected. Furthermore, the required computational time for the GaussNewton and the simplified quasilinearization was the same and about 90% of that required by the standard quasilinearization method.
6.6.3 Nonlinear Output Relationship

When the output vector is nonlinearly related to the state vector (Equation
6.3) then substitution of x^1+1> from Equation 6.74 into the Equation 6.3 followed by substitution of the resulting equation into the objective function (Equation 6.4)

yields the following equation after application of the stationary condition (Equation 6.78)

The above equation represents a set of p nonlinear equations which can be


solved to obtain k^4"1'. The solution of this set of equations can be accomplished by

two methods. First, by employing Newton's method or alternatively by linearizing the output vector around the trajectory x^(t). Kalogerakis and Luus (1983b) showed that when linearization of the output vector is used, the quasilinearization computational algorithm and the Gauss-Newton method yield the same results.
Copyright 2001 by Taylor & Francis Group, LLC

You might also like