You are on page 1of 13

Computers and Chemical Engineering 24 (2000) 39 – 51

www.elsevier.com/locate/compchemeng

A reduced space interior point strategy for optimization of


differential algebraic systems
Arturo M. Cervantes a, Andreas Wächter a, Reha H. Tütüncü b, Lorenz T. Biegler a,*
a
Department of Chemical Engineering, Carnegie Mellon Uni6ersity, Pittsburgh, PA 15213, USA
b
Department of Mathematical Sciences, Carnegie Mellon Uni6ersity, Pittsburgh, PA 15213, USA

Received 18 August 1999; received in revised form 21 January 2000; accepted 21 January 2000

Abstract

A novel nonlinear programming (NLP) strategy is developed and applied to the optimization of differential algebraic equation
(DAE) systems. Such problems, also referred to as dynamic optimization problems, are common in process engineering and
remain challenging applications of nonlinear programming. These applications often consist of large, complex nonlinear models
that result from discretizations of DAEs. Variables in the NLP include state and control variables, with far fewer control variables
than states. Moreover, all of these discretized variables have associated upper and lower bounds that can be potentially active. To
deal with this large, highly constrained problem, an interior point NLP strategy is developed. Here a log barrier function is used
to deal with the large number of bound constraints in order to transform the problem to an equality constrained NLP. A modified
Newton method is then applied directly to this problem. In addition, this method uses an efficient decomposition of the discretized
DAEs and the solution of the Newton step is performed in the reduced space of the independent variables. The resulting approach
exploits many of the features of the DAE system and is performed element by element in a forward manner. Several large dynamic
process optimization problems are considered to demonstrate the effectiveness of this approach, these include complex separation
and reaction processes (including reactive distillation) with several hundred DAEs. NLP formulations with over 55 000 variables
are considered. These problems are solved in 5–12 CPU min on small workstations. © 2000 Elsevier Science Ltd. All rights
reserved.

Keywords: Interior point; Dynamic optimization; Nonlinear programming

1. Introduction direct applicability of the tools developed to dynamic


optimization. Initial value formulations for DAEs,
Interest in dynamic simulation and optimization of which can fail in the presence of unstable dynamic
chemical processes has increased significantly during modes, are examples of the limiting factors.
the last decade. Common problems include control and DAE optimization problems can be solved using a
scheduling of batch processes; startup, upset, shutdown variational approach (Pontryagin, Boltyanskii,
and transient analysis; safety studies and the evaluation Gainkrelidge & Mishchenko, 1962) or by applying a
of control schemes. Chemical processes are modeled nonlinear programming (NLP) solver to the DAE
dynamically using differential – algebraic equations model. The variational approach works well for prob-
(DAEs). The DAE formulation consists of differential lems without bounds. However, if the problem requires
equations that describe the dynamic behavior of the the handling of active constraints, finding the correct
system, such as mass and energy balances, and alge- switching structure as well as suitable initial guesses for
braic equations that ensure physical and thermody- state and adjoint variables is sometimes impossible. The
namic relations. Despite the vast amount of work done methods that apply NLP solvers can be separated into
in the area of dynamic simulation there is a limited two groups; these are the sequential and the simulta-
neous strategies. In the first strategy, the control vari-
* Corresponding author. Tel.: +1-412-2682232; fax: + 1-412-268- ables are parameterized using a finite set of control
7139. parameters. Then, the objective and constraint func-

0098-1354/00/$ - see front matter © 2000 Elsevier Science Ltd. All rights reserved.
PII: S 0 0 9 8 - 1 3 5 4 ( 0 0 ) 0 0 3 0 2 - 1
40 A.M. Cer6antes et al. / Computers and Chemical Engineering 24 (2000) 39–51

tions are evaluated for a given set of parameters by sparsity and the block diagonal structure are also ex-
integration of the dynamic model using a DAE solver. ploited, but the degrees of freedom of the problems
The sensitivities with respect to the parameters are solved are relatively large compared to the number of
obtained from the same DAE solver and a small opti- variables.
mization problem is solved in the space of the parame- On the other hand, in most process engineering prob-
ters. Several studies describe this approach; their lems (n− m)/n  1, where n is the number of variables
algorithms differ in the integration technique and the and m is the number of equality constraints, as the
method used for obtaining sensitivities. See Vassiliadis number of state variables is much larger than the
(1993) for a review of these methods. number of control variables. Moreover, on current
Simultaneous approaches couple the solution of the process applications it is rare, even for dynamic opti-
DAE system with the optimization problem. These mization problems, for (n − m) to exceed a few hun-
techniques discretize the state and control variables, dred. For these cases, a reduced space SQP approach
leading to large-scale NLP problems which can require (rSQP) can be more efficient. With this approach, either
special solution strategies. Despite this characteristic, projected Hessian matrices or their quasi-Newton ap-
the simultaneous approaches have advantages for prob- proximations may be used, thus avoiding the necessity
lems with path constraints and for systems where insta- of second derivatives. This becomes important because
bilities occur for a range of inputs. In addition, the there are few commercial modeling packages that
simultaneous approach solves the DAE system only provide second derivatives (Betts & Frank, 1994) for
once, at the optimal point, and therefore can avoid optimization and none of these relate to process engi-
intermediate solutions that may not exist or may re- neering applications. As a result, there is still a strong
quire excessive computational effort. need to consider quasi-Newton methods and to imple-
The large-scale NLP problems that arise from the full ment them properly. Here an efficient algorithm can be
discretization of the DAE system are usually solved constructed by decoupling the search direction into its
using sequential quadratic programming (SQP) meth- components in range and null spaces and solving a
ods. These methods can be classified into full-space and smaller QP subproblem at every iteration.
reduced space approaches. Full-space methods take In Cervantes and Biegler (2000), we presented a
advantage of the DAE optimization problem structure simultaneous rSQP algorithm that exploits the sparsity
and the sparsity of the model. They are very efficient of the DAE system, as well as the almost block diago-
for problems with many degrees of freedom (Lucia & nal structure of the DAE optimization problem. The
Xu, 1990; Betts & Huffman, 1992; Betts & Frank, 1994) DAE system is discretized using collocation on finite
as the optimality conditions can be easily stored and elements. The variables are then partitioned into depen-
factored. Characteristics of these methods are that sec- dent and independent variables in each element. A
ond derivatives of the objective function and con- Newton step for the dependent variables is obtained by
straints are usually required, and special precautions solving small square systems of equations for each
are necessary to ensure convergence properties. In par- element. The step for the discretized independent vari-
ticular, Betts and Huffman (1992) and Betts and Frank ables is obtained by solving a QP for all the elements.
(1994) deal with a regularization based on a Gershgorin For this system, unstable modes are detected and
bound. In addition, a direct full-space factorization is avoided by selecting a numerically stable pivot sequence
developed by Forsgren and Gill (1998) that takes ad- for the LU factorization of the collocation matrix in
vantage of directions of negative curvature. Moreover, each element. Therefore, unstable modes are treated by
there are a number of studies that use indirect methods selecting state variables as decisions in the partitioning
including truncated Newton methods that use precondi- step. The system is thus stabilized without imposing
tioned conjugate gradient methods (Byrd, Hribar & additional boundary conditions.
Nocedal, 1999). With this approach, solution and storage of the large
For dynamic optimization, a full-space algorithm, collocation matrix is avoided, but the size of the re-
which exploits the almost block diagonal structure of duced QP subproblem remains the same. Moreover, the
the DAE optimization problem, was developed (Albu- number of inequalities can still be large, because of the
querque, Gopal, Staus, Biegler & Ydstie, 1997). This lower and upper bounds in state and control variables,
approach decouples the optimality conditions for each and solution of the QP subproblem using active-set
block of the quadratic programming (QP) subproblem techniques can be inefficient. Instead, the utilization of
using an affine transformation. This way, the first-order barrier methods is a possible alternative as they elimi-
conditions in the state and control variables can be nate the combinatorial problem of selecting an active
solved recursively, making the effort of solving it in- set. These techniques, also known as interior point (IP)
crease only linearly with the number of blocks. Also in methods, have been shown to solve large linear pro-
Betts and Huffman (1992) and Betts and Frank (1994) grams more efficiently than the standard set methods
a full-space method is presented. In this work, the (the simplex method). Their applicability and success
A.M. Cer6antes et al. / Computers and Chemical Engineering 24 (2000) 39–51 41

also extends to the solution of QP problems and other some computational examples that illustrate the perfor-
classes of nonlinear programs. mance of this algorithm, while Section 5 concludes the
In particular, SQP methods that solve the QP sub- paper and describes some future directions.
problem with interior point methods have been devel-
oped. These algorithms follow either a full space
(Albuquerque et al., 1997; Kyriakopoulou, 1997; Sar- 2. NLP problem formulation
gent, 1997) or a reduced space SQP approach (Kyriako-
poulou, 1997; Ternet & Biegler, 1999), in which the The general DAE optimization problem can be
quadratic programming subproblem generated at each stated as follows:
iteration is solved using a primal-dual interior point
min 8(z(tf), y (tf), u(tf), tf, p) (1)
method. For process problems with many active in- z (t), y (t), u (t), t f, p
equality constraints, these interior point approaches s.t. DAE model:
perform very well because their combinatorial complex-
ity is very low (Albuquerque et al., 1997; Kyriakopou- F
dz (t)
, z(t), y(t), u(t), t, p = 0
 (2)
lou, 1997). However, in the solution of each QP, there dt
still is a fixed cost which requires a number of interior G (z(t), y (t), u(t), t, p)=0 (3)
point iterations (on the order of ten) and the solution of
a linear system for each IP iteration. On the other initial conditions:
hand, warm starts of the active constraints, which z(0)= z 0 (4)
greatly accelerate active set QP solvers, have not yet
been developed for interior point solvers. Because of point conditions:
this fixed cost, it was found (Ternet & Biegler, 1999) Hs (z(ts ), y(ts ), u(ts ), ts, p)=0 for s{1, … , ns } (5)
that interior point QP solvers were not competitive with
active set strategies if only a few inequality constraints bounds:
become active. z L 5 z(t)5z U
More recently, interior point methods have been
applied directly to the solution of large NLP problems y L 5 y(t)5 y U
(Vanderbei & Shanno, 1999; Yamashita, Yabe & Tan- u L 5 u(t)5 u U
abe, 1997; Gay, Overton & Wright, 1998; Byrd et al.,
1999). These methods stem from the classical work on p L 5 p5 p U
penalty functions (Fiacco & McCormick, 1968) but
t Lf 5 tf 5 t U
f (6)
careful analysis and modern implementations have
shown that these methods are much better conditioned where
than previously believed (Wright, 1998). In these ap-
proaches inequality constraints are replaced by loga- 8 is a scalar objective function
rithmic penalty terms in the objective function and the F are differential equation constraints
resulting equality constrained problem is solved with G are algebraic equation constraints
Newton-type methods applied to the optimality condi- Hs are additional point conditions at fixed times ts
tions. Moreover, in contrast to SQP methods that use z are differential state profile vectors
interior point QP solvers, this approach requires only z0 are the initial values of z
one linear system to be solved for each NLP iteration. y are algebraic state profile vectors
In this paper we consider a new algorithm that falls u are control profile vectors
into this category. Here we develop a reduced space p is a time-independent parameter vector
decomposition for the NLP problem and apply quasi- tf is the final time
Newton methods to the reduced Hessian quantity. This
was encouraged by recent work (Liu, 1998) on a related The DAE optimization problem is converted into an
interior point algorithm that compares quasi-Newton NLP by approximating state and control profiles by a
methods with full space methods with exact Hessians. family of polynomials on finite elements. Here we use a
The resulting approach also allows us to use our previ- monomial basis representation (Bader & Ascher, 1987)
for the differential profiles as follows:
 
ous elemental decomposition strategy for the colloca-
tion matrix. ncol
t− ti − 1 dz
In the following section we briefly present the NLP z(t)= zi − 1 + (t− ti − 1)hi % Vq (7)
q=1 hi dt i,q
formulation of the DAE optimization problem as well
as the decomposition strategy used during the solution. where zi − 1 is the value of the differential variable at the
Section 3 includes the description of our interior point beginning of element i, hi is the length of element i,
algorithm applied to the NLP. In Section 4 we present dz/dti,q is the value of its first derivative in element i at
42 A.M. Cer6antes et al. / Computers and Chemical Engineering 24 (2000) 39–51

the collocation point q, and Vq is a polynomial of order using an rSQP method (Tanartkit & Biegler, 1995;
ncol, satisfying Cervantes & Biegler, 1998, 2000) using a rSQP method.
This method is very efficient for solving DAE optimiza-
Vq (0)=0 for q =1, ... , ncol
tion problems, especially when the dimension of the
Vq% (rr )= dq,r for q, r= 1, ... , ncol state variables is much larger than that of the control
variables (n n− m). The efficiency of the solution
where rr is the r th collocation point within each ele-
procedure is also improved by performing matrix fac-
ment. Here, projected Gauss – Legendre points are used
torizations in each element. This allows us to preserve
because they allow us to set constraints easily at the end
and exploit the structure of the problem and to detect
of each element and to stabilize the system more effi-
ill-conditioning due to unstable modes in the DAE
ciently if high index is present. This monomial represen-
system. At each iteration k, a search direction dk is
tation is also recommended because it leads to smaller
obtained by solving a quadratic approximation of the
condition numbers and smaller rounding errors (Bader
original problem (Eqs. (10)–(12))
& Ascher, 1987).
In addition, the control and algebraic profiles are 1
approximated using a similar monomial basis represen- min g(xK )Tdk + d Tk H(xk )dk (13)
xRn 2

 
tation which takes the form:
ncol
s.t. ck + A Tk dk = 0 (14)
t− ti − 1
y (t) = % cq yi,q (8) x L 5 xk + dk 5 x U (15)
 
q=1 hi
ncol
t− ti − 1 where g is the gradient of f, H denotes the Hessian of
u (t) = % cq ui,q. (9)
q=1 hi the Lagrangian function, and A Tk = A(xk ) are the gradi-
ents of the constraints at iteration k, and ck =c(xk ).
Here yi,q and ui,q represent the values of the algebraic
The variables are further partitioned into m depen-
and control variables, respectively, in element i at collo-
dent (R space) and n-m independent (Q space) vari-
cation point q. cq is a Lagrange polynomial of order
ables. The independent variable space occupies the null
ncol satisfying
space of A Tk . The complete set of variables spans the
cq (rr )=dq,r for q, r =1, ... , ncol. full space. Note that the control variables and parame-
ters are not necessarily the independent variables. With
From Eq. (7), the differential variables are required to
this partition A takes the form
be continuous throughout the time horizon, while the
control and algebraic variables are allowed to have A Tk = [Ck Nk ] (16)
discontinuities at the boundaries of the elements. It
should be mentioned that with this representation (Eq. where the m× m basis matrix C is nonsingular. Defin-
(7)), the bounds on the differential variables are only ing the matrices
enforced directly at element boundaries: however, they
can be enforced at all collocation points by writing
Qk =
 − C− 1
k Nk n Rk =
n
I
, (17)
appropriate point constraints (Eq. (5)). I 0
Here it is assumed that the number of finite elements,
ne, and their lengths are pre-determined. With this the search direction can be written as
assumption, the substitution of Eqs. (7) – (9) into Eqs.
(1) – (6) leads to the following nonlinear programming dk = RkdR + QkdQ (18)
problem (NLP) where the matrix Q satisfies
min f(x) (10)
x  Rn A Tk Qk = 0.
s.t. c (x)= 0 (11)
The range space direction dR is now determined by
x L 5x 5x U (12) solving

 
where dR = − C − 1
k ck, (19)
T
dz
x= , z , y u , t, p , f:Rn “R and c: Rn “ Rm and the null space direction dQ is obtained from the
dt i,q i i,q, i,q following reduced QP subproblem.
1
2.1. Reduced-Hessian successi6e quadratic programming Min (Q Tk gk + Q Tk HkRkdR )TdQ + d TQ(Q Tk Hk Qk )dQ
dQ  Rn − m 2
(rSQP) (20)
The NLP problem, (Eqs. (10) – (12)), can be solved s.t. x L − xk − RkdR 5 QkdQ 5 x U − xk − RkdR. (21)
A.M. Cer6antes et al. / Computers and Chemical Engineering 24 (2000) 39–51 43

2.2. Elemental decomposition In order to apply a reduced space algorithm, each


matrix Ai is partitioned into a basis submatrix C i and
The partitioning in Eq. (18) allows us to perform a nonbasic submatrix N i. This partition is performed by
special decomposition of the matrix A, that we will applying an LU factorization with partial pivoting on
briefly explain. In the remainder of this section we will the rectangular system A i. Following De Hoog and
omit the iteration index k in our notation for simplicity. Mattheij (1987), this LU factorization will yield a di-
First, consider the Jacobian of the discretized system of chotomous system in each element. If an unstable mode
equations. is present in the DAE, Ai is required to be partitioned
(22)

where I represents the identity matrix, and D i is a so that the end conditions of any increasing mode are
matrix containing the coefficients of the continuity fixed or become decision variables. Here, if a differen-
equations of the i th element. Z iq, DZ iq, Y iq, U iq, and P iq tial variable zj has an increasing mode, dz/dtj ncol would
represent the Jacobian of the collocation equations with be specified and would correspond to a column in the
respect to z i, dz/dti,q, yi,q, ui,q and p, at collocation point null space. In the same way, a column corresponding to
q and element i. Z ia, D ia, Y ia, and P ia, correspond to the a control variable or a parameter would be added to
Jacobian of the additional constraints. As indicated in the range space. By considering the variables that span
Eq. (22), it is assumed that these constraints can be the columns of the null space to be fixed, the decompo-
separated by elements. The factorization of this matrix sition approach is equivalent to solving a discretized,
is performed over smaller matrices, each one represent- linear BVP.
ing a finite element. In most cases these matrices have As shown in Cervantes and Biegler (2000), the whole
the same sparsity structure and consequently allow matrix A is not stored, and we perform a forward
re-use of the sparse matrix pivot sequence. To explore elimination in order to obtain the step for the depen-
this decomposition, consider the rows and columns of dent variables dR and C − 1 N for the construction of the
A T, corresponding to element i: QP subproblem. After the basis is selected, we can
represent the matrix A with the following structure and
ÆZ i1− 1 DZ i1 Y i1 0 U i1 P i1 Ç partition:
à i−1 Ã
ÃZ 2 DZ i2 Y i2 0 U i2 P i2 Ã
AT=
à — — — 0 — — Ã
(A i )T = Ã i − 1 i i
Ã. ÆI Ç
ÃZ ncol DZ ncol Y ncol 0 U incol i
P ncol à 0
à 1 Ã
ÃZ ia− 1 D ia Y ia 0 U ia P ia à ÃT C
1
N1 Ã
à à à I C. 1 Ã
È I Di 0 −I 0 0 É −I N. 2
à Ã
(23) Ã T2 C2 N2 Ã
à I C. 2 −I N. 2 Ã
Here, it is assumed that the additional point constraints à Ã
(the last row in Eq. (23)) can be separated by elements. È T3 C3 N 3É
If no parameters p are present, the decomposition of
this matrix can be performed directly, as all the vari- and the corresponding right hand sides are
ables can be eliminated locally. In the case that a c T = [c 0 c 1 ĉ 1 c 2 ĉ 2 c 3 ...].
parameter is present, the last column of A i, which
corresponds to the parameters, will be coupled to the By premultiplying T i and C i by the inverse of C i in
entire system. In this case, we create separate dummy each element, we can develop a forward decomposition
parameters for each finite element. strategy that allows us to calculate C − 1N and C − 1c.
44 A.M. Cer6antes et al. / Computers and Chemical Engineering 24 (2000) 39–51

This decomposition strategy is very efficient, but as only to a relaxed accuracy el, and the approximate
the number of discretized variables increases, the solu- solution is then used as a starting point for the next
tion of the reduced QP (Eq. (13)) subproblem with an barrier problem with liml “ el = 0.
active set algorithm (Ternet & Biegler, 1997) can be- It remains to show how a barrier problem for a fixed
come a bottleneck. This is the result of the large value of ml is solved. We follow a primal-dual approach
number of bounds and the combinatorial problem of (see e.g. El-Bakry, Tapia, Tsuchiya & Zhang, 1995),
choosing an active set associated to it. One possible that generates search directions for primal variables
solution is to apply an interior point method for the x\ 0 as well as for dual variables 6\ 0 which corre-
solution of the QP. This strategy has not always been spond to the Lagrange multipliers to the bound con-
efficient for the solution of general NLPs (Ternet & straints (Eq. (26)) as ml “ 0.
Biegler, 1999). In this work, we introduce a new interior
point algorithm that is applied directly to the NLP 3.1. Primal-dual search directions
problem. As a result, the inequality constraints
(bounds) are eliminated from the NLP, avoiding the In order to motivate the choice of the search direc-
necessity of choosing an active set. In the next section tion, let us consider the first order optimality conditions
we present a detailed description of this algorithm. of the barrier problem (Eqs. (27) and (28)):
9f(x)+ A(x)l− mX − 1e= 0
3. Interior point method applied to NLP c(x)=0,

In order to simplify the presentation of our new where A(x): = 9c(x), and the components of the vector
algorithm we assume in this section that all variables l are the Lagrange multipliers to the original equality
have only lower bounds of zero, and consider the constraints (Eq. (28)). Throughout this section, e de-
following problem: notes the vector of appropriate dimension of all ones,
and a capital letter of a vector name (e.g. X) denotes
min f(x) (24) the diagonal matrix with the vector elements in the
diagonal. After defining the so-called dual 6ariables
s.t. c(x) =0 (25)
6:= mX − 1e, the above system is equivalent to
x]0 (26)
9f(x)+A(x)l−6 =0 (29)
where f: R “ R are c: R “R are assumed to be
n n m

sufficiently smooth. Extending the approach described XVe− me =0 (30)


below to the general problem formulation (Eqs. (10)– c(x)=0. (31)
(12)) is straight-forward. The algorithm follows a bar-
rier approach, where the bound constraints (Eq. (26)) If we were to solve this system of nonlinear equations
are replaced by a logarithmic barrier term which is by Newton’s method, at an iterate (xk, lk, 6k ) the corre-
added to the objective function to give sponding search direction would be obtained by solving
n
Æ Hk Ak − I Ç Ád xk  Á9f(xk )+ Aklk −6k Â
min 8m (x):= f(x)− m % ln(x (i)) (27) à T à à là à Ã
i=1 ÃA k 0 0 à Ãd k à = − à ck Ã
à Ãà à à Ã
s.t. c(x)=0 (28) È Xk 0 Vk É Äd 6k Å Ä X V
k k e− me Å
with a barrier parameter m\ 0. Here, x denotes the
(i )
where Hk := 92xxL(xk, lk ) with L(x, l):=f(x)+
i th component of the vector x. Since the objective c(x)Tl denotes the Hessian of the Lagrangian NLP
function of this barrier problem becomes arbitrarily (Eqs. (24)–(26)), Ak := A(xk ), and c(xk ):= ck. Reduc-
large as x approaches the boundary of the nonnegative ing this linear system yields:

   
orthant {x x] 0}, it is clear that a local solution x (m)
of this problem lies in the interior of this set, i.e.
x (m)\0. The degree of influence of the barrier is
Hk + Sk Ak n d xk
=−
98m (xk )
with Sk :
A Tk 0 lk + d lk ck
determined by the size of m, and x (m) converges to a
local solution x  of the original problem (Eqs. (24)– = X− 1
k Vk (32)
(26)) as m “0. Consequently, a strategy for solving the
original NLP is to solve a sequence of barrier problems and
(Eqs. (27) and (28)) for decreasing barrier parameters
d 6k = mX −
k e− 6k − Skd k.
1 x
(33)
ml, where l is the counter for the sequence of subprob-
lems. Since the exact solution x (ml ) is not of interest It is well known that obtaining d xk from Eq. (32) is
for large ml, the corresponding barrier problem is solved equivalent to solving the quadratic problem
A.M. Cer6antes et al. / Computers and Chemical Engineering 24 (2000) 39–51 45

1
min 9cm (xk )Td x + d x(Hk +Sk )d x (34) nonlinear least squares, described in Dennis and Schn-
x  Rn 2
abel (1983).
s.t. A Tk d x +ck = 0 (35) In our implementation we use BFGS or SR1 quasi-
Newton updates for Bk. The advantage of using posi-
if the matrix Hk +Sk is positive definite in the null tive definite BFGS estimates is that the reduced Hessian
space of A Tk . B0 k of the barrier problem will always be positive defin-
The similarity of this QP to Eqs. (13) – (15) allows us ite and no modifications are necessary to obtain a
to employ rSQP techniques previously developed, in descent direction for the merit function. On the other
particular in terms of decomposition and multiplier-free hand, the reduced Hessian of the Lagrangian of the
reduced space quasi-Newton updates. As in Eq. (18), original NLP, approximated by Bk, may not be positive
we partition the overall primal step into a range and definite at the local solution of the barrier problem or
null space component. The range step can be obtained even of the original problem, if there are active bounds
using Eq. (19), while dQ is the solution of the reduced at the local solution of the original NLP. In this case,
QP (20). This QP is now unconstrained and its solution BFGS can produce poor estimates and lead to slow
can directly be computed as local convergence. For this reason, we also employ SR1
dQ = − [Q Tk (Hk +Sk )Qk ] − 1(Q Tk 98m (xk ) +wk ). (36) updates for Bk. Since this estimate of the overall re-
duced Hessian can become indefinite, we perform an
In our implementation we choose the cross term to be Eigen-value decomposition of Bk + Q Tk SkQk and, as a
wk = Q Tk Sk RkdR, (37) preliminary approach, we correct negative or very small
Eigen-values to be sufficiently positive. In our examples
i.e. we omit the term involving the full space second n– m is very small compared to the total size n of the
order information Hk, since it is not available to us. problem, so that the Eigen-value decomposition of this
(n– m)-dimensional matrix requires only a fraction of
3.2. Quasi-Newton estimates for the reduced Hessian the computation time for the range space step Eq. (19).

As in our previous rSQP algorithm the reduced Hes- 3.3. A primal-dual merit function
sian B0 k :=[Q Tk (Hk +Sk )Qk ] in Eq. (36) is approximated
by means of a quasi-Newton method. A straight-for- Having computed the primal-dual search directions
ward application of our previous algorithm would use (d xk, d 6k) from Eqs. (19), (36), (18) and (33), we need to
BFGS updates to estimate the whole matrix Bk : choose a step length ak (0, 1] to obtain the next iterate
[Q Tk (Hk + Sk )Qk ]. Although this approach guarantees
xk + 1 = xk + akd xk (38)
good local convergence results for fixed m and would
allow to bypass the expensive explicit computation of 6k + 1 = 6k + a d .
6
k k (39)
the null space matrix Qk, it has two major drawbacks:
Among other things we need to ensure that implicit
First, after decreasing m during the overall optimization
positivity constraints xk + 1 \ 0 and 6k + 1 \ 0 are sa-
(after sufficiently solving a barrier problem ((Eqs. (27)
tisfied, since a full step with ak = 1 might violate these
and (28)), Bk becomes the approximation of the re-
constraints. For this, we compute a maximal step size
duced Hessian to a different optimization problem, and
ã(0, 1] such that the ‘fraction-to-the-boundary-rule’
several iterations will be needed before the estimate of
B0 k is good enough to provide fast convergence to the xk + ãd xk ] (1− t)xk (40)
optimal point of the next barrier problem. This contra-
6k + ãd 6k ] (1− t)6k (41)
dicts our goal to solve the sequence of barrier problems
increasingly fast, in order to have fast local convergence with t= 0.95 is satisfied. Starting from this maximal
of the overall algorithm for the original NLP. We step size, an Armijo line search is performed using a
observed this behavior in practical experiments. Sec- primal-dual l1-penalty function
ondly, during the solution of a barrier problem, the
fn (x, 6):=8m (x)+ Vm (x, 6)+n c(x) 1 (42)
iterates xk may come close to their bounds. In such a
case, the term Sk becomes very large and the quasi- where
Newton approximation behaves poorly. n
Alternatively, we separate the two terms in the re- Vm (x, 6):= x T6− m % ln(x (i)6 (i))
duced Hessian B0 k =Q Tk HkQk +Q Tk SkQk and only esti- i=1

mate the first term B( k :Q Tk HkQk by means of a (see Anstreicher & Vial, 1994) is a function that has a
quasi-Newton method. The explicit computation of the global minimum of nm(1-ln(m)) for all points satisfying
second term is easy since Sk is diagonal and B( k + XVe= me; in other words, V is minimized, if and only
Q Tk SkQk is then used as an estimate for B0 k in Eq. (36). if the relaxed complementarity condition (Eq. (30)) is
This approach is related to a well known update for satisfied. In this way, we control the dual variables 6
46 A.M. Cer6antes et al. / Computers and Chemical Engineering 24 (2000) 39–51

and ensure that 6k converges to the bound multipliers 6. Compute primal search direction d xk from Eq. (18),
as m“ 0. The value of the penalty parameter n ] 0 is and dual search direction d 6k from Eq. (33).
adapted during the optimization in order to ensure that 7. Do the line search:
the directions generated by the algorithms are descent (a) Update penalty parameter to obtain 6k.
directions for the merit function (see Biegler, Schmid & (b) Determine maximal step size ãk (0, 1] that satisfies
Ternet, 1997), i.e. the directional derivative Eqs. (40) and (41).
Dfn ((xk, 6k ); (d xk, d 6k)) is always negative. We emphasize (c) Set ak := ãk.
that these updates can be performed without the ex- (d) Evaluate f(xk + akd xk) and c(xk + akd xk).
plicit computation of the multipliers lk (see Biegler et (e) Check if the Armijo sufficient decrease condition
al., 1997).
f6k (xk + akd xk, 6k + akd 6k)
3.4. Description of the algorithm 5 f6k (xk, 6k )+ pakDf6k ((xk, 6k ); (d xk, d 6k)) (43)
is satisfied. If not, choose a new trial step size
3.4.1. Algorithm ak [tak, t̄ak ] and go back to step 7(d).
Gi6en: Initial barrier parameter m0 \0, initial point 8. If Eq. (43) is satisfied, accept the new iterate (Eqs.
z0 = (x0, 60)Rn+ ×Rn+ , initial estimate of the reduced (38) and (39)).
Hessian B0, desired tolerance o, and positive 9. Perform the quasi-Newton update of the reduced
constants0B tBt̄B 1, h, L  (0, 1), r, M \0. Hessian estimate to obtain Bk + 1.
Initialize iteration counter k:= 0. 10. Increase iteration counter k’ k+ 1, set mk :=
1. Evaluate f(xk ), 9f(xk ), ck, Ak ; compute Rk and Qk mk − 1, and go back to step 1.
from Eq. (17).
2. Check convergence of barrier problem
max { Q Tk (9f(xk )− 6k ) , XkVke −mke } B Mmk. 4. Examples

If satisfied, In this section we present three chemical engineering


(a) Check convergence of original NLP examples. The first one is a small batch reactor, while
the following two are distillation columns. The CPU
max{ Q Tk (9f(xk )−6k ) , c(xk ) } B o. times are in seconds and were obtained with a DEC
(b) If satisfied. STOP [converged]. Alpha 400. All the examples were initialized to a feasi-
(c) Otherwise. decrease barrier parameter mk ’Lmk and ble solution and the CPU time includes this
go back to step 2. computation.
3. Compute range space step from Eq. (19).
4. Compute correction term from Eq. (37). 4.1. Batch reactor
5. Compute null space step from

dQ = − Bk +Q %k QkT
k
n −1
(Q 98m (xk ) +wk ).
T
k
We first consider a small batch reactor (Logsdon,
1990), where the following reactions take place.
A“ B“ C
If necessary, correct [Bk +Q Tk SkQk ] so that it is suffi-
The objective is to maximize the mole fraction of B at
ciently positive definite.
a given final time, by controlling the reactor tempera-
ture T:
max xB (tf ) (44)
dxA
s.t. = −k1(T)xA (45)
dt
dxB
= k1(T)xA − k2(T)xB (46)
dt
0 = xA + xB + xC − 1 (47)
680K 5 T5750K. (48)
The problem was solved with and without bounds on
the control variable (Eq. (48)), using a variable number
of finite elements and three collocation points. The
temperature profile was initialized to 700 K. The results
Fig. 1. Reactor profiles. for ten elements are presented in Fig. 1. We compare
A.M. Cer6antes et al. / Computers and Chemical Engineering 24 (2000) 39–51 47

Table 1
Computational results batch reactor

Elements n/m Unbounded (iter/CPU) Bounded (iter/CPU)

rSQP IP rSQP IP

5 72/62 19/0.21 15/0.12 12/0.20 23/0.16


10 142/122 15/0.26 15/0.22 13/0.27 31/0.34
15 352/302 28/1.26 27/1.21 24/1.42 39/1.73
20 702/602 41/6.89 31/6.10 35/7.94 39/7.01
25 1402/1202 58/42.29 37/37.04 47/55.72 43/42.81

the results with those obtained by solving the problem the stoichiometric coefficient, and ri,n is the rate of
with a rSQP algorithm. The computational results are production per unit volume. The phase equilibrium
presented in Table 1. Note that in both cases, the relationship is represented by
interior point (IP) method generally takes a similar
number of iterations as the rSQP method. However, the yi, j = Ki, j xi, j (52)
CPU time is generally less for the IP method. More- Ki, j = fj (Ti ) (53)
over, in the bounded case, the time per iteration is
about 20% less for the IP method. Also note that as the while the sum of the vapor fractions on each tray is
number of active constraints increases with n – m, the IP required to be equal to one
algorithm shows a clear improvement over the rSQP nc

approach, % yi, j = 1. (54)


j=1

4.2. Distillation columns The vapor flowrates are calculated using a modified
index one energy balance (see Cervantes & Biegler,
1998)

We consider two different examples of distillation
columns. The first one is a batch reactive column, while 1
the second one is a continuous column where a change Vi + 1(h 6i + 1)+ Li − 1(h 6i − 1 − h li)− Vi (h 6i − h li)+


Mi
of set point is simulated. The dynamic model for both
nr
cases consists of dynamic MESH equations. This leads Mi
% ri,n DH R
n = RHSi (55)
n = 1 ri
to an index one DAE system, obtained by reformulat-
ing the dynamic model of Ruiz, Basualdo and Scenna
(1995). The general model of the columns consists of where
nc
the following equations: dxi, j
% Ki, j
dMi nr (h li dxi, j (h li j = 1
nc
dt
=Fi +Vi + 1 + Li − 1 −Vi −Li + % Ri,n. (49) RHSi = % · − · nc . (56)
dt j = 1 (xi, j dt (Ti dKi, j
n=1 % xi, j
j=1 dTi
Here, Mi corresponds to the molar holdup on tray i, Vi
and Li are the vapor and liquid flowrates, Fi is the feed h 6 and h l are the vapor and liquid enthalpies, and DH R
n

flow rate, nr is the number of reactions, and Ri,n is the is the heat of reaction. We now consider two specific
difference between the rates of production and con- dynamic systems.
sumption of each reaction n.
The liquid mole fraction x of each component j can 4.2.1. Batch reacti6e distillation
be expressed as This example considers the reversible reaction be-
tween acetic acid and ethanol (Wajge & Reklaitis,
dxi, j 1995),
Mi =Fi (zi, j − xi, j ) +Vi + 1(yi + 1, j −xi, j )
dt
CH3COOH+ CH3CH2OH lCH3COOCH2CH3 + H2O.
+ Li − 1(xi − 1, j − xi, j ) −Vi (yi, j −xi, j ) + R. i,n (50)
The model consists of 13+ 5nt differential and 6+5nt
where algebraic equations, where nt is the number of trays.
nr
Mi nr The column, in all cases, was fed with an equimolar
R. i,n = % 6j,n ri,n − xi, j % Ri,n. (51) mixture of ethanol, acetic acid, ethyl acetate and water.
n=1 ri n=1
The objective is to maximize the amount of distillate D
ri is the liquid molar density, zi, j is the molar fraction produced within 1 h by manipulating the reflux ratio as
of j in the feed, yi, j is the vapor mole fraction, 6j,n is the a function of time, as follows:
48 A.M. Cer6antes et al. / Computers and Chemical Engineering 24 (2000) 39–51

max
& tf = 1
D dt
0

s.t. DAE model (49)− (56)


x Ester
D ] 0.4800.
We consider two cases. First we solved the problem with
a variable number of DAEs. For this we use eight, 15
and 24 trays with five finite elements and three colloca-
tion points. As seen in Fig. 2, the reflux ratio is at its
upper bound at the beginning, and at its lower bound at
the end, after obtaining the adequate composition of the
distillate.
We then solved for increasing n–m by changing the
number of finite elements and fixing the number of trays
Fig. 2. Reflux ratio. and collocation points (eight and three, respectively). In
all cases the problem was initialized with a feasible point
with a constant reflux ratio of 20. We also compare the
Table 2
results obtained with a rSQP algorithm. The computa-
Computational results, different number of DAEs
tional results are presented in Tables 2 and 3. As in the
Trays DAEs n/m [Iter/CPU (s)/(Iter/CPU (s))] batch reactor example, the interior point method takes
more iterations than time rSQP in some cases, but the
rSQP IP total solution time is smaller because the CPU time per
iteration is always smaller. Also, as n–m increases, the
8 99 1812/1802 28/5.73 (0.20) 48/7.14 (0.15)
15 169 2072/2062 40/21.84 (0.55) 55/20.52 (0.37) IP approach performs better, as seen in Table 3. This is
24 259 4692/4682 42/53.44 (1.27) 59/55.16 (0.93) due to the increase in the number of active constraints
at the optimum, as the number of elements increases (see
Fig. 2). Here the IP approach requires less computation
in dealing with these constraints than the combinatorial
Table 3 active set strategy in rSQP, and this leads to noticeably
Computational results, different number of elements improved CPU times.
Elements DAEs n/m [Iter/CPU (s)/(Iter/CPU (s))]
4.2.2. Continuous distillation
rSQP IP In this example we simulate the air separation process
using a continuous distillation column with 15 trays. The
5 99 1812/1802 28/5.73 (0.20) 48/7.14 (0.15) basic model is given by Eqs. (49)–(56) without the
7 99 2516/2502 179/46.75 96/21.87 (0.23)
reaction terms. The feed is assumed to have a composi-
(0.26)
9 99 3220/3202 98/39.14 53/15.41 (0.29) tion of 78.1% nitrogen, 21% oxygen, and 0.9% argon.
(0.40) The purity of nitrogen taken out at the top of the column
is 99.8%. The complete model consists of 70 differential
equations and 356 algebraic equations. Here, we simu-
late a change of set point in the distillate flow rate from
D(0)= 301.8 mol/s to Dset = 256.0 mol s − 1. The objec-
tive is to minimize the offset produced during the change
from one steady state to another by controlling the feed

&
flowrate F and/or the heat in the reboiler Q:
tf = 1
min (D−D set)2dt
0

s.t. DAE model (49)−(56)


0 kmol s − 1 5 F52 kmol s − 1
1 MJ s − 1 5 Q5100 MJ s − 1.
Figs. 3–5 show three cases where we consider different
combinations of the feed and reboil control profiles. In
Fig. 3. Feed as control. all cases we have a fine discretization of these profiles
A.M. Cer6antes et al. / Computers and Chemical Engineering 24 (2000) 39–51 49

through a coarser discretization of the control profile.


We examined this feature to eliminate the oscillations
and found that the optimal objective function became
slightly worse. Moreover, when we restarted the origi-
nal problem from the coarser solution, we observed
that the NLP returned to the optimal oscillatory profi-
les again.
Finally, we also ran the same problems but added a

&
penalty into the objective function, as follows:
tf
(D−D set)2dt+ r %i (ui − ui − 1)T(ui − ui − 1) (57)
0

where r: 10 − 4. This approach is a well known device


for stabilizing model predictive controllers and, as ex-
Fig. 4. Reboiler heat as control. pected, eliminates the oscillations in the state and con-
trol profiles. These results are presented in Fig. 6 for
the case of two controls. A similar result is observed for
just one control variable. As a result of these tests, we
are confident about the veracity of the profiles given in
these figures.
Table 4 shows the computational results for this
example for different numbers of elements and colloca-
tion points as well as for different control variables. As
can be seen, even the largest problem with over 55 000
variables and 120 degrees of freedom could be solved in
less than 12 min. We can also see that the computation
time increases with the number of control variables,
n–m. This is due to higher computational costs to
construct and factorize the reduced Hessian matrix but
this remains independent of the number of active
Fig. 5. Both controls. constraints.

5. Conclusions and future work

We have presented a new reduced space NLP barrier


algorithm which has proved to be efficient and well
suited for the solution of DAE optimization problems.
This is especially the case for large-scale problems and
when a large number of bounds are active, as this
approach eliminates the necessity of choosing an active
set. Here, an efficient implementation allows us to solve
problems with more than 55 000 variables in 5–12 CPU
min on standard workstations.
As future work, we are currently studying the global
and local convergence properties of the IP method.
These issues will be addressed in a separate work.
Fig. 6. Both controls, penalizing oscillations.
Moreover, the number of degrees of freedom n–m is
relatively small in the problems we solved, and this is
and observe interesting underdamped, oscillatory re- usually the case in DAE optimization of chemical pro-
sponses. These responses are very typical in PI or PID cesses. However, as future work we intend to exploit
feedback control on higher order linear processes and the structure of the exact full-space Hessian matrix to
we investigated these in more detail. First, the oscilla- obtain better behavior when n–m becomes large. This
tory characteristics of this response remained even with can occur when a lot of elements are needed and the
several adjustments of the finite element mesh. Second, DAE model contains more than a few control vari-
it is clear that the oscillations can be eliminated ables. We will also incorporate the algorithm within a
50 A.M. Cer6antes et al. / Computers and Chemical Engineering 24 (2000) 39–51

Table 4
Computational results for air separation optimization

Elements Collocation points n/m [Iter/CPU] [Iter/CPU] n/m [Iter/CPU]


(1 Cntrl.) F Cntrl. Q Cntrl. (2 Cntrl.) F, Q Cntrl.

20 2 18550/18530 23/19.39 28/23.10 18590/18550 65/68.81


20 3 27090/27050 41/83.35 51/103.04 27150/27070 48/141.61
40 2 37030/36990 43/93.61 62/132.36 37110/37030 119/384.70
60 2 55510/55450 87/360.74 69/295.22 55630/55510 98/670.36

graphical interface, with automated initialization, Cervantes, A., & Biegler, L. T. (2000). A stable elemental decom-
graphics for input, and output and other supporting position for dynamic process optimization, Journal of Computa-
tional and Applied Mathematics (in press).
tools. Finally, we plan to incorporate the element sizes
De Hoog, R., & Mattheij, M. (1987). On dichotomy and well
as additional decision variables, and to solve further conditioning in BVP. SIAM Journal of Numerical Analysis, 24,
large-scale DAE optimization problems. 89 – 105.
Dennis, J. E., & Schnabel, R. B. (1983). Numerical methods for
unconstrained optimization and nonlinear equations. Englewood
Acknowledgements Cliffs, N.J., USA: Prentice-Hall.
El-Bakry, A. S., Tapia, R. A., Tsuchiya, T., & Zhang, Y. (1995).
On the formulation and theory of the Newton interior-point
Funding from the Universidad Nacional Autónoma
method for nonlinear programming. Journal of Optimization
de México, the National Science Foundation Theory and Applications, 89, 507 – 541.
(CTS9729075, CCR9875559 and DMS9706950) and the Fiacco, A., & McCormick, G. (1968). Nonlinear programming: se-
American Chemical Society — Petroleum Research quential unconstrained minimization techniques. New York: John
Fund (31243 AC9) is gratefully acknowledged. The Wiley and Sons.
authors thank Professor Jorge Nocedal from North- Forsgren, A., & Gill, P. E. (1998). Primal-dual interior point meth-
western University for his insightful comments on the ods for nonconvex nonlinear programming. SIAM Journal of
Optimization, 8 (4), 1132 – 1152.
interior point method. We also thank Robert Grosch
Gay, D. M., Overton, M. L., & Wright, M. H. (1998). A primal-
for his contributions to the air separation model. dual interior method for nonconvex nonlinear programming. In
Y. Yuan, Proceedings of the 1996 International Conference on
Nonlinear Programming (pp. 83 – 92). Beijing, China, Kluwer
References Academic Publishers, Dordrecht, The Netherlands.
Kyriakopoulou, D. (1997). Development and implementation of an
Albuquerque, J., Gopal, V., Staus, G., Biegler, L. T., & Ydstie, B. interior point optimization algorithin for process engineering,
E. (1997). Interior point SQP strategies for structured process Ph.D. Thesis, Universite’ de Liege, Belgium, (1997).
optimization problems. Computers & Chemical Engineering, 21, Liu, G. (1998). Design issues in algorithms for large-scale nonlinear
283. programming, Ph.D. Thesis, Department of Industrial Engineer-
Anstreicher, K. M., & Vial, J.-P. (1994). On the convergence of an ing, Northwestern University, Evanston, IL.
infeasible primal-dual interior-point method for convex pro- Logsdon, J. S. (1990). Efficient determination of optimal profiles
gramming. Optimization Methods and Software, 3, 273–283. for differential algebraic systems, Ph.D. Thesis, Department of
Bader, G., & Ascher, U. (1987). A new basis implementation for Chemical Engineering, Carnegie Mellon University, Pittsburgh,
mixed order boundary value ODE solver. SIAM Journal of Sci- PA.
entific Computing, 8, 483–500. Lucia, A., & Xu, J. (1990). Chemical process optimization using
Betts, J. T., & Frank, P. D. (1994). A sparse nonlinear optimiza- newton-like methods. Computers & Chemical Engineering, 14,
tion algorithm. Journal of Optical Theoretical Applications, 82,
119.
543.
Pontryagin, V. V., Boltyanskii, Gainkrelidge, R., & Mishchenko, E.
Betts, J. T., & Huffman, W. P. (1992). Application of sparse non-
(1962). The mathematical theory of optimal processes. New
linear programming to trajectory optimization. Journal of Guid-
York, NY: Interscience Publishers.
ance Dynamic Control, 15, 198.
Ruiz, C. A., Basualdo, M. S., & Scenna, N. J. (1995). Reactive
Biegler, L. T., Schmid, C., & Ternet, D. (1997). A multiplier-free,
reduced Hessian method for process optimization’, large-scale distillation dynamic simulation. Transactions of the Institute of
optimization with applications, part II: optimal design and con- Chemical Engineering, 73, 363 – 378.
trol. In L. T. Biegler, T. F. Coleman, A. R. Conn, & F. N. Sargent, R. W. H. (1997). The de6elopment of the SQP algorithm
Santosa, IMA Volumes in Mathematics and Applications (p. for nonlinear programming large scale optimization with applica-
101). Springer Verlag. tions (pp. 1 – 19). Springer-Verlag.
Byrd, R. H., Hribar, M. E., & Nocedal, J. (1999). An interior Tanartkit, P., & Biegler, L. T. (1995). Stable decomposition for
point algorithm for large scale nonlinear programming. SIAM dynamic optimization. Industrial Engineering & Chemistry Re-
Journal on Optimization, 9, 877–900. search, 34, 1253 – 1266.
Cervantes, A., & Biegler, L. T. (1998). Large-scale DAE optimiza- Ternet, D., & Biegler, L. T. (1997). Recent improvements to a
tion using simultaneous nonlinear programming formulations. multiplier-free reduced Hessian quadratic programming al-
American Institute of Chemical Engineers Journal, 44, 1038. gorithm. Computers & Chemical Engineering, 22, 963.
A.M. Cer6antes et al. / Computers and Chemical Engineering 24 (2000) 39–51 51

Ternet, D., & Biegler, L. T. (1999). Interior-point methods for reduced for multicomponent batch distillation with reversible reaction,
Hessian successive quadratic programming. Computers and Chem- American Institute of Chemical Engineers Annual Meeting.
ical Engineering, 23 (7), 859–873. Wright, M. (1998). Ill-conditioning and computational error in interior
Vanderbei, R. J., & Shanno, D. F. (1999). An interior point algorithm methods for nonlinear programming. SIAM Journal on Optimiza-
for nonconvex nonlinear programming. Comp. Opt. Applic., 13, tion, 9 (1), 84 – 111.
231 – 252. Yamashita, H., Yabe, H., & Tanabe, T. (1997). A globally and
Vassiliadis, V. (1993). Computational solution of dynamic optimization superlinearly convergent primal-dual interior point trust region
problems with general differential algebraic constraints. Ph.D. method for large scale constrained optimization. Technical Report,
Thesis, University of London, London, UK. Mathematical systems, Tokyo, Japan, July 1997 (revised July,
Wajge, R. M. & G. V. Reklaitis. (1995). An optimal campaign structure 1998).

You might also like