You are on page 1of 26

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/262295038

A derivative-free algorithm for linearly constrained optimization problems

Article  in  Computational Optimization and Applications · April 2014


DOI: 10.1007/s10589-013-9607-y

CITATIONS READS

13 387

3 authors:

Elzain Gumma Mohsin Hashim


Northern Borders University University of Khartoum
6 PUBLICATIONS   56 CITATIONS    20 PUBLICATIONS   138 CITATIONS   

SEE PROFILE SEE PROFILE

Montaz Ali
University of the Witwatersrand
125 PUBLICATIONS   3,076 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Codon-based Similarity Measure and Optimization Techniques for the Clustering of Nucleic Acid Sequences View project

A derivative-free algorithms for solving optimization problems subject to nonlinear constraints View project

All content following this page was uploaded by Montaz Ali on 03 June 2014.

The user has requested enhancement of the downloaded file.


A derivative-free algorithm for linearly
constrained optimization problems

E. A. E. Gumma, M. H. A. Hashim &


M. Montaz Ali

Computational Optimization and


Applications
An International Journal

ISSN 0926-6003
Volume 57
Number 3

Comput Optim Appl (2014) 57:599-621


DOI 10.1007/s10589-013-9607-y

1 23
Your article is protected by copyright and all
rights are held exclusively by Springer Science
+Business Media New York. This e-offprint is
for personal use only and shall not be self-
archived in electronic repositories. If you wish
to self-archive your article, please use the
accepted manuscript version for posting on
your own website. You may further deposit
the accepted manuscript version in any
repository, provided it is only made publicly
available 12 months after official publication
or later and provided acknowledgement is
given to the original source of publication
and a link is inserted to the published article
on Springer's website. The link must be
accompanied by the following text: "The final
publication is available at link.springer.com”.

1 23
Author's personal copy
Comput Optim Appl (2014) 57:599–621
DOI 10.1007/s10589-013-9607-y

A derivative-free algorithm for linearly constrained


optimization problems

E.A.E. Gumma · M.H.A. Hashim · M. Montaz Ali

Received: 30 November 2011 / Published online: 3 October 2013


© Springer Science+Business Media New York 2013

Abstract Based on the NEWUOA algorithm, a new derivative-free algorithm is de-


veloped, named LCOBYQA. The main aim of the algorithm is to find a minimizer
x ∗ ∈ Rn of a non-linear function, whose derivatives are unavailable, subject to linear
inequality constraints. The algorithm is based on the model of the given function con-
structed from a set of interpolation points. LCOBYQA is iterative, at each iteration
it constructs a quadratic approximation (model) of the objective function that satis-
fies interpolation conditions, and leaves some freedom in the model. The remaining
freedom is resolved by minimizing the Frobenius norm of the change to the sec-
ond derivative matrix of the model. The model is then minimized by a trust-region
subproblem using the conjugate gradient method for a new iterate. At times the new
iterate is found from a model iteration, designed to improve the geometry of the inter-
polation points. Numerical results are presented which show that LCOBYQA works
well and is very competing against available model-based derivative-free algorithms.

E.A.E. Gumma
Department of Mathematics, Faculty of Pure and Applied Sciences, International University of
Africa, P.O. Box: 2469, Khartoum, Sudan
e-mail: elzain@aims.ac.za
E.A.E. Gumma
e-mail: elzain.elzain@yahoo.com

M.H.A. Hashim
Department of Applied Mathematics, Faculty of Mathematical Sciences, University of Khartoum,
P.O. Box: 321, Khartoum, Sudan.
e-mail: mhashim@uofk.edu
M.H.A. Hashim
e-mail: mohsinhashim@yahoo.com

B
M.M. Ali ( )
School of Computational and Applied Mathematics, University of the Witwatersrand, Johannesburg,
South Africa
e-mail: Montaz.Ali@wits.ac.za
Author's personal copy
600 E.A.E. Gumma et al.

Keywords Derivative-free optimization · Linearly constrained problem · Least


Frobenius norm method · Trust-region subproblem · Conjugate gradient method

1 Introduction

Generally, in optimization problems, there is useful information in the derivatives


of the function one wishes to optimize. For instance, the gradient and the Hessian
matrix of the objective function characterizes the local minimum of the function to be
optimized. However, in many engineering problems the analytical expression of the
objective function and constraints are unavailable and their values are computed by
means of complex black-box simulations. Therefore, the derivatives of such problems
are unavailable. Although, we assume that the derivatives exist. The derivative-free
methods are useful for solving these problems.
In this paper we design an algorithm to solve the optimization problem:

min f (x), x ∈ Rn , (1)


subject to AT x ≥ b, A ∈ Rn×m̂ , b ∈ Rm̂ , (2)

where f (x) is a smooth nonlinear real-valued function; the gradient and the Hes-
sian of f (x) are unavailable. The algorithm we are going to design can also solve
unconstrained and simple bounds constrained problems. The simple bounds

l ≤ x ≤ u, l, u ∈ Rn , (3)

can be written as
I x ≥ l, −I x ≥ −u. (4)
So, we can set A = [I, −I ]T , b = [l, −u]T , where I is the n × n identity matrix.
In Eq. (3), if we let the components of l to be very large negative numbers and the
components of u to be very large numbers, then we have an unconstrained optimiza-
tion problem. Thus, our algorithm can also solve unconstrained and simple bound
constrained problems.
Derivative-free methods can be classified into two classes. The direct search meth-
ods and the model-based methods. The algorithm presented in this paper belongs to
the latter class. In the literature, some model-based derivative-free algorithms have
been proposed. The first trial of employing available objective function values f (x)
for building a quadratic model by interpolation was proposed by Winfield [21, 22].
In 1994, Powell [14] proposed the COBYLA algorithm for constrained optimiza-
tion without derivatives. In this proposal, the objective function and the constraints
are approximated by linear multivariate models. Powell [15] also extended the idea
of Winfield further by developing his UOBYQA algorithm for unconstrained opti-
mization without derivatives by using quadratic interpolation models of the objective
function. A variant model-based algorithm (DFO) using quadratic Newton fundamen-
tal polynomial was proposed by Conn, Sheinberg and Toint [3, 4]. In 2005, Frank
[7] proposed a variant of UOBYQA algorithm [15], named CONDOR. CONDOR
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 601

is a derivative-free algorithm for unconstrained and easy constrained problems. In


2006, Powell [17] proposed NEWUOA, which is a derivative-free algorithm for un-
constrained optimization problems based on the least Frobenius norm method. An
extension to NEWUOA for simple bound constrained problems based on the least
Frobenius norm method was also proposed by Powell [19], named BOBYQA.
Our strategy to solve problems (1)–(2) is based on approximating the objective
function by a quadratic model. This technique is highly useful to obtain a fast rate
of convergence in derivative-free optimization, because it approximates the curvature
of the objective function. Therefore, at each iteration, at a point x, we construct the
quadratic model: Q(x + p) = c + pT g + 12 pT Gp. We cannot define g = ∇f (x) and
G = ∇ 2 f (x), because as mentioned earlier, the gradient and the Hessian matrix of
f (x) are unavailable. Instead, the constant c, the vector g and the symmetric matrix
G ∈ Rn×n have been determined by imposing the interpolation conditions

Q(x i ) = f (x i ), i = 1, 2, . . . , s, (5)

where s = 12 (n + 1)(n + 2). In this case, the parameters of Q can be written as a linear
system of equations in the coefficients of the model. If we choose the interpolation
points so that the linear system is nonsingular, then the model Q will be defined
uniquely. On the other hand, the use of a full quadratic model limits the size of the
problems that can be solved in practice. One of the methods that overcomes this draw-
back was proposed by Powell [16]. The method of Powell constructs a full quadratic
model that satisfies interpolation conditions, and leaves some freedom in the model.
The remaining freedom is taken up by minimizing the Frobenius norm of the second
derivative matrix of the change to the model. This variational problem is expressed
as a solution of (m + n + 1) × (m + n + 1) system of linear equations, where m is the
number of the interpolation points which satisfy n + 2 ≤ m ≤ 12 (n + 1)(n + 2).
The algorithm presented here (LCOBYQA) is based on the above principle
of Powell and thus, an extension to Powell’s algorithms (NEWUOA, BOBYQA)
[17, 19]. The name LCOBYQA is an acronym for Linearly Constrained Optimization
BY Quadratic Approximation. LCOBYQA is an iterative algorithm. A typical iter-
ation of the algorithm generates a new vector of variables either by minimizing the
quadratic model in a trust-region subject to linear constraints (trust-region subprob-
lem), or by a procedure that should improve the geometry of the interpolation points
(model iteration subproblem).
The rest of the paper is organized as follows. Since our algorithm is an extension of
Powell’s work, the latter is discussed in detail in Sect. 2. This makes it easy to describe
our algorithm in Sect. 3. Several numerical results are presented and discussed in
Sect. 4. The paper ends in Sect. 5 with a conclusion and an outlook to possible future
research.

2 Least Frobenius norm method for updating quadratic models

Quadratic approximations of the objective function are highly useful for obtaining a
fast rate of convergence in derivative-free algorithms, because usually some attention
Author's personal copy
602 E.A.E. Gumma et al.

has been given to the curvature of the objective function. On the other hand, each full
quadratic model has 12 (n + 1)(n + 2) independent parameters and this limits the size
of the problems that can be solved in practice. Therefore, Powell [17, 19] investi-
gated the idea of constructing a suitable quadratic model from m interpolation points,
when m is much less than 12 (n + 1)(n + 2). If m = 2n + 1, then there are enough data
to define a quadratic model with diagonal second derivative matrix which is done
before the first iteration. Specifically, on each iteration, the method of NEWUOA
and BOBYQA constructs a quadratic model Q(x), x ∈ Rn , of the objective function
f (x), x ∈ Rn that required to satisfy interpolation conditions
Q(x i ) = f (x i ), i = 1, 2, . . . , m, (6)
where m is prescribed by the user, the value m = 2n + 1 being typical, and where
the positions of the different points x i , i = 1, 2, . . . , m, are generated automatically.
These conditions leave much freedom in Q, taken up when the model is updated by
minimizing the Frobenius norm of the change to the second derivative matrix of Q.
The success of the method of NEWUOA and BOBYQA is due to a well-known
technique that is suggested by the symmetric Broyden method for updating ∇ 2 Q
when first derivatives of f (x) are available (see p. 73 of [6]). Let an old model Qold
be present, and let the new model Qnew be required to satisfy conditions (6) and
leave some freedom in the parameters of Qnew . The technique takes up this freedom
by minimizing the Frobenius norm of ∇ 2 Qnew − ∇ 2 Qold . One reason for trying the
symmetric Broyden method is that the calculation of Qnew from Qold requires only
O(n2 ) operations in the case m = 2n + 1, but O(n4 ) operations are needed if Qnew
is defined completely by conditions (6), when m = 12 (n + 1)(n + 2), see [19]. The
second reason is that if the objective function f (x) is quadratic, then the symmetric
Broyden method has the property:
 2     
∇ Qnew − ∇ 2 f 2 = ∇ 2 Qold − ∇ 2 f 2 − ∇ 2 Qnew − ∇ 2 Qold 2 , (7)
F F F

which suggests that the approximations ∇ 2 Q ≈ ∇ 2 f become more accurate as the


iteration proceeds [16].

2.1 The solution of the variational problem

Let the current quadratic model Qold be present and has the form:
1
Qold (x) = cold + (x − x 0 )T g old + (x − x 0 )T Gold (x − x 0 ), x ∈ Rn , (8)
2
where x 0 is a fixed point. Let x opt be the point that satisfies:
 
f (x opt ) = min f (x i ) : i = 1, 2, . . . , m , (9)
where x i , i = 1, 2, . . . , m, are the interpolation points of Qold . On each iteration of
NEWUOA, the current iteration generates a new point x + . The position of x opt is
central to the choice of x + in trust region methods. Indeed, x + is calculated to be a
sufficiently accurate estimate of the vector x ∈ Rn that solves the subproblem
min Qold (x), subject to x − x opt  ≤ ρ, (10)
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 603

where ρ adjusted automatically [17]. On some iterations x + may be generated in a


different way that is intended to improve the accuracy of the quadratic model. Once
x + is calculated, one of the old interpolation points, x t , say, is removed to make a
room for x + . Thus, the interpolation points x + +
i , i = 1, 2, . . . , m, of Qnew are x and
m − 1 of the old points x i . Let

1
Qnew (x) = c+ + (x − x 0 )T g + + (x − x 0 )T G+ (x − x 0 ), x ∈ Rn , (11)
2

be the new quadratic model that satisfies the conditions (6) at the new interpolation
points, leaving some freedom in the parameters of Qnew . NEWUOA takes up this
freedom by minimizing the Frobenius norm of ∇ 2 D(x), where

1
D(x) = Qnew (x) − Qold (x) = c + (x − x 0 )T g + (x − x 0 )T G(x − x 0 ), x ∈ Rn ,
2
(12)
subject to the conditions (6). This problem can be written as:

1 2 1  2
n n
min G+ − Gold F = Gij , (13)
4 4
i=1 j =1

subject to

 T 1 + T     +  + 
c + x+i − x0 g + xi − x0 G x+i − x 0 = f x i − Qold x i δit ,
2
i = 1, 2, . . . , m, (14)

where δit is the Kronecker delta.


As reported in [16], the positions of the interpolation points x i are required to have
the properties:
• A1: Let M be the space of quadratic polynomials from Rn to R that are zero at
x i , i = 1, 2, . . . , m. Then the dimension of M is s − m, where s = 12 (n + 1)(n + 2).
• A2: If q(x), x ∈ Rn , is any linear polynomial that is zero at x i , i = 1, 2, . . . , m,
then q is identically zero.
Problem (13), subject to (14) is a convex quadratic problem. Therefore, from the KKT
conditions, there exist unique Lagrange multipliers λk , k = 1, 2, . . . , m, such that the
first derivative of the expression

1  2
n n
L(c, g, G) = Gij
4
i=1 j =1


m

 + T 1 + T  + 
− λk c + x k − x 0 g + x k − x 0 G x k − x 0 , (15)
2
k=1
Author's personal copy
604 E.A.E. Gumma et al.

with respect to the parameters of L is zero. This implies that


m 
m
 
λk = 0, λk x +
k − x 0 = 0, (16)
k=1 k=1

and

m
  + T
G= λk x +
k − x0 xk − x0 . (17)
k=1

The second part of Eq. (16) and the conditions (14) give rise to the variational square
linear system
⎛ ⎞ ⎛ ⎞
λ   λ  
⎜ ⎟ ΛX T
⎜ ⎟ r
W ⎝c ⎠ = ⎝ c ⎠ = , (18)
XO 0
g g
where W is an (m + n + 1) × (m + n + 1) matrix, Λ has the elements

1  + T   2
Λij = xi − x0 x+ j − x0 , i, j = 1, 2, . . . , m, (19)
2
X is the (n + 1) × m matrix
 
1 1 ... 1
, (20)
x+1 − x0 x+
2 − x0 . . . x+
m − x0

and where r is a vector in Rm , which has the components [f (x + +


i ) − Qold (x i )]δit ,
i = 1, 2, . . . , m. Suppose that conditions A1 and A2 hold for the interpolation x + i ,
i = 1, 2, . . . , m, then the matrix X has a full row rank, the solution of the system (18)
gives the parameters of D, and the new model is given by

Qnew (x) = Qold (x) + D(x). (21)

2.2 The Lagrange functions of the interpolation points

As presented in [16], Lagrange polynomials play a fundamental role in preserving


nonsingularity of the system (18). The Lagrange polynomials of the interpolation
problem at the interpolation points x i , i = 1, 2, . . . , m, are quadratic polynomials
j (x), x ∈ Rn , j = 1, 2, . . . , m, that satisfy the conditions

j (x i ) = δij , 1 ≤ i, j ≤ m, (22)

where δij is the Kronecker delta. In order for these polynomials to be applicable to the
variational system (18), the conditions A1 and A2 are retained on the positions of the
interpolation points, and for each j = 1, 2, . . . , m, the remaining freedom in j (x)
is taken up by minimizing the Frobenius norm ∇ 2 j F subject to the constraints
in (22). Therefore, if the right hand side of the system (18) is replaced by the j -th
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 605

coordinate vector in Rm+n+1 , then the parameters of j are defined by this system.
Thus, if Q is the quadratic polynomial


m
Q(x) = f (x j )j (x), x ∈ Rn , (23)
j =1

then its parameters satisfy Eq. (18). It follows from the nonsingularity of the system
(18) that the expression (23) is the Lagrange form of the solution of the variational
problem (18). Let
 
Ω ΞT
H= = W −1
Ξ Υ
be the inverse of the matrix W of the system (18). The definition of j , where j is any
integer in {1, 2, . . . , m}, implies that the j -th column of H provides the parameters
of j . In particular, because of Eq. (17), j has the second derivative matrix


m
Gj = ∇ 2 j = Hkj (x k − x 0 )(x k − x 0 )T , j = 1, 2, . . . , m. (24)
k=1

Therefore, j is the polynomial:

1
j (x) = cj + (x − x 0 )T g j + (x − x 0 )T Gj (x − x 0 ), x ∈ Rn , (25)
2
where cj is equal to Hm+1j and g j is equal to the vector with component Hij , i =
m + 2, m + 3, . . . , m + n + 1. Because of the fact that the parameters of j (x) depend
on H , the elements of the matrix H are required to be available.
To discuss the relation between the polynomials j (x), j = 1, 2, . . . , m, and non-
singularity of the system (18), let x + be the new vector of variables that will re-
place one of the interpolation points x i , i = 1, 2, . . . , m. When x + replaces x t ,
t ∈ {1, 2, . . . , m}, x t will be dismissed, so the new interpolation points are the vectors

x+ +
t =x , x+
i = xi , i ∈ {1, 2, . . . , m}\{t}. (26)

One advantage of the Lagrange polynomials is that they provide a convenient way
of maintaining the conditions A1 and A2 of Sect. 2.1, see [16]. These conditions are
inherited by the new interpolation points if t is chosen so that t (x + ) is nonzero.
It can be seen that at least one of the values j (x + ), j = 1, 2, . . . , m, is nonzero,
because interpolation to a constant function yields


m
j (x) = 1, x ∈ Rn . (27)
j =1

Another advantage of the Lagrange polynomials is that they improve the accuracy
of the quadratic model. In order to improve the accuracy of the quadratic model,
Powell has chosen an alternative to solving the trust region subproblem (10). In this
Author's personal copy
606 E.A.E. Gumma et al.

alternative the interpolation point that is going to be replaced by x + , namely x t , is


selected before the position of x + is chosen. However, x t is often the element of the
set {x i : i = 1, . . . , m} that is furthest from the best point x opt . Having picked the
index t, the value of | t (x + ) | is made relatively large, by letting x + be an estimate
of the vector x ∈ Rn , that solves the model iteration subproblem
 
maxt (x), subject to x − x opt  ≤ Δ̄k , (28)

where Δ̄k is prescribed.

2.3 The procedure for updating the matrix H

In this subsection, a procedure for the calculation of H + from H is discussed. It so


happens that the change (26) to the interpolation points causes the symmetric matrices
W = H −1 and W + = (H + )−1 to differ only in their t-th row and t-th column. Also,
recall that W is not stored. Therefore, our formula of H + is going to depend only
on H and the vector w+ t ∈R
m+n+1 , which is the t-th column of W + . These data

completely define H + , because in theory the updating can be done by inverting H to


give W . The availability of w+ +
t allows the symmetric matrix W to be formed from
W , and H + is set to the inverse of W + . This procedure provides excellent protection
against the accumulation of computer rounding errors. As presented in [16], H + is
given by:

1
H+ = H + αt (et − H w)(et − H w)T − βt H et eTt H
σt
 
+ τt H et (et − H w)T + (et − H w)eTt H , (29)

where et is the t-th coordinate vector in Rm+n+1 , w ∈ Rm+n+1 is the vector that has
the components

1  2
wi = (x i − x 0 )T x + − x 0 , i = 1, 2, . . . , m,
2 (30)
 
wm+1 = 1, and wm+i+1 = x + − x 0 i , i = 1, 2, . . . , n,

also

αt = eTt H et , βt = (et − H w)T w + ηt ,


(31)
τt = eTt H w, and σt = αt βt + τt2 ,

where
1 4
ηt = x + − x 0  − eTt w. (32)
2
Once H + is constructed, the submatrices Ξ and Υ of H are overwritten by Ξ +
and Υ + respectively, and the factorization of Ω + is stored instead of Ω. The pur-
pose of the factorization is to reduce the damage from rounding errors to the identity
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 607

W = H −1 , which is fulfilled at the beginning of each iteration, see [17]. Let the fac-
torization Ω = V V T be given, the new factorization of Ω + can be constructed by
changing only one column of V , the first column say, see [19]. Specifically, the first
column has the form
− 12   
Vi1+ = σt τt Vi1 + et − eopt − H {w − v} i Vt1 , i = 1, 2, . . . , m, (33)

where v ∈ Rm+n+1 is defined by


1 2
vi = (x i − x 0 )T (x opt − x 0 ) , i = 1, 2, . . . , m,
2 (34)
vm+1 = 1, vm+1+i = (x opt − x 0 )i , i = 1, 2, . . . , n.

Here opt in eopt is the index of the best point x opt . The quantities σt , τt are defined by
Eq. (31) and the vector w is defined by Eq. (30). For full details of these calculations,
see [17, 19].

3 Linearly constrained optimization by the quadratic approximation algorithm

In this section, we describe our algorithm, LCOBYQA, for solving problem (1) sub-
ject to the linear constraints (2). Both NEWUOA and LCOBYQA are based on the
main idea of Powell, but our algorithm differs from the NEWUOA in three main
procedures, namely:
• the initial calculations procedure,
• the trust-region subproblem, and
• the geometry improvement subproblem (the model iteration subproblem).

3.1 Initial calculations

The LCOBYQA algorithm requires the initial point x 0 , the coefficient matrix AT of
the linear constraints, the right hand side vector of the constraints b, the parameters
ρbeg , ρend , ρ1 and Δ1 which satisfies ρ1 = Δ1 = ρbeg , ρbeg > ρend , and an integer
number m = 2n + 1, where n the number of variables. In order to construct the initial
interpolation points, we need the point x 0 to be strictly feasible, if x 0 is not strictly
feasible, the algorithm calls the phase one procedure of linear programming to pro-
vide a vertex point (see [12]), and then uses the following steps to construct a strict
feasible point.
Let x be the vertex point which is generated by phase one procedure. Suppose
that {1, 2, . . . , l} is the index set of the active constraints at x , i.e.,

a Ti x = bi , i = 1, 2, . . . , l. (35)

For a direction d to be a nonbinding feasible direction with respect to i, i =


1, 2, . . . , l, it must satisfy

a Ti d > 0, i = 1, 2, . . . , l. (36)
Author's personal copy
608 E.A.E. Gumma et al.

Let Al = [a1 , . . . , al ], so ATl d > 0 characterizes the feasible direction d which is


nonbinding with respect to 1, 2, . . . , l. Such a direction can be found by solving the
linear system
ATl d = e, e = [1, 1, . . . , 1]T . (37)
x
The candidate point x 0 is set to + αd, 0 ≤ α ≤ 2Δ1 . Initially, we set α = 2Δ1 ,
if x 0 = x + αd is strictly feasible then we are done. Otherwise, we reduce α iter-
atively. These steps produce a strict feasible point, for more details see [10]. The
procedure that calculates the strict feasible point x 0 has been denoted by MOVE in
the LCOBYQA algorithm, see Sect. 3.6.
Once x 0 is constructed, the first m interpolation points are given by [17]:

x1 = x0, x i+1 = x 0 + ρbeg ei ,


(38)
x n+1+i = x 0 − ρbeg ei , i = 1, 2, . . . , n,

where ei is the i-th coordinate vector in Rn . If it is necessary then the values of


ρbeg , ρ1 and Δ1 are reduced to guarantee that all the initial points that we will con-
structed are feasible, see [10].
Given x 0 , we now consider the first quadratic model
1
Q1 (x 0 + p1 ) = Q1 (x 0 ) + p T1 g 1 (x 0 ) + p T1 G1 p 1 , p 1 ∈ Rn ,
2 (39)
subject to AT (x 0 + p1 ) ≥ b,

that satisfies the interpolation conditions

Q1 (x i ) = f (x i ), i = 2, . . . , m, (40)

where m = 2n + 1. The vector g 1 and the diagonal elements (G1 )ii , i = 1, 2, . . . , n,


of the diagonal matrix G1 , are given uniquely by the conditions (40). The initial
calculations of LCOBYQA also set the initial matrix H = W −1 of the first model,
where W occurs in the linear system (18).
It is straightforward to derive the element of W for the initial points x i , i =
1, 2, . . . , m, but, as mentioned in Sect. 2.3, it is required to have the elements of
Ξ and Υ explicitly, with the factorization of Ω. Fortunately, the chosen positions of
the initial interpolation points provide convenient formulae for all of these terms, see
[17]. The first row of the initial (n + 1) × m matrix Ξ has the simple form

Ξ1j = δ1j , j = 1, 2, . . . , m. (41)

Further, for integer i = 2, 3, . . . , n + 1, the i-th row of the initial Ξ also has just two
nonzero elements

Ξii = (2ρbeg )−1 and Ξii+n = −(2ρbeg )−1 . (42)

This completes the definition of Ξ for the initial interpolation points. Moreover, the
initial (n + 1) × (n + 1) matrix Υ , is identically zero. As mentioned in the last para-
graph of Sect. 2.3, the factorization of Ω, which guarantees that the rank of Ω is at
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 609

most m-n-1, having the form


m−n−1
Ω= vk vTk = V V T , (43)
k=1

where the components of the initial vector vk ∈ Rm , which is the k-th column of V ,
are given the values
√ −2 1 √ −2 1 √ −2
V1k = − 2ρbeg , Vk+1k = 2ρbeg , Vk+n+1k = 2ρbeg ,
2 2 (44)
Vj k = 0, otherwise, where 1 ≤ k ≤ n, j = 1, 2, . . . , m.

We see that each of these columns has just three nonzero elements.
Let x opt be an interpolation point such that f (x opt ) is the least calculated value
of f so far. Each trust-region iteration solves a quadratic subproblem at x opt subject
to linear inequality constraints, using a version of the active set method for indefinite
quadratic programming problems. However, this method requires that the initial re-
duced Hessian matrix at x opt to be positive definite. Therefore, if the reduced Hessian
at x opt is not positive definite, then artificial constraints are added by the algorithm to
the initial working set (the set of active constraints at x opt ). These constraints involve
artificial variables yi , and are of the form yi ≥ (x opt )i or yi ≤ (x opt )i . The purpose
of the artificial constraints is to convert the reduced Hessian matrix at x opt to a pos-
itive definite matrix [9]. When the iterations of the algorithm proceed, the artificial
constraints are removed automatically.

3.2 The trust-region subproblem

Let x opt be an interpolation point such that f (x opt ) is the least calculated value of
f (x) so far. Assume that q constraints are active at x opt , let ÂT denote the matrix
whose rows correspond to the active constraints at x opt , and b̂ be the corresponding
right hand side vector, Î be the working set at x opt i.e. Î is the index set of the active
constraints. The quadratic model at the k-th iteration is defined by

1
Qk (x opt + p k ) = f (x opt ) + gkT p k + p Tk Gk p k , p k ∈ Rn , (45)
2
its parameters being the vector g k ∈ Rn and the n × n symmetric matrix Gk . The
model Qk has to satisfy the interpolation conditions (6). Once the quadratic model
is constructed, LCOBYQA is directed to one of the two iterations, namely the ‘trust-
region’ iteration or the ‘model’ iteration. In each ‘trust-region’ iteration, a step p k
from x opt , is defined as the vector that solves:

1
min Qk (x opt + p k ) = Qk (x opt ) + gkT pk + pTk Gp k , p k ∈ Rn ,
2 (46)
subject to A (x opt + pk ) ≥ b,
T
pk  ≤ Δk .
Author's personal copy
610 E.A.E. Gumma et al.

The solution of problem (46) is presented in detail in Sect. 3.4. If it occurs that
pk  < 12 ρk then x opt + p k is considered to be sufficiently close to x opt , and the
algorithm is switched to the ‘model’ iteration. Otherwise, the new function value
f (x opt + pk ) is calculated; ρk+1 is set to ρk ; the ratio, RATIO, is calculated by the
formula:
f (x opt ) − f (x opt + p k )
RATIO = . (47)
Qk (x opt ) − Qk (x opt + p k )
A new trust-region radius Δk+1 is chosen by the formula (see [19]):

⎨min[ 2 Δk , pk ], if RATIO ≤ 0.1
1

Δk+1 = max[ 2 Δk , pk ], if 0.1 < RATIO ≤ 0.7
1
(48)


max[ 12 Δk , 2pk ], if RATIO > 0.7,

and the new point is defined by the equation:



+ x + p k , if f (x opt + p k ) < f (x opt ),
x = opt (49)
x opt , if f (x opt + p k ) ≥ f (x opt ).

If RATIO> 0.1 then we select the integer t, the index of the interpolation point
that will be removed. The selection of t ∈ {1, 2, . . . , m} provides a relatively large
denominator | σt |=| αt βt + τt2 |. Specifically, t is set to the integer in the set
{1, 2, . . . , m}\{opt} that maximizes the weighted denominator
   
 1   
x + − x 4 − wT H w + eT H w 2 , (50)
max 1, x t − x opt 2 /Δ2k Htt 0 t
2

where w is given by (30). For justification of this choice, see [19].


Also, if RATIO> 0.1 then the model Q is updated using the formula (29) such
that Q interpolates f (x) at x + instead of x t . If x + minimizes f (x) with respect to a
constraint whose index is in Î then we calculate the Lagrange multipliers at x + . If all
Lagrange multipliers at x + are positives then we terminate the algorithm. Otherwise
we delete the index of the constraint with least Lagrange multiplier from the working
set and update ÂT , b̂ and Î. If x + reaches the boundary of a constraint whose index
is not in the working set then we add this to the working set and update ÂT , b̂ and Î.
Finally, we set k = k + 1 and the algorithm performs another trust-region iteration.

3.3 The model iteration subproblem

If the RATIO calculated by (47) satisfies RATIO < 0.1, then the algorithm set t to
be an integer in {1, 2, . . . , m} such that x t maximizes the distance DIST = x i −
x opt , ∀i. If DIST ≥ 2Δk , then the geometry of the interpolation points needs to be
improved, else the trust-region is shrinked. If the geometry of the interpolation points
needs to be improved, then the algorithm invokes the ‘model’ iteration. The ‘model’
iteration tries to improve the geometry of the interpolation points by choosing a step
p k which solves problem (28), subject to AT (x opt + pk ) ≥ b, pk  ≤ Δk , i.e. it
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 611

maximizes | j (x opt + pk ) | subject to AT (x opt + p k ) ≥ b, pk  ≤ Δk . The solution


of this problem is presented in Sect. 3.5. Once pk is chosen then the model Q is
updated using the formula (29) such that Q interpolates f (x) at x opt + p k instead of
x t . If f (x opt + p k ) < f (x opt ) then x opt is overwritten by x opt + pk . Finally, we set
k = k + 1 and the algorithm performs another trust-region iteration.
If the geometry of the interpolation point does not need to be improved then the
value of ρk is decreased from ρk to ρk+1 by the following rule [19]:


⎨ρend , if ρk ≤ 16ρend ,
1
ρk+1 = (ρk ρend ) 2 , if 16ρend < ρk ≤ 250ρend , (51)


0.1ρk , if ρk > 250ρend .

3.4 The solution of the trust-region subproblem

Consider the trust-region subproblem

1
min Qk (x opt + pk ) = Qk (x opt ) + g Tk p k + pTk Gk pk , p k ∈ Rn ,
2
subject to AT (x opt + pk ) ≥ b, pk  ≤ Δk ,

where the parameters g k and Gk are given data. We will use a null space active
set version of the truncated conjugate gradient procedure (for indefinite quadratic
programming problems) to solve the above subproblem. For more details see [5, 8–
10, 13]. The idea of choosing the active set method in this work is motivated by the
fact that if the correct working set at the solution is known a priori then the solution of
the linear equality constrained problem would also be a solution to a linear inequality
problem.
Assume that q constraints are active at x opt , let ÂT denote the matrix whose rows
correspond to the active constraints at x opt , and b̂ be the corresponding right hand
side vector. Therefore, in order to solve (46), we solve the subproblem

1
min Qk (x opt + pk ) = Qk (x opt ) + g Tk p k + pTk Gk pk , p k ∈ Rn ,
2 (52)
subject to ÂT (x opt + pk ) = b̂, pk  ≤ Δk ,

where ÂT ∈ Rq×n , b̂ ∈ Rq and ÂT is of a full rank. Let Î be the index set of active
constraints at x opt (the working set at x opt ). Then any step pk from a feasible point
to any other feasible point must satisfy:

ÂT p k = 0. (53)

Now, let Y and Z be an n × q and n × (n − q) matrices, respectively, such that


[Y : Z] is nonsingular. In addition, let ÂT Y = I and ÂT Z = O, the columns of Z
form a basis for the null space of ÂT . So, from Eq. (53), any feasible direction p k
Author's personal copy
612 E.A.E. Gumma et al.

can be written as p k = Zy, where y is any vector in Rn−q . Therefore, any solution
of ÂT x = b̂ is given by pk = Y b̂ + Zy. Thus, problem (52) can be written as:

1 1
min ψ(y) = y T Z T GZy + (g + GY b̂)T Zy + (g + GY b̂)T Y b̂,
y∈Rn−q 2 2 (54)
subject to y ≤ Δr ,

where Δr = Δ2k − Y b̂2 , see [7]. We observe that the constant term of (54) is
independent of y, so we can rewrite this problem in the form

1
min ψ(y) = y T Z T GZy + (g + GY b̂)T Zy, subject to y ≤ Δr . (55)
y∈R n−q 2

Once the solution y ∗ is calculated then we substitute

p k = Y b̂ + Zy ∗ . (56)

We use an active set version of the truncated conjugate gradient method to find y ∗ , see
[2]. This method produces a piecewise linear path in Rn−q , beginning at the centre
y 0 = 0 of the trust-region {y : y ≤ Δr }. For j ≥ 1, y j is then generated by:

y j = y j −1 + αs j , j = 1, 2, . . . , n − q, (57)

where s j is given by

−Z T ∇Qk (x opt ), j = 1,
sj = (58)
−Z T ∇Qk (x opt + Zy j −1 ) + βj s j −1 , j ≥ 2,

and βj = Z T ∇Qk (x opt + Zy j −1 )2 /Z T ∇Qk (x opt + Zy j −2 )2 .


We now discuss the step length α in (57). For each iteration j , let αΔ be the step
along the direction s j to the trust-region boundary, i.e. αΔ = Δr . Let αψ be such
that the derivative of ψ(y j −1 + αψ s j ) with respect to αψ is zero, except that αψ is
regarded as infinity if the first derivative of ψ(y j −1 + αψ s j ) with respect to αψ is
negative for every αψ ≥ 0. Let αc be such that
 b̂ − a T (Y b̂ + Zy ) 
i i j
αc = min , a Ti (Y b̂ + Zy j ) < 0 . (59)
/ Iˆ
i∈ a Ti (Y b̂ + Zy j )

We now define α = min{αΔ , αψ , αc }, see [19], and y j is overwritten by y j −1 + αs j .


In the case α = αΔ , the trust-region boundary has been reached which completes
the iteration of the conjugate gradient method. If α = αc , the current line search is
restricted by a constraint. Its index is added to Iˆ so that the subsequent choice of
x opt + (Y b̂ + Zy j ) will remain on the boundary of the additional constraint. At this
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 613

stage Q(x opt ) − Q(x opt + (Y b̂ + Zy j )) is the total reduction in Q that occurred so
far, and the product ∇Q(x opt + (Y b̂ + Zy j ))Δk is likely to be an upper bound on
any further reduction. Therefore, the termination occurs if the condition
     
∇Q x + (Y b̂ + Zy ) Δk ≤ 0.01 Q(x ) − Q x + (Y b̂ + Zy ) (60)
opt j opt opt j

is achieved. Otherwise, the conjugate gradient procedure is restarted at the point


x opt + (Y b̂ + Zy j ) with s j = −Z T ∇Q(x opt + (Y b̂ + Zy j )) as the next search di-
rection. In the remaining case α ≤ αΔ , α ≤ αc , α = αψ , α is a full projected con-
jugate gradient step without any interference from any constraint which gives the
strict reduction in Qk . If this reduction is at most the right hand side of (60) or the
inequality (60) holds at the new point x opt + (Y b̂ + Zy j ) then the termination also
occurs. If the new solution x opt + (Y b̂ + Zy j ) does not satisfy (60) then we per-
form the alternative operation. The alternative is a line search from the new point
along the direction s j , which is chosen as a projected steepest descent direction
−Z T ∇Q(x opt + (Y b̂ + Zy j )) augmented by the multiple of the previous search di-
rection, as given in (58). If x opt + p k minimizes f (x) with respect to the constraints
in Î then we calculate the Lagrange multipliers at x opt + p k . If the Lagrange multipli-
ers are positives then the algorithm is terminated, otherwise we remove the index of
the constraint which corresponds to the least Lagrange multiplier from the working
set Î, see [10] for more details.

3.5 The solution of the model iteration subproblem

If RATIO of (47) satisfies RATIO < 0.1, then the algorithm tests whether the geom-
etry of the interpolation points needs to be improved. We set t to be an integer in
{1, 2, . . . , m} such that x t maximizes the distance

DIST = x i − x opt , i = 1, 2, . . . , m, (61)

where x opt is the interpolation point such that f (x opt ) is the least calculated value
of f (x) so far. If DIST ≥ 2Δk then the procedure that improves the geometry of the
interpolation points is to be invoked. This procedure tests the condition
1
σ = αβ + τ 2 ≤ τ 2 . (62)
2
If condition (62) is true, then a subroutine RESCUE is invoked (see [19]). Other-
wise the procedure replaces the current interpolation point x t by a new point x + ,
in order to improve the geometry of the interpolation points. This is done by using
the Lagrange interpolation polynomials. The Lagrange interpolation polynomial is
a quadratic polynomial t (x), x ∈ Rn , that satisfies the Lagrange conditions (22)
and the remaining degrees of freedom are used to minimizing the Frobenius norm
∇ 2 t (x)F . Therefore, t is the quadratic function

1 
m
2
t (x) = c + (x − x 0 ) g +
T
λk (x − x 0 )T (x k − x 0 ) , x ∈ Rn . (63)
2
k=1
Author's personal copy
614 E.A.E. Gumma et al.

The parameters c, g and λk , k = 1, 2, . . . , m, being defined by the linear system (18),


where the right hand side is the coordinate vector et ∈ Rm+n+1 . Thus, the parameters
of t are the t-th column of the matrix H . As mentioned in (31) that τt = et H w =
t (x opt + pk ), we expect relatively larger modulus of the denominator σt = αt βt + τt2
to be beneficial when the formula (31) is applied. Therefore, when the geometry of
the interpolation points need to be improved the point x t is replaced by the point
x + = x opt + pk , where the direction p k solves the following problem
 
maxt (x opt + p k ), subject to pk  ≤ Δk , ÂT (x opt + p k ) ≥ b̂. (64)

We now adopt the following procedure to solve problem (64). It is reported in [18, 19]
that x opt + pk is selected usually from one of m − 1 line segment in Rn through x opt
and other interpolation points. Let x t be the point that will be removed in order to
improve the interpolation set then the direction p k is chosen as pk = α(x t − x opt ),
where α ∈ (0, 1) such that | t (x + p k ) | is maximum. Clearly, | t (x + pk ) | is pos-
itive, since t (x t ) = 1. The above choice of p k guarantees that x + = x opt + pk is
feasible, since the line segment between any two feasible points of a convex set (lin-
ear constraints) is feasible. We begin with α = 0.95 and reduce it iteratively, see [10]
for more details.

3.6 The LCOBYQA algorithm

We now introduce the summary of our algorithm which solves unconstrained, sim-
ple bound constrained and linear inequality constrained derivative-free optimization
problems. The following summary of the algorithm is divided into nine steps, where
each step refers to the relevant part of the material discussed in previous sections. We
now present the step by step description of the LCOBYQA algorithm:
• Step 1:
The user of the algorithm supplies the following data: The initial point x 0 ,
the parameters ρbeg , ρend , the matrix AT , and the vector b.
• Step 2: Initialization
Set m = 2n + 1.
If x 0 is not strictly feasible, invoke the procedures PHASE ONE and MOVE to a
generate strictly feasible point, see Sect. 3.1.
Set Δ1 = ρbeg , ρ1 = ρbeg , ρ1 > ρend .
Construct the initial interpolation points, as shown in (38).
Select a point x opt from the interpolation points such that f (x opt ) has least value
of f (x).
Select Î to be the initial working set at x opt , set ÂT to be the coefficient matrix
of the constraints in the working set, set b̂ to be the corresponding right hand side
vector.
Construct the first quadratic model Q, set k = 1, set r = 0.
• Step 3:
if ρk ≤ ρend
go to Step 9.
end(if).
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 615

Calculate the step pk that minimizes problem (52).


If pk  < 12 ρk
set r = 1
if three recent values of | f (x opt ) − Q(x opt ) | and pk  are small
go to Step 8.
else
reduce Δk to Δk+1 by using (48), set RATIO=-1, go to Step 6.
end(if)
end(if)
• Step 4:
Calculate f (x opt + pk ) and RATIO by (47).
Revise Δk by (48), subject to Δk+1 ≥ ρk .
If RATIO ≥ 0.1
Select an integer t by (50), the index of the interpolation point that will be
dropped.
else
set t = 0
end(if).
• Step 5:
If t > 0
Update the model Q such that Q interpolate f (x) at x opt + pk instead of x t .
If x opt + p k minimize f (x) with respect to the constraints in Iˆ
if the Lagrange multipliers at x opt + p k are positives
go to Step 9.
else
delete a constraint with least Lagrange multiplier from the working set,
update Î, ÂT , b̂.
end(if)
end(if)
If x opt + p k reaches the boundary of a constraint not in the working set
add this constraint to the working set, update Î, ÂT , b̂.
end(if)
If f (x opt + pk ) < f (x opt )
overwrite x opt by x opt + pk .
end(if)
end(if)
If RATIO≥ 0.1
set k = k + 1, go to Step 3
end(if)
• Step 6:
Select an integer t that maximizes the distance DIST = x t − x opt .
If DIST≥ 2Δk
If condition (62) is true
invoke the subroutine RESCUE.
else
Replace x t by x opt + p k , where pk solves problem (64).
Author's personal copy
616 E.A.E. Gumma et al.

Update the model Q such that Q interpolate f (x) at x opt + pk instead


of x t .
If f (x opt + pk ) < f (x opt )
overwrite x opt by x opt + pk .
end(if)
end(if)
set k = k + 1, go to Step 3.
end(if).
• Step 7:
If max [p k , Δk ] > ρk or RATIO> 0
set k = k + 1, go to Step 3.
end(if)
• Step 8:
If ρk > ρend
Reduce ρk by (51); reduce Δk by Δk =max [ 12 ρk , ρk+1 ]
set k = k + 1, go to Step 3.
end(if).
• Step 9:
If r = 1
Calculate f (x opt + p k ).
If f (x opt + pk ) < f (x opt )
overwrite x opt by x opt + pk .
end(if)
end(if)
output the final optimal point x opt , stop.

4 Numerical results and discussion

In this section, we compare the performance of LCOBYQA with that of other avail-
able model-based derivative-free algorithms. The LCOBYQA algorithm is tested on
unconstrained, simple bound constrained and linear constrained problems. Most of
these test functions are non-convex. We use the number of function evaluations (that
each algorithm takes to solve the problem) and the final function values as the cri-
teria for comparison. The numerical results discussed in this section are carried out
on a pentium-4, 2.0 GHz, PC and the code of the algorithm is a complete MATLAB
software.
First, we compared LCOBYQA with UOBYQA [15], and COBYLA [14] using
the unconstrained test problems ARWHEAD, BDQRTIC [3] and CHROSEN [20].
The softwares are run in the same environment. The number of evaluations of f (x)
for LCOBYQA, UOBYQA [15], and COBYLA [14], are reported in Table 1. Ta-
ble 1 shows that our algorithm is attractive in comparison with its competitor, namely
COBYLA[14] and UOBYQA[15], except that the results on the function BDQR-
TIC, where we observe that the performance of UOBYQA is superior to that of
LCOBYQA.
Next, we compare LCOBYQA with CONDOR [7] and DFO [3, 4] using the test
problems POWER, DQDRTIC and VARDIM [1]. We have run the softwares in the
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 617

Table 1 Comparative results between LCOBYQA, COBYLA and UOBYQA

Number of function evaluations Final function values


NAME, n LCOBYQA COBYLA UOBYQA LCOBYQA COBYLA UOBYQA

ARWHEAD, 10 184 280 219 8.8374e-14 1.776e-15 3.7747e-13


ARWHEAD, 15 224 522 458 1.9387e-009 1.2854e-11 1.3624e-12
ARWHEAD, 20 368 678 837 2.9088e-010 3.5527e-15 2.03925e-12
ARWHEAD, 25 457 900 1320 1.5336e-11 8.8817e-16 2.9070e-12
BDQRTIC, 10 660 1106 434 11.8654 18.9344 11.8654
BDQRTIC, 15 1273 2323 843 23.6405 30.7017 23.6405
BDQRTIC, 20 1764 3616 1541 35.4091 42.4700 35.4906
BDQRTIC, 25 2369 5619 2302 47.1774 54.2384 47.1774
CHROSEN, 10 411 4661 505 7.7337e-010 3.2103e-15 1.4485e-12
CHROSEN, 15 701 6935 1204 1.0806e-010 1.9841e-15 4.5296e-13
CHROSEN, 20 977 8912 2034 9.2520e-009 1.5176e-15 5.8361e-13
CHROSEN, 25 1216 10861 2933 6.9363e-009 2.0238e-15 1.4764e-12

Table 2 Comparative results between LCOBYQA, CONDOR and DFO

Number of function evaluations Final function values


NAME, n LCOBYQA CONDOR DFO LCOBYQA CONDOR DFO

DQDRTIC, 10 28 201 403 2.1777e-25 2.0929e-18 1.6260e-20


POWER, 10 28 550 206 6.6831e-26 9.5433e-7 2.0582e-7
VARDIM, 10 2517 2686 2061 3.15423e-10 2.177e-25 1.626e-20

same environment. Table 2 shows the number of function evaluations and the final
function values for POWER, DQDRTIC and VARDIM test functions. Table 2 shows
that the performance of LCOBYQA is far superior to that of the other two algorithms.
However, LCOBYQA does not behave as good as DFO on VANDIM.
LCOBYQA is also tested and compared to NEWUOA on different uncon-
strained test functions. The test functions used are, the ARWHEAD, CHROSEN,
PENALTY1, PENALTY2 and PENALTY3 [1]. We have run LCOBYQA and
NEWUOA in the same environment. We have used x 0 = 0. We set ρend = 10−6
in each case, while ρbeg is given the values 0.5, 0.5, 1.0, 0.1 and 0.1 for ARW-
HEAD, CHROSEN, PELANTY1, PELANTY2 and PELANTY3, respectively. Ta-
ble 3 shows the number of function evaluations of LCOBYQA and NEWUOA for
the 5 test functions. The star in Table 3 indicates that the CPU time is very long, so
the problem is not tried. In this table, we observe that the performance of the two
algorithms is similar. For example the results of LCOBYQA on ARWHEAD are bet-
ter than NEWUOA, but the results of NEWUOA on CHROSEN, PENALTY2 and
PENALTY3 are better than LCOBYQA. The results are therefore comparable inspite
of the fact that NEWUOA was specifically designed for unconstrained problem.
LCOBYQA is also tested and compared with CONDOR [7], BOBYQA [19] and
COBYLA [14] using test problems in [11] with simple bound constraints. The results
Author's personal copy
618 E.A.E. Gumma et al.

Table 3 Comparative results between LCOBYQA, and NEWUOA

Number of function evaluations Final function values


NAME, n LCOBYQA NEWUOA LCOBYQA NEWUOA

ARWHEAD, 20 321 404 1.7764e-015 3.8156e-12


ARWHEAD, 40 1107 1497 2.3208e-012 1.7674e-11
ARWHEAD, 80 2491 3287 5.2358e-013 2.6176e-11
ARWHEAD, 160 8453 8504 3.2557e-008 1.6807e-10
CHROSEN, 20 818 845 1.0806e-009 4.3624e-12
CHROSEN, 40 2042 1876 1.7063e-008 5.4617e-11
CHROSEN, 80 4852 4314 2.7119e-008 2.9747e-11
CHROSEN, 160 * 9875 * 2.3506e-10
PENALTY1, 20 7507 7476 1.7843e-004 1.5777e-4
PENALTY1, 40 16704 14370 3.4791e-004 3.3925e-4
PENALTY1, 80 27407 32390 5.2643e-004 7.1305e-4
PENALTY1, 160 * 72519 * 1.4759e-3
PENALTY2, 20 1612 2443 6.3457e+2 6.3457e+2
PENALTY2, 40 5354 2455 5.5419e+004 5.5418e+4
PENALTY2, 80 13475 5703 1.771e+008 1.7760e+8
PENALTY2, 160 * * * *
PENALTY3, 20 4337 3219 3.2579e+2 3.630e+2
PENALTY3, 40 14221 16589 1.4916e+3 1.5322e+3
PENALTY3, 80 36039 136902 1.4916e+3 6.2844e+3
PENALTY3, 160 * * * *

are presented in Table 4. The numerical results of LCOBYQA reported in this table
are encouraging. Out of 8 test cases, results of LCOBYQA on 6 cases are superiors
to others.
Finally, LCOBYQA is also compared to CONDOR, and COBYLA using prob-
lems in [11] with linear constraints. The results of these test functions are given in
Table 5. From this table we see that LCOBYQA performs very well compared to
CONDOR, and COBYLA for most of the test functions. LCOBYQA was also tested
on other functions. Results for these problems are reported in [10].
In our numerical experiments with LCOBYQA, it has been observed that it con-
verges faster when RESCUE procedure of BOBYQA [19] happens to be called ear-
lier during the first few iterations. This holds, for ARWHEAD which was solved by
LCOBYQA up to 900 variables using a single processor. This is considered amazing.

5 Conclusion and future outlook

Based on Powell’s algorithm (NEWUOA), we have developed a successful new


derivative-free algorithm, named LCOBYQA, to solve linearly constrained optimiza-
tion problems. The algorithm is based on quadratic interpolation. It constructs a
quadratic model of the objective function from a few data, and uses the remaining
Table 4 Comparative results between LCOBYQA, CONDOR, BOBYQA and COBYLA

Number of function evaluations Final function value


NAME, n LCOBYQA CONDOR BOBYQA COBYLA LCOBYQA CONDOR BOBYQA COBYLA.

HS01, 2 273 142 222 36 9.9003e-10 1.5560e-13 1.6081e-14 4.9982e-10


HS02, 2 51 36 62 53 0.0504 4.9412 4.94122 5.0426e-2
HS03, 2 9 73 31 45 1.3332e-33 −7.4113e-22 2.8393e-34 1.6331e-17
HS04, 2 12 54 21 18 2.6667 2.6667 2.6666 2.6666
HS05, 2 30 34 33 77 −1.9130 −1.3415 −1.91322 1.2283
A derivative-free algorithm for linearly constrained optimization

HS38, 4 174 311 546 382 2.8731e-10 7.8251e-13 2.22716e-13 7.8770e+0


HS45, 5 16 377 40 69 1.0000 1.0000 1.0000 1.5000
Author's personal copy

HS110, 10 177 201 219 failed −45.7785 −45.7785 −45.7784 failed


619
Author's personal copy
620 E.A.E. Gumma et al.

Table 5 Comparative results between LCOBYQA, CONDOR, and COBYLA

Number of function evaluations Final function value


NAME, n LCOBYQA CONDOR COBYLA LCOBYQA CONDOR COBYLA.

HS09, 2 20 50 53 −0.5000 −0.5000 6.0220e-1


HS21, 2 13 59 48 −99.9600 −99.99 −99.9600
HS24, 2 13 79 71 −1.0000 −0.5774 −2.6214
HS28, 3 52 128 142 5.5114e-18 6.6151e-26 7.4213e-17
HS35, 3 84 130 46 0.1117 0.1111 −9.0000
HS36, 3 20 123 66 −3.3000e+3 −3.3000e+3 4.0692e-14
HS37, 3 113 69 46 −3.4560e+3 −3456 −3.4560e+3
HS44, 4 15 23 45 −15 −15 −15
HS45, 5 16 377 69 1.0000 1.0000 1.0000
HS48, 5 166 269 57 1.3471e-14 7.7073e-23 1.7392e-12
HS50, 5 107 260 95 1.0913e-12 1.0806e-25 2.7840e-16
HS51, 5 87 257 90 2.o866e-11 1.0806e-25 1.7392e-16
HS76, 4 16 21 45 −4.6818 −4.6818 −4.6818

freedom in the model to minimize the Frobenius norm of the Hessian matrix of the
change to the model.
In Sect. 4, we tested our algorithm (LCOBYQA) on various test problems, most
of which are nonconvex problems. Tables 1–5 show that our algorithm is attractive in
comparison with its competitors. The results obtained by the algorithm prove its effi-
ciency and show that it competes favourably against other model based algorithms.
The work reported in this paper is considered as a starting point for the solution of
the large class of constrained optimization problems. For future work, in this regard,
we suggest the following:
In our future research, we would like to extend our algorithm to handle quadratic,
general nonlinear and difficult constraints (general nonlinear constraints which are
like the objective, expensive to evaluate and their derivatives are not available). In or-
der to handle quadratic and general constraints, we can use the augmented Lagrangian
method or the exact penalty function or the filter method. For the difficult constraints,
the idea is to use the interpolation to construct the constraints in the same way as the
interpolation of the objective function.
The code of the algorithm is a complete MATLAB package, there is no call to any
external, unavailable libraries, which makes the algorithm relatively fast. To increase
the speed of the algorithm further, it would be interesting to use the technique of
parallelism in coding.

Acknowledgements Parts of this work was done during the first author’s two visits to African Institute
for Mathematical Sciences (AIMS), South Africa. He is very grateful to all AIMS staff. Special thanks
to Professors Fritz Hanhe and Barry Green the previous and current directors of AIMS, respectively, for
their excellent hospitality and facilities that supported the research. Also, he would like to thank Professor
M.J.D. Powell, an emeritus Professor at the Centre for Mathematical Sciences, Department of Applied
Mathematics and Theoretical Physics, University of Cambridge, for his encouragement and help during
his Ph.D. work.
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 621

References

1. Buckley, A.G., Jenning, L.S.: Test functions for unconstrained minimization. Technical report, CS-3,
Dalhousie University, Canada (1989)
2. Conn, A., Gould, N., Toint, Ph.: Trust Region Methods. SIAM, Philadelphia (2000)
3. Conn, A., Sheinberg, K., Toint, Ph.: An algorithm using quadratic interpolation for unconstrained
derivative-free optimization. In: Pillo, G.Di., Giannessi, F. (eds.) Nonlinear Optimization and Appli-
cation, pp. 27–47. Plenum, New York (1996)
4. Conn, A., Sheinberg, K., Toint, Ph.: A derivative free optimization (DFO) algorithm. Report 98(11)
(1998)
5. Dong, S.: Methods for Constrained Optimization. Springer, Berlin (2002). 18: project
6. Fletcher, R.: Practical Methods of Optimization, 2nd edn. Wiley, New York (1987)
7. Frank, V., Bersini, H.B.: CONDOR, a new parallel constrained extension of Powell’s UOBYQA al-
gorithm: experimental results and comparison with DFO algorithm. J. Comput. Appl. Math. 181,
157–175 (2005)
8. Gay, D.M., David, M.: In: Lecture Note in Mathematics, vol. 1066, pp. 74–105. Springer, Berlin
(1984)
9. Gill, P.E., Murray, W., Wright, M.H.: In: Inertia-Controlling Methods for General Quadratic Program-
ming, vol. 33, pp. 1–36. SIAM, Philadelphia (1991)
10. Gumma, E.: A derivative-free algorithm for linearly constrained optimization problems. Ph.D., Uni-
versity of Khartoum, Sudan (2011)
11. Hock, W., Schittkowski, K.: In: Test Example for Nonlinear Programming Codes. Lecture Notes en
Economics and Mathematical Systems, vol. 187 (1981)
12. Igor, G., Nash, G., Sofer, A.: In: Linear and Nonlinear Programming. SIAM, Philadelphia (2009)
13. Ju, Z., Wang, C., Jiguo, J.: Combining trust region and line search algorithm for equality constrained
optimization. Appl. Math. Comput. 14(1–2), 123–136 (2004)
14. Powell, M.J.D.: A direct search optimization method that model the objective function and constrained
functions by linear interpolation. In: Gomez, S., Hennart, J.P. (eds.) Advances in Optimization and
Numerical Analysis, Proceeding of the Sixth Workshop on Optimization and Numerical Analysis,
Oaxaca, Mexico, vol. 275, pp. 15–67 (1994)
15. Powell, M.J.D.: UOBYQA: unconstrained optimization by quadratic approximation. Math. Program.
92, 555–582 (2002)
16. Powell, M.J.D.: Least Frobenius norm updating for quadratic models that satisfy interpolation condi-
tions. Math. Program. 100, 183–215 (2004)
17. Powell, M.J.D.: The NEWUOA software for unconstrained optimization without derivative. In: Di
pillo, G., Roma, M. (eds.) IMA in Large Scale Nonlinear Optimization. Springer, New York (2006)
18. Powell, M.J.D.: Development of NEWUOA for minimization without derivatives. IMA J. Numer.
Anal. 28, 649–664 (2008)
19. Powell, M.J.D.: The BOBYQA algorithm for bound constrained optimization without derivative.
Technical report, 2009/NA06, CMS University of Cambridge (2009)
20. Toint, Ph.: Some numerical results using a sparse matrix updating formula in unconstrained optimiza-
tion. Math. Comput. 32, 839–859 (1987)
21. Winfield, D.: Function and Functional Optimization by Interpolation in Data Table. Ph.D., Harvard
University, Cambridge, USA (1969)
22. Winfield, D.: Function minimization by interpolation in data table. J. Inst. Math. Appl. 12, 339–347
(1973)

View publication stats

You might also like