Professional Documents
Culture Documents
A Derivative-Free Algorithm For Linearly Constrained Optimization Problems - Gumma, Hashim, Ali (2013)
A Derivative-Free Algorithm For Linearly Constrained Optimization Problems - Gumma, Hashim, Ali (2013)
net/publication/262295038
CITATIONS READS
13 387
3 authors:
Montaz Ali
University of the Witwatersrand
125 PUBLICATIONS 3,076 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Codon-based Similarity Measure and Optimization Techniques for the Clustering of Nucleic Acid Sequences View project
A derivative-free algorithms for solving optimization problems subject to nonlinear constraints View project
All content following this page was uploaded by Montaz Ali on 03 June 2014.
ISSN 0926-6003
Volume 57
Number 3
1 23
Your article is protected by copyright and all
rights are held exclusively by Springer Science
+Business Media New York. This e-offprint is
for personal use only and shall not be self-
archived in electronic repositories. If you wish
to self-archive your article, please use the
accepted manuscript version for posting on
your own website. You may further deposit
the accepted manuscript version in any
repository, provided it is only made publicly
available 12 months after official publication
or later and provided acknowledgement is
given to the original source of publication
and a link is inserted to the published article
on Springer's website. The link must be
accompanied by the following text: "The final
publication is available at link.springer.com”.
1 23
Author's personal copy
Comput Optim Appl (2014) 57:599–621
DOI 10.1007/s10589-013-9607-y
E.A.E. Gumma
Department of Mathematics, Faculty of Pure and Applied Sciences, International University of
Africa, P.O. Box: 2469, Khartoum, Sudan
e-mail: elzain@aims.ac.za
E.A.E. Gumma
e-mail: elzain.elzain@yahoo.com
M.H.A. Hashim
Department of Applied Mathematics, Faculty of Mathematical Sciences, University of Khartoum,
P.O. Box: 321, Khartoum, Sudan.
e-mail: mhashim@uofk.edu
M.H.A. Hashim
e-mail: mohsinhashim@yahoo.com
B
M.M. Ali ( )
School of Computational and Applied Mathematics, University of the Witwatersrand, Johannesburg,
South Africa
e-mail: Montaz.Ali@wits.ac.za
Author's personal copy
600 E.A.E. Gumma et al.
1 Introduction
where f (x) is a smooth nonlinear real-valued function; the gradient and the Hes-
sian of f (x) are unavailable. The algorithm we are going to design can also solve
unconstrained and simple bounds constrained problems. The simple bounds
l ≤ x ≤ u, l, u ∈ Rn , (3)
can be written as
I x ≥ l, −I x ≥ −u. (4)
So, we can set A = [I, −I ]T , b = [l, −u]T , where I is the n × n identity matrix.
In Eq. (3), if we let the components of l to be very large negative numbers and the
components of u to be very large numbers, then we have an unconstrained optimiza-
tion problem. Thus, our algorithm can also solve unconstrained and simple bound
constrained problems.
Derivative-free methods can be classified into two classes. The direct search meth-
ods and the model-based methods. The algorithm presented in this paper belongs to
the latter class. In the literature, some model-based derivative-free algorithms have
been proposed. The first trial of employing available objective function values f (x)
for building a quadratic model by interpolation was proposed by Winfield [21, 22].
In 1994, Powell [14] proposed the COBYLA algorithm for constrained optimiza-
tion without derivatives. In this proposal, the objective function and the constraints
are approximated by linear multivariate models. Powell [15] also extended the idea
of Winfield further by developing his UOBYQA algorithm for unconstrained opti-
mization without derivatives by using quadratic interpolation models of the objective
function. A variant model-based algorithm (DFO) using quadratic Newton fundamen-
tal polynomial was proposed by Conn, Sheinberg and Toint [3, 4]. In 2005, Frank
[7] proposed a variant of UOBYQA algorithm [15], named CONDOR. CONDOR
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 601
Q(x i ) = f (x i ), i = 1, 2, . . . , s, (5)
where s = 12 (n + 1)(n + 2). In this case, the parameters of Q can be written as a linear
system of equations in the coefficients of the model. If we choose the interpolation
points so that the linear system is nonsingular, then the model Q will be defined
uniquely. On the other hand, the use of a full quadratic model limits the size of the
problems that can be solved in practice. One of the methods that overcomes this draw-
back was proposed by Powell [16]. The method of Powell constructs a full quadratic
model that satisfies interpolation conditions, and leaves some freedom in the model.
The remaining freedom is taken up by minimizing the Frobenius norm of the second
derivative matrix of the change to the model. This variational problem is expressed
as a solution of (m + n + 1) × (m + n + 1) system of linear equations, where m is the
number of the interpolation points which satisfy n + 2 ≤ m ≤ 12 (n + 1)(n + 2).
The algorithm presented here (LCOBYQA) is based on the above principle
of Powell and thus, an extension to Powell’s algorithms (NEWUOA, BOBYQA)
[17, 19]. The name LCOBYQA is an acronym for Linearly Constrained Optimization
BY Quadratic Approximation. LCOBYQA is an iterative algorithm. A typical iter-
ation of the algorithm generates a new vector of variables either by minimizing the
quadratic model in a trust-region subject to linear constraints (trust-region subprob-
lem), or by a procedure that should improve the geometry of the interpolation points
(model iteration subproblem).
The rest of the paper is organized as follows. Since our algorithm is an extension of
Powell’s work, the latter is discussed in detail in Sect. 2. This makes it easy to describe
our algorithm in Sect. 3. Several numerical results are presented and discussed in
Sect. 4. The paper ends in Sect. 5 with a conclusion and an outlook to possible future
research.
Quadratic approximations of the objective function are highly useful for obtaining a
fast rate of convergence in derivative-free algorithms, because usually some attention
Author's personal copy
602 E.A.E. Gumma et al.
has been given to the curvature of the objective function. On the other hand, each full
quadratic model has 12 (n + 1)(n + 2) independent parameters and this limits the size
of the problems that can be solved in practice. Therefore, Powell [17, 19] investi-
gated the idea of constructing a suitable quadratic model from m interpolation points,
when m is much less than 12 (n + 1)(n + 2). If m = 2n + 1, then there are enough data
to define a quadratic model with diagonal second derivative matrix which is done
before the first iteration. Specifically, on each iteration, the method of NEWUOA
and BOBYQA constructs a quadratic model Q(x), x ∈ Rn , of the objective function
f (x), x ∈ Rn that required to satisfy interpolation conditions
Q(x i ) = f (x i ), i = 1, 2, . . . , m, (6)
where m is prescribed by the user, the value m = 2n + 1 being typical, and where
the positions of the different points x i , i = 1, 2, . . . , m, are generated automatically.
These conditions leave much freedom in Q, taken up when the model is updated by
minimizing the Frobenius norm of the change to the second derivative matrix of Q.
The success of the method of NEWUOA and BOBYQA is due to a well-known
technique that is suggested by the symmetric Broyden method for updating ∇ 2 Q
when first derivatives of f (x) are available (see p. 73 of [6]). Let an old model Qold
be present, and let the new model Qnew be required to satisfy conditions (6) and
leave some freedom in the parameters of Qnew . The technique takes up this freedom
by minimizing the Frobenius norm of ∇ 2 Qnew − ∇ 2 Qold . One reason for trying the
symmetric Broyden method is that the calculation of Qnew from Qold requires only
O(n2 ) operations in the case m = 2n + 1, but O(n4 ) operations are needed if Qnew
is defined completely by conditions (6), when m = 12 (n + 1)(n + 2), see [19]. The
second reason is that if the objective function f (x) is quadratic, then the symmetric
Broyden method has the property:
2
∇ Qnew − ∇ 2 f 2 = ∇ 2 Qold − ∇ 2 f 2 − ∇ 2 Qnew − ∇ 2 Qold 2 , (7)
F F F
Let the current quadratic model Qold be present and has the form:
1
Qold (x) = cold + (x − x 0 )T g old + (x − x 0 )T Gold (x − x 0 ), x ∈ Rn , (8)
2
where x 0 is a fixed point. Let x opt be the point that satisfies:
f (x opt ) = min f (x i ) : i = 1, 2, . . . , m , (9)
where x i , i = 1, 2, . . . , m, are the interpolation points of Qold . On each iteration of
NEWUOA, the current iteration generates a new point x + . The position of x opt is
central to the choice of x + in trust region methods. Indeed, x + is calculated to be a
sufficiently accurate estimate of the vector x ∈ Rn that solves the subproblem
min Qold (x), subject to x − x opt ≤ ρ, (10)
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 603
1
Qnew (x) = c+ + (x − x 0 )T g + + (x − x 0 )T G+ (x − x 0 ), x ∈ Rn , (11)
2
be the new quadratic model that satisfies the conditions (6) at the new interpolation
points, leaving some freedom in the parameters of Qnew . NEWUOA takes up this
freedom by minimizing the Frobenius norm of ∇ 2 D(x), where
1
D(x) = Qnew (x) − Qold (x) = c + (x − x 0 )T g + (x − x 0 )T G(x − x 0 ), x ∈ Rn ,
2
(12)
subject to the conditions (6). This problem can be written as:
1 2 1 2
n n
min G+ − Gold F = Gij , (13)
4 4
i=1 j =1
subject to
T 1 + T + +
c + x+i − x0 g + xi − x0 G x+i − x 0 = f x i − Qold x i δit ,
2
i = 1, 2, . . . , m, (14)
1 2
n n
L(c, g, G) = Gij
4
i=1 j =1
m
+ T 1 + T +
− λk c + x k − x 0 g + x k − x 0 G x k − x 0 , (15)
2
k=1
Author's personal copy
604 E.A.E. Gumma et al.
m
m
λk = 0, λk x +
k − x 0 = 0, (16)
k=1 k=1
and
m
+ T
G= λk x +
k − x0 xk − x0 . (17)
k=1
The second part of Eq. (16) and the conditions (14) give rise to the variational square
linear system
⎛ ⎞ ⎛ ⎞
λ λ
⎜ ⎟ ΛX T
⎜ ⎟ r
W ⎝c ⎠ = ⎝ c ⎠ = , (18)
XO 0
g g
where W is an (m + n + 1) × (m + n + 1) matrix, Λ has the elements
1 + T 2
Λij = xi − x0 x+ j − x0 , i, j = 1, 2, . . . , m, (19)
2
X is the (n + 1) × m matrix
1 1 ... 1
, (20)
x+1 − x0 x+
2 − x0 . . . x+
m − x0
j (x i ) = δij , 1 ≤ i, j ≤ m, (22)
where δij is the Kronecker delta. In order for these polynomials to be applicable to the
variational system (18), the conditions A1 and A2 are retained on the positions of the
interpolation points, and for each j = 1, 2, . . . , m, the remaining freedom in j (x)
is taken up by minimizing the Frobenius norm ∇ 2 j F subject to the constraints
in (22). Therefore, if the right hand side of the system (18) is replaced by the j -th
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 605
coordinate vector in Rm+n+1 , then the parameters of j are defined by this system.
Thus, if Q is the quadratic polynomial
m
Q(x) = f (x j )j (x), x ∈ Rn , (23)
j =1
then its parameters satisfy Eq. (18). It follows from the nonsingularity of the system
(18) that the expression (23) is the Lagrange form of the solution of the variational
problem (18). Let
Ω ΞT
H= = W −1
Ξ Υ
be the inverse of the matrix W of the system (18). The definition of j , where j is any
integer in {1, 2, . . . , m}, implies that the j -th column of H provides the parameters
of j . In particular, because of Eq. (17), j has the second derivative matrix
m
Gj = ∇ 2 j = Hkj (x k − x 0 )(x k − x 0 )T , j = 1, 2, . . . , m. (24)
k=1
1
j (x) = cj + (x − x 0 )T g j + (x − x 0 )T Gj (x − x 0 ), x ∈ Rn , (25)
2
where cj is equal to Hm+1j and g j is equal to the vector with component Hij , i =
m + 2, m + 3, . . . , m + n + 1. Because of the fact that the parameters of j (x) depend
on H , the elements of the matrix H are required to be available.
To discuss the relation between the polynomials j (x), j = 1, 2, . . . , m, and non-
singularity of the system (18), let x + be the new vector of variables that will re-
place one of the interpolation points x i , i = 1, 2, . . . , m. When x + replaces x t ,
t ∈ {1, 2, . . . , m}, x t will be dismissed, so the new interpolation points are the vectors
x+ +
t =x , x+
i = xi , i ∈ {1, 2, . . . , m}\{t}. (26)
One advantage of the Lagrange polynomials is that they provide a convenient way
of maintaining the conditions A1 and A2 of Sect. 2.1, see [16]. These conditions are
inherited by the new interpolation points if t is chosen so that t (x + ) is nonzero.
It can be seen that at least one of the values j (x + ), j = 1, 2, . . . , m, is nonzero,
because interpolation to a constant function yields
m
j (x) = 1, x ∈ Rn . (27)
j =1
Another advantage of the Lagrange polynomials is that they improve the accuracy
of the quadratic model. In order to improve the accuracy of the quadratic model,
Powell has chosen an alternative to solving the trust region subproblem (10). In this
Author's personal copy
606 E.A.E. Gumma et al.
1
H+ = H + αt (et − H w)(et − H w)T − βt H et eTt H
σt
+ τt H et (et − H w)T + (et − H w)eTt H , (29)
where et is the t-th coordinate vector in Rm+n+1 , w ∈ Rm+n+1 is the vector that has
the components
1 2
wi = (x i − x 0 )T x + − x 0 , i = 1, 2, . . . , m,
2 (30)
wm+1 = 1, and wm+i+1 = x + − x 0 i , i = 1, 2, . . . , n,
also
where
1 4
ηt = x + − x 0 − eTt w. (32)
2
Once H + is constructed, the submatrices Ξ and Υ of H are overwritten by Ξ +
and Υ + respectively, and the factorization of Ω + is stored instead of Ω. The pur-
pose of the factorization is to reduce the damage from rounding errors to the identity
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 607
W = H −1 , which is fulfilled at the beginning of each iteration, see [17]. Let the fac-
torization Ω = V V T be given, the new factorization of Ω + can be constructed by
changing only one column of V , the first column say, see [19]. Specifically, the first
column has the form
− 12
Vi1+ = σt τt Vi1 + et − eopt − H {w − v} i Vt1 , i = 1, 2, . . . , m, (33)
Here opt in eopt is the index of the best point x opt . The quantities σt , τt are defined by
Eq. (31) and the vector w is defined by Eq. (30). For full details of these calculations,
see [17, 19].
In this section, we describe our algorithm, LCOBYQA, for solving problem (1) sub-
ject to the linear constraints (2). Both NEWUOA and LCOBYQA are based on the
main idea of Powell, but our algorithm differs from the NEWUOA in three main
procedures, namely:
• the initial calculations procedure,
• the trust-region subproblem, and
• the geometry improvement subproblem (the model iteration subproblem).
The LCOBYQA algorithm requires the initial point x 0 , the coefficient matrix AT of
the linear constraints, the right hand side vector of the constraints b, the parameters
ρbeg , ρend , ρ1 and Δ1 which satisfies ρ1 = Δ1 = ρbeg , ρbeg > ρend , and an integer
number m = 2n + 1, where n the number of variables. In order to construct the initial
interpolation points, we need the point x 0 to be strictly feasible, if x 0 is not strictly
feasible, the algorithm calls the phase one procedure of linear programming to pro-
vide a vertex point (see [12]), and then uses the following steps to construct a strict
feasible point.
Let x be the vertex point which is generated by phase one procedure. Suppose
that {1, 2, . . . , l} is the index set of the active constraints at x , i.e.,
a Ti x = bi , i = 1, 2, . . . , l. (35)
a Ti d > 0, i = 1, 2, . . . , l. (36)
Author's personal copy
608 E.A.E. Gumma et al.
Q1 (x i ) = f (x i ), i = 2, . . . , m, (40)
Further, for integer i = 2, 3, . . . , n + 1, the i-th row of the initial Ξ also has just two
nonzero elements
This completes the definition of Ξ for the initial interpolation points. Moreover, the
initial (n + 1) × (n + 1) matrix Υ , is identically zero. As mentioned in the last para-
graph of Sect. 2.3, the factorization of Ω, which guarantees that the rank of Ω is at
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 609
m−n−1
Ω= vk vTk = V V T , (43)
k=1
where the components of the initial vector vk ∈ Rm , which is the k-th column of V ,
are given the values
√ −2 1 √ −2 1 √ −2
V1k = − 2ρbeg , Vk+1k = 2ρbeg , Vk+n+1k = 2ρbeg ,
2 2 (44)
Vj k = 0, otherwise, where 1 ≤ k ≤ n, j = 1, 2, . . . , m.
We see that each of these columns has just three nonzero elements.
Let x opt be an interpolation point such that f (x opt ) is the least calculated value
of f so far. Each trust-region iteration solves a quadratic subproblem at x opt subject
to linear inequality constraints, using a version of the active set method for indefinite
quadratic programming problems. However, this method requires that the initial re-
duced Hessian matrix at x opt to be positive definite. Therefore, if the reduced Hessian
at x opt is not positive definite, then artificial constraints are added by the algorithm to
the initial working set (the set of active constraints at x opt ). These constraints involve
artificial variables yi , and are of the form yi ≥ (x opt )i or yi ≤ (x opt )i . The purpose
of the artificial constraints is to convert the reduced Hessian matrix at x opt to a pos-
itive definite matrix [9]. When the iterations of the algorithm proceed, the artificial
constraints are removed automatically.
Let x opt be an interpolation point such that f (x opt ) is the least calculated value of
f (x) so far. Assume that q constraints are active at x opt , let ÂT denote the matrix
whose rows correspond to the active constraints at x opt , and b̂ be the corresponding
right hand side vector, Î be the working set at x opt i.e. Î is the index set of the active
constraints. The quadratic model at the k-th iteration is defined by
1
Qk (x opt + p k ) = f (x opt ) + gkT p k + p Tk Gk p k , p k ∈ Rn , (45)
2
its parameters being the vector g k ∈ Rn and the n × n symmetric matrix Gk . The
model Qk has to satisfy the interpolation conditions (6). Once the quadratic model
is constructed, LCOBYQA is directed to one of the two iterations, namely the ‘trust-
region’ iteration or the ‘model’ iteration. In each ‘trust-region’ iteration, a step p k
from x opt , is defined as the vector that solves:
1
min Qk (x opt + p k ) = Qk (x opt ) + gkT pk + pTk Gp k , p k ∈ Rn ,
2 (46)
subject to A (x opt + pk ) ≥ b,
T
pk ≤ Δk .
Author's personal copy
610 E.A.E. Gumma et al.
The solution of problem (46) is presented in detail in Sect. 3.4. If it occurs that
pk < 12 ρk then x opt + p k is considered to be sufficiently close to x opt , and the
algorithm is switched to the ‘model’ iteration. Otherwise, the new function value
f (x opt + pk ) is calculated; ρk+1 is set to ρk ; the ratio, RATIO, is calculated by the
formula:
f (x opt ) − f (x opt + p k )
RATIO = . (47)
Qk (x opt ) − Qk (x opt + p k )
A new trust-region radius Δk+1 is chosen by the formula (see [19]):
⎧
⎨min[ 2 Δk , pk ], if RATIO ≤ 0.1
1
⎪
Δk+1 = max[ 2 Δk , pk ], if 0.1 < RATIO ≤ 0.7
1
(48)
⎪
⎩
max[ 12 Δk , 2pk ], if RATIO > 0.7,
If RATIO> 0.1 then we select the integer t, the index of the interpolation point
that will be removed. The selection of t ∈ {1, 2, . . . , m} provides a relatively large
denominator | σt |=| αt βt + τt2 |. Specifically, t is set to the integer in the set
{1, 2, . . . , m}\{opt} that maximizes the weighted denominator
1
x + − x 4 − wT H w + eT H w 2 , (50)
max 1, x t − x opt 2 /Δ2k Htt 0 t
2
If the RATIO calculated by (47) satisfies RATIO < 0.1, then the algorithm set t to
be an integer in {1, 2, . . . , m} such that x t maximizes the distance DIST = x i −
x opt , ∀i. If DIST ≥ 2Δk , then the geometry of the interpolation points needs to be
improved, else the trust-region is shrinked. If the geometry of the interpolation points
needs to be improved, then the algorithm invokes the ‘model’ iteration. The ‘model’
iteration tries to improve the geometry of the interpolation points by choosing a step
p k which solves problem (28), subject to AT (x opt + pk ) ≥ b, pk ≤ Δk , i.e. it
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 611
1
min Qk (x opt + pk ) = Qk (x opt ) + g Tk p k + pTk Gk pk , p k ∈ Rn ,
2
subject to AT (x opt + pk ) ≥ b, pk ≤ Δk ,
where the parameters g k and Gk are given data. We will use a null space active
set version of the truncated conjugate gradient procedure (for indefinite quadratic
programming problems) to solve the above subproblem. For more details see [5, 8–
10, 13]. The idea of choosing the active set method in this work is motivated by the
fact that if the correct working set at the solution is known a priori then the solution of
the linear equality constrained problem would also be a solution to a linear inequality
problem.
Assume that q constraints are active at x opt , let ÂT denote the matrix whose rows
correspond to the active constraints at x opt , and b̂ be the corresponding right hand
side vector. Therefore, in order to solve (46), we solve the subproblem
1
min Qk (x opt + pk ) = Qk (x opt ) + g Tk p k + pTk Gk pk , p k ∈ Rn ,
2 (52)
subject to ÂT (x opt + pk ) = b̂, pk ≤ Δk ,
where ÂT ∈ Rq×n , b̂ ∈ Rq and ÂT is of a full rank. Let Î be the index set of active
constraints at x opt (the working set at x opt ). Then any step pk from a feasible point
to any other feasible point must satisfy:
ÂT p k = 0. (53)
can be written as p k = Zy, where y is any vector in Rn−q . Therefore, any solution
of ÂT x = b̂ is given by pk = Y b̂ + Zy. Thus, problem (52) can be written as:
1 1
min ψ(y) = y T Z T GZy + (g + GY b̂)T Zy + (g + GY b̂)T Y b̂,
y∈Rn−q 2 2 (54)
subject to y ≤ Δr ,
where Δr = Δ2k − Y b̂2 , see [7]. We observe that the constant term of (54) is
independent of y, so we can rewrite this problem in the form
1
min ψ(y) = y T Z T GZy + (g + GY b̂)T Zy, subject to y ≤ Δr . (55)
y∈R n−q 2
p k = Y b̂ + Zy ∗ . (56)
We use an active set version of the truncated conjugate gradient method to find y ∗ , see
[2]. This method produces a piecewise linear path in Rn−q , beginning at the centre
y 0 = 0 of the trust-region {y : y ≤ Δr }. For j ≥ 1, y j is then generated by:
y j = y j −1 + αs j , j = 1, 2, . . . , n − q, (57)
where s j is given by
−Z T ∇Qk (x opt ), j = 1,
sj = (58)
−Z T ∇Qk (x opt + Zy j −1 ) + βj s j −1 , j ≥ 2,
stage Q(x opt ) − Q(x opt + (Y b̂ + Zy j )) is the total reduction in Q that occurred so
far, and the product ∇Q(x opt + (Y b̂ + Zy j ))Δk is likely to be an upper bound on
any further reduction. Therefore, the termination occurs if the condition
∇Q x + (Y b̂ + Zy ) Δk ≤ 0.01 Q(x ) − Q x + (Y b̂ + Zy ) (60)
opt j opt opt j
If RATIO of (47) satisfies RATIO < 0.1, then the algorithm tests whether the geom-
etry of the interpolation points needs to be improved. We set t to be an integer in
{1, 2, . . . , m} such that x t maximizes the distance
where x opt is the interpolation point such that f (x opt ) is the least calculated value
of f (x) so far. If DIST ≥ 2Δk then the procedure that improves the geometry of the
interpolation points is to be invoked. This procedure tests the condition
1
σ = αβ + τ 2 ≤ τ 2 . (62)
2
If condition (62) is true, then a subroutine RESCUE is invoked (see [19]). Other-
wise the procedure replaces the current interpolation point x t by a new point x + ,
in order to improve the geometry of the interpolation points. This is done by using
the Lagrange interpolation polynomials. The Lagrange interpolation polynomial is
a quadratic polynomial t (x), x ∈ Rn , that satisfies the Lagrange conditions (22)
and the remaining degrees of freedom are used to minimizing the Frobenius norm
∇ 2 t (x)F . Therefore, t is the quadratic function
1
m
2
t (x) = c + (x − x 0 ) g +
T
λk (x − x 0 )T (x k − x 0 ) , x ∈ Rn . (63)
2
k=1
Author's personal copy
614 E.A.E. Gumma et al.
We now adopt the following procedure to solve problem (64). It is reported in [18, 19]
that x opt + pk is selected usually from one of m − 1 line segment in Rn through x opt
and other interpolation points. Let x t be the point that will be removed in order to
improve the interpolation set then the direction p k is chosen as pk = α(x t − x opt ),
where α ∈ (0, 1) such that | t (x + p k ) | is maximum. Clearly, | t (x + pk ) | is pos-
itive, since t (x t ) = 1. The above choice of p k guarantees that x + = x opt + pk is
feasible, since the line segment between any two feasible points of a convex set (lin-
ear constraints) is feasible. We begin with α = 0.95 and reduce it iteratively, see [10]
for more details.
We now introduce the summary of our algorithm which solves unconstrained, sim-
ple bound constrained and linear inequality constrained derivative-free optimization
problems. The following summary of the algorithm is divided into nine steps, where
each step refers to the relevant part of the material discussed in previous sections. We
now present the step by step description of the LCOBYQA algorithm:
• Step 1:
The user of the algorithm supplies the following data: The initial point x 0 ,
the parameters ρbeg , ρend , the matrix AT , and the vector b.
• Step 2: Initialization
Set m = 2n + 1.
If x 0 is not strictly feasible, invoke the procedures PHASE ONE and MOVE to a
generate strictly feasible point, see Sect. 3.1.
Set Δ1 = ρbeg , ρ1 = ρbeg , ρ1 > ρend .
Construct the initial interpolation points, as shown in (38).
Select a point x opt from the interpolation points such that f (x opt ) has least value
of f (x).
Select Î to be the initial working set at x opt , set ÂT to be the coefficient matrix
of the constraints in the working set, set b̂ to be the corresponding right hand side
vector.
Construct the first quadratic model Q, set k = 1, set r = 0.
• Step 3:
if ρk ≤ ρend
go to Step 9.
end(if).
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 615
In this section, we compare the performance of LCOBYQA with that of other avail-
able model-based derivative-free algorithms. The LCOBYQA algorithm is tested on
unconstrained, simple bound constrained and linear constrained problems. Most of
these test functions are non-convex. We use the number of function evaluations (that
each algorithm takes to solve the problem) and the final function values as the cri-
teria for comparison. The numerical results discussed in this section are carried out
on a pentium-4, 2.0 GHz, PC and the code of the algorithm is a complete MATLAB
software.
First, we compared LCOBYQA with UOBYQA [15], and COBYLA [14] using
the unconstrained test problems ARWHEAD, BDQRTIC [3] and CHROSEN [20].
The softwares are run in the same environment. The number of evaluations of f (x)
for LCOBYQA, UOBYQA [15], and COBYLA [14], are reported in Table 1. Ta-
ble 1 shows that our algorithm is attractive in comparison with its competitor, namely
COBYLA[14] and UOBYQA[15], except that the results on the function BDQR-
TIC, where we observe that the performance of UOBYQA is superior to that of
LCOBYQA.
Next, we compare LCOBYQA with CONDOR [7] and DFO [3, 4] using the test
problems POWER, DQDRTIC and VARDIM [1]. We have run the softwares in the
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 617
same environment. Table 2 shows the number of function evaluations and the final
function values for POWER, DQDRTIC and VARDIM test functions. Table 2 shows
that the performance of LCOBYQA is far superior to that of the other two algorithms.
However, LCOBYQA does not behave as good as DFO on VANDIM.
LCOBYQA is also tested and compared to NEWUOA on different uncon-
strained test functions. The test functions used are, the ARWHEAD, CHROSEN,
PENALTY1, PENALTY2 and PENALTY3 [1]. We have run LCOBYQA and
NEWUOA in the same environment. We have used x 0 = 0. We set ρend = 10−6
in each case, while ρbeg is given the values 0.5, 0.5, 1.0, 0.1 and 0.1 for ARW-
HEAD, CHROSEN, PELANTY1, PELANTY2 and PELANTY3, respectively. Ta-
ble 3 shows the number of function evaluations of LCOBYQA and NEWUOA for
the 5 test functions. The star in Table 3 indicates that the CPU time is very long, so
the problem is not tried. In this table, we observe that the performance of the two
algorithms is similar. For example the results of LCOBYQA on ARWHEAD are bet-
ter than NEWUOA, but the results of NEWUOA on CHROSEN, PENALTY2 and
PENALTY3 are better than LCOBYQA. The results are therefore comparable inspite
of the fact that NEWUOA was specifically designed for unconstrained problem.
LCOBYQA is also tested and compared with CONDOR [7], BOBYQA [19] and
COBYLA [14] using test problems in [11] with simple bound constraints. The results
Author's personal copy
618 E.A.E. Gumma et al.
are presented in Table 4. The numerical results of LCOBYQA reported in this table
are encouraging. Out of 8 test cases, results of LCOBYQA on 6 cases are superiors
to others.
Finally, LCOBYQA is also compared to CONDOR, and COBYLA using prob-
lems in [11] with linear constraints. The results of these test functions are given in
Table 5. From this table we see that LCOBYQA performs very well compared to
CONDOR, and COBYLA for most of the test functions. LCOBYQA was also tested
on other functions. Results for these problems are reported in [10].
In our numerical experiments with LCOBYQA, it has been observed that it con-
verges faster when RESCUE procedure of BOBYQA [19] happens to be called ear-
lier during the first few iterations. This holds, for ARWHEAD which was solved by
LCOBYQA up to 900 variables using a single processor. This is considered amazing.
freedom in the model to minimize the Frobenius norm of the Hessian matrix of the
change to the model.
In Sect. 4, we tested our algorithm (LCOBYQA) on various test problems, most
of which are nonconvex problems. Tables 1–5 show that our algorithm is attractive in
comparison with its competitors. The results obtained by the algorithm prove its effi-
ciency and show that it competes favourably against other model based algorithms.
The work reported in this paper is considered as a starting point for the solution of
the large class of constrained optimization problems. For future work, in this regard,
we suggest the following:
In our future research, we would like to extend our algorithm to handle quadratic,
general nonlinear and difficult constraints (general nonlinear constraints which are
like the objective, expensive to evaluate and their derivatives are not available). In or-
der to handle quadratic and general constraints, we can use the augmented Lagrangian
method or the exact penalty function or the filter method. For the difficult constraints,
the idea is to use the interpolation to construct the constraints in the same way as the
interpolation of the objective function.
The code of the algorithm is a complete MATLAB package, there is no call to any
external, unavailable libraries, which makes the algorithm relatively fast. To increase
the speed of the algorithm further, it would be interesting to use the technique of
parallelism in coding.
Acknowledgements Parts of this work was done during the first author’s two visits to African Institute
for Mathematical Sciences (AIMS), South Africa. He is very grateful to all AIMS staff. Special thanks
to Professors Fritz Hanhe and Barry Green the previous and current directors of AIMS, respectively, for
their excellent hospitality and facilities that supported the research. Also, he would like to thank Professor
M.J.D. Powell, an emeritus Professor at the Centre for Mathematical Sciences, Department of Applied
Mathematics and Theoretical Physics, University of Cambridge, for his encouragement and help during
his Ph.D. work.
Author's personal copy
A derivative-free algorithm for linearly constrained optimization 621
References
1. Buckley, A.G., Jenning, L.S.: Test functions for unconstrained minimization. Technical report, CS-3,
Dalhousie University, Canada (1989)
2. Conn, A., Gould, N., Toint, Ph.: Trust Region Methods. SIAM, Philadelphia (2000)
3. Conn, A., Sheinberg, K., Toint, Ph.: An algorithm using quadratic interpolation for unconstrained
derivative-free optimization. In: Pillo, G.Di., Giannessi, F. (eds.) Nonlinear Optimization and Appli-
cation, pp. 27–47. Plenum, New York (1996)
4. Conn, A., Sheinberg, K., Toint, Ph.: A derivative free optimization (DFO) algorithm. Report 98(11)
(1998)
5. Dong, S.: Methods for Constrained Optimization. Springer, Berlin (2002). 18: project
6. Fletcher, R.: Practical Methods of Optimization, 2nd edn. Wiley, New York (1987)
7. Frank, V., Bersini, H.B.: CONDOR, a new parallel constrained extension of Powell’s UOBYQA al-
gorithm: experimental results and comparison with DFO algorithm. J. Comput. Appl. Math. 181,
157–175 (2005)
8. Gay, D.M., David, M.: In: Lecture Note in Mathematics, vol. 1066, pp. 74–105. Springer, Berlin
(1984)
9. Gill, P.E., Murray, W., Wright, M.H.: In: Inertia-Controlling Methods for General Quadratic Program-
ming, vol. 33, pp. 1–36. SIAM, Philadelphia (1991)
10. Gumma, E.: A derivative-free algorithm for linearly constrained optimization problems. Ph.D., Uni-
versity of Khartoum, Sudan (2011)
11. Hock, W., Schittkowski, K.: In: Test Example for Nonlinear Programming Codes. Lecture Notes en
Economics and Mathematical Systems, vol. 187 (1981)
12. Igor, G., Nash, G., Sofer, A.: In: Linear and Nonlinear Programming. SIAM, Philadelphia (2009)
13. Ju, Z., Wang, C., Jiguo, J.: Combining trust region and line search algorithm for equality constrained
optimization. Appl. Math. Comput. 14(1–2), 123–136 (2004)
14. Powell, M.J.D.: A direct search optimization method that model the objective function and constrained
functions by linear interpolation. In: Gomez, S., Hennart, J.P. (eds.) Advances in Optimization and
Numerical Analysis, Proceeding of the Sixth Workshop on Optimization and Numerical Analysis,
Oaxaca, Mexico, vol. 275, pp. 15–67 (1994)
15. Powell, M.J.D.: UOBYQA: unconstrained optimization by quadratic approximation. Math. Program.
92, 555–582 (2002)
16. Powell, M.J.D.: Least Frobenius norm updating for quadratic models that satisfy interpolation condi-
tions. Math. Program. 100, 183–215 (2004)
17. Powell, M.J.D.: The NEWUOA software for unconstrained optimization without derivative. In: Di
pillo, G., Roma, M. (eds.) IMA in Large Scale Nonlinear Optimization. Springer, New York (2006)
18. Powell, M.J.D.: Development of NEWUOA for minimization without derivatives. IMA J. Numer.
Anal. 28, 649–664 (2008)
19. Powell, M.J.D.: The BOBYQA algorithm for bound constrained optimization without derivative.
Technical report, 2009/NA06, CMS University of Cambridge (2009)
20. Toint, Ph.: Some numerical results using a sparse matrix updating formula in unconstrained optimiza-
tion. Math. Comput. 32, 839–859 (1987)
21. Winfield, D.: Function and Functional Optimization by Interpolation in Data Table. Ph.D., Harvard
University, Cambridge, USA (1969)
22. Winfield, D.: Function minimization by interpolation in data table. J. Inst. Math. Appl. 12, 339–347
(1973)