1 9781611974683 ch13

Downloaded 08/01/17 to 132.203.227.62. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.
php
Chapter 13
Deterministic Global
Optimization
Sergiy Butenko and Panos M. Pardalos
13.1 Introduction
Global optimization is a branch of mathematical optimization concerned with finding
global optima of nonconvex, multiextremal problems. A typical global optimization
problem has numerous local optima that are not global, which makes finding a globally
optimal solution extremely difficult. In its most general form, it may involve a highly
nonlinear, nonsmooth, difficult-to-evaluate objective function and nonconvex feasible
region, with both continuous and discrete variables.
It is well known that convexity plays a central role in nonlinear optimization
(NLO), and most classical optimization methods take advantage of the convexity as-
sumptions for both the objective function and the feasible region. One of the funda-
mental properties of a convex problem is the fact that its local minimum is also its
global minimum. Finding a global minimum of a smooth convex problem reduces to
the task of detecting a point satisfying the KKT conditions. While this still requires
solving a nonlinear system, the goal is clear (find a KKT point) and, under certain ad-
ditional assumptions, can be efficiently achieved. In fact, all the classical methods in
smooth NLO focused on the problem of determining a KKT (stationary) point. Such
methods are referred to as local methods. Since the numerous available local methods
converge to a stationary point, a natural question to ask is, can we find a large class of
problems for which a stationary point is guaranteed to be a global minimizer? It ap-
pears that if such a class is required to be sufficiently wide to include linear functions
and to be closed under the operations of addition of two functions and multiplication
by a positive scalar, then it is exactly the class of smooth convex functions [1394]. Thus,
the local methods are not expected to converge to a global minimum for nonconvex
optimization problems.
The field of global optimization originated in 1964, when Hoang Tuy published
his seminal paper on concave minimization [1798]. With advances in computer tech-
nologies, the field has expanded in many different directions, having become one of
the most active and fruitful avenues of research in optimization. Hundreds of journal
articles, edited volumes, research monographs, and textbooks have been devoted to
the field.
163
164 Chapter 13. Deterministic Global Optimization
The objective of this chapter is to briefly introduce deterministic global optimiza-

Downloaded 08/01/17 to 132.203.227.62. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
tion. Its remainder is organized as follows. Section 13.2 describes some examples of
typical global optimization problems, with emphasis on continuous formulations of
discrete problems. Section 13.3 focuses on the computational complexity issues related
to global optimization. Global optimality conditions are discussed in Section 13.4.
Section 13.5 presents some basic ideas behind algorithmic approaches to solving global
optimization problems and briefly reviews global optimization software. Finally,
Section 13.6 concludes the chapter with pointers for further reading.
13.2 Examples of Global Optimization Problems

Many important tasks can be stated as global optimization problems. Consider, for ex-
ample, a fixed charge problem, where one needs to decide on the level xi ≥ 0, i = 1, . . . , n,
of n given activities so that their overall cost f (x) = ni=1 fi (xi ) is minimized. In many
applications, the cost function fi for each activity i = 1, . . . , n has the form

0 if xi = 0,
fi (xi ) = (13.1)
ai + ci (xi ) if xi > 0,
where ai > 0 is the fixed setup cost and ci : + → is a continuous concave function
representing the variable cost of the activity.
One of the interesting directions in global optimization research deals with con-
tinuous (nonconvex) approaches to discrete optimization problems. Bridging the con-
tinuous and discrete domains may reveal surprising connections between problems of
seemingly different natures and lead to new developments in both discrete and con-
tinuous optimization. Many integer linear optimization (LO) and integer quadratic
optimization (QO) problems can be reformulated as QO problems in continuous vari-
ables. For example, a binary LO problem of the form
min c T x s.t. Ax ≤ b , x ∈ {0, 1}n , (13.2)
where A is a real m × n matrix, c ∈ n , and b ∈ m , is equivalent to the following

concave minimization problem:
min c T x + μx T (e − x) s.t. Ax ≤ b , 0 ≤ x ≤ e, (13.3)
where e is the appropriate vector of ones and μ is a sufficiently large positive num-
ber used to ensure that the global minimum of this problem is attained only when
x T (e − x) = 0. Other nonlinear binary problems can be reduced to equivalent concave
minimization problems in a similar fashion.
Next, we consider a quadratic unconstrained binary optimization (QUBO)
problem,
min f (x) = x T Q x + c T x s.t. x ∈ {0, 1}n , (13.4)
where Q is a real symmetric n × n matrix. For a real μ, let Qμ = Q + μIn , where In

is the n × n identity matrix, and let cμ = c − μe. Then the above QUBO problem is
equivalent to the following problem:
min fμ (x) = cμT x + x T Qμ x s.t. x ∈ {0, 1}n , (13.5)

Sergiy Butenko and Panos M. Pardalos 165
since fμ (x) = f (x) for any x. If μ ≤ −λmax , where λmax is the largest eigenvalue of Q,
then fμ (x) is concave and {0, 1}n can be replaced with the unit hypercube [0, 1]n as the
feasible region.
In yet another example of a continuous approach to combinatorial optimization,
we consider the classical maximum clique and maximum independent set problems on
graphs. Given a simple undirected graph G = (V , E) with the set of vertices V =
{1, . . . , n} and the set of edges E, a clique C in G is a subset of pairwise-adjacent ver-
tices, and an independent set I is a subset of pairwise-nonadjacent vertices of V . The
maximum clique problem is to find a clique C of the largest size in G, and the clique
number ω(G) is the size of a maximum clique in G. The maximum independent set
problem is defined likewise, and the independence number is denoted by α(G).
Let AG be the adjacency matrix of G and let e be the n-dimensional vector with
all components equal to one. For a subset C of vertices, its characteristic vector x C is
defined by xiC = 1/|C | if i ∈ C and xiC = 0 otherwise for i = 1, . . . , n. Motzkin and
Straus [1359] show that the global optimal value of the following quadratic program
(Motzkin–Straus QP):
1
1− = max x T AG x, (13.6)
ω(G) x∈S
where S = {x ∈ n : e T x = 1, x ≥ 0} is the standard simplex in n . Moreover, a

subset of vertices C ⊆ V is a maximum clique of G if and only if the characteristic
vector x C of C is a global maximizer of the above problem.
The Motzkin–Straus QP may have maximizers that are not in the form of character-
istic vectors. Bomze [298] introduces the following “regularization” of the Motzkin–
Straus formulation:

1
max g (x) = x T AG + In x s.t. e T x = 1, x ≥ 0. (13.7)
2
He shows that x ∗ is a local maximum of (13.7) if and only if it is the characteristic

vector of a maximal-by-inclusion clique in the graph.
The maximum independent set problem can also be formulated as the problem of
minimizing a quadratic function over a unit hypercube [1, 373, 907]:

T 1 T
α(G) = max e x − x AG x , (13.8)
x∈[0,1]n 2
as well as the following global optimization problems [150, 151, 908]:

α(G) = max xi (1 − x j ), (13.9)
x∈[0,1]n
i∈V j ∈N (i)
xi xi
α(G) = max = max − xi x j , (13.10)
x∈[0,1]n
i∈V
1+ xj x∈[0,1]n
i∈V
1+ xj (i, j )∈E
j ∈N (i) j ∈N (i)
where N (i ) denotes the neighborhood of vertex i , which is the set of all vertices ad-
jacent to i in G. Just like Bomze’s regularization of the Motzkin–Straus formulation,
the second formulation in (13.10) is characterized by a one-to-one correspondence be-
tween its local maxima and maximal-by-inclusion independent sets in G.
The examples above illustrate several important classes of global optimization prob-
lems, such as concave minimization, nonconvex QO, polynomial optimization, and
fractional optimization. In addition, they can be shown to belong to the class of differ-
ence of convex (D.C.) functions. A function f : S → , where S ⊆ n is a convex set,

is called D.C. on S if it can be expressed in the form f (x) = p(x) − q(x) for all x ∈ S,
where p and q are convex functions on S.
13.3 Computational Complexity

Understanding a problem’s complexity is crucial for developing mathematical opti-
mization models and algorithms for solving the problem. While in discrete optimiza-
tion establishing a problem’s complexity from the NP-hardness perspective [787] is
considered an important step to take before developing algorithms for the problem of
interest, many important complexity questions pertaining to continuous optimization
are still to be addressed. On the one hand, establishing NP-hardness may be very chal-
lenging and often requires sophisticated reductions from known hard problems (see,
e.g., [958] and references therein). On the other hand, most problems in NLO are so
difficult that proving their NP-hardness can be reasonably viewed as a mere formality.
Nevertheless, a number of important complexity results related to global optimization
have been established in the literature. In addition, alternative approaches to complex-
ity analysis in continuous optimization have been developed, some of which aim to
establish lower bounds on the running time of an algorithm belonging to a certain
class of numerical methods [1392, 1394].
Next, we mention some of the known NP-hardness results concerning global opti-
mization. Since our focus is on nonconvex problems, the first fundamental question to
address is distinguishing between convex and nonconvex problems. In 1992, Pardalos
and Vavasis [1450] posed the question of characterizing the computational complexity
of checking whether a given multivariable polynomial is convex. The question was ad-
dressed in 2013 by Ahmadi et al. [34], who show that the problem is strongly NP-hard
even when restricted to polynomials of degree four.
Convexity is easy to recognize for quadratic problems by, e.g., checking the sign of
the eigenvalues of the Hessian. Since a general nonconvex optimization problem is at
least as hard as a nonconvex QO problem, it is reasonable to focus on this special case
when establishing the complexity of global optimization problems. Given an n × n
real symmetric matrix Q; an m × n matrix A; and vectors b ∈ m , c ∈ n , consider
the problem
1
min f (x) = x T Q x + c T x s.t. Ax ≥ b . (13.11)
2
If Q is positive semidefinite (PSD), the problem is a convex QO problem, and ev-
ery KKT point is a local minimum, which is also global. The problem can be solved
in polynomial time using the ellipsoid method or an interior-point method. It turns
out that as soon as Q has at least one negative eigenvalue, the problem becomes NP-
hard [1449]. The Motzkin–Straus formulation (13.6) of the maximum clique problem
and the quadratic formulation (13.8) for the maximum independent set problem show
that QO remains NP-hard even when the feasible domain is restricted to the standard
simplex or the unit hypercube in n . As follows from the discussion in Section 13.2,
the problem of minimizing a concave quadratic (or maximizing a convex quadratic)
subject to linear constraints is NP-hard as well.
It should be pointed out that quadratic problems remain difficult even if instead of
global optimization one resorts to a less ambitious goal, such as finding a stationary
point or computing a local minimum. Indeed, the problem of checking the existence
of a KKT point for a problem of minimizing a quadratic function subject to nonneg-

ativity constraints has also been shown to be NP-hard by observing that the KKT
conditions for this problem are given by the linear complementarity problem, which
is NP-hard [958, 1444]. The problem of checking whether a given point is a local min-
imum of a given quadratic problem is also NP-hard [1448].
Even though nonconvex optimization is extremely hard in general, there exist
nonconvex problems that can be solved efficiently. One example is the problem of
minimizing a quadratic function over a sphere [1832]:
1
min x T Q x + c T x s.t. x T x ≤ 1. (13.12)
2

Another example is the problem of maximizing the Euclidean norm ||x||2 = ni=1 xi2
over a rectangular parallelotope [292].
The last example of a nonconvex problem that is efficiently solvable that we men-
tion here is the following fractional linear optimization problem [1831]:
min (c T x + γ )/(d T x + δ) s.t. Ax ≥ b , (13.13)
where the denominator d T x + δ is assumed to be positive (or negative) over the fea-
sible domain. The objective of this problem is pseudoconvex and may be minimized
using the ellipsoid algorithm. Alternatively, the problem can be solved efficiently using
Dinkelbach’s algorithm [600, 1643].
13.4 Global Optimality Conditions

The classical optimality conditions, such as the KKT conditions mentioned above, are
both necessary and sufficient for a global minimizer for convex problems. To take
advantage of the classical optimality conditions in a more general setting, one may
try to convexify nonconvex problems, leading to the concept of the convex envelope,
defined as follows. Let S be a nonempty convex set in n . For a lower semicontinuous
function f : S → , the convex envelope of f (x) over S is a function C f (x) such that
(i) C f is a convex underestimation of f on S (that is, C f is convex on S and C f (x) ≤

f (x) for all x ∈ S), and
(ii) for any convex underestimation h of f on S we have h(x) ≤ C f (x) for all x ∈ S.
Geometrically, C f (x) is the function with epigraph coinciding with the convex hull
of the epigraph of f (x). With respect to global optima of the problem of minimizing
f (x) over S, it has the following properties:
min f (x) = min C f (x), (13.14)

x∈S x∈S
{x ∈ S : f (x) = f ∗ } ⊆ {x ∈ S : C f (x) = f ∗ }, (13.15)

where f ∗ = min x∈S f (x). These properties can be used to derive global optimality con-
ditions for nonconvex problems. For example, the following result is proved in [944]:
a point x ∗ is a global minimizer of a differentiable function f : n → on n if and
only if
∇ f (x ∗ ) = 0 and C f (x ∗ ) = f (x ∗ ). (13.16)
These conditions can be generalized to the nondifferentiable case in several ways [945].
Next we state global optimality conditions for the problem of minimizing the D.C.
functions, utilizing the concept of ε-subdifferential, defined as follows. For an arbitrary

function φ : n → ∪ {+∞}, a point x ∗ where φ(x ∗ ) < +∞, and ε ≥ 0, the ε-
subdifferential ∂ε φ(x ∗ ) of φ at x ∗ is the following set:
S = {s ∈ n : φ(x) ≥ φ(x ∗ ) + s T (x − x ∗ ) − ε for all x ∈ n }. (13.17)
Hiriart-Urruty [945] shows that for a lower-semicontinuous convex function g : n →

∪{+∞} (with a nonempty domain) and a convex function h : n → , x ∗ is a global
minimizer of f = g − h on n if and only if
∂ε h(x ∗ ) ⊆ ∂ε g (x ∗ ) for all ε > 0. (13.18)
Deriving meaningful global optimality conditions for broad classes of problems

is extremely challenging in general; however, restricting one’s attention to specially
structured cases sometimes yields interesting results. Consider, for example, a QUBO
problem in the form
1
min q(x) = x T Q x + c T x s.t. x ∈ B = {−1, 1}n , (13.19)
2
where Q is an n × n real symmetric matrix and c ∈ n . Beck and Teboulle [182] use
a continuous representation of this problem to derive global optimality conditions.
Denote by λmin (Q) the smallest eigenvalue of Q, X = diag(x), and e = [1, . . . , 1]T ∈ n .
Then any global minimizer x ∈ B satisfies
XQX e + X c ≤ diag(Q)e, (13.20)
and any x ∈ B satisfying

λ mi n (Q)e ≥ XQX e + X c (13.21)
is a global optimal solution of the above QUBO problem. Alternative optimality con-
ditions for QUBO can be found in [1181]. The conditions are based on geometric
arguments and are successfully used in a variable fixing procedure within a branch-
and-bound framework.
Other special cases that have been studied in the literature in terms of global op-
timality conditions include indefinite and concave QO [547], nonconvex QO with
quadratic constraints [996], polynomial optimization [391], convex maximization
[642], and other problems with special structures [815, 995].
13.5 Algorithms and Software

We consider the following global optimization problem:
min f (x) (13.22)

s.t. gi (x) ≤ 0, i = 1, . . . , m, (13.23)
x ∈ X, (13.24)
where X ⊆ n and f , gi , i = 1, . . . , m, are real-valued functions, each defined on a

domain containing X .
We assume that a global optimal solution x ∗ of problem (13.22)–(13.24) exists and

denote by f ∗ = f (x ∗ ) the corresponding global optimal value of the objective function.

Computing a global optimal solution exactly is nearly impossible for a general opti-
mization problem; hence instead one may settle for the more realistic goal of finding a
globally ε-optimal value of f and a corresponding feasible point. That is, the aim is to
find a feasible point x ∈ X such that f (x ) ≤ f ∗ + ε, where ε is a given small positive
constant. We call an algorithm ε-convergent if it finds such a point in a finite number
of iterations.
Depending on the information available about the functions involved in this formu-
lation, we can define different classes of methods for solving the problem. In zero-order
or derivative-free methods, we assume that there is an oracle that can be called to obtain
f (x) and gi (x) values for any x ∈ X . The first-order methods assume that the functions
appearing in (13.22)–(13.24) are differentiable and, in addition to the function values,
we can use the values of their gradients, ∇ f (x), ∇ gi (x), i = 1, . . . , m, for any x ∈ X .
First-order methods are also known as gradient-based methods. Similarly, the second-
order methods assume availability of the Hessians ∇2 f (x), ∇2 gi (x), i = 1, . . . , m, for
any x ∈ X in addition to the function and gradient values.
Most common deterministic approaches for solving global optimization problems
are based on techniques involving some kind of branch-and-bound or successive parti-
tioning procedures. In a branch-and-bound framework, branching is used to split the
feasible region of the problem into smaller parts, whereas bounding schemes are ap-
plied to prune suboptimal branches and to prove optimality. The choice of specific
branching and bounding strategies usually depends on the structure of the problem of
interest. The key elements of a branch-and-bound algorithm are discussed next.
Branching (Partitioning) A typical branch-and-bound scheme proceeds by split-

ting the problem’s feasible region (or its relaxation) into a finite number of subsets
in a systematic fashion. The set X (or its relaxation) is usually assumed to have a sim-
ple structure in the sense that it allows for an easy partition into subsets having similar
structure. This is essential for devising efficient branching techniques. Popular choices
of basic partition elements include hyperrectangles, simplices, and cones [957]. The
choice of the next subset to partition at each step is an important part of the branch-
ing strategy.
Bounds For each subset resulting from the partitioning, upper and lower bounds on
the optimal objective function value are computed. The objective function value at any
evaluation point constitutes an upper bound, which could also be enhanced using some
heuristic strategies. The best evaluation point encountered in the process is recorded
and updated whenever a better-quality solution is found. The methods used to obtain
lower bounds include the reformulation-linearization technique (RLT) [1687], αBB ap-
proaches [28, 85], McCormick relaxations [1304, 1654], Lagrangian dual bounds [109,
201, 355, 966, 1384, 1802], semidefinite and other convex relaxations [300, 1004, 1241,
1929, 1960], bounds based on interval analysis [902, 1549], and Lipschitz bounds [904,
1494, 1730].
Pruning (Fathoming) If for some subset the lower bound exceeds the currently best
upper bound, the subset cannot contain an optimal solution and hence is eliminated
from further consideration.
Termination The algorithm terminates when the best upper bound is sufficiently
close to the smallest upper bound among all the active subsets, yielding a global mini-
mum or its close estimate.
Partitioning a continuous feasible region during the branching process is more com-
plicated than dealing with discrete variables, since the number of partitions may be in-
finite and the partitions may overlap on the boundary. Hence, even establishing finite
convergence is a nontrivial task [159].
Since a general global optimization problem is extremely hard to solve, many prac-
tical algorithms exploit certain structural properties or make additional assumptions
regarding the problem of interest to make it more tractable. In the following two
subsections, we illustrate this point by discussing some ideas behind algorithmic ap-
proaches for two important classes of global optimization problems: concave mini-
mization and Lipschitz optimization.
13.5.1 Concave Minimization

Consider the problem (13.22)–(13.24); where X is a polytope; f (x) is a concave func-
tion over X ; and gi , i = 1, . . . , m, are convex over X . This problem is referred to as a
concave minimization problem, or as a concave optimization problem when the feasible
region is given by a polytope. As we have seen in Section 13.2, fixed-charge problems,
integer optimization, and QUBO problems can be posed as concave programs. Other
broad classes of problems that can also be transformed into concave minimization
problems include, e.g., bilinear optimization problems, complementarity problems,
and multiplicative programs.
Following the influential work by Tuy [1798] published in 1964, concave mini-
mization became one of the focal points in the emerging field of global optimization
during the next several decades. As a result, a number of algorithmic approaches have
been devised for general concave minimization as well as for restricted subclasses of
concave optimization. Most popular methods for solving concave optimization prob-
lems can be classified into three categories: enumerative methods, successive approxi-
mation (outer, inner, underestimation) methods, and branch-and-bound techniques.
Enumerative methods exploit the property that a global minimum of a concave
function f over a polytope X is attained at a vertex of X , implying that the problem
can be solved in a finite number of enumerations by ranking the extreme points of X .
The first such approach was proposed by Murty in 1968 [1365]. To alleviate the diffi-
culties associated with an excessively large number of extreme points, the enumerative
approaches are often enhanced by introducing cutting planes, such as Tuy’s concav-
ity cuts [959, 1798], that allow one to eliminate some suboptimal parts of the feasible
region.
Successive approximation techniques proceed by solving a sequence of simpler
problems approximating the original, more complicated problem of interest. The ap-
proximation quality is successively improved until a solution to an approximation
problem is guaranteed to be an exact or approximate solution to the original problem.
Common techniques used for approximation purposes include outer approximation
(OA), inner approximation, and successive underestimation.
Consider the concave minimization problem
min f (x) s.t. x ∈ X , (13.25)

where X is a nonempty, closed convex set. In OA methods, a sequence of sets Xk ,

k = 1, 2, . . ., containing the feasible region X is constructed step by step, and the fol-
lowing approximating problem of (13.25) is considered at each step:
min f (x) s.t. x ∈ Xk . (13.26)
If the computed optimal solution x k of (13.26) is in X , it is also an optimal solution
for (13.25), and the original problem is solved. Otherwise, we construct a new OA
Xk+1 ⊂ Xk \{x k }, which cuts off the infeasible point x k , and repeat the process. Most
often, the improved OA Xk+1 is obtained using inequality cuts separating x k from
X . A much less common alternative approach is based on the notion of collapsing
polytopes [686].
In the inner approximation approach, which is also known as the polyhedral an-
nexation, the feasible region is approximated from inside by constructing an expand-
ing sequence of approximating sets X1 ⊂ X2 ⊂ · · · ⊂ X . In this case, the sequence
{x k : k ≥ 1} of optimal solutions to the corresponding approximating problems is
monotonically decreasing and converges to the solution of the original problem from
above [1799, 1800]. The successive underestimation methods approximate the prob-
lem of minimizing a concave function over a bounded polyhedron by replacing the
original objective function with its underestimates at each iteration [685] .
Just as in the case for global optimization problems in general, branch-and-bound
methods represent the most common approach to solving concave minimization prob-
lems. The reader is referred to [222,223,959] for detailed surveys of concave minimiza-
tion techniques.
13.5.2 Lipschitz Optimization

One of the simplest and most natural assumptions made concerning the structure of
optimization problems is given by the Lipschitz condition, which bounds the variation
of functions involved over the feasible domain. A function is called Lipschitz with Lip-
schitz constant L over X if
| f (x) − f (y)| ≤ L x − y for all x, y ∈ X . (13.27)
The methods that take advantage of the Lipschitz assumption are referred to as Lips-
chitz optimization methods. The feasible region X is typically assumed to be (a subset
of) an n-dimensional unit hypercube, and no smoothness assumptions are made for
the objective function. The Lipschitz assumption does not make the problem easy. In
fact, the problem remains NP-hard even when the objective is a quadratic function, as
evident from formulation (13.8) for the independence number of a graph. It can also be
shown that there always exists an instance of a Lipschitz optimization over a hypercube
that requires at least L/2εn function evaluations to guarantee an ε-approximate so-
lution, where the Lipschitz constant L is specified for the infinity norm [1394]. This
bound on the number of evaluations can be achieved (asymptotically when n → ∞) us-
ing an appropriate uniform grid to define evaluation points, implying that this naive
strategy is optimal in the sense that it is unbeatable in the worst case. However, the
Lipschitz assumption can be used to obtain algorithms that are much more effective
in practice.
A key property of Lipschitz functions utilized in global optimization algorithms
is the fact that for a given set {x1 , . . . , xk } of evaluation points, the function
hk (x) = max { f (xi ) − L x − xi } (13.28)
i=1,...,k
provides a lower bound on f over X , that is, hk (x) ≤ f (x) for all x ∈ X . In particu-
lar, in the univariate case, hk is a “saw-tooth” function and is easy to optimize. This
observation forms the core of Piyavskii’s algorithm [1494], which proceeds by adding
a global minimizer xk+1 of hk as the new evaluation point at the kth step of the algo-
rithm, thus yielding an updated lower-bounding function hk+1 .
Piyavskii’s algorithm can be generalized to the multivariate case. As in the uni-
variate case, an underestimate of the objective is built and updated by adding eval-
uation points corresponding to the global minima of the lower-bounding function.
Several algorithms for computing these global minima have been proposed in the lit-
erature [1340, 1495].
Alternatively, a multivariate Lipschitz optimization problem can be solved by re-
duction to a sequence of univariate problems using nested optimization [904, 1494].
Another way to reduce a multidimensional Lipschitz optimization problem to a uni-
variate one is based on the concept of space-filling curves. A space-filling curve or Peano
curve is a curve passing through every point of a multidimensional region. Such curves
have been studied by mathematicians since the end of the 19th century. To exploit
them in a global optimization context, we first use space-filling curves to transform
the multidimensional optimization problem min{ f (x) : x ∈ [0, 1]n } into the equiv-
alent problem min{ f (x(t )) : t ∈ [0, 1]} of a single variable, where x(t ) is a Peano
curve. Then we utilize a one-dimensional global optimization method to solve the re-
sulting univariate problem. This methodological direction has been explored in depth
for Lipschitz-continuous problems, as discussed in the monographs [1661,1729,1730].
The proposed techniques utilize the fact that if f (x) is Lipschitz continuous with a
constant L on [0, 1]n , then the one-dimensional function f (x(t )), t ∈ [0, 1], satisfies
the following Hölder condition:

| f (x(t )) − | f (x(t ))| ≤ 2L n + 3(|t − t |)1/n , t , t ∈ [0, 1]. (13.29)
Strongin and Sergeyev [1730] provide a detailed account of the corresponding method-
ology. A more recent book by Sergeyev et al. [1661] gives a brief introduction to the
topic and discusses the most recent related developments.
In addition to the approaches mentioned above, several branch-and-bound algo-
rithms have been developed for multivariate Lipschitz optimization [681, 904, 1310,
1311, 1461, 1487]. Typically such algorithms utilize partitions of the feasible region
into hyperrectangles; the objective function f is evaluated at the center of the current
hyperrectangle to get the corresponding upper bound; branching rules are based on
splitting the current hyperrectangle along its longest edge into several equal hyperrect-
angles, and the lower bound is based on the Lipschitz constant.
Despite their theoretical attractiveness, the early Lipschitz optimization methods
suffered from several drawbacks in practice, such as the need to specify the Lipschitz
constant, low speed of convergence, and high computational complexity in higher di-
mensions. In particular, their slow convergence can be explained by the high emphasis
placed on global search, dictated by the large values of Lipschitz constants typically
used to ensure a valid upper bound on the rate of change of the objective function.
This motivated more recent approaches that take advantage of an alternative role given
to the Lipschitz constant. One such method, DIRECT [1011, 1012], interprets the Lip-
schitz constant as a weighting parameter used in balancing between global and local
search. Due to its simplicity and effectiveness, DIRECT and its variations have attracted
considerable attention in the literature [711, 775, 1210, 1460].
13.5.3 Global Optimization Software

In the last two decades, we have witnessed significant progress in the development
of global optimization software. What used to be “the road less traveled” 20 years
ago [1444] is becoming an essential feature of state-of-the-art optimization software.
Most modern optimization modeling systems and solvers include global optimiza-
tion functionalities; see [1486] for a survey. Prominent global optimization solvers
include αBB [28, 85], BARON (Branch and Reduce Optimization Navigator) [1611,
1754,1755], GloMIQO (Global Mixed Integer Quadratic Optimizer) [1331], and LGO
(Lipschitz Global Optimizer) [1488], to name a few.
Some standard testbeds and problem generators have also been developed that can
be used to evaluate the performance of computer implementations of global optimiza-
tion algorithms. For example, the book by Floudas et al. [742] contains a large col-
lection of various types of test instances for global optimization, including QO prob-
lems, minimum concave cost transportation problems, quadratically constrained prob-
lems, and polynomial problems. Gaviano et al. [790] propose a method for generating
test functions with user-specified properties for global optimization. Websites [1339,
1399] maintaining lists of available global optimization software provide download
links to numerous open-source packages. Rios and Sahinidis [1569] review algorithms
and compare 22 derivative-free optimization solvers on 502 test problems; their prob-
lems and computational results are available at the website: http://archimedes.
cheme.cmu.edu/?q=dfocomp.
13.6 Further Reading

Due to the limited scope of this work, many important topics dealing with global opti-
mization have not been discussed, including D.C. and monotonic optimization [1797,
1801], stochastic global optimization and metaheuristic methods [817, 1943, 1974],
bilevel and multilevel optimization [490,578,1322], multiobjective optimization [654,
1321], complementary problems [1440], polynomial optimization [923, 1131, 1133,
1691], methods based on interval analysis [902, 1045, 1550], and decomposition tech-
niques [959]. The reader may consult the numerous textbooks [735, 932, 958, 959,
1215], research monographs [1447, 1461, 1646, 1687, 1730], edited volumes [741, 957,
1445, 1446, 1779], and comprehensive surveys of recent advances [119, 739] to learn
more about theoretical, algorithmic, and applied aspects of the vibrant field of global
optimization.

1 9781611974683 ch13

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 9781611974683 ch13

Uploaded by

Copyright:

Available Formats

Downloaded 08/01/17 to 132.203.227.62. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.

Sergiy Butenko and Panos M. Pardalos

The objective of this chapter is to briefly introduce deterministic global optimiza-

13.2 Examples of Global Optimization Problems

min c T x s.t. Ax ≤ b , x ∈ {0, 1}n , (13.2)

where A is a real m × n matrix, c ∈ n , and b ∈ m , is equivalent to the following

min c T x + μx T (e − x) s.t. Ax ≤ b , 0 ≤ x ≤ e, (13.3)

where Q is a real symmetric n × n matrix. For a real μ, let Qμ = Q + μIn , where In

min fμ (x) = cμT x + x T Qμ x s.t. x ∈ {0, 1}n , (13.5)

where S = {x ∈ n : e T x = 1, x ≥ 0} is the standard simplex in n . Moreover, a

He shows that x ∗ is a local maximum of (13.7) if and only if it is the characteristic

as well as the following global optimization problems [150, 151, 908]:

ence of convex (D.C.) functions. A function f : S → , where S ⊆ n is a convex set,

13.3 Computational Complexity

of a KKT point for a problem of minimizing a quadratic function subject to nonneg-

min (c T x + γ )/(d T x + δ) s.t. Ax ≥ b , (13.13)

13.4 Global Optimality Conditions

(i) C f is a convex underestimation of f on S (that is, C f is convex on S and C f (x) ≤

min f (x) = min C f (x), (13.14)

{x ∈ S : f (x) = f ∗ } ⊆ {x ∈ S : C f (x) = f ∗ }, (13.15)

functions, utilizing the concept of ε-subdifferential, defined as follows. For an arbitrary

S = {s ∈ n : φ(x) ≥ φ(x ∗ ) + s T (x − x ∗ ) − ε for all x ∈ n }. (13.17)

Hiriart-Urruty [945] shows that for a lower-semicontinuous convex function g : n →

∂ε h(x ∗ ) ⊆ ∂ε g (x ∗ ) for all ε > 0. (13.18)

Deriving meaningful global optimality conditions for broad classes of problems

XQX e + X c ≤ diag(Q)e, (13.20)

and any x ∈ B satisfying

13.5 Algorithms and Software

min f (x) (13.22)

where X ⊆ n and f , gi , i = 1, . . . , m, are real-valued functions, each defined on a

We assume that a global optimal solution x ∗ of problem (13.22)–(13.24) exists and

denote by f ∗ = f (x ∗ ) the corresponding global optimal value of the objective function.

Branching (Partitioning) A typical branch-and-bound scheme proceeds by split-

13.5.1 Concave Minimization

min f (x) s.t. x ∈ X , (13.25)

where X is a nonempty, closed convex set. In OA methods, a sequence of sets Xk ,

13.5.2 Lipschitz Optimization

13.5.3 Global Optimization Software

13.6 Further Reading

You might also like