Professional Documents
Culture Documents
1 9781611974683 ch13
1 9781611974683 ch13
php
Chapter 13
Deterministic Global
Optimization
13.1 Introduction
Global optimization is a branch of mathematical optimization concerned with finding
global optima of nonconvex, multiextremal problems. A typical global optimization
problem has numerous local optima that are not global, which makes finding a globally
optimal solution extremely difficult. In its most general form, it may involve a highly
nonlinear, nonsmooth, difficult-to-evaluate objective function and nonconvex feasible
region, with both continuous and discrete variables.
It is well known that convexity plays a central role in nonlinear optimization
(NLO), and most classical optimization methods take advantage of the convexity as-
sumptions for both the objective function and the feasible region. One of the funda-
mental properties of a convex problem is the fact that its local minimum is also its
global minimum. Finding a global minimum of a smooth convex problem reduces to
the task of detecting a point satisfying the KKT conditions. While this still requires
solving a nonlinear system, the goal is clear (find a KKT point) and, under certain ad-
ditional assumptions, can be efficiently achieved. In fact, all the classical methods in
smooth NLO focused on the problem of determining a KKT (stationary) point. Such
methods are referred to as local methods. Since the numerous available local methods
converge to a stationary point, a natural question to ask is, can we find a large class of
problems for which a stationary point is guaranteed to be a global minimizer? It ap-
pears that if such a class is required to be sufficiently wide to include linear functions
and to be closed under the operations of addition of two functions and multiplication
by a positive scalar, then it is exactly the class of smooth convex functions [1394]. Thus,
the local methods are not expected to converge to a global minimum for nonconvex
optimization problems.
The field of global optimization originated in 1964, when Hoang Tuy published
his seminal paper on concave minimization [1798]. With advances in computer tech-
nologies, the field has expanded in many different directions, having become one of
the most active and fruitful avenues of research in optimization. Hundreds of journal
articles, edited volumes, research monographs, and textbooks have been devoted to
the field.
163
164 Chapter 13. Deterministic Global Optimization
tion. Its remainder is organized as follows. Section 13.2 describes some examples of
typical global optimization problems, with emphasis on continuous formulations of
discrete problems. Section 13.3 focuses on the computational complexity issues related
to global optimization. Global optimality conditions are discussed in Section 13.4.
Section 13.5 presents some basic ideas behind algorithmic approaches to solving global
optimization problems and briefly reviews global optimization software. Finally,
Section 13.6 concludes the chapter with pointers for further reading.
where ai > 0 is the fixed setup cost and ci : + → is a continuous concave function
representing the variable cost of the activity.
One of the interesting directions in global optimization research deals with con-
tinuous (nonconvex) approaches to discrete optimization problems. Bridging the con-
tinuous and discrete domains may reveal surprising connections between problems of
seemingly different natures and lead to new developments in both discrete and con-
tinuous optimization. Many integer linear optimization (LO) and integer quadratic
optimization (QO) problems can be reformulated as QO problems in continuous vari-
ables. For example, a binary LO problem of the form
where e is the appropriate vector of ones and μ is a sufficiently large positive num-
ber used to ensure that the global minimum of this problem is attained only when
x T (e − x) = 0. Other nonlinear binary problems can be reduced to equivalent concave
minimization problems in a similar fashion.
Next, we consider a quadratic unconstrained binary optimization (QUBO)
problem,
min f (x) = x T Q x + c T x s.t. x ∈ {0, 1}n , (13.4)
since fμ (x) = f (x) for any x. If μ ≤ −λmax , where λmax is the largest eigenvalue of Q,
Downloaded 08/01/17 to 132.203.227.62. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
then fμ (x) is concave and {0, 1}n can be replaced with the unit hypercube [0, 1]n as the
feasible region.
In yet another example of a continuous approach to combinatorial optimization,
we consider the classical maximum clique and maximum independent set problems on
graphs. Given a simple undirected graph G = (V , E) with the set of vertices V =
{1, . . . , n} and the set of edges E, a clique C in G is a subset of pairwise-adjacent ver-
tices, and an independent set I is a subset of pairwise-nonadjacent vertices of V . The
maximum clique problem is to find a clique C of the largest size in G, and the clique
number ω(G) is the size of a maximum clique in G. The maximum independent set
problem is defined likewise, and the independence number is denoted by α(G).
Let AG be the adjacency matrix of G and let e be the n-dimensional vector with
all components equal to one. For a subset C of vertices, its characteristic vector x C is
defined by xiC = 1/|C | if i ∈ C and xiC = 0 otherwise for i = 1, . . . , n. Motzkin and
Straus [1359] show that the global optimal value of the following quadratic program
(Motzkin–Straus QP):
1
1− = max x T AG x, (13.6)
ω(G) x∈S
xi xi
α(G) = max = max − xi x j , (13.10)
x∈[0,1]n
i∈V
1+ xj x∈[0,1]n
i∈V
1+ xj (i, j )∈E
j ∈N (i) j ∈N (i)
where N (i ) denotes the neighborhood of vertex i , which is the set of all vertices ad-
jacent to i in G. Just like Bomze’s regularization of the Motzkin–Straus formulation,
the second formulation in (13.10) is characterized by a one-to-one correspondence be-
tween its local maxima and maximal-by-inclusion independent sets in G.
The examples above illustrate several important classes of global optimization prob-
lems, such as concave minimization, nonconvex QO, polynomial optimization, and
166 Chapter 13. Deterministic Global Optimization
fractional optimization. In addition, they can be shown to belong to the class of differ-
Downloaded 08/01/17 to 132.203.227.62. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
ativity constraints has also been shown to be NP-hard by observing that the KKT
conditions for this problem are given by the linear complementarity problem, which
is NP-hard [958, 1444]. The problem of checking whether a given point is a local min-
imum of a given quadratic problem is also NP-hard [1448].
Even though nonconvex optimization is extremely hard in general, there exist
nonconvex problems that can be solved efficiently. One example is the problem of
minimizing a quadratic function over a sphere [1832]:
1
min x T Q x + c T x s.t. x T x ≤ 1. (13.12)
2
Another example is the problem of maximizing the Euclidean norm ||x||2 = ni=1 xi2
over a rectangular parallelotope [292].
The last example of a nonconvex problem that is efficiently solvable that we men-
tion here is the following fractional linear optimization problem [1831]:
where the denominator d T x + δ is assumed to be positive (or negative) over the fea-
sible domain. The objective of this problem is pseudoconvex and may be minimized
using the ellipsoid algorithm. Alternatively, the problem can be solved efficiently using
Dinkelbach’s algorithm [600, 1643].
Geometrically, C f (x) is the function with epigraph coinciding with the convex hull
of the epigraph of f (x). With respect to global optima of the problem of minimizing
f (x) over S, it has the following properties:
Next we state global optimality conditions for the problem of minimizing the D.C.
Downloaded 08/01/17 to 132.203.227.62. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
1
min q(x) = x T Q x + c T x s.t. x ∈ B = {−1, 1}n , (13.19)
2
where Q is an n × n real symmetric matrix and c ∈ n . Beck and Teboulle [182] use
a continuous representation of this problem to derive global optimality conditions.
Denote by λmin (Q) the smallest eigenvalue of Q, X = diag(x), and e = [1, . . . , 1]T ∈ n .
Then any global minimizer x ∈ B satisfies
Bounds For each subset resulting from the partitioning, upper and lower bounds on
the optimal objective function value are computed. The objective function value at any
evaluation point constitutes an upper bound, which could also be enhanced using some
heuristic strategies. The best evaluation point encountered in the process is recorded
and updated whenever a better-quality solution is found. The methods used to obtain
lower bounds include the reformulation-linearization technique (RLT) [1687], αBB ap-
proaches [28, 85], McCormick relaxations [1304, 1654], Lagrangian dual bounds [109,
201, 355, 966, 1384, 1802], semidefinite and other convex relaxations [300, 1004, 1241,
1929, 1960], bounds based on interval analysis [902, 1549], and Lipschitz bounds [904,
1494, 1730].
Pruning (Fathoming) If for some subset the lower bound exceeds the currently best
upper bound, the subset cannot contain an optimal solution and hence is eliminated
from further consideration.
170 Chapter 13. Deterministic Global Optimization
Termination The algorithm terminates when the best upper bound is sufficiently
Downloaded 08/01/17 to 132.203.227.62. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
close to the smallest upper bound among all the active subsets, yielding a global mini-
mum or its close estimate.
Partitioning a continuous feasible region during the branching process is more com-
plicated than dealing with discrete variables, since the number of partitions may be in-
finite and the partitions may overlap on the boundary. Hence, even establishing finite
convergence is a nontrivial task [159].
Since a general global optimization problem is extremely hard to solve, many prac-
tical algorithms exploit certain structural properties or make additional assumptions
regarding the problem of interest to make it more tractable. In the following two
subsections, we illustrate this point by discussing some ideas behind algorithmic ap-
proaches for two important classes of global optimization problems: concave mini-
mization and Lipschitz optimization.
k = 1, 2, . . ., containing the feasible region X is constructed step by step, and the fol-
lowing approximating problem of (13.25) is considered at each step:
min f (x) s.t. x ∈ Xk . (13.26)
If the computed optimal solution x k of (13.26) is in X , it is also an optimal solution
for (13.25), and the original problem is solved. Otherwise, we construct a new OA
Xk+1 ⊂ Xk \{x k }, which cuts off the infeasible point x k , and repeat the process. Most
often, the improved OA Xk+1 is obtained using inequality cuts separating x k from
X . A much less common alternative approach is based on the notion of collapsing
polytopes [686].
In the inner approximation approach, which is also known as the polyhedral an-
nexation, the feasible region is approximated from inside by constructing an expand-
ing sequence of approximating sets X1 ⊂ X2 ⊂ · · · ⊂ X . In this case, the sequence
{x k : k ≥ 1} of optimal solutions to the corresponding approximating problems is
monotonically decreasing and converges to the solution of the original problem from
above [1799, 1800]. The successive underestimation methods approximate the prob-
lem of minimizing a concave function over a bounded polyhedron by replacing the
original objective function with its underestimates at each iteration [685] .
Just as in the case for global optimization problems in general, branch-and-bound
methods represent the most common approach to solving concave minimization prob-
lems. The reader is referred to [222,223,959] for detailed surveys of concave minimiza-
tion techniques.
provides a lower bound on f over X , that is, hk (x) ≤ f (x) for all x ∈ X . In particu-
Downloaded 08/01/17 to 132.203.227.62. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
lar, in the univariate case, hk is a “saw-tooth” function and is easy to optimize. This
observation forms the core of Piyavskii’s algorithm [1494], which proceeds by adding
a global minimizer xk+1 of hk as the new evaluation point at the kth step of the algo-
rithm, thus yielding an updated lower-bounding function hk+1 .
Piyavskii’s algorithm can be generalized to the multivariate case. As in the uni-
variate case, an underestimate of the objective is built and updated by adding eval-
uation points corresponding to the global minima of the lower-bounding function.
Several algorithms for computing these global minima have been proposed in the lit-
erature [1340, 1495].
Alternatively, a multivariate Lipschitz optimization problem can be solved by re-
duction to a sequence of univariate problems using nested optimization [904, 1494].
Another way to reduce a multidimensional Lipschitz optimization problem to a uni-
variate one is based on the concept of space-filling curves. A space-filling curve or Peano
curve is a curve passing through every point of a multidimensional region. Such curves
have been studied by mathematicians since the end of the 19th century. To exploit
them in a global optimization context, we first use space-filling curves to transform
the multidimensional optimization problem min{ f (x) : x ∈ [0, 1]n } into the equiv-
alent problem min{ f (x(t )) : t ∈ [0, 1]} of a single variable, where x(t ) is a Peano
curve. Then we utilize a one-dimensional global optimization method to solve the re-
sulting univariate problem. This methodological direction has been explored in depth
for Lipschitz-continuous problems, as discussed in the monographs [1661,1729,1730].
The proposed techniques utilize the fact that if f (x) is Lipschitz continuous with a
constant L on [0, 1]n , then the one-dimensional function f (x(t )), t ∈ [0, 1], satisfies
the following Hölder condition:
| f (x(t )) − | f (x(t ))| ≤ 2L n + 3(|t − t |)1/n , t , t ∈ [0, 1]. (13.29)
Strongin and Sergeyev [1730] provide a detailed account of the corresponding method-
ology. A more recent book by Sergeyev et al. [1661] gives a brief introduction to the
topic and discusses the most recent related developments.
In addition to the approaches mentioned above, several branch-and-bound algo-
rithms have been developed for multivariate Lipschitz optimization [681, 904, 1310,
1311, 1461, 1487]. Typically such algorithms utilize partitions of the feasible region
into hyperrectangles; the objective function f is evaluated at the center of the current
hyperrectangle to get the corresponding upper bound; branching rules are based on
splitting the current hyperrectangle along its longest edge into several equal hyperrect-
angles, and the lower bound is based on the Lipschitz constant.
Despite their theoretical attractiveness, the early Lipschitz optimization methods
suffered from several drawbacks in practice, such as the need to specify the Lipschitz
constant, low speed of convergence, and high computational complexity in higher di-
mensions. In particular, their slow convergence can be explained by the high emphasis
placed on global search, dictated by the large values of Lipschitz constants typically
used to ensure a valid upper bound on the rate of change of the objective function.
This motivated more recent approaches that take advantage of an alternative role given
to the Lipschitz constant. One such method, DIRECT [1011, 1012], interprets the Lip-
schitz constant as a weighting parameter used in balancing between global and local
search. Due to its simplicity and effectiveness, DIRECT and its variations have attracted
considerable attention in the literature [711, 775, 1210, 1460].
Sergiy Butenko and Panos M. Pardalos 173
In the last two decades, we have witnessed significant progress in the development
of global optimization software. What used to be “the road less traveled” 20 years
ago [1444] is becoming an essential feature of state-of-the-art optimization software.
Most modern optimization modeling systems and solvers include global optimiza-
tion functionalities; see [1486] for a survey. Prominent global optimization solvers
include αBB [28, 85], BARON (Branch and Reduce Optimization Navigator) [1611,
1754,1755], GloMIQO (Global Mixed Integer Quadratic Optimizer) [1331], and LGO
(Lipschitz Global Optimizer) [1488], to name a few.
Some standard testbeds and problem generators have also been developed that can
be used to evaluate the performance of computer implementations of global optimiza-
tion algorithms. For example, the book by Floudas et al. [742] contains a large col-
lection of various types of test instances for global optimization, including QO prob-
lems, minimum concave cost transportation problems, quadratically constrained prob-
lems, and polynomial problems. Gaviano et al. [790] propose a method for generating
test functions with user-specified properties for global optimization. Websites [1339,
1399] maintaining lists of available global optimization software provide download
links to numerous open-source packages. Rios and Sahinidis [1569] review algorithms
and compare 22 derivative-free optimization solvers on 502 test problems; their prob-
lems and computational results are available at the website: http://archimedes.
cheme.cmu.edu/?q=dfocomp.