Professional Documents
Culture Documents
I. I NTRODUCTION
Convex optimization can be described as a fusion
of three disciplines: optimization [22], [20], [1], [3],
[4], convex analysis [19], [24], [27], [16], [13], and
numerical computation [26], [12], [10], [17]. It has
recently become a tool of central importance in engineering, enabling the solution of very large, practical
engineering problems reliably and efficiently. In some
sense, convex optimization is providing new indispensable computational tools today, which naturally extend
our ability to solve problems such as least squares and
linear programming to a much larger and richer class of
problems.
Our ability to solve these new types of problems
comes from recent breakthroughs in algorithms for solving convex optimization problems [18], [23], [29], [30],
coupled with the dramatic improvements in computing
power, both of which have happened only in the past
decade or so. Today, new applications of convex optimization are constantly being reported from almost
every area of engineering, including: control, signal
processing, networks, circuit design, communication, information theory, computer science, operations research,
economics, statistics, structural design. See [7], [2], [5],
[6], [9], [11], [15], [8], [21], [14], [28] and the references
therein.
The objectives of this tutorial are:
1) to show that there is are straight forward, systematic rules and facts, which when mastered,
allow one to quickly deduce the convexity (and
i = 1, . . . , m
i = 1, . . . , p.
(1)
3252
where A =
matrix
nullspace(B)
= {x | Bx = 0}
T
b1 bp
where B =
. As an example, let
Sn = {X Rnn | X = X T } denote the set
of symmetric matrices. Then S n is a subspace, since
symmetric matrices are closed under addition. Another
way to see this is to note that S n can be written as
{X Rnn | Xij = Xji , i, j} which is the nullspace
of linear function X X T .
A set S Rn is affine if it contains line through any
two points in it, i.e.,
x, y S, , R, + = 1 = x + y S.
= 1.5
p
px
= 0.6
p
py
= 0.5
{x | bT1 x = d1 , . . . , bTp x = dp }
{x | Bx = d}.
F (x) = A0 + x1 A1 + + xn An
where Ai Rpq . Affine functions are sometimes
loosely refered to as linear.
Recall that S Rn is a subspace if it contains the
plane through any two of its points and the origin, i.e.,
, R = x + y S.
aq ; or as the nullspace of a
= {x | bT1 x = 0, . . . , bTp x = 0}
x, y S,
a1
= {Aw | w Rq }
= {w1 a1 + + wq aq | wi R}
2
3253
x, y S, , 0, = x + y S
Geometrically, x, y S means that S contains the entire
pie slice between x, and y.
x
0
p x0
T
aT x b a x b
y x K
We now come to a very important fact about properties which are preserved under intersection: Let A be
an arbitrary index set (possibly uncountably infinite) and
{S | A} a collection of sets , then we have the
following:
subspace
subspace
affine
affine
=
S is
S is
y x int K).
y K x
K
x
convex
convex cone
convex
convex cone
Sn+ =
zi zj Xij 0 .
X Sn
z T Xz =
n
i,j=1
zR
Now observe that the summation above is actually linear
in the components of X, so S n+ is the infinite intersection
of halfspaces containing the origin (which are convex
cones) in S n .
We continue with our listing of useful convex sets.
A polyhedron is intersection of a finite number of
halfspaces
P
{x | aTi x bi , i = 1, . . . , k}
{x | Ax b}
3254
a3
a5
a4
; p 1,
;p =
S = {(x, t) | xT x t}
T
I 0
x
x
=
(x, t)
0, t 0
0 1
t
t
x p =
px
a2
a1
(1, 1)
p= 1
p = 1.5
p =2
p =3
1
0.8
Ip =
0.6
0.4
0.2
0
1
1
0.5
0.5
0.5
0.5
1
(1, 1)
The image (or preimage) under an affinte transformation of a convex set are convex, i.e., if S, T convex,
then so are the sets
(1, 1)
f 1 (S) =
f (T ) =
{x | Ax + b S}
{Ax + b | x T }
E = {x | (x xc )T A1 (x xc ) 1}
where the parameters are A = A T 0, the center xc
Rn .
F (x) = A0 + x1 A1 + + xm Am 0
4
3255
A. Convex functions
A function f : Rn R is convex if its domain dom f
is convex and for all x, y dom f , [0, 1]
f (x + (1 )y) f (x) + (1 )f (y);
f is concave if f is convex.
x
convex
x2
f (x1 )
x
neither
x3
f (x4 )
x
concave
f (x3 )
f (x2 )
epi f
S = {x dom f | f (x) }
Form the basic definition of convexity, it follows that
if f is a convex function if and only if its epigraph epi f
is a convex set. It also follows that if f is convex, then
its sublevel sets S are convex (the converse is false see quasiconvex functions later).
The convexity of a differentiable function f : R n R
can also be characterized by conditions on its gradient
f and Hessian 2 f . Recall that, in general, the
gradient yields a first order Taylor approximation at x 0 :
x S a x a x0
T
x0
3256
f (x)
x0
epi max{f1 , f2 }
f1 (x)
f (x0 ) + f (x0 )T (x x0 )
x is convex on R ++ for 1;
x log x is convex on R + ;
log-sum-exp function f (x) = log i exi (tricky!)
affine functions f (x) = a T x + b where a Rn , b
R are convex and concave since 2 f 0.
quadratic functions f (x) = x T P x+2q T x+r (P =
P T ) whose Hessian 2 f (x) = 2P are convex
P 0; concave P 0
3257
two points: 1 + 2 = 1, i 0 =
f (1 x1 + 2 x2 ) 1 f (x
1 ) + 2 f (x2 )
more
than
two
points:
i i = 1, i 0 =
f ( i i xi ) i i f (xi )
continuous
version:
p(x) 0, p(x) dx = 1 =
f ( xp(x) dx) f (x)p(x) dx
or more generally, f (E x) E f (x).
f (x, y) = x2 /y,
The next best thing to a convex function in optimization is a quasiconvex function, since it can be minimized
in a few iterations of convex optimization problems.
A function f : Rn R is quasiconvex if every
sublevel set S = {x dom f | f (x) } is
convex. Note that if f is convex, then it is automatically
quasiconvex. Quasiconvex functions can have locally
flat regions.
B. Quasiconvex functions
dom f = R R++
0.
Ax + b
eT x + d
x
S
3258
aT x + b
f (x) = T
c x+d
is quasilinear on the halfspace c T x + d > 0
2
f (x) = xa
xb2 is quasiconvex on the halfspace
{x |
x a
2
x b
2 }
minimize f0 (x)
subject to fi (x) 0, i = 1, . . . , m
hi (x) = 0, i = 1, . . . , p
S1
f (x)
4
minimize x1 + x2
subject to x1 0
x2 0
1 x1 x2 0
S2
S3
1 < 2 < 3
positive multiples
f quasiconvex, 0 = f quasiconvex
pointwise supremum f 1 , f2 quasiconvex =
max{f1 , f2 } quasiconvex
(extends to supremum over arbitrary set)
affine transformation of domain
f quasiconvex = f (Ax + b) quasiconvex
x2
linear-fractional transformation
of domain
f quasiconvex = f cAx+b
quasiconvex
T x+d
T
on c x + d > 0
composition with monotone increasing function:
f quasiconvex, g monotone increasing
= g(f (x)) quasiconvex
sums of quasiconvex functions are not quasiconvex
in general
f quasiconvex in x, y = g(x) = inf y f (x, y)
quasiconvex in x (projection of epigraph remains
quasiconvex)
3259
find
x
such that Ax b,
xi {0, 1}
y C,
y x
R = f0 (y) f0 (x)
for some R > 0. A point x C is globally optimal
means that
y C = f0 (y) f0 (x).
For convex optimization problems, any local solution is
also global. [Proof sketch: Suppose x is locally optimal,
but that there is a y C, with f 0 (y) < f0 (x). Then
we may take small step from x towards y, i.e., z =
y + (1 )x with > 0 small. Then z is near x, with
f0 (z) < f0 (x) which contradicts local optimality.]
There is also a first order condition that characterizes
optimality in convex optimization problems. Suppose f 0
is differentiable, then x C is optimal iff
fi (x) 0, i = 1, . . . , m
hi (x) = 0, i = 1, . . . , p
or determine that it is inconsistent.
An optimization problem in standard form is a convex
optimization problem if f 0 , f1 , . . . , fm are all convex,
and hi are all affine:
minimize f0 (x)
subject to fi (x) 0, i = 1, . . . , m
aTi x bi = 0, i = 1, . . . , p.
y C = f0 (x)T (y x) 0
So f0 (x) defines supporting hyperplane for C at x.
This means that if we move from x towards any other
feasible y, f0 does not decrease.
contour lines of f0
f0 (x)
Many standard convex optimization algorithms assume a linear objective function. So it is important to
realize that any convex optimization problems can be
converted to one with a linear objective function. This
is called putting the problem into epigraph form. It
involves rewriting the problem as:
minimize t
subject to f0 (x) t 0,
fi (x) 0, i = 1, . . . , m
hi (x) = 0, i = 1, . . . , p
maximize
x
subject to Ax b
minimize cT x
subject to xT Pi x + qiT x + ri = 0, i = 1, . . . , K
9
3260
level of modeling capability is at the cost of longer computation time. Thus one should use the least complex
form for the problem at hand.
A general linear program (LP) has the form
minimize cT x + d
subject to Gx h
Ax = b,
f0 (x)
en+1
minimize f T x
subject to
Ai x + bi
2 cTi x + di ,
F x = g,
i = 1, . . . , m
minimize f0 (x)
subject to fi (x) Ki 0, i = 1, . . . , L
Ax = b
where f0 : Rn R are all convex; the Ki are
generalized inequalities on R mi ; and fi : Rn Rmi
are Ki -convex.
A quasiconvex optimization problem is exactly the
same as a convex optimization problem, except for one
difference: the objective function f 0 is quasiconvex.
We will see an important example of quasiconvex
optimization in the next section: linear-fractional programming.
minimize cT x
subject to x1 F1 + + xn Fn + G 0
Ax = b,
where G, F1 , . . . , Fn Sk , and A Rpn . The
inequality here is a linear matrix inequality. As shown
earlier, since SOCP constraints can be written as LMIs,
SDPs subsume SOCPs, and hence LPs as well. (If
there are multiple LMIs, they can be stacked into one
large block diagonal LMI, where the blocks are the
individual LMIs.)
We will now consider some examples which are
themselves very instructive demonstrations of modeling
and the power of convex optimization.
First, we show how slack variables can be used to
convert a given problem into canonical form. Consider
the constrained - (Chebychev) approximation
minimize
Ax b
subject to F x g.
3261
i = 1, . . . , m
i=1,...,r
bi ) .
cTi x + di
,
eTi x + fi
Prob
.
or, equivalently,
u + 1 () bi .
From u = aTi x and = (xT i x)1/2 we obtain
1/2
aTi x + 1 () i x 2 bi .
receiver (i, j)
1
transmitter i
minimize cT x
1/2
subject to aTi x + 1 ()
i x
2 bi ,
11
3262
power is Iij = k=i Aijk pk + Nij .Therefore the signal
to interference/noise ratio (SINR) is
s
Sij /Iij .
Aiji pi
i,j
k=i Aijk pk + Nij
subject to 0 pi pmax
maximize min
This is a GLFP.
Next we consider two related ellipsoid approximation
problems. In addition to the use of LMI constraints,
this example illustrates techniques of computing with
ellipsoids and also how the choice representation of the
ellipsoids and polyhedra can make the difference between a tractable convex problem, and a hard intractable
one.
First consider computing a minimum volume ellipsoid
around points v 1 , . . . , vK Rn . This is equivalent
to finding the minimum volume ellipsoid around the
polytope defined by the convex hull of those points,
represented in vertex form.
EP
y1
Bai + aTi d bi , i = 1, . . . , L
C. Geometric programming
3263
K
ck xa1 1k xa2 2k
K
k=1
then
xannk ,
f (x) =
k=1
eak y+bk ,
k=1
i = 1, . . . , m
i = 1, . . . , p
0, i = 1, . . . , m
subject to fi (y) = log
k=1 e
T
hi (y) = g y + hi = 0, i = 1, . . . , p.
K
f (ey1 , . . . , eyn )
ea
y+b
3264
VII. C ONCLUSION
In this paper we have reviewed some essentials of
convex sets, functions and optimization problems. We
hope that the reader is convinced that convex optimization dramatically increases the range of problems in
engineering that can be modeled and solved exactly.
ACKNOWLEDGEMENT
I am deeply indebted to Stephen Boyd and Lieven
Vandenberghe for generously allowing me to use material from [7] in preparing this tutorial.
R EFERENCES
[1] M. S. Bazaraa, H. D. Sherali, and C. M. Shetty. Nonlinear
Programming. Theory and Algorithms. John Wiley & Sons,
second edition, 1993.
[2] A. Ben-Tal and A. Nemirovski. Lectures on Modern Convex Optimization. Analysis, Algorithms, and Engineering Applications.
Society for Industrial and Applied Mathematics, 2001.
[3] D. P. Bertsekas. Nonlinear Programming. Athena Scientific,
1999.
[4] D. Bertsimas and J. N. Tsitsiklis. Introduction to Linear
Optimization. Athena Scientific, 1997.
[5] S. Boyd and C. Barratt. Linear Controller Design: Limits of
Performance. Prentice-Hall, 1991.
[6] S. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan. Linear
Matrix Inequalities in System and Control Theory. Society for
Industrial and Applied Mathematics, 1994.
[7] S.P. Boyd and L. Vandenberghe. Convex Optimization. Cambridge University Press, 2003. In Press. Material available at
www.stanford.edu/boyd.
[8] D. Colleran, C. Portmann, A. Hassibi, C. Crusius, S. Mohan,
T. Lee S. Boyd, and M. Hershenson. Optimization of phaselocked loop circuits via geometric programming. In Custom
Integrated Circuits Conference (CICC), San Jose, California,
September 2003.
[9] M. A. Dahleh and I. J. Diaz-Bobillo. Control of Uncertain
Systems: A Linear Programming Approach. Prentice-Hall, 1995.
[10] J. W. Demmel. Applied Numerical Linear Algebra. Society for
Industrial and Applied Mathematics, 1997.
[11] G. E. Dullerud and F. Paganini. A Course in Robust Control
Theory: A Convex Approach. Springer, 2000.
[12] G. Golub and C. F. Van Loan. Matrix Computations. Johns
Hopkins University Press, second edition, 1989.
[13] M. Grotschel, L. Lovasz, and A. Schrijver. Geometric Algorithms
and Combinatorial Optimization. Springer, 1988.
[14] T. Hastie, R. Tibshirani, and J. Friedman. The Elements of
Statistical Learning. Data Mining, Inference, and Prediction.
Springer, 2001.
[15] M. del Mar Hershenson, S. P. Boyd, and T. H. Lee. Optimal
design of a CMOS op-amp via geometric programming. IEEE
Transactions on Computer-Aided Design of Integrated Circuits
and Systems, 20(1):121, 2001.
14
3265