Professional Documents
Culture Documents
In optimal design problems, values for a set of n design variables, (x1 , x2 , · · · xn ), are
to be found that minimize a scalar-valued objective function of the design variables, such
that a set of m inequality constraints, are satisfied. Constrained optimization problems are
generally expressed as
g1 (x1 , x2 , · · · , xn ) ≤ 0
g2 (x1 , x2 , · · · , xn ) ≤ 0
min
x1 ,x2 ,··· ,xn
J = f (x1 , x2 , · · · , xn ) such that .. (1)
.
gm (x1 , x2 , · · · , xn ) ≤ 0
If the objective function is quadratic in the design variables and the constraint equations are
linearly independent, the optimization problem has a unique solution.
Consider the simplest constrained minimization problem:
1 2
min kx where k>0 such that x≥b. (2)
x 2
This problem has a single design variable, the objective function is quadratic (J = 12 kx2 ),
there is a single constraint inequality, and it is linear in x (g(x) = b − x). If g > 0, the
constraint equation constrains the optimum and the optimal solution, x∗ , is given by x∗ = b.
If g ≤ 0, the constraint equation does not constrain the optimum and the optimal solution
is given by x∗ = 0. Not all optimization problems are so easy; most optimization methods
require more advanced methods. The methods of Lagrange multipliers is one such method,
and will be applied to this simple problem.
Lagrange multiplier methods involve the modification of the objective function through
the addition of terms that describe the constraints. The objective function J = f (x) is
augmented by the constraint equations through a set of non-negative multiplicative Lagrange
multipliers, λj ≥ 0. The augmented objective function, JA (x), is a function of the n design
variables and m Lagrange multipliers,
m
X
JA (x1 , x2 , · · · , xn , λ1 , λ2 , · · · , λm ) = f (x1 , x2 , · · · , xn ) + λj gj (x1 , x2 , · · · , xn ) (3)
j=1
Case 1: b = 1
If b = 1 then the minimum of 12 kx2 is constrained by the inequality x ≥ b, and the
optimal value of λ should minimize JA (x, λ) at x = b. Figure 1(a) plots JA (x, λ) for a few
non-negative values of λ and Figure 1(b) plots contours of JA (x, λ).
12 4
3.5 (b)
10 (a)
λ=4
3
augmented cost, JA(x)
Lagrange multiplier, λ
2.5
3
6
2
2
*
4 1.5
1 1
2 b=1
0.5
0 λ=0
4
3.5
3
2.5
JA 2
1.5
1
0.5 4
3.5
3
00 2.5
0.5 2
1.5Lagrange multiplier, λ
1 1
design variable, x 1.5 0.5
20
• JA (x, λ) is independent of λ at x = b,
∂JA (x, λ)
= 0 (5a)
∂x x = x∗
λ = λ∗
∂JA (x, λ)
= 0 (5b)
∂λ x = x∗
λ = λ∗
(5c)
∂JA (x, λ)
= 0 ⇒ kx∗ − λ∗ = 0 ⇒ λ∗ = kx∗ (6a)
∂x x = x∗
λ = λ∗
∂JA (x, λ)
= 0 ⇒ −x∗ + b = 0 ⇒ x∗ = b (6b)
∂λ x = x∗
λ = λ∗
This example has a physical interpretation. The objective function J = 12 kx2 represents
the potential energy in a spring. The minimum potential energy in a spring corresponds to
a stretch of zero (x∗ = 0). The constrained problem:
1 2
min kx such that x≥1
x 2
means “minimize the potential energy in the spring such that the stretch in the spring is
greater than or equal to 1.” The solution to this problem is to set the stretch in the spring
equal to the smallest allowable value (x∗ = 1). The force applied to the spring in order to
achieve this objective is f = kx∗ . This force is the Lagrange multiplier for this problem,
(λ∗ = kx∗ ).
Case 2: b = −1
If b = −1 then the minimum of 21 kx2 is not constrained by the inequality x ≥ b. The
derivation above would give x∗ = −1, with λ∗ = −k. The negative value of λ∗ indicates
that the constraint does not affect the optimal solution, and λ∗ should therefore be set to
zero. Setting λ∗ = 0, JA (x, λ) is minimized at x∗ = 0. Figure 2(a) plots JA (x, λ) for a few
negative values of λ and Figure 2(b) plots contours of JA (x, λ).
12 0
*
-0.5 (b)
10 (a)
λ=-4
-1
augmented cost, JA(x)
Lagrange multiplier, λ
-1.5
-3
6
-2
-2
4 -2.5
-1 -3
2 b=-1
-3.5
0 λ=0
4
3.5
3
2.5
JA 2
1.5
1
0.5 0
-0.5
-1
0-2 -1.5
-1.5 -2
-2.5Lagrange multiplier, λ
-1 -3
design variable, x -0.5 -3.5
0 -4
• JA (x, λ) is independent of λ at x = b,
• the saddle point of JA (x, λ) occurs at a negative value of λ, so ∂JA /∂λ 6= 0 for any
λ ≥ 0.
• The constraint x ≥ −1 does not affect the solution, and is called a non-binding or an
inactive constraint.
• The Lagrange multipliers associated with non-binding inequality constraints are nega-
tive.
Summary
In summary, if the inequality g(x) ≤ 0 constrains the minimum of f (x) then the
optimum point of the augmented objective JA (x, λ) = f (x)+λg(x) is minimized with respect
to x and maximized with respect to λ. The optimization problem of equation (1) may be
written in terms of the augmented objective function (equation (3)),
∂JA
= 0 (8a)
∂xk xi = x∗i
λj = λ∗j
∂JA
= 0 (8b)
∂λk xi = x∗i
λj = λ∗j
λj ≥ 0 (8c)
These equations define the relations used to derive expressions for, or compute values of, the
optimal design variables x∗i . Lagrange multipliers for inequality constraints, gj (x1 , · · · , xn ) ≤ 0,
are non-negative. If an inequality gj (x1 , · · · , xn ) ≤ 0 constrains the optimum point, the cor-
responding Lagrange multiplier, λj , is positive. If an inequality gj (x1 , · · · , xn ) ≤ 0 does not
constrain the optimum point, the corresponding Lagrange multiplier, λj , is set to zero.
m
δf (x∗ ) = λ∗j δgj
X
(9)
j=1
The value of the Lagrange multiplier is the sensitivity of the constrained objective to (small)
changes in the constraint δg. If λj > 0 then the inequality gj (x) ≤ 0 constrains the optimum
point and a small increase of the constraint gj (x∗ ) increases the cost. If λj < 0 then
gj (x∗ ) < 0 and does not constrain the solution. A small change of this constraint should
not affect the cost. So if a Lagrange multiplier associated with an inequality constraint
gj (x∗ ) ≤ 0 is computed as a negative value, it is subsequently set to zero. Such constraints
are called non-binding or inactive constraints.
f(x) f(x)
δ f= λδg λ<0
λ>0
111
000 111
000
000
111
000
111
000
111
000
111
x 000
111
000
111
000
111 λ=0 δ f = 0 x
000
111 000
111
000
111 000
111
000
111
000
111
000
111
000
111
000
111
x*
x* g(x)+ δg 000
111
g(x)+ δg
not ok ok g(x) not ok ok g(x)
Figure 3. Left: The optimum point is constrained by the condition g(x∗ ) ≤ 0. A small increase
of the constraint from g to g + δg increases the cost from f (x∗ ) to f (x∗ ) + λδg. (λ > 0) Right:
The constraint is negative at the minimum of f (x), so the optimum point is not constrained
by the condition g(x∗ ) ≤ 0. A small increase of the constraint function from g to g + δg does
not affect the optimum solution. If this constraint were an equality constraint, g(x) = 0, then
the Lagrange multiplier would be negative and an increase of the constraint would decrease the
cost from f to f + λδg (λ < 0).
The previous examples consider the minimization of a simple quadratic with a single
inequality constraint. By expanding this example to two inequality constraints we can see
again how Lagrange multipliers indicate whether or not the associated constraint bounds
f(x)
111
000 111
000
000
111
000
111
000
111
000
111
000
111
000
111
x
000
111 000
111
000
111
000
111 000
111
000
111
000
111 000
111
b x*=c
not ok ok
Figure 4. Minimization of f (x) = 12 kx2 such that x ≥ b and x ≥ c with 0 < b < c. Feasible
solutions for x are greater-or-equal than both b and c.
∂JA (x, λ1 , λ2 )
x = x∗ =0⇒ −x∗ + b = 0 (12b)
∂λ1 λ1 = λ∗1
λ2 = λ∗2
∂JA (x, λ1 , λ2 )
x = x∗ =0⇒ −x∗ + c = 0 (12c)
∂λ2 λ1 = λ∗1
λ2 = λ∗2
g1 = 3x1 + 2x2 + 2 ≤ 0
min x21 + 0.5x1 + 3x1 x2 + 5x22 such that (14)
x1 ,x2 g2 = 15x1 − 3x2 − 1 ≤ 0
This example also has a quadratic objective function and inequality constraints that are
linear in the design variables. Contours of the objective function and the two inequality
constraints are plotted in Figure 5. The feasible region of these two inequality constraints is
to the left of the lines in the figure and are labeled as “g1 ok” and “g2 ok”. This figure shows
that the inequality g1 (x1 , x2 ) constrains the solution and the inequality g2 (x1 , x2 ) does not.
This is visible in Figure 5 with n = 2, but for more complicated problems it may not be
immediately clear which inequality constraints are “active.”
Using the method of Lagrange multipliers, the augmented objective function is
JA (x1 , x2 , λ1 , λ2 ) = x21 + 0.5x1 + 3x1 x2 + 5x22 + λ1 (3x1 + 2x2 + 2) + λ2 (15x1 − 3x2 − 1) (15)
Unlike the first examples with n = 1 and m = 1, we cannot plot contours of JA (x1 , x2 , λ1 , λ2 )
since this would be a plot in four-dimensional space. Nonetheless, the same optimality
conditions hold.
∂JA
min JA ⇒ = 0 ⇒ 2x∗1 + 0.5 + 3x∗2 + 3λ∗1 + 15λ∗2 = 0 (16a)
x1 ∂x1 x∗1 ,x∗2 ,λ∗1 ,λ∗2
∂JA
min JA ⇒ =0⇒ 3x∗1 + 10x∗2 + 2λ∗1 − 3λ∗2 = 0 (16b)
x2 ∂x2 x∗1 ,x∗2 ,λ∗1 ,λ∗2
∂JA
max JA ⇒ =0⇒ 3x∗1 + 2x∗2 + 2 = 0 (16c)
λ1 ∂λ1 x∗1 ,x∗2 ,λ∗1 ,λ∗2
∂JA
max JA ⇒ =0⇒ 15x∗1 − 3x∗2 − 1 = 0 (16d)
λ2 ∂λ2 x∗1 ,x∗2 ,λ∗1 ,λ∗2
If the objective function is quadratic in the design variables and the constraints are linear
in the design variables, the optimality conditions are simply a set of linear equations in the
design variables and the Lagrange multipliers. In this example the optimality conditions are
expressed as four linear equations with four unknowns. In general we may not know which
inequality constraints are active. If there are only a few constraint equations it’s not too
hard to try all combinations of any number of constraints, fixing the Lagrange multipliers
for the other inequality constraints equal to zero.
• First, let’s find the unconstrained minimum by assuming neither constraint g1 (x1 , x2 )
or g2 (x1 , x2 ) is active, λ∗1 = 0, λ∗2 = 0, and
" #" # " # " # " #
2 3 x∗1 −0.5 x∗1 −0.45
= ⇒ = , (17)
3 10 x∗2 0 x∗2 0.14
which is the unconstrained minimum shown in Figure 5 as a “∗”. Plugging this so-
lution into the constraint equations gives g1 (x∗1 , x∗2 ) = 0.93 and g2 (x∗1 , x∗2 ) = −8.17,
so the unconstrained minimum is not feasible with respect to constraint g1 , since
g1 (−0.45, 0.14) > 0.
• Next, assuming both constraints g1 (x1 , x2 ) and g2 (x1 , x2 ) are active, optimal values for
x∗1 , x∗2 , λ∗1 , and λ∗2 are sought, and all four equations must be solved together.
2 3 3 15 x∗1 −0.5 x∗1 −0.10
3 10 2 −3 x∗2
0 x∗2 −0.85
=
⇒
=
, (18)
3 2 0 0 λ∗1 −2 λ∗1 3.55
15 −3 0 0 λ∗2 1 λ∗2 −0.56
which is the constrained minimum shown in Figure 5 as a “o” at the intersection of the
g1 line with the g2 line in Figure 5. Note that λ∗2 < 0 indicating that constraint g2 is not
active; g2 (−0.10, 0.85) = 0 (ok). Enforcing the constraint g2 needlessly compromises
the optimality of this solution. So, while this solution is feasible (both g1 and g2
evaluate to zero), the solution could be improved by letting go of the g2 constraint and
moving along the g1 line.
• So, assuming only constraint g1 is active, g2 is not active, λ∗2 = 0, and
x∗1 x∗1
2 3 3 −0.5 −0.81
∗ ∗
3 10 2 x2 = 0 ⇒ x2 = 0.21 , (19)
(a)
0.5
g1 ok
x2
0
g2 ok
-0.5
-1
-1 -0.5 0 0.5 1
x1
(b)
0.5
g1 ok
x2
0
g2 ok
-0.5
-1
-1 -0.5 0 0.5 1
x1
Figure 5. Contours of the objective function and the constraint equations for the example of
equation (14). (a): J = f (x1 , x2 ); (b): JA = f (x1 , x2 ) + λ∗1 g1 (x1 , x2 ). Note that the the
contours of JA are shifted so that the minimum of JA is at the optimal point along the g1 line.
1 1 A11 x1 + A12 x2 − b1 ≤ 0
min H11 x21 + H12 x1 x2 + H22 x22 + c1 x1 + c2 x2 + d such that
2
x1 ,x2 2 A21 x1 + A22 x2 − b2 ≤ 0
(21)
or, even more generally, for any number (n) of design variables, as
Pn
j=1 A1j xj − b1 ≤ 0
Pn
1 n X
X n n
X A x
j=1 2j j − b2 ≤ 0
min xi Hij xj + ci x i + d such that .. (22)
x1 ,x2 ,··· ,xn 2 i=1 j=1 i=1 .
Pn
j=1 Amj xj − bm ≤ 0
P P
where Hij = Hji and i j xi Hij xj > 0 for any set of design variables. In matrix-vector
notation, equation (22) is written
1 T
min x Hx + cT x + d such that Ax − b ≤ 0 (23)
x 2
The augmented optimization problem is
n X n n m n
1X X X X
max min xi Hij xj + ci x i + d + λk Akj xj − bk , (24)
λ1 ,··· ,λm x1 ,··· ,xn 2 i=1 j=1 i=1 k=1 j=1
Hx + c + AT λ = 0 and Ax − b = 0 (27)
∂JA (x, λ)
= 0T (29)
∂x
results in Hx + c + AT λ = 0 from which the design variables x may be solved in terms of
λ
x = −H −1 (AT λ + c). (30)
Substituting this solution for x in terms of λ into (25) eliminates x from equation (25) and
results in a really long expression for the augmented objective in terms of λ only
1
T T −1 −1 T T −1 T T −1 T
max −(λ A + c )H H −H (A λ + c) − c H (A λ + c) + d + λ A −H (A λ + c) − b
λ 2
... and after a little algebra involving some convenient cancellations and combinations ... we obtain
the dual problem.
1 1
max − λT AH −1 AT λ − (cT H −1 AT + bT )λ − cT H −1 c + d such that λ ≥ 0 (31)
λ 2 2
Equation (25) is called the primal quadratic programming problem and equation (31)
is called the dual quadratic programming problem. The primal problem has n + m unknown
variables (x and λ) whereas the dual problem has only m unknown variables (λ) and is
therefore easier to solve.
Note that if quadratic term in the primal form (quadratic in the primal variable x)
is positive, then the quadratic term in the dual form (quadratic in the dual variable λ) is
negative. So, again, it makes sense to minimize over the primal variable x and to maximize
over the dual variable λ.