You are on page 1of 9

journal of optimization theory and applications: Vol. 123, No. 1, pp.

213–221, October 2004 (© 2004)

TECHNICAL NOTE

On the Classical Necessary Second-Order


Optimality Conditions1
A. Baccari2

Communicated by J. P. Crouzeix

Abstract. In this paper, the property of a necessary second-order


optimality condition to hold with the same Lagrange multiplier for
all critical vectors is investigated. It is limited to nonconvex optimi-
zation problems in IR with equality and inequality constraints; the
n

Mangasarian-Fromovitz constraint qualification is assumed to hold.


A counterexample was given recently by Anitescu. We give some suffi-
cient conditions and we prove that this property holds if n ≤ 2 or if
the number of active inequality constraints is at most two. For three
active inequality constraints and n = 3, a counterexample is given.

Key Words. Nonconvex optimization, Lagrange multipliers, pairs of


quadratic forms, Mangasarian-Fromovitz constraint qualification.

1. Introduction

This paper deals with the necessary second-order optimality condi-


tions for nonconvex optimization problems with equality and inequality
constraints. The optimization problem has the following form:

(P) min [f (x) : gi (x) ≤ 0, i ∈ I, hj (x) = 0, j ∈ J ],

where x ∈ IRn , I = {1, . . . , p}, J = {1, . . . , q}, the functions f, gi , hj are real
and twice continuously differentiable. In particular, Problem (P) may have
no equality or inequality constraints.

1 Thanks are due to A. Trad, F. Bonnans, and F. Khadi for helpful discussions and numer-
ous references.
2 Assistant Professor, Ecole Supérieure des Sciences et Techniques de Tunis, Tunis, Tunisia.

213
0022-3239/04/1000-0213/0 © 2004 Plenum Publishing Corporation
214 JOTA: VOL. 123, NO. 1, OCTOBER 2004

We recall the main notations, definitions, and classical results that will
be used in the sequel.
p+1
The Lagrangian function of (P) is defined on IRn × IR+ × Rq by


i=p =q
j
L(x, λ, µ) = λ0 f (x) + λi gi (x) + µj hj (x).
i=1 j =1

The gradient and the Hessian matrix of L with respect to x are denoted
respectively by ∇L(x, λ, µ) and ∇ 2 L(x, λ, µ). The gradient of f with
respect to x is ∇f (x) and ∇f (x)t is its transpose. The feasible set is

F = {x|gi (x) ≤ 0, i ∈ I, hj (x) = 0, j ∈ J }.

For x ∈ F, the active index set I (x), the critical cone C(x), and the set of
generalized Lagrange multipliers 0 (x), and the set of Lagrange multipli-
ers (x) are defined as follows:

I (x) = {i ∈ I |gi (x) = 0},


C(x) = {d ∈ TF (x)|∇f (x)t d ≤ 0},

where

TF (x) = {d|∇gi (x)t d ≤ 0, i ∈ I (x), ∇hj (x)t d = 0, j ∈ J },


p+1
0 (x) = {(λ,µ) = 0|λ ∈ IR+ , µ ∈ IRg , ∇L(x,λ,µ) = 0,λi gi (x) = 0, i ∈ I },
(x) = {(λ,µ) ∈ 0 (x) | λ0 = 1},

and px is the number of active inequality constraints.


We recall the two most classical constraint qualifications that will be
used.

Definition 1.1. A feasible point x satisfies the linear independence


constraint qualification (LICQ) if the vectors

∇gi (x), i ∈ I (x), ∇hj (x), j ∈ J

are linearly independent.

The Mangasarian-Fromovitz constraint qualification (Ref. 1) is defined


as follows.
JOTA: VOL. 123, NO. 1, OCTOBER 2004 215

Definition 1.2. The Mangasarian-Fromovitz constraint qualification


(MFCQ) holds at a feasible point x ∈ F , if the following two conditions
hold:
(i) the q vectors ∇hj (x) are linearly independent;
(ii) there exists a vector d ∗ such that

∇gi (x)t d ∗ < 0, i ∈ I (x),


t ∗
∇hj (x) d = 0, j ∈ J.

It is well known that the following lemma holds (see for instance
Ref. 2, p. 241).

Lemma 1.1. x ∈ F satisfies MFCQ if and only if there is no pair


(λ, µ) = 0 such that

 
λi ∇gi (x) + µj ∇hj (x) = 0, (1)
i∈I (x) j ∈J

λi ≥ 0, i ∈ I (x). (2)

The following result is known as the Karush-Kuhn-Tucker necessary


optimality conditions: Assume that x ∗ is a local optimal solution of (P)
and satisfies LICQ; then, (x ∗ ) is a singleton [i.e., (x ∗ ) = {(λ∗ , µ∗ )}] and
satisfies the following classical necessary second-order optimality condi-
tion:

(CN2) (d)t ∇ 2 L(x ∗ , λ∗ , µ∗ )d ≥ 0, for all d ∈ C(x ∗ ).

Condition CN2 has the important property that the Lagrange multi-
plier (λ∗ , µ∗ ) is the same for all critical vectors d ∈ C(x ∗ ). In many appli-
cations, LICQ is very restrictive. If LICQ does not hold, (x ∗ ) fails to be
a singleton and CN2 fails to be satisfied (see Ref. 3). However, CN2 holds
if one of the following conditions holds (see for instance Ref. 4):
(i) all functions gi and hj are affine;
(ii) the functions f and gi are convex, the functions hj are affine,
and x ∗ satisfies the Slater condition;
(iii) there exists a (λ, µ) ∈ IR+ × IRq such that (x ∗ , λ, µ) is a saddle
p

point for the Lagrangian function of (P) when λ0 = 1.


216 JOTA: VOL. 123, NO. 1, OCTOBER 2004

Without any constraint qualification, a local optimal solution x ∗ of


problem (P) satisfies the following Fritz John necessary first-order and
second-order optimality conditions (see for example Ref. 5, p. 443):

0 (x ∗ ) = ∅, (3)
∗ ∗ 2 ∗
∀d ∈ C(x ), ∃(λ, µ) ∈ 0 (x ) : (d) ∇ L(x , λ, µ)d ≥ 0.
t
(4)

The necessary second-order optimality condition (4) has two defects:


(i) the first component λ0 of λ may vanish;
(ii) the multiplier pair (λ, µ) is not necessarily the same for all crit-
ical vectors.
Assume that a local optimal solution x ∗ of problem (P) satisfies
MFCQ; then, (x ∗ ) is not empty and is bounded (Ref. 6), convex, and
compact. Every (λ, µ) ∈ 0 (x ∗ ) satisfies λ0 > 0. Condition (4) can be
written as

(GN2) sup (d)t ∇ 2 L(x ∗ , λ, µ)d ≥ 0, ∀d ∈ C(x ∗ ),


(λ,µ)∈(x ∗ )

and GN2 is free from the first defect. It is well known that a sufficient
optimality condition for x ∗ ∈ F is that (x ∗ ) = ∅ and that, for every
d ∈ C(x ∗ ), d = 0,

(SGC2) sup (d)t ∇ 2 L(x ∗ , λ, µ)d > 0.


(λ,µ)∈(x ∗ )

A question arises: For a local optimal solution x ∗ of problem (P),


does MFCQ imply CN2? The first counterexample was given in Ref. 3
with n = 3 and px ∗ = 4. Without any convexity hypothesis or linear inde-
pendence constraint qualification, the gap between conditions CN2 and
GN2 is not easy to handle. There is a limited number of papers devoted
to this question (see Ref. 7). The first result of this paper is the following
theorem.

Theorem 1.1. Let x ∗ be a local optimal solution of problem (P) and


let it satisfy MFCQ and SGC2. Assume that one of the following condi-
tions holds:
(C1) n ≤ 2,

(C2) px ∗ ≤ 2.
JOTA: VOL. 123, NO. 1, OCTOBER 2004 217

Then, there exists a pair of multipliers (λ∗ , µ∗ ) ∈ (x ∗ ) such that

(d)t ∇ 2 L(x ∗ , λ∗ , µ∗ )d > 0, for all d ∈ C(x ∗ ), d = 0. (5)

Remark 1.1. If SGC2 does not hold in Theorem 1.1, we replace the
objective function f by the function defined by f (x) + ||x − x ∗ ||2 ,  > 0;
we apply Theorem 1.1 to obtain a pair (λ , µ ) ∈ (x ∗ ) such that (5) holds,
let  go to 0, and get the following main result of this paper.

Theorem 1.2. Let x ∗ be a local optimal solution of problem (P) and


let it satisfy MFCQ. Assume that one of the following conditions holds
(C1) n ≤ 2,
(C2) px ∗ ≤ 2.
Then, there exists a pair (λ∗ , µ∗ ) ∈ (x ∗ ) such that condition CN2 holds.

This paper is organized as follows. In Section 2, we prove Theorem


1.1. In Section 3, a new counterexample, for Theorem 1.2, with n = 3 and
px ∗ = 3, proves that conditions C1 or C2 are the best possible additional
conditions to MFCQ.

2. Proof of Theorem 1.1

Because of SGC2, it suffices to prove the theorem in the case where


(x ∗ ) is not a singleton. MFCQ and Lemma 1.1 imply that (x ∗ ) is a
singleton if we have

∃(λ, µ) ∈ (x ∗ ), s.t. i ∈ I (x ∗ ) ⇒ λi = 0. (6)

Suppose that n ≤ 2. We have two cases below.


(a) If n = 1, then C(x ∗ ) ⊂ R. If there exists d 1 = 0 in C(x ∗ ),
we can find, via SGC2, a pair (λ∗ , µ∗ ) ∈ (x ∗ ) such that
(d 1 )t ∇ 2 L(x ∗ , λ∗ , µ∗ )d 1 > 0. Every other d ∈ C(x ∗ ), d = 0,
can be written as d = αd 1 and (d)t ∇ 2 L(x ∗ , λ∗ , µ∗ )d =
α 2 (d 1 )t ∇ 2 L(x ∗ , λ∗ , µ∗ )d 1 > 0.
(b) If n = 2 and there exists (λ, µ) ∈ (x ∗ ) and i ∈ I (x ∗ ) such that
λi > 0, then C(x ∗ ) ⊂ Ker (∇gi (x ∗ )t ) which is a one-dimensional
subspace and then (a) can be applied.
Suppose that px ∗ ≤ 2 and n ≥ 3. If px ∗ ≤ 1, then x ∗ satisfies LICQ and
(x ∗ )is a singleton.
218 JOTA: VOL. 123, NO. 1, OCTOBER 2004

Let px∗ = 2. Without loss of generality, we can suppose that I (x ∗ ) =


{1, 2}. For each of the following conditions, (x ∗ ) is a singleton

(λ, µ) ∈ (x ∗ ) ⇒ λ1 = 0, (7)

(λ, µ) ∈ (x ∗ ) ⇒ λ2 = 0, (8)

(λ, µ) ∈ (x ∗ ) ⇒ λ1 > 0 and λ2 > 0. (9)

From (6)–(9), it is easy to see that, if (x ∗ ) is not a singleton, then:


(i) there exists (λ1 , µ1 ) and (λ2 , µ2 ) in (x ∗ ) such that

λ11 > 0, λ12 = 0, λ21 = 0, λ22 > 0.

(ii) (λ1 , µ1 ) and (λ2 , µ2 ) are extreme points of (x ∗ ).


(iii)Every (λ, µ) in (x ∗ ) is a convex combination of these extreme
points.
(iv) C(x ∗ ) is the vector space

{d | ∇gi (x ∗ )t d = 0, i = 1, 2; ∇hj (x ∗ )t d = 0, j ∈ J }.

We use (iv) and SGC2 to prove, by contradiction, that there


exists c∗ > 0 such that, for every c ≥ c∗ , one has, for every d ∈ Rn
and d = 0,

max (d)t ∇ 2 L(x ∗ , λ, µ)d + cR(d) > 0, (10)


(λ,µ)∈(x ∗ )

where


i=2 =q
j
R(d) = ||∇gi (x ∗ )t d||2 + ||∇hj (x ∗ )t d||2 .
i=1 j =1

(v) (d)t ∇ 2 L(x ∗ , λ, µ)d is linear in (λ, µ), (x ∗ ) is convex and com-
pact, and the max in (10) is attained at (λ1 , µ1 ) or (λ2 , µ2 ). Let

P (d) = (d)t ∇ 2 L(x ∗ , λ1 , µ1 )d + cR(d),


Q(d) = (d)t ∇ 2 L(x ∗ , λ2 , µ2 )d + cR(d).

(vi) The quadratic forms P and Q satisfy

max(P (d), Q(d)) > 0, ∀d ∈ IRn , d = 0,


JOTA: VOL. 123, NO. 1, OCTOBER 2004 219

and we use a new version of the Yuan lemma (see Ref. 8), as
stated in Ref. 9 for n ≥ 3, to find positive numbers α1 and α2 ,
with α1 + α2 = 1, such that

α1 P (d) + α2 Q(d) > 0, ∀d ∈ IRn , d = 0. (11)

If

(λ∗ , µ∗ ) = α1 (λ1 , µ1 ) + α2 (λ2 , µ2 ),

then (λ∗ , µ∗ ) ∈ (x ∗ ) and

α1 P (d) + α2 Q(d) = (d)t ∇ 2 L(x ∗ , λ∗ , µ∗ )d + cR(d).

We have, from (11),

(d)t ∇ 2 L(x ∗ , λ∗ , µ∗ )d > 0, ∀d ∈ C(x ∗ ), d = 0.

3. Counterexample for Theorem 1.2

The problem (P) is

min{x3 | gi (x) ≤ 0, i = 1, 2, 3},

where

x = (x1 , x2 , x3 ) ∈ IR3 ,

g1 (x) = 2 3x1 x2 − 2x22 − x3 ,
g2 (x) = x22 − 3x12 − x3 ,

g3 (x) = −2 3x1 x2 − 2x22 − x3 .

Put

w1 (x) = 2 3x1 x2 − 2x22 ,
w2 (x) = x22 − 3x12 ,

w3 (x) = −2 3x1 x2 − 2x22 .

We have

gi (x) = wi (x) − x3 , i = 1, 2, 3.

It is easy to see that

w1 (x)w3 (x) = 4x22 w2 (x).


220 JOTA: VOL. 123, NO. 1, OCTOBER 2004

Let x be a feasible point. If w1 (x) ≥ 0 or w3 (x) ≥ 0, then x3 ≥ 0. If w1 (x) <


0 and w3 (x) < 0, then w2 (x) > 0 and x3 > 0. So, for every feasible point x,
the objective function f (x) = x3 ≥ 0, x ∗ = (0, 0, 0) is a global optimal solu-
tion and satisfies MFCQ. Also, for every x3 ≥ 0, x = (0, 0, x3 ) is a feasible
point. For λ0 = 1, the Lagrangian function is


i=3
L(x1 , x2 , x3 , λ) = x3 + λi gi (x1 , x2 , x3 ).
i=1

Calculus of the first and second derivatives yields


√ √
Dx1 L(x, λ) = 2 3x2 λ1 − 6x1 λ2 − 2 3λ3 x2 ,
Dx1 x1 L(x, λ) = −6λ2
√ √
Dx1 x2 L(x, λ) = 2 3λ1 − 2 3λ3 ,
√ √
Dx2 L(x, λ) = (2 3x1 − 4x2 )λ1 + 2x2 λ2 − (2 3x1 + 4x2 )λ3 ,
Dx2 x2 L(x, λ) = −4λ1 + 2λ2 − 4λ3 .

The set of Lagrange


 multipliers is 

i=3
∗ 4
(x ) = λ ∈ IR | λ0 = 1, λi ≥ 0, i = 1, 2, 3, λi = 1 .
i=1

The critical cone is


C(x ∗ ) = {d = (d1 , d2 , d3 )t | d3 = 0}.

The Hessian matrix of L is


 
Dx1 x1 L(x ∗ , λ) Dx1 x2 L(x ∗ , λ) 0
∗ 
∇ L(x , λ) = Dx1 x2 L(x ∗ , λ) Dx2 x2 L(x ∗ , λ)
2
0.
0 0 0

For the critical vectors d 1 = (1, 0, 0)t and d 2 = (0, 1, 0)t , we have

∇ 2 L(x ∗ , λ)d 1 = (−6λ2 , 2 3(λ1 − λ3 ), 0)t ,

(d 1 )t ∇ 2 L(x ∗ , λ)d 1 = −6λ2 ,



∇ 2 L(x ∗ , λ)d 2 = (2 3(λ1 − λ3 ), −4λ1 + 2λ2 − 4λ3 , 0)t ,
(d 2 )t ∇ 2 L(x ∗ , λ)d 2 = −4λ1 + 2λ2 − 4λ3 ,

and for (λ, µ) ∈ (x ∗ ),

(d 1 )t ∇ 2 L(x ∗ , λ)d 1 + (d 2 )t ∇ 2 L(x ∗ , λ)d 2 = −4.


JOTA: VOL. 123, NO. 1, OCTOBER 2004 221

This means that there is no λ ∈ (x ∗ ) such that

(d 1 )t ∇ 2 L(x ∗ , λ)d 1 ≥ 0 and (d 2 )t ∇ 2 L(x ∗ , λ)d 2 ≥ 0.

References

1. Mangasarian, O. L., and Fromovitz, S., The Fritz John Necessary Optimal-
ity Conditions in the Presence of Equality and Inequality Constraints, Journal of
Mathematical Analysis and Applications, Vol. 17, pp. 37–47, 1967.
2. Hestenes, M. R., Optimization Theory: The Finite-Dimensional Case, Robert
E. Krieger Publishing Company, New York, NY, 1981.
3. Anitescu, M., Degenerate Nonlinear Programming with a Quadratic Growth
Condition, SIAM Journal on Optimization, Vol. 10, pp. 1116–1135, 2000.
4. Bazaraa, M. S., Sherali, H. D., and Shetty, C. M., Nonlinear Programming:
Theory and Algorithms, John Wiley and Sons, New York, NY 1993.
5. Bonnans, J. F., and Shapiro, A., Perturbation Analysis of Optimization Prob-
lems, Springer, New York, NY, 2000.
6. Gauvin, J., A Necessary and Sufficient Regularity Condition to Have Bounded
Multipliers in Nonconvex Programming, Mathematical Programming, Vol. 12,
pp. 136–138, 1977.
7. Ben-Tal, A., and Zowe, J., A Unified Theory of First and Second-Order Con-
ditions for Extremum Problems in Topological Vector Spaces, Mathematical
Programming Study, Vol. 19, pp. 39–76, 1982.
8. Yuan, Y., On a Subproblem of Trust-Region Algorithms for Constrained Opti-
mization, Mathematical Programming, Vol. 47, pp.53–63, 1990.
9. Hiriart-Urruty, J. B., and Torki, M., Permanently Going Back and Forth
between the Quadratic World and the Convexity World in Optimization, Applied
Mathematics and Optimization, Vol. 45, pp. 169–184, 2002.

You might also like