You are on page 1of 18

Applied Mathematics and Computation 172 (2006) 10001017

www.elsevier.com/locate/amc

A trust region lter method for general non-linear programming


Pu-yan Nie
a b

a,*

, Chang-feng Ma

Department of Mathematics, College of Information Science and Technology, Jinan University, Guangzhou 510632, PR China College of Mathematics and Physics, Zhejiang Normal University, Zhejiang 321004, PR China

Abstract Filter approach is initially proposed by Fletcher and Leyer in 2002. Because of promising numerical results, lter methods are recently attached importance to. If the objective function value or the constraint violation is reduced, this step is accepted by a lter, which is the basic idea of the lter. In this paper, the lter technique is employed in a trust region algorithm. In every trial step, the step length is controlled by a trust region radius. Moreover, our purpose is not to reduce the objective function and constraint violation directly. To overcome some bad cases, we aim to reduce the degree of constraint violation and the entry of some function, and the function is a combination of the objective function and the degree of constraint violation. The algorithm in this paper requires neither Lagrangian multipliers nor strong decrease condition. In certain conditions, this method produces KT points for the original problem. Moreover, Maratos eect can be avoided for our algorithm. 2005 Elsevier Inc. All rights reserved.
Keywords: Filter method; Trust region approach; Non-linear programming; Constrained optimization

Corresponding author. E-mail address: pynie2002@hotmail.com (P.-y. Nie).

0096-3003/$ - see front matter 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2005.03.004

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

1001

1. Introduction In this paper, we concern the minimization of a non-linear function with non-linear constraints, which can be stated as minimize subject to f x ci x 0 i 1; 2; . . . ; me ; ci x 6 0 i m e 1; . . . ; m ; 1:1

where x 2 Rn, the functions c: Rn ! Rm , f : Rn ! R and n, me and m are all positive integers. Non-linear programming model (1.1), arising often in science, engineering and many elds in the society, is extremely important. There have existed a plenty of literatures about non-linear programming, see the excellent monograph [11] and the attached references. There are various methods to attack (1.1), for instance, sequential quadratic approaches, sequential quadratic programming (SQP) methods, interior point methods, penalty approaches and trust region techniques. Filter approach, recently proposed by Fletcher and Leyer in 2002 [4], has been extensively applied so far. In [5], lter method is employed to SLP (sequential linear programming) approach and the global convergence to rst order critical point is achieved. Moreover, to obtain good properties, equality constrained quadratic programming model is employed in a SLP lter method in [2]. In 2003, lter method is combined with SQP strategy by Fletcher et al. [7,9] and rst-order critical accumulation points are obtained. In [8], a lter method is proposed for bundle non-smooth approaches. Audet and Dennis [1] present a patter search lter method for derivative-free non-linear programming problems without any requirement of sucient descent. Nie [10] combines the lter with composite-step like methods. In 2000, Ulbrich et al. [12] give an interior-point lter method to non-convex programming. In [13], super-linear local convergence is recently achieved for lter-SQP methods. Furthermore, the lter idea has been proved to be very successful numerically in the SLP/SQP framework [6]. Filter methods have several advantages over penalty function methods. Firstly, no penalty parameter estimates, which could be dicult to obtain, are required. Secondly, practical experience shows that they exhibit a certain degree of non-monotonicity which may be benecial. Finally, lter approaches play an important role to balance the objective function and constraints. When the objective function and the function of constraint violation degree are inconsistent to the change of variable, some good point may be ltered. For example, in the problem of Maratos, a good point, which converges superlinearly to the optimal solution, is ltered because its constraint and function values becomes worse. The above bad cases motivate us to take steps to avoid

1002

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

them. A new trust region lter method, which is based on some penalty function and is dierent to that in [9], is presented in this paper. In general, the penalty parameters are changed at each step. While the penalty parameter in this paper is a constant because lter method guarantees the degree of constraint violation tends to zero. Meanwhile, the global convergence of the new trust region lter approach is achieved. This paper is organized as following: an introduction of trust region method and lter technique is given in the next section. In Section 3, a new trust region lter algorithm is put forward. The convergent properties are analyzed in Section 4. Some numerical results and remarks are presented in Section 5.

2. Filter technique and trust region method Both trust region techniques and lter approaches appear in recent years. They all receive much attention because of their advantages and promising numerical results. We introduce them respectively. 2.1. Filter technique General purpose of all optimization methods is to minimize both the objective function f and a non-negative continuous constraint violation function h, where h(x) P 0. h(x) = 0 if and only if x is feasible (h(x) > 0 if and only if x is infeasible). The lter will be employed as a criterion to accept or to reject a trial step generated by a subproblem. Fletcher et al.s denition of lter is based on the denition of dominance, which is originated from multi-objective terminology. The denition of dominance is now given. Denition 1. For a pair of x, x 0 with nite components, x dominates x 0 , written x 0 x 0 if and only if xi 6 x0i for each i and x 5 x 0 . Similarly, we denote x " x 0 to indicate that either x 0 x 0 or that x = x , which is the notion of dominance in earlier lter papers. Combined with our problem, we dene x 0 (h,p)x 0 if and only if (h(x), p(x)) 0 (h(x 0 ), p(x 0 )) where 0 is listed in Denition 1 and p(x) is a function related close to f(x). In order to simplify the terminology, we use x 0 x 0 rather than x 0 (h,p)x 0 . As above, x " x 0 indicates that either x 0 x 0 or equivalent. A lter is dened as follows: Denition 2. A lter set F is a set of points in Rn such that no point dominates any other.
0

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

1003

To acquire the convergence, stronger conditions are required to decide whether to accept a point to the lter or not. In the earlier papers about lter approaches, p(x) = f(x), while the relation is not required in this work. Certainly, in an algorithm, ltered point is rejected by the set F. This means that the trial step does not produce a successful iteration. On the contrary, unltered points are accepted. It is pointed out that the lter points are close related to when it is generated. In practice, to obtain good properties some additional conditions are necessary. For example, Fletcher and Leyer [4] dene the envelope and Audet and Dennis utilize poll search in [1]. In this paper, the following notation is utilized throughout. px : f x rkcx k1 ;
where c i jci j for i 1; 2; . . . ; me , ci maxf0; ci g for i me 1; me 2; . . . ; m. In this work, we therefore aim to reduce both kc(x)+k1 and p(x). For convenience, we also denote

hx : kcx k1 .

2:1

Note: In lter methods, when x 0 1 x and x is in the lter set, x 0 will be rejected by the lter set no matter what the point x 0 owns some good properties. Actually, in Maratos eect, a good point will be rejected if lter method is employed. We hope to overcome Maratos eect by relaxation the accepted criterion. This motivates us to utilize kc(x)+k1 in two objective p and h. On one hand, the early lter approaches are the special cases in this work with r = 0. On the other hand, with suitable r, Maratos eect can be overcome. The denition of lter in this paper thus diers from that of others. A point is acceptable to a lter if either pxk sk 6 pj chxk sk or hxk sk 6 bhj ; 2:2

for all j 2 Fk , where 1 > b > c > 0 while b close to 1 and c close to 0 and pj : p(xj), hj : h(xj). The early lter methods is in Fig. 1 with the coordinates of p = f and h. The lter in this paper is illustrated in Fig. 1 with the coordinates p and h or in Fig. 2 with f and h if r < 0. When r < 0, the accepted conditions are relaxed. The boundary of the area consisting of the unltered points in Fig. 2 is therefore zigzag. 2.2. Trust region method Trust region method is exceedingly important for ensuring global convergence while retaining fast local convergence in optimization algorithms, see the excellent monograph [3] and the attached references. Meanwhile, it can exibly use diverse approaches locally. Trust region methods consequently have very good local properties.

1004

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

Fig. 1.

Fig. 2.

In a trust region method, a trial step is obtained by solving some trust region model. Then some merit function is utilized to evaluate the new iterate to decide whether to accept the new point or not. If the new point is accepted, increase the trust region radius and a new quadratic model is formed by a certain method. If the new point is rejected, decrease the trust region radius and recompute. Certainly, there are many approaches to get a quadratic model and the merit function at the current point xk, for example, the quadratic model of Yuans [14], which is listed as follows:   ^ k d gT d 1 d T Bk d rk ck AT d  ; W k k 1 2
k where rk > 0 is the penalty parameter. d 2 Rn and ck AT k d i jc T T k k AT d j for i 1 ; . . . ; m , c A d max f 0 ; c A d g for i m 1 ; . ..; e e i k k k i i m. Bk is the Hessian matrix or the approximate Hessian matrix to the Lagrangian function to (1.1). Ak = $ck and ck = c(xk). The quadratic model to trust region method is given as follows:

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

1005

minimize subject to

^ k d W kd k1 6 Dk .

2:3

In [14], Yuan uses L1 exact penalty function, f(x) + rkkc+(x)k1, as merit function, where rk > 0 is penalty parameter. In this work, the following function is employed as the merit function in every step, where r is a constant which may be positive or negative. M k x f x rkc xk1 . Moreover, the subproblem is  k  1 T  T  W k d g T k d d Bk d r c Ak d 1  k  2  T  subject to c Ak d 1 0; minimize kd k1 6 Dk . There are several reasons: (1) In the restoration algorithm, the penalty parameter is not very large (the restoration algorithm is given in Section 3). ^k k is very small while (2) We can avoid some bad cases. For example, when kg kAkk is very large, it is very dicult to nd a point accepted by the lter, ^k is the gradient of the Lagrangian function to (1.1). where g (3) When (2.5) is inconsistent, some technique is used, which is called restoration algorithm in this paper, to insure a new good point is obtained. Although it is very dicult to tackle (2.5), we need the approximate solution without very strong conditions. 3. New trust region lter algorithm The advantages of trust region approach and lter method motivate us to combine them together. The basic idea of our algorithm is: in every step, we use the trust region approach to produce a new iteration. Some ltered rules are then employed to determine whether this point is accepted by the lter set or not. In every step and at the current point xk, (2.5) is solved to obtain sk. We point out that the form of (2.5) looks like penalty function while (2.5) diers from penalty function in the sense that r is a constant throughout. If (2.5) has no solution, restoration algorithm is employed. Otherwise, some rules are utilized to evaluate xk + sk. If the conditions are satised, ltered criterion is employed to decide whether this point is added to the lter or not. Otherwise, we reduce the trust region radius and recompute (2.5). In our algorithm, the 2:4

2:5

1006

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

ltered criterion is dened in (2.2). If xk + sk is accepted by the lter, then xk+1 : xk + sk and Dk1 fijxi # xk1 ; i 2 Fk g. Filter set is updated as the following rule: [ Fk1 Fk1 Fk fk 1g n Dk1 . The purpose of optimization methods is to reduce both h(x) and p(x). When h(x) is reduced, p(x) may be increased. Meanwhile, we hope to avoid h(x) being very large. When h(x) is very large, we take steps to control it. In [5,12], the restoration algorithm is presented. In [1], the poll step plays the same role as the restoration algorithm in [12]. For convenience, a denition about some notations, which is utilized throughout, is given. Denition 3 hI k minfhi jhi > 0; i 2 Fk g; pF k minfpi jhi 0; i 2 Fk g
I and pI k is the corresponding value to hk for k = 0, 1, 2, . . .

Namely, some way, to let (h(x 0 ), p(x 0 )) lying near (0, pF) or (hI, pI), is employed. In our algorithm, the restoration algorithm is utilized. For the update of Dk, the way of Yuans [15] is employed. Now, we formally give our algorithm as follows, which is based on (2.5) to handle (1.1). Algorithm 1 (Trust Region Filter Algorithm). Step 0. Choose D0, x0, r, a1, a2, b, g, g2, c, where D0 > 0, 0 < a1, a2, b, g, g2 < 1 and c 2 0; 1 ; b 2 1 ; 1. Set k : 0; F0 fx0 g. 2 2 I F Step 1. Compute hI k ; pk ; pk . Step 2. Solve (2.5) to obtain sk. Denote xk(Dk) : xk + sk. If sk = 0, then a KT point to (1.1) is obtained and stop. Step 3. If (2.5) has no solution, use Restoration Algorithm (Algorithm 2) to I r I obtain sr k . Let xk : xk sk , update hk and pk . Go to Step 2. Step 4. Compute rk M k x k M k x k sk W k 0 W k s k 3:1

If rk 6 g, then let xk+1 : xk, Dk1 : 1 D ; k : k 1 and go to Step 2. 2 k Compute ^ pk : pxk Dk ; ^ hk : hxk Dk . If xk(Dk) = xk + sk is not acceptable to the lter, then Dk1 : 1 D . If hxk P 2 k 2 Dk minfg2 ; a1 Da g , call Restoration Algorithm (Algorithm 2) to produce k r a point xr x s such that: k k k

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

1007

(A) xr k is acceptable to the lter; I 2 (B) hxr k 6 g2 minfhk ; a1 Dk g. Go to Step 2. Step 5. xk(Dk) is acceptable to the lter. xk+1 : xk(Dk), pk1 ^ p k ; h k 1 ^ hk Remove the points dominated by (pk+1, hk+1) from the lter according to Fk1 . Let Dk+1 : 2Dk and generate Bk+1. 2 Step 6. If hxk1 6 Dk1 minfg2 ; a1 Da k 1 g, then, set k : k + 1 and go to Step 1. Otherwise, k : k + 1 and call Restoration Algorithm (Algorithm r 2) to produce a point xr k xk sk such that (A) and (B) are met. Go to Step 2.
2 If hxk > Dk minfg2 ; a1 Da k g, we give the restoration algorithm (Algorithm 2) r to compute xk such that (A) and (B) are satised. In a restoration algorithm, it is therefore desired to decrease the value of h(x). The direct way is utilized Newton method or the similar ways to attack c(x + sr)+ = 0. Let

j j Mj k d h x k h x k d .

We now give a restoration algorithm, where ckj cxj k ,   j kj   j Wj k d h x k A k d c 1 and rj k


j Mj k sk j . Wj k sk

3:2

3:3

Algorithm 2 (Restoration Algorithm).


0 Step 0. Let x0 k : xk ; Dk : Dk , j : 0, g, g2 2 (0, 1). j j 2 Step 1. If hxk 6 g2 minfhI k ; a1 Dk g and xk is acceptable to the lter, then let j xr k : xk and stop. Step 2. Compute

minimize subject to

Wj k d kd k1 6 D k

3:4

j to get sj k . Calculate rk . j j 1 j 1 Step 3. If rk 6 g, then, let xk : xj : 1 Dj , j : j + 1 and go to Step 2. k ; Dk 2 k j 1 j j j 1 j 1 Step 4. Or else, let xk : xk sk ; Dk : 2Dk . Get Aj j : j 1 and go to k Step 1.

The above restoration algorithm is a Newton method for c(x)+ = 0. This method is utilized frequently. Of course, there are other restoration algorithms, for example, the interior point restoration algorithm in [12] and the

1008

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

SLP restoration algorithm in [4]. We can also solve a quadratic programming to get the result. In fact, Algorithm 2 is a method to handle the linear programming problem. Actually, Mk(xk) Mk(xk + sk) and Wk(0) Wk(sk) are permitted to less than 0 at the same time. The above method is thus a non-monotonic algorithm. The least requirement, to the restoration algorithm, is the nite termination. In Section 4, this property will be shown. About r, it will be discussed in Section 5.

4. The convergence properties Just as that in [1,4,5,7], our analysis of the algorithm are based on the standard assumption as follows. Meanwhile, to obtain the convergence, the sucient reduction condition plays an crucial role. Assumption 1 (1) The set {xk} 2 X is non-empty and bounded. (2) The function f(x) and c(x) are twice continuously differentiable on an open set containing X. j j j j j (3) When solving (3.4), we have Wj k s k h x k k Ak s k c x k k1 P j b2 minfhxj k ; Dk g, where b2 > 0 is a constant. (4) The matrix sequence {Bk} is bounded. (1) and (2) are the standard assumptions. (3) is the sucient reduction condition. (3) is very moderate because Cauchy step satises (3). In a trust region method, (3) guarantees the global convergence. To simplify the problem, we regard (3) as a condition. (4) plays an important role to obtain the convergence result. But it has minor eects to the local convergence rate. The following results are based on Assumption 1. In terms of the restoration algorithm we obtain the following result. Lemma 1. The Restoration Algorithm 2 terminates in a nite number of iteration. Proof. Assume the restoration algorithm does not terminate nitely. Certainly, the termination criterion will be satised if limj!1 hxj k 0. We hence assume that there exists a constant  > 0 with hxj >  for all j. We show that it will k lead to a contradiction. Let K fj: r j k P gg.

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

1009

According to Assumption 1 and (3.3), we have 1>


1 1 X X X 1 j 1 j 1 h x j Wj minf; Dj k hxk P g k sk P gb2 k g. j 1 j 1 k 2K

Therefore, limj!1 Dj k ! 0; j 2 K . In terms of Algorithm 2, we obtain limj!1 Dj k ! 0, for all j. On the other hand,
j j j j Wj k sk M k sk oDk ; j 1 when limj!1 Dj P Dj k ! 0. Thus, Dk k by virtue of Step 4 of Restoration Algoj rithm. Namely, fDk g is increased when Dj k is very small, which contradicts limj!1 Dj ! 0. The result therefore holds and the proof is complete. h k

Then, we investigate the optimal properties of Algorithm 1. Lemma 2. Every new iteration xk+1 5 xk is acceptable to the lter set F. Proof. From Algorithm 1, a new iteration is produced in Step 5 and Step 6. In both cases, xk+1 is accepted by the lter. h Theorem 1. Suppose there are innitely many points added to the lter. Then
k !1

lim hxk 0.

Proof. See [10]. h If there is nitely many points added to the lter, then, the following conclusion is met. Theorem 2. Suppose there are nitely many points added to the lter. Then hxk 0; for certain k1 and k > k1. Proof. The result is obvious from Algorithm 1. h

Similar to [5,8], the global convergence theorem concerns KuhnTucker (KT) necessary conditions under the MangasarianFromowitz constraint qualication (MFCQ). This is an extended form of FritzJohn conditions for a problem that includes equality constraints. Then, we analyze the properties of solution to (2.5). In fact, (2.5) can be stated as the following equivalent form:

1010

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

minimize subject to

1 T T W k d g k d d B k d 2 ci A T d 0 ; i 2 E; i ci A T i d 6 0; kd k1 6 D k ; i 2 I;

4:1

where E = {1, 2, . . ., me} and I = {me + 1, me + 2, . . ., m}. On the other hand, without loss of generality, we assume that k$cik1 6 M2, k$2fk1 6 M2, k$2cik1 6 M2 for all x 2 X and i = 1, 2, . . ., m, where M2 > 0 is a constant. We now consider ci(xk(Dk)). From Taylor expansion, when (4.1) is consistent, there is 1 2 2 ci xk Dk 6 ci xk AT i d n M 2 Dk . 2 Thus, 1 hxk Dk 6 n2 M 2 D2 k. 2 4:2

As for (4.1), Fletcher et al. do very good analysis in [7]. We list it as follows. Theorem 3. Let Assumption 1 hold and let xw 2 X be a feasible point of problem (1.1) at which MFCQ holds, but which is not a KT point. Then there exists a neighborhood N0 of xw and positive constants n1, n2, n3 such that for all xk 2 N0 \ X and all Dk for which n2 hcxk 6 Dk 6 n3 ; 4:3 it follows that (4.1) has a feasible solution sk at which the predicted reduction satises 1 W k s k P n1 D k . 3 If Dk 6 (1 g3)n1/3nM2, then f xk f xk d P g3 W k d ; where g < g3. q 2bhk If hk > 0 and Dk 6 n 2 M then h(xk + d) 6 bhk. 2 Proof. Similar to the methods of Lemmas 4 and 5 in [7], the results are immediately obtained. (4.3)(4.5) are obtained with the similar approach q of Lemma 5 in [7]. We have h(xk + d) 6 bhk when hk > 0 and Dk 6 complete. h
2bhk n2 M 2

4:4

4:5

according to

(4.2) or Lemma 4 in [7]. The result therefore holds and the proof is

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

1011

Lemma 3. Let {xk} 2 X generated by the algorithm which has an accumulation satisfying the conditions of Theorem 3, if there are innitely many points enter into the lter, then, there are just nitely many points related with Restoration Algorithm. Proof. According to Theorem 3, consider the conditions when a point is accepted. If r 5 0, we show the result. According to the result in [10], we have D ! 0 if the algorithm does not generate a point satisfying the conditions of Theorem 3. When (   1 s) 1 g3 n1 g 3 n 1 a2 2bhk Dk 6 min n3 ; ; ; ; 4:6 3nM 2 3a1 jrj n2 M 2 1 11a2 h ; 4:7 a1 k and hk > 0, then, (4.4) and (4.5) are all true. Moreover, according to Theorem 3, we obtain h(xk + sk) 6 bhk. xk + sk will be therefore accepted by the lter because Dk P pxk pxk sk f xk f xk sk rhk hxk sk 1 1a2 k P g3 W >0 k s jrjhk P g3 n1 Dk jrja1 Dk 3 and h(xk + sk) 6 bhk. 1 1 1 a2 Further, under (4.6) and (4.7), Dkj P a h is satised for any j P 1 if k is k 1 large enough. Namely, every trial step satises (4.7) for j P 1. If there is a Restoration Algorithm, from (4.4), there exists a constant 1 3 n1 P 0 > 0 such that WH k j s k j P  0 k s k j k .
g3 0 2 From (4.2) and the above analysis, if Da k j 6 jrja1 2cn2 M 2 , we have

p x k j p x k j s k j P g 3 W H k j sk j rhxk j rhxk j sk j P g3 W H k j sk j jrjhxk j


a2 1a2 2 P g3 0 Dkj jrja1 D1 k j P 2cn M 2 Dk j

P 2cn2 M 2 D2 k j P chxk j Dk j .   r The above inequality holds because limk!1 skj  6 limk!1 s0 hkj 0 and Dk ! 0. The above inequality means that p is monotonic decreasing and there exists Dkj accepted by the lter for Dkj P a11 hk q 1 1 a g3 n1 g3 n1 a1 2bhk 2; 6 minfn3 ; 13 ; g. Therefore, Dkj1 P a11 hk 2 . nM 2 3a1 jrj n2 M 2 a trial point
1 1 a2

and

1012

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

We now show that (4.7) is reasonable. If (4.7) were always false, there
1 should be Dk < a h 1 k
1 1a2

and limk !1 Dk 0 according to limk!1 hk 0. Fur1

1 1 a2 thermore, from (4.2) when 2Dk+1 P Dk and Dk1 6 8na for sufcient 2M 2

large k,
2 2 hk1 6 2n2 M 2 D2 k 6 8 n M 2 D k 1 .
1 1 a 1 1 a

Thus,

1 a1

2 h k 1 6 Dk1 . With the similar analysis, we have

1 a1

hki2 6 Dki for all

integer i P 1. On the other hand, in terms of the hypothesis that there are innitely many points entering lter, we have the corresponding j and Dkj < a11 hkj2 because of limk!1 Dk 0. This is a contradiction. The result therefore holds if r 5 0. If r = 0, substituting (4.6) for s) ( 1 g3 n1 2bhk Dk 6 min n3 ; ; ; 3nM 2 n2 M 2
1 1a

4:6b

we obtain the similar conclusion. The result is accordingly obtained and the proof is complete. h Then, the global convergence can be obtained by utilizing the results of Theorems 13 and Lemma 3. Theorem 4. Let Assumption 1 hold and let {xk} 2 X be generated by the algorithm which has accumulation to be a feasible point of problem (1.1) at which MFCQ holds. Then, {xk} has an accumulation which is a KT point. Proof. If Algorithm 1 terminates nitely and sk = 0 for some k, a KT point is obtained and the result holds apparently. Now, we assume Algorithm 1 terminates innitely. If the result were false, according to Theorem 1, the result in [14] and the above analysis, there would exist an integer k0 such that hk < n2 is small enough, p(xk) p(xk + sk) P g[Wk(0) Wk(sk)] for all k > k0. Furthermore, the Restoration Algorithm does not appear when k > k0 from Lemma 3. For convenience, we denote K 2 fk jk > k 0 ; and pxk pxk sk P gWk 0 Wk sk g. On the other hand, from Assumption 1, we have 1> P X X pxk pxk1 P g3 W k sk rhk hk 1 X 1
K2 K2

4:8

 g3 n1 Dk rhk hk1 .

K2

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

1013

Therefore, limk!1 Dk 0 for k 2 K2 because hk ! 0. In terms of the denition of Dk, we hence have limk!1 Dk 0 for all k. There thus exists Dk such that
a2 D > k

6 jrj maxf1 b; cg; g3 n1 a1

4:9

for suciently large k and


a 2 P hxk Dk . 2a1 D1 k

4:10

Eq. (4.10) is rational from Taylor expansion of c(xk(Dk)). In terms of (4.2)


1a2 hxk Dk 6 1 n2 M 2 D 2 from Restoration Algorithm, k and hxk 6 a1 Dk 2 a 1 1 a 2 . We consider the next trial step x (D ). (4.10) hence holds when D 6 n4 2M k k 2 From the above analysis, when D is small enough to satisfy the condition (4.6) or (4.6b) in Lemma 3, p(xk) is monotonic decreasing and
1

1 pxk pxk Dk P g3 n1 Dk jrjjhk hxk Dk j 3 1 a2 1a2 g3 n1 D jrjjhk hxk Dk j k Dk 3 a2 jrhk j P jrj maxf1 b; cg2a1 D1 k P maxf1 b; cghxk Dk ; 4:11

which means that xk(Dk) will be accepted by the lter. Then, Dk will not be reduced when Dk is small enough and k > k0, which contradicts the fact that limk!1 Dk 0. The result therefore holds and the proof is complete. h The global convergence properties have been gotten. Algorithm 1 is an SQP trust-region method combined with new lter technique. A new lter technique is therefore added into the lter community in this paper.

5. Conclusion remarks and numerical results The researchers attach importance to lter approaches because lter methods can eciently balance the objective function and constraint violation function. The combination of various objective function with lter technique brings out a new method. From Figs. 1 and 2, the dierences between our approaches and other lter methods are obvious. Of course, diverse kinds of p can be employed to get various methods. Bk is an approximate Hessian, which can be updated by utilizing some methods, such as BFGS. It has minor eect on the convergent properties. We also do not discuss the initial trust region radius,

1014

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

although it plays an important role to obtain good numerical results. As for r, it can be positive or negative because the lter method can guarantee hk ! 0. Therefore, it has no eect on the convergence of our algorithm. In the earlier papers about lter method, r = 0. When r > 0, the accepting condition becomes strict. But, when r < 0, the accepting condition becomes moderate. When we use a suitable r, some bad cases, for example, Maratos eect, are eciently overcome. Given certain r < 0, Maratos eect can be avoided. (In the following examples, the toleration is 1.0e7. b = 0.98, c = 0.05, a1 a2 1 , 2 g2 = 0.1 and g = 0.25.) Example 1. Consider the following problem: minimize subject to f u; v 3v2 2u u v2 0.

This problem has a optimal solution (u, v) = (0, 0). When the initial point x0 u0 ; v0 t2 0 ; t 0 , where t0 is very small. Let B W x H ; kH 0 0 ! 0 . 2

Now, we give some numerical results for diverse values of r and initial points (Table 1). When t0 = 0.2, the change of r has some eect on the result. When r = 4.0, the algorithm fails because D is reduced to less than 2.0e 20. Then, we give the solution if t0 = 0.01 (Table 2). When r = 4.0, the algorithm fails because D is too small, which is reduced to less than 2.0e19. When r = 0, we get the optimal solution with 2 iterates but with 14 trial steps. We compare them in Table 3. Iter, NF and NG mean the number of iterates, calculations of f and gradient, respectively. In brief, r < 0 is an ecient approach to avoid Maratos eect according to the numerical results in Example 1. The second example is now considered. Example 2 (Distribution of electrons on a sphere). Given np electrons, nd the equilibrium state distribution (of minimal Coulomb potential) of the electrons positioned on a conducting sphere. This problem, known as the Thomson problem of nding the lowest energy conguration of np point changes on a
Table 1 t0 = 0.2 1 2 r = 4.0 (2.40000000000000000e2, 0.00000000000000000) (0.00000000000000000, 0.000000000000000000) r = 4.0

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017 Table 2 t0 = 0.01 1 2 r = 4.0 (9.80000000000000010e5, 1.7347234759768071e18) (0.00000000000000000e+00, 1.7350634817780987e18) r = 4.0

1015

Table 3 t0 = 0.01 Iter NF NG r = 4.0 2 3 2 r = 0.0 3 14 3 r = 4.0

conducting sphere, which is representative of an important class of problems in physics and chemistry that determine a structure with respect to atomic positions. The problem is dened by minimize subject to f x ; y ; z x2 i y2 i z2 i
n p 1 X np X

xi xj y i y j zi zj

2 1 2

i1 ji1

1;

i 1 ; . . . ; np .

The starting point is a quasi-uniform distribution of the points on a unit sphere. The results is given in Table 4. We let r be always a constant in this paper. But the detail choice of r is extremely exible. Certainly, we can give some criterions to choose r or to update r. Moreover, the monotonic property of r is not necessary. Meanwhile, the value of r is not required very large. In a restoration algorithm, the instinctive thought is to minimize h(x). When showing the convergence properties of Algorithm 1, the sucient reduction condition is necessary because both in lter method and in trust region approach the sucient reduction condition is necessary. In our algorithm, the merit function is necessary to enable the information of f(x) to keep in

Table 4 n = 3np 60 60 60 150 150 m = np 20 20 20 50 50 r 5.0 0.0 5.0 5.0 0.0 NF 4 6 7 4 5 Ng 3 4 5 3 4 CPU time (s) 0.587612986564636 0.657778024673461 0.692695021629333 8.371495246887207 9.153695106506348

1016

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

t with the quadratic model. Certainly, there are some dierences with that of [4,5,12]. We thus give a dierent kind of trust region lter method. Algorithm 1 is non-monotonic, which is both to f and to h or to p. About the parameter b, c where 1 > b > c > 0. b is required to close enough to 1. While c suciently close to 0. There are several advantages: (1) the acceptable criterion is not very strong. (2) The relation of f and h becomes more implicit. Certainly, some conditions in the algorithm can be modied to obtain the global convergence. In the algorithm, we hope to avoid the linkage between f and h. But it is by no means yet. Because when the relation is lost, the convergence properties are not guaranteed. In this way will a non-descent direction be produced. As for the norm of the constraint violation function, the other forms can be chosen, such as l1-norm and l2-norm. For example, l1-norm is employed in [1]. The similar conclusions can be immediately obtained. In the numerical results, when this technique is taken steps, it plays better to a certain degree. It is well known, lter method has the advantage of high-eciency.

References
[1] C. Audet, J.E. Dennis, A pattern search lter method for nonlinear programming without derivatives, SIAM Journal on Optimization 14 (2004) 9801010. [2] C.M. Chin, R. Fletcher, On the global convergence of an SLP-lter algorithm that takes EQP steps, Mathematical Programming 96 (2003) 161177. [3] A.R. Conn, N.I.M. Gould, Ph. Toint, Trust Region Methods, MPS-SIAM series on optimization, SIAM Publications, Philadelphia, PA, 2000. [4] R. Fletcher, S. Leyer, Nonlinear programming without a penalty function, Mathematical Programming 91 (2002) 239269. [5] R. Fletcher, S. Leyer, P.L. Toint, On the global convergence of an SLP-lter algorithm. Tech. Report. NA/183, Department of Mathematics, University of Dundee, 1998. [6] R. Fletcher, S. Leyer, User manual for lter SQP. Tech. Report. NA/181, Department of Mathematics, University of Dundee, 1998. [7] R. Fletcher, S. Leyer, P.L. Toint, On the global convergence of a SQP-lter algorithm, SIAM Journal on Optimization 13 (2002) 4459. [8] R. Fletcher, S. Leyer, A bundle lter method for nonsmooth nonlinear optimization, Tech. Report. NA/195, Department of Mathematics, University of Dundee, 1999. [9] R. Fletcher, N.I.M. Gould, S. Leyer et al., Global convergence of a trust-region SQPlter algorithm for general nonlinear, SIAM Journal on Optimization 13 (2003) 635 659. [10] P.Y. Nie, Composite-step like lter methods for equality constraint problems, Journal of Computational Mathematics 21 (2003) 613624. [11] J. Nocedal, S. Wright, Numerical Optimization, Springer Verlag, New York, NY, 1999. [12] M. Ulbrich, S. Ulbrich, L.N. Vicente, A globally convergent primaldual interior-point lter method for nonconvex nonlinear programming, Mathematical Programming 100 (2004) 379410. [13] S. Ulbrich, On the superlinear local convergence of a lter-SQP methods, Mathematical Programming 100 (2004) 217245.

P.-y. Nie, C.-f. Ma / Appl. Math. Comput. 172 (2006) 10001017

1017

[14] Y.X. Yuan, On the convergence of a new trust region algorithm, Numerische Mathematik 70 (1995) 515539. [15] Y. Yuan, On a subproblem of trust region algorithms for constrained optimization, Mathematical Programming 47 (1990) 5363.