You are on page 1of 12

International Journal of Advanced Science and Technology

Vol. 29, No. 05, (2020), pp. 187-198

Convergence Analysis of a New Coefficient Conjugate Gradient


Method Under Exact Line Search

1,*
Maulana Malik, 2Mustafa Mamat, 3Siti Sabariah Abas, 4Sukono
1
Department of Mathematics, Faculty of Mathematics and Natural Science,
Universitas Indonesia, Depok, Indonesia
2,3
Faculty of Informatics and Computing, Universiti Sultan Zainal Abidin,
Terengganu, Malaysia
4
Department of Mathematics, Faculty of Mathematics and Natural Science,
Universitas Padjadjaran, Bandung, Indonesia

Abstract
Conjugate gradient (CG) methods were instrumental in solving unconstrained, wide-ranging
optimization. In this paper, we propose a new CG coefficient family, which holds conditions of
sufficient descent and global convergence properties. Under exact line search this new CG is
evaluated on a set of functions. Based on number of iterations (NOI) and central processing
unit (CPU) time, it then compared its output with that of some of the well-known previous CG
methods. The results show that of all the methods tested, the latest CG method has the best
performance.

Keywords: Conjugate Gradient Method, Conjugate Gradient Coefficient, Sufficient Descent,


Global Convergence, Exact Line Search

1. Introduction
Consider the problem of unconstrained optimization of the following 𝑛 variables:
min 𝑓(𝒙) (1)
𝒙∈ℝ𝑛
𝑛 𝑛
where 𝑓: ℝ → ℝ is smooth and ℝ denotes an 𝑛-dimensional Euclidean space. The nonlinear
conjugate gradient method for (1) is designed by iterative formula is given by [1]

𝒙𝑘+1 = 𝒙𝑘 + 𝛼𝑘 𝒅𝑘 , 𝑘 = 0, 1, 2, … (2)

where 𝒙𝑘 is 𝑘th iterative point, 𝛼𝑘 > 0 is steplength obtained by one dimensional line search,
and 𝒅𝑘 is a search direction of 𝑓 at 𝒙𝑘 defined by

−𝒈𝑘 , 𝑘 = 0
𝒅𝑘 = { (3)
−𝒈𝑘 + 𝛽𝑘 𝒅𝑘−1 , 𝑘 ≥ 1

where 𝒈𝑘 = ∇𝑓(𝒙𝑘 ) is gradient 𝑓 at 𝒙𝑘 , and 𝛽𝑘 ∈ ℝ is a scalar parameter which determines the


different conjugate gradient method. Steplength 𝛼𝑘 is obtained using several forms of line
search, i.e. exact or inexact line search and the form is as follows:

𝑓(𝒙𝑘 + 𝛼𝑘 𝒅𝑘 ) = min 𝑓(𝒙𝑘 + 𝛼𝑘 𝒅𝑘 ) (4)


𝛼≥0
and
𝑓(𝒙𝑘 + 𝛼𝑘 𝒅𝑘 ) ≤ 𝑓(𝒙𝑘 ) + 𝛿𝛼𝑘 𝒈𝑇𝑘 𝒅𝑘 ,
𝒈(𝒙𝑘 + 𝛼𝑘 𝒅𝑘 )𝑇 𝒅𝑘 ≥ 𝜎𝒈𝑇𝑘 𝒅𝑘 (5)
with 0 < 𝛿 < 𝜎 < 1 [2].

𝑓(𝒙𝑘 + 𝛼𝑘 𝒅𝑘 ) ≤ 𝑓(𝒙𝑘 ) + 𝛿𝛼𝑘 𝒈𝑇𝑘 𝒅𝑘 ,

ISSN: 2005-4238 IJAST 187


Copyright ⓒ 2020 SERSC
International Journal of Advanced Science and Technology
Vol. 29, No. 05, (2020), pp. 187-198

‖𝒈(𝒙𝑘 + 𝛼𝑘 𝒅𝑘 )𝑇 𝒅𝑘 ‖ ≤ −𝜎𝒈𝑇𝑘 𝒅𝑘 (6)

with 0 < 𝛿 < 𝜎 < 1 and ‖ . ‖ means the Euclidean norm [2]. Equation (4) is the form of exact
line search and inequality (4), (5) are form of inexact line search. The inexact line search is
methods such as Armijo [3], Goldstein [4], Wolfe [5] and Grippo-Lucidi line search [6]. The
step size 𝛼𝑘 in this paper uses exact line (4).
There are many well-known formulas for 𝛽𝑘 , such as the Hestenes and Stiefel (HS) [7],
Fletcher and Reeves (FR) [8], Conjugate Descent (CD) [9], Dai and Yuan (DY) [10], Wei, Yao
and Liu (WYL) [11], Rivaie, Mustafa, Ismail and Leong (RMIL) [12], and lastly Polak and
Ribiere (PRP) [13]. It is all given as follows, respectively:

𝒈𝑇 (𝒈𝑘 −𝒈𝑘−1 )
𝛽𝑘𝐻𝑆 = 𝒅𝑇𝑘 , (7)
𝑘−1 (𝒈𝑘 −𝒈𝑘−1 )
‖𝒈𝑘 ‖2
𝛽𝑘𝐹𝑅 = ‖𝒈𝑘−1 ‖2
, (8)
‖𝒈𝑘 ‖2
𝛽𝑘𝐶𝐷 = − , (9)
𝒅𝑇
𝑘−1 𝒈𝑘−1
‖𝒈𝑘 ‖2
𝛽𝑘𝐷𝑌 = 𝒅𝑇
, (10)
𝑘−1 𝑘 −𝒈𝑘−1 )
(𝒈
‖𝒈𝑘 ‖
𝒈𝑇
𝑘 (𝒈𝑘 − 𝒈 )
‖𝒈𝑘−1 ‖ 𝑘−1
𝛽𝑘𝑊𝑌𝐿 = 2 , (11)
||𝒈𝑘−1 ||
𝒈𝑇
𝑘 (𝒈𝑘 −𝒈𝑘−1 )
𝛽𝑘𝑅𝑀𝐼𝐿 = ‖𝒅𝑘−1 ‖2
, (12)
𝑇 (𝒈
𝒈𝑘 𝑘 −𝒈𝑘−1 )
𝛽𝑘𝑃𝑅𝑃 = ‖𝒈𝑘−1 ‖2
. (13)

The sufficient descent condition and global convergence properties are the most well studied
properties of CG methods. Previously, Wei et al. [11] proposed a CG coefficient which
modification of CG coefficient PRP known as WYL. This WYL is known fulfil the sufficient
condition and global convergence properties under exact line, Gippo-Lucidi and Wolfe line
Search [11]. As well as Rivaie et al. [12] with CG coefficient RMIL is fulfil to the sufficient
descent condition and global convergence properties under exact line search. RMIL is design
a new formula for the denominator and retained the original numerator as the PRP and HS the
numerical result of RMIL is have best performance compared to the other standard CG methods
(see [12]).
For a good reference for studies describing the latest CG coefficients with important result
and various modifications from 𝛽𝑘 see Rivaie et al. [14], Yousif [15], Basri and Mustafa [16],
Waziri et al. [17], Yuan et al. [18], Liu [19], Babaie-Kafaki [20], Liu et al. [21], Huang et al.
[22], Kui et al. [23], Yang et al. [24], Xu et al. [25], Zhu et al. [26] and Guo and Wan [27].
In this paper we will present our new CG coefficient 𝛽𝑘 , whose efficiency compared to
classic FR, CD, DY, WYL and RMIL formulas. Section 2 introduces a new formula for the CG
coefficient with an algorithm designed to solve unlimited problems of optimization. In Section
3, we will present the sufficient descent condition and proof of our new method of global
convergence. Section 4 will explain important numerical findings and discussions. Finally, the
conclusion is in Section 5.

2. New Conjugate Gradient Coefficient


In this section, we will state how to develop a new CG coefficient. The our new 𝛽𝑘
which is known as 𝛽𝑘𝑀𝑀𝑆𝑆 , where MMSS denotes Malik, Mustafa, Sabariah and Sukono.
For this new 𝛽𝑘𝑀𝑀𝑆𝑆 we designed a new formula for the numerator with add another

ISSN: 2005-4238 IJAST 188


Copyright ⓒ 2020 SERSC
International Journal of Advanced Science and Technology
Vol. 29, No. 05, (2020), pp. 187-198

negative gradient of WYL, retained the original denominator of RMIL and prevent
negative 𝛽𝑘 values. Hence,
‖𝒈𝑘 ‖
𝒈𝑇
𝑘 (𝒈𝑘 − 𝒈 −𝒈𝑘−1 )
‖𝒈𝑘−1 ‖ 𝑘−1
𝛽𝑘𝑀𝑀𝑆𝑆 = max {0, ‖𝒅𝑘−1 ‖ 2 } . (14)

The algorithm is given as follows:


Algorithm 1
Step 1: Initialization. Given 𝒙0 ∈ ℝ𝑛 , set 𝑘 = 0.
Step 2: Compute 𝛽𝑘 based on (8) to (12) or (14).
Step 3: Compute 𝒅𝑘 based on (3). If 𝒈𝑘 = 0, then stop.
Step 4: Compute 𝛼𝑘 based on exact line search (4).
Step 5: Updating new point based on (2).
Step 6: Convergent test and stopping criteria.
If 𝑓(𝒙𝑘+1 ) < 𝑓(𝒙𝑘 ) and ‖𝒈𝑘 ‖ ≤ 𝜖 then stop. Otherwise go Step 1 with 𝑘 = 𝑘 + 1.

3. Convergence Analysis
In this section we will use the exact line search to demonstrate the sufficient descent condition
and the global convergent properties of 𝛽𝑘𝑀𝑀𝑆𝑆 .

3.1. Sufficient Descent Condition


Sufficient descent condition holds when

𝒈𝑇𝑘 𝒅𝑘 ≤ −𝐶‖𝒈𝑘 ‖2 , for 𝑘 ≥ 0 and 𝐶 > 0. (15)

The theorem below is a theorem sufficient descent condition.

Theorem 1. Consider a CG method with search direction 𝒅𝑘 in (3), 𝛽𝑘𝑀𝑀𝑆𝑆 given as equation
(14), then, the condition (15) will hold for all 𝑘 ≥ 1.
2
Proof: If 𝑘 = 0 , then 𝒅0 = −𝒈0 , so that 𝒈𝑇0 𝒅0 = 𝒈𝑇0 (−𝒈0 ) = − (√𝒈𝑇0 𝒈0 ) = −‖𝒈0 ‖2 .
Hence, condition (15) holds true for 𝑘 = 0. Next, we will show that for 𝑘 ≥ 1, condition (15),
will holds true. Multiply (3) by 𝒈𝑇𝑘 then

𝒈𝑇𝑘 𝒅𝑘 = −𝒈𝑇𝑘 𝒈𝑘 + 𝛽𝑘𝑀𝑀𝑆𝑆 𝒈𝑇𝑘 𝒅𝑘−1


⇔ 𝒈𝑇𝑘 𝒅𝑘 = −‖𝒈𝑘 ‖2 + 𝛽𝑘𝑀𝑀𝑆𝑆 𝒈𝑇𝑘 𝒅𝑘−1 .

For exact line search, 𝒈𝑇𝑘 𝒅𝑘−1 = 0. Thus,

𝒈𝑇𝑘 𝒅𝑘 = −‖𝒈𝑘 ‖2 . (16)

Hence, condition (15) holds true for 𝑘 ≥ 1. So, it has been proven that for every 𝑘 ≥ 0, search
direction 𝒅𝑘 is descent direction. The proof is complete.∎

3.2. Global Convergence Properties

ISSN: 2005-4238 IJAST 189


Copyright ⓒ 2020 SERSC
International Journal of Advanced Science and Technology
Vol. 29, No. 05, (2020), pp. 187-198

In this subsection, we will demonstrate that CG methods with 𝛽𝑘𝑀𝑀𝑆𝑆 converge globally.
We need to simplify our new 𝛽𝑘𝑀𝑀𝑆𝑆 first, however, so that our proof of convergence will be
substantially easier. From (14), we have two cases:

‖𝒈𝑘 ‖
 Case 1: If ‖𝒈𝑘 ‖2 > ‖𝒈 𝒈𝑘−1 − 𝒈𝑘−1 then
𝑘−1 ‖
‖𝒈𝑘‖ ‖𝒈𝑘 ‖
𝒈𝑇
𝑘 (𝒈𝑘 − 𝒈 −𝒈𝑘−1 ) ‖𝒈𝑘 ‖2 − 𝒈𝑇 𝒈 −𝒈𝑇
𝑘 𝒈𝑘−1 ‖𝒈𝑘 ‖2
‖𝒈𝑘−1 ‖ 𝑘−1 ‖𝒈𝑘−1 ‖ 𝑘 𝑘−1
𝛽𝑘𝑀𝑀𝑆𝑆 = 2 = ‖𝒅𝑘−1 ‖2
< ‖𝒅 2.
||𝒅𝑘−1 || 𝑘−1 ‖
‖𝒈𝑘 ‖
 Case 2: If ‖𝒈𝑘 ‖2 ≤ ‖𝒈 𝒈𝑘−1 − 𝒈𝑘−1 then 𝛽𝑘𝑀𝑀𝑆𝑆 = 0. (17)
𝑘−1 ‖

The following basic assumptions are often needed when analyzing the global
convergence properties of the CG methods.

Assumption 1.
(i) The level set Ω = {𝒙 ∈ ℝn : 𝑓(𝒙) ≤ 𝑓(𝒙0 )} is bounded, where 𝒙0 is a given starting
point.
(ii) In an open convex set Ω0 that contains Ω, 𝑓 is continuous and differentiable, and
its gradient is Lipschitz continuous; that is, for any 𝒙, 𝒚 ∈ Ω0 , there exist a constant
𝐿 > 0 such that ‖𝑔(𝒙) − 𝑔(𝒚)‖ ≤ 𝐿‖𝒙 − 𝒚‖.

Lemma 1. Suppose that Assumptions 1 hold, let 𝒙𝑘 be generated by Algorithm 1, where


𝒅𝑘 is a descent search direction, and 𝛼𝑘 is obtained by (3), then the following condition,
known as the Zoutendijk condition, holds
∞ 2
(𝒈𝑇𝑘 𝒅𝑘 )
∑ < ∞.
‖𝒅𝑘 ‖2
𝑘=0

Proof of this Lemma can be found in [28]. The following convergent theorem for the CG
method can be obtained by using Lemma 1 and (17).

Theorem 2: Suppose Assumption 1 hold. Consider any CG method in the form (3), where
the steplenght 𝛼𝑘 is determined by the exact line (4). In addition, suppose that the
sufficient descent condition holds true. Then,

lim inf ‖𝒈𝑘 ‖ = 0. (18)


𝑘→0

Proof: We use a contradiction proof. Suppose that (18) is not correct then there is a
constant 𝐶 > 0 such that
1 1
‖𝒈𝑘 ‖ ≥ 𝐶 ⟺ ‖𝒈𝑘 ‖2 ≥ 𝐶 2 ⟺ ≤ 𝐶 2 , for every 𝑘 ≥ 0. (19)
‖𝒈 ‖2 𝑘

We know that from (3),


𝒅𝑘 = −𝒈𝑘 + 𝛽𝑘𝑀𝑀𝑆𝑆 𝒅𝑘−1

and squaring both sides, we get


2
‖𝒅𝑘 ‖2 = (𝛽𝑀𝑀𝑆𝑆
𝑘
) ‖𝒅𝑘−1 ‖2 + ‖𝒈𝑘 ‖2 − 2𝛽𝑀𝑀𝑆𝑆
𝑘
𝒈𝑇𝑘 𝒅𝑘 ,

for exact line search we know that 𝒈𝑇𝑘 𝒅𝑘 = 0, therefore

ISSN: 2005-4238 IJAST 190


Copyright ⓒ 2020 SERSC
International Journal of Advanced Science and Technology
Vol. 29, No. 05, (2020), pp. 187-198

2
‖𝒅𝑘 ‖2 = (𝛽𝑀𝑀𝑆𝑆
𝑘
) ‖𝒅𝑘−1 ‖2 + ‖𝒈𝑘 ‖2.

Applying (17)
‖𝒅𝑘 ‖2 = ‖𝒈𝑘 ‖2 , for all 𝑘. (20)

Dividing both side of (20) by ‖𝒈𝑘 ‖4,

‖𝒅𝑘 ‖2 1
‖𝒈𝑘 ‖4
= ‖𝒈 2 .
𝑘‖
and we obtain

‖𝒅𝑘 ‖2 1 ‖𝒅 ‖2
‖𝒈𝑘 ‖4
≤ ‖𝒈 ‖2
+ ‖𝒈𝑘−1 ‖4 . (21)
𝑘 𝑘−1

For 𝑘 = 0, 𝒅0 = −𝒈0 , so, ‖𝒅0 ‖2 = ‖𝒈0 ‖2,

‖𝒅1 ‖2 ‖𝒅 ‖2 1 1 1 1
for 𝑘 = 1, ‖𝒈1 ‖4
≤ ‖𝒈0 ‖4 + ‖𝒈 2 = ‖𝒈 2 + ‖𝒈 2 = ∑1𝑘=0 ‖𝒈 2 ,
0 1‖ 0‖ 1‖ 𝑘‖

‖𝒅2 ‖2 ‖𝒅 ‖2 1 1 1 1 1
for 𝑘 = 2, ‖𝒈2 ‖4
≤ ‖𝒈1 ‖4 + ‖𝒈 2 ≤ ‖𝒈 2 + ‖𝒈 2 + ‖𝒈 2 = ∑2𝑘=0 ‖𝒈 2 , ……….
1 2‖ 0‖ 1‖ 2‖ 𝑘‖

‖𝒅𝑛 ‖2 ‖𝒅 ‖2 1 1 1 1 1 1
for 𝑘 = 𝑛, ‖𝒈𝑛 ‖4
≤ ‖𝒈𝑛−1 ‖4 + ‖𝒈 2 ≤ ‖𝒈 2 + ‖𝒈 2 + ‖𝒈 2 + ⋯ + ‖𝒈 2 = ∑𝑛𝑘=0 ‖𝒈 2 .
𝑛−1 𝑛‖ 0‖ 1‖ 2‖ 𝑛‖ 𝑘‖

So that,
‖𝒅𝑛 ‖2 1
‖𝒈𝑛 ‖4
≤ ∑𝑛𝑘=0 ‖𝒈 2 . (22)
𝑘‖

From (19), the right side of (22) is


1 𝑛+1
∑𝑛𝑘=0 ≤
‖𝒈𝑘 ‖2 𝐶2
So, we have

‖𝒈𝑘 ‖4 𝐶2
∑𝑛𝑘=0 ≥ ∑𝑛𝑘=0 𝑘+1, and further, we get
‖𝒅𝑘 ‖2

2
(𝒈𝑇
𝑘 𝒅𝑘 ) 1
∑∞
𝑘=0 ≥ 𝐶 2 ∑∞
𝑘=0 𝑘+1. (23)
‖𝒅𝑘 ‖2

From (23), can be concluded that


2
(𝒈𝑇
𝑘 𝒅𝑘 ) 1
∑∞
𝑘=0 ≥ 𝐶 2 ∑∞
𝑘=0 𝑘+1 = +∞,
‖𝒅𝑘 ‖2

this statement is contradiction with Zoutendijk condition in Lemma 1. Therefore, the


proof is completed.∎

4. Numerical Result and Discussion


This section shows the numerical results for 𝛽𝑘 of MMSS compared to the other 𝛽𝑘 of
FR, CD, DY, WYL and RMIL. We will use some of the test problems under small,
medium, and high dimensions considered in Andrei [29], as in Yousif paper [15], namely
2, 3, 4, 10, 50, 100, 500, 1000 and 10,000 to show the performance. The employed

ISSN: 2005-4238 IJAST 191


Copyright ⓒ 2020 SERSC
International Journal of Advanced Science and Technology
Vol. 29, No. 05, (2020), pp. 187-198

function is an artificial device. Artificial functions are functions used to detect algorithmic
actions in various conditions such as the length of the narrow valleys, unimodal functions
and functions with a large number of local optimal functions.
In this paper there are thirty-one non-linear functions to be evaluated as described in
Table 1. In addition, for each test dimension, one of which is the initial point suggested
by Andrei [29]. The comparison between each method is based on the number of iterations
(NOI) and the time in seconds required to run each test problem (CPU). Test evaluation
was based on the Nocedal-line search algorithm for the exact condition (4) and coded in
MATLAB under the stop criterion is set to ‖𝒈𝑘 ‖ ≤ 10−6 . The test was performed on a
laptop with an intel ® Core TM i7- CPU @ 1.80GHz processor (8CPUs), ~2.0GHz, 16
GB RAM memory and windows 10 Professional 64bit operating system.
The numerical results are combined using the profile results described in Dolan and
More [30]. The profile results are illustrated in Figure 1 and 2. Figure 1 and 2,
respectively, are the results of the iteration and running time profiles. The results in
Figures 1 and 2 are obtained in the following way:

𝑎𝑝,𝑠
𝑟𝑝,𝑠 =
min{𝑎𝑝,𝑠 : 𝑠 ∈ 𝑆}

where 𝑟𝑝,𝑠 is performance ratio, 𝑎𝑝,𝑠 is the number of iterations or CPU time, 𝑃 is set test,
and 𝑆 is set of solvers on the test set 𝑃. Overall profile results can be obtained in the
following ways:
1
𝜌𝑠 (𝑡) = 𝑠𝑖𝑧𝑒 {𝑝 ∈ 𝑃: 𝑟𝑝,𝑠 ≤ 𝑡}
𝑛𝑝

with𝜌𝑠 (𝑡) is the probability for solver 𝑠 ∈ 𝑆 that a performance ratio 𝑟𝑝,𝑠 is within a factor
𝑡 ∈ ℝ of the best possible ratio and 𝑛𝑝 is the number of functions. The function 𝜌𝑠 (𝑡) is
the distribution function for the performance ratio. The value of 𝜌𝑠 (1) is the probability
that solver will win over the rest of solvers.

Table 1. Comparison of Different CG Method Based on NOI


Function Dim. Initial Point MMSS RMIL FR CD DY WYL
Extended 1000 (-1.2,1, … ,1) 18 28 26 26 26 1251
White & Holst (10, … ,10) 59 30 293 222 278 2227
10000 (-1.2,1, … ,1) 18 28 30 30 30 1180
(5, … ,5) 25 32 180 193 145 2923
1000 (-1.2,1, … ,1) 26 28 211 100 221 1283
(10, … ,10) 32 47 56 56 56 1157
Extended 10000 (-1.2,1, … ,1) 26 28 227 99 230 415
Rosenbrock (5, … ,5) 36 27 206 219 224 922
Freudenstein 4 (0.5, -2, …) 8 9 15 15 15 560
& Roth (5, 5, 5, 5) 4 6 7 7 7 435
1000 (1, 0.8, … ,0.8) 9 52 75 75 75 315
(0.5, …, 0.5) 19 45 81 81 81 263
Extended 10000 (-1, …, -1) 13 15 87 87 87 111
Beale (0.5, …, 0.5) 20 45 87 87 87 450
Extended wod 4 (-3, -1, -3, -1) 220 495 45545 10820 21279 990
10 (1, …, 1) 19 19 19 19 19 24
(10, …, 10) 69 84 19827 13849 20043 208
100 (-1, …, -1) 88 107 99 90 96 249
Raydan 1 (-10, …, -10) 147 180 985 751 fail 376
500 (2, …, 2) 64 202 453 452 453 3133
(10, …, 10) 89 189 7 90 90 2667
Extended 1000 (1, …, 1) 74 181 517 517 517 4381
Tridiagonal 1 (-10, …, -10) 42 115 8 32 78 3651
500 (1, …, 1) 3 3 5 5 5 46
Diagonal 4 (-20, …, -20) 4 4 5 5 5 46

ISSN: 2005-4238 IJAST 192


Copyright ⓒ 2020 SERSC
International Journal of Advanced Science and Technology
Vol. 29, No. 05, (2020), pp. 187-198

Function Dim. Initial Point MMSS RMIL FR CD DY WYL


1000 (1, …, 1) 3 3 5 5 5 46
(-30, …, -30) 4 4 5 5 5 64
1000 (1, …, 1) 9 11 15 15 15 20
(20, …, 20) 6 6 8 8 8 10
Extended 10000 (-1, …, -1) 10 9 27 26 27 12
Himmelblau (50, …, 50) 6 9 17 17 17 20
10 (0, …, 0) 68 72 1214 973 1208 198
FLETCHCR (10, …, 10) 32 31 30 30 30 48
Ext. Powell 100 (3, -1, 0, 1, …) 4790 161067 5644 5640 5653 29913
2 (3,3) 10 15 50 50 50 118
NONSCOMP (10, 10) 12 16 915 241 1755 109
10 (1, …, 1) 5 5 9 9 9 10
(10, …, 10) 10 10 13 13 13 13
Extended 100 (10, …, 10) 10 10 13 13 13 14
DENSCHNB (-50, …, -50) 11 9 77 77 77 14
10 (1, 2, …, 10) 15 20 13 13 13 48
(-10, …, -10) 13 21 14 14 14 39
Extended 100 (5, …, 5) 8 15 13 13 14 fail
Penalty (-10, …, -10) 8 fail 31 39 fail fail
10 (1, …, 1) 12 12 11 11 11 13
Hager (-10, …, -10) 18 18 97 99 97 21
Ext Maratos 10 (1.1, 0.1, …,) 55 42 4820 1022 4096 fail
Six-hump 2 (-1,2) 7 8 24 24 24 7
camel (-5, 10) 7 6 11 11 11 9
2 (5, 5) 3 3 3 3 3 5
Booth (10, 10) 3 3 3 3 3 4
2 (-1, 0.5) 1 1 1 1 1 1
Trecanni (-5, 10) 5 6 14 14 14 13
2 (-1, 2) 11 21 11 11 11 51
Zettl (10, 10) 11 19 10 10 10 73
1000 (0, …, 0) 8 26 18 18 18 76
(10, …, 10) 12 11 175 175 175 94
10000 (-1, …, -1) 16 37 47 47 47 73
Shallow (-10, …, -10) 12 35 43 43 43 68
Generalized 1000 (1, … ,1) 5 6 6 6 6 6
Quartic (20, … ,20) 7 10 12 12 13 9
50 (0.5, …, 0.5) 69 77 116 117 116 151
Quadratic QF2 (30, …, 30) 67 77 125 124 126 153
Generlized 10 (2, …, 2) 21 22 27 27 27 28
Tridiagonal 1 (10, …, 10) 26 27 43 43 43 41
Generlized (1, …, 1) 4 4 5 5 5 5
Tridiagonal 2 4 (10, …, 10) 9 7 fail fail fail 20
10 (1, …, 1) 102 123 20 20 21 146
POWER (10, …, 10) 116 139 24 24 23 157
50 (1, …, 1) 61 69 38 38 38 126
(10, …, 10) 69 78 41 41 41 129
500 (1, …, 1) 391 422 131 131 131 532
Quadratic QF1 (-5, …, -5) 421 538 137 150 137 525
Ext. quadratic 100 (1, …, 1) 37 33 189 231 141 598
penalty QP2 (10, …, 10) 40 31 2758 101 fail 567
Ext. quadratic 4 (1, …, 1) 6 14 20 20 20 17
penalty QP1 (10, …, 10) 13 9 19 19 19 21
2 (1, 1) 1 1 1 1 1 1
Matyas (20, 20) 1 1 1 1 1 1
Dixon and 3 (1, 1, 1) 28 35 15 15 15 74
Price (10, 10, 10) 31 56 29 29 29 76

ISSN: 2005-4238 IJAST 193


Copyright ⓒ 2020 SERSC
International Journal of Advanced Science and Technology
Vol. 29, No. 05, (2020), pp. 187-198

Figure 1. Performance Profile Based on Number of Iterations (NOI)

Table 2. Comparison of Different CG Method Based on CPU Time


Function Dim. Initial Point MMSS RMIL FR CD DY WYL
Extended 1000 (-1.2,1, … ,1) 0.4893 0.8948 0.8585 0.982 1.0212 29.7987
White & Holst (10, … ,10) 1.5649 0.817 9.7359 7.54 9.6918 53.518
10000 (-1.2,1, … ,1) 4.5836 7.282 8.0132 7.9301 8.0372 281.6603
(5, … ,5) 6.4817 8.1429 48.6315 53.4613 39.8922 704.4778
1000 (-1.2,1, … ,1) 0.1342 0.1369 0.8497 0.407 0.8766 3.7323
(10, … ,10) 0.1535 0.2037 0.2329 0.2491 0.3524 3.3508
Extended 10000 (-1.2,1, … ,1) 0.475 0.5125 3.8702 1.6556 3.9131 9.1311
Rosenbrock (5, … ,5) 0.6514 1.3893 3.438 3.5928 3.6501 17.902
Freudenstein & 4 (0.5, -2, …) 0.0338 0.0574 0.0676 0.0639 0.0778 1.032
Roth (5, 5, 5, 5) 0.035 0.0279 0.0334 0.0346 0.0348 0.8117
1000 (1, 0.8, … ,0.8) 0.3183 1.5775 2.2823 2.2307 2.4375 8.4078
(0.5, …, 0.5) 0.6889 1.3736 2.526 2.5458 2.4948 7.1318
10000 (-1, …, -1) 3.5595 4.1676 24.0085 24.0719 24.0302 29.9674
Extended Beale (0.5, …, 0.5) 5.5427 12.4285 23.9661 23.9937 23.9873 121.5545
Extended wod 4 (-3, -1, -3, -1) 0.5237 1.1665 258.8326 62.6266 118.3625 1.8304
10 (1, …, 1) 0.0799 0.0788 0.0449 0.0452 0.0484 0.0748
(10, …, 10) 0.1971 0.2591 42.4045 30.5942 43.2475 0.4272
100 (-1, …, -1) 0.3153 0.3758 0.3171 0.2834 0.306 0.6654
Raydan 1 (-10, …, -10) 0.4548 0.5898 2.8474 2.0725 fail 0.9673
500 (2, …, 2) 0.9477 2.943 7.2773 7.2176 7.2673 46.1204
(10, …, 10) 1.3757 2.7909 0.1176 1.493 1.4984 38.5006
Extended 1000 (1, …, 1) 1.9916 4.9043 15.361 15.4591 20.1142 118.1998
Tridiagonal 1 (-10, …, -10) 1.1555 3.1173 0.303 1.1196 2.9221 97.6139
500 (1, …, 1) 0.0162 0.0141 0.0347 0.0311 0.0312 0.1409
(-20, …, -20) 0.0305 0.0158 0.0346 0.0349 0.0308 0.1427
1000 (1, …, 1) 0.0226 0.0207 0.0337 0.0343 0.0387 0.1559
Diagonal 4 (-30, …, -30) 0.0417 0.0245 0.0397 0.0367 0.0393 0.2108
1000 (1, …, 1) 0.0569 0.0496 0.0893 0.0832 0.0802 0.083
(20, …, 20) 0.0369 0.0276 0.0496 0.0548 0.0507 0.0483
Extended 10000 (-1, …, -1) 0.2037 0.177 0.5326 0.5417 0.5198 0.2166
Himmelblau (50, …, 50) 0.1641 0.2144 0.3447 0.3584 0.3445 0.3414
10 (0, …, 0) 0.2037 0.1655 3.1757 2.5102 3.0949 0.3852
FLETCHCR (10, …, 10) 0.0978 0.0774 0.1165 0.1232 0.0959 0.1204

ISSN: 2005-4238 IJAST 194


Copyright ⓒ 2020 SERSC
International Journal of Advanced Science and Technology
Vol. 29, No. 05, (2020), pp. 187-198

Function Dim. Initial Point MMSS RMIL FR CD DY WYL


Ext. Powell 100 (3, -1, 0, 1, …) 21.317 765.7152 87.6384 89.2627 90.6282 131.7058
2 (3,3) 0.0456 0.0387 0.1546 0.16 0.1612 0.2481
NONSCOMP (10, 10) 0.0525 0.0396 4.7677 0.6989 4.8267 0.2265
10 (1, …, 1) 0.0281 0.0184 0.0414 0.0415 0.0388 0.0427
(10, …, 10) 0.0377 0.0341 0.056 0.059 0.0608 0.0366
Extended 100 (10, …, 10) 0.0425 0.0285 0.0645 0.062 0.0675 0.0474
DENSCHNB (-50, …, -50) 0.0572 0.0265 0.4324 0.2332 0.2346 0.0441
10 (1, 2, …, 10) 0.0549 0.0509 0.0614 0.0562 0.0595 0.1293
(-10, …, -10) 0.0569 0.0617 0.0639 0.0686 0.0657 0.0994
Extended 100 (5, …, 5) 0.0357 0.0557 0.0647 0.0735 0.0619 fail
Penalty (-10, …, -10) 0.0538 fail 0.1023 0.1576 fail fail
10 (1, …, 1) 0.0469 0.0562 0.0304 0.03 0.0315 0.1578
Hager (-10, …, -10) 0.0667 0.0706 0.2343 0.2295 0.2283 0.0535
Ext Maratos 10 (1.1, 0.1, …,) 0.1602 0.1242 8.5759 1.9205 7.5708 fail
Six-hump 2 (-1,2) 0.034 0.0394 0.0564 0.0549 0.055 0.0419
camel (-5, 10) 0.0431 0.0375 0.0389 0.031 0.0405 0.0336
2 (5, 5) 0.0174 0.0182 0.0144 0.0151 0.0161 0.0184
Booth (10, 10) 0.0202 0.0186 0.0187 0.0182 0.0205 0.0145
2 (-1, 0.5) 0.0073 0.0053 0.0084 0.0066 0.0075 0.0084
Trecanni (-5, 10) 0.0308 0.0344 0.0524 0.062 0.0619 0.038
2 (-1, 2) 0.0516 0.0813 0.0445 0.0495 0.0382 0.1497
Zettl (10, 10) 0.0539 0.076 0.0456 0.0476 0.0474 0.1813
1000 (0, …, 0) 0.0495 0.1391 0.103 0.1154 0.0851 0.2866
(10, …, 10) 0.0724 0.0665 0.6468 0.7126 0.6634 0.3217
10000 (-1, …, -1) 0.3241 0.7005 0.875 0.8105 0.8397 1.2541
Shallow (-10, …, -10) 0.2433 0.6463 0.7564 0.7327 0.7296 1.1735
Generalized 1000 (1, … ,1) 0.042 0.0456 0.0469 0.0488 0.0508 0.0427
Quartic (20, … ,20) 0.063 0.0646 0.0732 0.0737 0.0976 0.0462
50 (0.5, …, 0.5) 0.226 0.2463 0.3075 0.3051 0.3219 0.3246
Quadratic QF2 (30, …, 30) 0.2239 0.2262 0.4089 0.3247 0.3264 0.3462
Generlized 10 (2, …, 2) 0.0797 0.0966 0.1314 0.1189 0.107 0.085
Tridiagonal 1 (10, …, 10) 0.098 0.1068 0.1707 0.169 0.169 0.1117
Generlized (1, …, 1) 0.0364 0.0342 0.0239 0.0235 0.023 0.0252
Tridiagonal 2 4 (10, …, 10) 0.0447 0.0382 fail fail fail 0.2117
10 (1, …, 1) 0.2811 0.3127 0.0486 0.0487 0.0483 0.3292
POWER (10, …, 10) 0.3067 0.3425 0.0546 0.0689 0.0659 0.3125
50 (1, …, 1) 0.1855 0.3085 0.0847 0.0844 0.0863 0.2668
(10, …, 10) 0.1998 0.2213 0.1004 0.0959 0.0932 0.268
500 (1, …, 1) 2.395 3.2311 0.58 0.546 0.5904 3.0803
Quadratic QF1 (-5, …, -5) 2.4197 3.3194 0.5643 0.8401 0.5979 1.8863
Ext. quadratic 100 (1, …, 1) 0.1645 0.1521 0.9002 0.7706 0.7473 1.5307
penalty QP2 (10, …, 10) 0.1933 0.3543 7.671 0.4509 fail 1.4758
Ext. quadratic 4 (1, …, 1) 0.0321 0.0619 0.0558 0.0462 0.048 0.0531
penalty QP1 (10, …, 10) 0.0577 0.052 0.0574 0.0577 0.0562 0.0673
2 (1, 1) 0.0079 0.0051 0.005 0.005 0.0083 0.0054
Matyas (20, 20) 0.0095 0.0103 0.0045 0.0048 0.0052 0.0099
Dixon and 3 (1, 1, 1) 0.0901 0.1129 0.0457 0.0343 0.0341 0.1607
Price (10, 10, 10) 0.1163 0.1721 0.0668 0.0676 0.0693 0.167

ISSN: 2005-4238 IJAST 195


Copyright ⓒ 2020 SERSC
International Journal of Advanced Science and Technology
Vol. 29, No. 05, (2020), pp. 187-198

Figure 2. Performance Profile Based on CPU Time

From Table 1 and Table 2, we see that for all given test function, MMSS successfully reach
solution point. The RMIL, FR and CD only reaches 98%, the DY 95% and the WYL 96%.
Based on Figure 1 the performance profile show that MMSS method almost strongly
outperforms other tested methods RMIL, FR, CD, DY, and WYL in terms of the number of
iterations (NOI) since it corresponds to the top curve. As well as on Figure 2, the performance
profile show that MMSS method almost strongly outperforms other tested methods RMIL, FR,
CD, DY, and WYL in terms of CPU time since it corresponds to the top curve. Moreover, based
on the test of the problem being tested, it was found that MMSS was better than the other
methods.

5. Conclusion
In this paper we proposed a new coefficient of conjugate gradient method, namely MMSS.
Firstly, we have proved that the sufficient descent property holds. Under some
assumptions we have showed that our proposed algorithm is globally convergent under
the exact line search. The comparison of our proposed method with RMIL, FR, CD, DY
and WYL methods shows that our new is have the best performance.

References
[1] E. Polak, “Algorithms and Consistent Approximations”, Springer, Berlin, (1997).
[2] J. Nocedal and S.J. Wright, “Numerical Optimization”, Springer, New York, (2000).
[3] L. Armijo, “Minimization of functions having Lipschitz continuous first partial
derivatives”, Pacific Journal of mathematics., vol.16, no. 1, (1966), pp.1-3.
[4] A. A. Goldstein, “On steepest descent”, Journal of the Society for Industrial and Applied
Mathematics, Series A: Control., vol. 3, no. 1, (1965), pp. 147-151.
[5] P. Wolfe, “Convergence conditions for ascent methods”, SIAM review., vol. 11, no. 2,
(1969), pp. 226-235.

ISSN: 2005-4238 IJAST 196


Copyright ⓒ 2020 SERSC
International Journal of Advanced Science and Technology
Vol. 29, No. 05, (2020), pp. 187-198

[6] L. Grippo and S. Lucidi, “A globally convergent version of the Polak-Ribiere conjugate
gradient method”, Mathematical Programming., vol. 78, no. 3, (1997), pp. 375-391.
[7] M. R. Hestenes and E. Stiefel, “Methods of Conjugate Gradients for Solving Linear
Systems”, Journal of Research of The National Bureau of Standards., vol. 49, no. 6, (1952),
pp. 409-435.
[8] R. Fletcher and C. M. Reeves, “Function minimization by conjugate gradients”, The
computer journal., no. 2, (1964), pp. 149-154.
[9] R. Fletcher, “Practical methods of optimization”, Wiley Interscience John Wiley and Sons,
New York, USA, 2nd edition, (1987).
[10] Y. H. Dai and Y. Yuan, “A Nonlinear Conjugate Gradient Method with A Strong Global
Convergence Property”, SIAM Journal on optimization., vol. 10, no. 1, (1999), pp. 177-
182.
[11] Z. Wei, S. Yao, and L. Liu, “The convergence properties of some new conjugate gradient
methods”, Appl. Math. Comput., vol. 183, no. 2, (2006), pp. 1341-1350.
[12] M. Rivaie, M. Mamat, L.W. June, and I. Mohd, “A new class of nonlinear conjugate
gradient coefficients with global convergence properties”, Appl. Math. Comput., vol. 218,
no. 22, (2012), pp.11323-11332.
[13] E. Polak and G. Ribiere, “Note on The Convergence of Methods of Conjugate Directions”,
Revue Francaise d’Informatique Et De Recherche Operationnelle., vol.3, no. 16, (1969),
pp. 35-43.
[14] M. Rivaie, M. Mamat, and A. Abashar, “A new class of nonlinear conjugate gradient
coefficients with exact and inexact line searches”, Appl. Math. Comput., vol. 268, no.
October, (2015), pp. 1152–1163.
[15] O. O. O. Yousif, “The convergence properties of RMIL+ conjugate gradient method under
the strong Wolfe line search”, Appl. Math. Comput., vol. 367, (2020), p. 124777.
[16] S. Basri and M. Mamat, “A new class of nonlinear conjugate gradient with global
convergence properties”, in Materials Today: Proceedings., (2018).
[17] M. Y. Waziri, K. Ahmed, and J. Sabi’u, “A family of Hager–Zhang conjugate gradient
methods for system of monotone nonlinear equations”, Appl. Math. Comput., vol. 361,
(2019), pp. 645-660.
[18] G. Yuan, T. Li, and W. Hu, “A conjugate gradient algorithm for large-scale nonlinear
equations and image restoration problems”, Appl. Numer. Math., vol. 147, no. 11661009,
(2020), pp. 129-141.
[19] J. Liu, “Convergence properties of a class of nonlinear conjugate gradient methods”,
Comput. Oper. Res., vol. 40, no. 11, (2013), pp. 2656-2661.
[20] S. Babaie-Kafaki, “Two modified scaled nonlinear conjugate gradient methods”, J. Comput.
Appl. Math., vol. 261, (2014), pp. 172-182.
[21] D. Liu, L. Zhang, and G. Xu, “Spectral method and its application to the conjugate gradient
method”, Appl. Math. Comput., vol. 240, (2014), pp. 339-347.
[22] Y. Huang, S. Liu, X. Du, and X. Dong, “A Globally Convergent Hybrid Conjugate Gradient
Method and Its Numerical Behaviors”, Mathematical Problem in Engineering., vol. 2013,
no. 5, (2013), pp. 1-14.
[23] L. Jin-kui, Z. Li-min, and S. Xiao-qian, “Global Convergence of a Nonlinear Conjugate
Gradient Method”, Mathematical Problem in Engineering., vol. 2011, (2011), pp. 1-23.
[24] X. Yang, Z. Luo, and X. Dai, “A Global Convergence of LS-CD Hybrid Conjugate
Gradient Method,” Adv. Numer. Anal., vol. 2013, (2013), pp. 1–5.
[25] C. Xu, J. Zhu, Y. Shang, and Q. Wu, “Method over Networks”, Complexity., vol. 2020,
(2020), pp. 1-13.
[26] T. Zhu, Z. Yan, and X. Peng, “A Modified Nonlinear Conjugate Gradient Method for
Engineering Computation,” Math. Probl. Eng., vol. 2017, (2017), pp. 1-11.
[27] J. Guo and Z. Wan, “A Modified Spectral PRP Conjugate Gradient Projection Method for
Solving Large-Scale Monotone Equations and Its Application in Compressed Sensing”,
Math. Probl. Eng., vol. 2019, (2019), pp. 23–27.

ISSN: 2005-4238 IJAST 197


Copyright ⓒ 2020 SERSC
International Journal of Advanced Science and Technology
Vol. 29, No. 05, (2020), pp. 187-198

[28] G. Zoutendijk, “Nonlinear programming, computational methods”, Integer nonlinear


Program., (1970), pp. 37-86.
[29] N. Andrei, “An Unconstrained Optimization Test Functions Collection”, Adv. Model.
Optim., vol. 10, no. 1, (2008), pp. 147–161, 2008.
[30] E.D. Dolan and J.J. More, “Benchmarking optimization software with performance
profiles”, Mathematical Programming., Vol. 912, no. 2, (2002), pp. 201-213.

ISSN: 2005-4238 IJAST 198


Copyright ⓒ 2020 SERSC

You might also like