Professional Documents
Culture Documents
Article
An Inertial Parametric Douglas–Rachford Splitting Method for
Nonconvex Problems
Tianle Lu and Xue Zhang *
School of Mathematics and Computer Science, Shanxi Normal University, Taiyuan 030031, China;
221109020@sxnu.edu.cn
* Correspondence: zhangxue2100@sxnu.edu.cn
Abstract: In this paper, we propose an inertial parametric Douglas–Rachford splitting method for min-
imizing the sum of two nonconvex functions, which has a wide range of applications. The proposed
algorithm combines the inertial technique, the parametric technique, and the Douglas–Rachford
method. Subsequently, in theoretical analysis, we construct a new merit function and establish
the convergence of the sequence generated by the inertial parametric Douglas–Rachford splitting
method. Finally, we present some numerical results on nonconvex feasibility problems to illustrate
the efficiency of the proposed method.
1. Introduction
The Douglas–Rachford (DR) splitting method is a classical optimization algorithm
initially proposed by Douglas and Rachford [1] for the numerical solution of heat differential
equations. Later, Lions and Mercier [2], through their pioneering work, made the algorithm
applicable to a class of optimization problems formulated as follows:
Citation: Lu, T.; Zhang, X. An Inertial min ϕ(u) = f (u) + g(u), (1)
u
Parametric Douglas–Rachford
Splitting Method for Nonconvex where f and g are closed convex functions. Following that, the DR splitting method has also
Problems. Mathematics 2024, 12, 675. been widely applied to various optimization problems that arise from signal processing,
https://doi.org/10.3390/ tensor recovery, and image processing, where the objective function is the sum of two
math12050675 proper closed convex functions. For example, Combettes and Pesque [3], Gandy et al. [4],
Academic Editor: Shin-ya Matsushita He and Yuan [5], Qu et al. [6].
Moreover, the DR splitting method can solve the more general problem of finding the
Received: 22 January 2024 zeros of two maximal monotone operators. When both f and g in Problem (1) are convex
Revised: 22 February 2024
functions, and the corresponding operators are the subdifferentials of f and g, the DR
Accepted: 23 February 2024
splitting method corresponds to the following iterative procedure :
Published: 25 February 2024
1 k 1
x k +1 = x + (2 proxγg − I )(2 proxγ f − I )( x k ), (2)
2 2
Copyright: © 2024 by the authors. where step-size parameter γ > 0, I is the identity mapping, and the proximal mapping is
Licensee MDPI, Basel, Switzerland. defined as
1
This article is an open access article proxγ f ( x ) := arg min{ f (u) + ∥ u − x ∥2 }.
distributed under the terms and u 2γ
conditions of the Creative Commons
Although both f and g are convex functions in Problem (1), the proximal operator of
Attribution (CC BY) license (https://
the objective function ϕ is challenging to compute. As indicated by (2), the DR splitting
creativecommons.org/licenses/by/
method overcomes this difficulty. It converts solving Problem (1) into solving two easily
4.0/).
1
k +1
y
∈ arg min{ f (y) + ∥ y − x k ∥2 },
2γ
y
1
zk+1 ∈ arg min{ g(z) + ∥z − (2yk+1 − x k )∥2 },
2γ
z
k +1
x = x k + ( z k +1 − y k +1 ),
5 3
(1 + γL)2 + γs − < 0, (3)
2 2
1
yk+1 ∈ arg min{ f (y) + ∥ y − x k ∥2 },
2γ
y
1
zk+1 ∈ arg min{ g(z) + ∥z − αyk+1 + x k ∥2 },
2γ
z
k +1
x = x k + ( z k +1 − y k +1 ),
Mathematics 2024, 12, 675 3 of 24
where α ∈ ( 23 , 2] is the parameterized coefficient, step-size γ > 0, and they satisfy the
following inequality:
4−α 9 − 2α 3
(1 + γL)2 + γs − < 0. (4)
2 2 2
In nonconvex settings, it is evident that the PDR splitting method represents a spe-
cific instance of the adaptive DR method [14]. Parameterization techniques to make DR
algorithms more flexible in practice. Moreover, in the numerical experimental part of
PDR, we observe that it saves a significant amount of running time compared to DR. Not
only the PDR splitting method but much work (see [17–19]) on DR splitting methods has
been done by adding parameters to its formulations, and it illustrates the superiority of
parameterization.
On the other hand, the inertial scheme, also called the heavy ball method proposed
by Polyak [20], has been widely used in optimization algorithms, and its effectiveness has
been proved. Inertia strategies help solve nonconvex minimization problems more rapidly.
For instance, Boţ and Csetnek [21] proposed the inertial Douglas–Rachford splitting for
monotone inclusion problems. Based on the inexact proximal point algorithm, Alves et
al. [22] combined inertial step and overrelaxation to propose a partially inexact inertial-
relaxed Douglas–Rachford algorithm. Meanwhile, Han et al. [19] study the randomized
r-sets–Douglas–Rachford (RrDR) method with the inertial scheme and show that it con-
verges at an accelerated linear rate. Additionally, Feng et al. [23] developed an inertial
Douglas–Rachford splitting (IDRS) method and demonstrated its validity in terms of signal
recovery. The IDRS method is given as follows:
1
yk+1 ∈ arg min{ f (y) +
∥ y − u k ∥2 },
y 2γ
1
zk+1 ∈ arg min{ g(z) + ∥z − (2yk+1 − uk )∥2 },
z 2γ
x k +1 = u k + ( z k +1 − y k +1 ),
k +1
u = x k +1 + β ( x k +1 − x k ),
where β > 0 is the inertial parameter and satisfies the following conditions:
The use of parameterization and inertial strategies can improve the performance of
the DR method. It is natural to pose the following question: Can we develop an efficient
approach to solving Problem (1) by leveraging both parameterization and inertial strategies?
Given the prevalence of nonconvex optimization problems in practical applications,
investigating the DR splitting method for solving these problems holds significant and
far-reaching implications. In this paper, we consider the nonconvex and nonsmooth mini-
mization problems (1), where f and g are properly closed, possibly nonconvex functions.
We propose to combine the inertial and parametric techniques with the DR splitting method
and introduce an inertial parametric Douglas–Rachford (IPDR) splitting method. Moreover,
we demonstrate that the IPDR splitting method generates a stationary point if the sequence
generated by the method has a cluster point. Our analysis relies heavily on our defined
merit function (see Definition 5), which generates a decreasing sequence along the IPDR
splitting method. Additionally, because of our IPDR method, it is easy to obtain DR, PDR,
and IDRS by appropriate choice of parameter simplification. Then, we obtain a unified
convergence analysis of DR, PDR, and IDRS as a byproduct. Finally, numerical results
on nonconvex feasibility problems demonstrate that our algorithm saves computing time
while improving accuracy.
The structure of this paper is organized as follows. In Section 2, we present some
fundamental concepts and preliminary materials. In Section 3, we introduce the IPDR
Mathematics 2024, 12, 675 4 of 24
method. We also demonstrate the convergence of the proposed algorithm with suitable
assumptions. Section 4 presents the results of numerical experiments, which are illustrated
and discussed. Finally, we provide concluding remarks in Section 5.
Definition 1 ([24]). (limiting subdifferential) Let f be a proper function. The limiting subdifferen-
tial of f at x ∈ dom f is defined by
f (z) − f x t − vt , z − x t
f
n
∂ f ( x ) := v ∈ Rn : ∃ x t → x, vt → v with lim inf
z→ xt ∥z − xt ∥ (6)
o
≥ 0 for each t .
Lemma 1 ([25]). Given a positive real number α that satisfies α < 1, and { ak }k∈N is a summable
sequence, {bk }k∈N is a non-negative real sequence and satisfies
bk+1 ≤ αbk + ak , ∀k ≥ 1,
∞
then ∑ bk < +∞.
k =0
Lemma 2 ([24]). Given a bounded sequence { x k }k∈N ∈ Rn , the set C ({ x k }k∈N ) is nonempty and
compact. Moreover, we have
L
| f (y) − f ( x ) − ⟨∇ f ( x ), y − x ⟩|≤ ∥ y − x ∥2 .
2
Proposition 1. Let f , f1 : Rn → R ∪ {+∞} be proper functions, then the following conclusions hold.
1. Let { x k }, {vk } be sequences, if x k → x, vk → v, vk ∈ ∂ f ( x k ) and f ( x k ) → f ( x ), then
v ∈ ∂ f ( x );
Mathematics 2024, 12, 675 5 of 24
0 ∈ ∂ f ( x ∗ ); (8)
α
f (y) ≥ f ( x ) + ⟨v, y − x ⟩ + ∥ y − x ∥2 , ∀ x, y, v ∈ ∂ f ( x ).
2
The proximal operator plays a crucial role in the design of our algorithm. Therefore,
we review its definition and significant properties as follows.
Definition 2 ([24]). (Proximal mapping) Given a proper and lower semicontinuous function
f : Rn → R ∪ {+∞} and a parameter α > 0, the proximal mapping Pα f of f associated with
parameter α > 0 is defined by:
1
Pα f ( x ) = arg min{ f (y) + ∥ y − x ∥2 }.
y 2α
Proposition 2. Suppose the function f is proper and lower semicontinuous. Given α > 0.
1. If f is lower bounded, i.e., inf f > −∞, then for each x̄ ∈ dom( f ), the set Pα f ( x̄ ) is
nonempty and compact [24] (Theorem 1.17);
2. If f is convex, the proximal mapping Pα f is single valued and firmly nonexpansive [27]
(Lemma 11.1), i.e.,
Definition 3 ([24]). (Indicator function) For a closed set S ∈ Rn , its indicator function IS is
defined by
0, if x ∈ S,
IS ( x ) =
+∞, if x ∈ / S.
ψ′ ( f ( x ) − f ( x ∗ )) dist(0, ∂ f ( x )) ≥ 1. (9)
A function that satisfies the KL property at each point of its domain is called a
KL function.
Mathematics 2024, 12, 675 6 of 24
one has
ψ′ ( f ( x ) − f ( x ∗ )) · dist(0, ∂ f ( x )) ≥ 1.
Remark 2. Based on Assumption 1, we conclude from Lemma 3 that f (·) + 2l ∥·∥2 is convex for
any fixed l ≥ L. Specifically, take s ≤ L such that
s
f s+ (·) := f (·) + ∥·∥2
2
is convex.
Remark 3. Using the optimal conditions and the subdifferential calculus rule for y and z-updates
in (11) and (12), we have
1 k +1
0 = ∇ f ( y k +1 ) + (y − u k ), (15)
γ
1 k +1
0 ∈ ∂g(zk+1 ) + (z + uk − αyk+1 ). (16)
γ
Mathematics 2024, 12, 675 7 of 24
1
0 ∈ ∇ f (y∗ ) + ∂g(y∗ ) + (2 − α ) y ∗ . (17)
γ
According to Formula (17), the limiting point is a critical point of Problem (1) with the addi-
−α y 2 .
tional regularization term 22γ ∥ ∥ If the variable g in Algorithm 1 is replaced with
−α y 2 , the corresponding clustering points will also be a critical point of Problem (1).
g̃ = g − 22γ ∥ ∥
And we set ϕ̃ = f + g̃.
1
f k ( x ) := f ( x ) + ∥ x − u k ∥2
2γ
(18)
1 1 1 s
= f s+ ( x ) + ( − s)∥ x − u k ∥2 − ∥ u k ∥2 .
2 γ 1 − γs 2(1 − γs)
1
The f k ( x ) is strongly convex with coefficient c and
1
yk+1 = arg min{ f k ( x )} = Pc f s+ ( u k ),
x 1 − γs
where c = 1−γγs . By combining the Proposition 2.2 with the Propositions 1.1 and 1.4,
we can obtain that, for all k ∈ N, the following two inequalities hold:
1 1
∥ y k +1 − y k ∥ 2 ≤ ⟨ y k +1 − y k , uk − u k −1 ⟩, (19)
1 − γs 1 − γs
1
f k ( x ) − f k ( y k +1 ) ≥ ∥ x − y k +1 ∥ 2 , ∀ x; (20)
2c
2. For any k ∈ N, define gk : Rn → R ∪ {+∞} by
1
gk ( x ) := g( x ) + ∥ x − (αyk+1 − uk )∥2 . (21)
2γ
Then
zk+1 = arg min{ gk ( x )};
x
3. Denote
v k = ( y k ; z k ; x k ), w k = ( y k ; z k ; x k ; x k −1 ; x k −2 ), (22)
k −1 k −1 k −1 k −1 k −2
∆1k k
= y −y , ∆2k k
= z −z ,∆ =x −x . (23)
To analyze the convergence of the IPDR method, we formulated a new merit function.
Definition 5 (Merit function). Let γ > 0, for any y, z, x, x̃, x̂ ∈ Rn , the merit function is defined by
β2
M(y, z, x, x̃, x̂ ) := M0 (y, z, x ) + ( + ρ0 )∥ x̃ − x̂ ∥2 , (24)
2γ
(2− α ) β2
where ρ0 = 2γ ,
1 1 2−α
M0 (y, z, x ) = f (y) + g(z) − ∥y − z∥2 + ⟨ x − (α − 1)y, z − y⟩ + ∥ y ∥2 . (25)
2γ γ 2γ
Mathematics 2024, 12, 675 8 of 24
where Equation (26) follows from the elementary relation ⟨a, b⟩ = 12 (∥a + b∥2 − ∥a∥2 − ∥b∥2 )
applied with a = x − (α − 1)y and b = z − y in (25). And Equation (27) follows from the
relation ⟨a, b⟩ = 12 (∥a + b∥2 − ∥a − b∥2 ) applied with a = x − y and b = z − y in (25).
where the first equality follows from the definition of u-update (14). Next, using the
definition of x-update (13) and u-update (14), we obtain that
Using Equations (13) and (14), we can obtain the following two relations:
y k − z k = u k −1 − x k
= u k −1 − u k + u k − x k (30)
k −1 k k k −1
=u − u + β( x − x ).
z k − y k = x k − u k −1
(31)
= x k − x k −1 + β ( x k −2 − x k −1 ).
Theorem 1. (Decrease property) Suppose that Assumptions 1 and 2 hold. {wk } is the sequence of
points generated by the IPDR algorithm. Then, for all k ∈ N,
Proof. First, from the definition of M0 (see (25)) and the equality (29), we can obtain
M0 ( y k +1 , z k +1 , x k +1 ) − M0 ( y k +1 , z k +1 , x k )
1
= ⟨ x k +1 − x k , z k +1 − y k +1 ⟩
γ
1 (33)
= ⟨ z k +1 − y k +1 + β ( x k − x k −1 ), z k +1 − y k +1 ⟩
γ
1 β
= ∥ z k +1 − y k +1 ∥ 2 + ⟨ ∆ k , z k +1 − y k +1 ⟩.
γ γ
Second, employing (26) and the fact that zk+1 is a minimizer, we have
M0 ( y k +1 , z k +1 , x k ) − M0 ( y k +1 , z k , x k )
1 k +1 1
= g ( z k +1 ) + ∥z − (αyk+1 − x k )∥2 − ∥yk+1 − zk+1 ∥2
2γ γ
k 1 k k +1 k 2 1 k +1 k 2
− g(z ) + ∥z − (αy − x )∥ − ∥y −z ∥ (34)
2γ γ
β 1 1
= gk (zk+1 ) − gk (zk ) − ⟨∆2k+1 , ∆k ⟩ − ∥yk+1 − zk+1 ∥2 + ∥yk+1 − zk ∥2
γ γ γ
β 1 1
≤ − ⟨∆2k+1 , ∆k ⟩ − ∥yk+1 − zk+1 ∥2 + ∥yk+1 − zk ∥2 ,
γ γ γ
M0 ( y k +1 , z k , x k ) − M0 ( y k , z k , x k )
1 1
= f ( y k +1 ) + ∥ x k − y k +1 ∥ 2 − f ( y k ) + ∥ x k − y k ∥2
2γ 2γ
1 1
+ ⟨(2 − α)yk+1 , zk − yk+1 ⟩ − ⟨(2 − α)yk , zk − yk ⟩
γ γ
2−α k +1 2 k 2
(36)
+ (∥y ∥ − ∥y ∥ )
2γ
β 2 − α k +1 2 2 − α k 2
= f k (yk+1 ) − f k (yk ) + ⟨∆1k+1 , ∆k ⟩ + ∥y ∥ − ∥y ∥
γ 2γ 2γ
1 h i
+ (2 − α ) ⟨ y k +1 , z k − y k +1 ⟩ − ⟨ y k , z k − y k ⟩ ,
γ
where the last equality follows from the (18). By the inequality Equation (20), we have
1
f k ( y k +1 ) − f k ( y k ) ≤ − ∥ ∆ k +1 ∥ 2 . (37)
2c 1
Moreover, for the last equation of (36), we further simplify
⟨ y k +1 , z k − y k +1 ⟩ − ⟨ y k , z k − y k ⟩
= ⟨ y k +1 − y k , z k − y k +1 ⟩ + ⟨ y k , y k − y k +1 ⟩ (38)
1 1
≤ ∥ y k +1 − y k ∥ 2 + ∥ z k − y k +1 ∥ 2 + ⟨ y k , y k − y k +1 ⟩,
2 2
M0 ( y k +1 , z k , x k ) − M0 ( y k , z k , x k )
1 β 2 − α k +1 2
≤ − ∥∆1k+1 ∥2 + ⟨∆1k+1 , ∆k ⟩ + ∥ ∆1 ∥
2c γ 2γ
2−α k 2 − α k +1 2 (40)
+ ∥ z − y k +1 ∥ 2 + ∥ ∆1 ∥
2γ 2γ
1 2−α β 2−α k
= (− + )∥∆1k+1 ∥2 + ⟨∆1k+1 , ∆k ⟩ + ∥ z − y k +1 ∥ 2 .
2c γ γ 2γ
M0 ( v k +1 ) − M0 ( v k )
1 2−α β
≤ (− + )∥∆1k+1 ∥2 + ⟨∆k , ∆1k+1 − ∆2k+1 + zk+1 − yk+1 ⟩
2c γ γ
1 2−α
+( + )∥zk − yk+1 ∥2
γ 2γ
1 2−α β 1 2−α (41)
= (− + )∥∆1k+1 ∥2 + ⟨∆k , zk − yk ⟩ + ( + )∥zk − yk+1 ∥2
2c γ γ γ 2γ
1 2−α β β2
= (− + )∥∆1k+1 ∥2 + ∥∆k ∥2 − ⟨∆k , ∆k−1 ⟩
2c γ γ γ
1 2−α
+( + )∥zk − yk+1 ∥2 ,
γ 2γ
Mathematics 2024, 12, 675 11 of 24
where the second equality uses Formula (31). From Formula (19) and the relationship
Equation (30), we obtain
∥ y k +1 − z k ∥ 2 = ∥ y k +1 − y k ∥ 2 + ∥ y k − z k ∥ 2 + 2 ⟨ y k +1 − y k , y k − z k ⟩
= ∥∆1k+1 ∥2 + ∥yk − zk ∥2 − 2⟨∆1k+1 , uk − uk−1 ⟩ + 2β⟨∆1k+1 , ∆k ⟩ (42)
≤ −(1 − 2γs)∥∆1k+1 ∥2 k
+ ∥y − z ∥ k 2
+ 2β⟨∆1k+1 , ∆k ⟩.
γ ∇ f ( y k +1 ) = u k − y k +1 .
Therefore, combining (43), (28), and (30), for all k ∈ N, it holds that
M0 ( v k +1 ) − M0 ( v k )
1 2−α β β2
≤ (− + )∥∆1k+1 ∥2 + ∥∆k ∥2 − ⟨∆k , ∆k−1 ⟩
2c γ γ γ
1 2−α h
+( + ) (2γs − 1)∥∆1k+1 ∥2 + 2β⟨∆1k+1 , ∆k ⟩
γ 2γ
i
+(1 + γL)2 ∥∆1k+1 ∥2 − (2β + β2 )∥∆k ∥2 + 2β2 ⟨∆k−1 , ∆k ⟩
4−α (3 − α ) β 2 k k −1
ρ β
=− ∥∆1k+1 ∥2 − (2β + β2 ) − ∥ ∆ k ∥2 + ⟨∆ , ∆ ⟩
2γ 2γ γ γ
(4 − α ) β k +1 k
+ ⟨ ∆1 , ∆ ⟩
γ
(4 − α ) q − ρ k +1 2 4−α (3 − α ) β2 (4 − α ) β2
β
≤ ∥ ∆1 ∥ − (2β + β2 ) − − − ∥ ∆ k ∥2
2γ 2γ γ 2γ 2γ q
(3 − α ) β 2 k −1 2
+ ∥∆ ∥
2γ
(4 − α ) q − ρ k +1 2 (3 − α ) β β2 (4 − α ) β2 (3 − α ) β 2 k −1 2
= ∥ ∆1 ∥ − + − ∥ ∆ k ∥2 + ∥∆ ∥ , (45)
2γ γ 2γ 2γ q 2γ
∥ a ∥2
where the second inequality uses the relation 2⟨ a, b⟩ ≤ q + q∥b∥2 with q > 0.
Mathematics 2024, 12, 675 12 of 24
M(wk+1 ) − M(wk )
β2 β2
= M0 ( v k +1 ) + ( + ρ0 )∥∆k ∥2 − M0 (vk ) − ( + ρ0 )∥∆k−1 ∥2
2γ 2γ
ρ − (4 − α ) q k +1 2 (3 − α ) β 4 − α β2 (2 − α ) β 2 k −1 2 (46)
≤ − ∥ ∆1 ∥ − ( − · )∥∆k ∥2 + ∥∆ ∥
2γ γ 2γ q 2γ
+ ρ 0 ∥ ∆ k ∥ 2 − ρ 0 ∥ ∆ k −1 ∥ 2
= − ρ1 ∥∆1k+1 ∥2 − (ρ2 − ρ0 )∥∆k ∥2 ,
(3− α ) β 4− α β2
where ρ2 = γ − 2γ · q . This completes the proof.
Remark 4. 1. If α = 2, then the IPDR method becomes the IDRS method, and Assumption 2
reduces to the following Relation
Compared to (5), it is clear that in this case, the parameter requirement has one less condition:
1
γ > s.
2. If the inertial parameter β = 0, then the IPDR method degenerates to the PDR method, and
Assumption 2 also degenerates to inequality (4).
3. When α and β are set to 0, it is evident that the IPDR method degenerates to the DR method.
Additionally, the inequality relation related to the parameters is the same as in (3).
Theorem 2. Suppose that Assumptions 1 and 2 hold. The sequence {vk } is generated by the IPDR
method. If there exists a bounded subsequence {vk j } ⊆ {vk }, then we have
∞ ∞
1. ∑ ∥∆1k ∥2 < +∞, ∑ ∥∆k ∥2 < +∞;
k =0 k =0
2. lim ∥∆1k ∥ = lim ∥∆2k ∥ = lim ∥∆k ∥ = lim ∥yk − zk ∥ = 0.
k→∞ k→∞ k→∞ k→∞
Proof. 1. The sequence {M(ω k )k∈N } is decreasing by Theorem 1, and {M(wk j ) j∈N } has
a lower bound by Assumption 1. Then {M(ω k )k∈N } is convergent. Summing (32) from
k = 0 to N ≥ 1, we obtain that
N +1 N
M(w N +1 ) − M(w0 ) ≤ −ρ1 ∑ ∥∆1k ∥2 − (ρ2 − ρ0 ) ∑ ∥ ∆ k ∥2 .
k =1 k =0
As N → ∞, we have
∞ ∞
ρ1 ∑ ∥∆1k ∥2 + (ρ2 − ρ0 ) ∑ ∥∆k ∥2 ≤ M(w0 ) − Nlim
→∞
M(w N ) < +∞.
k =1 k =0
lim ∥uk+1 − uk ∥ = 0,
k→∞
lim ∥zk − yk ∥ = 0.
k→∞
lim ∥zk+1 − zk ∥ = 0.
k→∞
The following theorem demonstrates that all cluster points of {zk }({yk }) are stationary
points of the objective function ϕ̃ = f + g̃.
1 k +1 1 ∗
g ( z k +1 ) + ∥z − (αyk+1 − uk )∥2 ≤ g(z∗ ) + ∥z − (αyk+1 − uk )∥2 .
2γ 2γ
Substituting k with k j − 1 and taking the upper limits on both sides, we obtain that
2−α
lim ϕ̃(zk j ) = lim f (zk j ) + lim g(zk j ) − ∥ y ∥2
j→∞ j→∞ j→∞ 2γ
2−α
= f (z∗ ) + g(z∗ ) − ∥y∥2 = ϕ̃(z∗ ).
2γ
1 k +1
0 ∈ ∇ f (yk+1 ) + ∂g(zk+1 ) + (z − ( α − 1 ) y k +1 ), ∀ k ∈ N.
γ
2− α 2
For the function ϕ̃ = ϕ − 2γ ∥ y ∥ , applying the optimality condition to the subprob-
lem, we obtain further that
1 k +1
0 ∈ ∇ f (yk+1 ) + ∂ g̃(zk+1 ) + (z − y k +1 ), ∀ k ∈ N. (50)
γ
where
1 k +1
q k +1 = ∇ f ( z k +1 ) − ∇ f ( y k +1 ) + (y − z k +1 ).
γ
Thus, we have qk j → 0, j → ∞ by (18) and Theorem 2. Therefore, we obtain the
conclusion based on Proposition 1.1.
then
tk+1 ∈ ∂M(wk+1 ), ∀k ∈ N.
Moreover, there exists a positive constant ρ3 such that
1
M(wk+1 ) = f (yk+1 ) + g(zk+1 ) − ∥ y k +1 − z k +1 ∥ 2
2γ
1 k +1
+ ⟨x − ( α − 1 ) y k +1 , z k +1 − y k +1 ⟩ (51)
γ
2 − α k +1 2 β 2 k
+ ∥y ∥ + ∥ x − x k −1 ∥ 2 + ρ 0 ∥ x k − x k −1 ∥ 2 .
2γ 2γ
We first consider the subdifferential of M at wk+1 = (yk+1 ; zk+1 ; x k+1 ; x k ; x k−1 ). Notice
that for any k ≥ 0, we have
1 k +1 1 h k +1 i
∂y M(wk+1 ) =∇ f (yk+1 ) − (y − z k +1 ) + −x − ( α − 1 ) z k +1 + 2 ( α − 1 ) y k +1
γ γ
(2 − α ) k +1
+ y
γ
2 ( α − 1 ) 2 − α k +1 α − 1 k +1 1 k +1
1 1
=∇ f (yk+1 ) + − + + y +( − )z − x
γ γ γ γ γ γ
(52)
α − 1 1 − α 1 1
=∇ f (yk+1 ) + y k +1 + z k +1 + z k +1 − x k +1
γ γ γ γ
α − 1 1
=∇ f (yk+1 ) + ( y k +1 − z k +1 ) + ( z k +1 − x k +1 )
γ γ
α−1 k
= ( u − x k +1 ),
γ
Mathematics 2024, 12, 675 15 of 24
where the last equality use Formula (15) and the definition of x k+1 (13). From the Formulas
(13) and (16), we obtain
2 − α k +1 1 k +1
∂z M(wk+1 ) = ∂g(zk+1 ) + y + (x − z k +1 )
γ γ
1h i 2−α 1
∋ (αyk+1 − uk ) − zk+1 + y k +1 + ( x k +1 − z k +1 )
γ γ γ
2 k +1 1 k 1 k +1 1 k +1 1 k +1
= y − u − z + x − z (53)
γ γ γ γ γ
2 k +1 1
= (y − z k +1 ) + ( x k +1 − u k )
γ γ
1 k
= ( u − x k +1 ).
γ
1 k +1
∂ x M(wk+1 ) = (z − y k +1 ).
γ
β2
∂ x̃ M(wk+1 ) = ( + 2ρ0 )( x k − x k−1 ).
γ
β2
∂ x̂ M(wk+1 ) = −( + 2ρ0 )( x k − x k−1 ).
γ
Therefore, we obtain
1h
(α − 1)(uk − x k+1 ); uk − x k+1 ; zk+1 − yk+1 ; ( β2 + 2ρ0 γ)( x k − x k−1 );
γ (54)
i
(− β2 − 2ρ0 γ)( x k − x k−1 ) ∈ ∂M(wk+1 ).
By (31), we have
uk − x k+1 = −( x k+1 − uk )
h i
= − x k +1 − x k + β ( x k −1 − x k )
(55)
= −( x k+1 − x k ) − β( x k−1 − x k )
= −∆k+1 + β∆k .
z k +1 − y k +1 = x k +1 − x k + β ( x k −1 − x k )
= ∆ k +1 − β ( x k − x k −1 ) (56)
k +1 k
=∆ − β∆ .
The (54) can be converted into
tk+1 ∈ ∂M(wk+1 ), ∀k ∈ N.
Mathematics 2024, 12, 675 16 of 24
where ρ3 = max{(α + 1)/γ, [(α + 1) β + (6 − 2α) β2 ]/γ}. The main convergence result is
as follows.
Theorem 4 (Strongly convergence). If ϕ is a KL function and {vk }k∈N is bounded sequence, and
based on Assumptions 1 and 2, it holds that
∞
1. ∑ ∥vk+1 − vk ∥ < +∞;
k =0
2. The sequence {zk }k∈N converges to a stationary point of ϕ̃.
Proof. Let the set Ω := C ({wk }k∈N ). Since the sequence {vk }k∈N is bounded and wk =
(yk ; zk ; x k ; x k−1 ; x k−2 ). Moreover, by Lemma 2, we can deduce that Ω is a nonempty compact
set, and dist(wk , Ω) → 0.
1. From the Theorem 2, the set Ω has the form
1
lim M(wk ) = lim ϕ(wk j ) = M(w∗ ) = ϕ̃( x ∗ ) + ∥ y ∗ ∥2 .
k→∞ j→∞ 2γ
Recall from Theorem 1 that the sequence {M(wk )}k∈N is nonincreasing, so if for some
k0 ∈ N, we have M(wk0 ) = M(w∗ ), then for all k ≥ k0 , M(wk ) = M(w∗ ).
Thus, by Theorem 1, {yk }, { x k } must be eventually constant, and so is {zk } by (13).
Hence, {vk } is of finite length. The proof is complete.
Mathematics 2024, 12, 675 17 of 24
Mk = M(wk ) − M(w∗ ), ∀k ∈ N.
ρ3 ∥ ∆ k +1 ∥ + ∥ ∆ k ∥
∥ ∆ k +1 ∥ ≤ (ψ(Mk+1 ) − ψ(Mk+2 )) + . (65)
ρ2 − ρ0 4
That is,
1 4ρ3
∥ ∆ k +1 ∥ ≤ ∥ ∆ k ∥ + (ψ(Mk+1 ) − ψ(Mk+2 )).
3 3( ρ0 − ρ2 )
In addition, we have ψ(Mk ) > 0 for k ≥ k1 + 1.
N
∑ (ψ(Mk ) − ψ(Mk+1 )) = ψ(Mk1 +1 ) − ψ(M N +1 ) ≤ ψ(Mk1 +1 ), (66)
k = k 1 +1
that is,
∞
∑ (ψ(Mk ) − ψ(Mk+1 )) < +∞. (67)
k = k 1 +1
∞
∑ ∥xk+1 − xk ∥ < +∞. (68)
k =0
Similar to the above proof process, in Equations (59)–(62), by making the superscript
k = k − 1, we also obtain
Then
ρ3 ∥ ∆ k ∥ + ∥ ∆ k −1 ∥
∥∆1k+1 ∥ ≤ (ψ(Mk ) − ψ(Mk+1 )) + .
ρ1 4
Therefore, by (67), (68), we obtain
∞
∑ ∥yk+1 − yk ∥ < +∞. (73)
k =0
2. From conclusion 1, we know that {vk }k∈N is Cauchy sequence, and thus {vk }k∈N
is convergent. Consequently, by applying conclusion 2 in Theorem 3, we obtain that {zk }
converges to a stationary point of ϕ̃.
4. Numerical Results
In this section, we evaluate the effectiveness and feasibility of the IPDR splitting
method through the nonconvex feasibility problem coming from [16,30]. All experiments
are implemented in MATLAB R2021a on a laptop computer with a 3.20 GHz AMD Ryzen 7
6800H processor and 16 GB memory.
We are solving a nonconvex feasibility problem, which searches for a point in the
intersection of two nonempty sets. It can be structured as follows:
1 2
min
x 2 dC ( x )
(75)
s.t. x ∈ D,
min f ( x ) + g( x ). (76)
x
As discussed in Section 3, when the IPDR splitting method is applied directly to Prob-
lem (76) with f ( x ) = inf ∥y − x ∥2 and g( x ) = ID ( x ), the resulting sequence does not con-
y∈C
verge to the critical point of (76). However, as an alternative, we can take
Mathematics 2024, 12, 675 19 of 24
2− α 2
f ( x ) = inf ∥y − x ∥2 and g( x ) = ID − 2γ ∥ x ∥ and apply the IPDR splitting method
y∈C
accordingly, then the sequence generated converges to a critical point of (76). Thus, we
obtain the following algorithm:
= 1+1 γ (ut + γPC (ut )),
t +1
y
t +1 t
αy −u
t +1
( IPDR) z ∈ PD α −1 ,
(77)
x t + 1 = u t + ( z t − y t + 1 ),
t +1
u = x t +1 + β ( x t +1 − x t ),
where PC is the projector onto the set C and PD is the projector onto the set D; α is the
parameterized coefficient; β is the inertial parameter; and γ > 0 is the step-size.
We compare our method with that of several other algorithms, including DR [11], PR
(the Peaceman–Rachford splitting method [30]), PDR [16], Alt (the alternating projection
method [11]), and IDRS [23].
In Li and Pong [11], the DR splitting method for Problem (76) is as follows:
Similar to the PR method, in Bian and Zhang [16]. They take f ( x ) = 12 infx∈C ∥y − x ∥2
−α ∥ x ∥2 in Problem (76). The PDR splitting method is formulated:
and g( x ) = ID ( x ) − 22γ
In Feng et al. [23], they study the IDRS method for the signal recovery problem. We
give the IDRS method for the feasibility problem and make the same substitution as the
PDR method for the functions f , g. The IDRS method is given below:
The Alt method is a typical algorithm for solving Problem (75) is as follows:
( Alt) x t+1 ∈ arg min { x − x t + A† (b − Ax t ) }. (82)
{∥ x ∥0 ≤r,∥ x ∥∞ ≤106 }
we set b = A x̂. We choose the same initialization and stopping criteria in Li and Pong [11],
Li et al. [30] and Bian and Zhang [16].
All algorithms are initialized at the origin, and the stopping criteria are as follows:
• DR, PR, PDR, IDRS, IPDR
• Alt
max{ x t − x t−1 }
< 10−8 . (84)
{∥ x t−1 ∥, 1}
The parameters of various methods are listed in Table 1, and for the definition of the
algorithmic parameters in Table 1, please refer to [11,16,23,30].
Table 1. Parameters of various methods for solving the nonconvex feasibility problem.
To select the parameter γ, we first choose γ0 > 0. However, γ0 may become very small
during actual numerical calculations. Therefore, we use a heuristic: γ = k · γ0 . Here, in the
DR method, k is chosen to be 150, while in the PDR method, the optimal parameters (as
referenced in [16]) are chosen to be k = 150 and α = 1.7. Most notably, in the IPDR splitting
method, we reduce the value of γ0 while increasing k to ensure appropriate scaling, where
k = 375.
We further give the assumptions in Section 3 that the function f , g satisfy:
1. Given that C is both closed and convex, it implies that the function f is smooth
with a Lipschitz continuous gradient, where the Lipschitz continuity modulus L is 1.
Moreover, similar to PDR [16], we can set s = 0 in this experiment;
2. From Assumption 2, we have q < ρ/(4 − α), then choose the largest q, and it is easy
to calculate that
(6 − 2α)q
β ≤ β1 = .
(4 − α ) + (2 − α ) q
In summary, the parameters of the IPDR method are as follows:
1. We choose the largest parameter β, i.e., β = β 1 ;
2. We choose the parameter α = 1.6;
p
3. γ = 0.7 · (1 + α)/(4 − α) − 1.
In our experiments, we generated 50 random instances where m was chosen from the
set {300, 400, 500, 600} and n was chosen from the set {4000, 5000, 6000}. The results of
the numerical calculations are presented in Tables 2–5, which include the runtime (tim),
number of iterations (iter), as well as the largest and smallest function values at termination
(fmax, fmin). To assess the correctness of the sparse solution of the linear system, we report
the number of successes (succ) and failures (fail). We declare success when the value of
the function at termination is less than 10−12 , and we declare failure when the value of the
function at termination is greater than 10−6 .
Mathematics 2024, 12, 675 21 of 24
Data DR PR
m n Tim (s) Iter Succ Fail Tim (s) Iter Succ Fail
300 4000 0.41 593 50 0 0.10 133 37 13
300 5000 0.65 710 50 0 0.16 167 29 21
300 6000 0.94 801 50 0 0.24 206 22 28
400 4000 0.79 525 50 0 0.16 97 45 5
400 5000 1.06 578 50 0 0.25 126 44 6
400 6000 1.37 650 50 0 0.34 157 32 18
500 4000 1.10 498 50 0 0.40 176 48 2
500 5000 1.31 525 50 0 0.25 94 49 1
500 6000 1.74 561 50 0 0.36 124 43 7
600 4000 2.17 486 50 0 9.39 2498 4 46
600 5000 2.38 502 50 0 0.58 116 49 1
600 6000 2.12 524 50 0 0.52 92 50 0
Table 3. Compare Alt and IDR methods for feasibility problems on random instances.
Table 4. Compare PDR and IPDR methods for feasibility problems on random instances.
As demonstrated in the tables, when m ∈ {400, 500, 600}, the success rate of the IPDR
method closely matches that of DR and PDR while exhibiting superior computational
efficiency. In other words, IPDR has a high success rate, fewer iterations, and less CPU time.
In cases where m = 300, our proposed method achieves a success rate that outperforms
all other algorithms, ranking second only to the DR method. It can also be seen that the
success rate of the method may be related to the experimental model, and the DR method
can solve some problems with worse nonconvexity. In Table 5, we list the fmax and fmin
values of the algorithms with high success rates: DR, PDR, and IPDR. Notably, the final
function value of IPDR consistently outperforms PDR in most cases. It is worth mentioning
that the PR, Alt, and IDRS methods exhibited low success rates. These observations show
the effectiveness of combining parametric and inertial techniques.
5. Conclusions
In this paper, we focus on solving a class of nonconvex and nonsmooth minimiza-
tion problems. Specifically, we propose an inertial parametric Douglas–Rachford (IPDR)
splitting method, which combines inertial and parametric techniques with the DR splitting
method. We also construct a new merit function and use the Kurdyka-Łojasiewicz property
to prove the boundedness and convergence of the iterative sequences generated by the
proposed IPDR method. Finally, by applying the IPDR method to nonconvex feasibility
problems, our numerical experimental results demonstrate the potential advantage of
combining parametric and inertia techniques.
Author Contributions: Conceptualization, T.L.; methodology, T.L.; validation, T.L. and X.Z.; inves-
tigation, T.L.; writing—original draft preparation, T.L.; writing—review and editing, T.L. and X.Z.;
supervision, X.Z. All authors have read and agreed to the published version of the manuscript.
Funding: This research was funded by the National Natural Science Foundation of China (11901368)
and the Shanxi Province Science Foundation for Youths (20210302124530).
Data Availability Statement: The data presented in this study are available on request from the
corresponding author.
Conflicts of Interest: There is no conflict of interest.
Mathematics 2024, 12, 675 23 of 24
Abbreviations
The following abbreviations are used in this manuscript:
DR Douglas–Rachford
DRE Douglas–Rachford Envelope
ADMM Alternating direction method of multipliers
PDR Parameterized Douglas–Rachford
RrDR Randomized r-sets-Douglas–Rachford
IDRS Inertial Douglas–Rachford splitting
IPDR Inertial parametric Douglas–Rachford
PR Peaceman–Rachford
Alt Alternating projection
References
1. Douglas, J.; Rachford, H.H. On the numerical solution of heat conduction problems in two and three space variables. Trans. Am.
Math. Soc. 1956, 82, 421–439.
2. Lions, P.L.; Mercier, B. Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 1979, 16, 964–979.
3. Combettes, P.L.; Pesquet, J.C. A Douglas–Rachford Splitting Approach to Nonsmooth Convex Variational Signal Recovery. IEEE J.
Sel. Top. Signal Process. 2007, 1, 564–574.
4. Gandy, S.; Recht, B.; Yamada, I. Tensor completion and low-n-rank tensor recovery via convex optimization. Inverse Probl. 2011,
27, 025010.
5. He, B.; Yuan, X. On the O(1/n) convergence rate of the Douglas–Rachford alternating direction method. SIAM J. Numer. Anal.
2012, 50, 700–709.
6. Qu, Y.; He, H.; Han, D. A Partially Inertial Customized Douglas–Rachford Splitting Method for a Class of Structured Optimization
Problems. SIAM J. Sci. Comput. 2024, 98, 9.
7. Patrinos, P.; Stella, L.; Bemporad, A. Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants. In Proceedings
of the 53rd IEEE Conference on Decision and Control, Piscataway, NJ, USA, 15–17 December 2014; pp. 4234–4239.
8. Barshad, K.; Gibali, A.; Reich, S. Unrestricted Douglas-Rachford algorithms for solving convex feasibility problems in Hilbert
space. Optim. Methods Softw. 2023, 38, 655–667.
9. Lindstrom, S.B.; Sims, B. Survey: Sixty Years of Douglas-Rachford. J. Aust. Math. Soc. 2021, 110, 333–370.
10. Han, D.R. A Survey on Some Recent Developments of Alternating Direction Method of Multipliers. J. Oper. Res. Soc. China. 2022,
10, 1–52.
11. Li, G.; Pong, T.K. Douglas-Rachford Splitting for Nonconvex Optimization with Application to Nonconvex Feasibility Problems.
Math. Program. 2016, 159, 371–401.
12. Themelis, A.; Patrinos, P. Douglas-Rachford Splitting and ADMM for Nonconvex Optimization: Tight Convergence Results.
SIAM J. Optim. 2020, 30, 149–181.
13. Themelis, A.; Stella, L.; Patrinos, P. Douglas-Rachford splitting and ADMM for nonconvex optimization: Accelerated and
Newton-type linesearch algorithms. Comput. Optim. Appl. 2022, 82, 395–440.
14. Dao, M.N.; Phan, H.M. Adaptive Douglas–Rachford Splitting Algorithm for the Sum of Two Operators. SIAM J. Optim. 2019, 29,
2697–2724.
15. Tran Dinh, Q.; Pham, N.H.; Phan, D.; Nguyen, L. FedDR-randomized Douglas-Rachford splitting algorithms for nonconvex
federated composite optimization. Adv. Neural Inf. Process. Syst. 2021, 34, 30326–30338.
16. Bian, F.; Zhang, X. A Parameterized Douglas-Rachford Splitting Algorithm for Nonconvex Optimization. Appl. Math. Comput.
2021, 410, 126425.
17. Dao, M.N.; Phan, H.M. Linear Convergence of Projection Algorithms. Math. Oper. Res. 2019, 44, 715–738.
18. Wang, D.; Wang, X. A Parameterized Douglas–Rachford Algorithm. Comput. Optim. Appl. 2019, 73, 839–869.
19. Han, D.; Su, Y.; Xie, J. Randomized Douglas-Rachford Method for Linear Systems: Improved Accuracy and Efficiency. arXiv 2022,
arXiv:2207.04291.
20. Polyak, B.T. Some Methods of Speeding Up the Convergence of Iteration Methods. USSR Comput. Math. Math. Phys. 1964, 4, 1–17.
21. Boţ, R.I.; Csetnek, E.R.; Hendrich, C. Inertial Douglas–Rachford splitting for monotone inclusion problems. Appl. Math. Comput.
2015, 256, 472–487.
22. Alves, M.M.; Eckstein, J.; Geremia, M.; Melo, J.G. Relative-Error Inertial-Relaxed Inexact Versions of Douglas-Rachford and
ADMM Splitting Algorithms. Comput. Optim. Appl. 2020, 75, 389–422.
23. Feng, J.; Zhang, H.; Zhang, K.; Zhao, P. An Inertial Douglas-Rachford Splitting Algorithm for Nonconvex and Nonsmooth
Problems. Concurr. Comput. Pract. Exper. 2021, e6343.
24. Rockafellar, R.T.; Wets, R.J.B. Variational Analysis; Springer Science & Business Media: New York, NY, USA, 2009; Volume 317.
25. Boţ, R.I.; Csetnek, E.R. An Inertial Tseng’s Type Proximal Algorithm for Nonsmooth and Nonconvex Optimization Problems. J.
Optim. Theory Appl. 2016, 171, 600–616.
26. Nesterov, Y. Introductory Lectures on Convex Programming Volume I: Basic Course. Lect. Notes 1998, 3, 5.
Mathematics 2024, 12, 675 24 of 24
27. Goebel, K.; Reich, S. Uniform Convexity, Hyperbolic Geometry, and Nonexpansive Mappings; Marcel Dekker: New York, NY, USA,
1984; Volume 83.
28. Attouch, H.; Bolte, J.; Svaiter, B.F. Convergence of Descent Methods for Semi-Algebraic and Tame Problems: Proximal Algorithms,
Forward-Backward Splitting, and Regularized Gauss-Seidel Methods. Math. Program. 2013, 137, 91–129.
29. Bolte, J.; Sabach, S.; Teboulle, M. Proximal Alternating Linearized Minimization for Nonconvex and Nonsmooth Problems. Math.
Program. 2014, 146, 459–494.
30. Li, G.; Liu, T.; Pong, T.K. Peaceman-Rachford Splitting for a Class of Nonconvex Optimization Problems. Comput. Optim. Appl.
2017, 68, 407–436.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.