Professional Documents
Culture Documents
Article history: The ordinary differential equation is a powerful tool for analyzing optimization
Received 7 August 2023 algorithm. Motivated by the fact, this paper revisits Halpern fixed-point algorithm
Received in revised form 7 October 2023 from an ordinary differential equation. More specifically, we establish a second-
Accepted 7 October 2023 order ordinary differential equation with Hessian-driven damping, which is the
Available online 10 October 2023 limit of Halpern fixed-point algorithm. The Hessian-driven damping makes it
Keywords: possible to significantly attenuate the oscillations.
Ordinary differential equation © 2023 Elsevier Ltd. All rights reserved.
Halpern fixed-point algorithm
Hessian-driven damping
1. Introduction
✩ This work was supported in part by the National Nature Science Foundation of China under Grant 62276202 and 62106186,
in part by the Natural Science Basic Research Plan in Shaanxi Province of China under Grant 2022JQ-670, and in part by the
Fundamental Research Funds for the Central Universities, China under Grant QTZX22047 and JB210701.
∗ Corresponding author.
E-mail addresses: zhanglulu202204@126.com (L. Zhang), gaoweifeng2004@126.com (W. Gao), xj6417@126.com (J. Xie),
lihong@mail.xidian.edu.cn (H. Li).
https://doi.org/10.1016/j.aml.2023.108889
0893-9659/© 2023 Elsevier Ltd. All rights reserved.
L. Zhang, W. Gao, J. Xie et al. Applied Mathematics Letters 148 (2024) 108889
where s ∈ (0, 2/L], B : Rd → Rd is a single-valued and 1/L-co-coercive operator. This paper discusses the
Halpern fixed-point algorithm (4).
Notice that the Halpern fixed-point algorithm (4) can be applied to monotone inclusion and convex
optimization problem, especially game theory, robust optimization and minimization problem [8]. Thus, we
also consider the unconstrained minimization problem
where f (x) is a convex function and ∇f is 1/L-co-coercive. Actually, if B = ∇f , then the problem (1)
reduces to (5). Therefore, we can apply the Halpern fixed-point algorithm (4) to solve (5). The Halpern
fixed-point algorithm for solving (5) has the following scheme
ODE. More precisely, if f is a twice differentiable convex function with L-Lipschitz continuous gradient, the
high-resolution ODE of Nesterov’s accelerated gradient method is
√
3 √ 2 3 s
Ẍ(t) + Ẋ(t) + s∇ f (X(t))Ẋ(t) + (1 + )∇f (X(t)) = 0, (8)
t 2t
√ √ √
with the initial conditions X(3 s/2) = x0 and Ẋ(3 s/2) = − s∇f (x0 ). The main difference between
√ √
(7) and (8) is that (8) possesses the term s∇2 f (X(t))Ẋ(t). We call s∇2 f (X(t))Ẋ(t) Hessian-driven
damping. Furthermore, for a general convex function f , Attouch et al. [18,19] presented an ODE framework
with Hessian-driven damping (DIN-AVD)α,β,b
α
Ẍ(t) + Ẋ(t) + β(t)∇2 f (X(t))Ẋ(t) + b(t)∇f (X(t)) = 0, (9)
t
where β(t) is the damping parameter, b(t) is the time scale parameter and b(t) > β̇(t) + β(t)/t. Notice
that the ODEs (7) and (8) are two special cases of (DIN-AVD)α,β,b . More specifically, (9) reduces
√
to the
√ 3 s
continuous version (7) as α = 3, β(t) ≡ 0 and b(t) ≡ 1. When α = 3, β(t) ≡ s and b(t) = 1 + 2t , (9) can
be interpreted as (8). Based on the previous works, we will consider Halpern fixed-point algorithm by using
ODE. We believe that it will be interesting to understand the Halpern fixed-point algorithm by ODE.
Paper organization. The rest of this paper is organized as follows. Section 2 provides some necessary
notations and preliminary results. Section 3 presents the main results and derives the corresponding ODE
(HF-ODE) of Halpern fixed-point algorithm (6). In Section 4, we conclude this work.
This section recalls some necessary notations and concepts. Throughout this paper, we use ∥ · ∥ to denote
the standard Euclidean norm and use ⟨·⟩ to denote the standard inner product. A single-value operator B
is said to be L-Lipschitz continuous if ∥Bx − By∥ ≤ L∥x − y∥ for all x, y ∈ dom(B), where L ≥ 0 is a
Lipschitz constant and dom(B) = {x ∈ Rd : Bx ̸= ∅} denotes the domain of B. If L = 1, then we say that
B is nonexpansive. If L ∈ [0, 1), then we say that B is L-contractive, where L is the contraction factor. We
say that B is monotone if ⟨u − v, x − y⟩ ≥ 0 for all x, y ∈ dom(B). We say that B is 1/L-co-coercive if
⟨Bx − By, x − y⟩ ≥ L1 ∥Bx − By∥2 for all x, y ∈ dom(B). If L = 1, then we say that B is firmly nonexpansive.
Notice that if B is 1/L-co-coercive, then it is also monotone and L-Lipschitz continuous. ∇f (·) denotes the
first-order derivative of f (·), ∇2 f (·) denotes the second-order derivative of f (·) and x∗ denotes a minimizer
of f . Ẋ(·) denotes the first-order derivative of X(·) and Ẍ(·) denotes the second-order derivative of X(·).
3. Main results
In this section, we introduce the limit HF-ODE of Halpern fixed-point algorithm (6). To the best of
my knowledge, this work is the first to use the continuous dynamic system to model Halpern fixed-point
algorithm. Here we offer a new condition p(βk−1 − βk ) = βk βk−1 for some constant p.
Theorem 3.1. Let f : Rd → Rd be a twice continuously differentiable convex function and ∇f satisfies 1/L-
β0
co-coercive. If there exists a constant p ≥ 2−β 0
such that the coefficient βk of (6) satisfies p(βk−1 − βk ) =
βk βk−1 , then Halpern fixed-point algorithm (6) has the limit HF-ODE
√
p+1 √ 2 s
Ẍ(t) + Ẋ(t) + s∇ f (X(t))Ẋ(t) + ∇f (X(t)) = 0, (10)
t t
√ 1+p √ √
with the initial conditions X(( βp0 − 1+p p
2 ) s) = x0 and Ẋ(( β0 − 2 ) s) = − s(1 − β0 )∇f (x0 ).
3
L. Zhang, W. Gao, J. Xie et al. Applied Mathematics Letters 148 (2024) 108889
Proof . We start by recalling the Halpern fixed-point algorithm (6), which has the following form
xk+1 = βk x0 + (1 − βk )xk − s(1 − βk )∇f (xk ), (11)
where x0 is the initial point and p(βk−1 − βk ) = βk βk−1 . Shifting the index from k to k − 1 in (11) and
multiplying this expression by −βk , we deduce
− βk xk = −βk βk−1 x0 − (1 − βk−1 )βk xk−1 + s(1 − βk−1 )βk ∇f (xk−1 ). (12)
Adding (11) × βk−1 and (12), we obtain
βk−1 xk+1 − βk xk = (1 − βk )βk−1 xk − s(1 − βk )βk−1 ∇f (xk ) − (1 − βk−1 )βk xk−1
+ s(1 − βk−1 )βk ∇f (xk−1 ),
which leads to
βk−1 (xk+1 − xk ) = (1 − βk−1 )βk (xk − xk−1 ) − s(1 − βk )βk−1 ∇f (xk )
(13)
+ s(1 − βk−1 )βk ∇f (xk−1 ).
1+p √
Introduce the Ansatz xk = X(tk ) for some smooth curve X(t) defined for t ≥ ( βp0 − 2 ) s. Letting
√
tk := ( βp − 1+p
2 ) s, we have by Taylor expansion
k
xk+1 − xk 1 √ √
√ = Ẋ(tk ) + Ẍ(tk ) s + o( s),
s 2
√ √ (14)
xk − xk−1 1
√ = Ẋ(tk ) − Ẍ(tk ) s + o( s),
s 2
and we use a Taylor expansion for ∇f , which gives
√ √
∇f (xk−1 ) = ∇f (xk ) − ∇2 f (X(tk ))Ẋ(tk ) s + o( s). (15)
√
Then plugging (14) and (15) into (13)×1/ s, we have
√ √
s √ s √
βk−1 (Ẋ(tk ) + Ẍ(tk ) + o( s)) = (1 − βk−1 )βk (Ẋ(tk ) − Ẍ(tk ) + o( s))
√2 2 √ √
+ s(1 − βk−1 )βk (∇f (X(tk )) − ∇2 f (X(tk ))Ẋ(tk ) s + o( s))
√
− s(1 − βk )βk−1 ∇f (X(tk )).
Rearranging it, we obtain
√
s[βk−1 + (1 − βk−1 )βk ]
Ẍ(tk ) + [βk−1 − (1 − βk−1 )βk ]Ẋ(tk )
2 √ (16)
+ s(1 − βk−1 )βk ∇2 f (X(tk ))Ẋ(tk ) + s(βk−1 − βk )∇f (X(tk )) + o(s) = 0.
√
s
Multiplying both sides of (16) by 1/ 2 [βk−1+ (1 − βk−1 )βk ], we obtain
√
2[βk−1 − (1 − βk−1 )βk ] 2 s(1 − βk−1 )βk
Ẍ(tk ) + √ Ẋ(tk ) + ∇2 f (X(tk ))Ẋ(tk )
s[βk−1 + (1 − βk−1 )βk ] βk−1 + (1 − βk−1 )βk
(17)
2(βk−1 − βk ) √
+ ∇f (X(tk )) + o( s) = 0.
βk−1 + (1 − βk−1 )βk
√
Substituting p(βk−1 − βk ) = βk βk−1 into and ignoring o( s) term, we have the HF-ODE of Halpern
fixed-point algorithm (11)
√
p+1 √ s
Ẍ(t) + Ẋ(t) + s∇2 f (X(t))Ẋ(t) + ∇f (X(t)) = 0, (18)
t t
√
where the first initial condition is X(( βp0 − 1+p
2 ) s) = x0 . By (11), we obtain the second initial condition
x1 − x0 √
√ = − s(1 − β0 )∇f (x0 ).
s
√ √
Therefore, the second initial condition is Ẋ(( βp0 − 1+p
2 ) s) = − s(1 − β0 )∇f (x0 ). □
4
L. Zhang, W. Gao, J. Xie et al. Applied Mathematics Letters 148 (2024) 108889
Fig. 1. Evolution of the trajectories for NHF-ODE (p=1.1) and HF-ODE (p=1.1) on an ill-conditioned quadratic problem in R2 .
As anticipated above, by introducing the condition p(βk−1 −βk ) = βk βk−1 , we obtain the ODEs of Halpern
fixed-point algorithm in the literature and generalize the results.
1
• βk = k+2
√
In this case, we choose p = 1 and tk = (k + 1) s, and the limit ODE of Halpern fixed-point algorithm
(6) in [5] is √
2 √ s
Ẍ(t) + Ẋ(t) + s∇2 f (X(t))Ẋ(t) + ∇f (X(t)) = 0.
t t
w+1
• βk = k+2w+2
√
Let us set p = w + 1 and tk = (k + 3w 2 + 1) s, and the limit ODE of Halpern fixed-point algorithm (6)
in [11] is √
w+2 √ 2 s
Ẍ(t) + Ẋ(t) + s∇ f (X(t))Ẋ(t) + ∇f (X(t)) = 0.
t t
√ √
Remark 3.1. If we set α = p + 1, β(t) = s and b(t) = ts , then the HF-ODE (10) satisfies
b(t) = β̇(t) + β(t)/t, which is different from the ODE framework (DIN -AV D)α,β,b .
√
Remark 3.2. In the absence of Hessian-driven damping s∇2 f (X(t))Ẋ(t), the corresponding ODE of
Halpern fixed-point algorithm (6) is
√
p+1 s
Ẍ(t) + Ẋ(t) + ∇f (X(t)) = 0,
t t
which is called NHF-ODE. We omit the proof and more details please see [12].
To better understand the role of Hessian-driven damping, we compare the two continuous systems NHF-
ODE and HF-ODE on a simple minimization problem. Taking d = 2, the step size s = 0.09 and an
ill-conditioned function f (x1 , x2 ) = 21 (10000x21 + x22 ), we have the trajectories in Figs. 1 and 2. For Fig. 1, we
set p = 1.1 and the initial conditions (x1 (0.345), x2 (0.345)) = (1, 1), (ẋ1 (0.345), ẋ2 (0.345)) = (−1500, −0.15).
For Fig. 2, we take p = 2.1 and the initial conditions (x1 (0.795), x2 (0.795)) = (1, 1), (ẋ1 (0.795), ẋ2 (0.795)) =
(−1500, −0.15). The wild oscillations of NHF-ODE are neutralized by the Hessian-driven damping in
HF-ODE.
5
L. Zhang, W. Gao, J. Xie et al. Applied Mathematics Letters 148 (2024) 108889
Fig. 2. Evolution of the trajectories for NHF-ODE (p = 2.1) and HF-ODE (p = 2.1) on an ill-conditioned quadratic problem in R2 .
4. Conclusions
This paper is devoted to ordinary differential equation for modeling Halpern fixed-point algorithm. More
precisely, we derive the limit ordinary differential equation HF-ODE of Halpern fixed-point algorithm under
some given coefficients. This limit ordinary differential equation exhibits approximate equivalence to Halpern
fixed-point algorithm and thus can serve as a tool for analysis.
Lulu Zhang: Originated the proposed idea, Algorithm analysis, Writing – original draft. Weifeng Gao:
Responsible for the research direction, Paper organization. Jin Xie: Writing – original draft, Managing of
research funding. Hong Li: Writing – original draft, Managing of research funding.
Data availability
We do not analyze or generate any datasets, because our work proceeds within a theoretical and
mathematical approach. One can obtain the relevant materials from the references below.
Acknowledgments
This work was supported in part by the National Nature Science Foundation of China under Grant
62276202 and 62106186, in part by the Natural Science Basic Research Plan in Shaanxi Province of China
under Grant 2022JQ-670 and 2022JM-372, in part by the Fundamental Research Funds for the Central
Universities, China under Grant QTZX22047 and JB210701.
References
[1] H. Attouch, M. Soueycatt, Augmented lagranian and proximal alternating direction methods of multipliers in Hilbert
spaces. Applications to games, PD e’s and control, Pac. J. Optim. 5 (1) (2009) 17–37.
[2] H.H. Bauschke, P.L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces, Springer, New
York, 2017.
[3] P.L. Combettes, Monotone operator theory in convex optimization, Math. Program. B 170 (1) (2018) 177–206.
[4] R.T. Rockafellar, Monotone operators and the proximal point algorithm, SIAM J. Control Optim. 14 (1976) 877–898.
[5] F. Lieder, On the convergence rate of the halpern-iteration, Optim. Lett. 15 (2) (2021) 405–418.
[6] K. Goebel, W.A. Kirk, Topics in Metric Fixed Point Theory, in: Cambridge Studies in Advanced Mathematics, Cambridge
University Press, Cambridge, 1990.
6
L. Zhang, W. Gao, J. Xie et al. Applied Mathematics Letters 148 (2024) 108889
[7] H.H. Bauschke, J.M. Borwein, On projection algorithms for solving convex feasibility problems, SIAM Rev. 38 (3) (1996)
367–426.
[8] M.A. Krasnoselskil̆, Two remarks on the method of successive approximations, Uspekhi Mat. Nauk 10 (1955) 123–127.
[9] Halpern, B: Fixed points of nonexpansive maps, Bull. Amer. Math. Soc. 73 (1967) 957–961.
[10] K. Nakajo, W. Takahashi, Strong convergence theorems for nonexpansive mappings and nonexpansive semigroups, J.
Math. Anal. Appl. 279 (2003) 372–379.
[11] T.D. Quoc, The connection between Nesterov’s accelerated methods and Halpern fixed-point iterations, 2022, arXiv:22
03.04869v1.
[12] W.J. Su, S. Boyd, E.L. Candès, A differential equation for modeling Nesterov’s accelerated gradient method: theory and
insights, J. Mach. Learn. Res. 17 (153) (2016) 1–43.
[13] Y. Nesterov, A method of solving a convex programming problem with convergence rate O(1/(k2 )), Sov. Math. Doklady
27 (1983) 372–376.
[14] A. Wibisono, A.C. Wilson, M.I. Jrodan, A variational perspective on accelerated methods in optimization, Proc. Natl.
Acad. Sci. 113 (47) (2016) E7351–E7358.
[15] B.T. Polyak, Some methods of speeding up the convergence of iteration methods, USSR Comput. Math. Math. Phys. 4
(5) (1964) 1–17.
[16] B. Shi, S.S. Du, M.I. Jordan, W.J. Su, Understanding the acceleration phenomenon via high-resolution differential
equations, Math. Program. (2018) arXiv preprint arXiv:1810.08907.
[17] B. Shi, S.S. Du, W.J. Su, M.I. Jordan, Acceleration via symplectic discretization of high-resolution differential equations,
in: Advances in Neural Information Processing Systems, 2019, pp. 5745–5753.
[18] H. Attouch, Z. Chbani, J. Fadili, H. Riahi, First-order optimization algorithms via systems with Hessian driven damping,
Math. Program. 193 (2022) 113–155.
[19] H. Attouch, Z. Chbani, J. Fadili, H. Riahi, Convergence of ierates for first-order optimization algorithms with inertia
and Hessian driven damping, Optimization 72 (5) (2023) 1199–1238.
[20] B. Xue, H.L. Du, X.G. Geng, New integrable peakon equation and its dynamic system, Appl. Math. Lett. (2023)
http://dx.doi.org/10.1016/j.aml.2023.108795.