Systems With Stability Guarantees Neural Lyapunov Control of Unknown Nonlinear

Machine Translated by Google
Neural Lyapunov Control of Unknown Nonlinear

Systems with Stability Guarantees
Ruikun Zhou Thanin Quartz

Department of Applied Mathematics Department of Applied Mathematics
University of Waterloo University of Waterloo
ruikun.zhou@uwaterloo.ca tquartz@uwaterloo.ca
Hans De Sterck Jun Liu

Department of Applied Mathematics Department of Applied Mathematics
University of Waterloo University of Waterloo
hans.desterck@uwaterloo.ca j.liu@uwaterloo.ca
Abstract
Learning for control of dynamical systems with formal guarantees remains a chal
lenging task. This paper proposes a learning framework to simultaneously
stabilize an unknown nonlinear system with a neural controller and learn a neural
Lyapunov function to certify a region of attraction (ROA) for the closed-loop
system. The algorithmic structure consists of two neural networks and a
arXiv:2206.01913v2
[eess.SY]
2022
Oct
16
satisfiability module theories (SMT) solver. The first neural network is responsible
for learning the unknown dynamics. The second neural network aims to identify
a valid Lyapunov function and a provably stabilizing nonlinear controller. The
SMT solver then verifies that the candidate Lyapunov function indeed satisfies the Lyapunov conditions.
We provide theoretical guarantees of the proposed learning framework in terms
of the closed-loop stability for the unknown nonlinear system. We illustrate the
effectiveness of the approach with a set of numerical experiments.
1 Introduction
Finding the region of attraction (ROA) of an asymptotically stable equilibrium point for a nonlinear
dynamical system remains one of the most challenging tasks, especially for systems with uncertainty
[1]. Even though the well-known Lyapunov theorems which use the level sets of a Lyapunov
function to estimate the ROA were proposed more than a century ago, there is no general approach
to finding a valid Lyapunov function for general nonlinear systems with provable guarantees [2] .
Thanks to the development of learning-based methods, several data-driven approaches have
recently shown their effectiveness in identifying unknown (or partly known) systems [3, 4].
However, how to simultaneously find Lyapunov functions and nonlinear controllers for systems
with unknown dynamics is still an open and active problem in control and robotics applications [5].
By exploiting the expressiveness of neural networks to tackle the complex nonlinearities of dynamical
systems, we propose a learning framework for simultaneously learning control functions and neural
Lyapunov functions. We build upon the framework in [6], but address the more challenging problem
of stabilizing a nonlinear system with unknown dynamics with formal stability guarantees. In addition,
we complement the computational framework in [6], by proving the existence of a neural network
satisfying the Lyapunov conditions, except on an arbitrarily small neighborhood of the origin, which
offers a theoretical basis for the learning framework here and in [6 ].
36th Conference on Neural Information Processing Systems (NeurIPS 2022).

Our proposed learning framework consists of two neural networks. The first neural network learns
the unknown dynamics, while the second one aims to identify a valid Lyapunov function and a
stabilizing nonlinear controller. After training the neural networks, we derive error bounds that can
be verified directly by the satisfiability module theories (SMT) solver. We then use Lipschitz
continuity of the nonlinear dynamics and its neural approximation to prove rigorous theoretical
guarantees on the closed-loop stability of the unknown system under the learned nonlinear
controller. To the best knowledge of the authors, no prior work has provided closed-loop stability
analysis in such a general setting of simultaneously learning the dynamics, a nonlinear controller, and a Lyapunov function.
We experimented on three typical nonlinear system examples: the Van der Pol oscillator, the
inverted pendulum, and the unicycle path following. To align with the theoretical results, we
carefully establish bounds for the Lipschitz constants, keep track of the training losses, and adjust
the Lyapunov conditions to be verified by the SMT solver accordingly so that the verified ROA is
valid for the unknown system. The numerical experiments show that the ROA learned this way are
comparable with the ones computed with actual dynamics, and improve that obtained from the LQR controller.
Related work. Artificial neural networks are capable of estimating any nonlinear function with
arbitrarily desired accuracy [7]. As both the right-hand side of a differential equation and the
Lyapunov function can be regarded as nonlinear (or linear) functions, mapping from the state
spaces to some vector spaces, a natural question is: can we use neural networks to approximate
the dynamics and find a Lyapunov function to achieve a larger estimate of the ROA, compared
with widely used linear–quadratic regulator (LQR) and sum of squares (SOS) methods [8, 9]?
Various applications of neural networks for system identification and Lyapunov stability have been
proposed recently [5, 10]. For instance, Wang et al. [11] use a recurrent neural network to generate
the state-space representation of an unknown dynamical system. A Lyapunov neural network is
derived in [12] to learn a fixed quadratic format Lyapunov function for finding the largest estimated ROA.
However, when it comes to finding a Lyapunov function with a typical feedforward neural network
to certify the stability for the nonlinear systems, performing this kind of regression task is difficult for
shallow neural networks, where only equality and/or inequality Lyapunov conditions, instead of
explicit expression or values of the function, are given. Consequently, it is crucial to develop
methods to test or verify the outputs of a neural network. There are two main approaches to tackling this issue.
One is converting the neural networks into a mixed-integer programming (MIP) format which can be solved by MIP
solvers. The other is to encode the verification problem into an SMT problem. In the first category, [13, 14] find a robust
Lyapunov function for an uncertain nonlinear system or a hybrid system which can be partly approximated by piecewise
linear neural networks with the help of a counterexample-guided method using mixed integer quadratic programming
(MIQP ), and the corresponding robust ROA is estimated. With a similar technique, [15] synthesizes a neural-network
controller and Lyapunov function simultaneously to certify the dynamical system's stability by converting the dynamics,
the Lyapunov function, and the controller into piecewise linear functions and solving it with mixed-integer linear
programming ( MILP). On the other hand, several SMT solvers have been developed by various scholars [16, 17], for
instance, dReal [18], with which tasks with various nonlinear real functions can be elegantly handled. This can be
employed in most cases for verifying existing neural network structures, eg, with tanh and sigmoid activation functions.
With the help of these SMT solvers, Chang et al. propose an algorithm to find and validate a
neural Lyapunov function satisfying the Lyapunov conditions with a one-hidden-layer neural network [6].
More recent work in [19] uses the same methodology to approximate control Lyapunov barrier functions for nonlinear
systems to guarantee both stability and safety for reach-avoid robot tasks, while [20] uses a similar framework to design
stabilizing controllers for unknown nonlinear systems with an implementation of Koopman operator theory. Gaussian
process (GP) regression is used in [21, 22] to learn the unknown systems while providing safety and stability guarantees
for the states in the ROA with high probability, at the same time how to enlarge the estimated ROA is also well studied.
On top of that, a method to obtain a larger ROA by iteratively re-designing a control Lyapunov function is presented in
[23]. Apart from the above traditional machine learning methods, [24] proposed a piecewise learning framework with
piecewise affine models, where stability guarantees are verified with quadratic Lyapunov functions by solving an
optimization problem. However, we have to contend with the safety-critical nature of certain control and robotics
applications. As a result, worse-case bounds and performance guarantees are more desirable. In a similar vein, an
upper bound on the infinity-norm bounds were established in the context of training deep residual networks [25].
We remark that none of these works analyzed the closed-loop stability in the general setting as ours.
two
Contributions. We summarize the main contributions of the paper as follows. • We
propose a framework for simultaneously learning a nonlinear system, a stabilizing nonlinear controller, and a
Lyapunov function for verifying the closed-loop stability of the unknown system. • We provide theoretical
analysis on the existence of a neural Lyapunov function that verifies the Lyapunov conditions, except on an
arbitrarily small neighborhood of the origin, provided that the origin is asymptotically stable. This provides
theoretical support for the proposed method as well as other neural Lyapunov frameworks in the literature.
• Combining classic Lyapunov analysis with rigorous error estimates for neural networks, we establish closed-
loop loop stability for the unknown system under the learned controller. This is accomplished with the help of
SMT solvers and recent tools for estimating Lipschitz constants for neural networks.
2 Preliminaries
Throughout this work, we consider a nonlinear control system of the form xÿ = f(x,
u), x(0) = x0, (1) where x ÿ D is the state of the system, and D ÿ R n is
an open set containing the origin; u ÿ U ÿ R m is the feedback control input given by u = ÿ(x), where ÿ(x) is a
continuous function of x. Without loss of generality, we assume the origin is an equilibrium point of the closed-
loop system
xÿ = f(x, ÿ(x)), x(0) = x0. (two)
When there is no ambiguity, we also refer to the right-hand side of (2) by f.
We assume that we do not have explicit knowledge of the right-hand side of the system (1). The
main objective is to stabilize the unknown dynamical system by designing a feedback control function ÿ.
Stability guarantees are established using Lyapunov functions. We next present some preliminaries on model
assumptions and Lyapunov stability analysis.
Model assumptions
Assumption 1 (Lipschitz Continuity). The right-hand side of the nonlinear system (1) is assumed to be Lipschitz
continuous, ie, f(x, u) ÿ f(y, v) ÿ
L(x, u) ÿ (y, v) ÿx, y ÿ D and ÿu, v ÿ U, where L is called the Lipschitz constant; (x,
u) and (y, v) denote the concatenation of the corresponding two vectors. We assume the Lipschitz constant L is
known.
Assumption 2 (Partly Known Dynamics). The linearized model about the origin in the form of xÿ = Ax + Bu,
where A and B are constant matrices, is known for the nonlinear system (1).
To analyze the stability properties of the closed-loop system for (1) in the presence of uncertainty (due to the
need to approximate the unknown dynamics), we introduce a more general notion of stability about a closed set
A (see, eg, [26], and the appendix for a definition). When A = {0}, this coincides with the standard notion of
stability about an equilibrium point. Intuitively, set stability wrt to A is measured by closeness and convergence of
solutions to the set A. To this end, define the distance from x to A by xA = infyÿA x ÿ y.
Set invariance [27] plays an important role in our analysis.

Definition 1 (Forward Invariance). A set ÿ ÿ R n is said to be forward invariant for (2) if x0 ÿ ÿ implies that x(t) ÿ
ÿ for all t ÿ 0.
Definition 2 (Region of Attraction). For a closed forward invariant set A that is UAS, the region of attraction is the
set of initial conditions in D such that the solution for the closed-loop system (2) is defined for all t ÿ 0 and x(t)A ÿ
0 as t ÿ ÿ.
Remark 1. Any set satisfying Definition 2 is called a region of attraction. The ROA is the largest set contained in
D satisfying Definition 2.
Definition 3 (Lie Derivatives). The Lie derivative of a continuously differentiable scalar function V : D ÿ R over a
vector field f and a nonlinear controller u is defined as
n n
ÿV ÿV
ÿfV (x) = xÿ i = fi(x, u). ÿxi (3)
ÿxi
i=1 i=1
3
The lie derivative measures the rate of change along the system dynamics.
The following is a standard Lyapunov theorem for the UAS property of a compact invariant set.
Theorem 1 (Sufficient Condition for UAS property). Consider the closed-loop nonlinear system (2).
Let A ÿ D be a compact invariant set of this system. Suppose there exists a continuously differentiable
function V : D ÿ R that is positive definite with respect to A, ie,
V (x) = 0 ÿx ÿ A and V (x) > 0 ÿx ÿ D \ A, (4)
and the lie derivative is negative definite with respect to A, ie
ÿfV (x) < 0 ÿx ÿ D \ A. (5)

Then, A is UAS for the system.
See [26, 28] for sufficiency and necessity of Lyapunov conditions for set stability under more general
settings. The function V satisfying (4) and (5) is called a Lyapunov function with respect to A.
From the forward invariance of level sets of a Lyapunov function it follows that these sets provide an
estimate of the ROA (see [2]).
Lemma 1 (Region of Attraction with Lyapunov Functions). Suppose that V satisfies the conditions in
Theorem 1. Denote
V w := {x ÿ D | V(x) ÿ c}.
For every c > 0, V c is a region of attraction for the closed-loop system (2).
3 Learn and Stabilize Dynamics with Neural Lyapunov Functions
In this section, we show how to use two shallow neural networks to learn the dynamics and find a
valid neural Lyapunov function. The first network is designed to learn the dynamics as shown in (1),
and the second network aims to identify a valid neural Lyapunov function and its corresponding
nonlinear controller. The algorithmic structure can be found in Fig. 1 and the pseudocode in Algorithm 1.
Figure 1: The schematic diagram of the proposed algorithm.
3.1 Learn the Unknown Dynamics
According to the well-known universal approximation theorem [29], shallow neural networks are able
to approximate the right-hand side f(x, u) of the nonlinear system (1) arbitrarily well. Here we use a
one-hidden-layer feedforward neural network, denoted as ÿ(x, u), to conduct this regression task with
a mean square loss. For notational convenience, we slightly overuse the notation of ÿ to denote both
the learned dynamics and the parameters that define the function. Note that if we use the tanh or
sigmoid function as activation functions, the output layer should not have such activation functions,
as f is not bounded in most cases.
4
When learning the dynamics, f is regarded as a function of two variables, x and u. The input
dimension of the neuron network is (n + m), and output dimension is n. The training data of x and u
are sampled uniformly and independently over their respective spaces. In most robotic systems the
input is provided by motors, which is typically saturated in practice. To reflect this, we use the
activation function tanh to simulate this property. The format of the controller is
u = ÿ(x) = Cÿ(kx + b), (6)
where C is a constant matrix, determined by the saturation property of the controller in real systems,
which defines the bounds of U, k and b are the weight matrix and bias vector, respectively, which
are initialized with an LQR controller based on the linearized system xÿ = Ax + Bu. The bias b is
chosen such that f(0, Cÿ(b)) = 0, ie, the origin is an equilibrium point for the closed-loop system (2).
Note that the bias vector b is not updated in the learning process.
3.2 Neural Lyapunov Function and Nonlinear Controller
Following learning the dynamics using ÿ, the other neural network is designed to learn a nonlinear
controller and a neural Lyapunov function simultaneously during the same training process. The
detailed structure can be found in the appendix. It is worth mentioning that since we train the
Lyapunov function with data generated by the learned dynamics, which is an approximation of
the actual dynamics, we rewrite (4) and (5) as follows to accommodate for approximation errors:
V(0) = 0, and , ÿx ÿ D\{0}, V (x) > 0 and ÿÿV (x) < ÿÿ. (7)
Here ÿ is a positive real number, which can be determined as follows.

It assumes that (x, u) is a pair of unsampled state and inputs, and (y, v) is its nearest known sample
used in training or testing the neural network for learning the dynamics. Let ÿ > 0 be such that (x,
u) ÿ (y, v) ÿ ÿ holds for all such (x, u) and (y, v) and let M > 0 be such that < M for all x ÿÿV
D Denote
ÿx
ÿ as the maximum of the 2-norm loss among all known samples, which can be from either the
training dataset or the test dataset. Then the bound on the generalization error can be calculated as
f(x, u) ÿ ÿ(x, u) ÿ f(x, u) ÿ f(y, v) + f(y, v) ÿ ÿ(y, v) + ÿ(y, v) ÿ ÿ (x, u) ÿ

(8)
ÿ Kf ÿ + ÿ + Kÿÿ < ,
M
where f and ÿ are Lipschitz continuous with Lipschitz constants Kf , Kÿ, respectively and ÿ > 0 is
some sufficiently large constant. Then, choosing ÿ to satisfy (8) implies that
ÿV
ÿfV (x) ÿ ÿÿV (x) ÿ ÿ f(x, ÿ(x)) ÿ ÿ(x, ÿ(x)) < M = ÿ. M (9)
ÿx
In view of (7) and (9), we have
ÿfV (x) < ÿÿV (x) + ÿ < ÿÿ + ÿ = 0, ÿx ÿ D\{0}, (10)
which implies that the actual dynamics is also stable with the obtained neural Lyapunov function.
Here, we calculate Kÿ by using LipSDP-network developed in [30], and M can be determined by
checking the inequality with an SMT solver. An initial guess of ÿ is needed according to some
prior knowledge, and it will be re-computed in each epoch with (8).
A valid neural Lyapunov function can be guaranteed by the SMT solver with all the Lyapunov
conditions written as falsification constraints in the form of first-order logic formula over reals [6]:
ÿÿ(x) :=n 2xi ÿ ÿ ÿ (V (x) ÿ 0 ÿ ÿÿV (x) ÿ ÿÿ), (11)

i=1
where ÿ is a numerical error parameter, which is explicitly introduced for controlling numerical
sensitivity around the origin. If the SMT solver returns either UNSAT this means that the falsification
constraint is guaranteed not to have any solutions and confirms all the Lyapunov conditions are
met. If the SMT solver returns ÿ-SAT, this means there exists at least one counterexample under
the ÿ-weakening condition [31] that satisfies the falsification conditions.
5
We use ÿ to denote the parameter vectors for a neural Lyapunov function candidate Vÿ. The
parameters ÿ and k are found by minimizing the following cost function, which is a modification of the
ÿV
so-called empirical Lyapunov risk in [6] by adding one more ÿxterm
, as we need ÿ to be small as well:
No
1 ÿV
two
L(ÿ, k) = C1 max (ÿVÿ (xi), 0)+C2 max (0, ÿÿVÿ (xi)) +C3V ÿ (0)+C4 (12)
No ÿx
i=1
where C1, C2, C3 and C4 are tunable constants. The cost function can be regarded as the positive penalty of any violation of the Lyapunov
conditions in (4) and (5). Note that the ROA can also be maximized by adding an L2-like cost term to the Lyapunov risk with L(ÿ, k) + i=1
1 No
xi2 ÿ ÿVÿ (xi), where ÿ is a tunable parameter, as shown in the original paper [6]. No
Algorithm 1 Neural Lyapunov Control with Unknown Dynamics
1: function LEARNINGDYNAMICS(Xdyn, U(bdd) )

2: Set learning rate (ÿ), input dimension (n + m), output dimension (n) repeat f ÿ
3: NNÿ(x)
4: Output of forward pass of neural network
5: Compute MSE L(f, ÿ) ÿ ÿ
6: ÿ ÿ ÿÿÿL(f, ÿ) until Updates Weights using SGD
7: convergence return
ÿ 8: 9:
end function
10: function LEARNINGLYAPUNOV(XLyp, fÿ, klqr)
11: Set learning rate (ÿ), input dimension(n), output dimension (1)
12: Initialize feedback controller u to LQR solution k lqr repeat
13:
14: Vÿ(x), u(x) ÿ NNÿ,u(x) Output of forward pass of neural network
Din ÿV
15:
ÿÿV (x) ÿ [ÿ]i(x) i=1 ÿxi
16: Compute Lyapunov risk L(ÿ, k) ÿ ÿ ÿ
17: ÿ ÿÿÿL(ÿ, k) k ÿ k ÿ
18: ÿÿkL(ÿ, k) until Updates Weights using SGD
19: convergence return
Vÿ, u 20: 21:
end function 22:
function FALSIFICATION( fÿ, u, Vÿ, ÿ, ÿ, ÿ)
23: Encode conditions from (11)
Use SMT solver with ÿ to verify the conditions 24:
return satisfiability 25: 26: end
function 27:
function MAIN( ) 28:
input initial guess of bound (ÿ), parameters of LQR (k lqr), radius (ÿ), precision (ÿ), sampled states X,
sampled inputs U ÿ ÿ
29: LEARNINGDYNAMICS(Xdyn, U(bdd) ) while
30: Satisfiable do
31: Add counterexamples to X
32: Vÿ, u ÿ LEARNING-LYAPUNOV(XLyp, ÿ, klqr) update
33: ÿ according to (8)
34: CE ÿ FALSIFICATION(ÿ, u, Vÿ, ÿ, ÿ, ÿ) end
while 35:
36: end function
4 Theoretical Guarantees
In this section, we analyze the theoretical guarantees of using neural networks to learn to control an unknown
nonlinear system and a Lyapunov function for certifying the closed-loop stability.
6
4.1 Approximation guarantee of unknown dynamics
Our analysis provides stability guarantees for the unknown system by rigorously quantifying the
errors using Lipschitz constants of the unknown dynamics and its neural approximation. To this end,
we need a theoretical guarantee that extends the universal approximation theorem, stating that we
can approximate the Lipschitz constants and function values by a neural network to an arbitrary precision.
Theorem 2. (Approximation of Lipschitz constants). Suppose that K ÿ R n is a compact set. (a) If
f : K ÿ R m is L-Lipschitz in the uniform norm, ie f(x) ÿ f(y)ÿ
ÿ Lx ÿ yÿ, (13) then for every > 0 there exists a neural network of
the form ÿ(x) = C(ÿ ÿ (Ax + b)) for ÿ ÿ C for some k ÿ N such that
1 k and C ÿ R k×n
(R) and not a polynomial, A ÿ R k×m, b ÿ R
supxÿK |f(x) ÿ ÿ(x)| < and ÿ has a Lipschitz constant of L + in the same norms as ( 13). (b) If f : K
ÿ R m is L-Lipschitz in the two norm, ie
f(x) ÿ f(y)ÿ ÿ Lx ÿ y2, (14)
then for every > 0 there exists a neural network ÿ of the same form such that supxÿK |f(x) ÿ ÿ(x)| < and
ÿ n+n/
ÿ has a Lipschitz constant of L + + L in the same norms as (14).
two
The idea of the proof is to first approximate f by a smooth function F and then approximate F by a
neural network ÿ and the details can be found in the appendix.
Remark 2. The equivalence of norms gives an upper bound on the Lipschitz constant for all norms.
4.2 Existence of neural Lyapunov functions
As a theoretical guarantee we show that it is possible to train a neural network as a Lyapunov function,
provided that a Lyapunov function exists. According to converse Lyapunov theorems [26, 28],
Lyapunov functions do exist when the origin is UAS. More specifically, we show that the learned
neural network satisfies the Lyapunov conditions outside some neighborhood of the origin that can be
chosen to be arbitrarily small in measure. The idea will be to perform an under-approximation of the
domain D in a controlled way. Let (R n, B(R n), µ) denote the standard measure space where B(R n)
is the Borel ÿ-algebra and µ is the Lebesgue measure. The following lemma from [32] states that it is
possible to arbitrarily under-approximate open sets by compact sets in measure.
Lemma 2. For every open set O ÿ B(R n) such that µ(O) < ÿ and every > 0, there exists a compact set
K such that µ(O \ K) < .
By the pointwise approximation of the universal approximation theorem, it is not possible to satisfy the
Lyapunov conditions on D as this set contains the origin. However, if there is a Lyapunov function for
(2) that a neural network could learn and if practical stability (ie, set stability wrt a sufficiently small
neighborhood of the origin) is sufficient, Theorem 3 states that there exists a neural network satisfying
the Lyapunov conditions on a compact set K except on a closed neighborhood B of the origin that is
UAS. Moreover, this approximation can be controlled in measure. The details of the proof can be
found in the appendix.
Theorem 3. Suppose that the origin is UAS for system (2) and I is a forward invariant set contained in
the ROA of the origin. Fix any ÿ1, ÿ2 > 0. There exists a forward invariant and compact set K ÿ I
satisfying the under approximation µ(I \ K) < ÿ1. On K there exists a neural network Vÿ that satisfies
the Lyapunov conditions on K \ A, where A is a closed neighborhood of the origin. The neural
Lyapunov function Vÿ can certify that a closed invariant set B containing A and satisfying µ(B \ A) <
ÿ2 is UAS. Furthermore, the set K is contained in the ROA of B.
An important feature of Lyapunov functions is that the level sets approach the ROA. We can show
that the neural Lyapunov function Vÿ has such a property. Details can be found in the appendix.
4.3 Asymptotic Stability Guarantees of Unknown Nonlinear Systems
With the theoretical guarantee of learning a neural Lyapunov function established above, we show
that the neural network trained with the learned dynamics is robust, that is this neural network also
7
satisfies the Lyapunov conditions with respect to the actual dynamics f. Since the SMT solver
verifies the Lyapunov conditions outside of some -ball which is not necessarily forward invariant,
the following technical assumption helps bridge this gap. The assumption is mild, because for
the nonlinear system to be stabilized, it is reasonable to assume that it has a stable linearization.
A linear system xÿ = Ax + Bu is said to be stable if there exists a matrix K such that A + BK is Hurwitz, ie, all
eigenvalues of A + BK have negative real part. If (A, B) is stabilized, then the gain matrix K can be obtained by
an LQR controller.
Assumption 3 (ROA of LQR Controller). Suppose that the linearized model xÿ = Ax + Bu from
Assumption 2 is stable. Consequently, an LQR controller and a quadratic Lyapunov function can
be found such that the origin is UAS for the closed-loop system (2). Furthermore, we assume
that the set Bÿ which is not verified by a SMT solver lies in the interior of a ROA of the closed-
loop system, provided by the quadratic Lyapunov function.
Since this ÿ-ball is small, we further assume that the level sets of the Lyapunov function are contained
in the ROA of the LQR controller.
Assumption 4 (Controlled Level Sets). Denote Bÿ := {x : x ÿ ÿ}. Let V be a continuously differentiable function
satisfying the Lyapunov conditions on D \ Bÿ. Suppose that there exists constants 0 < c1 < c2 such that the
following chain of inclusions holds
{x ÿ D : V (x) ÿ c1} ÿ Bÿ ÿ {x ÿ D : V (x) ÿ c2}, and {x ÿ D : V (15)
(x) ÿ c2} in the interior of the ROA of the closed loop system provided by the quadratic Lyapunov function.
Theorem 4. (Stability Guarantees for the Unknown System) Let ÿ be the approximated dynamics of right-hand
side of the closed-loop system (2) trained by the first neural network. There exists a neural Lyapunov function
V which is learned using ÿ and verified by an SMT solver that satisfies the Lyapunov conditions with respect to
the actual dynamics f. Furthermore, if the system satisfies Assumption 3 and V satisfies Assumption 4, then
the origin is UAS for the closed-loop system (2).
The details of this proof can be found in the appendix. This theorem proves that if the dynamics are
approximated to sufficient precision then the neural Lyapunov function satisfies the Lyapunov conditions on D
\ Bÿ for the actual dynamics. Furthermore, if the level sets of the neural Lyapunov function are sufficiently well
behaved and the set Bÿ excluded from SMT verification is small then this learning framework certifies that the
origin is UAS for the actual system (2).
5 experiments
In this section, we demonstrate the effectiveness of the proposed algorithm on learning the
unknown dynamics and finding provably stable controllers as well as neural Lyapunov functions
for various classic nonlinear systems. In all the following problems, we use a two-layer neural
network with one hidden layer for both learning the dynamics and learning the neural Lyapunov
function. For learning the dynamics, the number of neurons in the hidden layer varies from 100 to
200 without an output layer activation function as stated before, and we call this neural network FNN for convenience.
However, for learning the neural Lyapunov function, there are six neurons in the hidden layer for all the
experiments, and we name this neural network VNN in short. Regarding other parameters, we use the Adam
optimizer for both FNN and VNN, and we use dReal as the SMT solver, setting ÿ as 0.01 for all experiments.
All the training of FNN is performed on Google Colab with a 16GB GPU, and VNN training is done on a 3 GHz
6-Core Intel Core i5.
5.1 Van der Pol Oscillator
As a starting point, we test the proposed algorithm on the nonlinear system without input u first to
show its effectiveness in learning unknown dynamical systems and finding the valid neural
Lyapunov function. Van der Pol oscillator is a well-known nonlinear system with stable oscillations
as a limit cycle. The area within the limit cycle is the non-convex ROA, as shown in the appendix.
According to the algorithm described in Section 3, we learn the two nonlinear dynamical
equations with 100 hidden neurons. To ensure an accurate model, we use 9 million data points
sampled on (ÿ1.5, 1.5) × (ÿ1.5, 1.5), and the learning rate of the training process varies from 0.1 to 1e-5 to
8
acquire a small enough ÿ. With the learned dynamics, we aim to find a valid Lyapunov function
over the domain x2 < 1.2, and the obtained neural Lyapunov function is shown in Fig. 2nd. The
corresponding ROA estimate can be found in Fig. 2b. In comparison, the ROAs found by the
neural Lyapunov function and the classical LQR techniques in [2] using actual dynamics are also
plotted, as the blue and magenta ellipses respectively. The phase portrait of the system is given
as the gray curves with small arrows. It is obvious that the neural Lyapunov function after tuning
obtains a larger ROA. We also have a comparable verified ROA for the system with the one
obtained with actual dynamics based on the same neural Lyapunov approach, both larger than
the LQR case. The values of the parameters can be found in Table 1. The dynamics, the neural
Lyapunov function, and the nonlinear controller for all experiments are described in detail in the appendix.
Table 1: Parameters in Van der Pol Oscillator case

ÿV
Kf Kÿ ÿ ÿ
ÿx ÿ ÿ
3.4599 5.197 5e-4 8.5e-3 1.249 0.02 0.2
(a) Neural Lyapunov function (b) ROA comparison with LQR
Figure 2: Neural Lyapunov function and the corresponding estimated ROA for Van der Pol oscillator.
5.2 Unicycle path following
The path following task is a typical stabilization problem for a nonlinear system. Here, we consider
path tracking control using a kinematic unicycle with error dynamics [33]. With the learned dynamics
ÿ, a neural Lyapunov function can be learned on the valid region x2 ÿ 0.8, and the neural controller
is set as u = 5 tanh(kx + b). The ROA comparison with the LQR method can be found in Fig. 3a.
Apparently, the neural network method yields a larger estimated ROA, compared to the classical
LQR approach in which the level set is determined by considering some relaxation of the largest
ÿ
reasonable range of linearizatin for practical systems under a small angles assumption (< 9), given
the fact that the actual dynamics is unknown.
The parameters for this valid neural Lyapunov function and the learned dynamics in this case are
listed in Table 2. The Lipschitz constant Kf is computed by using the bound Jf twoÿ ÿ mJf ÿ,
where Jf is the Jacobian matrix of f and m is the number of rows of Jf . Note that ÿ can be the
maximum of the 2-norm loss over a test dataset as stated in Section 3.2, since what we need here
is the discrepancies between the actual value and the approximated value of some known samples.
In this regard, we can train FNN with fewer data samples, which is more computationally efficient.
Then on top of that, a much larger dataset uniformly sampled over state and input spaces is used
to calculate ÿ, which contributes to a smaller ÿ. We implement the same approach on the inverted
pendulum case as well.
9
Table 2: Parameters in unicycle path following case

ÿV
Kf Kÿÿ _ ÿ
ÿx ÿÿ
< 45 108 1e-4 7e-3 4.43 0.1 0.1
(a) ROA comparison for path following (b) ROA comparison for inverted pendulum
Figure 3: Comparison of obtained ROAs for path following and inverted pendulum.
5.3 Inverted pendulum.
The inverted pendulum is another well-known nonlinear control problem. This system has two state
variables ÿ, ÿÿ and one control input u. Here, ÿ and ÿÿ represent the angular position from the
inverted position and angular velocity. The valid domain is x2 ÿ 4. The similar ROA comparison is
shown in Fig. 3b, where the LQR approach uses the same function as given in [6]. The details
regarding the bounds and ÿ are given in the appendix.
6 Conclusion
This paper explores the ability of neural networks to approximate an unknown nonlinear system with
sufficient precision and find a neural Lyapunov function as well as a feedback controller to stabilize
the unknown dynamics. Probable guarantees are also provided with the help of bounds on the
approximation error and partial derivatives of the Lyapunov function. Moreover, experimental results
show that the proposed approach outperforms the existing classical approach given part of the
dynamics is known. The algorithm is only tested on low-dimensional systems. How to address this
issue for high-dimensional systems is an interesting future research direction.
Acknowledgments
This work was funded in part by the Natural Sciences and Engineering Research Council of Canada
(NSERC), the Canada Research Chairs (CRC) program, and the Ontario Early Researcher Awards
(ERA) program.
References
[1] LG Matallana, AM Blanco, and JA Bandoni, “Estimation of domains of attraction: A global
optimization approach,” Mathematical and Computer Modeling, vol. 52, no. 3-4, pp. 574–585,
Aug. 2010. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/ S0895717710001913
[2] H. Khalil, Nonlinear Control, ser. Always Learning. Pearson, 2015. [Online]. Available:
https://books.google.ca/books?id=OfHhnQEACAAJ
10
[3] A. Chiuso and G. Pillonetto, “System identification: A machine learning perspective,” Annual
Review of Control, Robotics, and Autonomous Systems, vol. 2, pp. 281–304, 2019.
[4] G. Pillonetto, F. Dinuzzo, T. Chen, G. De Nicolao, and L. Ljung, “Kernel methods in system
identification, machine learning and function estimation: A survey,” Automatica, vol. 50, no. 3,
pp. 657–682, 2014.
[5] Y. Jiang and Z.-P. Jiang, Robust adaptive dynamic programming. John Wiley & Sons, 2017.
[6] Y.-C. Chang, N. Roohi, and S. Gao, “Neural lyapunov control,” Advances in neural information
processing systems, vol. 32, 2019.
[7] K. Hornik, M. Stinchcombe, and H. White, “Multilayer feedforward networks are universal
approximators,” Neural networks, vol. 2, no. 5, pp. 359–366, 1989.
[8] A. Papachristodoulou and S. Prajna, “A tutorial on sum of squares techniques for systems
analysis,” in Proceedings of the 2005, American Control Conference, 2005. Portland, OR, USA:
IEEE, 2005, pp. 2686–2700. [Online]. Available: http://ieeexplore.ieee.org/document/1470374/
[9] L. Khodadadi, B. Samadi, and H. Khaloozadeh, “Estimation of region of attraction for poly
nominal nonlinear systems: A numerical method,” ISA transactions, vol. 53, no. 1, pp. 25–32,
2014.
[10] N. Gaby, F. Zhang, and X. Ye, “Lyapunov-Net: A Deep Neural Network Architecture for Lyapunov
Function Approximation,” arXiv:2109.13359 [cs, math], Sep. 2021, arXiv: 2109.13359. [Online].
Available: http://arxiv.org/abs/2109.13359
[11] Jeen-Shing Wang and Yen-Ping Chen, “A fully automated recurrent neural network for unknown
dynamic system identification and control,” IEEE Transactions on Circuits and Systems I:
Regular Papers, vol. 53, no. 6, pp. 1363–1372, Jun. 2006. [Online]. Available: http://
ieeexplore.ieee.org/document/1643442/
[12] SM Richards, F. Berkenkamp, and A. Krause, “The Lyapunov Neural Network: Adaptive Stability
Certification for Safe Learning of Dynamical Systems,” in arXiv:1808.00924 [cs], Oct. 2018,
arXiv: 1808.00924. [Online]. Available: http://arxiv.org/abs/1808.00924
[13] S. Chen, M. Fazlyab, M. Morari, GJ Pappas, and VM Preciado, “Learning Region of Attraction
for Nonlinear Systems,” arXiv:2110.00731 [cs, eess], Oct. 2021, arXiv: 2110.00731.
[Online]. Available: http://arxiv.org/abs/2110.00731
[14] ——, “Learning lyapunov functions for hybrid systems,” in Proceedings of the 24th International
Conference on Hybrid Systems: Computation and Control. Nashville Tennessee: ACM, May
2021, pp. 1–11. [Online]. Available: https://dl.acm.org/doi/10.1145/3447928. 3456644
[15] H. Dai, B. Landry, L. Yang, M. Pavone, and R. Tedrake, “Lyapunov-stable neural-network

control,” arXiv:2109.14152 [cs, eess], Sep. 2021, arXiv: 2109.14152. [Online]. Available: http://
arxiv.org/abs/2109.14152
[16] L.d. Moura and N. Bjørner, “Z3: An efficient smt solver,” in International conference on Tools
and Algorithms for the Construction and Analysis of Systems. Springer, 2008, pp. 337–340.
[17] G. Katz, C. Barrett, DL Dill, K. Julian, and MJ Kochenderfer, “Reluplex: An efficient smt solver
for verifying deep neural networks,” in International conference on computer aided verification.
Springer, 2017, pp. 97–117.
[18] S. Gao, S. Kong, and EM Clarke, “dreal: An SMT solver for nonlinear theories over the reals,” in
International conference on automated deduction. Springer, 2013, pp. 208–214.
[19] C. Dawson, Z. Qin, S. Gao, and C. Fan, “Safe Nonlinear Control Using Robust Neural Lyapunov-
Barrier Functions,” arXiv:2109.06697 [cs, eess], Oct. 2021, arXiv: 2109.06697.
[Online]. Available: http://arxiv.org/abs/2109.06697
[20] V. Zinage and E. Bakolas, “Neural Koopman Lyapunov Control,” arXiv:2201.05098 [cs, eess],
Jan. 2022, arXiv: 2201.05098. [Online]. Available: http://arxiv.org/abs/2201.05098
11
[21] F. Berkenkamp, R. Moriconi, AP Schoellig, and A. Krause, “Safe learning of regions of attraction for
uncertain, nonlinear systems with gaussian processes,” in 2016 IEEE 55th Conference on Decision
and Control (CDC). IEEE, 2016, pp. 4661–4666.
[22] F. Berkenkamp, M. Turchetta, A. Schoellig, and A. Krause, “Safe model-based reinforcement learning
with stability guarantees,” Advances in neural information processing systems, vol. 30, 2017.
[23] A. Mehrjou, M. Ghavamzadeh, and B. Schölkopf, “Neural lyapunov redesign,” arXiv preprint
arXiv:2006.03947, 2020.
[24] M. Farsi, Y. Li, Y. Yuan, and J. Liu, “A piecewise learning framework for control of unknown nonlinear
systems with stability guarantees,” in Learning for Dynamics and Control Conference.
PMLR, 2022, pp. 830–843.
[25] M. Marchi, B. Gharesifard, and P. Tabuada, “Training deep residual networks for uniform approximation
guarantees,” in Proceedings of the 3rd Conference on Learning for Dynamics and Control, ser.
Proceedings of Machine Learning Research, vol. 144. PMLR, 07 – 08 June 2021, pp. 677–688.
[Online]. Available: https://proceedings.mlr.press/v144/marchi21a.html
[26] Y. Lin, ED Sontag, and Y. Wang, “A smooth converse lyapunov theorem for robust stability,”
SIAM Journal on Control and Optimization, vol. 34, no. 1, pp. 124–160, 1996.
[27] F. Blanchini, “Set invariance in control,” Automatica, vol. 35, no. 11, pp. 1747–1767, 1999.
[28] AR Teel and L. Praly, “A smooth Lyapunov function from a class-KL estimate involving two positive
semidefinite functions,” ESAIM: Control, Optimization and Calculus of Variations, vol. 5, pp. 313–367,
2000. [Online]. Available: http: //www.numdam.org/item/COCV_2000__5__313_0/
[29] G. Cybenko, “Approximation by superpositions of a sigmoidal function,” Mathematics of

control, signals and systems, vol. 2, no. 4, pp. 303–314, 1989.
[30] M. Fazlyab, A. Robey, H. Hassani, M. Morari, and GJ Pappas, “Efficient and Accurate Estimation of
Lipschitz Constants for Deep Neural Networks,” arXiv:1906.04893 [cs, math, stat], Jun . 2019, arXiv:
1906.04893. [Online]. Available: http://arxiv.org/abs/1906.04893
[31] S. Gao, J. Avigad, and EM Clarke, “ÿ-complete decision procedures for satisfiability over the reals,” in
International Joint Conference on Automated Reasoning. Springer, 2012, pp. 286–300.
[32] H. Royden and P. Fitzpatrick, Real Analysis. Prentice Hall, 2010. [Online]. Available:
https://books.google.ca/books?id=0Y5fAAAACAAJ
[33] R. Carona, AP Aguiar, and J. Gaspar, “Control of unicycle type robots tracking, path following and point
stabilization,” in International Proceeding of IV Electronics and Telecommunications.
Quoteseer, November 2008.
[34] A. Pinkus, “Approximation theory of the mlp model in neural networks,” Acta Numerica, vol. 8, p. 143–
195, 1999.
[35] AF Filippov, Differential equations with discontinuous right hand sides, ser. Mathematics and its
Applications (Soviet Series). Dordrecht: Kluwer Academic Publishers Group, 1988, vol. 18, translated
from the Russian.
[36] J. Schwartz and H. Karcher, Nonlinear Functional Analysis, ser. Notes on mathematics and its
applications. Gordon and Breach, 1969. [Online]. Available: https://books.google.ca/books?
id=7DYv1rkD1E4C
12
Appendix
A.1 Appendix A : Algorithm
The structure of the neural network (VNN) mentioned in Section 3.2 for simultaneously learning the neural
Lyapunov function and the nonlinear controller is detailed in Fig. 4 [6].
Figure 4: Algorithmic structure of learning a neural Lyapunov function and the corresponding nonlinear
controller with a 1-hidden layer neural network and an SMT solver.
A.2 Appendix B: Proofs
We first begin by defining notions of stability that are necessary to the proofs.
Definition 4 (Set stability). A closed set A ÿ R n is said to be uniformly asymptotically stable (UAS) for the
closed-loop system (2), if the following two conditions are met: (1) (Uniform
stability) For every > 0, there exists a ÿ > 0 such that x(0)A < ÿ implies that x(t) is defined for t ÿ 0 and xA <
for any solution x of (2) for all t ÿ 0; and (2) (Uniform attractivity) There exists some ÿ >
0 such that, for every > 0, there exists some T > 0 such that x(t) is defined for t ÿ 0 and x(t)A < for any
solution x(t) of (2) whenever x(0)A < ÿ and t ÿ T.
Definition 5 (Reachable Set). Let Rt (x0) denote the point x(t) reached by the solution of (2) at time t
starting at x0. For T ÿ 0 defines the finite time horizon reachable set as R
0ÿtÿT (x0) = ÿ0ÿtÿT R t
(x0).
Similarly, for a set W ÿ D, define R

0ÿtÿT (W) := ÿx0ÿW R 0ÿtÿT (x0).
Similarly, if solutions are defined for all t ÿ 0, then the reachable set is defined as (x0).
t
R(W) := ÿx0ÿW ÿtÿ0 R
Before we prove Theorem 3 we introduce some lemmas. First we state an extension of the universal
approximation theorem that states it is possible to simultaneously pointwisely approximate a function and
its partial derivatives by a neural network. The proof of this result can be found in [34].
Theorem 5. Let K ÿ R m be a compact set and suppose f : K ÿ R n > 0 there ÿ C 1 (Rn ). Then, for every
1
exists a neural network ÿ : K ÿ R of the form ÿ(x) = C(ÿ ÿ (Ax + b)) for ÿ ÿ C kk×n and C ÿ R (R)
and not a polynomial, A ÿ R k×m, b ÿ R for some k ÿ N such that |
f ÿ ÿÿ := sup f(x) ÿ ÿ(x)| < (16)
xÿK
13
and for all i = 1, . . . , n, the following simultaneously holds ÿf

ÿxi ÿÿ
ÿ
<. (17)
ÿxi ÿ
To make the connection between the topology of the dynamical system and the compact set
guaranteed by Lemma 2 we consider the reachable set. The following result states that the finite time
horizon reachable set is compact and can be found
any Lemma 3. Suppose that K ÿ R in [35]. is a compact set, then the set R0ÿtÿT (K) is compact for
n T ÿ 0.
We recall theorem 3:
Theorem 3. Suppose that the origin is UAS for system (2) and I is a forward invariant set contained in
the ROA of the origin. Fix any ÿ1, ÿ2 > 0. There exists a forward invariant and compact set K ÿ I
satisfying the under approximation µ(I \ K) < ÿ1. On K there exists a neural network Vÿ that satisfies
the Lyapunov conditions on K \ A, where A is a closed neighborhood of the origin. The neural Lyapunov
function Vÿ can certify that a closed invariant set B containing A and satisfying µ(B \ A) < ÿ2 is UAS.
Furthermore, the set K is contained in the ROA of B.
Proof. By the converse Lyapunov theorem [2], there exists a function V satisfying the Lyapunov
conditions on I. Lemma 2 states that there exists a compact set W such that µ(I \ W) < ÿ/2.
Since unions preserve compactness we can suppose without loss of generality that W contains
the closed ball of radius r for r > 0 sufficiently small, denoted as Br, which lies in the interior of I.
By virtue of I being a forward invariant set contained in the region of attraction, for autonomous
systems, asymptotic stability is equivalent to uniform attractive stability, so in particular there exists a
time T > 0 such that all solutions starting in W will enter Aÿ : = {x ÿ Br : V (x) ÿ ÿ} for any ÿ ÿ 0 without
leaving I. The continuity of measure and the Lyapunov condition V (0) = 0 implies there exists a
constant ÿ0 > 0 such that µ(Aÿ0 ) < ÿ/2 since ÿÿ>0Aÿ = {0} and this is a set of measure zero. For ease
of notation, we simply refer to Aÿ0 as A. Let T ÿ 0 be the time such that all solutions starting in W enter
A. By Lemma 3, the reachable set R0ÿtÿT (W) is compact and satisfies µ (I \ R0ÿtÿT (W)) < ÿ/2. denote
K := R 0ÿtÿT (W) ÿ A.
We see that K which contains the origin is compact and forward invariant. Similarly, by the continuity
of V and the continuity of measure, it follows that ÿÿ>ÿ0Aÿ = A and this implies there exists a level set
Aÿ1 such that µ(Aÿ1 \ A) < ÿ2.
By the continuity of V there exists a constant ÿ > 0 such that V > ÿ and ÿfV (x) < ÿÿ on K \ A. We can
suppose that ÿ < ÿ1 ÿ ÿ0 in the inequality above. By Theorem 5, there exists a < ÿ/2 neural network
approximation of V denoted Vÿ satisfying the pointwise bounds Vÿ ÿ V and ÿfV ÿ ÿfVÿÿ < ÿ/ ÿ
2 on K \ A. This proves that Vÿ satisfies the Lyapunov conditions on K \ A. To summarize we have

established that
Vÿ(x) > ÿ/2 ÿx ÿ K \ A,
and
ÿfVÿ(x) < ÿÿ/2; ÿx ÿ K \ A. < ÿ/2 it
By this pointwise bound Vÿ ÿ V ÿ follows that A ÿ B :=
{x ÿ Br : Vÿ ÿ ÿ0 + ÿ2} ÿ Aÿ1 .
This proves that µ(B \ A) < ÿ2. Now we show that Vÿ verifies that the set B is uniformly
asymptotically stable.
(Uniform Stability) Given > 0 as per the definition of uniform stability. denote
B(B) := ÿxÿBB(x).
Without loss of generality by taking < r we can assume that B(B) ÿ K. Choose c > 0 such that
0 < c < min Vÿ(x)
xB=
14
holds. Then by a contradiction argument the set

w
ÿ := {x ÿ B(B) : Vÿ(x) ÿ c}
is contained in the interior of B(B). By the continuity of Vÿ and compactness of B, Vÿ is uniformly
continuous on B. Thus, there exists 0 < ÿ < such that
x0 ÿ x < ÿ =ÿ |V (x) ÿ V (x0)| < c ÿ ÿ0/2 for all x0 ÿ B.
In particular, this proves that Bÿ (B) ÿ ÿ c ÿ B(B). A standard argument shows that the set ÿ c is forward
invariant and hence for all x0 ÿ Bÿ (B) this implies that x(t) ÿ ÿ c for all t ÿ 0 and proves uniform stability.
(Uniform Attractiveness). Given that K is a compact positive invariant set that contains A we claim that
there exists some time T > 0 for which the solution enters A. Indeed, suppose otherwise this means that
x(t) ÿ K \ A for all t ÿ 0. By Lemma 5 we get that ÿfVÿ(x(t)) < ÿÿ/2 for all t ÿ 0 on K \ A. It follows that
V(x(t)) = V(x(0)) + ÿfVÿ(x(ÿ ))dÿ ÿ V (x(0)) ÿ ÿt/2.

0
As the right hand side will eventually become negative, this is a contradiction to the Vÿ > 0 on K \ A.
To conclude, note that the set A is forward invariant which implies that x(t)A = 0 for all t ÿ T.
In particular, as A is contained in B this proves the uniform attractivity of B.
Suppose further that I is the ROA of the system (2). It was mentioned as a closing remark that if the
Lyapunov function V is radially unbounded, this means that V (x) ÿ ÿ when x ÿ ÿI (the boundary of I), then
the level sets of V approach the ROA. If the origin is UAS for system (2), then by the converse Lyapunov
theorem, it follows that V (x) is radially unbounded. We show that the neural Lyapunov function inherits a
similar property where the level sets approach K.
Theorem 6. In addition to the assumptions of Theorem 3, suppose that I is the region of attraction which
is bounded. Set Wc := {x ÿ K : Vÿ(x) ÿ c}. Then, for any sequence kn ÿ ÿ, ÿnÿNWkn = K.
Proof. Without loss of generality suppose that kn is an increasing sequence and < k1. Again, define = {x
A ÿ D : V(x) ÿ c}. Note that if x ÿ K satisfies Vÿ(x) ÿ c, then V (x) ÿ c + . Therefore, V c Wkn ÿ V kn+ÿK.
similar argument shows that V knÿÿK ÿ Wkn ÿ V kn+ÿK. Since K ÿ D knÿ ÿ K = K ÿ ÿnÿNV = K. Therefore,
knÿ
ÿnÿNWkn = K. this implies that ÿnÿNV
Now that we have established guarantees for the existence of neural Lyapunov functions. The proof of
Theorem 4 is analogous, so we defer the proof of Theorem 2 to the end of this section.
Theorem 4. (Stability Guarantees for the Unknown System) Let ÿ be the approximated dynamics of right-
hand side of the closed-loop system (2) trained by the first neural network. There exists a neural Lyapunov
function V which is learned using ÿ and verified by an SMT solver that satisfies the Lyapunov conditions
with respect to the actual dynamics f. Furthermore, if the system satisfies Assumption 3 and V satisfies
Assumption 4, then the origin is UAS for the closed-loop system (2).
ÿV
Proof. Fix ÿ > 0 and let M > 0 be chosen such that < M. AsÿxV is learned using the learned dynamics ÿ, V
satisfies V ÿ 0 and ÿÿÿV (x) < ÿÿ on D \ Bÿ. To certify that V satisfies the Lyapunov conditions on D \ Bÿ,
it suffices to verify that ÿfV < 0. By the universal approximation theorem, there exists a neural network ÿ
ÿ
approximating f such that f(x, ÿ(x) ÿ ÿ( x, ÿ(x)ÿ < on the D \ {x < ÿ}. As in (10), the following holds M
ÿfV (x) < ÿÿV (x) + ÿ < ÿÿ + ÿ = 0, ÿx ÿ D\{0}. (18)
Therefore, V satisfies the neural Lyapunov conditions on D \ Bÿ.

(Uniform Stability). The uniform stability property follows from the Assumption 3 as the quadratic
Lyapunov function guarantees uniform stability at the origin.
(Uniform Attractiveness). We show that under these assumptions, the neural Lyapunov function is able
to verify uniform attractivity. As uniform attractiveness is equivalent to attractiveness for autonomous
15
systems, it suffices to verify that any level set of V, denoted V c,

which contains Bÿ is a ROA for this dynamical
system. By a similar argument to the Uniform Stability part of Theorem 3, any trajectory starting
in V c must eventually enter Bÿ. Since Bÿ is contained in the ROA of the closed loop system
provided by the quadratic Lyapunov function, it follows that x(t) ÿ 0 as t ÿ ÿ.
As a remark, it can be seen from this proof that if Assumption 3 is satisfied then any level set containing
the ball must be a ROA of the system. This is because trajectories flow from one level set to a sublevel
set. Eventually trajectories must enter the level set contained in Bÿ.
To prove Theorem 2, we introduce concepts that will allow us to approximate Lipschitz functions smoothly.
Let us first restate this theorem.
Theorem 2. (Approximation of Lipschitz constants). Suppose that K ÿ R n is a compact set. (a) If f :
K ÿ R m is L-Lipschitz in the uniform norm, ie f(x) ÿ f(y)ÿ ÿ Lx
ÿ yÿ, (13) then for every > 0 there exists a neural network of the form
ÿ(x) = C(ÿ ÿ (Ax + b)) for ÿ ÿ C for some k ÿ N such that
1 k
and C ÿ R k×n
(R) and not a polynomial, A ÿ R k×m, b ÿ R
supxÿK |f(x) ÿ ÿ(x)| < and ÿ has a Lipschitz constant of L + in the same norms as ( 13). (b) If f : K ÿ R
m is L-Lipschitz in the two norm, ie
f(x) ÿ f(y)ÿ ÿ Lx ÿ y2, (14) then for every > 0 there exists a neural
network ÿ of the same form such that supxÿK |f(x) ÿ ÿ(x)| < and ÿ has a Lipschitz constant of L +
ÿ n+n/
+ L in the same norms as (14).
two
The idea will be to first approximate f by a smooth function and then approximate this smooth function by
a neural network. In this end, define ÿ ÿ Cÿ(R n) by
1
C exp |x| 2ÿ1
if |x| < 1 if
ÿ(x) :=
0 |x| ÿ 1
where the constant C is some normalizing constant, that is C > 0 is selected so that ÿdx
Rn
= 1.
Some standard properties of ÿ(x) is that ÿ ÿ 0, ÿ ÿ Cÿ(R n) and spt(ÿ) ÿ B1(0) which is the unit ball in R n.
For each > 0, set
1 x
ÿ(x) := .
n ÿ
We call ÿ the standard mollifier. The functions ÿ are Cÿ and satisfy
ÿdx = 1,spt(ÿ) ÿ B(0, ).

Rn
Then, by taking the convolution of f with the mollifier ÿ, denote f = f ÿ which can be further
simplified to
f(x) = f(y)ÿ(x ÿ y)dy

Rn
= f(x ÿ y)ÿ(y)dy
Rn
1
= yf(x ÿ y)ÿ( )
n
B(0,)
= f(x ÿ y)ÿ(y)dy.
B(0.1)
It is well known that f ÿ Cÿ. Additionally, by the Lipschitz continuity of f, uniform convergence holds on R n:
|f(x) ÿ f(x)| ÿ |(f(x ÿ y) ÿ f(x))ÿ(y)|dy ÿ L yÿ(y)dy

B(0.1) B(0.1)
ÿL ÿ(y)dy ÿ L.
B(0.1)
However, to define f, we need this function to be defined on R n. Therefore, we need a specific case
of the following lemma called the Kirszbraun theorem. The proof of this result can be found in [36].
16
Lemma 4. Suppose that U ÿ R n. For any Lipschitz map f : U ÿ R m there exists a Lipschitz continuous map
F:R n ÿR m
that extends f and has the same Lipschitz constant as f.
We are now ready to prove Theorem 2.
Proof. As the proof of (a) is analogous to the proof of (b) and simpler, we elect to prove (b) only.
Since f is L-Lipschitz in the uniform norm, we have that the component functions fi are L-Lipschitz.
Since neural networks can be stacked in parallel, it suffices to prove this result for the case that m = 1.
Moreover, since K is compact, by taking the approximation on some hypercube containing K and then
restricting onto K we can also assume without loss of generality that K is convex. The uniform norm is still
well-defined as the restriction of a continuous function is continuous. By the Kirszbraun theorem there
exists an extension F to R n with the same Lipschitz constant L so we can suppose without loss of generality
that f is defined on R n and has Lipschitz constant L. Denote f := f ÿ.
Since we have that f(x) = f(x ÿ y)ÿ(y)dy
Rn from the above calculation, we see that
|f(x1) ÿ f(x2)| ÿ |(f(x1 ÿ y) ÿ f(x2 ÿ y)) ÿ(y)dy

Rn
ÿ |(f(x1 ÿ y) ÿ f(x2 ÿ y)) ÿ(y)| dy

Rn
ÿ Lx1 ÿ x2 ÿ(y)dy
Rn
= Lx1 ÿ x2.
This shows that f is L-Lipschitz. Since f ÿ f uniformly we can choose > 0 small enough so that f ÿ f < /2.
Therefore, by Theorem 5, there exists a neural network ÿ such that supxÿK |f(x) ÿ ÿ(x)| < /2 and supxÿK |
ÿf ÿxi ÿÿ ÿxi
(x) ÿ (x)| < /2 for all i = 1, . . . , no. Since f is L-Lipschitz it follows that ÿf2 ÿ L. By the uniform bound on the
partial derivatives and the following inequality, (a + b) + (1 + 1/)b
two two two
ÿ (1 + )a , this gives
two two
ÿÿ ÿÿ
ÿÿ2 = +...+
ÿx1 ÿxn
two two
ÿf ÿf
ÿ ± +...+ ±
ÿx1 two
ÿxn two
two two
ÿf ÿf ÿn
ÿ (1 + ) +...+ + 1 + 1/
ÿx1 ÿxn 2
n+n/
ÿL+ +L
2
Therefore, by the mean value theorem and the convexity of K, it follows that ÿ is L +
ÿ n+n/
two
+ L -Lipschitz.
A.3 Appendix C: Experiments
As stated in Section 5, the learned dynamics is in the format of ÿ = W2 tanh (W1X + B1) + B2 where X =
[x, u]. In VNN, the valid neural Lyapunov function is of the following form: Vÿ = tanh (W2 tanh (W1x + B1)
+ B2). It is worth mentioning that we have access to the nonlinear dynamics and consequently, we are able
to test the neural Lyapunov functions on the actual dynamics.
In all three experiments, we observe that the neural Lyapunov functions are indeed valid Lyapunov functions
for the actual dynamics, which shows the effectiveness of the proposed algorithm. The code is open
sourced at https://github.com/RuikunZhou/Unknown_Neural_Lyapunov.
17
A.3.1 Van der Pol oscillator
The states equations of Van der Pol oscillator are:
xÿ 1 = ÿx2
(19)
xÿ
two
2 = x1 + x 1 ÿ 1x2 .
Correspondingly, the phase plot and the limit cycle of this nonlinear system are shown in Figure 5 [2].
Figure 5: Phase space plot and the limit cycle (bold black line) of Val Der Pol oscillator without controller,
where the area within the bold black curve forms the actual ROA.
T
Clearly, we can write x = [x1 x2] and ,we use 100 hidden neurons to learn the dynamics in FNN.
With the learned dynamics, the weights and biases matrices of obtained neural Lyapunov function in
VNN are:
T
ÿ1.82994 ÿ0.70762 3.35979 ÿ6.42827 ÿ1.14237 0.39034 0.32546 ÿ1.16843 ÿ0.03503
W1 = 1.30866 0.57501 0.27398 ,
W2 = [ ÿ1.32270 ÿ0.73489 1.87897 0.89612 1.65451 1.17499 ] ,

B1 = [ ÿ2.30191 0.38658 0.47604 0.83902 0.87791 1.18262 ] and B2 = [0.62172].
A.3.2 Unicycle path following
In this case, we have two state variables, the angle error ÿe and the distance error de, and the dynamics
of this system can be written as:
v cos (ÿe)
sÿ = ,
1 ÿ deÿ(s)
ÿde = v sin (ÿe), (20)
vÿ(s) cos (ÿe) .
ÿÿe = ÿ ÿ
1 ÿ deÿ(s)
T
Here we assume the target path is a unit circle ÿ(s) = 1 and take ÿ as the input u with x = [de ÿe] ,
consequently the dynamical system is of the format xÿ = f(x, u). Similarly, after obtaining the learned
dynamics ÿ(x, u) with 200 hidden neurons, the weights and biases matrices of Vÿ for this experiment
18
are recorded below.

T
ÿ2.13787 ÿ0.02771 2.83659 ÿ3.33855 0.61321 4.98050 1.07949 ÿ0.25036 0.69794
W1 = ÿ2.23639 ÿ1.62861 0.11680 ,
W2 = [ ÿ1.23695 1.08396 ÿ2.13833 ÿ0.76877 ÿ0.84737 1.47562 ] , B1 = [ ÿ1.90726 0.87544

0.18892 0.73855 1.09844 ÿ0.79774 ] and B2 = [0.59095],
and the nonlinear controller function is u = 5 tanh(ÿ5.95539de ÿ 4.03426ÿe + 0.19740)
A.3.3 Inverted pendulum
The system dynamics of inverted pendulum can be described as
mg sin(ÿ) + u ÿ 0.1 ÿÿ ¨ÿ =
. (21)
m2
In this example, the only nonlinear function we need to learn for FNN is (21). Therefore, the input [xu] is 3-
T
dimensional and the output is 1-dimensional. Using constants g = 9.81, m = 0.15 and = 0.5, by the same process
as the previous experiment, the weights and bias matrices of the neural Lyapunov function for this experiment
are listed below, and the corresponding parameters can be found in Table 3.
T
0.03331 0.03467 2.12564 ÿ0.39925 0.12885 0.95375 ÿ0.03113 ÿ0.01892
W1 = 0.02354 ÿ0.10678 ÿ0.32245 0.01298 ,
W2 = [ ÿ0.33862 0.65177 ÿ0.52607 0.23062 ÿ0.04802 0.66825 ] ,
B1 = [ ÿ0.48061 0.88048 0.86448 ÿ0.87253 0.81866 ÿ0.26619 ] and B2 = [0.22032],
and the nonlinear controller function is u = 20 tanh ÿ23.28632ÿ ÿ 5.27055 ÿÿ
Table 3: Parameters in inverted pendulum case

ÿV
Kf Kÿ ÿ ÿ
ÿx ÿ ÿ
<33,214 633,806 5e-5 5e-3 0.51 0.02 0.4
A.4 Appendix D: Limitations and future work
This work is concerned with formulating an algorithm to learn the unknown dynamics with stability guarantees
using Lyapunov functions with nonlinear controllers. Since the experiments are run in an ideal setting, we must
admit there are some limitations, which will be further addressed or taken into consideration in future work.
1. The data set used to train the unknown dynamics and learn the Lyapunov function is generated from the
trajectories of solutions to ordinary differential equations, but in practice the actual measurements of the
states are typically noisy, and sometimes it is difficult to have direct access to the states measurements
and obtain a significant number of data points. In this manuscript, we first try to prove the proposed
approach works well with ideal measurements both practically and theoretically. In the future, we will
study how to learn the unknown dynamics and a robust Lyapunov function with different values of ÿ in
(11) to guarantee stability with noisy measurements. Further, the implementation of this algorithm on
real dynamical systems will be investigated as well afterwards.
2. Although all the nonlinear systems in Section 5 are widely studied and standard nonlinear problems, the
systems considered here are relatively low-dimensional. This is primarily due to the lack of expressibility
of shallow neural networks with limited width and the scalability of the SMT solvers. As in all tasks, we
have to consider the trade off between computational time and performance so the proposed number of
neurons in Section 5 achieves this balance.
For now, the main bottleneck we experienced is approximating high-dimensional dynamics
19
with the one-hidden-layer shallow neural networks. Regarding the scalability of the SMT
solver, we have seen dReal works well with learned complex high-dimensional system in
[6] and [20]. In our future work, we will try to learn high-dimensional unknown dynamics,
for instance quadrature dynamics in six dimensions with deeper neural networks. We
believe that dReal should be able to handle such a case.
3. Computing a valid Lyapunov function is challenging and has been a well-studied topic for
dynamical systems. We admit that our method is not complete, that is, it does not guarantee
that we can obtain a valid Lyapunov function after running the algorithm. The original paper
of this framework [6] similarly suffers from this same issue and to the best of our knowledge,
we are unaware of any papers in the literature that have addressed this issue.
Finding a complete algorithm for computing Lyapunov functions is an interesting topic for
future research.
20

Systems With Stability Guarantees Neural Lyapunov Control of Unknown Nonlinear

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Systems With Stability Guarantees Neural Lyapunov Control of Unknown Nonlinear

Uploaded by

Copyright:

Available Formats

Machine Translated by Google

Neural Lyapunov Control of Unknown Nonlinear

Ruikun Zhou Thanin Quartz

Hans De Sterck Jun Liu

36th Conference on Neural Information Processing Systems (NeurIPS 2022).

Contributions. We summarize the main contributions of the paper as follows. • We

When there is no ambiguity, we also refer to the right-hand side of (2) by f.

Set invariance [27] plays an important role in our analysis.

ÿfV (x) < 0 ÿx ÿ D \ A. (5)

3 Learn and Stabilize Dynamics with Neural Lyapunov Functions

Figure 1: The schematic diagram of the proposed algorithm.

3.1 Learn the Unknown Dynamics

u = ÿ(x) = Cÿ(kx + b), (6)

3.2 Neural Lyapunov Function and Nonlinear Controller

Here ÿ is a positive real number, which can be determined as follows.

f(x, u) ÿ ÿ(x, u) ÿ f(x, u) ÿ f(y, v) + f(y, v) ÿ ÿ(y, v) + ÿ(y, v) ÿ ÿ (x, u) ÿ

In view of (7) and (9), we have

ÿfV (x) < ÿÿV (x) + ÿ < ÿÿ + ÿ = 0, ÿx ÿ D\{0}, (10)

ÿÿ(x) :=n 2xi ÿ ÿ ÿ (V (x) ÿ 0 ÿ ÿÿV (x) ÿ ÿÿ), (11)

Algorithm 1 Neural Lyapunov Control with Unknown Dynamics

1: function LEARNINGDYNAMICS(Xdyn, U(bdd) )

4.1 Approximation guarantee of unknown dynamics

4.2 Existence of neural Lyapunov functions

4.3 Asymptotic Stability Guarantees of Unknown Nonlinear Systems

{x ÿ D : V (x) ÿ c1} ÿ Bÿ ÿ {x ÿ D : V (x) ÿ c2}, and {x ÿ D : V (15)

5.1 Van der Pol Oscillator

Table 1: Parameters in Van der Pol Oscillator case

3.4599 5.197 5e-4 8.5e-3 1.249 0.02 0.2

(a) Neural Lyapunov function (b) ROA comparison with LQR

5.2 Unicycle path following

Table 2: Parameters in unicycle path following case

5.3 Inverted pendulum.

[15] H. Dai, B. Landry, L. Yang, M. Pavone, and R. Tedrake, “Lyapunov-stable neural-network

[29] G. Cybenko, “Approximation by superpositions of a sigmoidal function,” Mathematics of

A.2 Appendix B: Proofs

Similarly, for a set W ÿ D, define R

and for all i = 1, . . . , n, the following simultaneously holds ÿf

2 on K \ A. This proves that Vÿ satisfies the Lyapunov conditions on K \ A. To summarize we have

holds. Then by a contradiction argument the set

x0 ÿ x < ÿ =ÿ |V (x) ÿ V (x0)| < c ÿ ÿ0/2 for all x0 ÿ B.

V(x(t)) = V(x(0)) + ÿfVÿ(x(ÿ ))dÿ ÿ V (x(0)) ÿ ÿt/2.

ÿfV (x) < ÿÿV (x) + ÿ < ÿÿ + ÿ = 0, ÿx ÿ D\{0}. (18)

Therefore, V satisfies the neural Lyapunov conditions on D \ Bÿ.

systems, it suffices to verify that any level set of V, denoted V c,

We call ÿ the standard mollifier. The functions ÿ are Cÿ and satisfy

ÿdx = 1,spt(ÿ) ÿ B(0, ).

f(x) = f(y)ÿ(x ÿ y)dy

|f(x) ÿ f(x)| ÿ |(f(x ÿ y) ÿ f(x))ÿ(y)|dy ÿ L yÿ(y)dy

that extends f and has the same Lipschitz constant as f.

We are now ready to prove Theorem 2.

|f(x1) ÿ f(x2)| ÿ |(f(x1 ÿ y) ÿ f(x2 ÿ y)) ÿ(y)dy

ÿ |(f(x1 ÿ y) ÿ f(x2 ÿ y)) ÿ(y)| dy

A.3 Appendix C: Experiments

A.3.1 Van der Pol oscillator

The states equations of Van der Pol oscillator are:

W2 = [ ÿ1.32270 ÿ0.73489 1.87897 0.89612 1.65451 1.17499 ] ,

A.3.2 Unicycle path following

are recorded below.

W2 = [ ÿ1.23695 1.08396 ÿ2.13833 ÿ0.76877 ÿ0.84737 1.47562 ] , B1 = [ ÿ1.90726 0.87544

and the nonlinear controller function is u = 5 tanh(ÿ5.95539de ÿ 4.03426ÿe + 0.19740)

A.3.3 Inverted pendulum

The system dynamics of inverted pendulum can be described as