You are on page 1of 8

Disturbance Rejection-Guarded Learning for Vibration Suppression of

Two-Inertia Systems
Fan Zhang1 , Jinfeng Chen1 , Yu Hu1 , Zhiqiang Gao1 , Ge Lv2 , Qin Lin1

Abstract— Model uncertainty presents significant challenges design of a disturbance rejection controller. Among the array
in vibration suppression of multi-inertia systems, as these of existing disturbance observers, the extended state observer
systems often rely on inaccurate nominal mathematical models (ESO) [6] is gaining popularity due to its simplicity in
due to system identification errors or unmodeled dynamics.
An observer, such as an extended state observer (ESO), can implementation. For the formulation of an ESO, the system
arXiv:2404.10240v1 [eess.SY] 16 Apr 2024

estimate the discrepancy between the inaccurate nominal model is modeled as a simple chained integrator with a total
and the true model, thus improving control performance via disturbance term (also called lumped disturbance, f ) that
disturbance rejection. The conventional observer design is includes both internal and external disturbances. The total
memoryless in the sense that once its estimated disturbance disturbance is treated as an extended state to be estimated
is obtained and sent to the controller, the datum is discarded.
In this research, we propose a seamless integration of ESO and together with other states. The estimated disturbance can be
machine learning. On one hand, the machine learning model mitigated through various means, including a simple state
attempts to model the disturbance. With the assistance of prior feedback controller or more advanced control strategies such
information about the disturbance, the observer is expected as sliding mode control [7] and model predictive control [8].
to achieve faster convergence in disturbance estimation. On It is worth noting that the traditional ESO operates in a
the other hand, machine learning benefits from an additional
assurance layer provided by the ESO, as any imperfections in memoryless fashion, i.e., once it estimates a disturbance and
the machine learning model can be compensated for by the transmits it to the controller, the datum used for estimation
ESO. We validated the effectiveness of this novel learning-for- is then discarded. However, as a control system operates, we
control paradigm through simulation and physical tests on two- can improve our understanding of the disturbance through
inertial motion control systems used for vibration studies. collected operational data. Prior works [9], [10] show that
Index Terms— Machine Learning, Disturbance Rejection,
Extended State Observer, Model Uncertainty a model-based ESO (MB-ESO), which utilizes prior model
information about the disturbance (such as a detailed dy-
namic model obtained through system identification), tends
I. I NTRODUCTION to exhibit reduced sensitivity to noise when compared to a
model-free ESO (MF-ESO) that assumes a simple chained
Vibration suppression of multi-inertia systems is critical
integrator as a nominal model. In order to circumvent the
in many engineering applications, including automotive sus-
need for extensive system identification and maximize the
pensions, series elastic actuators (SEA), and various other
utilization of disturbance information, we propose to leverage
motion control systems [1]. These systems often involve mul-
machine learning (ML), which has powerful capacities for
tiple inertia components with a two-inertia subsystem serving
nonlinear optimization, to memorize and generalize the past
as a fundamental block, connected by flexible couplings,
estimations from the ESO as a feedforward estimation of
which leads to inherent resonance issues. This resonance
the disturbance. The learning component is expected to
can cause dynamic stresses, energy wastes, and performance
capture the internal dynamics as well as patterns of external
degradation, therefore posing significant challenges to the
disturbances.
systems’ efficiency and stability [2], [3]. Given the funda-
[11], [12], [13] combine ESO with iterative learning con-
mental challenge of system identification and the necessity
trol (ILC) for repetitive control tasks. Our approach focuses
for real-time performance, it is common practice to employ
on general control tasks rather than just the repetitive ones.
a simplified or inaccurate nominal dynamic model. Conse-
In addition, we assume that system dynamics, as well as
quently, the disturbances become inevitable, necessitating
disturbances, are unknown and not necessarily repetitive. In
their rejection to achieve robust control. The disturbance
[14], a neural network is utilized to tune the parameters of
includes internal (i.e., unknown or unmodelled parts of the
ESO rather than explicitly learning the disturbance. Other
plant dynamics) and external (i.e., perturbations from the
learning-for-control approaches such as [15] employ neural
outside affecting the dynamics) [4], [5].
networks to capture discrepancies between a nominal model
The observer-based method has emerged as a promising
F̂ (xk , uk ) and the true model F (xk , uk ). Since the state
approach to estimating the disturbance for the subsequent
of the true model is unknown, the measured next state
1 The authors are with the Center for Advanced Control xk+1 is used to update the error model represented by the
Technologies (CACT), Cleveland State University, 2121 Euclid neural network. However, these methods always assume full-
Avenue, Cleveland, OH 44115, USA. Corresponding author: Qin Lin, state information is available. In addition, when the learning
q.lin80@csuohio.edu
2 Ge Lv is with the Department of Mechanical Engineering, Clemson performance falls short of expectations, it may result in sub-
University, 105 Sikes Hall, Clemson, SC 29634, USA. optimal performance for subsequent model-based controllers.
In contrast, our approach represents a novel paradigm that design is two-fold: 1) no need to change the existing
aims at learning the total disturbance with the help of output framework; 2) users can customize the learning com-
measurements instead of true values for states. Furthermore, ponents by choosing any appropriate machine learning
our paradigm includes a correction mechanism for cases model.
where the learning component fails to accurately capture • Our learning and estimation are real-time and online.
the disturbances. The residual total disturbance, i.e., the We showcase the efficacy of our framework through
remainder excluding the disturbance already estimated by simulations and a real-world two-inertia testbed as a
the learning component, will be estimated by a conventional fundamental block for a multi-inertia system.
ESO in a feedback correction manner. Through this seam-
The remainder of this paper is structured as follows.
less integration, even when the learning-based estimation
We first go through the preliminaries in Sec. II. Then, we
struggles to converge effectively, we can leverage the ESO
construct our framework in Sec. III. Simulation results of the
for feedback correction, thereby adding an extra layer of
two-mass-spring benchmark system are presented in Sec. IV,
robustness and assurance to the system.
followed by the hardware experiments of a torsional plant in
In our new framework, as visualized in Fig. 1, we refer
Sec. V. Finally, we conclude our work and discuss possible
to the learning-enabled extended state observer as L-ESO.
future research directions in Sec. VI.
The estimation fˆ of the true total disturbance f consists of
fˆL and ∆fˆ, which are from the learning component and
the ESO, respectively. First, ESO uses the information of II. P RELIMINARY
control u and observation y to estimate the system’s states x̂
and the residual disturbance ∆fˆ. Second, ESO’s estimation, The multi-inertia system can be represented as the sum of
including x̂ and u are fed as input to the learning component a nominal part and a nonlinear time-varying part:
for learning a regression model. The learning component car-
ries out the feedforward estimation fˆL , after which an online (
˙
x̄(t) = A0 x̄(t) + B0 u(t) + E0 f (x(t), d(t), t)
optimization iteratively minimizes the difference between fˆL (1)
and fˆ, allowing the learning component to approximate the y = C0 x̄
total disturbance accurately. In situations where imperfect
learning introduces errors, the ESO serves as an additional where x̄ ∈ Rn is the state vector, u ∈ R is a control input,
layer to rectify. y ∈ R is a measured output, and f : Rn+1 ×[0, ∞] → R is an
unknown function representing the time-varying uncertainty,
which contains external disturbance d(t) ∈ R, unmodeled
dynamics, and parameter uncertainty. Terms A0 , B0 , E0 and
C0 are real and known matrices with appropriate dimensions.
For the particular case of a two-inertial system with n =
4, meaning two states for each inertial position/angular and
velocity/angular velocity, please refer to the details in the
example in Sec. IV. The justification of classifying (1) as a
nonlinear time-varying system can be found in [16], [17].
Traditionally, an ESO is established for a system in a
chained integrator form [6]. However, in our most recent
Fig. 1: The proposed framework in this paper, where the red work [18], we have significantly expanded the applicability
and the blue blocks represent the L-ESO and the disturbance scope of ESO and rigorously proved that for a general
rejection tracking controller, respectively. Once the total system (1), given that Assumption 1 and the Assumption
disturbance is estimated, the tracking controller will be able 2 are satisfied, an ESO can be established to estimate f by
to reject disturbance. releasing the chained integrator form requirement.

The contributions of our work are summarized as follows: Assumption 1. (A0 , C0 ) is observable.
• We propose a novel framework that combines ML and
Assumption 2. (A0 , E0 , C0 ) has no invariant zeros.
ESO for feedforward estimation and feedback correc-
tion for a general disturbance rejection tracking con- For system (1), under the Assumptions 1, and 2 , there
trol task. Compared with existing learning-for-control exists a matrix
frameworks, we estimate states and disturbances in a 
C0

unique way. We also have an extra error correction  C0 A0 
mechanism for the learning component. S=

..

 (2)
• The learning component serves as an add-on to existing  . 
ESO-based control architecture. As shown in Fig. 1, C0 An−1
0
only a learning component and a few connections (in
green) are introduced. The advantage of our modular such that
where −a0 x1 − · · · − an−1 xn + (b − b0 )u is the internal dis-

0 1 ... 0
 turbance (unknown/unmodelled dynamics), b0 is the nominal
 .. .. .. ..  control gain, and d is the external disturbance. In such a case,
Ā0 = SA0 S −1 =  .
 . . . 
 the total disturbance becomes:
 0 ... 0 1 
−a0 −a1 ... −an−1 (3) f = −a0 x1 − · · · − an−1 xn + (b − b0 )u + d (8)
 T
B̄0 = SB0 = 0 0 ... b ESO treats the total disturbance f as an extended state,
C̄0 = C0 S −1 = 1 0 such that a Luenberger observer can be designed to estimate
 
... 0
 T both the original system state x and the total disturbance f .
Ē0 = SE0 = 0 0 ... 1 The augmented dynamic system is as follows:
form the following new system
" # " #
 ẋ = A x + Bu + E f˙

f˙ f (9)
(
ẋ = Ā0 x + B̄0 u + Ē0 f
(4)

y = Cx

y = C̄0 x
Ā0 E¯0
   
B̄0
The readers are referred to [18] for more details on the where A = , B = ,
01×n 0 (n+1)×(n+1) 0 (n+1)×1
matrix transformation. The new system (4) has an observable
canonical form such that an ESO can be established for C = [C¯0 , 0]1×(n+1) , E = [0, · · · , 0, 1]T(n+1)×1 .
estimating f . The Luenberger observer has the following form:
" #
Remark 1. Assumption 2 is equivalent to the following x̂˙
    
x̂ x̂
˙ = A ˆ + Bu + L y − C (10)
conditions. The proof can be found in [18]. fˆ f fˆ

C0 E0 = 0, C0 A0 E0 = 0, . . . , C0 An−1
0 E0 ̸= 0 where x̂ and fˆ are estimations of x and f , L is the observer
gain. We have the following estimation error dynamics:
According to whether or not the system dynamics are
available, we have the following two variants of ESO: ė = (A − LC)e + E f˙ (11)
T
where e = x − x̂ f − fˆ .

A. MB-ESO
If the model information, i.e., −a0 , −a1 , · · · , −an−1 , b, in Theorem 1. Under Assumption 1 and Assumption 2, the
matrix Ā0 and B̄0 is available, we have eigenvalues A − LC can be placed at the left side of the
      plane to make the estimation converge [18], [17].
0 1 ... 0 0 0
 .. . . . . ..  0 0 All eigenvalues can be placed at −ωo , which is called the
ẋ =  .
 . . .   x +  ..  u +  ..  |{z}
   
d
 0 observer bandwidth of ESO [19].
... 0 1  . .
f
−a0 ... ... −an−1 b 1 III. L EARNING -E NABLED ESO
| {z
Ā0,M B
} |{z}
B̄0,M B
|{z}
Ē0
The model-based ESO in (5) and the model-free ESO in
(5) (7) can be further expanded as follows:
 
The total disturbance can be represented as: 0 1 ... 0
0
 
 . .. .. ..
 ..

f =d (6) . . . E¯0    
 .. 

 0 . . . 0 1  x̂  . 
 fˆ +  u =
 
where d is the external disturbance, b is the true control gain. 
−a0 . . . . . . −an−1

 b0 + b − b 0 
  | {z }
| {z }  B̄ 
0,M B
B. MF-ESO

 Ā0,M B 
0
If the model information, i.e., −a0 , −a1 , · · · , −an−1 , b, in 01×n 0 | {z }
BM B
matrix Ā0 and B̄0 , is not available, we have
| {z }
 AM B 

0 1 ... 0
  
0 0 1 ... 0
0
 
 .. . . . . . . .
. 0  .. .. . . .. ¯ 

. . . .. 
  
E0     . 
ẋ = .

 x +  ..  u+ 
0 . . .  x̂  .. 
0 . . . 0 1 . 0 1
 fˆ +  u+
 

0 . . . . . . 0
 b0 
0 0 ... 0 b0    |{z} 
B̄0,M F
| {z }   
| {z } | {z }
(7)  Ā0,M F 
  Ā0,M F B̄0,M F 0
0 01×n 0 | {z }
BM F
 .. 
| {z }
 .  (−a0 x1 − · · · − an−1 xn + (b − b0 )u + d)   AM F
| {z } Ē0
1 f (−a0 x1 · · · − an−1 xn + (b − b0 )u)
|{z} 0
Ē0 (12)
Remark 2. By incorporating model information, MF-ESO Algorithm 1 L-ESO
becomes equivalent to MB-ESO. Input: Control input u, system output y, learning rate α,
batch size n, maximum running time Nmax
Remark 3. The motivation for proposing the learning com- Output: Total disturbance fˆ
ponent can be justified in that the model information is learn- 1 Initialize:
able to facilitate the incorporation of model information. 2 machine learning input batch I 0 = ∅
Remark 4. The learning component is even possible to 3 disturbance estimation by ESO batch ∆F 0 = ∅
learn the external disturbance together with the internal 4 machine learning output batch FL 0 = ∅
disturbance to be incorporated. 5 machine learning model parameter θ
6 machine learning output fˆL0 = 0
Since the learning component has a feedforward esti-
7 for i = 1 to n do
mation fˆL for the total disturbance, ESO can serve as a
feedback correction to estimate the residual total disturbance 8 Get x̂i and ∆fˆi by running L-ESO ▷ see (13)
as ∆fˆ. The combination of the feedforward estimation and 9 Compute ui ▷ see (22)
the feedback correction is realized as follows: 10 I i := [I i−1 , [x̂i1 , x̂i2 , . . . , x̂in , ui , 1]T ]
" # 11 ∆F i := [∆F i−1 , ∆fˆi ]
x̂˙
      
x̂ x̂ Ē0 ˆ 12 FL i := [FL i−1 , 0] ▷ append data into three
˙ = A ∆fˆ + Bu + L y − C ∆fˆ + 0 fL
∆fˆ batches
(13) 13 fˆLi = 0
Since the learning component is expected to capture the 14 end
unknown dynamics, we employ a model-free ESO, see Fig. 15 for i = n to Nmax do
1. The learning block in Fig. 1 is a function hθ (x, u) 16 Get x̂i and ∆fˆi by running L-ESO ▷ see (13)
parameterized by θ. To learn the total disturbance (see (8)), 17 Update I i ▷ pop oldest datum, push new
we establish a mapping from the input (x̂ estimated by ESO datum
and control input u) to the output fˆ, where fˆ = fˆL + ∆fˆ. 18 Update ∆F i ▷ pop oldest datum, push new
The total disturbance estimation consists of two parts: 1) datum
the feedforward estimation from the learning component 19 Update θi ▷ According to (14)
fˆL = hθ (x̂, u); 2) feedback correction for the residual 20 FL i = hθi (I i )
disturbance ∆fˆ by an MF-ESO. To optimize the parameters 21 fˆLi = hθi (xi )
of the machine learning model, a general regression problem
22 fˆi = fˆLi + ∆fˆi ▷ compute total disturbance
is formulated using the following cost function:
23 Compute ui ▷ see (22)
n
1X 24 end
J(θ) = (hθ (x̂i , ui ) − fˆi )2 (14)
2 i=1

where n is the size of the training data. The details are in k. The system is subject to two external disturbance forces
Alg. 1. When the batch is not yet filled, we run the MF- w1 and w2 , which act on masses m1 and m2 , respectively.
ESO (see Line 7-14, the learning component does not return The control signal u is the force applied to mass m1 . Both
optimized parameters). the positions of mass m1 and mass m2 are measured, and
Our framework has superior modularity. The design of the either one can be used as an output to be controlled.
ESO is just a conventional model-free convention. We only The states of the two-mass-spring system are defined as the
need to use the estimation from ESO to drive the training of displacements and velocities of the two masses. Specifically,
our learning component. First, the learning component can the displacement and velocity of mass m1 are x1 and x3 ,
serve as an add-on to existing ESO-based control architecture respectively, while the displacement and velocity of mass
by just adding a few connections. Second, the learning m2 are x2 and x4 , respectively. The dynamics of the system
component is so flexible that users can customize it by can be represented in the following state-space form:
choosing appropriate machine learning models, e.g., linear,     
non-linear, parametric, non-parametric, etc. ẋ1 0 0 1 0 x1
ẋ2   0 0 0 1  x2 
 
 = k
IV. S IMULATION R ESULTS ẋ3  − m k
0 0  x3 
1 m1
k k
A. Two-Mass-Spring Problem Formulation ẋ4 − 0 0 x4
 m2 m2
 
Fig. 2 depicts a schematic of a two-mass-spring system, 0 0 (15)
which is from a well-known benchmark control problem  0   0 
+ 1  (u + w1 ) +  0  w2
  
[20]. The system includes two masses: m1 and m2 , which m1
1
can slide freely over a horizontal surface without friction. 0 m2
Note that it has been proved that a non-friction setting is   T
y = c1 c2 0 0 x1 x2 x3 x4
more challenging for a controller design [9]. The masses are
connected by a light horizontal spring with a spring constant A time-varying unknown external disturbance w2 is from
the above-mentioned model-free design, such a system tries
to leverage the prior knowledge of the dynamic model, by
assuming −k mm11+mm2 is known (see (16)). In this case, the
2

total disturbance becomes:


k 1
Fig. 2: Two-mass-spring system with uncertain parameters f= w2 + ẅ2 + (b − b0 )u (21)
m1 m2 m2
such that y (4) = −k mm11+m
m2 ÿ + f + b0 u
2

the mass m2 , control needs to be conducted on m1 to allow The observer gain is chosen where all eigenvalues of
x2 track any desired trajectory. For the output y, i.e., x2 , a AM B − LC are placed at −ωo [19]. Let a = −k mm11+mm2 ,
2

chained integrator system is derived by taking the derivatives the coefficients of LM B are listed in Table I.
of the output four times. The input and disturbance are in
the last channel of this fourth-order system with b = m1km2 : Parameters Values

m1 + m2 k 1 LM B,1 5ωo
y (4) = −k ÿ + w2 + ẅ2 + bu (16) LM B,2 a + 10ωo2
m1 m2 m1 m2 m2 LM B,3 5aωo + 10ωo3
B. ESO design LM B,4 a2 + 10aωo2 + 5ωo4
LM B,5 5a2 ωo + 10aωo3 + ωo5
The states in the system are:
 ...T TABLE I: coefficients of LM B
x = y ẏ ÿ y (17)
The state-space description of the system is
" # " # 3) L-ESO: As shown in (19), the internal disturbance has
 ẋ = A x + Bu + E f˙
 a linearly structured mapping between the input (state and
f˙ f (18) control) and the output (disturbance). Therefore, a linear
 regression model is a reasonable choice for the learning
y = Cx
  T
T
component, with hθ (·) = θ x̂1 x̂2 x̂3 x̂4 u 1 .
1) Model-free ESO: The state-space
 modelis:  Note that as we mentioned before, the learning model is
0 1 0 0 0 0 flexible to be linear, nonlinear, parametric, non-parametric,
0 0 1 0 0 0 etc. Our contribution is not about the complexity of the
   
AM F =  0 0 0 1 0, B =  0 , C = learning model but the novel design to seamlessly combine
  
0 0 0 0 1 b0  machine learning models with an ESO. A batch gradient
0 0 0 0 0 0 descent method is used for optimizing the cost function. In
   T
1 0 0 0 0 , E = 0 0 0 0 1 . As we can see, our experiments, we initialize θ with all zeros.
the model-free design assumes unknown dynamics, such that
the total disturbance f can be represented as: C. Controller Design

m1 + m2 k 1 The control law for the system (20) can be designed as:
f = −k ÿ + w2 + ẅ2 + (b − b0 )u (19)
m1 m2 m1 m2 m2 −fˆ + u0
u= (22)
where −k mm11+m
m2
2
is the model parameter information, b0 is b0
the nominal control gain. We have such that
y (4)
= f + b0 u (20) y (4) = u0 (23)
where everything besides b0 u is considered as total distur- It can be controlled by a state feedback controller
bance (see (16)). It can be validated that such a system
satisfies Assumptions 1, 2, and 3. Therefore, an ESO can u0 = −K x̂ = k1 (r − x̂1 ) − k2 x̂2 − k3 x̂3 − k4 x̂4 (24)
be designed for the estimation of f , see (10). with a control gain K = ωc4 4ωc3 6ωc2 4ωc , where ωc
 
The observer gain is chosen where all the eigenvalues is the close-loop natural frequency [19].
of AM F − LC are placed at −ωo [19], i.e., LM F =
[5ωo 10ωo2 10ωo3 5ωo4 ωo5 ]. D. Simulation Results
2) Model-based ESO: The model-based design has the The system parameters are taken from the benckmark
following state-space
 representation:    problem [20], i.e., m1 = m2 = 1 kg, k = 1 N/m,
0 1 0 0 0 0 c1 = 0, c2 = 1. Tracking a desired trajectory for the
0 0 1 0 0 0
position of mass m2 is the control objective. A sinusoidal

   
AM B =  0 0 0 1 0 , B =  0 , C = wave with a frequency of 1 rad/s and amplitude 1 is applied

0 0 −k m1 +m2 0 1

b0  in the training phase for L-ESO. After 110 seconds, a step
m1 m2
0 0 0 0 0 0 reference is given to all three approaches. A band-limited
T
white noise with noise power 10−12 is added at the system
  
1 0 0 0 0 , E = 0 0 0 0 1 . In contrast to
output side. A sinusoidal external disturbance with frequency has a linear component, a linear regression component
π/10 rad/s is applied on m2 as w2 starting at 150 s. The can still capture it, e.g., the trends of going up and
learning algorithm is running online. The learning phase down in a sinusoidal external disturbance.
is designed to emulate the typical operational scenarios of 3) Adding external disturbance information to the ob-
the machine under general conditions, whereas the step server can help reduce the required bandwidth. In our
response is employed to assess and compare the tracking experiments, we found that MF-ESO and MB-ESO
performance. All the control parameters are set identically will need three times more bandwidth to achieve the
for fair comparison. same performance as the L-ESO.
The controller bandwidth ωc and the observer bandwidth 4) The control input of the L- ESO has more fluctuations
ωo are set to 1 rad/s and 10 rad/s, respectively. The control compared with MF-ESO and MB-ESO, as shown in
gain is set to 1. All three approaches share such same settings Fig. 4. This is caused by the noise signal and the batch
for fair comparison. gradient descent method we choose to minimize the
cost function. It can be smoothened by increasing the
1.2
Position batch size in this example.

1
V. H ARDWARE E XPERIMENTS R ESULTS
1
0.8
0.95

0.9
We conduct physical experiments on our ECP Model 205
0.6
0.85 torsional testbed [21], see Fig.5. It is a mechanical system
126 127 128 129 130 131 132 133 134

0.4
1.05
that consists of a flexible vertical shaft connecting two disks
1 - a lower disk and an upper disk. Each disk is equipped with
0.2

0.95
an encoder for position measurement. A DC servo motor
170 175 180 185 190 195
0 ref
MB-ESO
drives the lower disk through a belt and pulley system, which
-0.2
MF-ESO
L-ESO provides a 3:1 speed reduction ratio. The system can be used
120 140 160 180
Time (s)
200 220 240 260
to study the vibration of a torsional two-mass-spring system.

Fig. 3: Tracking performance for MB-ESO, MF-ESO, L-ESO


plotting from 120s.

Control input
1
MB-ESO
0.5

-0.5
120 140 160 180 200 220 240 260

1
Torque (Nm)

MF-ESO
0.5

-0.5
120 140 160 180 200 220 240 260 Fig. 5: ECP Model 205 torsional testbed
1
L-ESO

0.5
A personal computer with MATLAB® Simulink Desktop
0

120 140 160 180 200 220 240 260


Real-Time™ installed is used for computation. The computer
Time (s) is also equipped with a four-channel quadrature encoder
input card (NI-PCI6601) and a multi-function analog and
Fig. 4: Control signal for MB-ESO, MF-ESO, L-ESO plot-
digital I/O card (NI-PCI6221). These cards interface with
ting from 120s.
the torsional plant Model 205 for real-time data acquisition
and control. The quadrature encoder input card enables the
The tracking performance and the control input are shown computer to receive position and velocity data from the
in Fig. 3 and Fig. 4, respectively. encoders on the disks of the plant. The multi-function analog
1) MB-ESO and L-ESO have similar performance for the and digital I/O card allows the computer to send control
step reference tracking (see the zoom-in plot from 126 signals to the DC servo motor that drives the lower disk.
s to 134 s, Fig. 3) after the training phase, see the
position plot of m2 in Fig. 3, which are better than A. System Model
MF-ESO in terms of overshoot percentage (0 vs. 5‰
) and settling time (12s vs. 16s). Since the MB-ESO, as a baseline approach, needs the
2) For external disturbance rejection (see the zoom-in plot dynamics information, we first use MATLAB® System iden-
from 170 s to 195 s, Fig. 3), L-ESO’s performance is tification toolbox and get the transfer function: G(s) =
4.6×104
the best. By re-visiting (8), if the external disturbance s4 +1.901s3 +1683s2 +1812s+0.1032 .
B. ESO and Controller Design VI. CONCLUSIONS
As this testbed is again a fourth-order dynamic system, A novel learning-enabled extended state observer L-ESO
the same ESO design pipeline shown before can be applied. with the capacity to memorize and generalize from past
C. Experiment Results estimated disturbances is proposed in this paper. The ma-
chine learning model is seamlessly integrated into existing
Tracking a desired trajectory for the upper disk is the disturbance rejection control architecture as a flexible add-
control objective. A sinusoidal wave with a frequency of on for boosting robustness performance against unknown and
π/2 rad/s and an amplitude 0.5π is applied in the training time-varying disturbances. Compared with existing learning
phase of L-ESO. ωc and ωo are set to 90 rad/s and 40 rad/s, for control framework, our new paradigm does not rely on
respectively. The control gain is 5.5 × 104 . A trapezoidal access to full states. In addition, the learning is guarded
profile reference with the final value π is used. by disturbance rejection that provides an extra assurance
layer to compensate for the imperfections of the machine
Position
3.5 learning model. The efficacy of the proposed approach has
3
been supported by simulation and hardware experiments. In
3.2
the future, we will further validate in real robotic testbeds.
2.5
Upper disk position (Rad)

2
3.15 R EFERENCES
1.5
3.1
[1] Y. Hori, H. Iseki, and K. Sugiura, “Basic consideration of vibration
1 suppression and disturbance rejection control of multi-inertia system
using SFLAC (state feedback and load acceleration control),” IEEE
0.5 1 1.2 1.4 1.6
Transactions on Industry Applications, vol. 30, no. 4, pp. 889–896,
ref
0 MB-ESO 1994.
MF-ESO
L-ESO
[2] S. Zhao and Z. Gao, “An active disturbance rejection based approach to
-0.5 vibration suppression in two-inertia systems,” Asian Journal of control,
0 0.5 1 1.5 2 2.5 3
Time (s) vol. 15, no. 2, pp. 350–362, 2013.
[3] Y. Wang, L. Dong, Z. Chen, M. Sun, and X. Long, “Integrated
Fig. 6: Upper disk position tracking: MB-ESO, MF-ESO, skyhook vibration reduction control with active disturbance rejection
and L-ESO decoupling for automotive semi-active suspension systems,” Nonlinear
Dynamics, pp. 1–16, 2024.
[4] J. Chen, Y. Hu, and Z. Gao, “On practical solutions of series elastic
actuator control in the context of active disturbance rejection,” Ad-
vanced Control for Applications: Engineering and Industrial Systems,
Control input
2 vol. 3, no. 2, p. e69, 2021.
MB-ESO
1 [5] Q. Zheng, Z. Ping, S. Soares, Y. Hu, and Z. Gao, “An active distur-
0 bance rejection control approach to fan control in servers,” in 2018
-1
IEEE Conference on Control Technology and Applications (CCTA).
0 0.5 1 1.5 2 2.5 3 IEEE, 2018, pp. 294–299.
2 [6] J. Han, “From PID to active disturbance rejection control,” IEEE
Torque (Nm)

MF-ESO
Transactions on Industrial Electronics, vol. 56, no. 3, pp. 900–906,
0
2009.
-2
[7] R. Cui, L. Chen, C. Yang, and M. Chen, “Extended state observer-
0 0.5 1 1.5 2 2.5 3
based integral sliding mode control for an underwater robot with un-
2
L-ESO
known disturbances and uncertain nonlinearities,” IEEE Transactions
0
on Industrial Electronics, vol. 64, no. 8, pp. 6785–6795, 2017.
[8] H. Zhang, Y. Li, Z. Li, C. Zhao, F. Gao, F. Xu, and P. Wang,
-2 “Extended-state-observer based model predictive control of a hybrid
0 0.5 1 1.5 2 2.5 3
Time (s) modular DC transformer,” IEEE Transactions on Industrial Electron-
ics, vol. 69, no. 2, pp. 1561–1572, 2021.
Fig. 7: Control signal for MB-ESO, MF-ESO, L-ESO [9] H. Zhang, S. Zhao, and Z. Gao, “An active disturbance rejection
control solution for the two-mass-spring benchmark problem,” in 2016
American Control Conference (ACC). IEEE, 2016, pp. 1566–1571.
From the results illustrated in Fig. 6 and Fig. 7, we [10] C. Fu and W. Tan, “Tuning of linear ADRC with known plant
information,” ISA transactions, vol. 65, pp. 384–393, 2016.
have the following observations: 1) L-ESO has the best [11] Y. Hui, R. Chi, B. Huang, and Z. Hou, “Extended state observer-
performance among all the methods after the training phase based data-driven iterative learning control for permanent magnet
in terms of overshoot percentage and settling time. The linear motor with initial shifts and disturbances,” IEEE Transactions
on Systems, Man, and Cybernetics: Systems, vol. 51, no. 3, pp. 1881–
reasons for L-ESO outperforming MB-ESO could be the 1891, 2021.
imperfection of system identification or that our approach [12] J. Wang, D. Huang, S. Fang, Y. Wang, and W. Xu, “Model predictive
can learn internal as well as external disturbance. 2) The control for ARC motors using extended state observer and iterative
fluctuation of control input of L-ESO is between MF-ESO learning methods,” IEEE Transactions on Energy Conversion, vol. 37,
no. 3, pp. 2217–2226, 2022.
and MB-ESO, as shown in Fig. 7, which is different from [13] J. Zhang and D. Meng, “Improving tracking accuracy for repetitive
the simulation result. This is because the learning rate is learning systems by high-order extended state observers,” IEEE Trans-
conservatively chosen due to the large noise in the hardware. actions on Neural Networks and Learning Systems, 2022.
[14] P. Kicki, K. Łakomy, and K. M. B. Lee, “Tuning of extended state
Also, the trapezoidal profile reference is more smooth than observer with neural network-based control performance assessment,”
the step reference, which is beneficial for learning. European Journal of Control, vol. 64, p. 100609, 2022.
[15] G. Shi, X. Shi, M. O’Connell, R. Yu, K. Azizzadenesheli, A. Anand-
kumar, Y. Yue, and S.-J. Chung, “Neural lander: Stable drone landing
control using learned dynamics,” in 2019 International Conference on
Robotics and Automation (ICRA), 2019, pp. 9784–9790.
[16] B. Guo and Z. Zhao, “On the convergence of an extended state
observer for nonlinear systems with uncertainty,” Systems & Control
Letters, vol. 60, no. 6, pp. 420–430, 2011.
[17] W. Bai, S. Chen, Y. Huang, B. Guo, and Z. Wu, “Observers and ob-
servability for uncertain nonlinear systems: A necessary and sufficient
condition,” International Journal of Robust and Nonlinear Control,
vol. 29, no. 10, pp. 2960–2977, 2019.
[18] J. Chen, Z. Gao, Y. Hu, and S. Shao, “A general model-based
extended state observer with built-in zero dynamics,” arXiv preprint
arXiv:2208.12314, 2023.
[19] Z. Gao, “Scaling and bandwidth-parameterization based controller
tuning,” in Proceedings of the 2003 American Control Conference,
2003. IEEE, 2003, pp. 4989–4996.
[20] B. Wie and D. S. Bernstein, “Benchmark problems for robust control
design,” Journal of Guidance, Control, and Dynamics, vol. 15, no. 5,
pp. 1057–1059, 1992.
[21] Open AI, “Safety gym,” http://www.ecpsystems.com/controls torplant.
htm [Accessed: 3-23-2024].

You might also like