You are on page 1of 14

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 69, NO.

10, OCTOBER 2022 4177

Model Order Reduction for Delayed PEEC Models


With Guaranteed Accuracy and Observed Stability
Lihong Feng , Luigi Lombardi , Peter Benner , Daniele Romano, and Giulio Antonini , Senior Member, IEEE

Abstract— In this work, a stable and accurate model order function is only satisfied for the approximate transfer function.
reduction method for large-scale partial element equivalent In [12], a physics-based MOR method for PEEC circuits is
circuit (PEEC) models with many delays is proposed. The method proposed. The method presented in [13] is a direct mesh-based
reduces the dimension of the original model by interpolating the
original transfer function. The interpolation points are iteratively MOR method. Both the works in [12] and [13] are limited to
selected using a greedy algorithm. An efficient error estimator for quasi-static problems.
the reduced transfer function makes the greedy algorithm very In [14], a parameterized MOR technique for circuits
successful in both, properly selecting the interpolation points and described by the delayed PEEC method is presented.
tightly measuring the error of the reduced transfer function. The An implicit multiparameter moment matching algorithm is
resulting reduced-order model is accurate both in the frequency
and in the time domain. Numerical tests have shown that the used for the moments related to the design parameters. The
reduced-order model successfully filters the instability behavior parameterized reduced-order model (ROM) is constructed only
of the original model and exhibits stability over a large time at a fixed frequency. Many ROMs must be constructed for
interval. many different frequency samples. Inverse Laplace transform
Index Terms— Partial element equivalent circuit (PEEC) is applied to the frequency domain solution to obtain the time-
method, model order reduction, greedy algorithm, error domain response, which requires a second-stage approxima-
estimation. tion: truncating a Padé approximation.
In a recent work [15], the original transfer function is
not approximated before MOR. A reduced transfer function
I. I NTRODUCTION which interpolates the original transfer function at iteratively

T HE design of high-speed circuits requires the time-delay


phenomena due to the propagation delays to be taken into
account. It occurs in modeling and designing interconnects
chosen interpolation points is obtained using greedy algo-
rithms. A reduced-order model (ROM) is derived via applying
Petrov-Galerkin projection to the original PEEC model that
in circuit packaging and printed circuit boards (PCBs). Large is, for future reference, referred to as the full-order model
scale time-delay systems with many delays are very time con- (FOM). The interpolation property of the reduced transfer
suming to simulate both in the frequency and the time domain. function is also proved in [15], but essentially follows from the
Many model order reduction (MOR) techniques have been structure-preserving interpolatory model reduction framework
proposed and successfully applied to reduce the complexity introduced in [16]. At each iteration of the greedy algorithm,
and dimension of large-scale systems [1]–[4]. During the past the frequency at which the approximate L∞ -error between the
years, quite a few efforts are sought to speed up the simulation original transfer function and the reduced transfer function
of time-delay systems [5]–[7], including PEEC models [8], via is attained, is selected as the interpolation point. It is then
different MOR methods [5], [9], [10]. All those methods tried used to compute new basis vectors that are combined with the
to apply the conventional moment-matching methods [11] by previous basis vectors to update the ROM. Novel algorithms
approximating the original non-rational transfer function with are proposed in [15] to numerically compute the L∞ -error.
a rational function using truncated power series expansion. The The method nevertheless has some weak points. The ROMs for
resulting moment-matching property of the reduced transfer most of the tested models are usually not accurate and unstable
in the time domain though the corresponding reduced transfer
Manuscript received 28 February 2022; revised 12 June 2022; accepted functions approximate the original transfer functions well in
3 July 2022. Date of publication 14 July 2022; date of current version
29 September 2022. This article was recommended by Associate Editor P. frequency domain. Moreover, the L∞ error measure used to
A. Beerel. (Corresponding author: Giulio Antonini.) stop the algorithm has no direct relation to the magnitude error
Lihong Feng and Peter Benner are with the Max Planck Institute for of the reduced transfer function, especially for multiple input
Dynamics of Complex Technical Systems, 39106 Magdeburg, Germany
(e-mail: feng@mpi-magdeburg.mpg.de; benner@mpi-magdeburg.mpg.de). and multiple output systems, and it tends to overestimate the
Luigi Lombardi is with Micron Semiconductor, 67051 Avezzano, Italy true error of the reduced transfer function.
(e-mail: luigilombardi89@gmail.com). The proposed method in this work overcomes the above
Daniele Romano and Giulio Antonini are with the UAq EMC Laboratory,
Department of Industrial and Information Engineering and Economics, Uni- difficulties of the existing works. Firstly, it also interpolates the
versity of L’Aquila, 67100 L’Aquila, Italy (e-mail: daniele.romano@univaq.it; original transfer function H (s) instead of an approximation or
giulio.antonini@univaq.it). truncation of H (s). Secondly, it employs a new error estimator
Color versions of one or more figures in this article are available at
https://doi.org/10.1109/TCSI.2022.3189389. proposed in [17] to measure the magnitude error of the reduced
Digital Object Identifier 10.1109/TCSI.2022.3189389 transfer function. The new error estimator tightly catches the
1549-8328 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.
4178 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 69, NO. 10, OCTOBER 2022

true error of the reduced transfer function. Thirdly, the ROMs The unknown vector is:
are not only accurate but also stable in the time domain.
In contrast to the L∞ -error, the new error estimator is a x(t) = [i (t) φsr (t) φi (t) v d (t) qs (t)]T ∈ nu ×1 ,
more intuitive and straightforward error measure. Due to the
tightness of the error estimator, the greedy algorithm converges where i (t) represents the branch currents, φsr (t) the scalar
in only a few iterations by selecting a few interpolation points potentials for surface nodes, φi (t) the scalar potentials for
in a very wide frequency band. The ROM accurately catches internal nodes, v d (t) the excess capacitance voltages for
the behavior of the FOM in both the frequency domain and dielectric branches and qs (t) the surface charges [8]. The state
the time domain. Finally, we construct a single ROM for the space matrices E, A and B are:
PEEC model, which is accurate for all the frequencies in the ⎡ ⎤
frequency domain of interest. Our time domain response is L p(Q S) 0 0 0 0
n b ×n ns n b ×n ni n b ×n bd n b ×n p
obtained by directly simulating the ROM in the time domain ⎢ n b ×n b ⎥
⎢ 0 0 0 0 ⎥ MT
without a second stage of approximation [14]. ⎢ n ns ×n b n ns ×n ns n ns ×n ni n ns ×n bd ⎥ n ns ×n p
⎢ ⎥
Compared to our proceedings paper [18], this work con- ⎢ 0 0 0 0 0 ⎥
E =⎢ n ni ×n b n ni ×n ns n ni ×n ni n ni ×n bd n ni ×n p ⎥ ,
tributes most of the numerical analysis which is missing ⎢ ⎥
⎢ 0 0 0 Cd 0 ⎥
in [18]. New contributions of this work include: the adaptation ⎢ n bd ×n b n bd ×n ns n bd ×n ni n bd ×n bd n bd ×n p ⎥
⎣ ⎦
of an error estimator from [17] to time-delayed PEEC models 0 0 0 0 0
n p ×n b n p ×n ns n p ×n ni n p ×n bd n p ×n p
is explored. Numerically computing the error estimator is ⎡ ⎤
detailed. Instability of the ROMs obtained by the method −R −As −Ai − 0
×n ×n ×n ×n ×n
in [15] is observed and empirically improved. Numerical ⎢ n b b n b ns n b ni n b bd n b p

⎢ A T −G le 0 0 0 ⎥
comparison with the method in [15] in both frequency-domain ⎢ s n ×n nns ×n ⎥
ni n ns ×n bd n ns ×n p ⎥
⎢ nns ×nb ns ns
and time-domain is provided. The proposed algorithm is tested ⎢ AT ⎥
A=⎢
⎢ nni ×ni 0 0 0 0 ⎥
n ni ×n ns n ni ×n ni n ni ×n bd n ni ×n p ⎥ ,
on two more PEEC models with up to 191 delays and some ⎢ Tb ⎥
16,000 degrees of freedom. ⎢  0 0 0 0 ⎥
⎢ nbd ×nb nbd ×n ns n bd ×n ni n bd ×n bd n bd ×n p ⎥
The paper is organized as follows. In Section II, we describe ⎣ ⎦
0 M 0 0 −P (Q S)
the PEEC method and the time-delayed PEEC model under n p ×n b n p ×n ns n p ×n ni n p ×n bd n p ×n p
consideration. In Section III, the proposed method is discussed ⎡ ⎤
I 0
in detail. We first introduce a greedy algorithm for delayed ⎢ nb ×nb n b ×n ns

⎢ 0 I ⎥
PEEC models. Then instability of the method in [15] is ⎢ nns ×nb n ns ×n ns ⎥
⎢ ⎥
analyzed and improved. A new error estimator in [17] is B =⎢ 0
⎢ nni ×nb
0 ⎥,
n ni ×n ns ⎥
extended to time-delayed PEEC models based on which a new ⎢ 0 0 ⎥
⎢ n ×n ⎥
greedy algorithm is proposed. More details on computing the ⎣ bd b n bd ×n ns ⎦
error estimator for MOR of the delayed PEEC models are 0 0
n p ×n b n p ×n ns
addressed. Section IV presents numerical tests of the proposed
greedy algorithm on three large-scale delayed PEEC models where n b , n ns , n ni , n bd and n p represent the number of
with hundreds of delays. Behavior of the ROMs in both branches, surface nodes, internal nodes, dielectric cells and
frequency domain and time domain are compared with the surface cells, respectively. Furthermore, L p(Q S) and P (Q S) are
ROMs obtained from the algorithms in [15]. Conclusions are the quasi static partial inductance and coefficients of potential
given in Section V. matrices where propagation delays have been neglected, Cd
II. D ELAYED PEEC M ODELS is the excess capacitance matrix, R is the branches resistance
The PEEC method is an integral equation - based method matrix, As is the incidence matrix for the surface nodes, Ai is
which solves Maxwell’s equations using the electric field the incidence matrix for the internal nodes,  is the dielectric
integral equation and the continuity equation [8], [19]–[21]. region selection matrix, M is the surface to node reduction
Several different formulations have been proposed through the matrix and G le is the load conductance matrix (assuming for
years for the PEEC method. Each formulation is characterized simplicity of notation that only resistive lumped elements are
by a different set of variables which may confer particular connected to the PEEC model). The source vector u(t) is:
numerical properties to the model. Since we need to keep 
v (t)
explicit the propagation delays for both partial inductances and u= s ,
potential coefficients, the choice which better fits our needs is i s (t)
the one proposed in [22].
where v s (t) and i s (t) are the voltage and current sources
A. Brief Introduction to the PEEC Method which are applied to branches and nodes, respectively. In this
In this subsection, we briefly present the PEEC method formulation, the surface charges qs are kept as unknowns
in [22], based on which we derive our time-delayed PEEC avoiding the inversion of the potential coefficients matrix
models in the next subsection. The modified nodal analy- P (Q S). It is also worth observing that the proposed PEEC for-
sis (MNA) form of the adopted formulation is: mulation, based on separate charge and current basis functions,
d x(t) demonstrates favorable low-frequency behavior, as confirmed
E = Ax(t) + Bu(t). in [23], [24].
dt

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.
FENG et al.: MOR FOR DELAYED PEEC MODELS WITH GUARANTEED ACCURACY AND OBSERVED STABILITY 4179

B. Time-Delay Formulation of the PEEC Model C. MOR for the Delayed PEEC Model
When propagation delays are considered and modeled as A ROM of the delayed PEEC model, which has the
center to center delays between the spatial support of basis same delays as the original system, can be obtained via
functions, partial inductances L p and coefficients of potential Petrov-Galerkin projection using two projection matrices
P are expanded in the time domain as W, V , i.e.,

d

d
Ê j ż(t − τ j ) = Â j z(t − τ j ) + B̂u(t),
L p(t) = L p j δ(t − τ j ),
j =0 j =0
j =0
ŷ(t) = Ĉ z(t), ∀ t ≥ 0, (5)

d
P(t) = P j δ(t − τ j ), (1) where Ê j = EjV ∈
WT Â j = Rr×r ,
AjV ∈ B̂ =
WT Rr×r ,
j =0 W T B ∈ Rr×m , Ĉ = C V ∈ R p×r , with r  n being the
where δ(t) denotes the delta Dirac function and 0 = τ0 < order of the ROM. The original state vector x(t) in (3) can be
τ1 < . . . < τd represent a set of significant delays. Delay τ0 = recovered by the approximation: x(t) ≈ V z(t). The transfer
0 occurs in the self (diagonal) terms and for basis functions function of the ROM is
for which the center to center delay is smaller than the time Ĥ (s) = Ĉ K̂−1 (s) B̂,
step and account for instantaneous magnetic and electric field
interactions. where K̂(s) = s dj =0 Ê j e−sτ j − dj =0 Â j e−sτ j . In [15], the
The expressions of L p(t) and P(t) in (1) imply that following interpolation property for delay systems is proved.
matrices E and A are decomposed in the Laplace domain as Proposition 1: Let a delay system be given by E 0 , . . . , E d ,
A0 , . . . , Ad , τ1 , . . . , τd , B, C and let s0 ∈ C be a fixed

d expansion point. Define the sequence of matrices (Kk )∞ k=0 ∈


E(s) = E j e−sτ j , Cn×n by
j =0

d
−sτ j
K0 (s0 ) = s0 e−s0 τ j E j − e−s0 τ j A j ,
A(s) = Aje . (2) j =0 j =0
j =0

d
(−τ j )k−1 −s0 τ j
(−τ j )kd

When matrices E(s) and A(s) in (2) are transformed back Kk (s0 ) = e E j + s0 e−s0 τ j E j
(k − 1)! k!
to the time domain and applied to the unknown vector x(t), j =0 j =0
the following time-delay formulation of the PEEC model is

d
(−τ j )k
obtained: − e−s0 τ j A j ∀ k ≥ 1,
k!
j =0

d

E j ẋ(t − τ j ) = A j x(t − τ j ) + Bu(t), which satisfies K(σ ) = ∞ k=0 Kk σ , with s = s0 + σ . Here
k

j =0 j =0 K(s) is defined in (4). Use these to define two other sequences


y(t) = C x(t), ∀ t ≥ 0 (3) of matrices (Fk )∞
k=0 ⊂ C
n×m and (G )∞ ⊂ Cn× p recursively
k k=0
as
with an initial condition x(t) = (t) ∈ Cn , ∀ t ∈ [−τd , 0].
F0 (s0 ) = K0 (s0 )−1 B,
Here, E 0 , . . . , E d , A0 , . . . , Ad ∈ Cn×n , B ∈ Cn×m , C ∈ k−1
C p×n , 0 = τ0 < τ1 < . . . < τd and n is called the order

−1
Fk (s0 ) = −k!K0 (s0 ) Kk−i Fi ,
of the delay system.
i=0
Equations (3) are a set of delayed differential equations
of the neutral type (NDDE). They can be solved by the G 0 (s0 ) = (K0 (s0 ))−T C T ,
k−1
Lobatto III-C method that provides a numerical solution which

−T T
is accurate to order 2 for the differential equation scheme, G k (s0 ) = −k!(K0 (s0 )) Kk−i (s0 ) G i .
i=0
as shown in [25]. Accurate time domain simulations for
delayed PEEC circuits have also been obtained using the Let V, W ∈ Cn×r be projection matrices and Ĥ be the transfer
backward Euler formulas [22], [26], [27]. It should be pointed function of the corresponding ROM.
out that the PEEC method is able to reproduce the DC solution • If F0 (s0 ), . . . , Fl (s0 ) ∈ Range(V ), then
better than other integral equation-based techniques, like the H (i) (s0 ) = Ĥ (i) (s0 ) ∀ i = 0, . . . , l.
Method of Moments (MoM) [28] by virtue of the fact that
it keeps the charges and currents effects separate, making it • If G 0 (s0 ), . . . , G k (s0 ) ∈ Range(W ), then
more robust than the MoM in relation to the low frequency c T H (i) (s0 ) = c T Ĥ (i)(s0 ) ∀ i = 0, . . . , k.
breakdown problem [24].
The transfer function of the delay system is defined as: • If F0 (s0 ), . . . , Fl (s0 ) ∈ Range(V ) and
G 0 (s0 ), . . . , G k (s0 ) ∈ Range(W ), then
H (s) = CK−1 (s)B, (4) H (i) (s0 ) = Ĥ (i) (s0 ) ∀ i = 0, . . . , l + k + 1.

where K(s) = s dj =0 E j e−sτ j − dj =0 A j e−sτ j . Here, (·)(i) denotes the i -th derivative of a function.

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.
4180 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 69, NO. 10, OCTOBER 2022

Note that for l = k = 1, this is a particular instance of the


general interpolation result from [16].
Proposition 1 shows that the reduced transfer function Ĥ (s)
and its i -th (i = 1, . . . , N) order derivatives interpolate the
original transfer function H (s) and its corresponding deriva-
tives at the interpolation point s0 , respectively. In this work,
we use this Lemma to construct a projection matrix V , so that
a ROM is computed via Galerkin projection using W = V .
It can be seen that when V is constructed following the above
Lemma, the reduced transfer function interpolates the original
transfer function rather than any truncated approximation,
up to the lth derivative.
From the expressions of Fi and G i , i ≥ 1, we can see
that they are much more complicated to compute than F0 and
G 0 , especially for systems with many delays, e.g., d ≥ 100.
Therefore it is desired that only F0 and G 0 are used to compute
the projection matrices V and W , respectively and obtain the
ROM from W, V following (5). However, if only a single inter-
polation point s0 is used, then the resulting ROM is inaccurate
for problems with wide frequency band. Multiple interpolation
points must be considered. A proposition is proposed in [15]
to show the multiple-point interpolation property of the ROM.
For completeness, we repeat it as below.
Proposition 2:
• If F0 (si ) ∈ Range(V ), i = 1, . . . , l, then H (si ) = Fig. 1. Greedy interpolation for delayed PEEC models.
Ĥ (si ), i = 1, . . . , l.
• If G 0 (si ) ∈ Range(W ), i = 1, . . . , l, then H (si ) =
Ĥ (si ), i = 1, . . . , l.
• If F0 (si ) ∈ Range(V ) and G 0 (si ) ∈ Range(W ), i = Therefore, ∗0 might be seen as an estimation of H − Ĥ L∞ ,
which takes the supremum in a finite interval of s. A heuristic
1, . . . , l, then H (si ) = Ĥ (si ) and H
(si ) = Ĥ
(si ), i =
algorithm H-greedy is proposed in [15], where ∗0 is computed
1, . . . , l.
by taking some samples of ω from [ω, ω] and take ∗0 as the
The proposition says that if only F0 (si ), i = 1, . . . , l, are
maximal value of H (j ω) − Ĥ(j ω) over those samples. It is
included in Range(V ), then the reduced transfer function Ĥ
clear that this heuristic method is unreliable and ∗0 may have
interpolates the original transfer function at the interpolation
different values when different (number of) samples are taken.
points si , i = 1, . . . , l. However, different interpolation points
Another more advanced algorithm SSI-greedy is also proposed
result in ROMs with quite different accuracy. Therefore, how
in [15]. However, SSI-greedy faces high computational com-
to properly select the interpolation points so as to obtain a
plexity as there are two many loops of iteration where not
ROM with acceptable accuracy and dimension is the key of
only the FOM needs to be simulated but also the ROMs need
MOR for delayed PEEC models.
to be simulated for many times during the iterations of the
algorithm, which constitute the main computational cost of
III. T HE P ROPOSED I NTERPOLATION M ETHOD
SSI-greedy. Numerical tests in [15] also show that SSI-greedy
We propose to use a greedy procedure presented in the is much slower than H-greedy for large models, though it is
following Algorithm 1, shown in Fig. 1, to iteratively select the more reliable.
interpolation points. The robustness of the procedure mainly It is worth pointing out that since ∗0 corresponds to the
depends on the efficiency of the error measure ∗ (s) in Step 7 maximum of the spectral norm of H (j ω) − Ĥ (j ω) in a
of the algorithm. finite interval of s, this error measure does not directly reflect
the actual magnitude of the reduced transfer function error,
A. A Greedy Algorithm and the Error Measure in [15] (s) := max |Hi j (s) − Ĥi j |, i = 1, . . . , p, j = 1, . . . , m. The
i, j
The greedy procedure of Algorithm 1, shown in Fig. 1, subscript i j means the i, j -th entry of the transfer function.
is initially introduced in [15], where ∗ (sk+1 ) =: ∗0 is It is shown in [15] that measuring the error of the reduced
defined as ∗0 = max H (j ω) − Ĥ (j ω) . Here, · is transfer function using the L∞ -norm tends to overestimate
ω∈[ω,ω]
the spectral norm of a matrix, [ω, ω] is the frequency band of (s). This is also seen in the Tables I-II, Fig. 10, Fig. 14,
interest. Recall the definition of the L∞ -norm of any matrix Fig. 12 and Fig. 16 in Section IV, where the L∞ -errors
function f (s) ∈ C p×m : f L∞ = sup f (j ω) , where s = (Valid.er) are still much larger than the given error tolerance,
ω∈R but from the figures, the errors (s) of those ROMs are already
j ω, j is the imaginary unit, and ω is the angular frequency. very small.

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.
FENG et al.: MOR FOR DELAYED PEEC MODELS WITH GUARANTEED ACCURACY AND OBSERVED STABILITY 4181

B. Stable Versions of the Algorithms in [15] Inserting X r (s) = K−1 (s)R pr (s) into the last equality in (8),
We find that H-greedy and SSI-greedy in [15] produce we get
ROMs that are unstable in the time domain. This probably H (s) − Ĥ (s) = C(s)X r (s) . (10)
is caused by the Petrov-Galerkin projection. In this work,
we propose to replace the Petrov-Galerkin projection in In [17], instead of solving the system (9) of the FOM dimen-
H-greedy and SSI-greedy with Galerkin projection by equating sion, a ROM of (9) is first constructed by computing another
W to V , i.e., W = V , and only compute V during the projection matrix Vr , i.e.,
greedy iterations. Finally, the ROMs are obtained by applying
VrT K(s)Vr Z r (s) = VrT R pr (s). (11)
Galerkin projection to the FOM using V . We name the
Galerkin versions of H-greedy and SSI-greedy as H-greedy-G An approximate solution X̃ r (s) to (9) is obtained by X̃ r (s) =
and SSI-greedy-G, respectively. It will be demonstrated in Vr Z r (s). Replacing X r (s) in (10) with X̃ r (s), we obtain the
Section IV that H-greedy-G and SSI-greedy-G do construct error estimator for Ĥ (s):
ROMs that are stable in the time domain, while the ROMs
obtained from H-greedy and SSI-greedy are not stable. H (s) − Ĥ (s) ≈ C(s) X̃ r (s) =: D(s) .
In the next section, we propose a new error estimator leading
In this work, we consider the matrix max-norm of the error,
to a fast-to-compute ∗ (s) in Step 7 of Algorithm 1 reported
i.e.,
in Fig. 1. Moreover, the error estimator directly measures
the magnitude of the reduced transfer function error, i.e., H (s) − Ĥ (s) max ≈ D(s) max = max |Di j (s)| =: (s),
ij
(s). Therefore, it is a more intuitive error measure. It is
demonstrated in Section IV that (s) can be tightly estimated (12)
by the new error estimator. The proposed algorithm based on where Di j (s) is the i, j -th entry of error matrix D(s). Finally,
the new error estimator also uses Galerkin projection to avoid we use the maximum as our error estimator (s) for esti-
instability of the ROMs in the time domain. mating the error of Ĥ (s). On the one hand, it is shown
in [17] that (s) can tightly estimate the error of Ĥ (s) in
C. The New Error Estimator the sense that is is very close to the true error of Ĥ (s).
On the other hand, computing the error estimator (s) is
In the following, we introduce an error estimator for the much faster than computing the true error defined as (s) =
reduced transfer function Ĥ (s) of the ROM for the delayed max |Hi j (s) − Ĥi j (s)|, where the original transfer function
PEEC model in (3). The error estimator is initially proposed ij
in [17] and applied to MNA formulated circuit examples H (s) needs to be computed, which involves computational
without delay. Since the error estimator is applicable to any cost with complexity of the FOM dimension n. Note that
linear system, we consider adapting it to the delayed PEEC computing (s) needs only an approximate solution computed
models in this work. Detailed description of the error estimator from the ROM of the primal-residual system (9). Moreover,
for the delay system in (3) is as follows. Define a primal constructing the ROM (11) of this system can be done in
system in the frequency domain as parallel with constructing the ROM of the original system.
In particular, the projection matrix Vr can be computed in
K(s)X pr (s) = B, (6) parallel with V during the iteration of Algorithm 1 reported
in Fig. 1. Based on the new error estimator (s), we propose
The reduced primal system is defined as a new algorithm reported in Fig. 2, denoted as Algorithm 1,
K̂(s)Z pr (s) = B̂, (7) where the error measure ∗ (s) in Step 7 of Algorithm 1 is
replaced by (s) and computation of (s) for the PEEC
so that X̂ pr (s) := V Z pr (s) well approximates the solution model in (3) is specified.
X pr (s). The primal residual is defined as Remark 1: From the analysis in [17], Vr should be com-
puted in the same way as computing V , but with a different
R pr (s) = B − K(s) X̂ pr (s). interpolation point skr at every iteration. Since otherwise,
if sk = skr , then (s) is identically zero [17]. It is also analyzed
Following the definition of Ĥ (s) and the reduced primal in [17], that sk should be far away from skr to avoid (s)
system (7), we obtain the error between H (s) and Ĥ (s): being close to zero leading to underestimation of the true error.
||H (s) − Ĥ (s)|| In the subsequent iterations, skr , k > 1, is chosen such that
rr (s) 2 (rr (s) is defined in Step 8 of Algorithm 2, reported
= C(s)(K−1 (s)B(s) − V K̂−1 (s) B̂(s))
in Fig. 2), is maximized to make sure that skr is different
= C(s)K−1 (s)(B(s) − K(s) V K̂−1 (s) B̂(s)) from sk . As has been mentioned before, we use Galerkin
  
X̂ pr (s):=V Z pr (s) projection to construct the ROM, i.e., W = V , so that the
−1 cost of constructing W in Steps 4-5 of Algorithm 1, reported
= C(s)K (s)R pr (s) , (8) in Fig. 1, is saved in Algorithm2.
Define the primal-residual system as below, Remark 2: Different error estimators for the reduced trans-
fer function are proposed in [17]. Theoretical analyses in [17]
K(s)X r (s) = R pr (s). (9) indicate that (s) has the least computational complexity,

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.
4182 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 69, NO. 10, OCTOBER 2022

Fig. 3. Three coplanar microstrips.

Fig. 2. The proposed greedy algorithm for delayed PEEC models.

while numerical tests there show that (s) is one of the


tightest error estimators. Therefore, we choose to use (s)
in this work.
Next, we apply the proposed Algorithm2, reported in Fig. 2, Fig. 4. Microstrip filter geometry adopted for the numerical example.
to three large-scale delayed PEEC models and present the
numerical results both in the frequency domain and in the
into four 256 GB partitions. The results in the time domain
time domain. Algorithm2 is also compared with the algorithms
are obtained with MATLAB R2020a running on a computer
H-greedy and SSI-greedy proposed in [15], as well as their sta-
with Core i7-8565U and 40GB of RAM.
ble versions H-greedy-G and SSI-greedy-G from section III-B.

IV. N UMERICAL T ESTS A. The PEEC Models


We first describe the structures of three PEEC models in 1) Coplanar Microstrips: In this example, we consider the
subsection IV-A, then we compare the results of Algorithm2, three coplanar microstrips structure shown in Fig. 3. The width
reported in Fig. 2, with those of H-greedy, SSI-greedy and of the metal strips is m w = 0.178 mm, the thickness of metal
H-greedy-G, SSI-greedy-G in terms of both accuracy and strips and ground plane is m t = 0.035 mm while the left
wall time1 in the frequency domain. The speed-ups with and right wing of the microstrips are wd = 3 mm. Finally,
respect to simulating the FOM are also presented. The time- the length of each strip is = 5 cm, the thickness of the
domain results are shown afterwards in subsection IV-C. The dielectric is dt = 0.8 mm, and the spacing between 2 strips
stability of the ROMs derived from Algorithm 2, H-greedy- is s = 0.3 mm. The relative dielectric constant is set to be
G and SSI-greedy-G can be clearly seen from the figures. εr = 4 and the conductivity of the metal is assumed to be
Numerical tests in the frequency domain and the time domain σ = 5.8 · 107 S/m. The six ports, located between the ends of
are done separately by two of the coauthors on their available each strip and the ground plane below, are terminated on load
computers. Tests in the frequency domain are done with resistors Rload = 50 . The order of the FOM is n = 16, 644,
MATLAB R2016b on a computer server with 4 Intel Xeon and there are d = 168 significant delays. The frequency band
E7-8837 CPUs running at 2.67 GHz, 1 TB main memory, split of interest is [0, 10] GHz.
2) Microstrip Filter: The 3D structure of a microstrip filter
1 This is also called the wall clock time. It is “the actual time taken from is depicted in Fig. 4. The physical dimensions for the geometry
the start of a computer program to its termination. In other words, it is the of the 3D structure are: wzl = 0.5 mm, wz0 = 1.125 mm,
difference between the time at which a task finishes and the time at which
the task started.” (from Wikipedia https://en.wikipedia.org/wiki/Elapsed-real- wzC = 4 mm, zl = 18.3 mm, z0 = 1 mm, zC = 14.1 mm,
time) w = 2.4 cm = 2 zl + 2 z0 + zC tm = 100 μm ts = 100 μm

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.
FENG et al.: MOR FOR DELAYED PEEC MODELS WITH GUARANTEED ACCURACY AND OBSERVED STABILITY 4183

• “SP-off” in the tables is the speed-up factor of the


algorithm when the offline time of constructing the ROM
is included. Let T-off be the wall time of running each
algorithm till convergence, which is also the offline time
of constructing the ROM. Tr be the online simulation
time of computing the reduced transfer function at the
1000 validation samples, and Tf be the online simulation
Fig. 5. The three-port microstrip power-divider circuit.
time of computing the original transfer function at the
same samples, then SP-off=Tf/(T-off+Tr). It is the ratio
td = 508 μm. The two ends of the microstrip are terminated between the FOM simulation time and the offline ROM
on 50  resistors. The order of the FOM is n = 12, 132, and construction time plus the online ROM simulation time.
there are d = 190 delays. The interesting frequency band is “SP-on” is the online speed-up factor, when only the
[0, 5] GHz. online simulation time of the ROM is considered, i.e.,
3) Three-Port Divider: A three-port microstrip SP-on=Tf/Tr. It is the ratio between the online FOM
power-divider circuit has been modeled. The structure simulation time and the online ROM simulation time.
is shown in Fig. 5 (P1 , P2 and P3 denote the ports). The • For the H-greedy algorithm, samples of s need to be taken
dimensions of the circuit are [20, 20, 0.5] mm in the in order to compute the approximate L∞ -error. We take
[x, y, z] directions and the width of the microstrips is set n H = 100 samples of s over the interesting frequency
as 0.8 mm. Furthermore, the dimensions l X 1 , lY 1 , and lY 3 band, since it is shown in [15] that n H = 50 produces
are 9, 7.2 and 7.2 mm, respectively. The relative dielectric less reliable ROMs than n H = 100.
constant is εr = 2.2. All the ports are terminated on 50  • For the SSI-greedy algorithm, an optimization algorithm
resistances. The order of the FOM is n = 10, 626, and it has needs to be implemented as an inner loop, at each
d = 93 delays. The interesting frequency band is [0, 20] GHz. iteration of SSI-greedy. It is shown in [15] that a sam-
Simulating the FOM in both the frequency domain and ple optimization approach obtains the best results. The
the time domain is very time consuming, which could take number of samples there is taken as n S = 5 or n S = 10.
up to 1.5 days to finish computing the transfer function at We use n S = 5 in this work, as it produced results almost
1000 frequency samples, and needs over 2 hours to finish one as good as n S = 10 as demonstrated in [15], while the
simulation in the time domain for the largest system. After algorithm converges much faster with n S = 5.
the ROM is constructed, simulating the ROM can be done in • For algorithms H-greedy-G and SSI-greedy-G, we use the
seconds to minutes. This will be illustrated from the following same n H = 100 and n S = 5 to compute the ROMs.
numerical tests on the three models. • The “error estimator” in Figs. 6-8 means the maximal
error estimator over the training set , i.e., max (si ),
i=1,...,n 
B. Frequency Domain Results where n  is the number of samples in  used in
This subsection presents the results of the proposed algo- Algorithm 2. The “true error” is the maximal magnitude
rithm and the existing greedy algorithms in the frequency of the reduced transfer function error evaluated at the
domain. For ease of read, some common variables in the tables samples in , i.e., max (si ).
i=1,...,n 
and figures are listed as below. • In all the figures, the waveforms indicated by the legend
• r : the order of the ROM. n: the order of the FOM. I ter “FOM” are the corresponding results from simulating
is the number of iterations for the algorithm to converge. the FOM.
• The error tolerance ε for the ROMs are set to be
ε = 1 × 10−3 . Tables I-III present the results of the proposed Algorithm 2
• Valid.er: the validated error. The finally obtained ROM is
(shown in Fig. 2), H-greedy, SSI-greedy, as well as their
tested on a separate test set including 1000 frequency Galerkin versions: H-greedy-G, SSI-greedy-G. The offline,
samples. The test set is much larger than the set of online speed-ups, the iteration numbers, the order of the
frequency samples used by H-greedy, H-greedy-G and the finally derived ROM and the validated error are listed for
training set of Algorithm 2, reported in Fig. 2. To test the each algorithm. When the offline time of constructing the
efficacy of the error estimators used in the algorithms, ROMs are considered, all algorithms have achieved speed-
the validated errors are taken according to the corre- ups. In particular, Algorithm 2 is faster than the SSI-greedy
sponding error estimators. In particular, for algorithms algorithms, but is a bit slower than the H-greedy algorithms.
H-greedy, H-greedy-G, SSI-greedy and SSI-greedy-G, the If only the online simulation time is considered, the speed-ups
validated error is the L∞ -error of the reduced transfer SP-on are significant, since any ROM simulation can be done
function computed over a set of 1000 samples of ω, i.e., within seconds to minutes, which is negligible as compared
max H (j ωi )− Ĥ (j ωi ) . For Algorithm 2, the vali- to the FOM simulation that takes hours. The order of the
i=1,...,1000 derived ROM is only slightly bigger if not smaller. The most
dated error is the maximal magnitude error of the reduced important is the ROMs obtained from Algorithm 2 are the
transfer function evaluated at the same 1000 samples with most accurate ones. The errors of the ROMs after validation
s = j ω, i.e., max (si ), where (si ) is defined in are either below the error tolerance or very close to the error
i=1,...,1000
subsection III-A. tolerance. In contrast, the existing greedy algorithms cannot

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.
4184 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 69, NO. 10, OCTOBER 2022

TABLE I
C OPLANAR M ICROSTRIP : n = 16, 644, 168 D ELAYS

TABLE II
M ICROSTRIP F ILTER : n = 12, 132, 190 D ELAYS

Fig. 6. Coplanar microstrips: error decay in the greedy iterations, true error
vs error estimator.

TABLE III
M ICROSTRIP P OWER -D IVIDER : n = 10, 626, 93 D ELAYS

construct ROMs with the required accuracy, the validated


errors are much larger than the error tolerance ε for most cases.
It is clear that the proposed Algorithm 2 outperforms the other
algorithms in terms of the above comparison quantities.
In Table III, the validated error of the ROM computed
by Algorithm 2 is still a bit larger than the tolerance 0.001. Fig. 7. Microstrip filter: error decay in the greedy iterations, true error vs
This might be caused by too few (50) training samples in the error estimator.
training set  for this model with very wide frequency band
[0, 20] GHz. If we increase the number of training samples
from 50 to 70, then the ROM has maximal error 0.001 over all the three models and behaves as a tight error bound,
the 1000 validation samples, which meets the error tolerance. except for the first iteration for the microstrip filter model.
For the case of using 50 training samples, when checking This is reasonable, since the ROM is not yet accurate in the
the error estimator (s) on the 1000 validation frequency beginning, and the accuracy of the error estimator depends
samples, we find that the maximal value of (s) is also 0.0037. on the accuracy of the ROM. As has been analyzed in [17],
This implicates that the error estimator tightly estimates the the more accurate the ROM, the less possible that the error
true error, so that we could adaptively update the training estimator underestimates the true error.
set  by only watching this highly reliable error estimator. We further compare the proposed algorithm with the exist-
A technique for adaptively updating the samples in the training ing greedy algorithms in terms of accuracy in the frequency
set based on efficient a posteriori error estimator is recently domain. We plot the magnitudes of the transfer functions of
proposed in [29] and has been successfully used in model order the FOM and the ROM as well as the corresponding phases or
reduction of large-scale electromagnetic models that have large angles, respectively. It should be pointed out that although the
frequency bands. The technique can be applied to adaptively validated errors for the existing greedy algorithms H-greedy,
sample the training set of Algorithm 2. Since the focus of SSI-greedy, H-greedy-G, SSI-greedy-G are very big, it does
this work is on applicability and robustness of the new error not mean the magnitude error or phase error of Ĥ (s) is also
estimator (s) for model reduction of time-delayed PEEC that big, since the validated errors in Tables I-III are the
models, detailed application of the adaptive sampling approach L∞ -error that has no direct relation with the magnitude error,
could be considered in a future work. especially for multiple-input and multiple-output systems con-
Figures 6-8 plot the behavior of the proposed error estimator sidered in this work.
by comparing the error estimator with the true error at each In fact, ROMs constructed by all the greedy algorithms
iteration of Algorithm 2. The error estimator is robust for produce transfer function magnitudes and phases (or angles)

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.
FENG et al.: MOR FOR DELAYED PEEC MODELS WITH GUARANTEED ACCURACY AND OBSERVED STABILITY 4185

Fig. 8. Microstrip power-divider: error decay in the greedy iterations, true Fig. 10. Coplanar microstrips: magnitude errors of the reduced transfer
error vs error estimator. function from input port 1 to output port 1 computed by different algorithms.

Fig. 9. Coplanar microstrips: magnitude of the transfer function from input Fig. 11. Coplanar microstrips: phase of the transfer function from input
port 1 to output port 1 computed by Algorithm 2 and FOM simulation. port 1 to output port 1 computed by Algorithm 2 and FOM simulation.

that are indistinguishable from those of the original transfer


functions for all the three PEEC models. To avoid redundancy,
we only present the transfer function magnitude, phase and
angle computed by Algorithm 2, reported in Fig. 2, and the
FOM simulation, respectively; we compare the algorithms
using only the coplanar microstrip model and only present
the results for two entries of the transfer matrix H11(s)
and H16(s), the results for the other entries are omitted.
Furthermore, to clearly present the accuracy of different ROMs
constructed by different algorithms, we plot the errors of the
reduced transfer functions in terms of magnitude, phase and
angle.
Figure 9 shows the transfer function from input port 1 to
output port 1, i.e. the 1, 1-th entry H11(s) of H (s), computed
by Algorithm 2 and by simulating the FOM, respectively.
Figure 10 plots the magnitude errors of the reduced transfer Fig. 12. Coplanar microstrips: phase error of the reduced transfer function
functions computed by different algorithms. It is seen that from input port 1 to output port 1 computed by different algorithms.
H-greedy and H-greedy-G have the biggest errors that are
larger than 0.01 at some frequencies, while the largest errors
of other algorithms are much smaller and around 10−4 . values of the phase errors by different algorithms are plotted
Figure 11 plots the phase of H11(s) computed by in Fig. 12. Again, the transfer functions computed by H-greedy
Algorithm 2 and that by the FOM simulation. The absolute and H-greedy-G have the biggest phase errors.

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.
4186 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 69, NO. 10, OCTOBER 2022

Fig. 13. Coplanar microstrips: magnitude of the transfer function from input Fig. 16. Coplanar microstrips: angle error of the reduced transfer function
port 1 to output port 6 computed by Algorithm 2 and FOM simulation. from input port 1 to output port 6 computed by different algorithms.

corresponding magnitude errors and the absolute values of the


angle errors computed by different algorithms. This time, the
reduced transfer functions computed by Algorithm 2, H-greedy
and H-greedy-G have similar errors, and are a bit larger than
the errors of the transfer functions computed by SSI-greedy
and SSI-greedy-G. However, note that all the errors are smaller
than 0.001, which is already very small.
In summary, H-greedy and H-greedy-G produce larger
errors than the other algorithms in the frequency domain,
though the results of all the algorithms are acceptable. SSI-
greedy and SSI-greedy-G are the slowest algorithms, which
consume much more offline time than others.

C. Time Domain Results


Fig. 14. Coplanar microstrips: magnitude error of the reduced transfer
Time domain simulations of Differential Delayed Equations
function from input port 1 to output port 6 computed different algorithms. of the Neutral type (NDDE), like the ones arising from the
delayed-PEEC method, are often challenging to solve using
Marching on Time (MoT) methods because of the late-time
instabilities that may appear. While alternative methods like
the Numerical Inversion of the Laplace Transform (NILT)
exhibit interesting properties as shown in [30], for general
purposes, MoT represents the most widespread way to solve
time domain systems. The time domain simulations of the
FOM and ROMs have been performed by means of the
Backward Differentiation Formula of second order (BDF2)
based solver with triangular basis function for time interpo-
lation [22]. Transient behavior of the ROMs obtained from
different greedy algorithms are presented in this subsection.
Likewise, the time-domain ROM simulation needs neglegible
runtime as compared to the time-domain FOM simulation.
To avoid repetition, we do not provide the wall time of the
ROM and FOM simulation in the time domain. The input
Fig. 15. Coplanar microstrips: angle of the transfer function from input
signals used for each PEEC model is plotted in Fig. 17. There,
port 1 to output port 6 computed by Algorithm 2 and FOM simulation. the coplanar microstrips and the microstrip power-divider have
the same input signal, while the input signal for the microstrip
filter has more time-delay. The performance of each ROM
Figures 13 and 15 present the results for the transfer at different output ports of the same model are very similar.
function from input port 1 to output port 6, i.e., the 16-th Therefore, we only show the results of the ROMs at one
entry H16(s) of H (s). Finally, Figures 14 and 16 show the randomly chosen output port for each PEEC model. For the

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.
FENG et al.: MOR FOR DELAYED PEEC MODELS WITH GUARANTEED ACCURACY AND OBSERVED STABILITY 4187

Fig. 17. Input signals for the three PEEC models.


Fig. 19. Coplanar microstrips: H-greedy- and SSI-greedy-ROM responses
at port 1.

Fig. 18. Coplanar microstrips: FOM response at port 1.


Fig. 20. Coplanar microstrips: SSI-greedy-G-, H-greedy-G, Algorithm 2-
ROM responses responses at port 1.

coplanar microstrips and the microstrip filter, we show the


results at port 1 and the results at port 3 are presented for the Petrov-Galerkin projection. This conclusion is further justified
microstrip power-divider model. by the results for the other two PEEC models.
1) Coplanar Microstrips: This part presents the results for 2) Microstrip Filter: The results for the microstrip filter
the coplanar microstrips model. The FOM response at port 1 are analyzed in this part. Figure 21 plots the FOM response
is plotted in Fig. 18, where the waveform begins to blow at port 1, which again blows up at 15 ns before the final
up at time around 15 nanoseconds (ns). On the contrary, the time 20 ns. The ROM response by Algorithm 2 in Fig. 23
response of the ROM produced by Algorithm 2 in Fig. 20 is still stable till the final time. Once more, the Galerkin
is stable till 20 ns. The ROMs by H-greedy and SSI-greedy projection based algorithms H-greedy-G and SSI-greedy-G
cannot give rise to meaningful outputs that are plotted in construct ROMs that are also accurate and stable in the time
Fig. 19, though the ROM by SSI-greedy is accurate in the fre- domain, as can be seen from Fig. 23. It is interesting to see that
quency domain (see Figs. 10, 12, 14, 16). However, if Galerkin H-greedy also constructs a ROM that is stable and accurate in
projection is applied to construct the ROMs, the corresponding time domain, see Fig. 22. Note that ROMs constructed by all
algorithms H-greedy-G and SSI-greedy-G construct ROMs the algorithms have accurate results in the frequency domain
that are stable all through the whole time interval. The for the microstrip filter model, though they are not shown in
outputs are plotted in Fig. 20. Note that the output waveform subsection IV-B.
produced by the ROM of SSI-greedy-G is not identically 3) Microstrip Power-Divider: The results for the microstrip
zero at later time and has almost invisible oscillations around power-divider model are similar to those for the coplanar
zero. H-greedy-G behaves similarly with SSI-greedy-G, and it microstrips model and are presented in Figs. 24- 26. The FOM
produces a ROM that is even more stable than SSI-greedy-G response blows up already at 5 ns, while the ROM response
in the time domain. The above phenomenon also implicates of Algorithm 2 is stable till 20 ns. The ROMs constructed
that the instability of the ROMs might be caused by the by H-greedy and SSI-greedy cannot reproduce the output

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.
4188 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 69, NO. 10, OCTOBER 2022

Fig. 21. Microstrip filter: FOM response at port 1.


Fig. 24. Microstrip power-divider: FOM response at port 3.

Fig. 22. Microstrip filter: H-greedy- and SSI-greedy-ROM responses at


port 1. Fig. 25. Microstrip power-divider: H-greedy- and SSI-greedy-ROM
responses at port 3.

Fig. 23. Microstrip filter: H-greedy-G- and SSI-greedy-G-,


Algorithm 2-ROM responses at port 1. Fig. 26. Microstrip power-divider: SSI-greedy-G-, H-greedy-G-, Algo-
rithm 2-ROM responses at port 3.

waveforms. In contrast, the Galerkin projection based ROMs


by H-greedy-G and SSI-greedy-G produce accurate and stable In general, SSI-greedy and H-greedy produce unstable
outputs. The frequency-domain results of all the ROMs are ROMs (with only one exception) in the time domain, though
also accurate for the microstrip power-divider model. SSI-greedy is very accurate in the frequency domain. In most

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.
FENG et al.: MOR FOR DELAYED PEEC MODELS WITH GUARANTEED ACCURACY AND OBSERVED STABILITY 4189

the frequency domain and the time domain for three large-scale
PEEC models with up to 191 delays support our analysis and
demonstrate the robustness of the proposed algorithm.

A PPENDIX
This appendix summarizes the modified Gram-Schmidt
process, reported in Fig. 27, applied in this work.

R EFERENCES
[1] J. Cullum, A. Ruehli, and T. Zhang, “A method for reduced-order
modeling and simulation of large interconnect circuits and its application
to PEEC models with retardation,” IEEE Trans. Circuits Syst. II, Analog
Digit. Signal Process., vol. 47, no. 4, pp. 261–273, Apr. 2000.
[2] P. Benner, S. Gugercin, and K. Willcox, “A survey of projection-based
model reduction methods for parametric dynamical systems,” SIAM Rev.,
vol. 57, no. 4, pp. 483–531, 2015.
[3] A. C. Antoulas, C. A. Beattie, and S. Gugercin, Interpolatory Meth-
ods for Model Reduction (Computational Science & Engineering).
Philadelphia, PA, USA: Society for Industrial and Applied Mathematics,
2020.
[4] P. Benner, S. Grivet-Talocia, A. Quarteroni, G. Rozza,
Fig. 27. Modified gram-schmidt process for orthogonalizing the columns of W. H. A. Schilders, and L. M. Silveira, Model Order Reduction:
V1 with the columns in an orthogonal matrix V . System- and Data-Driven Methods and Algorithms, vol. 1. Berlin,
Germany: De Gruyter, 2021.
[5] E. R. Samuel, L. Knockaert, and T. Dhaene, “Model order reduction of
cases, H-greedy behaves worse than others both in the fre- time-delay systems using a Laguerre expansion technique,” IEEE Trans.
quency and in the time domain, though it uses much less Circuits Syst. I, Reg. Papers, vol. 61, no. 6, pp. 1815–1823, Jun. 2014.
offline time (T-off) than SSI-greedy. SSI-greedy-G has slight [6] E. R. Samuel, D. Deschrijver, F. Ferranti, L. Knockaert, and T. Dhaene,
“Multipoint model order reduction for systems with delays,” in IEEE
oscillations in the time domain for the coplanar microstrips MTT-S Int. Microw. Symp. Dig., Aug. 2015, pp. 1–3.
model. Finally, considering runtime, accuracy and stability, the [7] E. R. Samuel, L. Knockaert, and T. Dhaene, “Matrix-interpolation-based
proposed Algorithm 2 is most efficient. The results in the time parametric model order reduction for multiconductor transmission lines
with delays,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 62, no. 3,
domain show that Galerkin projection produces ROMs that are pp. 276–280, Mar. 2015.
more stable, since all ROMs derived using Galerkin projection [8] A. E. Ruehli, G. Antonini, and L. Jiang, Circuit Oriented Electromag-
are stable, and only a single Petrov-Galerkin ROM derived by netic Modeling Using the PEEC Techniques. Hoboken, NJ, USA: Wiley,
2017.
H-greedy for the microstrip filter model is stable. A method, [9] W. Tseng, C. Chen, E. Gad, M. Nakhla, and R. Achar, “Passive order
based on linear matrix inequalities, to study the input-to state reduction for RLC circuits with delay elements,” IEEE Trans. Adv.
stability of PEEC models with multiple non-commensurate Packag., vol. 30, no. 4, pp. 830–840, Nov. 2007.
[10] F. Ferranti, M. S. Nakhla, G. Antonini, T. Dhaene, L. Knockaert, and
time delays has been presented in [31] but it becomes ineffi- A. E. Ruehli, “Multipoint full-wave model order reduction for delayed
cient when a large number of delays is considered. To the best PEEC models with large delays,” IEEE Trans. Electromagn. Compat.,
knowledge of the authors, theoretical analysis on the stability vol. 53, no. 4, pp. 959–967, Nov. 2011.
[11] A. Odabasioglu, M. Celik, and L. T. Pileggi, “PRIMA: Passive reduced-
of the ROMs for time-delayed PEEC models with a large order interconnect macromodeling algorithm,” IEEE Trans. Comput.-
number of delays rarely appear in the literature. It should be Aided Design Integr. Circuits Syst., vol. 17, no. 8, pp. 645–654,
an interesting topic for future research. Aug. 1998.
[12] Y. Dou and K.-L. Wu, “A passive PEEC-based micromodeling circuit
for high-speed interconnection problems,” IEEE Trans. Microw. Theory
V. C ONCLUSION Techn., vol. 66, no. 3, pp. 1201–1214, Mar. 2018.
[13] Y. Dou and K.-L. Wu, “Direct mesh-based model order reduction of
A robust model order reduction algorithm for time-delayed PEEC model for quasi-static circuit problems,” IEEE Trans. Microw.
PEEC models is proposed in this work. A tight error estimator Theory Techn., vol. 64, no. 8, pp. 2409–2422, Aug. 2016.
[14] L. Lombardi, Y. Tao, B. Nouri, F. Ferranti, G. Antonini, and M. S.
is the key factor that make the algorithm successful both in the Nakhla, “Parameterized model order reduction of delayed PEEC cir-
frequency domain and in the time domain. Computing the error cuits,” IEEE Trans. Electromagn. Compat., vol. 62, no. 3, pp. 859–869,
estimator involves computing two residuals that are introduced Jun. 2020.
[15] D. Alfke, L. Feng, L. Lombardi, G. Antonini, and P. Benner, “Model
by the ROM of the FOM and another ROM of a residual order reduction for delay systems by iterative interpolation,” Int.
system. Constructing the ROM of the residual system can be J. Numer. Methods Eng., vol. 122, no. 3, pp. 684–706, Feb. 2021.
simultaneously implemented with the targeted ROM construc- [16] C. Beattie and S. Gugercin, “Interpolatory projection methods for
structure-preserving model reduction,” Syst. Control Lett., vol. 58, no. 3,
tion within one greedy algorithm. The error estimator is much pp. 225–232, Mar. 2009.
cheaper than the previously proposed error estimator making [17] L. Feng and P. Benner, “On error estimation for reduced-order modeling
the greedy algorithm much more efficient than the existing of linear non-parametric and parametric systems,” ESAIM, Math. Model.
Numer. Anal., vol. 55, no. 2, pp. 561–594, Mar. 2021.
greedy algorithms. Compared with the existing methods for [18] L. Feng, L. Lombardi, G. Antonini, and P. Benner, “Stable macromodels
time-delay systems, which interpolate an approximation of for delayed PEEC models with error estimation,” in Proc. Int. Appl.
the original transfer function, the resulting proposed algorithm Comput. Electromagn. Soc. (ACES) Symp., Aug. 2021, pp. 1–4.
[19] A. E. Ruehli, “Inductance calculations in a complex integrated circuit
interpolates the original transfer functions directly, leading to environment,” IBM J. Res. Develop., vol. 16, no. 5, pp. 470–481,
more accurate results in the frequency domain. Results in both Sep. 1972.

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.
4190 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS, VOL. 69, NO. 10, OCTOBER 2022

[20] A. E. Ruehli and P. A. Brennan, “Efficient capacitance calculations Peter Benner received the Diploma degree
for three-dimensional multiconductor systems,” IEEE Trans. Microw. in mathematics from the RWTH Aachen
Theory Techn., vol. MTT-21, no. 2, pp. 76–82, Feb. 1973. University, Aachen, Germany, in 1993, the
[21] A. E. Ruehli, “Equivalent circuit models for three-dimensional multi- Ph.D. degree in mathematics from the University
conductor systems,” IEEE Trans. Microw. Theory Techn., vol. MTT-22, of Kansas, Lawrence, KS, USA, and the TU
no. 3, pp. 216–221, Mar. 1974. Chemnitz-Zwickau, Germany, in February 1997,
[22] C. Gianfagna, L. Lombardi, and G. Antonini, “Marching-on-in-time and the Habilitation (Venia Legendi) degree in
solution of delayed PEEC models of conductive and dielectric objects,” mathematics from the University of Bremen,
IET Microw., Antennas Propag., vol. 13, no. 1, pp. 42–47, Jan. 2019. Germany, in 2001. After spending a term as
[23] M. Taskinen and P. Ylä-Oijala, “Current and charge integral equation a Visiting Associate Professor with the TU
formulation,” IEEE Trans. Antennas Propag., vol. 54, no. 1, pp. 58–67, Hamburg-Hamburg, Germany, he was a Lecturer in
Jan. 2006. Mathematics with the TU Berlin, Germany, from 2001 to 2003. Since 2003,
[24] D. Gope, A. Ruehli, and V. Jandhyala, “Solving low-frequency EM-CKT he has been a Professor of mathematics in industry and technology with
problems using the PEEC method,” IEEE Trans. Adv. Packag., vol. 30, the TU Chemnitz. In 2010, he was appointed as one of the four Directors
no. 2, pp. 313–320, May 2007. of the Max Planck Institute for Dynamics of Complex Technical Systems,
[25] A. Bellen, N. Guglielmi, and A. E. Ruehli, “Methods for linear systems Magdeburg, Germany. Since 2011, he has been an Honorary Professor with
of circuit delay differential equations of neutral type,” IEEE Trans. the Otto-von-Guericke University of Magdeburg, Germany. His research
Circuits Syst. I, Fundam. Theory Appl., vol. 46, no. 1, pp. 212–215, interests include scientific computing, numerical mathematics, systems
Jan. 1999. theory, and optimal control. He is a SIAM Fellow (Class of 2017).
[26] A. Ruehli, U. Miekkala, A. Bellen, and H. Heeb, “Stable time domain
solutions for EMC problems using PEEC circuit models,” in Proc. IEEE
Symp. Electromagn. Compat., Aug. 1994, pp. 371–376.
[27] W. Pinello, A. C. Cangellaris, and A. Ruehli, “Hybrid electromagnetic
modeling of noise interactions in packaged electronics based on the
partial-element equivalent-circuit formulation,” IEEE Trans. Microw.
Theory Techn., vol. 45, no. 10, pp. 1889–1896, Oct. 1997.
[28] R. F. Harrington, Field Computation by Moment Methods. Malabar,
India: Krieger, 1982.
[29] S. Chellappa, L. Feng, V. de la Rubia, and P. Benner, “Adaptive interpo-
latory MOR by learning the error estimator in the parameter domain,” in
Model Reduction of Complex Dynamical Systems (International Series of
Numerical Mathematics), vol. 171, P. Benner, T. Breiten, H. Faßbender,
M. Hinze, T. Stykel, and R. Zimmermann, Eds. Cham, Switzerland:
Birkhäuser, 2021, pp. 97–117.
[30] L. Lombardi et al., “Time-domain analysis of retarded partial element
equivalent circuit models using numerical inversion of Laplace trans-
form,” IEEE Trans. Electromagn. Compat., vol. 63, no. 3, pp. 870–879, Daniele Romano was born in Campobasso, Italy,
Jun. 2021. in 1984. He received the Laurea degree in computer
[31] G. Antonini and P. Pepe, “Input-to-state stability analysis of partial- science and automation engineering from the Uni-
element equivalent-circuit models,” IEEE Trans. Circuits Syst. I, Reg. versity of L’Aquila, L’Aquila, Italy, in 2012, and the
Papers, vol. 56, no. 3, pp. 673–684, Mar. 2009. Ph.D. degree in 2018. Since 2012, he has been with
the UAq EMC Laboratory, University of L’Aquila,
focusing on EMC modeling and analysis, algorithm
engineering, and speed-up techniques applied to
EMC problems.

Lihong Feng received the Ph.D. degree in com-


putational mathematics from Fudan University,
Shanghai, in 2002. She received an Alexander-von-
Humboldt Fellowship for the period 2007–2009
and worked at the host university TU Chemnitz.
Since 2010, she has been a Senior Scientist at
the Max Planck Institute for Dynamics of Complex
Technical Systems in Magdeburg, and became a
Team Leader with the CSC Group headed by Peter
Benner. Her research interests include model order
reduction and fast simulation of complex models
arising from engineering applications, such as fluid dynamics, chemical
engineering, and mechanical engineering and electrical engineering, numerical
analysis, and scientific computing.

Giulio Antonini (Senior Member, IEEE) received


the Laurea degree (cum laude) in electrical engineer-
Luigi Lombardi received the Laurea (M.D.) and ing from the University of L’Aquila, L’Aquila, Italy,
Ph.D. (cum laude) degrees in electronic engineering in 1994, and the Ph.D. degree in electrical engi-
from the University of L’Aquila, L’Aquila, Italy. neering from the University of Rome “La Sapienza”
In early 2018, he was a Visiting Researcher with in 1998. Since 1998, he has been with the UAq
the Department of Electronics, Carleton Univer- EMC Laboratory, University of L’Aquila, where he
sity, Ottawa, ON, Canada. Since November 2018, is currently a Professor. He has authored more than
he has been working as a Design Engineer with the 300 papers published on international journals and in
Non Volatile Memory (NVM) Design Team, Micron the proceedings of international conferences. He has
Semiconductor, Avezzano, Italy. also coauthored the book Circuit Oriented Electro-
magnetic Modeling Using the PEEC Techniques (Wiley–IEEE Press, 2017).
His research interests are in the field of computational electromagnetics.

Authorized licensed use limited to: HANGZHOU DIANZI UNIVERSITY. Downloaded on April 23,2023 at 06:19:21 UTC from IEEE Xplore. Restrictions apply.

You might also like