You are on page 1of 5

An Improved Multiple Edge Responses Method for

Memory Simulation Considering Worst Case and


Nonlinear Crosstalk
Shiji Pan1, Jiangyuan Qian2
(1) Inphi Corporation, Santa Clara, California, US
(2) Qualcomm Technologies, Inc, San Diego, California, US
span@inphi.com

Abstract—Although multiple edge responses method the superposition of double edge responses (DERs) can
has been proven effective to capture the input and output resolve the asymmetric issues, it still cannot fully overcome
nonlinearity in a channel simulation, the method may not the driver nonlinearity. To solve the issue of nonlinearity in
be applicable for memory simulation due to the nonlinear channel simulation, the multiple edge response (MER)
crosstalk effect and also the requirement of a massive method was introduced in [2, 3].
number of step responses, which significantly reduces the The MER method distinguishes a single bit step response
simulation efficiency. In this work, we present an by a number (N) of previous bits (pre-bits). Instead of using
improved simulation methodology involving multiple SBR as in PDA analysis, the time-shifted superposition is
edge responses to accurately estimate the worst-case eye applied by selecting the corresponding step responses based
diagram, specifically applicable for memory simulation. on the instantaneous transition and a few bits beforehand, as
The proposed method is able to capture the nonlinearity expressed by
effect due to crosstalk, plus has nearly 20 times simulation ஶ
acceleration compared with the traditional long random ܻሺ‫ݐ‬ሻ ൌ ෍ሼȁܺ௞ାଵ െ ܺ௞ ȁ ή ܵ௑ೖషಿ‫ڮ‬௑ೖషభ௑ೖ ሺ‫ ݐ‬െ ݇ ή ܷ‫ܫ‬ሻሽ
input bit pattern simulation, based on a multiple
௞ୀଵ
processors system.
in which X1, X2 ĂXk stand for the first, second and the kth bit
Keywords—intersymbol interference; peak distortion of input pattern, N indicates the order the MER method, UI is
analysis; worst case crosstalk; eye diagram; multiple edge the unit interval and S(t) is the particular rising or falling edge
responses; nonlinearity; parallel link; DDR; memory response. For a Nth order MER method, a total of 2n step
responses need to be performed before the time-shifted
I. INTRODUCTION superposition. The increasing order of MER method (N)
To estimate the performance of high speed signaling, makes the MER method approaching to a fully PRBS HSPICE
accurate channel simulation including passive simulation (in this work, we generally call it “HSPICE
interconnections and active input/output (I/O) circuit is simulation” for simplicity) and therefore a better matching
critical. Despite considered as an accurate method, the SPICE with HSPICE simulation could be achieved. The value N
simulation is typically very time-consuming, especially for a needs to be large enough to capture the non-linearity effect, at
complex I/O circuit topology. More importantly, simulations a cost of a larger number of step responses required. In [4],
using a random bit pattern (e.g., pseudorandom binary this method was employed even when considering the effect
sequence, PRBS) of limited length may not be sufficient and from power distribution network (PDN), although for a single
thereby fail to capture the worst case situation of a channel. bit case.

In recent years, peak distortion analysis (PDA) [1] has Nevertheless, there has never been any study, to the best
been resorted to quickly determine the worst case eye of the of our knowledge, applying MER method to a parallel link eye
link. In PDA, the worst case voltage magnitude at any given diagram simulation with crosstalk considered. In this work,
sampling point due to intersymbol interference (ISI) is the feasibility of using MER method in a parallel link eye
instantly calculated based on the single bit response (SBR), diagram simulation is studied. It is found that the linear
which is supposed to characterize the whole channel. addition of victim and aggressors’ step responses could
Although PDA is computationally efficient, it assumes the severely overestimate the aperture size and the jitter of the eye
whole channel to be linear time invariant (LTI) or at least diagram, which is solely due to the intrinsic nonlinear property
approximated well as an LTI system. In most cases, a of crosstalk. A tentative solution is initiated by superposition
differential I/O circuit could satisfy this criterion. However, of time-shifted steps responses which explicitly include
for a channel involving single-ended I/O circuit, this crosstalk, under all possible combinations. However, this
assumption usually cannot be satisfied. In other words, a SBR approach requires a massive number of step responses, the
cannot capture the nonlinearity of the channel. Even though number of which is exponentially proportional to the number

978-1-4799-1993-2/15/$31.00 ©2015 IEEE 139


of aggressors. This leads to a long simulation time, possibly the eye. This discrepancy is simply because of the fact that the
making the approach even less efficient than a HSPICE linear combination of victim’s and aggressors’ step responses
simulation with long random input bit patterns. cannot capture the crosstalk’s intrinsic nonlinearity.
Therefore, we reconstruct the MER method by attaching To validate our observation, in Fig. 2 (a), we compare the
the worst-case crosstalk in each step responses. According to rising edge solely at the victim output, between the case when
our simulation, this improved method is not only numerically both the victim and the aggressors are rising simultaneously
efficient, but also is accurate enough to estimate the worst case (in red dotted line) and the case when the victim and the
eye diagram for a targeted victim. The proposed method has aggressors are rising separately, meanwhile the total victim
been validated for various non-linear I/O circuits. It should be output is obtained by point-to-point superposition (in blue
noted that in this work, we assume an ideal clean power for solid line). Similarly, Fig. 2 (b) shows the case when the
I/O circuits such that power supply induced jitter (PSIJ) is victim is rising while the aggressors are falling. As shown in
ignored. both plots, there is a certain deviation between the blue solid
line and the red dotted line. In Fig. 2 (a), the red dotted line is
II. MER METHOD CONSIDERDING CROSSTALK mostly above the blue line and it means that the linear
First of all, we use the MER method to verify a single-bit superposition could possibly lead to a missing severe
channel involving a single-ended non-linear I/O. Fig. 1 (a)(b) overshoot. Meanwhile in Fig. 2 (b) as opposite, the red dotted
compare the eye diagrams obtained by the HSPICE line is all below the blue line. It implies that in this case the
simulation, the superposition of double edge responses linear superposition overestimate the eye and timing margin.
(DERs) and 2nd order MERs using a 1024 bit PRBS pattern. It is noteworthy that although only single bit transition edges
The DER method is based on two step responses, one rising are plotted here, in a practical scenario, the ISI from the
(r01) and one falling (f10) while the 2nd order MER method previous bit pattern could even reinforce this deviation from
requires four step responses including r001, r101, f110, and f010, nonlinearity.
in which the letter indicates it is a rising (r) or a falling (f) step
response while the first number indicates the previous one bit 1
before the transition. The comparison shows a certain
discrepancy between DER and HSPICE at the right top corner 0.8
Voltage (V)

of the eye. Meanwhile, a good matching between 2nd order 0.6


MER and HSPICE is achieved. It means 2nd order MER is Vic. rising (Sim.)
0.4 Agg. rising (Sim.)
sufficient to capture all the non-linearity while DER is not. Vic. rising + Agg. rising (post-processing)
0.2 Vic. & Agg. rising (Sim.)
1.2 1.2
1 1 0
Voltage (V)
Voltage (V)

0.8 0.8
0.6 DER 0.6 MER -0.2
HSPICE HSPICE
0 0.2 0.4 0.6 0.8 1
0.4 0.4
Time (UI) (a)
0.2 0.2
0 0
1
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
(a) Time (UI) (b)
Time (UI) 0.8
Voltage (V)

1.2 1.2
1 1
0.6
Vic. rising (Sim.)
Voltage (V)

Voltage (V)

0.8 0.8
3rd MER 0.4 Agg. falling (Sim.)
0.6 2nd MER 0.6
0.4 HSPICE 0.4 HSPICE Vic. rising + Agg. falling (post-processing)
0.2 0.2 0.2 Vic. rising & Agg. falling (Sim.)
0 0
-0.2 -0.2 0
0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1
(c) (d) -0.2
Time (UI) Time (UI) 0 0.2 0.4 0.6 0.8 1
Time (UI) (b)
Fig. 1 Comparison of simulated eye diagram for a 1024 bit long PRBS
pattern without crosstalk (a)DER and HSPICE, (b) 2nd order MER and Fig. 2 Comparison of the victim output, between the case (i) when the victim
HSPICE; with crosstalk (c) 2nd order MER and HSPICE and (d) 3rd and the aggressors are simulated separately, the results are obtained
order MER and HSPICE. by post-processing (superposition) and the case (ii) when the victim
and the aggressors are simulated simutaneiously. (a) both the victim
Fig. 1(c)(d) show the eye diagram comparison between and aggressros are rising (b) the vicim is rising while the aggresors is
using HSPICE, the 2nd and 3rd order MER method for a three- falling.
bits channel (one victim and two aggressors). The 2nd order
MER method requires a total of 12 (=3×22) single-source step The comparison in Fig. 2 actually gives a hint that, in order
responses, all observed at the victim output, while the 3rd order to capture the nonlinear effect of crosstalk, the channel
MER method is based on 24 (=3×23) steps responses. It can characterization process must include the step responses
be observed that (i) a higher order MER does not improve the where the victim and the aggressors are switching
deviation between the HSPICE simulation and the MER simultaneously in the simulation of related step responses. A
method in the eye diagram, (ii) even for the 3rd order MER, generalized Nth order MER method with crosstalk should
there is a large discrepancy spread over all feature region of consider all possible combination of N bits at both aggressors
and the victim. If we assume using 2nd order MER method for

140
a 3-bits channel, one victim and two aggressors. At the edge 1.4
of interest, each signal could be in one of the five status 1.2
including “quiet”, “r001”, “r101”, “f110”, and “f010”. Overall, 1 Agg1 Agg2

Voltage (V)
there are 53-1=124 (minus one because the case where all 0.8
Static Static
0.6 Worst “1” Worst “0”
signals are quiet is unnecessary) combination of step Fall Fall
0.4
responses. It is obvious that requiring so many step responses
0.2 Fall Static
has greatly degraded the efficiency of MER method. Not to
0
mention that here we only consider a 3-bits channel using 2nd Fall Rise
-0.2
order MER. If the channel has M bits of signals, using Nth Rise Rise
order MER method, the total number of step response will be 0 0.2 0.4 0.6 0.8 1
proportional to 2M×N. The requirement of such a massive Time (UI) 1
number of step responses jeopardizes the simulation Fig. 3 Comparison of the victim output under the effect of two aggressors’
efficiency and makes the generalized MER method different patterns (a) victim is rising (b) victim is falling.
impractical. To overcome this issue, we reconstruct the In this second case, we compare the victim eye diagram
method as described in the next section. between the case when all aggressors are in different PRBS
III. MER METHOD BASED ON STEP RESPONSES CONSIDERING patterns and the case when all aggressors are in the same
WORST-CASE CROSSTALK pattern. With 1024 bits PRBS patterns employed in both
simulation, the ISI effect is considered as being fully
A. Worst-case Crosstalk characterized. It is obvious that with an infinite long input
The purpose of plotting an eye diagram is to locate the PRBS pattern, the two eye diagram should be ideally perfect
worst-case situation of the signal waveform, more match to each other. However, as the comparison shown in
specifically, the worst rising and falling edges. Intuitively, the Fig. 4, with a finite length input bit pattern considering ISI,
worst rising and falling edge in the eye diagram corresponds all aggressors in the same pattern result in a worse victim eye
to a worst-case crosstalk, plus a destructive ISI from the diagram in terms of both aperture size and jitter. This further
previous patterns. As an initial step, it is important to proves our assumption that the worst-case crosstalk occurs
understand at which condition the strongest crosstalk happens. when all aggressors are switching reversely at the same time,
We find that the worst case crosstalk happens when all the even when ISI is considered.
aggressors are switching at the same transition but in the This worst-case crosstalk observation implies that it is not
reverse direction as the victim (e.g., all the aggressors are necessary to traverse all the possible combinations of the step
falling when the victim is rising, vice versa). Two test cases
responses between the victim and the aggressors in the MER
have been used to validate our assumption. In the first case,
method. Only considering all aggressors switching in the
we show a group of unit step response at the victim output
with various combinations between two aggressors. As same pattern, this number of step responses is significantly
shown in Fig. 3(a), the worst “1” and the worst “0” of the reduced from the order of 2M×N (N is the order of MER
victim rising edge all appears in the case when both two method and M is the number of signals) to the order of 2N,
aggressors are switching as a falling action. Meanwhile in Fig. which is independent of the number of signals in a parallel
3(b), both the worst “1” and the worst “0” of the victim falling link.
edge also occur when the both aggressors are switching as a It should be mentioned that the worst crosstalk pattern is
rising. This comparison proves that reverse switching among based on a targeted victim. For a group of channels without
all aggressors will lead to the worst crosstalk to a victim at
knowing which one is the worst victim, a certain technique
that transition. It should be mentioned that this comparison
does not indicate any information how the following transition should be employed to locate the worst victim first. For
would be affected by means of ISI. In other words, it is still example, by checking the summation of the far-end crosstalk
reasonable to question if the reverse switching among all at each signal output based on the passive channel models.
aggressors at the current transition could induce to an
underestimated transition at the following bits. To testify the 1.2
doubt, we consider a second case as below. 1
0.8
Voltage (V)

1.4
1.2 0.6 Same aggs pattern
1 Agg1 Agg2 0.4 Diff. aggs patterns
Voltage (V)

0.8
Static Static 0.2
0.6 Worst “0”
0.4 Worst “1” Fall Fall 0
0.2 Fall Static -0.2
0
Fall Rise 0.2 0.4 0.6 0.8 1
-0.2
Rise Rise Time (UI)
0 0.2 0.4 0.6 0.8 1
Fig. 4 Comparison of the victim eye diagram when all the aggressora are in
Time (UI) 1 the same PRBS pattern (blue) and when all the aggressors are in
different PRBS pattern (red).

141
B. Step Responses Considering Worst-case Crosstalk margin covering all the regions, all the four step responses
Fig. 5 shows the step responses at the victim output for within Fig. 5 (a) should be considered. (ii) In terms of the
various combinations of the victim and aggressors’ input jitter, we could find out that the signal timing is lagged with
pattern considering 2nd order MER. In all the cases, both two rising bit pattern in the aggressors (dashed blue and dotted
victim and aggressors are switching while all the aggressors red). This is due to that when both the aggressors and victim
are switching in the same pattern as in Section A. In each are rising, the even mode is dominant. And it is known that
figure, we show the feature transition region of interest while even mode propagates slower than the odd mode due to a
the aggressors are among four possible switching possibilities, larger effective dielectric constant compared to the odd mode.
which are indicated by “f 110”, “f 010”, “r 001” and “r 101”. In Fig. Vice versa, two falling bit aggressors pattern make the victim
5 (a), the victim is rising in an r001 pattern while in Fig. 5(b), to lead compared to the case when no crosstalk is involved.
the victim is rising in an r101 pattern. Similarly, Fig. 5 (c)(d) Since both leading and lagging would induce jitter, all four
represents the case when the victim goes through f110, and f010 step responses are needed in order to locate the worst-case
respectively. It should be stressed that, in all the curves in Fig. jitter.
5, the ISI effect due to both the victim’s and aggressors’ From all the discussion above, it can be conclude that in
previous-bit has been removed by subtracting a shifted rising order to accurately estimate every region of an eye diagram,
and falling step response (i.e., r011, f100 with victim only or with all step responses with possible combination of crosstalk
aggressors only). patterns in the eye estimation must be considered. That means
1.4 1.4 an overall 16 step responses required when considering 2nd
1.2 1.2 order MER.
1 1
Voltage (V)

Voltage (V)

0.8 0.8 C. Superposition of MERs Considering Worst-case


0.6 AGG:110 0.6 AGG:110 Crossttalk
0.4 AGG:010 0.4 AGG:010
0.2 AGG:001 0.2 AGG:001 Similar as the MER method, the long input bit pattern is
0 AGG:101 0 AGG:101 decomposed into short N-bits (for 2nd order MER, N =3)
No Xtalk No Xtalk
-0.2
0 0.2 0.4 0.6 0.8 1
-0.2
0 0.2 0.4 0.6 0.8 1
pattern for both the victim and the aggressors. To include the
1.4
Time (UI) (a)
1.4
Time (UI) (b) worst-case crosstalk, all the aggressors are switching in the
1.2
AGG:001
1.2
AGG:001 same pattern. The system response is calculated by
AGG:101 AGG:101
1 AGG:110 1 AGG:110 superimposing the shifted version of N-bits short pattern,
Voltage (V)

Voltage (V)

0.8 AGG:010 0.8 AGG:010 which are selected among 20 step responses. Besides 16 step
No Xtalk No Xtalk
0.6 0.6 responses shown in Fig. 5, other four steps responses, in which
0.4 0.4
either victim is quite or aggressors are quite, are required.
0.2 0.2
0 0 Fig. 6 shows the eye diagram for a 1024 bit long PRBS
-0.2 -0.2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 pattern using the proposed method and the HSPICE
Time (UI) (c) Time (UI) (d) simulation. It shows that the proposed method could
Fig. 5 Comparison of the step responeses’ transition edge for four cases of effectively account for the nonlinear crosstalk effect, much
2nd order crosstalk on victim rising and falling (a) victim r001(t), (b) more accurate than the linear superposition result shown in
victim r101(t) with 1à0 transition effect removed, (c) victim f 110(t), Fig. 1(c)(d). The proposed method could approximate very
and (d) victim f 010(t) with with 0à1 transition effect removed. All
aggressors have the same switching pattern.
close to the HSPICE simulation in terms of both the eye
aperture size and the jitter.

According to Fig. 5, it is clear that the aggressors’ bit Table 1 shows the comparison of the simulation time and
pattern plays a significant role on the victim’s rising/falling the capability to capture the worst-case eye diagram from
edges. Let us discuss only the case in Fig. 5 (a) since similar different methods. It is known that, without a targeted victim,
conclusion could be deduced from other three plots. it is very difficult to capture the worst case scenario of the eye.
Compared with the case without crosstalk (in the solid green However, our proposed method could simply capture the
line), it can be observed that (i) in terms of the eye aperture worst-case eye based on a targeted victim, save 80% of the
size, the falling bit patterns over aggressors (“Agg: f 110” in simulation time even using single processor, compared with
black solid line and “Agg: f 010” in magenta dashed dotted line) HSPICE. With 4 processors, our proposed method is able to
affect stronger to the victim rising than the other two rising provide a near ×20 simulation acceleration, compared with the
aggressors. This is due to the fact that when the aggressors and traditional HSPICE simulation, which is one of the superior
the victim are switching in the opposite direction, the odd advantages of our proposed method.
mode of the lines establishes a severe negative far-end-
crosstalk (FEXT). The negative FEXT reaches the victim after
a certain time (in this case, after 0.45·UI) and therefore greatly
lower the rising waveform. Besides, the negative FEXT would
accompany a positive FEXT beforehand, which elevates the
waveform before the 0.45·UI. Nevertheless, there is still a
small range between 0.6 V and 0.7 V, the rising bit pattern
(“Agg: r 101”) provides the worst “1”, which is partially due to
the change of timing. Therefore, to fully estimate the eye

142
HSPICE w/o targeted victim Very Difficult NA
1.2
HSPICE w/ targeted victim Difficult 48 mins
1
Proposed method (single processor),
Voltage (V)

0.8 Easy 10 mins


w/ targeted victim
0.6 Proposed method Proposed method (4 processors,
Easy 2.5 mins
0.4 HSPICE parallel), w/ targeted victim
0.2 REFERENCES
0
[1] B. K. Casper, M. Haycock, and R. Mooney, "An accurate and efficient
-0.2 analysis method for multi-Gb/s chip-to-chip signaling schemes," in
2002 Symposium on VLSI Circuits, 2002, pp. 54-57.
0.2 0.4 0.6 0.8 1 [2] J. Ren and K. S. Oh, "Multiple edge responses for fast and accurate
Time (UI) system simulations," IEEE Transactions on Advanced Packaging, vol.
Fig. 6 Comparison of simulated eye diagram for a 1024 bit long PRBS 31, pp. 741-748, 2008.
pattern with crosstalk between using modified 2nd order MER method [3] D. Oh, "Multiple edge responses for fast and accurate system
(in black) and long input bit pattern in HSPICE (in red). simulations," in 2006 IEEE Electrical Performance of Electronic
Packaging, 2006, pp. 163-166.
TABLE 1 COMPARISON OF SIMULATION TIME FOR A 3-BITS CHANNEL [4] C.-C. Chou, H.-H. Chuang, T.-L. Wu, S.-H. Weng, and C.-K. Cheng,
"Eye prediction of digital driver with power distribution network
Capture worst Simulation noise," in 2012 IEEE 21st Conference on Electrical Performance of
Method Electronic Packaging and Systems (EPEPS), 2012, pp. 131-134.
case time

143

You might also like