You are on page 1of 18

Variation Library Application Note

What Is a Variation Library?


A variation library is just like a standard cell library, but with additional
information to model the effects of process parameters. A traditional standard
cell library contains information that allows the cell behavior to be modeled as a
function of input slew and output load for a timing arc:

Figure 1: Normal ("Deterministic") Library Cell Model

This information allows the cell's behavior to be modeled as a function of its context, but
only at a single process condition.

A variation library is simply an extension of a regular library that allows the cell behavior to
be additionally modeled across a variety of process conditions. This is done by enhancing the
cell model to be a function of one or more additional process parameters (represented below
by parameters A, B, and C):

Figure 2: Variation Library Cell Model

The variation library itself says nothing about how the process parameters vary -- whether
they vary normally, uniformly, or across what range. This additional information comes from
the analysis tool, PrimeTime VX. The figure below shows the relationship between the
analysis tool and the variation library:
Figure 3: Analysis Tool and Variation Library

Here we can see that PrimeTime VX models the distributions of the process parameters.
Using these parameter distributions (the input distributions), PrimeTime VX can plug
parameter values into the variation library's cell model, observe the timing responses under
a variety of conditions, and determine the variation timing response (the output response) of
the logic to the process variation.

Types of Variation
The autocorrelation of a parameter describes the correlation of the parameter
values within a given die. Parameters with an autocorrelation of 1 will always
share the same (fully correlated) parameter value across objects in the die, and
parameters with an autocorrelation of 0 will have independent (uncorrelated)
parameter values.

For the purposes of our discussion, process parameters can be categorized into two major
types based on their autocorrelation characteristics:

• inter-die (global) parameters - autocorrelation of 1, all objects in the die


share the same parameter value
• intra-die (local) parameters - autocorrelation of 0, all objects in the die
have independent parameter values

To help understand this more intuitively, we can visually show how PrimeTime VX plugs
process values into the cell instance models for both types of parameters. Parameters A and
B are inter-die (global) parameters, and parameter C is an intra-die (local) variation:
Figure 4: Autocorrelation, or Correlation Across Objects on a Die

From the figure above, we can see that parameter values for the inter-die variations are
shared by all cells in the die. We can also see that PrimeTime VX applies a unique
distribution of parameter values to each cell for parameter C. In reality, parameter C
represents not only the variation of every individual cell in the die, but also the variation of
every individual transistor within those cells. This raises the question of how a single
parameter per cell can model the variation of all transistors within each cell, and we'll
explore how this is done in the section entitled "Modeling Local Variation".

Modeling Global Variation


Let's say we wish to build a variation model for our two global variation
parameters, A and B. This model should allow us to predict some cell behavior of
interest (arc delay, output transition, receiver model value, setup/hold constraint
arc, DRC check value, and so on), given any values of parameters A and B.

First, we know that if we set both parameters to their nominal values and vary each
parameter through its possible range, we will get some change in cell simulation behavior as
a function of each parameter. These "behavior versus parameter value" curves are shown in
Figure 5:
Figure 5: Cell Simulation Behaviors as a Function of Each Parameter

To build the model, first the cell behavior is captured at nominal process conditions (both A
and B set to their nominal values). This nominal behavior is the baseline cell behavior stored
in the library. On our behavior-versus-parameter curves, this single nominal library provides
us with the following characterization points:

Figure 6: Nominal Baseline Cell Behaviors

The point on each of these graphs actually represents a complete set of characterized library
data, captured under the X-axis process conditions.

Then, for each parameter, the cell behavior is also captured at two additional points, one on
either side of nominal. To capture these points, all other parameters are kept at their
nominal values, and only the parameter of interest is varied away from nominal to capture
the additional characterization points:
Figure 7: Non-Nominal Cell Behaviors

This allows us to isolate the effects of each individual parameter, with all other parameters
held at nominal.

We can then use the three characterization points in each graph to build a piecewise-linear
model (consisting of two line segments) to approximate the real process behavior across
each parameter's range of interest. By evaluating the behavior at the characterized points
and interpolating (or extrapolating) as needed using the two linear segments, the effect of
each individual parameter can be predicted throughout its range.

For N global parameters, the total number of required characterization points is 2N+1 -- two
non-nominal points for each parameter of interest, plus the shared nominal baseline point. If
we consider the N-dimensional process variation space created by our N parameters, these
2N+1 points actually form orthogonal axes in the space. For our two parameters of interest
above, we can visually show the characterization points in their process space as follows:

Figure 8: N-dimensional Process Space for Parameters A and B

Now we have models which allow us to accurately predict the effect of each parameter
varying alone. How can we predict what happens when the parameters vary together? Since
the parameters are independent, we can apply the property of superposition to combine
their effects. Superposition is simply a fancy way of saying that we determine the
contribution of each parameter separately, then add them together.
Let's say we vary A and B as shown, and individually measure their contributions to variation
away from nominal:

Figure 9: Measuring Individual Parameter Variation Contributions

Once we have determined how much variation away from the nominal behavior each
parameter has contributed, we can apply superposition and add up these contributions to
determine the total amount of variation away from nominal:

Figure 10: Using Superposition to Combine Parameter Variation Contributions

Modeling Local Variation


So far we have avoided discussing one subtle but tricky detail. A parameter's
autocorrelation actually describes the correlation between different transistors
across the die, not cells. Transistors are the fundamental building blocks of
digital logic, and standard cells are simply arbitrary wrappers around different
numbers of transistors that happen to perform functions we find useful. With
global variation, this distinction between cells and transistors was not important.
If all cells in the die share the same parameter value during analysis, and all
transistors in the cell shared the same value during characterization, then by
extension all transistors in the die have the same value during analysis:
Figure 11: Variation as a Function of Global Transistor Parameters

But what about local variation? We define local variation as the truly independent variation
of each transistor. (Note that this effect is different from distance-based spatial variation,
which is beyond the scope of this application note.) Let's say each transistor has two local
variation parameters, K and L. Most process models actually use separate sets of parameters
for the NMOS and PMOS transistors, but we'll use the same name across transistor types for
simplicity. If we take the "global variation" approach of modeling the cell's behavior as a
function of every independent source of variation, then each cell would need to be
characterized as a function of every parameter of every transistor inside that cell:

Figure 12: Variation as a Function of Local Transistor Parameters

This would drive up the cost of characterization tremendously! Furthermore, the number of
parameters would explode with the size and complexity of the cell and the number of
transistor parameters. For local variation, a different approach is needed.

Fortunately, the Liberty NCX characterization tool provides a solution by allowing us to


model the cell's behavior by its output side (observed behaviors) instead of by its input side
(the process parameters that affect the transistors). To understand this, imagine that we
place the cell into the simulator and allow every parameter of every transistor to vary
according to its own independent distribution. If we perform a Monte Carlo simulation and
allow the cell to vary naturally according to its multiple independent input distributions, we
will also observe some Gaussian (or near-Gaussian) distribution of outcomes:

Figure 13: Varying Cell's Input Side to Observe Output Side

The key to building the model is to apply some clever analysis techniques to the outcomes,
and derive a representative +1σ (sigma) behavior and a -1σ behavior from them. These ±1σ
outcomes are then stored as characterization points for a new synthetic parameter, which
we'll call "parameter C:":

Figure 14: Constructing Synthetic Parameter Characterization Points from Outcomes

Parameter C is then assigned a normal distribution so that a +1σ value of parameter C


results in the +1σ output behavior, and a -1σ value of parameter C results in the -1σ output
behavior. By constructing such a model based on the output behavior, we can reproduce the
natural set of outcomes for the cell (automatically aggregating the effects of all individual
transistors and transistor parameters) as parameter C varies normally.
Of course, it would be impractically expensive to run a full Monte Carlo simulation for every
table point of every timing arc. The Monte Carlo simulation concept is a good way to explain
the approach, but Liberty NCX employs some more complex analysis algorithms that allow it
to build the cell model based on the individual transistor contributions. In addition, it handles
the interactions and relationships between the different types of behaviors that the model
represents (such as delay and slew).

Types of Variation Libraries


So far, we have simply been talking about our variation models as data stored in
characterization points. Somehow, we must store these characterization points
as library files. There are two primary types of variation libraries:

• unified library - All 2N+1 characterization points are stored in a single CCS
library file, using Liberty variation constructs
• separate libraries - The data for each of the 2N+1 characterization points is
stored in a separate conventional (deterministic) library

Below is an example of a unified variation library. Only two parameters, A and B, are shown
to keep the examples simple:

/* unified variation library for parameters A and B */


library ("lib") {
...
va_parameters("A", "B");
...
cell ("INV") {
pin (a) {...}
pin (y) {
timing () {
/* all parameters nominal */
compact_ccs_rise (...) {...}
compact_ccs_fall (...) {...}
receiver_capacitance1_rise (...) {...}
receiver_capacitance1_fall (...) {...}
receiver_capacitance2_rise (...) {...}
receiver_capacitance2_fall (...) {...}

timing_based_variation () {
va_parameters("A", "B");
nominal_va_values(0, 0);

/* A = -3 */
va_compact_ccs_rise (...) {va_values(-3, 0); ...}
va_compact_ccs_fall (...) {va_values(-3, 0); ...}
va_receiver_capacitance1_rise (...) {va_values(-3, 0); ...}
va_receiver_capacitance1_fall (...) {va_values(-3, 0); ...}
va_receiver_capacitance2_rise (...) {va_values(-3, 0); ...}
va_receiver_capacitance2_fall (...) {va_values(-3, 0); ...}

/* A = +3 */
va_compact_ccs_rise (...) {va_values(+3, 0); ...}
va_compact_ccs_fall (...) {va_values(+3, 0); ...}
va_receiver_capacitance1_rise (...) {va_values(+3, 0); ...}
va_receiver_capacitance1_fall (...) {va_values(+3, 0); ...}
va_receiver_capacitance2_rise (...) {va_values(+3, 0); ...}
va_receiver_capacitance2_fall (...) {va_values(+3, 0); ...}

/* B = -3 */
va_compact_ccs_rise (...) {va_values(0, -3); ...}
va_compact_ccs_fall (...) {va_values(0, -3); ...}
va_receiver_capacitance1_rise (...) {va_values(0, -3); ...}
va_receiver_capacitance1_fall (...) {va_values(0, -3); ...}
va_receiver_capacitance2_rise (...) {va_values(0, -3); ...}
va_receiver_capacitance2_fall (...) {va_values(0, -3); ...}

/* B = +3 */
va_compact_ccs_rise (...) {va_values(0, +3); ...}
va_compact_ccs_fall (...) {va_values(0, +3); ...}
va_receiver_capacitance1_rise (...) {va_values(0, +3); ...}
va_receiver_capacitance1_fall (...) {va_values(0, +3); ...}
va_receiver_capacitance2_rise (...) {va_values(0, +3); ...}
va_receiver_capacitance2_fall (...) {va_values(0, +3); ...}
}
}
}
}
}

Figure 15: Unified Variation Library (Source Code)

Inside each timing arc, the nominal timing data is stored in the usual way. In addition to the
nominal timing, Liberty variation constructs beginning with va_* are used to capture the
non-nominal timings. Each set of non-nominal timing data has a va_values entry that
specifies the parameter vector used to capture the timing, which corresponds directly to one
of the characterization points shown in Figure 8.

Unified variation libraries always use compact CCS timing constructs (compact_ccs_*) to
efficiently store the CCS timing data. In a uncompacted CCS library, the current response
waveforms are stored as I-versus-t (current versus time) response curves. Storing complete
current waveforms for every table point can result in a significant amount of library data. In
a compact CCS library, each response is instead stored as I-versus-V (current versus
voltage). The advantage of such I/V response curves is that they are often very similar in
shape, differing only in their size and positioning. When storing each new I/V response
curve, compact CCS libraries have a library of full I/V base curves, and then store each
current response as a geometric transformation of one of the available base curves. (The
geometric transformations include things like time-to-activity, magnitude of current
response, and duration of current response.) Since storing the curve as a geometric
transformation requires only a few numbers instead of a full curve, the library space
requirements are significantly reduced.

When using separate libraries, 2N+1 individual conventional (non-merged) libraries are
required:

/* Parameter B +3sigma library */


library ("lib") {
...
cell ("INV") {
pin (a) {...}
pin (y) {
timing () {
compact_ccs_rise (...) {...}
compact_ccs_fall (...) {...}
receiver_capacitance1_rise (...) {...}
receiver_capacitance1_fall (...) {...}
receiver_capacitance2_rise (...) {...}
receiver_capacitance2_fall (...) {...}
}
}
}
}
/* Parameter A -3sigma library */ /* Nominal library */ /* Parameter A +3sigma library */
library ("lib") { library ("lib") { library ("lib") {
... ... ...
cell ("INV") { cell ("INV") { cell ("INV") {
pin (a) {...} pin (a) {...} pin (a) {...}
pin (y) { pin (y) { pin (y) {
timing () { timing () { timing () {
compact_ccs_rise (...) {...} compact_ccs_rise (...) {...} compact_ccs_rise (...) {...}
compact_ccs_fall (...) {...} compact_ccs_fall (...) {...} compact_ccs_fall (...) {...}
receiver_capacitance1_rise (...) {...} receiver_capacitance1_rise (...) {...} receiver_capacitance1_rise (...) {...}
receiver_capacitance1_fall (...) {...} receiver_capacitance1_fall (...) {...} receiver_capacitance1_fall (...) {...}
receiver_capacitance2_rise (...) {...} receiver_capacitance2_rise (...) {...} receiver_capacitance2_rise (...) {...}
receiver_capacitance2_fall (...) {...} receiver_capacitance2_fall (...) {...} receiver_capacitance2_fall (...) {...}
} } }
} } }
} } }
} } }
/* Parameter B -3sigma library */
library ("lib") {
...
cell ("INV") {
pin (a) {...}
pin (y) {
timing () {
compact_ccs_rise (...) {...}
compact_ccs_fall (...) {...}
receiver_capacitance1_rise (...) {...}
receiver_capacitance1_fall (...) {...}
receiver_capacitance2_rise (...) {...}
receiver_capacitance2_fall (...) {...}
}
}
}
}

Figure 16: Separate Libraries (Source Code)

Again, the separate libraries correspond directly to the characterization points in Figure 8.
These individual libraries can be in NLDM, CCS, or compact CCS format. The requirements
for the library set are:

1. All libraries must be in the same format (all CCS, all NLDM, or all compact
CCS).
2. All libraries must have the same type of timing data -- the same cells,
same timing arcs, same conditionals, and so on. These are the same library
compatibility rules enforced by the CCS scaling group command,
define_scaling_group. The contents must be compatible but the ordering can
change.

Note that in both the unified and separate library forms, the timing variation data
is stored as complete sets of library timing data rather than as simple scalar
sensitivities ("k-factor" of value change per change in parameter value). This
form of library data ensures that the timing response to each variation parameter
is modeled as accurately as possible. As each parameter varies, the full power of
library timing modeling is available to determine how all of the different aspects
of timing are affected:

• cell arc drive capability


• crosstalk delta delay behavior
• crosstalk noise bump behavior
• receiver model behaviors
• Miller effect characteristics
• effective capacitance
When the variation timing is kept as full detailed library data, it can always be simplified
down to sensitivities as needed for less demanding tasks (such as reporting simple library
cell sensitivities). However, keeping the full detailed response information in the library
(rather than simplifying the data down to sensitivities) makes the detailed information
available for more demanding tasks such as statistical static timing analysis (SSTA) delay
calculation.

Limitations of the Synthetic Parameter


In the local variation section, we saw how we were able to take an arbitrary
number of independent per-transistor parameters, and aggregate them into a
single synthetic parameter that models the possible outcomes of the cell
behavior. The question that often next arises is, why wouldn't we want to do the
same thing with global variation, and aggregate multiple global parameters
together into a single synthetic global parameter?

This is not an easy question to answer, but we can understand it by taking a look at a
fictional example and performing some step-by-step reasoning. First, let's say we have a
library with two buffer cells, BUFA and BUFB. These buffer cells are special magic buffer cells
whose timing does not vary with input slew or output load, but only with the process
conditions. This process model has two independent global variation parameters, A and B.

As it happens, BUFA is affected only by parameter A:

Figure 15: BUFA Response to Parameters A and B

and BUFB is affected only by parameter B:


Figure 16: BUFB Response to Parameters A and B

We decide that two global parameters are too many, and we would like to reduce them down
to a single synthetic global parameter. To do this, first we vary parameters A and B
simultaneously but independently, observe their combined effect on BUFA and BUFB, and
capture their statistical responses as a function of a synthetic global parameter G:

Figure 17: BUFA and BUFB Responses to Synthetic Outcome Parameter G

We can see that parameter A affects BUFA such that the standard deviation in delay
outcomes is 1. We also see that parameter B similarly affects BUFB such that its standard
deviation of delay outcomes is 1. As a result, both buffers have identical responses to the
single synthetic parameter G.

Now, let's put aside our libraries and models and statistical analyses for a moment, and
imagine how the real physical timing would work for two BUFA cells in a row:

Figure 18: Physical Endpoint Arrival Behavior of BUFA + BUFA Chain

Since BUFA cells are only affected by parameter A and not input slew or output delay, we
know that at every point in the N-dimensional process space, the two BUFA cells will have
identical delays. In other words, since they are affected by the same physical process
parameter, the delays of BUFA are fully correlated. Statistics tells us that when two fully
correlated normal distributions are added together, the standard deviation of the result is
the sum of the standard deviations of the addends (i.e., what we are adding together). As a
result, the standard deviation of our endpoint arrival is 2.
Figure 19: Physical Endpoint Arrival Behavior of BUFA + BUFB Chain

Now, BUFA is affectedly only by parameter A, and BUFB is only affected by parameter B. We
also know that the two parameters are statistically independent, which means that the cell
delays of the two buffers are fully uncorrelated. Statistics tells us that when two fully
uncorrelated normal distributions are added together, the standard deviation of the result is
the root-sum-square of the standard deviations of the addends. As a result, the standard
deviation of our endpoint arrival is sqrt(2) due to the statistical cancellation effects of A and
B.

Now, let's get back to our libraries and models and statistical analyses using our synthetic
parameter G. When we analyze the timing of our BUFA + BUFA chain, we feed the synthetic
global parameter G into both cells. Since both cells have the same synthetic parameter
response with a standard deviation of 1, their delays in our analysis are fully correlated and
their standard deviations sum together to yield the correct response:

Figure 20: Synthetic Parameter G Arrival Behavior of BUFA + BUFA Chain

Next, we consider the BUFA + BUFB chain case (and with any luck, you've already gotten to
the conclusion). We know that the responses of both BUFA and BUFB to synthetic parameter
G have a standard deviation of 1 since their responses to their respective dominant
parameters A and B were also identical. As a result, once again their delays in our analysis
are fully correlated and the endpoint arrival distribution has a standard deviation of 2.
However, this does not match the real physical behavior of this chain, where the buffer
delays are fully uncorrelated and the endpoint arrival actually has a standard deviation of
sqrt(2)!
Figure 21: Synthetic Parameter G Arrival Behavior of BUFA + BUFB Chain

We know we have reduced all of the independent variation parameters down to a single
global synthetic parameter. We also know that the synthetic parameter response for any
individual cell model is correct and properly models the statistical cancellation between the
original independent global parameters. The problem lies in the fact that the cell outcomes
for global parameters are also correlated across multiple cells on the die. When the global
parameters are kept separate, the independent effects on each cell are maintained and the
resulting statistical cancellation across cells is computed. When global parameters are
reduced down to a synthetic parameter, their independent effects on each cell are lost and
the cancellation can no longer be modeled.

To put it more succinctly, a single global synthetic parameter cannot be used to model
multiple global variation parameters when:

1. the variation parameters are independent (or have some degree of


independence)
2. different cells (and their timing arcs) are affected by the parameters in
different ways

Although the case above is contrived to prove a point, it is relevant to real library modeling.
In addition to different cells being affected in different ways, there are other types of arc-to-
arc differences. For some global parameters, rise behaviors may be affected differently from
fall behaviors. For other parameters, arc delays may be affected differently from arc output
slews. Both of these effects have been observed in real process models. As a result of these
types of differing parameter behaviors, the parameters must be kept separate so that the
resulting cell-to-cell interactions and cancellations can be modeled.

Why is this not a problem for local variation? For local variation parameters, every cell (and
in fact every transistor) varies independently as we have seen in Figure 12. As a result,
there is no parameter correlation to maintain across cells, and each cell varies
independently. Therefore, it does not matter whether we create the cell model with multiple
physical parameters or a single synthetic parameter. The only requirement for accurate local
variation modeling is that each cell's model properly reproduces the cell's timing outcomes.

Choosing Global Parameter Characterization


Points
When characterizing a library for global variation, characterization is performed
at the parameter's nominal value as well as a value on either side of nominal.
However, what non-nominal values should be used for characterization?

It is usually best to characterize the global variation parameters at values at or near the
most extreme parameter values which are reasonably likely to occur. For normally-
distributed parameters, the ±3σ parameter values are typically used.

There are two primary reasons for using characterization points at the extreme ends of the
parameter range.

• Error Interpolation vs. Extrapolation

When creating a linear model by interpolating the timing outcomes between two
points, the delay calculation error at any point between those two points can only be
somewhere between the errors of either point. But when extrapolating, the resulting
delay calculation error can be magnified.

Consider the following global variation parameter which has been characterized at
±3σ. For the sake of example, let's assume a worst-case error configuration:

o the real simulation response curve to the parameter variation is


concave-downwards
o delay calculation at nominal yields -1% error versus the simulation
timing
o delay calculation at the non-nominal characterization points yields
+1% error

Figure 22: ±3σ Characterization Points

If we interpolate the timing outcome at any point within the ±3σ parameter range,
the error can only be somewhere between -1% and +1% error.

Now consider the same library, but characterized at ±1σ:


Figure 23: ±1σ Characterization Points

Again, delay calculation error at the nominal point is -1%, and delay calculation error
at the non-nominal characterization points is +1%. Due to the non-nominal points
being within the parameter range, any delay calculation beyond the ±1σ
characterization points requires linear extrapolation instead of interpolation. When
extrapolation past the characterization points is used, the delay calculation error at
the characterization points can be magnified by the extrapolation process. This error
magnification effect depends on the relative error magnitudes at the nominal and
non-nominal points.

By choosing global parameter characterization points at the extremes, the


interpolation region is maximized and the potential for error magnification due to
extrapolation is minimized.

• Greater Accuracy at Extremes

One might argue that the above behavior is acceptable because the increased error is
likely to happen at the extremes, where the individual cell outcomes are most
improbable. However, recall that with global variation, the parameter value is shared
by all devices on the chip. With global variation parameters, all cells tend to get
slower or faster together.

In addition, it is these improbable outcomes that are usually of primary interest! In


variation analysis, the primary outcome of interest is the quantile outcome, or the
worst-case outcome at the most conservative side of the curve. If we were looking to
find the slowest endpoint arrival from a chain of buffers, the slowest endpoint arrival
occurs when each of the buffers in the chain is exhibiting its slowest behavior:
Figure 24: +3σ Arrival Results From +3σ Global Buffer Delays

Since it is desirable to have the best accuracy at the extreme ends of the timing
variation curves (whether they are arrivals, slacks, or transitions), it is reasonable to
conclude that we should place the characterization points at the ends of the
parameter range, near the part of the curve which is most important to our analysis.

By placing the global parameter characterization points at the extremes of the


parameter range, the direct error for the extreme outcomes is reduced, and the
potential for error magnification due to extrapolation is minimized.

You might also like