Parallel Test Technology: The New Paradigm For Parametric Testing

www.keithley.
com
1st
Edition
Parallel Test Technology

The New Paradigm for Parametric Testing
Specifications are subject to change without notice.

All Keithley trademarks and trade names are the property of Keithley Instruments, Inc.
All other trademarks and trade names are the property of their respective companies.
Keithley Instruments, Inc.

Corporate Headquarters • 28775 Aurora Road • Cleveland, Ohio 44139 • 440-248-0400 • Fax: 440-248-6168 • 1-888-KEITHLEY (534-8453) www.keithley.com
© Copyright 2006 Keithley Instruments, Inc. No. 2788

Printed in the U.S.A. 0906100RC
Parallel Test Technology:
The New Paradigm for
Parametric Testing
1st Edition
A G R E A T E R M E A S U R E O F C O N F I D E N C E
Foreword
As the dimensions of modern integrated circuits continue to shrink and the use
of innovative materials grows, device fabrication and parametric testing have become
more challenging with each passing year. Every device shrink, every process innova-
tion, and every new material has made the repeatability and volume of data produced
by parametric test more critical to process development and controlling modern fabs.
Today’s fabs must understand how to produce and characterize materials like high κ
gate dielectrics and the low κ insulators used in conductive layers quickly and cost-
effectively; tomorrow’s IC producers may well need to learn how to manufacture and
test transistors formed from carbon nanotubes or other technologies that researchers
have only begun to imagine.
This book offers a high level overview of how parallel parametric testing can help to-
day’s highly automated, 24/7 fabs maximize the throughput of their existing parametric
test hardware and reduce their cost of test. But parallel test has the potential to do much
more than help fabs reduce the cost of ownership for their parametric test equipment.
By extracting more data from every probe touchdown, parallel test offers fabs the flex-
ibility to choose whether they want to increase their wafer test throughput dramatically
or to use the time to acquire significantly more data, so they can gain greater insight into
their processes than ever before.
I hope you find this book as informative and enlightening as I have during the de-
velopment and review process. And best of luck with your parallel test implementation
journey!
Flavio Riva
Front-End Technology and Manufacturing
Advanced R&D—Non-Volatile Memory Technology Development
Laboratory & Parameter Testing—Group Leader
STMicroelectronics Srl
ii Parallel Test Technology:

Table of Contents
S e ct i o n I
What is Parallel Parametric Test? . . . . . . . . . . . . . . . . . . . . . . . . 1-1
S e ct i o n I I
The Parallel Test Implementation Process . . . . . . . . . . . . . . . . 2-1
S e ct i o n II I
Identifying and Harvesting the “Low Hanging Fruit”

in Existing Scribe Line TEGs . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1
S e ct i o n I V
Test Structure Design Considerations for Parallel Testing . . 4-1
Appendix A
Coding for Parallel Test Using pt _ execute . . . . . . . . . . . . . . . A-1
Appendix B
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1
The New Paradigm for Parametric Testing iii

iv
I
Par allel Test Technology
S e ct i o n I
What is Parallel
Parametric Test?
1-1
I What is Parallel Parametric Test?
Introduction
The shortest, simplest definition of parallel parametric test is that it’s an emerging
strategy for wafer-level parametric testing that involves concurrent execution of multiple
tests on multiple scribe line test structures. It offers enormous potential for increasing
test throughput with existing test hardware.
The market pressure to minimize test times is the most powerful motivator driving
fabs to explore parallel testing. It offers a relatively inexpensive way to increase through-
put, thereby lowering the Cost of Ownership significantly. Just as important, parallel
testing can address the growing need to perform more tests on the same structures in
less time as device scaling increases the randomness of failures.
As of this writing (2006), the structures being tested in parallel are typically located
within a single Test Element Group (TEG). Even among leading-edge IC manufacturers,
very few have progressed to the point of testing structures in different TEGs simulta-
neously. Implementing this strategy involves using the tester’s controller to inter-leave
execution of the multiple tests in a way that maximizes the use of processing time and
test instrumentation capacity that would otherwise be standing idle. When the design
of the test structures allows, this “multi-threaded” approach to test sequencing reduces
the execution time for multiple tests on multiple structures to little more than the time
needed to execute the longest test in the sequence.
Parallel test vs. traditional sequential mode approach
To illustrate the throughput advantage that parallel testing offers, it may be helpful
to contrast it with the traditional approach to parametric test, in which each test in the
sequence must be completed before the next one can begin. The total test time for an
individual TEG is approximately the sum of the test times for the individual test devices,
plus any delays due to switching latencies, which can be significant.
Today’s parametric test systems can be equipped with up to eight Source-Measure
Units (SMUs), although most systems have fewer installed. However, for the sake of
argument, if a tester configured with eight SMUs was operated in sequential mode for
simple tests such as measuring a resistor (which requires one SMU for the two nodes),
then seven SMUs would be sitting idle. Parallel test increases utilization of both the
tester and prober and boosts throughput by measuring multiple devices simultaneously.
(Figure 1‑1 illustrates the difference between the amounts of time required to perform
a set of tests sequentially vs. the same tests performed realistically in parallel.)
Devices tested in parallel may be of all the same type (homogenous) or of different
types (heterogeneous). For example, two transistors, one resistor, and one diode could
potentially be measured independently and asynchronously by performing different
connect-force-measure sequences on all four devices simultaneously. Figures 1-2 and
1-3 illustrate the difference between how tests within a single TEG are tested in sequen-
1-2
1-2 Parallel Test Technology:
i
Sequential
Conpin ForceV Delay MeasI Devint
tseq
True Parallel
ttp tseq
Figure 1‑1. Sequential mode vs. parallel test timing.
Gate
Source Drain
Substrate
Vt Vt Res Leakage
Leakage Leakage Vf
Idoff Idoff
Idsat Idsat
Isub Isub Sequential
BVdss BVdss end
Figure 1‑2. Schematic of a sequential mode test sequence.
Gate
Source Drain
Substrate
Vt Vt Res Leakage
Leakage Leakage Vf
Idoff Idoff
Idsat Idsat
Isub Isub In parallel
BVdss BVdss
end
Figure 1‑3. Schematic of a parallel test sequence.
The New Paradigm for Parametric Testing 1-3

tial mode and in parallel mode. Note how the parallel mode test sequence maximizes
the use of the instrument resources available.
Wafer-level parallel parametric test vs. parallel functional test
Although the concept of parallel test has been discussed extensively in the semicon-
ductor industry over the last few years, many of those discussions focus on parallel func-
tional testing of packaged components, rather than on wafer-level parallel parametric
test. For example, Keithley’s Model 4500-MTS Multi-Channel I-V Test System and Series
2600 System SourceMeter® Multi-Channel I-V Test Solutions can be applied to parallel
functional test applications. Keithley’s S680 Automatic Parametric Test Systems falls into
the true parallel parametric test category.
Although both types of parallel testing use a similar testing strategy (i.e., the use of
multiple SMUs operating asynchronously to reduce total test time), there are obvious
differences. The most significant one, other than the size and cost of the test hardware
itself, is that functional tests of packaged devices are largely immune to the parasitic ca-
pacitances between devices under test that can interfere with parametric test accuracy,
whether those tests are performed sequentially or in parallel. Parallel functional testing
with the Model 4500-MTS or the Series 2600 instruments also support the use of chan-
nel groups for testing multiple devices and eliminating further tests on failed devices,
which parallel parametric test does not. Series 2600 instruments can also be grouped
through the use of multiple GPIB addresses. These groups of SMUs, each with a master
unit and one or more slaves, can execute tests in the same asynchronous, concurrent
fashion as the Model 4500-MTS’s channel groups.
The parallel test continuum
When discussing parallel test strategies, it’s important to remember not all fabs (or
even all test cells within the same fab) should necessarily implement it in the same way.
Rather than a single approach to parallel test implementation, it’s more productive to
think of parallel test implementation as a continuum. The point on this continuum
that’s most appropriate for a particular fab or test cell will depend on a number of fac-
tors, including the manufacturing technology, the maturity of its production process
and TEGs, and its anticipated manufacturing lifespan (i.e., how long the fab will con-
tinue to produce it). The following paragraphs describe the end points and midpoint of
this implementation continuum:
• Picking the “low hanging fruit”: For many fabs or test cells working with
mature processes, this approach to parallel test will be the most attractive
because it involves changing only the test sequencing on existing TEGs. Typi-
cally, this would require analysis of both the TEG and the test sequence to
identify opportunities for reordering or regrouping existing tests on hetero-
geneous structures in a way that minimizes the time needed for switching

I
between test pads. A majority of the discussion in this book will focus on
this approach because it represents the fastest, surest way for fabs to achieve
significant throughput improvements with a relatively limited investment in
analysis effort, new software, and test sequence modifications.
• Doing the heavy lifting: This point in the continuum demands much more
extensive analysis of both the test sequence and the TEG itself because it
requires significant changes to both. A number of new reticles typically must
be designed, created, and validated to allow parallel testing of more struc-
tures within the TEG. This point in the parallel test continuum may also
require changes to the probe card design, as well as the installation of ad-
ditional source-measure instrumentation. While it’s important for prospec-
tive users to understand the expense and time required at this point in the
continuum, for many fabs, the throughput gains parallel test makes possible
may justify the effort.
• Plowing the “green field”: During technology development for new prod-
ucts, it’s relatively inexpensive to design the new TEGs in a way that maxi-
mizes the number of structures that can be tested in parallel. Given that
there are no existing reticles or test sequences that must be replaced, there’s
no existing testing process to disrupt. While this point in the continuum of-
fers the highest potential for payback in terms of throughput, it’s wiser not
to try implementing parallel test for the first time on a new product, when
there are many other priorities to consider while trying to ramp up produc-
tion. Instead, the knowledge gained from first implementing parallel test
on mature processes using the “low hanging fruit” approach can be applied
to the process of implementing it on new products later. Parametric test
vendors can also provide enormous assistance by reviewing test structures
and algorithms, which may make it possible to ramp parallel test technology
significantly faster.
Weighing the advantages of parallel test
Parallel parametric test offers a variety of advantages over traditional sequential
parametric testing:
• Cost of ownership advantages. The most obvious advantage of parallel
test is its impact on the cost of ownership (COO) of the parametric test sys-
tem on which it is implemented. The largest “lever” on the cost of ownership
for a process or metrology tool is system throughput; therefore, by increas-
ing throughput, parallel test decreases the system’s cost of ownership. Users
have documented throughput increases due to parallel test ranging from

1.05× to 3.9×.The degree of throughput improvement that a particular fab

or test cell can achieve will depend on a variety of factors:
– The existing test structure and pad layout. When designing scribe line
test structures, saving space has long been an important objective for
many TEG designers. In order to minimize the amount of costly wafer
real estate devoted to TEGs, designers have typically designed structures
with shared gate pads, which can make it impossible to test certain struc-
tures in parallel.
– The specific combination of test structures within the TEG. Consider, for
example, a structure made up of an array of transistors, all of which share
a single gate pad. It would be impossible to characterize the devices fully
within such a structure in parallel. Conversely, a resistor network would
likely allow testing of all the resistors in parallel because such a structure
would have a pad at every node in the network, allowing the tester to
source current across the whole network, then measure the voltage drop
at each node. Most TEGs, however, fall somewhere in between these two
extremes. A more likely scenario is a TEG that includes at least one capaci-
tor, one resistor, one diode, and one transistor. Although some of these
test structures may, in fact, share pads, some level of parallel testing is
still achievable. For example, it might be possible to measure the forward
voltage drop of a diode and the resistance of a resistor in parallel, even if
they are connected in series, as long as there is a pad at the node where
the two structures connect. Similarly, it’s likely that one can measure the
resistance of a polysilicon line, the leakage of a capacitor, and the reverse
leakage of a diode in parallel. Typically, however, C-V measurements are
performed sequentially, largely because few testers are equipped with
more than one C-V meter. There are also lingering concerns about the
potential for C-V measurements to create parasitic coupling with nearby
structures or probe tips if performed in parallel with other measure-
ments.
• Test cell capacity advantages. Parallel test can be a good option for fabs
with limited make-up test capacity and that lack the resources and/or the
floor space to add another test cell. By allowing fabs to use their existing test-
ing hardware more efficiently, parallel test can often eliminate the need for
additional test cells or reduce the number of test cells needed when equip-
ping a new fab.
• Impact on overall cost of test. Parallel test users have reported a variety of
other benefits that helped reduce their overall cost of parametric test:

I
Parallel test vs. adaptive test

When weighing alternatives for improving parametric test throughput, fab managers of-
ten consider an adaptive testing strategy rather than parallel testing. While both represent
valid approaches for reducing cost of ownership, they are very different in nature. A brief
overview of adaptive testing may be helpful in understanding these differences.
Results-based adaptive testing allows programming the tester to increase or decrease the
number of sites tested and the number of tests performed on a wafer based on the results of
previous measurements. If the results from previous sites are acceptable, the number of sites
and/or tests can be reduced, thereby increasing throughput when testing good wafers. When
previous test results don’t meet the pre-set criteria, adaptive test supports several different
scenarios. If the test is used for process control, the tester can make extensive additional tests
at the bad sites automatically (more tests, same TEG), so that a more complete set of paramet-
rics is available for analysis by the process engineer if the lot is placed on hold. When used for
lot dispositioning, the tester can perform the same tests at all die on the wafer automatically
(same tests, more die), to determine known good die for final test. In either case, adaptive test
can largely reduce or eliminate the time, expense, and errors involved in re-probing.
Most parallel parametric test experts would advise test managers against attempting to
ramp up both parallel testing and adaptive testing at the same time because these techniques
are based on differing strategies, even though both are designed to increase test throughput.
Adaptive test requires setting thresholds that define what constitutes an acceptable or unac-
ceptable wafer. Unacceptable wafers can trigger 100% testing of the remainder of the lot in
order to gather additional data, which the process engineer can use in tracking down the
source of the problem. In contrast, parallel testing, by clustering tests and structures for
improved efficiency, already provides a larger data sample in less time, without relying on
adaptive testing’s reduced sampling strategy. Implementing either strategy can be a relatively
complex and time-consuming process that may require a good amount of a test manager’s at-
tention for several months. Generally, it’s preferable to implement parallel test first, to ensure
that the test content is stable.
Serial sequential testing of multiple structures within a TEG results in the highest quality
parametric measurements. Parasitic coupling can degrade some parallel measurements to
some degree. Given that the process is not optimized to control parasitic coupling behav-
iors, the variability of the parasitic coupling results in a broader statistical distribution of
parametric measurement results. This broader distribution could trigger adaptive testing to
increase sampling. The adaptive test parameters may need to be adjusted to compensate for
the broader statistical distributions.
– Reduction in number of test cells required. Consider a hypothetical fab

with three test cells that’s facing a significant increase in demand for test-
ing capacity due to the addition of a new product or ramped up produc-
tion of an existing one. By boosting the throughput of each test cell by
30%, parallel test can make it unnecessary to add another cell.
– Additional make-up capacity at relatively little cost. Even fabs with a
relatively steady demand for test capacity need make-up capacity to

a ccommodate disruptions like periodic tool maintenance, yield crashes,

etc. Parallel testing offers the flexibility to handle these disruptions with-
out the expense of additional test cells.
– Reduction in the number of operators needed due to reduction in num-
ber of test cells needed and associated training required. Fewer test cells
require fewer operators—it’s as simple as that. Fabs also save because
fewer operators typically require fewer man-hours of training to maintain
their skills.
– Flexibility to test more extensively as desired. The additional test capacity
parallel test provides affords some fabs the time they need to add more
tests to the test sequence, which may not have been possible previously.
The ability to gather more information in less time helps fabs gain a better
understanding of their processes.
– Reduced consumables costs. Fewer testers require fewer consumable
items, such as probe cards.
To evaluate the effect of parallel testing on a specific operation’s overall Cost of Test,
Keithley recommends Wright, Williams, and Kelly’s TWO COOL® for Wafer Sort & Final
Test software for semiconductor test floor operations.

II
S e ct i o n II
The Parallel Test

Implementation
Process
2-1
II The Parallel Test Implementation Process
Introduction
Making the transition from strictly sequential parametric test to the use of parallel
test techniques can appear daunting, even to experienced parametric test engineers.
The best way to approach this challenge is to break the process down into a number of
smaller, more do-able phases, as outlined in Figure 2-1. It’s also important to remember
that parallel test doesn’t necessarily demand test structure modifications or developing
new structures for new processes—there’s plenty of potential for reducing test times or
increasing the number of parameters even when continuing to test existing structures.
Although parallel testing of legacy structures will be touched on this section, it will be
discussed in greater detail in Section 3.
Considerations Prior to Parallel Test Implementation
Parallel test is appropriate for use with virtually any solid-state technology—it’s just
as suitable for gallium arsenide processes as it is for mainstream silicon processes. There
are only a few minor caveats associated with selecting a process for parallel test:
• The test structure shouldn’t introduce instability into the measurement by
being tested in parallel with another structure. Structures with shared ter-
minals at any level, whether in the diffusion or the interconnect, have the
potential to produce skewed results. Unfortunately, these shared terminals
are fairly common in legacy structures—test structure designers often use
common pads in multiple DUTs to conserve space in crowded scribe lines.
See Section 3 for more information on parallel testing of existing scribe line
TEGs.
• Particularly for new device technologies, it’s essential to establish a mea-
surement baseline using sequential testing prior to implementing parallel
test. Variations in device performance are more common with new technolo-
gies than with existing ones. Given that one of the objectives of parametric
testing is to understand where the variations in the process are and then
to reduce them through the development process, it’s critical to establish
this sequential test baseline; parallel testing may introduce additional vari-
ations as a result of either tester timing or device interference. Without a
sequential test baseline for comparison, it’s impossible to distinguish be-
tween “new device” variations and “parallel test” variations. Fortunately,
pt _ execute, Keithley’s unique toolset and coding method for parallel
test, allows switching from sequential to parallel testing quickly and easily. It
also manages test resource allocation. See Appendix A for more information
on pt _ execute.
• For new processes, the best time to turn parallel test “on” is at the beginning
of a volume ramp, because this point offers the greatest bang for the invest-

ii
Pick a mature
process for parallel
test learning
Pick a high demand

process for
maximum impact
Review sampling
Perform feasibility study
strategy to understand
(Section 2)
prober overhead
Develop throughput Correlation

model, Pareto test on 3 lots
times
Optimize code with Throughput No

traditional methods and correlation
achieved?
Develop throughput Yes

model, Pareto test
times
Release to
production
Analyze structures (see

Section 3), model
parallel test impact
No Good return Yes

on effort?
Figure 2‑1: Basic parallel test implementation flowchart.
ment buck by reducing the number of testers needed once the product goes
into volume production. Note: It is best to learn how to use parallel test on
a mature process, not during ramp-up of a new process.
identifying Throughput Limitations imposed by
the prober Used
Before test engineers begin designing new structures or modifying test sequences
to implement parallel test, it’s critical that they consider everything that affects the

ii The Parallel Test Implementation Process
Prober Time Tester Time
2× improvement in test time only

gains a 25% improvement in overall time
Prober Time Tester Time
2× improvement in test time

gains a 45% improvement in overall time
Figure 2‑2: prober overhead can dilute the gains parallel test makes possible.
throughput of the test cell as a whole, not just raw tester speed. Weighing the impact of
any prober throughput limitations is an important first step in ensuring the implemen-
tation effort achieves the maximum potential test time reduction. To review briefly, five
different timing parameters affect prober throughput:
• First wafer load and align times (typically ~90 seconds)
• Site index time (typically ~600–700ms)
• Subsite index time (typically ~350ms)
• Wafer swap time (typically 45 seconds)
• Last wafer unload (expel) time (typically ~30 seconds)
The typical times listed here don’t reflect any specific prober’s performance and are
offered simply to provide an indication of magnitude. Obviously, the exact times for
each parameter will depend on the mechanical design trade-offs that particular prober’s
manufacturer has chosen to make. When designing test structure sets for parallel test,
one of the typical goals has long been to minimize the number of subsite moves in
order to reduce the impact of subsite index time on throughput. However, all the prob-
er throughput parameters must be considered and factored into the test cell’s overall
throughput budget and return on parallel test implementation effort.
The gains achieved from implementing parallel test can be greatly diluted by the
prober overhead. The closer the ratio of test time to prober time, the more dilution
occurs. In the example in Figure 2‑2, when the test time is equivalent to prober time,
it is possible to produce a 2× improvement in test time, but only a 25% overall improve-

II
ment. Still, on a heavily loaded test floor, this smaller gain may provide the needed ROI
to justify the implementation effort.
Understanding/Revisiting the Current Sampling Strategy
Over the manufacturing lifespan of a device, the numbers of sites and subsites tested
on each wafer will typically change fairly substantially, with fabs generally testing the
most in the early weeks after the device enters production, then whittling the list down
to a smaller set of tests on far fewer sites and subsites as the process reaches maturity.
Those test sequences and site/subsite numbers will also depend significantly on who’s
doing the testing and their business objectives. When fewer tests are performed on
fewer sites and subsites—thereby reducing the amount of time the prober requires for
site and subsite indexing—the other prober timing parameters (first wafer load and
align times, swap times, and unload times) assume greater significance in the overall
throughput picture.
Pre-implementation Feasibility Study
Keithley recommends implementing parallel test for the first time on a mature ex-
isting process, rather than on a new product with new test structures. The knowledge
gained during through the implementation process on a well-known process will pro-
vide valuable insights for subsequent implementations on newer products. In addition,
the necessity of ramping up production on a new product as quickly as possible also
makes it unlikely that a fab would spare the tester capacity and human resources neces-
sary for a throughput improvement project. In fact, parametric test experts recommend
employing conventional throughput improvement techniques first, given that these
approaches offer more straightforward throughput benefits, even where parallel test
techniques are impractical.
Prior to attempting to implement parallel test on an existing process, a team of test
engineers must perform an in-depth feasibility study. This study, which starts with a
review of the documentation for the wafer’s existing test structures, allows the imple-
mentation team to determine the most appropriate DUT test pairings and test groupings
to maximize test time reduction. This job falls to the test engineers because they have
the best understanding of the test resources (numbers of SMUs and other sourcing and
measurement instrumentation) available on specific test systems, the most insight into
the structures themselves and which ones can be paired, and the greatest understand-
ing of the tests performed now and those likely to continue to be performed as the
process approaches maturity.
As part of the feasibility study, the implementation team must evaluate the oppor-
tunities the TEG offers for pairing “like” tests, such as multiple ION or V T tests, keeping
in mind that pairing low-level measurements may increase the variability of the test
results. Pairing long-duration tests that are performed on the same type of structure,

such two gate oxide integrity (GOI) tests performed on two different gate dielectrics
within the same TEG is another possibility for test time reduction. It’s also important
to evaluate the various test grouping options, based on the tester resources available
(i.e., the SMUs and other instruments). In actual parallel testing, this grouping func-
tion will be performed automatically by the software’s pt _ execute function, but it’s
important to gain an early awareness of the level of throughput improvement possible
with the existing tester configuration. A test pairing is the sequence (order) of tests. For
example, if the test engineering puts three V T test in order, these tests are performed in
this order. The test grouping process is done by the pt _ execute tool, based on the
resources available (pins and instruments) and limitations imposed by test conditions
that must be executed separately. For example, if the test engineer has four VT tests
paired together, and each V T test employs three SMUs, pt _ execute can only perform
two V T tests at a time (in parallel). Therefore, only two V T tests are grouped together. If
the test grouping review indicates the current configuration offers limited throughput
improvement, it may be a sign that the fab should consider investing in additional SMUs
for the tester in question.
Long duration tests that represent fundamental elements of the manufacturing pro-
cess, such as the GOI test mentioned previously, are the best candidates for both the
feasibility study and performing in parallel. During the feasibility study, the team can
prevent wasted time by not bothering to evaluate tests that are likely to be eliminated
as part of the typical test time reduction activities, the usual test streamlining that goes
along with increasing process maturity. Obviously, it would be unwise to base a parallel
test cost justification decision or go to the effort of rewriting sequences on tests that are
likely to be deleted, unless the ROI for the time the structures are used is very compel-
ling. For example, tests used for structure-related debug (such as tests on structures
intended to monitor silicide formation) or process-maturity-related debug (such as tests
on comb or serpentine structures used to monitor yield variability) during technology
development or ramp-up require special consideration.
The next step in the feasibility study is to create a set of test algorithms for a con-
servative DUT test grouping. In this instance, “conservative” means a sequence that
includes no low-level tests, such as leakage tests (because of their potential to for disrup-
tion by other tests, such as breakdown tests), or non-alike groupings (such as attempting
to test breakdown voltage and voltage threshold at the same time). Once the algorithms
for this set of tests are complete, they should be loaded and run on the tester in both
sequential and parallel modes. Then, the data from the parallel test run should be com-
pared carefully with the data acquired with the same set of tests performed sequentially.
The sequences can be run and the data taken at the sub-program (Keithley Interactive
Test Tool or KITT) level. In other words, rather than testing an entire wafer, complete
with prober indexing from one subsite to the next, this comparison process requires

II
only looping through the tests for a single subsite. This comparison, which involves
close examination of the control charts for both the parallel test run and the sequential
test run, will allow the implementation team to identify unintended offsets or interfer-
ences that parallel testing may introduce. Even though it only provides a comparison of
the sequential vs. parallel test execution times, it offers an important indication of the
potential for overall test time reduction. Fortunately for the implementation team, the
pt _ execute software supports switching parallel testing on and off quickly to gauge
the impact on test execution time of any code changes and to track down the source of
correlation problems.
NOTE: Parallel testing is a licensed option for the Keithley Test Environment. Parallel
tests can be developed and executed only a licensed workstation. For more information
on this licensing option, contact your Keithley representative or field service engineer.
Parallel test development process
Once the feasibility study is complete it’s time to review which of the existing test
algorithms (or macros) can be modified and reused in a parallel test environment (typi-
cally, roughly two-thirds of them) and which ones require the creation of new algo-
rithms. Fortunately, the pt _ execute package helps increase the percentage of ex-
isting test libraries that can be reused in parallel test. Refer to Appendix A for details
on how pt _ execute can simplify the process of creating parallel test algorithms.
However, it’s important to keep in mind that a macro-by-macro review process typically
tends to lead to some rework of almost all algorithms once the implementation team
recognizes the reductions in test times that adjustments to delays and integration times
can make possible.
In one sense, creating new macros involves a “deconstruction” process for some
implementation teams. For example, if the test engineer who created the original se-
quential test program was especially diligent about reducing tester overhead, he or she
may have grouped the “connect” statements for all the instrument applied to a spe-
cific set of test structures on a subsite, then forced/measured sequentially on all of the
structures This practice of grouping many tests into one large test algorithm is often
referred to as writing jumbo algorithms. The earlier use of these “pseudo-parallel test”
sequences, unfortunately, typically forces the team to take a step back and start over
with far simpler, single-purpose algorithms designed to be performed in parallel with
other single-purpose algorithms on a single DUT. Adding pt _ execute commands at
the appropriate points in the test sequences will automatically associate the macros with
the appropriate DUTs.
Ongoing correlation studies between test results obtained in sequential and parallel
modes are critical throughout the development process, initially at the sub-program
(KITT) level, and eventually at the composite program level. While it’s obviously satisfy-

ing to identify test execution time reductions of up to 60% at the sub-program level on a
single subsite, it’s much more important to see significant throughput gains on the com-
posite program level, which also includes the subsite and site indexing times. Keithley
typically recommends analyzing and correlating the data from three wafer lots for gage
performance and throughput modeling. This stage can reveal new test issues, such as
probe card charging or other problems, which must be resolved before the implementa-
tion process can be considered complete.
Subsequent parallel test implementations
As is true with virtually any type of implementation process, implementing parallel
test tends to be somewhat easier the second time around, particularly if the original
implementation team has been diligent about documenting their efforts and sharing
that knowledge with their colleagues through a formal “Best Known Method” process.
Subsequent implementations on new wafer designs may allow for significantly
greater throughput reduction than parallel test on legacy test structures will permit,
particularly if the lessons learned in the first implementation can feed into the creation
of new test structures optimized for parallel test. For example, a number of IDMs with
experience in parallel test choose to incorporate the devices associated with all their
long duration tests into one test structure. Others take advantage of the flexibility that
parallel test’s higher test execution speed offers to add new tests or more test devices, so
they can gather levels of information that were previously impractical.
In an ideal testing world, every test structure would be very simple, totally electri-
cally isolated, and equipped with a pad for every DUT terminal. Oddly enough, this
is somewhat similar to the test structure design philosophy typically followed during
technology development, when the objective is to obtain the highest possible data gran-
ularity. This is achieved by testing many of the same types of devices with various gate
lengths, structures with contact chains of various lengths, etc.
Implementation—How long should it take?
Obviously, every team will have its own timetable for implementing parallel test,
depending on organizational priorities and the resources available, including test cell
capacity and test engineer and structure designer time. However, generally speaking, if
the fab is already using an S680 test system successfully, the team should plan that the
first implementation of parallel test will require approximately three months to com-
plete, from feasibility study to final switch over.

III
S e ct i o n III
Identifying and Harvesting

the “Low Hanging Fruit” in
Existing Scribe Line TEGs
3-1
iii Identifying and Harvesting the “Low Hanging Fruit” in Existing Scribe Line TEGs
introduction
As every test engineer knows, typical scribe line test structures are components and
groupings of components that represent the manufacturing process being supported by
electrical measurement Statistical Process Control (SPC). In order to minimize pad us-
age, test structure designers frequently connect device terminals together or use other
techniques to minimize that amount of space these structures require. From the per-
spective of parallel testing, this lack of device isolation can create problems. Fortunately,
despite these kinds of limitations, experienced test sequence developers have been able
to produce parametric test throughput improvements (including prober overhead)
ranging from 5% to 40% with existing scribe line test structures and from 40% to 50%
with structure layouts designed to increase the potential for parallel test.
In traditional (i.e., sequential) parametric test programs, each DUT is connected to
the measurement instruments one after the other. During the period in which the DUT
is connected, forcing conditions are applied to it and measurement resources record its
response. Once a single test or group of tests for a DUT is complete, the connections
are cleared to allow connection to the next DUT. These connect and disconnect times
represent some proportion of the overall throughput budget because the relay switch-
ing and settling times for high isolation mechanical devices are fixed.
In addition to the relay connect (Conpin) and disconnect (Reinitialize) overheads
just described, there is a delay (Delay) overhead, the length of which can vary widely,
depending on the DUT type and the measurement conditions. When addressed sequen-
tially, these connect, disconnect, and delay overheads can reduce the overall through-
put gains that faster measurement instruments promise. Fortunately, by connecting
multiple DUTs to different measurement resources simultaneously for different types of
tests, it’s possible to reduce the impact of relay connect and disconnect times on overall
throughput significantly (Figure 3‑1).
Conpin Force V Delay Meas I Reinitialize

Conpin Force V Delay Meas I

Reinitialize
tp
ts ~ 3.8 tp
Figure 3‑1: Sequential vs. parallel measurement of four resistors. in this example,
testing four dUTs in parallel can be completed approximately 4× faster than sequential
testing would allow. note that the speed improvement is slightly less than 4× because
there is some overhead.

iii
Connecting the various measurement instruments to multiple DUTs simultaneously

and obtaining reliable data from them is vastly simplified if instruments of the same
type (e.g., Source-Measure Units) have identical capabilities. In the case of the DC instru-
ment example shown in Figure 3‑2, each path has uniform signal amplification (pre-
amp) at the pin, uniform switching characteristics in a solid-state relay matrix, uniform
SMUs with full dynamic measurement range, and dedicated precision analog-to-digital
converters (PADCs).
VXI-CPU IEEE
8 Full Spec AC
SMU
SMU
SMU
SMU
SMU
SMU
SMU
SMU
FCM
PGU
CGK
SMUs, Each with Instruments
CPU & PADC 8 Ports
50Ω Matrix 60MHz

I-V Matrix Spec to all
(60MHz 3dB BW)
64 Pins
Preamp
Preamp
Preamp
Preamp
Preamp
Preamp
Preamp
Preamp
S680 Probe Card
Figure 3‑2: All instruments of the same type must have identical capabilities. in
this configuration, all 64 i‑V/C‑V paths are identical and all paths provide lab‑grade
resolution. The KTe software running on the S680 system is multi‑threaded to allow
parallel testing.
Table 3‑1: examples of parallel Test Throughput improvements on Measurements of

discrete devices
Throughput
Total Sequential no. of dUTs Total parallel improvement due to
Test Time Tested in parallel Test Time parallel Testing
VT 553 ms 2 296 ms 1.87×
IDSS 120 ms 2 67 ms 1.80×
IDLEAK 1154 ms 4 294 ms 3.92×
IGLEAK 1125 ms 4 295 ms 3.81×
BVDSS 1101 ms 4 299 ms 3.68×
ToTAL 4053 ms 1251 ms 3 .24×

III Identifying and Harvesting the “Low Hanging Fruit” in Existing Scribe Line TEGs
The examples in Table 3-1 offer one way of evaluating the general benefit of paral-
lel parametric test. In these examples, a fixed set of instrument resources is applied to
discrete devices to perform a set of measurements (in this case, threshold voltage, ter-
minal leakages, drive current, and breakdown voltage), which characterize the dynamic
measurement range required for the measurement application. In this case, tests of the
same type, performed on electrically isolated, identical DUTs, produce maximum, near-
theoretical throughput improvements.
In many cases, however, existing test structure designs lack sufficient electrical iso-
lation (due to shared DUT pins) to produce optimal results. However, empirical data
taken on wafer-level structures that are not electrically isolated (Table 3-2) shows the
same throughput benefit as that realized on discrete devices. This is significant because
it indicates no instrument interference occurs in this parallel test use case.
Table 3-2: Actual Parallel Test Program Examples (Keithley Series S600 Customers)
Example 1 Example 2 Example 3 Example 4 Example 5

Comment Lots of Lots of Lots of Isolated Isolated
Shared Pins Shared Pins Shared Pins Devices Devices
Sequential Time* 320 8.14 100 210 433
Parallel Time* 230 5.76 62 100 110
Benefit 28% (1.4×) 29% (1.4×) 38% (1.6×) 52% (2.1×) 77% (3.9×)
*Times are cited in arbitrary units.
S680 Tester Resource Usage Limitations

There are a number of general limitations on how instruments are assigned to vari-
ous pins, as well as some limitations specific to the S680 system. For example, it’s un-
realistic to assign a single instrument to multiple pins, and then expect to be able to
collect the same valid measurement results associated with individual pins from all the
pins simultaneously. Similarly, a single instrument assigned to multiple DUTs can’t be
expected to force more than one condition at a time. When parallel test is implemented
on an S680 system, testing is limited to eight VXI communication threads and one GPIB
communication thread. Multiple instruments on the single GPIB communication can’t
be used simultaneously in different threads.
In parallel test, instruments and pins are essentially “owned” by the first thread
that uses them. The tasks running in parallel can’t share instruments that vary force
conditions or perform measurements. Similarly, they can’t share pins unless they are
fixed bias or ground pins as set within the master test sequence. The GPIB (IEEE-488)
bus is considered an instrument, so multiple threads can’t share its use simultaneously.
Therefore, for example, an IEEE-based capacitance meter or pulse generator can’t be
operated on two or more different threads at once because the bus is under the control
of only a single thread.

iii
optimizing Throughput
Once the general and hardware-specific limitations are understood, the first step in
the process of throughput optimization is establishing a sequential performance base-
line—in other words, characterizing how long it takes to complete each portion of a
specific set of tests in sequential mode. While performing any portions of the tests in
parallel will result in some throughput improvement, it typically doesn’t achieve all the
potential time savings (Figure 3‑3). It’s critical to group tests with similar test times
to get the greatest benefit from parallel testing. Engineers must take these factors into
consideration when generating test programs.
Sequential
Short Test Long Test Short Test Long Test
Test
Long Test Long Test

Parallel
Test Short Test Short Test
Optimized Short Test Long Test Additional

Parallel Throughput
Test Short Test Long Test Benefit
Figure 3‑3: Group tests of similar lengths in order to achieve the highest throughput
benefit from parallel test.
parallel Testing of Legacy Test Structures

When evaluating legacy test structures (i.e., the existing scribe line structures for
wafers already in production) for their suitability for a parallel test strategy, electrically
isolated devices are obviously the best choices. For example, the two discrete transistors
T1 Gate T2 Gate
Source Drain Source Drain

Subst Subst
SMU1 SMU5 SMU2 SMU6
IDS A IDS A
VGS1 VGS2
VDS1 VDS2
Figure 3‑4: Two totally separate (discrete) transistors are easy to test in parallel.

Gate Gate
Source Drain Source Drain
Subst
SMU1 SMU2 SMU5 SMU6
IDS A IDS A
VGS1 VGS2
VDS1 VDS2
Not allowed Typical scribe line test structures

because two pins are designed to minimize pad
would be fighting usage, which can complicate
over one pin. parallel test implementation.
Figure 3‑5: disallowed test pairings often involve direct connections between
structures, such as a shared gate, source, or substrate.
10Ω 100Ω 1kΩ
SMU1 SMU2
R10 R1K
A A
V V
Figure 3‑6: Resistor chains present special parallel test challenges.
illustrated in Figure 3‑4 are well suited for testing in parallel. Much more common,
however, is the space-saving, shared-terminal type of legacy structure (like that illus-
trated in Figure 3‑5), which often presents problems in parallel testing. Testing this
device in parallel would require applying different voltages to the two gates, which is
clearly impossible.
Measuring resistor chains in parallel requires special attention. Figure 3‑6 illus-
trates a test structure where resistor chains can present measurement problems.

iii
1mA
10Ω 100Ω 1kΩ

Force 1V
Measure 1mA
V 1V
SMU1 SMU2 R= = = 1000Ω
I 0.001A
R10 R1K
A A
1V 1V
Figure 3‑7: despite the use of a measurement technique that’s inappropriate for
parallel test, the measurement of the 1kΩ resistor produces a valid result.
100mA 10mA
10Ω 100Ω 1kΩ

Force 1V
Measure 110mA
V 1V
SMU1 SMU2 R= = = 9.09Ω
I 0.110A
R10 R1K
A A
1V 1V
Figure 3‑8: in this portion of the test, the incorrect technique produces an invalid result
for the measurement of the 10Ω resistor.
Chain structures must also be recognized as shared pin structures that are subject to
the same interference issues as structures with shared pins and best treated as a single
DUT. Even if the chains are measured without the proper consideration of current flows,
tests on some structures within them may produce valid results, such as the 1kΩ resist-
or shown in Figure 3‑7, while tests on others, such as the 10Ω resistor, will produce
invalid results, as illustrated in Figure 3‑8.
When the correct technique is used (Figure 3‑9), both the 10Ω and the 1kΩ re-
sistor will be measured correctly. However, sorting out the unintended measurement

100mA 1mA
10Ω 100Ω 1kΩ
SMU1 SMU2
R10 R1K
A A
1V 1V
Figure 3‑9: The appropriate measurement technique allows for valid parallel
measurements of both the 10Ω and 1kΩ resistors.
paths in legacy test structures typically demands additional engineering effort, which
can slow the payback on a parallel test implementation investment.
In some cases, evaluating legacy structures for parallel test requires more than
studying their schematic representations—examining a device’s cross-section may also
be necessary. For example, with a P+ resistor in an N well, a common diffusion means a
common measurement path, but some common diffusions can be managed by applying
the correct bias (Figure 3‑10). Even though the reverse bias leakage paths may produce
some surprises, this structure would generally be considered an acceptable candidate
for parallel test.
I I
V V
~25Ω ~25Ω
P+ P+
~1kΩ ~500Ω ~1kΩ
Figure 3‑10: p+ resistor in an n well with a common diffusion.

iii
I I
V V
~25Ω ~25Ω
N+ N+
Parallel Test
Parasitic
N
Current Path
~1kΩ ~500Ω ~1kΩ
Figure 3‑11: n+ resistor in an n well represents a problem for parallel test.
In contrast, examining the cross-section of an N+ resistor in an N well (Figure 3‑11)

reveals some obvious problems in terms of parallel test. Parallel testing of un-isolated
structures will create problems for testing in parallel. A diagnostic program that mea-
sures between all pins in a test structure set will quantify the resistance of an unin-
tended measurement path. Clearly, this structure cannot be used for parallel testing.
Parasitic voltage drop is another important consideration when evaluating legacy
structures for parallel test. In a common test structure that contains electrically isolated
DUTs distributed along a common ground line, like the set of transistors shown in
Figure 3‑12, the source and substrate are frequently connected together. Despite this
connection, a structure is still a good candidate for parallel test if the voltage drops have
been characterized.
Common
Source
Common
D D D Substrate
Sub G Sub G Sub G

S S S
Figure 3‑12: A set of transistors is a common test structure.
In the example shown in Figure 3‑13, the gate and drain voltages must be adjusted
to compensate for the voltage drop on the shared source connection. However, in many
designs, a long source line effectively acts as a resistor, which means measurements

made on one device can affect the results of measurements on others. For example,
passing high current through one device can cause a voltage drop in the test structure
as a whole, thereby changing the V DS and VGS voltages.
Common
Source
Common
D D D Substrate
Sub G Sub G Sub G

10mA S 20mA S 30mA S
20mV 60mV 120mV
2Ω 2Ω 2Ω
200mV
Figure 3‑13: Measurements made on one device often affect the results of
measurements on others.
Test structure design changes, although almost never made to wafers already in pro-
duction, can minimize the effect of cumulative currents on shared ground lines during
parallel testing. These changes usually involve increasing the area of the structure. In
the case of the structure shown in Figure 3‑14, an alternative to the structure in Figure
3‑12, duplicate ground lines are used to minimize the parasitic resistance drops within
the structure. Adding a ground pad at the other end of the structure is another possible
option for minimizing the drops. Once these parasitic voltage drop considerations are
dealt with, only the pad contact resistive drops remain to be managed.
Common
Source
Common
D D D Substrate
Sub G Sub G Sub G

S S S
Figure 3‑14: duplicate ground lines can be used to minimize the parasitic resistance
drops within the structure.
While implementing parallel test demands rethinking a variety of preconceptions

about parametric test structures, many of these issues only arise when new structures

III
are being designed. By examining the parallel vs. sequential mode program correlation
on existing device layouts, most of these issues can be resolved without the need to
examine the structure itself. In fact, Keithley engineers have developed pt _ execute,
a test program characterization and optimization tool, which, among many other things,
allows identifying correlation problems quickly.
The pt _ execute tool automates many of the decisions a test program developer
would make. Developed by Keithley applications engineers and based on their years of
experience with implementing high throughput parametric test systems at customer
sites, pt _ execute is now part of the Keithley’s standard parallel parametric test prod-
uct.
The software automatically detects the test hardware configuration of the parameter
test system on which it is installed, grouping tests based on the instrument resources
available, so there’s no need to keep a running tally of how many SMUs are available
to apply to a specific test. It also allows for easy switching in and out of parallel test
mode, which simplifies weighing throughput improvements and detecting sequential
vs. parallel correlation issues. However, pt _ execute can only be employed within
Keithley test macros (KTM) used with the Keithley Test Environment (KTE) version 5.1
or later, not in stand-alone C language programs or User Access Points (UAPs), which
are software modules that allow creating and running specialized routines for a variety
of functions other than parallel test.
For more information on the pt _ execute tool and its role within Keithley’s par-
allel test solution, we suggest reviewing Appendix A, followed by discussion with your
Keithley Applications Engineer.

III Identifying and Harvesting the “Low Hanging Fruit” in Existing Scribe Line TEGs

IV
P ARALLEL T ES T T e c h n o l o g y
S e ct i o n I V
Test Structure
Design Considerations
for Parallel Testing
4-1
iV Test Structure Design Considerations for Parallel Testing
Parallel testing provides high throughput. However, although parallel testing can
sometimes be performed successfully on existing test structures, efficient execution of
tests in parallel, without signal loss, generally requires addressing several issues. This
section describes some of these issues.
Common substrate issues
The semiconductor wafers produced by most processes have a common substrate.
(Wafers produced by the dielectric isolation processes are an exception.) Wells with a
polarity opposite to that of the substrate are isolated—for example, separate n-doped
wells in a p-doped substrate produced by a CMOS process. However, wells having a
polarity identical to that of the substrate—for example, p-doped wells in a p-doped
substrate are all shorted together. As a result, simultaneously forcing different voltages
at different points can introduce significant error, as a result of:
• The different voltages cause current flow and a voltage gradient across the
substrate.
• The voltage gradient can cause uncertainty about exact substrate voltages
under the gates of transistors under test.
parasitic voltage‑drop issues
Semiconductor test structures are generally much smaller than the probe pads used
to connect the tester to these structures. As a result, the total area dedicated to a test
structure is roughly the same as the area occupied by its probe pads. Understandably,
during test structure design, substantial effort is devoted to minimizing the number of
probe pads.
Probe pad count is often minimized by using common pads. Common pads are
probe pads that connect to more than one DUT. The most frequent application for
common pads is to connect the source terminals together for a set of transistors, as
illustrated in Figure 4‑1.
Drain 1 Gate 1 Drain 2 Gate 2 Drain 3 Gate 3 Unused Common Common

Substrate Source
T1 T2 T3
R1 R2 R3
Figure 4‑1: Unoptimized test structure, containing three transistors with common
source and substrate connections.

IV
However, connection of multiple DUTs to one probe pad can require substantial
lengths of metal line between the DUTs, resulting in substantial parasitic resistances. In
turn, currents flowing through the metal line can introduce substantial voltage drops.
For example, in Figure 4-1, resistors R1, R2, and R3 represent parasitic resistances
and voltage drops in the common line connecting the common pad to the transistor
source terminals:
• If transistors are tested one at a time, the parasitic voltage drop in the com-
mon-source line to a transistor is the product of the source current to the
transistor and the cumulative line resistances:
– Voltage drop for T3 = I3(R3)
– Voltage drop for T2 = I2(R2 + R3) at T2
– Voltage drop for T1 = I1(R1 + R2 + R3)
• If all three transistors are tested in parallel, the voltage drops are the prod-
ucts of the cumulative line resistances and the cumulative source currents to
the three transistors.
– Voltage drop for T3 = (I1 + I2 + I3) R3
where I1, I2, and I3 are the source currents for transistors T1, T2, and T3
respectively.
– Voltage drop for T2 = (I1 + I2)R2 + (I1 + I2 + I3)R3
– Voltage drop for T1 = I1(R1) + (I1 + I2)R2 + (I1 + I2 + I3)R3
Voltage drop calculations for unoptimized test structure
Assume that the circuit shown in Figure 4-1 has the following characteristics:
Transistor location T1 — To the left of the Drain 1 pad.
T2 — Between the Gate 1 and Drain 2 pads.
T3 — Between the Gate 2 and Drain 3 pads
Pad characteristics Number of pads — 9
Pad dimensions — 100µm × 100µm
Pad spacing — 100µm
Common-source line The source of each transistor is connected to a single, 1.0µm wide metal
line that runs down the length of the scribe line to the common-source
probe pad. The lengths of the common-source line segments between
transistors are as follows:
T1 to T2 — 400µm
T2 to T3 — 400µm
T3 to common-source probe pad — 800µm
Source line resistivity 0.05Ω/square (sheet resistivity)
Source currents 5mA to each transistor

Calculations
Figure 4‑2 expands the test structure in Figure 4‑1 to show the following details for
the common-source line sections between each transistor, in the order shown below:
• Dimensions, length × width
• Resistivity
• Current when the three transistors are tested in parallel
• Voltage drop [(length/width) × sheet resistivity × current] when the three
transistors are tested in parallel
Drain 1 Gate 1 Drain 2 Gate 2 Drain 3 Gate 3 Unused Common Common

Substrate Source
T1 T2 T3
I1 R1 I2 R2 I3 R3
I1 I1 + I2 I1 + I2 + I3
400µm × 1µm 400µm × 1µm 800µm × 1µm

0.05Ω/square 0.05Ω/square 0.05Ω/square
0.005A 0.010A 0.015A
Voltage drop = Voltage drop = Voltage drop =
(400/1) × 0.05 × 0.005 (400/1) × 0.05 × 0.010 (800/1) × 0.05 × 0.015
= 0.1V = 0.2V = 0.6V
Figure 4‑2: Unoptimized test structure—same circuit as Figure 4‑1, but with added
details.
Not shown are the individual parasitic resistances [(length/width) × sheet resist
ivity]: 40Ω for R1 and R2 and 80Ω for R3.
implications
When the three transistors are tested in parallel, the cumulative voltage drops affect
Vds values as follows:
• Vds across T3 = 0.6V less than the voltage forced on the Drain 3 pad.
• Vds across T2 = 0.8V (0.2V + 0.6V) less than the voltage forced on the Drain
2 pad.
• Vds across T1 = 0.9V (0.1 + 0.2 + 0.6V) less than the voltage forced on the
Drain 1 pad.
These results clearly show the problem with parasitic resistance. A 0.6–0.9V drop
due to parasitic resistance will be highly significant for a process designed for a 3.3V
power supply. Even if the transistors were tested one at a time, similar calculations
would show that the parasitic voltage drops would be reduced to:

iV
• 0.2V for T3
• 0.3V for T2
• 0.4V for T1
The above voltage drops are still unacceptable, but significantly less than when the
transistors are tested in parallel. The example above suggests a need for the following
changes:
• A commonsource pad located as close as possible to the transistors.
• Individual connections to the commonsource pad.
The next subsection illustrates these changes.
Voltage drop calculations for optimized test structure
Figure 4‑3 shows the same basic circuit as in Figure 4‑2—with the same number
of pads—but with a modified pad order and layout that significantly reduces parasitic
resistance.
Gate 1 Drain 1 Common Drain 2 Drain 3 Gate 3 Gate 2 Common Unused

Source Substrate
T1 T2 T3
R3
R1 R2
Figure 4‑3: optimized test structure—same transistors as in Figure 4‑1, but with
changed layout to minimize series resistance.
General changes are as follows:

• The three transistors no longer share a commonsource metal line. Rather,
the transistors are connected individually to the common-source pad. T1
and T2 essentially extend from the common-source pad, and T3 is connect-
ed as closely as possible.
• The drain pads are as close as possible to the commonsource pad.
• The gate and substrate pads are now further from the transistors. However,
gate and substrate currents are generally small—on the order of a few mi-
croamps. Therefore, even a 100Ω series resistance would cause a parasitic
voltage drop of only a few hundred microvolts.
More specifically, the following improved characteristics are now possible (changes
are highlighted in italics):

Transistor location T1 — Between the Drain 1 pad and the common-source pad.
T2 — Between the Drain 2 pad and the common-source pad.
T3 — Between the Drain 2 pad and the Drain 3 pad (by necessity) 1
Pad characteristics Number of pads — 9
Pad dimensions — 100µm (microns) x 100µm (microns)
Pad spacing — 100µm
Common-source lines The source of each transistor is connected individually to a
common-source pad, in each case with a 1.0µm wide metal line. 2
The lengths of the individual source lines transistors are as follows:
Source line resistivity 0.05Ω/square (sheet resistivity)
Source currents 5ma to each transistor
1 T3 could perhaps be placed next to T2, but a similar series resistance would result—between the Drain 3 pad and
the T3 drain terminal, instead of between the common source pad and the T3 source terminal.
2 could be widened/paralleled for further improvement. refer to “Further optimization” on page 4-7.
Calculations
Figure 4‑4 expands the optimized test structure in Figure 4‑3 to show the follow-
ing details for the metal line section between each transistor and the common source
pad, in the order shown below:
• Dimensions, length × width
• Resistivity
• Current when the three transistors are tested in parallel
• Voltage drop [(length/width) × sheet resistivity × current] when the three
transistors are tested in parallel.
Gate 1 Drain 1 Common Drain 2 Drain 3 Gate 3 Gate 2 Common Unused

Source Substrate
T1 T2 T3
I2 R3 I3
I1
R1 R2
20µm × 1µm 20µm × 1µm 200µm × 1µm

0 .05Ω/square 0 .05Ω/square 0 .05Ω/square
0 .005A 0 .005A 0 .005A
Voltage drop = Voltage drop = Voltage drop =
(20/1) × 0 .05 × 0 .005 (20/1) × 0 .05 × 0 .005 (200/1) × 0 .05 × 0 .005
= 0 .005V = 0 .005V = 0 .05V
Figure 4‑4: optimized test structure, with added details.

IV
Not shown are the individual parasitic resistances [(length/width) × sheet resist
ivity]: 1Ω for R1 and R2 and 10Ω for R3.
Implications
When the three transistors are tested in parallel, the calculated source-to-source pad
voltage drops are the same as the individual voltage drops calculated in Figure 4-4.
Therefore, the effects on Vds values are as follows:
• Vds across T3 will be 0.050V less than the voltage forced on the Drain 3 pad.
These results are a dramatic improvement over the results for the unoptimized test
structure. Further improvements are possible, as discussed below.
Further optimization
To reduce the series resistances further, the metal line widths could be increased
significantly. There is no need to keep the metal line width at one micron, because
the line need not run outside of the probe pads in this arrangement. For example, the
linewidths could easily be made ten microns, thereby reducing parasitic resistances and
voltages by a factor of ten.
Adding similar, parallel metal lines in other metal layers will also reduce the series
resistance.3
If it is not possible to run parallel metal lines under probe pads, the metal line width
can be increased significantly where lines run between probe pads.
Efficient use of assets
Traditionally, test structures are typically arranged functionally. For example, a set of
three test structures may consist of the following:
• Structure A — A probe-pad set connected to a group of transistors.
• Structure B — A probe-pad set connected to a group of capacitors.
• Structure C — A probe-pad set connected to a group of resistors.
While such an arrangement seems logical, it does not necessarily provide maximum
throughput for parallel testing. Alternate arrangements can often improve throughput
by optimizing the use of multiple instruments in parallel. Consider the following:
• A Keithley S600 Series tester can contain up to eight SMUs. A parallel execu-
tion thread can control multiple pairs of SMUs.
Because all three metal source lines run independently, running them in parallel causes no additional
3
voltage drop to any of the transistors.

IV Test Structure Design Considerations for Parallel Testing
• A parametric tester generally contains a single capacitance meter. This capac-

itance meter, addressed via the GPIB bus, can be assigned to an additional,
independent execution thread. The remainder of this section illustrates the
throughput improvements that can be obtained by optimizing test struc-
tures for parallel testing.
Test time for unoptimized test structures
Consider the four unoptimized test structures in Figure 4-5. They contain four sets
of twelve probe pads, which are connected as follows:
• Unoptimized structure #1 — Seven n-channel transistors
• Unoptimized structure #2 — Seven p-channel transistors
• Unoptimized structure #3 — Six capacitors
• Unoptimized structure #4 — Three 4-terminal resistors
Throughput considerations for the unoptimized test structures
The type of test structure design represented by Figure 4-5 uses scribe line space
very efficiently, requiring only 48 pads. However, the structure is not optimized for
throughput for the following reasons.
• In unoptimized structure #1, the use of a common gate pad for the n-
channel transistors prevents parallel testing. Each transistor must be tested
separately. Therefore, parallel execution threads offer no advantage with
this design. The same issue applies to the p-channel transistors in unopti-
mized structure #2.
• Because the system has only one capacitance meter, each of the six capaci-
tors in unoptimized structure #3 must be tested sequentially.
• In unoptimized structure #4, the three resistors can be tested in parallel us-
ing three execution threads. Each parallel execution thread would use two
SMUs. One SMU would force current, while the other SMU would measure
differential voltage. However, the resistor test is generally the fastest of three
types of tests, so the advantage is minimal.
Throughput time for the unoptimized test structures
For illustration purposes, assume that the individual device test times and prober-
movement are as follows:
• Full transistor test (VT, gM, Isat, IDlin, GIDL, gate leakage, ISUB, V BDSS, V BDII…)
— 0.5 seconds
• Capacitor test — 0.3 seconds
• Resistor test — 0.03 seconds
• Prober movement from one structure to another — 0.5 seconds

iV
Unoptimized structure #1, containing twelve probe pads and seven n‑channel FeTs
Drain 1 Drain 2 Drain 3 Drain 4 Drain 5 Drain 6 Drain 7
Tn1 Tn2 Tn3 Tn4 Tn5 Tn6 Tn7
Common Common Common

Substrate Source Gate
Unoptimized structure #2, containing twelve probe pads and seven p‑channel FeTs
Drain 1 Drain 2 Drain 3 Drain 4 Drain 5 Drain 6 Drain 7
Tp1 Tp2 Tp3 Tp4 Tp5 Tp6 Tp7
Common Common Common

Substrate Source Gate
Unoptimized structure #3, containing twelve probe pads and six capacitors
High 1 High 2 High 3 High 4 High 5 High 6
C1 C2 C3 C4 C5 C6
Low Low Low Low Low Low
Unoptimized structure #4, containing twelve probe pads and three 4‑terminal resistors
Force Sense Sense Force

High 3 High 3 Low 3 Low 3
R3
R1 R2
Force Sense Sense Force Force Sense Sense Force

High 1 High 1 Low 1 Low 1 High 2 High 2 Low 2 Low 2
Figure 4‑5: Set of four unoptimized t est structures, each containing twelve probe pads.

Given these assumptions, it would take the tester 10.39 seconds to test all four unop-
timized structures sequentially. Table 4-1 details the time calculation.4
Table 4-1: Time to test the unoptimized test structures.
Individual device-test and

structure number
Parallel test time

prober-movement times
Devices tested
Unoptimized
Test number
(seconds)*
move, 0.5s
p-channel
Capacitor,
n-channel
in parallel
FET, 0.5s
FET, 0.5s
Resistor,
Prober
0.03s
0.3s
1 1 Tn1 0.5 0.50
2 Tn2 0.5 0.50
3 Tn3 0.5 0.50
4 Tn4 0.5 0.50
5 Tn5 0.5 0.50
6 Tn6 0.5 0.50
7 Tn7 0.5 0.50
Move to unoptimized structure #2 0.5 0.50
2 1 Tp1 0.5 0.50
2 Tp2 0.5 0.50
3 Tp3 0.5 0.50
4 Tp4 0.5 0.50
5 Tp5 0.5 0.50
6 Tp6 0.5 0.50
7 Tp7 0.5 0.50
3 1 C1 0.3 0.30
2 C2 0.3 0.30
3 C3 0.3 0.30
4 C4 0.3 0.30
5 C5 0.3 0.30
6 C6 0.3 0.30
4 1 R1 0.3 0.03
2 R2 0.3 0.03
3 R3 0.3 0.03
*Parallel test time is the largest individual test time for the devices that are tested.
In this calculation, unoptimized structure #4 is tested sequentially, given the minimal advantages to
4
testing the resistors in parallel. If the resistors were tested in parallel, the total test times would be 10.33
seconds, an improvement of only 0.06 seconds.

IV
Test time for a set of optimized test structures

Consider the alternative set of test structures in Figure 4-6. These structures con-
tain the same devices as Figure 4-5. However, the structures are optimized for high
throughput using parallel testing. Each of the three structures contains 16 probe pads
for a total of 48 probe pads—the same total number as in the set of unoptimized struc-
tures. The devices are distributed among the structures as follows:
• Structure #1:
– Four n-channel transistors
– Four p-channel transistors
– Two capacitors
• Structure #2:
– Three n-channel transistors
– Three p-channel transistors
– Two capacitors
• Structure #3:
– Three four-terminal resistors
– Two capacitors
Optimized structure #1
Optimized structure #1 has been arranged as follows:
• The transistors have been connected so that both n-channel and p-channel
transistors can be tested in parallel. Two parallel execution threads can be
defined for these transistors using four SMUs for each thread.5
– One execution thread and one executor (controlling four SMUs in this
case) would be assigned to test one of the n-channel transistors at a
time.
– Another execution thread and the second executor (controlling a second
set of four SMUs) would be assigned to test one of the p-channel transis-
tors at the same time.
• A third executor would be assigned control of a GPIB capacitance meter.
This executor can test a capacitor while the other two executors are testing
transistors. Because the first pad set contains only two capacitors, one of
the capacitors would be tested in parallel with the first pair of transistors.
Model 60110-SMU SMUs in an execution thread must be specified in pairs. For this reason, four SMUs
5
must be assigned even though only three are needed. This constraint does not apply when using Model
60111-SMU SMUs.

optimized structure #1
Drain 1 Drain 2 Drain 3 Drain 4 Drain 5 Drain 6 Drain 7 Drain 8
Tn1 Tn2 Tn3 Tn4 Tp1 Tp2 Tp3 Tp4
C1 C2
Cap Common Common Common Cap Common Common Common

1 Substrate N Source N Gate 2 P Well P Source P Gate
Drain 1 Drain 2 Drain 3 Unused Drain 5 Drain 6 Drain 7 Unused
Tn5 Tn6 Tn7 Tp5 Tp6 Tp7
C3 C4
Cap Common Common Common Cap Common Common Common

1 Substrate N Source N Gate 2 P Well P Source P Gate
Force Sense Sense Force Cap Cap Cap Cap

High 3 High 3 Low 3 Low 3 High 1 Low 1 High 2 Low 2
R3 C5 C6
R1 R2
Force Sense Sense Force Force Sense Sense Force

High 1 High 1 Low 1 Low 1 High 2 High 2 Low 2 Low 2
Figure 4‑6: The devices of Figure 4‑5 after rearranging into three asset‑optimized,
16‑pad test structures
The second capacitor would be tested in parallel with the second pair of
transistors. No capacitors would be tested in parallel with the last two pairs
of transistors.
Assume the same individual device test times as those assumed for the unoptimized
test structures:
• Full transistor test (V T, gM, Isat, IDlin, GIDL, gate leakage, ISUB, V BDSS, V BDII…)
— 0.5 seconds
• Capacitor test — 0.3 seconds
• Resistor test — 0.03 seconds
• Prober movement from one pad set to another — 0.5 seconds

IV
Based on the above assumptions, the total calculated test time for optimized struc-
ture #1 would be approximately the time required to test four transistors or 2.0 sec-
onds, assuming 100% efficiency. If one conservatively assumes only 80% efficiency for
parallel testing, the total test time for this structure would be 2/0.8 = 2.5 seconds.
Optimized structure #2 is similar to optimized structure #1, except that it contains
only three n-channel and three p-channel transistors. Again, two parallel execution
threads, controlling four SMUs each, would be defined, so that one p-channel and one
n-channel transistor can be tested in parallel. Again, there are also two capacitors in
this pad set. Each capacitor is tested in parallel with a transistor pair using a GPIB capac
itance meter.
Based on the assumed 0.5 second transistor test time and 0.3 second capacitor test
time, the total test time for optimized structure #2 would be approximately the time
required to test three transistors or 1.5 seconds, assuming 100% efficiency. If one con-
servatively assumes only 80% efficiency for parallel testing, the total time for this set
would be 1.5/0.8 = 1.875 seconds.
Optimized structure #3 includes the three resistors and the remaining two capaci-
tors. In this case, three execution threads could be defined, each containing two SMUs.
In each pair of SMUs, one SMU would force current and the other would measure the
differential voltage drop across the resistor. A fourth executor would be defined as a
GPIB capacitance meter.
In this case, each capacitance measurement would take much longer than the par-
allel testing of the three resistors, so the total test time would be the sequential time
to test two capacitors or 0.6 seconds. If one conservatively assumes 97% efficiency for
the parallel test (three resistors and one capacitor in parallel) and 100% efficiency for
the second capacitor alone, the total time for this set would be 0.3/0.97 + 0.3 = 0.609
seconds.
Total throughput time for the set of optimized test structures
Based on the calculations above for individual optimized structures, plus two
prober movements between structures, the total, conservatively derated time to test
the optimized structures would be 5.98 seconds. Compare the detailed time breakdown
for the optimized structures in Table 4-2 with the detailed time breakdown for the
unoptimized structures in Table 4-1.
Conclusion
Even assuming conservatively derated times to test the optimized structures (5.98
seconds total), these examples show that parallel testing of the optimized test struc-

Table 4-2: Time to test the optimized test structures.
Individual device-test and

Devices tested in parallel
slightly longer individual

test times when devices
prober-movement times
multiplier to allow for
Derated parallel-test
are tested in parallel
Optimized structure
Conservative derate
p-channel FET, 0.5s
n-channel FET, 0.5s
Prober move, 0.5s
Parallel test time
time (seconds)
Capacitor, 0.3s
Resistor, 0.03s
Test number
(seconds)*
number
1.25
1 1 Tn1, Tp1, C1 0.5 0.5 0.3 0.50 0.625
(100%/80%)
2 Tn2, Tp2, C2 0.5 0.5 0.3 0.50 1.25 0.625
3 Tn3, Tp3 0.5 0.5 0.50 1.25 0.625
4 Tn4, Tp4 0.5 0.5 0.50 1.25 0.625
Move to optimized structure #2 0.5 0.50 N/A 0.500
2 1 Tn5, Tp5, C3 0.5 0.5 0.3 0.50 1.25 0.625
2 Tn6, Tp6, C4 0.5 0.5 0.3 0.50 1.25 0.625
3 Tn7, Tp7 0.5 0.5 0.50 1.25 0.625
Move to optimized structure #3 0.5 0.50 N/A 0.500
0.03,
1.03
3 1 C5, R1, R2, R3 0.3 0.03, 0.30 0.309
(100%/97%)
0.03
2 C6 0.30 1.00 0.300
Total 5.984
*Parallel test time is the largest individual test time for the devices that are tested. The values in this
column assume that each individual test runs as efficiently in parallel mode as in sequential mode.
tures, instead of sequentially testing the unoptimized test structures, would result in a
test time reduction of more than 40%:
(10.39 – 5.98) / 10.39 = 0.42
This improvement is achieved without compromising efficient use of space on the
wafer.6
The total pad count in both optimized and unoptimized designs is the same, suggesting that the total
6
scribe line area will also be the same.

A
Appendix A
Coding for Parallel Test

Using pt _ execute
A-1
A Appendix A
This appendix is not a complete reference to the use of pt _ execute; instead, it’s
merely intended to provide a quick overview of the tools and techniques associated with
this approach to parallel test sequence creation and modification.
The material presented here assumes the reader has at least a basic knowledge of the
underlying theory and concepts of parametric test and device characterization, as well
as some background in C programming. All sample code snippets were developed spe-
cifically for use with a Keithley Series S600 Automated Parametric Test System equipped
with a licensed Parallel Test Option, so they are not applicable to any other type of
parametric testers.
The pt _ execute tool kit is designed to simplify the development of parallel test
modules. The main idea underlying pt _ execute is the multiple phase execution and
task queue mechanism. A parallel test module runs in a phase called SIGNUP. In this
phase, a test does not run immediately; instead, it signs itself up to a task queue, which
is maintained by pt _ execute. Each node of the task queue is a task, which stores the
function name and arguments passed when calling this function.
When the pt _ execute function is called, it looks up the task queue, and runs all
the tasks during a phase called FETCHINFO. During the FETCHINFO phase, only the
system resource request routines are processed. As a result, once this execution is com-
plete, pt _ execute is aware of which instruments each task requires, and it divides
the task queue into several groups, each of which includes one or more non-conflicting
tasks.
Next, pt _ execute executes the groups one by one. For each group, pt _ execute
first sets the phase to CONNECT and executes all the tasks in sequence. Then, these
tasks are executed in parallel during the PARALLEL phase, when they are known as
threads. In the CONNECT phase, only codes for the CONNECT phase are executed; in
the PARALLEL phase, only codes for PARALLEL phase are executed.
When using pt _ execute, rather than using hard-coded SMU1, SMU2, ... identi-
fiers as with the traditional programming method, users call the pt _ smu function
to get an SMU. pt _ execute maintains an instrument table, which is fetched from the
tester. Therefore, the pt _ execute function is always aware of whether enough in-
struments are available before adding a task into a group. This feature simplifies migrat-
ing routines from one test cell to another, making it unnecessary to take each tester’s
specific instrument configurations into consideration. Once properly programmed, the
same code can run on different machines with no changes.
To learn more about how to apply the pt _ execute tool kit to parallel test, contact
your Keithley sales representative or field engineer.
A-2 Parallel Test Technology:

Coding for Parallel Test Using pt _ execute A
Programming a Parallel Module with pt _ execute

To illustrate how convert a sequential test module to a parallel test module, let’s start
with a very simple example—a routine that measures resistance. The following code
snippets represent the sequential test module and the test macro for this routine.
double resist(int hi, int lo, double ipgm, double vlim)
{
double volt;
conpin(SMU1H, hi, KI _ EOC);

conpin(GND, SMU1L, lo, KI _ EOC);
limitv(SMU1, vlim);
forcei(SMU1, ipgm):
measv(SMU1, &volt);
devint();
return volt / ipgm;
}
RESIST1 = resist(P1, P2, 1.0E-4, 10.0)

Step 1
Replace the hard-coded SMU identifiers in the test module with the SMUs received
from the pt _ smu function call, and include the pt _ execute.h header file, as
shown here:
#include <pt _ execute.h>
{
int smu;
double volt;
smu = pt _ smu();
conpin(SMU1 smu, hi, KI _ EOC);
conpin(GND, SMU1L smu + 1, lo, KI _ EOC);
limitv(SMU1 smu, vlim);
forcei(SMU1 smu, ipgm);
measv(SMU1 smu, &volt);
devint();
return volt / ipgm;
}
Step 2
Specify the CONNECT and PARALLEL phase codes. By using the connect _ phase
macro and a pair of braces, define the codes that are to be executed sequentially. With
the parallel _ phase macro and a pair of braces, define the codes to be executed in
parallel. The braces can be omitted if there is only one statement.
The New Paradigm for Parametric Testing A-3

A Appendix A
{
int smu;
double volt;
smu = pt _ smu();
connect _ phase {
conpin(smu, hi, EI _ EOC);
conpin(GND, smu + 1, lo, KI _ EOC);
}
parallel _ phase {
limitv(smu, vlim);
forcei(smu, ipgm);
measv(smu, &volt);
devint();
return volt / ipgm;
}
}
Keithley recommends the use of conpin statements for commands executed in par-
allel. In this case, we moved the conpin statements to parallel _ phase and omit
connect _ phase.
Step 3
Add the signup statement. The function name is pt _ signup. Its first argument is
always a pt _ this macro; the rest of the argument list is exactly the argument list of
this function.
{
int smu;
double volt;
pt _ signup(pt _ this, hi, lo, ipgm, vlim);

smu = pt _ smu();
connect _ phase {
conpin(smu, hi, KI _ EOC);
}
parallel _ phase {
limitv(smu, vlim);
forcei(smu, ipgm);
measv(smu, &volt);
devint();
return volt / ipgm;
}
}
Step 4
Once we’ve saved the module, compiled it, set the user library dependency to
pt _ execute, and built the library, we’re done with this test module. The following
code snippet is a sample Keithley Test Macro (KTM) that shows how it works.


pt _ execute()
If four or more SMUs are installed in the system, the four tests will run in parallel.
However, if there are only two SMUs installed, the first two tests will run in parallel
first, followed by RESIST3 and RESIST4 in parallel. To verify the system action, turn
on the debug switch with the command setenv KI _ LPT _ DEBUG /dev/tty before
calling KITT.
Reference of Functions Called by Parallel Test Modules
This section provides an overview of the functions that describe the properties
of a parallel test module or request instrument resources. Only parallel test mod-
ules may call these functions. Unless explicitly declared in the remarks, they must
be called outside the phase declaration braces. Their prototypes are declared in
$KI _ KULT _ PATH/pt _ execute.h.
pt _ signup —Assigns a task to the task queue.

Purpose Assigns a task to the task queue. The task contains information about the current function
name and the arguments passed when calling this function, so that pt _ execute can
track and execute it later.
Format void pt _ signup(char *func_name, arguments…);
func _ name The current function name. We recommend using the
pt _ this macro to represent the function name string.
arguments The argument list of the function; arguments must appear
the same sequence as declared in the function prototype. Do
not put asterisks (*) before pointer-type arguments; just use
identifiers instead.
Remarks If the argument list does not match that of the function, the system may behave incorrectly
or a result ID such as LOSTIDXXXX may appear in the result list.
This function may be called in SIGNUP phase.
pt _ smu—Gets an available SMU ID.

Purpose Gets an available SMU ID.
Format int pt _ smu(void);
Remarks Returns FAKE at SIGNUP or FETCHINFO phases. Returns SMU ID at CONNECT or
PARALLEL phases. If no SMU is available, pt _ smu returns KI _ EOC. As long as a module
uses a number of SMUs fewer than or equal to the number installed in the system,
pt _ smu will never return KI _ EOC for this module.
By the time pt _ execute is called, if integer global data SMU _ TOTAL is set to a
positive number and is less than the actual number of SMUs available, then pt _ smu will
allocate SMUs based on this number. If integer global data SMU _ PAIR is set to a non-zero
number, then pt _ smu will allocate SMUs following the pairing constraints.

A Appendix A
pt _ timer—Gets an available timer ID.

Purpose Gets an available timer ID.
Format int pt _ timer(void);
Remarks Returns FAKE at SIGNUP or FETCHINFO phases. Returns timer ID at CONNECT or
PARALLEL phases. If there is no timer available, returns KI _ EOC. As long as a module uses
a number of timers that’s fewer than or equal to the number the system has, pt _ timer
will never return KI _ EOC for this module.
By the time pt _ execute is called, if integer global data TIMER _ TOTAL is set to a
positive number that is lower than the actual number of timers, then pt _ timer will
allocate timers based on this number.
pt _ claim —Claims an instrument to be owned by a thread.

Purpose Claims an instrument to be owned by a thread.
Format int pt _ claim(int instruments…);
instruments… The instruments to be claimed. Once claimed, the module can hard-
code the instruments. The instrument list is KI _ EOC (0) terminated.
Remarks If any GPIB instrument is claimed, then GPIB1 will be claimed automatically.
An instrument will be unified by pt _ claim to its major instrument ID.
pt _ biasv—Declares bias pins and the bias voltage level.

Purpose Declares bias pins and the bias voltage level, range, and current compliance.
Format
level The voltage level to which the pins are to be biased.
ilim The current compliance of the bias. If specified as 0.0, the system will not use
the default compliance.
vring The voltage range of the bias. If specified as 0.0, the system will use auto-range.
pins… The pins that share this voltage. The pin list is KI _ EOC (0) terminated.
pt _ sharei—Declares shared pins and the shared current level.

Purpose Declares shared pins and the shared current level, range, and voltage compliance.
Format
level The current level at which the pins are to be biased.
vlim The voltage compliance of the bias. If specified as 0.0, the system will not use the
default compliance.
irng The current range of the bias. If specified as 0.0, the system will use auto-range.
pins… The pins that share this current. The pin list is KI _ EOC (0) terminated.
pt _ hotpin—Declares hot pins.

Purpose Declares hot pins. A hot pin is to be connected to an instrument high, and it is not shared.
Format void pt _ hotpin(int pins…);
pins… The hot pin list. The pin list is KI _ EOC (0) terminated.

pt _ gndpin—Declares ground pins.

Purpose Declares ground pins. A ground pin is to be connected to GND, and it is not shared.
Format void pt _ gndpin(int pins…);
pins… The ground pin list. The pin list is KI _ EOC(0) terminated.
pt _ destructive—Declares the parallel test module to be destructive.

Purpose Declares the parallel test module to be destructive. When sorting is enabled,
pt _ execute will put all destructive tasks to the end of the task queue.
Format void pt _ destructive(void);
pt _ grab —Grabs a mutual exclusion (mutex) resource.

Purpose Waits for a mutex resource to become available, then grabs it once it is.
Format void pt _ grab(chat *rscID);
rscID A character string that labels the mutex resource. Any valid data pool tag can be
accepted. The rscID occupies a LONG _ P data pool node. Do not use this ID in
data pool operation for other purposes.
pt _ release—Releases a mutual exclusion (mutex) resource.

Purpose Releases a mutex resource. The resource becomes available to other threads.
Format void pt _ release(chat *rscID);
rscID A character string that labels the mutex resource. Any valid data pool tag can be
accepted. The rscID occupies a LONG _ P data pool node. Do not use this ID in
data pool operation for other purposes.
The pt _ execute Execution Phases

This section is an overview of the phase declaration macros, which are defined in
$KI _ KULT _ PATH/pu _ execute.h.
connect _ phase Equivalent to “if the current phase is CONNECT.” The statements between
the pair of braces following connect _ phase will be executed in
sequence at the CONNECT phase. If there is only one statement to be
executed, the braces may be omitted. If there are no codes to be executed
at CONNECT phase, then connect _ phase may be omitted.
sequential _ phase An alias of connect _ phase .
parallel _ phase Equivalent to “if the current phase is PARALLEL.” The statements between
the pair of braces following parallel _ phase will be threaded in parallel
at the PARALLEL phase. If there is only one statement to be executed, the
braces may be omitted. If there are no codes to be executed at PARALLEL
phase, then parallel _ phase may be omitted.

A Appendix A
prior _ connect Equivalent to “if the current phase is CONNECT, and this is the first task
of this group.” The statements between the pair of braces following
prior _ connect will be executed for the first task of a group at the
CONNECT phase. If there is only one statement to be executed, the braces
may be omitted. If there are not codes to be executed for the first task of a
group at CONNECT phase, then prior _ connect may be omitted.
post _ connect Equivalent to “if the current phase is CONNECT, and this is the last task
of this group.” The statements between the pair of braces following
post _ connect will be executed for the last task of a group at the
CONNECT phase. If there is only one statement to be executed, the braces
may be omitted. If there are no codes to be executed for the last task of a
group at CONNECT phase, then post _ connect may be omitted.
signup _ phase Equivalent to “if the current phase is SIGNUP.” The statements between the
pair of braces following signup _ phase will be executed in sequence at
SIGNUP phase. Only pt _ signup is recommended at this phase. If there is
only one statement to be executed, the braces may be omitted. Given that
pt _ signup does nothing at other phases, signup _ phase is usually
omitted.
Sample Codes
#include <stdio.h>
#include <lptdef.h>
double res _ pt(int hi, int lo, double ipgm, double vlim, double delaytime)
{
int smu;
double volt;
pt _ signup(pt _ this, hi, lo, ipgm, vlim, delaytime);
pt _ hotpin(hi, KI _ EOC);
pt _ gndpin(lo, KI _ EOC);
smu = pt _ smu();
parallel _ phase {
conpin(smu, hi, KI _ EOC);
rangei(smu, ipgm * 10.0);
forcei(smu, ipgm);
rdelay(delaytime);
measv(smu, &volt);
devint();
return volt / ipgm;

}
}
#include <stdio.h>
#include <stdlib.h>
#include <lptdef.h>
void vth _ pt(int drain, int gate, int source, int subst, double vds,

double vbs, double vgstart, double vgstop, int steps, double

delaytime, double *vth, double *gm)
{
int dmsu, gsmu;
double *vgs, *ids, vg _ gm, id _ gm, s;
int i;
pt _ signup(pt _ this, drain, gate, source, subst, vds, vbs, vgstart,

vgstop, steps, delaytime, vth, gm);
pt _ hotpin(drain, gate, KI _ EOC);

pt _ gndpin(source, KI _ EOC);
pt _ biasv(vbs, 0.0, 0.0, subst, KI _ EOC);
dsmu = pt _ smu();
gsmu = pt _ smu();
parallel _ phase {
*vth = 0.0;
*gm = 0.0;
if ((vgs = (double *)malloc(sizeof(double)
* (steps + 1) *2)) == NULL)
return;
ids = vgs + steps + 1;
conpin(dsmu, drain, KI _ EOC);
conpin(gsmu, gate, KI _ EOC);
conpin(GND, dsmu + 1, gsmu +1, source, KI _ EOC);
forcev(dsmu, vds);
rtfary(vgs);
smeasi(dsmu, ids);
sweepv(gsmu, vgstart, vgstop, steps, delaytime);
devint();
for (i = 0; i < steps; i++)
if ((s=(ids[i+1]-ids[i])/(vgs[i+1]-vgs[i]))>*gm) {
*gm = s;
id _ gm = ids[i];
vg _ gm = vgs[i];
}
if (*gm > 0.0)
*vth = vg _ gm – id _ gm / (*gm);
free(vgs);
}
}
#include <stdio.h>
#include <lptdef.h>
double ids _ pt(int drain, int gate, int source, int subst, double vds,
double vgs, double vbs, double delaytime)
{
int dsmu, gsmu;
double ids;
pt _ signup(pt _ this, drain, gate, source, subst, vds, vgs, vbs,

delaytime);
pt _ hotpin(drain, gate, KI _ EOC);

pt _ gndpin(source, KI _ EOC);
pt _ biasv(vbs, 0.0, 0.0, subst, KI _ EOC);

A Appendix A
dsmu = pt _ smu();
gsmu = pt _ smu();
parallel _ phase {
conpin(gsmu, gate, KI _ EOC);
conpin(GND, dsmu + 1, gsmu +1, source, KI _ EOC);
forcev(dsmu, vds);
forcev(gsmu, vgs);
rdelay(delaytime);
measi(dsmu, &ids);
devint();
return ids;
}
}
/*
This sample shows how a wrapper is converted to parallel code.
sign _ up phase is used here to enclose the return statement, so that
ids _ pt is not entered at SIGNUP phase.
*/
#include <stdio.h>
#include <lptdef.h>
double idoff _ pt(int drain, int gate, int source, int subst, double vds,
double delaytime)
{
double ids _ pt(int, int, int, int, double, double, double, double);
signup _ phase {
pt _ signup(pt _ this, drain, gate, source, subst, vds,
delaytime);
return 0.0;
}
return ids _ pt(drain, gate, source, subst, vds, 0.0, 0.0, delaytime);
}
#include <stdio.h>
#include <lptdef.h>
void bvdss _ pt(int drain, int gate, int source, int subst, double ibkdn,
double vdlim, double dlytime, double *bvdss)
{
int dmsu;
pt _ signup(pt _ this, drain, gate, source, subst, ibkdn, vdlim,

dlytime, bvdss);
pt _ destructive();
pt _ hotpin(drain, KI _ EOC);
pt _ gndpin(gate, source, subst, KI _ EOC);
dmsu = pt _ smu();

parallel _ phase {
conpin(GND, dsmu + 1, gate, source, subst, KI _ EOC);
setmode(dsmu, KI _ LIM _ MODE, KI VALUE);
limitv(dsmu, vdlim);
rangei(dsmu, ibkdn * 10.0);
forcei(dsmu, ibkdn);
rdelay(dlytime);
measv(dsmu, bvdss);
devint();
}
}
#include <stdio.h>
#include <lptdef.h>
double capvxi _ pt(int hi, int lo, double vpgm, double dlytime)
{
double cap;
pt _ signup(pt _ this, hi, lo, vpgm, dlytime);
pt _ hotpin(hi, lo, KI _ EOC);

pt _ claim(CMTR1, SMU1, KI _ EOC);
parallel _ phase {
insbind(CMTR1, SMU1);
conpin(CMTR1H, hi, KI _ EOC);
conpin(CMTR1L, lo, KI _ EOC);
forcev(CMR1, vpgm);
rdelay(dlytime);
intgc(CMTR1,&cap);
devint();
return cap;
}
}
#include <stdio.h>
#include <lptdef.h>
#include <cmtr _ hp4284.h>
double cap4284 _ pt(int hi, int lo, double freq, double vpgm, double
dlytime)
{
float cap, cond;
pt _ signup(pt _ this, hi, lo, freq, vpgm, dlytme);
pt _ hotpin(hi, lo, KI _ EOC);

pt _ claim(CMTR2, KI _ EOC);
parallel _ phase {
if (!HP4284LibInit) {
c _ initialize(1, “”);
setcmtr(1, MODE, CpG, 0.0);
}
setcmtr(1, FREQUENCY, freq, 0.0);
conpin(CMTR2H, hi, KI _ EOC);

A Appendix A
conpin(CMTR2L, lo, KI _ EOC);

cforcev(1, vpgm);
rdelay(dlytime);
c _ meascg(1, &cap, &cond);
devint();
return (double) cap;

}
}

BI
Appendix B
Glossary
B-1
B Appendix B
Adaptive test Cost of Test (COT)

An approach to parametric test that The cost of testing devices during the
involves changing test strategies auto- manufacturing process through final
matically, based on initial test results qualification prior to shipment. Units
and established decision criteria. of measure are typically cost-per-de-
In other words, it is intelligent and vice or cost-per-wafer.
flexible testing based on limits used C-V (Capacitance–Voltage)
for statistical process control. Rather Measurements
than simply changing control limits Several parameters of semiconduc-
after measurements are complete, tor materials and structures can be
real adaptive testing involves chang- determined by measuring C-V char-
ing force/measure levels or changing acteristics. These measurements are
the number of sites test based on a routinely used for process diagnostics
well-documented decision process. and monitoring in MOS technology.
Three primary components of a test They allow determination of interface
strategy can be changed on the fly: (1) trap density, fixed charge, and oxide
type and number of tests; (2) number charge.
of die tested, (3) number of wafers
Device
tested. Some or all of these test strat-
A fairly broad term that could
egy components can be adaptively
describe a functional chip or circuit
changed for operational benefit at the
board component, a working unit
die (or site), wafer, and lot levels.
within a circuit, or separate, isolated
bias pin structures or simple circuits on a
A pin biased at a constant level and on wafer.
which no measurement is taken.
Discrete device
CONNECT Usually refers to either isolated struc-
A phase in which the codes of the tures on a wafer or separate compo-
tasks run sequentially. This phase is nents for circuit boards. Examples
set by pt _ execute. include capacitors and single transis-
Cost of Ownership (COO) tors in packages for circuit boards,
The cost of implementing a given or capacitors and transistors placed
process technology, including cost of in test element groups on wafers for
equipment, maintenance, materials, parametric testing.
impact on existing tool set, etc. COO DUT (Device Under Test)
is typically determined for a specific The target device being tested.
tool or set of tools needed to carry
FETCHINFO
out a specific processing goal.
A phase in which the system resource
request routines are executed. This
phase is set by pt _ execute .
B-2 Parallel Test Technology:

Glossary B
ground pin mutex

A pin intended to be connected to Abbreviation for mutual exclusion.
GND and which is not shared. mutual exclusion
group A resource that can be operated by
A group is made up of one or more only one thread at one time. Two
compatible tasks, i.e., these tasks have threads that require this resource
non-conflicting bias conditions and must be executed sequentially.
pin lists, as well as identical prior-con- pad map
nect actions or post-connect actions. The function pt _ padmap is sup-
The tester must have sufficient instru- ported in the pt _ execute library.
ment resources to run these tasks in It declares how pads interact with oth-
parallel. ers (the pad map). The pt _ padmap
hot pin function accepts a character string
A pin intended to be connected to argument, which contains pad
an instrument high and which is not numbers, pad identifiers, and/or pad
shared. interaction operands.
I-V (Current-Voltage) PARALLEL
Measurements A phase at which the codes of the
The electrical characteristics of semi- tasks run as concurrent threads. This
conductor test structures and devices phase is set by pt _ execute.
can be determined by measuring Parallel functional test
current flowing across the device as a Testers for wafers and packaged chips
function of applied voltage. can test multiple devices for cor-
KTE rect function in parallel to increase
Keithley Test Environment. Describes throughput and lower cost of test.
the entire test venue for software These tests typically do not require
development, configuration set- the sensitivity of parametric tests and
tings, programming, production are currently being widely adopted
test revision control and execution for functional testers.
for Keithley automated parametric phase
testers. A status flag that represents the flow
KTM of pt _ execute.
Keithley Test Macro. Usually a subset pt _ execute
of a full program, it is a self-contained Keithley program that can evaluate a
mini-test program comprised of user- test program for parallel test suitabil-
defined algorithms from a library and ity, determine which threads can be
test commands. executed in parallel, arrange system
The New Paradigm for Parametric Testing B-3

B Appendix B
resources and then execute those on quantifiable parameters of a

threads in parallel. production process to characterize
Scribe line test structure the process baseline behavior and
In contemporary IC technologies, determine close to real-time when
PCMs are placed in the scribe line that process has deviated, requiring
(saw lane) between the product possible intervention to bring it back
dies. This saves wafer space and to the baseline performance.
photomasks (reticles), and improves task
wafer throughput during litho A node in the task queue. Generally, a
processing. Since the available width task corresponds to a line in the KTM
in a scribe line is usually not much and includes a variety of information,
more than 100μm, the contact pads including the function name, the
must be placed “in line.” An additional arguments passed when calling this
advantage of scribe line PCMs com- function, the number of instruments
pared to drop-in PCMs is that they are required, the shared pins and bias
available on more positions on the levels, the hot and ground pins, etc.
wafer (at least once for each reticle ex- task queue
posure). This provides possibilities for A task queue stores the information
more extensive characterization, such on all tasks between the upcoming
as parametric gradients across the wa- pt _ execute call and the most
fer for assessment of wafer uniformity, recent preceding pt _ execute call
or statistical analyses with reduced (if any).
statistical uncertainty. Due to silicon
TEG (Test Element Group)
area as well as test time limitations,
A group of elements used to identify
PCMs are always severely restricted in
and evaluate specific failure modes.
terms of the number of available test
Capacitors, inductors, via chains,
structures within each PCM.
single transistors and other discrete
sign up devices typically comprise a test ele-
An operation that appends a task ment group, and are usually placed in
to the end of the task queue. When scribe lines.
a parallel test module is called by
thread
pt _ execute, it is at SIGNUP phase.
The context and code path in which
Source-Measure Unit (SMU) execution takes place, from start to
A test system or subsystem that can finish, through a series of tasks.
both source a signal (voltage or cur-
rent driven) and measure a resultant
output signal (voltage or current).
SPC (Statistical Process Control)
Use of various statistical analyses
B-4 Parallel Test Technology:

Glossary B
The New Paradigm for Parametric Testing B-5

www.keithley.com
1st
Edition
Parallel Test Technology

The New Paradigm for Parametric Testing
Specifications are subject to change without notice.

All Keithley trademarks and trade names are the property of Keithley Instruments, Inc.
All other trademarks and trade names are the property of their respective companies.
Keithley Instruments, Inc.

Corporate Headquarters • 28775 Aurora Road • Cleveland, Ohio 44139 • 440-248-0400 • Fax: 440-248-6168 • 1-888-KEITHLEY (534-8453) www.keithley.com
© Copyright 2006 Keithley Instruments, Inc. No. 2788

Printed in the U.S.A. 0906100RC

Parallel Test Technology: The New Paradigm For Parametric Testing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Parallel Test Technology: The New Paradigm For Parametric Testing

Uploaded by

Copyright:

Available Formats

www.keithley.

Parallel Test Technology

Specifications are subject to change without notice.

Keithley Instruments, Inc.

© Copyright 2006 Keithley Instruments, Inc. No. 2788

ii Parallel Test Technology:

What is Parallel Parametric Test? . . . . . . . . . . . . . . . . . . . . . . . . 1-1

The Parallel Test Implementation Process . . . . . . . . . . . . . . . . 2-1

Identifying and Harvesting the “Low Hanging Fruit”

Test Structure Design Considerations for Parallel Testing . . 4-1

Coding for Parallel Test Using pt _ execute . . . . . . . . . . . . . . . A-1

The New Paradigm for Parametric Testing iii

Par allel Test Technology

Figure 1‑1. Sequential mode vs. parallel test timing.

Figure 1‑2. Schematic of a sequential mode test sequence.

Figure 1‑3. Schematic of a parallel test sequence.

The New Paradigm for Parametric Testing 1-3

1-4 Parallel Test Technology:

The New Paradigm for Parametric Testing 1-5

1.05× to 3.9×.The degree of throughput improvement that a particular fab

1-6 Parallel Test Technology:

Parallel test vs. adaptive test

– Reduction in number of test cells required. Consider a hypothetical fab

The New Paradigm for Parametric Testing 1-7

a­ ccommodate disruptions like periodic tool maintenance, yield crashes,

1-8 Parallel Test Technology:

Par allel Test Technology

The Parallel Test

2-2 Parallel Test Technology:

Pick a high demand

Develop throughput Correlation

Optimize code with Throughput No

Develop throughput Yes

Analyze structures (see

No Good return Yes

Figure 2‑1: Basic parallel test implementation flowchart.

The New Paradigm for Parametric Testing 2-3

Prober Time Tester Time

2× improvement in test time only

Prober Time Tester Time

2× improvement in test time

2-4 Parallel Test Technology:

The New Paradigm for Parametric Testing 2-5

2-6 Parallel Test Technology:

The New Paradigm for Parametric Testing 2-7

2-8 Parallel Test Technology:

Par allel Test Technology

Identifying and Harvesting

Conpin Force V Delay Meas I Reinitialize

Conpin Force V Delay Meas I

3-2 Parallel Test Technology:

Connecting the various measurement instruments to multiple DUTs simultaneously

50Ω Matrix 60MHz

S680 Probe Card

Table 3‑1: examples of parallel Test Throughput improvements on Measurements of

The New Paradigm for Parametric Testing 3-3

Example 1 Example 2 Example 3 Example 4 Example 5

S680 Tester Resource Usage Limitations

3-4 Parallel Test Technology:

Long Test Long Test

Optimized Short Test Long Test Additional

parallel Testing of Legacy Test Structures

Source Drain Source Drain

a ccommodate disruptions like periodic tool maintenance, yield crashes,