Professional Documents
Culture Documents
1
techniques to include TDF coverage at the boundaries
Abstract of the cores.
The DFT architecture, pattern generation and The addition of TDF vectors resulted in a larger vector
application, and economic issues encountered in large volume that would not fit on the target tester. To this
ASIC designs are increased when at-speed testing is end, vector compression DFT was added to the chip to
introduced. This case study shows enhancements over reduce the memory requirements for the target tester.
the current state of the art hierarchical methodology The most efficient method of adding compression
by improving transition testing, test scheduling, and DFT, at design time, was to implement it at the
using compression on a production design. individual core level and to then integrate it at the top-
level.
Once the DFT was in place, there were concerns about
1.0 Introduction both test application time and test power consumption.
System-on-Chip (SoC) design focusing on the DFT techniques to allow the testing of multiple cores
integration of IP Cores is now a mainstream at once were included to reduce test time. However, to
technology and is becoming quite mature in provide a level of guarantee against power problems at
application. Because of the core based nature in the test, enhancements were added that allow flexibility in
design of SoC devices, the hierarchical methodology choosing which cores and how many to test
employed for DFT makes use of the natural simultaneously. This design methodology is currently
partitioning and follows the same core-based flow. being employed in multiple designs, across multiple
DFT and test architectures can be segmented and sites.
related to the different hierarchical logic groupings.
However, due to the increasing size of modern designs This paper is organized as follows: section 2 presents
plus the inclusion of AC scan tests used to detect the the motivation for the work; section 3 presents the
more prevalent speed related defects, the pattern details and requirements of the design; section 4
counts are growing at a higher rate than economics covers the pattern-sizing estimate that drove the DFT
and tester memory can match. design; section 5 presents the core-level DFT; section
6 details the top-level DFT; section 7 briefly outlines
The key technologies investigated with this case study the role of 1149.1 in the on-chip DFT; section 8
design were exactly the management of vector sizing presents the specifics of TDF testing; section 9
by both hierarchical partitioning and the use of discusses the role of the clock-chopping PLL; section
embedded on-chip compression; the inclusion of an 10 reviews some of the challenges encountered in the
enhanced set of AC transition delay vectors; and an case study design; section 11 concludes the paper; and
evaluation of the methodology needed to allow post- finally section 12 outlines future work to be
design test scheduling. considered.
The design that is the focus of the case study was a
large SoC, with multiple instances of one core. The
end customer had very high quality requirements. 2.0 Motivation
Because it included the detection of speed-related In this case study design, the size of the device posed a
defects, AC scan type testing using transition delay problem with high pattern count. Even considering
(TDF) was added to traditional single stuck-at fault the inherent compression benefits found in the
(SSF) testing. The AC scan methodology that was hierarchical methodology, the introduction of
applied was enhanced from previously implemented transition fault patterns caused the total tester memory
requirements to exceed tester limitations. The target
One key element in the use of a hierarchical Figure 3 – Ports fan out to multiple Input Partition Registers
methodology is the registration of Inputs and Outputs
(I/Os) in the top level blocks of the chip. In many Secondly, in some cases, the combinational logic had
designs, the ports are registered to have accurately feedbacks to input partition registers from logic
characterized blocks (for reuse). Registration is also internal to the core (within the partition registers,
used to minimize the constraints on global routing. Figure 4). This caused a problem in testing the final
Since the core performance is characterized at the phase, or the top level logic. In the testing of the top
block level, it is not dependant on asynchronous level logic, all partition scan chains were placed in
timing from other blocks. This reduces the timing scan mode. Patterns were fed into those registers
constraints on those inter-block routes. In this while the cores (inside the partition chains) were in
hierarchical methodology, the DFT flow takes “don’t care” states. However, when there was
advantage of those registers to isolate the blocks. This feedback from an internal node back to the input of a
isolation provides the ability to generate patterns for partition scan chain, it caused an unknown to be
the block that are independent of the logic in other present on the partition register. This caused a loss of
blocks within the design. In the design used with this test coverage at that register and the cones of logic fed
case study, one major core, the DSP, did not register by that register.
all I/Os. This posed a challenge for the methodology
requiring special handling for those unregistered ports.
P3 3 1 1 1 1
P4 4 1
P5 5 1
P6 6 1
P7 7 1
P8 8 1
P9 9 1
P10 10 1
P11 11 1
P12 12 1
P13 13 1
P14 14 1
P15 15 1
6.0 Chip Level • Phase 2 - the RX/TX cores, the Sand and
In the chip level testing, the strategy of using multi- CompoundTop cores are tested in parallel.
1
phase muxes was used . In multi-phase muxing, the • Phase 3 - the top level logic with all partition scan
scan in and scan out from multiple cores were
chains from each core are tested at the same time.
multiplexed into and out of the chip. Using this
technique, multiple cores can use the same few I/O • Phases 4 to 15 - each core can be tested individually.
pins available for use in test. In previous designs, the
selection of cores under test was controlled through A decision was made to have the default operation test
primary inputs to the chip. four cores at once. As the table describes, in Phase 0, four
of the DSP Hard Macros were tested in parallel. Patterns
This methodology was enhanced by using JTAG
for Hard Macro 0 are delivered through Mux1. Patterns
instructions to control the selection of the blocks under
for Hard Macro 2 are delivered through Mux2, Mux3
test. Using JTAG for control reduced the number of
dedicated package pins required for test. It does add a provides patterns to Hard Macro4 and Mux4 supplies
level of complexity in the application of test patterns patterns to Hard Macro6.
to the device, but this is essentially a test procedure The one exception to this strategy is the top level testing.
that is setup and automated in the EDA tools. In the final phase, the scan chains from all of the top level
During the chip level test, each of the cores was tested logic is brought out to package pins along with all
independently in each phase. The final phase focused partition chains from the cores. This is the technique
on testing the top level logic, which is outside of the which picks up the combinational logic that resides
defined cores. In this case study design, compression outside the partition cells in the cores. It also covers the
technology was used in all the major blocks but was cones of logic that reside between the registers in the top
level and the ports in the cores. In this final phase, all 32
CLK
Figure 7 – Transition Pattern Origin
MASK0[7:0] CLK_EN0 GATED_CLK
The reason for the scan_enable signals and a lock-up MASK
MASK1[7:0]
SHIFT
CLK_EN1 GATED_CLK_1/2
SCAN_CHAIN VALUES STAGE CLOCK
flop is to insure that there is a valid state during the MASK2[7:0]
MASK3[7:0]
CLK_EN2
CLK_EN3
GATING
GATED_CLK_1/4
GATED_CLK_1/8
two clock pulses required for at-speed scan testing. In
LAUNCH_CAPTURE
a hierarchical implementation, it is necessarily
assumed that Xs propagate from outside the block. EDGE
SHIFT_IN_BIT
DETECT LD_SHFT
After loading the scan chain for a transition fault BEGIN_AC &
CTRL
pattern, the scan_enable signal for the partition chains TEST_EN