You are on page 1of 10

Test Response Compression and Bitmap Encoding

for Embedded Memories in Manufacturing Process Monitoring

John T Chen Jitendra Khare*1 Ken Walker* Saghir Shaikh* Janusz Rajski** Wojciech Maly

Carnegie Mellon University, Pittsburgh, PA 15213


jtchen@ece.cmu.edu
*Intel Corporation, Sacramento, CA 95827
**Mentor Graphics Corporation, Wilsonville, OR 97070

Abstract Thus, Built-in Self Test (BIST) has become a popular


This paper introduces a method that enables the diagnosis of method to test embedded SRAMs. The BIST circuit gener-
embedded memories via test response compression and auto- ates the input vectors on-chip and compacts the output
matic bitmap recognition. The proposed method has been response of the SRAM into either a “signature” or a “PASS/
tested via simulation with various memory specifications, FAIL” signal to reduce the bandwidth. However, the band-
fail patterns and test algorithms; it has also been imple- width-saving output of the traditional BIST does not provide
mented in a 0.18 µm CMOS test chip. enough information for detailed diagnosis.
Due to the System on Chip (SOC) trend, there is an
Keywords: BIST, embedded memories, process monitoring, increasing usage of embedded memories. As a result, the
bitmap, diagnosis, RAM testing, memory repair. ability to diagnose embedded SRAMs has become more
important.
Several methods that attempt to address the bandwidth
1 Introduction problem exist. First, additional probe pads are added to the
wafer for diagnosis, but probe-pads result in a substantial
Traditionally, diagnosis of stand-alone static random increase in the circuit area and become useless after packag-
access memories (SRAMs) serves two purposes. First, if ing. In another method, when a fail is detected, the test exe-
the coordinates of the failing cells can be determined, the cution is halted to scan out register values[5][6]. If fails
failing memories may be repaired by laser-cutting to occur frequently, the test time is unreasonably long.
improve production yield. In [7], a BIST circuit is proposed to recognize fail pat-
Second, the result of diagnosis can provide feedback to terns on-chip during testing. The locations and types of fail
IC manufacturing[1][2][3][4]. A defect in a SRAM, depend- patterns are scanned off-chip at the end of the test. Although
ing on its characteristic, causes a group of cells to fail, form- eliminating the need for traditional bitmap classification, this
ing a fail pattern. The test response of a faulty SRAM can hardware will capture only certain types of fail patterns (sin-
be used to generate a bitmap as a vehicle of diagnosis, in gle cells and columns) but will not record the different
which each pixel represents the behavior of one cell. From a behaviors of the failing cells.
collection of these bitmaps, the fail patterns may be identi- We proposed a method in [8][9] to compress the test
fied and mapped to their potential defects. This helps to responses on-chip to reduce the number of I/O pins needed
improve and/or maintain yield by facilitating or eliminating to export a test response. The method compresses the width
the need for expensive and time-consuming physical failure of the test response from the word-width of the memory to 6
analysis. bits. The compressed test response is then captured during
Because testing of stand-alone SRAMs is usually done production testing and can be decompressed on a worksta-
externally, the test responses are readily available to generate tion to reconstruct the original test response. The recon-
bitmaps. Embedded memories on the other hand, are inher- structed test response is almost always identical to the
ently difficult to test externally due to their limited controlla- original response. Thus, the decompressed response may be
bility and observability. In addition, assigning memory ports used to generate a bitmap for the purpose of manufacturing
to I/O pins is often infeasible because IC designs today are feedback, just like the original response.
becoming more pin-limited than area-limited. However, such high quality reconstruction of the original
test response may not be necessary for manufacturing feed-

1. Now at Ample Communications, Fremont, CA 94538


back. Uncertainty in a bitmap generated from a decom-
pressed response can be tolerated as long as its fail patterns
can be identified. The ability to tolerate imperfect decom-
pression can contribute to a higher compression ratio and a
lower area overhead.
Based on the above paragraph, this paper introduces a
technique, composed of two steps, for embedded memory
diagnosis. First, a “stripped-down” compressor modified
from [8][9] is described. It compresses the width of the test
response to 2 bits, instead of the original 6 bits, but requires
only 1/3 of the original area overhead. However, it does not
always preserve all the data. Thus, the decompressed test
response can not be directly used for diagnostic purpose like Figure 1. Fail matrix
the original test response. To overcome the loss of data, in a row (b) or a column (c). Further, a combination of a col-
the second step a novel method to recognize bitmaps that are umn and a row of failing cells may also occur (d).
imperfect due to lossy compression is proposed.
This methodology is applicable to both SRAMs and
DRAMs (Dynamic Random Access Memory), but this dis-
cussion focuses mainly on SRAMs because they are the
dominant embedded RAMs used today. This paper concen-
trates on the diagnosis for manufacturing feedback rather (a) (b) (c) (d)
than on laser repair, although the proposed method provides Figure 2. Examples of fail bitmaps
solution for both.
The rest of the paper is organized in the following fash- The above list of 4 fail patterns is by no means represen-
ion: Section 2 introduces the background and outlines the tative of an exhaustive collection; obviously, more compli-
important objectives that the proposed method must meet. In cated fail patterns exist. However, they do represent the
Section 3, the compression/decompression method modified basic geometric nature of fail patterns, due to the topology of
from [9] is described. Section 4 introduces a method by the SRAM circuit.
which bitmaps generated from decompressed test responses The more fail patterns can be differentiated, the higher the
can be automatically recognized. Section 5 shows the simu- manufacturing diagnostic resolution [2]. In the simplest
lation results from having applied the proposed methodology example, a single “failing” cell may be further categorized as
on a variety of test algorithms and fail patterns. Finally, a “stuck-at-0” or stuck-at-1” cell. Color bitmaps, instead of
Section 6 reports the implementation results of the proposed monochrome bitmaps, will be necessary to differentiate fail
method. patterns and will do so by indicating the failing behaviors of
cells with different colors. Our method is capable of produc-
2 Background ing color bitmaps.
A fail vector is a bit-wise comparison of the memory out-
2.2 Example Fail Matrices
put with the expected output, as shown in Fig. 1. The width
of a fail vector is inherently equal to the word width, denoted Other than the fail pattern, the fail matrix is also affected
as m. A 0 in the fail vector indicates that the corresponding by the test algorithm, memory organization and addressing
bit of memory output matches its expected value; a 1 indi- scheme. Using fail patterns in Fig. 2, a few sample fail
cates a mismatch. matrices will be constructed in this section to demonstrate
Subsequent rows of fail-vectors generated from a test exe- the type of data that will be compressed.
cution composes a fail matrix, as shown in the bottom of Fig. First, the arrays from Fig. 2 are organized in a 8x2x4
1. The fail matrix is the test response which will be com- fashion, as in Fig. 3(a). Because the proposed method works
pressed and decompressed by our method. with the fast row addressing scheme as shown in Fig. 3(b),
the fast row scheme will be used instead of the fast column.
2.1 Representative Fail Patterns The march test algorithms, frequently used in production
testing, are composed of several traversals through the
Several examples of monochrome 8x8 bitmaps with fail
address space. One such traversal through the address space
patterns are shown in Fig. 2. The white pixels represent
which expects to read 0s will be used to generate the exam-
good cells, and the black pixels represent failing cells. Aside
ple fail matrices; assuming that all failing cells Fig. 2 are
from a single, failing cell (a), the failing cells may also form
stuck-at-1s, reading any failing cell would produce a mis- 1. Enabling of production test - The time required should be
match at the comparator. short enough for the production test.
2. Scalability - To handle memories of different dimensions.
Row Decoder

Row Decoder
3. Compatibility with traditional BIST methodology.
4. The ability to generate color, instead of monochrome,
bitmaps.
5. Handling most production test-algorithms[10].
6. Small area overhead.

Column Decoder Column Decoder 3 Compression/Decompression Algorithm


Q0 Q1 Q2 Q3 Q0 Q1 Q2 Q3
(a) (a) 3.1 Hardware Compression
Figure 3. (a) A 8x2x4 memory organization, (b) fast The compressor to be shown here is modified from the
row addressing scheme compressor proposed in [9]. It generates 2 bits of output per
Given the above specifications, the fail matrices corre- read operation of memory. The compression ratio is the
sponding to the fail bitmaps in Section 2 will be produced, as width a fail-vector divided by the output width of the com-
shown in Fig. 4. Since the failing bit in Fig. 2 (a) belongs to pressor. For example, if the word width is 32, the compres-
the read port Q1, a 1 occurs in the corresponding column in sion ratio is 16. The two bits of output are the results of two
the fail matrix Fig. 4(a). The failing row of Fig. 2(b) results functions applied to two different types of fail matrix sub-
in a fail matrix with rows of 1s, separated by the height of sets. The fail matrix is decomposed into two types of sub-
the array, as in Fig. 4(b). The failing column of Fig. 2(c) sets, column segments and rows.
produces a column of 1s, with its height equal to the height
of the memory array, as in Fig. 4(c). Since the bitmap of Fig. 3.1.1 Apply OR to Column Segments
2(d) is a combination of bitmaps (b) and (c), the matrix of The fail matrix can be decomposed into column segments
Fig. 4(d) is also composed of matrices (b) and (c). in the following way: First, the left-most column is frag-
mented into equal length segments. Then, each subsequent
column is segmented in the same fashion with its segmenta-
tion alignment shifted down by one element with respect to
the previous column. Fig. 5 shows a fail matrix with its col-
umn segments inscribed by ovals. The OR function is com-
puted for each column segment. The OR is chosen because
it is able to encode all 0s or a single 1 in a subset, as will be
explained in Section 3.2.1.
bit-comparator
OR

(a) (b) (c) (d)


Figure 4. Examples of fail matrices
0 0 0 0 0
Since the march algorithms are composed of several such 1
1 1 1 1
traversals, the matrices in Fig. 4 resemble portions of actual
0 0 0 0 1
fail matrices from production testing. Because the bitmaps
in Fig. 2 represent the basic geometric nature of fail patterns, 0 0 0 0 1
the patterns in Fig. 4 exemplify the kind of geometry that a 0 0 0 0 1
compression method should target. 0 0 1 0 1
0 0 1 0 0
2.3 Compression Criteria 0
With traditional BIST, the generation of bitmaps requires 0
a number of pins equal to the word-width, which is often 1
impractical. To overcome this limitation, it is necessary that Figure 5. Applying OR function to column segments
the test response be compressed as much as possible while
maintaining the information needed to identify the fail pat- By setting the column segment size to m, the OR pro-
terns. In addition, the following characteristics must also be duces an output for each fail vector, conveniently forming an
met to ensure that the proposed method is practical: output stream.
3.1.2 Apply OR to Rows in the set; if all but one element has already been assigned a
Fig. 6 shows an example of a fail matrix with its rows 0, a 1 may be assigned to the only undetermined element.
inscribed in ovals. An OR function is applied to each row, (a) x x x x OR=0 0 0 0 0
which results in a 0 if the row contains all 0s, otherwise it
results in a 1. This is essentially how the “PASS/FAIL” sig- (b) 0 0 0 x OR=1 0 0 0 1
nal is generated in traditional BIST; thus, no additional hard- Figure 7. Determination rules
ware overhead is required.
bit-comparator

OR
3.2.2 Determination Order
0 0 0 0 0 The process of applying determination rules to a group of
1 1 1 1 1 elements is referred to as a determination. Notice, the sec-
ond determination rule depends on elements that were previ-
0 0 0 0 0
ously determined to be 0s. Thus, the first determination rule
0 0 0 0 0 must be applied to the subsets before the second determina-
0 0 0 0 0 tion rule. The process ends either when all Xs have been
0 0 1 0 1 removed or all the determination rules have been applied to
0 0 1 0 1 all subsets.

Figure 6. Applying OR function to rows 3.2.3 Decompressing Sample Fail Matrices


In this section, the fail matrices introduced in Fig. 4 will
3.1.3 Multiple-Memory Compression be used to verify the compression/decompression method.
Memories of interest for the purpose of diagnosis may The compression step is not explicitly shown due to its trivial
each contain their own dedicated compressor or share a sin- nature, but its OR results are provided as shown in Fig. 8.
gle compressor. Dedicated compressors have a larger area For the fail matrix shown in Fig. 4(a), we will only look at
overhead. However, this may be advantageous for intercon- the first 8 rows, which includes the only 1 in the fail matrix,
nect-limited designs as fail-vectors do not need to be routed as shown in the left most matrix of Fig. 8(a); it is obvious
to a single compressor. If the memories are tested in paral- that the other rows, which are all-0 rows, can be successfully
lel, additional I/O pads may be needed because each com- determined. First, all 0s are identified by both rows and col-
pressor will generate its own compressed test response. umn segments that resulted in OR=0. The remaining X is
If a shared compressor is used, the compressor may need then removed by applying the determination rule of OR=1.
to be designed to handle the different word-widths of differ- The entire fail matrix is reconstructed.
ent memories. If the memories are tested in parallel, the fail For Fig. 4(b), the first 8 rows will be used to demonstrate
vectors of different memories can be multiplexed to the com- the decompression method since the last 8 rows are identical
pressor; only the faulty fail vector is selected to be com- to the first 8 rows. As shown in Fig. 8(b), the three rows of
pressed. 0s are first identified because they resulted in 0s for the OR.
The remaining row of Xs are then determined by applying
3.2 Software Decompression the determination rule of OR=1 to the intersecting column
segments. This fail matrix is also fully reconstructed.
The fail matrix is reconstructed by first initializing an The first 8 rows of Fig. 4(c) will be examined, as shown
matrix identical in dimension to the original fail matrix with in Fig. 8(c). The Xs of column segments that resulted in
all its elements marked as unknowns, denoted as Xs. The Xs OR=0, are replaced by 0s. The remaining Xs are then deter-
are replaced by 0s and 1s using the determination rules. mined by applying the determination rule of OR=1 to the
intersecting rows. All remaining Xs are determined to be 1s.
3.2.1 Determination Rules
This fail matrix is fully reconstructed too.
There are two determination rules. The first determina- For Fig. 4(d), the decompression of the last 8 rows has
tion rule, as depicted by Fig. 7 (a), states that if the result of already been shown in Fig. 8(b). The first eight rows will be
an OR applied to a set of elements is a 0, then it may be examined in Fig. 8(d). First, the Xs in columns that resulted
safely assumed that all elements within the set are 0s. Con- in OR=0 are replaced with 0s. Then, the rows that contain
versely, the second determination rule, as shown by Fig. 7 only a single X are determined. Further determination does
(b), states that if the result of an OR function applied to a set not lead to the removal of any Xs. As a result, the decom-
of elements is a 1, it can be assumed that at least one 1 exists pression process ends without completely reconstructing the
fail matrix.
OR

OR
OR

OR

OR
0 0 0 0 x x x x 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x x x x 0 0 0 0 0 0 0 0 0 0
0 1 0 0 x x x x 1 x x x x 0 0 x 0 0 1 0 1 0 0 0 0 0 0 x x x x 0 0 0 0 0 0 0 0 0 0
0 0 0 0 x x x x 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 x x x x 1 x x x x 1 1 1 1 1
(a) 0 0 0 0 x x x x 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (b) 0 0 0 0 x x x x 0 0 0 0 0 1 0 0 0 0
0 0 0 0 x x x x 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 x x x x 0 0 0 0 0 1 0 0 0 0
0 0 0 0 x x x x 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x x x x 0 0 0 0 0 10 0 0 0 0
0 0 0 0 x x x x 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x x x x 0 0 0 0 0 0 0 0 0 0
0 0 0 0 x x x x 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 x x x x 0 0 0 0 0 0 0 0 0 0
0 0
0 0
0 0

OR
OR
OR

OR x x 0 x 1 x x 0 x
0 1 0 0 x x x x 1 0 x 0 0 1 0 1 0 0 0 1 0 0 x x x x 1
0 1 0 0 0 1 0 0 x x x x 0 x x 0 x 1 x x 0 x
0 1 0 0 x x x x 0 0 x 0 0 1 x x x x
0 1 0 0 1 1 1 1 x x x x 1 x x x x 1
0 1 0 0 x x x x 0 0 x 0 0 1 x x x 0 1 x x x 0
0 1 0 0 x x x x 1
(c) 0 1 0 0 x x x x 0 0 x 0 0 1 (d) 0 1 0 0
x x x x 1 0 x x 0 1 0 x x 0
0 1 0 0 x x x x 1 0 x 0 0 1 0 1 0 0 0 1 0 0
0 1 0 0 x x x x 1 0 x x 0 1 0 x x 0
0 1 0 0 x x x x 0 0 x 0 0 1 0 1 0 0 0 1 0 0
0 1 0 0 0 1 0 0 x x x x 0 0 x 0 0 1
0 1 0 0 x x x x 0 0 x 0 0 1 0 x 0 0 1 0 1 0 0
0 1 0 0 x x x x 0 0 x 0 0 1 0 1 0 0 0 1 0 0 x x x x 0
1 1
0 0
0 0

Figure 8. Determination of fail matrices

3.2.4 Decompressed Fail Matrix determinations will be applied only to subsets in those
At the completion of decompression, if all Xs have been regions.
removed, the fail matrix is fully reconstructed without any The decompression run-time has a linear relationship
loss of information. In general, reconstructed fail matrices with the size of the fail matrix. In our experience, decom-
of fail patterns containing two or more rows or columns of pression of a 16k x 32 SRAM tested by March C- algorithm
the same behavior (both SA0 or both SA1) will contain Xs. (6X, 5 read passes), requires approximately 5 seconds on a
As a result, bitmaps generated from reconstructed fail matri- 300 MHz UltraSparc II.
ces, denoted as reconstructed bitmaps, may also contain
unknown pixels, also denoted as Xs. For instance, Fig. 9(a) 4 Proposed Bitmap Recognition Method
shows an actual bitmap generated from a raw fail matrix The reconstructed bitmaps with Xs can be manually rec-
without compression. In Fig. 9(b), cells in which their ognized and classified after comparing a few examples of
behavior are unknown are shown in grey. original and reconstructed bitmaps. Reconstructed bitmaps
also contain enough information for laser repair. On the
other hand, automated recognition is necessary for rapid
manufacturing feedback.
Current methods to recognize fail bitmaps
[11][12][13][14] are not designed to handle bitmaps with Xs.
Thus, a novel method to recognize bitmaps is proposed here.
(a) (b)
In our method, the fail pattern within a bitmap is encoded
Figure 9. (a) Bitmap generated from raw fail matrix; into a signature to reduce data size while maintaining its
(b) bitmaps generated from decompressed fail
matrix uniqueness; i.e. no two fail patterns should produce the same
signature, although each fail pattern may produce many dif-
3.2.5 Efficiency and Run-Time ferent signatures.
After applying the first independent determination rule to 4.1 Center of Interest
all rows or column segments, remaining Xs typically cluster
and occupy small regions of the matrix. The decompression Because a defect usually affects only cells located in the
time can be significantly reduced by marking these regions same columns and rows as the defect, the first step in creat-
during the first pass of determinations. The subsequent ing a fail pattern signature is to identify these columns and
rows of interest. They are identified by finding the consecu-
tive rows and consecutive columns with the most non-pass-
? ? ? 11 10 0011 . . . . . ? 11
ing pixels (pixels that are either failing or unknown). The ? ? ? 11 00 0111 . . . . . ? 10
example in Fig. 10 shows the columns and rows of interest of
a bitmap enclosed by dotted lines. Because defect sizes are Figure 11. Example of the encoding of an
extension.
rarely larger than 2x2 cells, the widths of these columns and nature of this method, trial-and-error can painlessly fine-tune
rows are set to 2. the threshold value, if necessary.

4.2.3 Continuity Signatures


Finally, the continuity signatures record if the failing rows
and columns span across the whole memory array. Due to
the unique threshold, not the entire rows (columns) of inter-
est are always encoded; thus leaving out information which
would differentiate partial rows (columns) from whole rows
(columns). The continuity signatures fill in this information.

4.3 Building Signature Dictionary


Figure 10. Center of Interest
A dictionary is created to associate the relationship
The 2x2 pixel square where the rows and columns overlap between fail patterns and their signatures. The dictionary
is the center of interest (COI). For a fail pattern with only a may be constructed using bitmaps generated by a software
single cell, such as the one shown in Fig. 2 (a), the COI program or provided by historical fabrication data or both.
encloses the failing cell. For a failing a column, such as the With this dictionary, bitmaps collected from a fabrication
one in Fig. 2(b), the COI may lie anywhere along the col- line can be “looked up” and their fail patterns determined for
umn. process monitoring. If the pattern signature of a bitmap can-
If multiple fail patterns are to be recognized on a single not be found, the bitmap can be examined manually and its
bitmap, traditional pattern recognition may be employed to signature appended to the dictionary, improving the dictio-
differentiate the fail patterns before identifying the COIs. nary’s coverage.

4.2 Signature Generation 4.3.1 Signature Compatibility


A fail pattern signature is composed of three sub-sections: The fail pattern signatures generated by the proposed
the center, the extension and the continuity signatures. method are independent of memory size and dimension.
This simplifies the process monitoring task because the engi-
4.2.1 Center Signature neer does not need to maintain many different dictionaries
The center signature is composed of pixel values of the for different memory sizes. The scalability also means that
COI. The center pixels are included in the signature because even if the dictionary is built to recognize bitmaps of large
many fail patterns, such as the one shown in Fig. 2(a), can be memory size(+1Mb), the dictionary can be built using a very
identified only by their COI. small memory. Building a dictionary using a smaller mem-
ory size, other than the reduction of computation time for
4.2.2 Extension Signatures each bitmap, holds another advantage--the number of runs
needed is reduced because smaller bitmaps generate differ-
The segments of the cell pairs, to the top, bottom, left and
ent signatures, including many rare corner cases, more fre-
right of the COI are the 4 extensions. Each extension gener-
quently than larger bitmaps.
ates a signature by the following process: Scanning from the
The minimal specifications of the SRAM to build the dic-
pixel pairs closest to COI, the very first and any non-repeat-
tionary depends on the fail patterns expected. In general, we
ing pixel pair are recorded as the signature. A unique thresh-
have found the following set of rules to be useful: First, the
old is used to limit the number of pixel pairs to be recorded
block width must be larger than the width of the COI. The
into the signature per each extension.
word-width needs to be larger than 2 times the unique
Assuming that the symbols “.”, “1”, “0” and “?” represent
threshold. The word width should also be larger than the
the memory cell behaviors PASS, SA1, SA0 and
width and height of COI. The height of the SRAM should be
UNKNOWN respectively, setting the unique threshold to 3
equal to or larger than 2 times the word-width. These
would lead to the example show in Fig. 11.
parameters may also be fine tuned easily via trials due to the
A unique threshold of 3 should be able to produce unique
speedy nature of this method.
extension signatures in almost all cases. Due to the speedy
The signatures are dependent on the test algorithm used is repeated 1000 times, and during dictionary testing, 100
because the locations on the Xs on a reconstructed bitmap are times.
dependent of the test algorithm. However this can be reme-
died. Although many test algorithms exist, they usually 5.1.2 Memory Specifications
share common components. Thus, by utilizing only the por- To demonstrate the scalability of this method, the dictio-
tion of the test response from the part of the test algorithm nary will be built using a smaller memory (32x8x8 or 2Kbit)
common to other test algorithms, comparable bitmaps and and then tested by a larger memory (512x16x32 or 256Kbit)
comparable signatures can be generated. as shown in Table 1. Multiplying the block widths and word
widths in Table 1 results in the widths of the memory arrays.
5 Software Simulation The last column of the table, compression ratio, is derived by
dividing the word widths, by the width of the compressor
An experiment to determine the quality of this scheme is
output, 2. The compression ratio increases with the word
carried out by, first, building a signature dictionary using a
width.
smaller memory, followed by testing the dictionary with a
larger memory. This is repeated for three different test algo- Table 1. Specifications of memories for software
rithms. simulation
RAM Block Block Word- Compression
5.1 Simulation Setup # Height Width Width Ratio
1 32 8 8 4
5.1.1 Simulation Flow 2 512 16 32 16
Fig. 12 details the flow of one simulation run. First, a bit-
map is generated from a fail pattern. Different bitmaps can 5.1.3 Fail Patterns
be generated from one fail pattern. For example, a fail pat- A total of 70 fail patterns are used in this experiment. It is
tern may define a row to be stuck-at-0. The bitmaps gener- not possible to describe all fail patterns in detail. However,
ated from this fail pattern will all contain a row of cells they are categorized into 16 classes as shown in Table 2. The
stuck-at-0, but the row number is randomly chosen. integer in the parenthesis after each class represents the num-
ber of fail patterns in that class.
Table 2. Fail pattern classes
1. Single Cell Stuck At(3)
2. 2 Neighboring Cells Stuck-At(8)
3. 3 Cells Stuck-At(8)
4. 2x2 Cells Stuck-At(7)
5. Two Diagonal Cells(3)
6. Row Stuck-At(3)
7. Two Rows Stuck-At(3)
8. Partial Row Stuck-At(3)
9. Row Stuck-At 010101(2)
Figure 12. Simulation data flow 10. Column Stuck-At(3)
By emulating the test execution, a fail matrix may be 11. Two Columns Stuck-At(6)
obtained from the bitmap. Using the algorithm described in 12. Column Partially Stuck-At(3)
Section 3, the fail matrix is first compressed into a 2-bit wide 13. Row Stuck-At, Column Stuck-At(4)
compressed fail matrix, then decompressed to a decom- 14. Row, Partial Column Stuck-At(4)
pressed fail matrix. A bitmap is then generated from the 15. Partial Row, Partial Column Stuck-At(4)
decompressed fail matrix. Using the steps proposed in 16. Two Columns, Two Rows Stuck-At(6)
Section 4, a signature is produced from the bitmap. The Other, and possibly more complicated, fail patterns may
width and height of COI are set to 2, and the unique thresh- occur. However, for process monitoring purposes, it is suffi-
old, 3. The signature is looked up in the signature dictionary. cient to be able to identify the majority of fail patterns, par-
If a match is found, the recognition is a success. If no match ticularly those whose probable causes are well understood.
is found during dictionary building, the new signature is
appended to the dictionary. 5.1.4 Test Algorithms
During the simulation, a bitmap is generated for every fail
Given a bitmap, the generation of the fail matrix depends
pattern in a loop; each loop is called a learning cycle. For
on the test input vector, which is determined by the test algo-
dictionary building using smaller memory, the learning cycle
rithm used. Thus, three commonly used test algorithms,
namely the “March C-,” “March C+” and “Topological able, dictionary collection time can be reduced. Because
Checkerboard” test algorithms, are selected to evaluate the some fail patterns produce many more signatures than oth-
compression/recognition system. ers, the run-time can be greatly reduced by generating bit-
maps more frequently for these fail patterns that produce
5.2 Simulation Results more signatures.
Notice that RAM2 is 128 times larger than RAM1. If the
5.2.1 Dictionary Building signatures of different memory sizes are not compatible,
The quality of this system is measured by the ratio of suc- RAM1 cannot be used to build the signature dictionary for
cessfully recognized fail patterns to all fail patterns RAM2. Both the time per simulation run and the number of
attempted to be recognized, or the recognition ratio. During runs required to achieve the same recognition ratio increases
dictionary building, the recognition ratio is calculated for with respect to memory size--the run time would increase by
each learning cycle, which indicates the “learning progress” 128^2 times. Constructing the same dictionary using RAM2
of the dictionary. The recognition ratios of bitmaps gener- with March C- test algorithm would consume 10,923 hours,
ated from March C-’s test response during dictionary build- or 15 months, instead of 40 minutes using RAM1.
ing are plotted in Fig. 13. The recognition ratio improves
with respect to the logarithm of the number of learning 5.2.2 Recognition of RAM2
cycles because the probability of creating a signature not The recognition ratios of different test algorithms for
already in the dictionary decreases as the recognition ratio RAM2 are shown in Table 4. Notice that the ratios are con-
increases. sistently high, regardless of the test algorithm used. Also, no
mis-match occurred, i.e. the signature dictionary did not mis-
Recognition Ratio (%)

identify any fail pattern.


Table 4. Average recognition ratio of RAM2
March C- March C+ Top. Chkrbrd.
100% 97.0% 99.5%
In Table 5, the average recognition ratios of fail pattern
classes are shown. Recogntion is consistently high.

5.2.3 Production Test Data Flow


In our simulation for the test response of RAM2 tested by
March C-, the average time consumed for decompression is
Learning Cycle 1.5 second; bitmap generation, 1.5 second, and recognition,
Figure 13. Recognition ratio during dictionary 0.5 second. In other words, once the test response of a fail-
building ing memory has been transferred from the tester to a work
The average recognition ratios of the last 3 learning station, 3.5 seconds are required to construct and classify the
cycles is displayed in Table 3. The recognition ratio of the bitmap. This time requirement should be short enough to
dictionary when used to recognize a large bitmap is expected monitor IC manufacturing processes.
to be higher because the probability of creating a case which
is not yet in the dictionary is lower for a larger memory. No 6 Hardware Implementation Experiment
two fail patterns created the same signature during the dictio-
The above method, along with two embedded SRAMs
nary building.
and a BIST circuit, is implemented in a test chip along with
Table 3. Average recognition ratio of RAM1 other test structures. A total of 19 wafers are manufactured
March C- March C+ Top. Chkrbrd. in a 0.18µm mixed-signal process.
99.0% 96.7% 98.1%
6.1 Design Specifications
On a 300 MHz UltraSparc II, dictionary building of
March C- consumes approximately 40 minutes; March C+, 1 The SRAMs used in this experiment are generated using a
hour; Checkerboard, 30 minutes. Although this is well toler- commercial memory compiler according to the specifica-
tions shown in Table 6.
Table 5. Averaged recognition ratio for each fail pattern class
Fail Pattern Class 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Recog. Ratio (%) 100 100 100 99.9 100 100 100 100 100 99.9 100 99.9 99.0 97.3 90.9 98.6
Table 6. Memory specification of hardware from [9] is described. It achieves higher compression ratio
implementation and consumes less area than [9], but is lossy. In the second
RAM Block Block Word Compression step, to overcome the loss of data, a novel method to recog-
# Height Width Width Ratio nize bitmaps that are imperfect due to lossy compression is
3 1024 16 32 16 introduced.
4 128 8 23 11.5 Simulation with a variety of test algorithms shows that
this technique can be used for production testing. This tech-
The BIST netlist is generated using a commercial BIST nique works consistently well with a large variety of fail pat-
tool. The two memories are tested in series. The BIST cir- terns.
cuit executes three different test algorithms on both memo- Implementation of the compression hardware in silicon
ries. The compressed response of one of the test algorithms, consumes little area and pin overhead, even when shared by
March C-, which uses the fast row addressing scheme, will two memories. Our recognition method is able to uniquely
be captured and decompressed. identify almost all bitmaps generated from compressed test
To accommodate two memories with different word responses.
widths, the compressor is designed to be reconfigurable for
32-bit and 23-bit input fail vectors as suggested in Acknowledgments
Section 3.1.3. The BIST circuit signals the compressor
The authors would like to thank Omar Kebichi, Hans Hei-
whether the fail vector is 23 bits or 32 bits.
neken, Manuel D’Abreu, and Laszlo Gutai, among others,
Because the compressor hardware is built as part of a
for their support in this project.
larger circuit, the hardware overhead of the proposed com-
pressor cannot be exactly calculated but is estimated at 364 References
equivalent NAND gates. The BIST circuit consumes 1141
equivalent NAND gates. [1] W. Maly and S. Naik, “Process Monitoring Oriented Testing,”
Proc. Int’l Test Conf., 1989, pp. 527-532.
6.2 Implementation Results [2] S. Naik, F. Agricola, and W. Maly, “Failure Analysis of
The number of failing memories of RAM4 is too small to High-Density CMOS SRAMs Using Realistic Defect
Modeling and Iddq Testing,” IEEE Design & Test of
have a meaningful sample size to be studied. Thus, we Computers, Jun. 1993, pp. 13-23.
examined only RAM3. A total of 259 bitmaps were col-
lected from RAM3 as the result of the testing. Via manual [3] W. Maly et al, “Yield Diagnosis through Interpretation of
Tester Data,” Proc. Int’l Test Conf., 1987, pp. 10-20.
inspection, a total 61 fail patterns were identified. To build
the signature dictionary, a program was written to generate [4] J. Khare et al, “SRAM-based Extraction of Defect
fail patterns similar to those observed from testing. The dic- Characteristics,” Proc. of Int’l Conf. on Microelectronic Test
Structures, 1994, pp. 98-107.
tionary building consumed approximately 1 hour on a 300
MHz UltraSparc II and resulted in an average of 99.7% rec- [5] A. L. Crouch, DFT for Digital IC’s and Embedded Core
ognition on the last three learning cycles. Systems, Prentice Hall PTR, Upper Saddle River, NJ, 1999.
Of the 259 bitmaps, 253 (97.7%) were successfully rec- [6] C. Pyron et. al., “DFT Advances in Motorola’s MPC7400, a
ognized and 6 (2.3%) were not recognized. None of the bit- PowerPC Microprocessor,” Proc. Int’l Test Conf., 1999, pp.
maps were mis-identified. The 6 bitmaps not recognized 137-146.
contained two types of fail patterns. Both fail patterns were [7] I. Schanstra et al, “Semiconductor Manufacturing Process
“large clustering patterns” which do not have failing rows or Monitoring using Built-In Self-Test for Embedded
columns and lacked an obvious COI. However, they were Memories,” Proc. of Int’l Test Conf., 1998, pp. 872-881.
easily recognizable by manual inspection. [8] J. T. Chen and J. Rajski, “A Method and Apparatus for
Diagnosing Memory using Self-Testing Circuits,” US Patent
Pending, Appl. No. 09/522279.
7 Conclusion
[9] J. T. Chen et al. “Enabling Embedded Memory Diagnosis via
In [9], we presented a method to enable embedded mem- Test Response Compression,” Rec. of VLSI Test Symp., 2001.
ory diagnosis in pin-limited designs by compressing the test
response on-chip to reduce the I/O pins needed. The com- [10] A.J. Van De Goor, Testing Semiconductor Memories, Theory
and Practice, John Wiley & Sons, Chichester, UK, 1991.
pression almost always resulted in no loss of data.
This paper addresses the fact that a lossyless compression [11] M. Faucher, “Pattern Recognition of Bit-Fail Maps,” Proc.
is not necessary for diagnosis as long as the fail patterns can Int’l Test Conf., 1983, pp. 108-113.
be identified. A technique composed of two steps is shown [12] B. B. Sindahl, “Interactive Graphical Analysis of Bit-Fail
in this paper: First, a “stripped-down” compressor modified Map Data Using Interactive Pattern Recognition,” Proc. of
Int’l Test Conf., 1987, pp. 687-692.

[13] R. S. Collic et al. “SRAM Bitmap Shape Recognition and


Sorting Using Neural Networks,” IEEE Transactions on
Semiconductor Manufacturing, VOL. 8, No. 3, August 1995,
pp. 326-332.

[14] J. Vollrath et al. “Compressed Bit Fail Maps for Memory Fail
Pattern Classification,” Proceedings of IEEE European Test
Workshop, 2000, pp.125-130.

You might also like