You are on page 1of 54

Fault Diagnosis Overview

David Lavo UC Santa Cruz January 13, 2005

Outline
Introduction: What is Fault Diagnosis? Components: Whats involved? Algorithm details: How does it work? Diagnosis in practice: How does it really work? Research: Why does (or doesnt) it work? How should it work?

2005 David Lavo

Fault Diagnosis Overview

What is Fault Diagnosis?


A guess as to whats wrong with a malfunctioning circuit Narrows the search for physical root cause Makes inferences based on observed behavior Usually based on the logical operation of the circuit

2005 David Lavo

Fault Diagnosis Overview

VLSI Fault Diagnosis (in One Slide)


Defective Circuit
Tests Observed Behavior
Location or Fault

Physical Analysis

Diagnosis

Diagnosis Algorithm

Two Types of Diagnosis


Circuit Partitioning (Effect-Cause Diagnosis) Identify fault-free or possibly-faulty portions Identify suspect components, logic blocks, interconnects Model-Based Diagnosis (Cause-Effect Diagnosis) Assume one or more specific fault models Compare behavior to fault simulations
2005 David Lavo Fault Diagnosis Overview 5

Circuit Partitioning
Separate known-good portions of circuit from likely areas of failure Simplest method: identify failing flip-flops Tester can identify failing flops or outputs Input cone of logic is suspect Intersection of multiple cones is highly suspect Single clock pulse with scan can be used for sequential/functional fails
2005 David Lavo Fault Diagnosis Overview 6

Back-Tracing Failures

aka Effect-Cause Diagnosis


Reasoning based on observed behavior and expected (good-circuit) functions Commonly used at system and board-levels Tries to separate good and suspect areas Advantage: Simple and general Disadvantage: Not very precise, often gives no indication of defect mechanism

2005 David Lavo

Fault Diagnosis Overview

Cause-Effect Diagnosis
Start from possible causes (fault models), compare to observed effects A simulator is used to predict behavior of the circuit in the presence of various faults Match prediction(s) against observed behavior Advantage: Implicates a mechanism as well as a location Disadvantage: Can be fooled by unmodeled defects
2005 David Lavo Fault Diagnosis Overview 9

Cause-Effect Diagnosis
Behavior Signature
010001010100010101010

Defective Circuit

Tests

Comparison & Conclusion

010100110000101010100 101000100001011101100 010100010100011101100 000111000101010011110

Diagnosis Algorithm

Fault Simulator

Candidate Signatures

Outline
Introduction: What is Fault Diagnosis? Components: Whats involved? Algorithm details: How does it work? Diagnosis in practice: How does it really work? Research: Why does (or doesnt) it work? How should it work?

2005 David Lavo

Fault Diagnosis Overview

11

Components of Fault Diagnosis


Fault models Fault simulators Fault dictionaries Diagnosis algorithms

2005 David Lavo

Fault Diagnosis Overview

12

Fault Models
A fault model is an abstraction of a type of defect behavior A fault instance is the application of a model to a circuit wire, node, gate, etc. Used to create and evaluate test sets For diagnosis, they can be used to simulate and predict faulty behaviors

2005 David Lavo

Fault Diagnosis Overview

13

Stuck-at Fault Model


The most-used fault model (by far) Simple to simulate and enumerate Effective for testing, fault grading, and diagnosis of some defects Many defects are not well represented by the stuck-at model

Node A stuck-at 1:

0/1 1

0/1

B (Fault-free/faulty logic values)

Bridging Fault Model


Shorts are a common defect type in CMOS Different bridging fault models have varying accuracy and precision, from simplistic to very sophisticated Difficult or impractical to enumerate
Nodes X and Y bridged:

0 1 1 1

1/0

Node X forces Y to a value of 0

Some Diagnostic Fault Models

Gate Fault
Net Fault

Bridging Fault

Path Fault

Fault Simulators
A fault simulator can simulate instances of a particular fault model Inputs: Circuit (netlist) Test set Faultlist (list of fault instances) Output: circuit response Usually, simulates the presence of a single fault instance (single-fault assumption)
2005 David Lavo Fault Diagnosis Overview 17

Fault Dictionaries
A fault dictionary is a database of the simulated responses for all faults in faultlist Used by some diagnosis algorithms for convenience: Fast: no simulation at time of diagnosis Self-contained: netlist, simulator, and test set not needed after dictionary creation Can be very large, however!

2005 David Lavo

Fault Diagnosis Overview

18

The Full-Response Dictionary


For each fault ( f ), store the response to each test vector ( v ) One bit per vector, pass ( 0 ) or fail ( 1 ) For each vector, store the expected output response ( o ) Total storage requirement: f v o bits

2005 David Lavo

Fault Diagnosis Overview

19

The Pass-Fail Dictionary


For each fault, store only the test vector responses One bit per vector, pass ( 0 ) or fail ( 1 ) Total storage requirement: f v bits Much smaller than full-response, and often practical for even very large circuits

2005 David Lavo

Fault Diagnosis Overview

20

Dynamic Diagnosis
Alternative to dictionary-based diagnosis Fault simulation is only done for certain faults, based on test results Only simulate faults in input cones of failing flip-flops/outputs Dictionary is eliminated, but requires complete netlist and test pattern file Used by most commercial ATPG tools: Mentor Fastscan, Synopsys, Cadence, etc.
2005 David Lavo Fault Diagnosis Overview 21

Outline
Introduction: What is Fault Diagnosis? Components: Whats involved? Algorithm details: How does it work? Diagnosis in practice: How does it really work? Research: Why does (or doesnt) it work? How should it work?

2005 David Lavo

Fault Diagnosis Overview

22

Algorithm Details
Role of a diagnosis algorithm Scoring methods Types of diagnosis algorithms

2005 David Lavo

Fault Diagnosis Overview

23

Diagnosis Algorithms
Algorithms compare observed behavior to predicted behaviors An algorithm attempts to explain the observed failures with fault candidates The job of a diagnosis algorithm is to report the best fault candidate(s) Best is determined by scoring method

2005 David Lavo

Fault Diagnosis Overview

24

Fault Candidate Scoring


Two common scoring methods Match/mismatch points Fault candidate probability Other common scorings: Hamming distance Set intersection/overlap Nearest neighbor

2005 David Lavo

Fault Diagnosis Overview

25

Match/mismatch Point Scoring


Award points for matching observed failures Optionally deduct points for not predicting fails Nonprediction: A behavior not predicted by candidate Misprediction: A prediction not fulfilled by behavior Commercial tools (e.g. Fastscan) are usually biased to lowest nonprediction
2005 David Lavo Fault Diagnosis Overview 26

Probabilistic Scoring
Probability score based on matches and mismatches and error assumptions Weights for non- and mis-prediction Different prediction probabilities for different fault candidates (bridges vs. stuck-at) Usually normalized so that total of all candidates equals 1.0 UCSC method uses probabilities to compare stuck-at candidates to bridges in same diagnosis
2005 David Lavo Fault Diagnosis Overview 27

Types of Diagnosis Algorithms


Stuck-at Most common, best supported by tools Surprisingly effective (~60% exact matches) Very fast IDDQ Orthogonal set of failing data Requires interpretation of tester results Not well supported by tools
2005 David Lavo Fault Diagnosis Overview 28

IDDQ Threshold Setting


180 160 140 120 100 80 60 40 20 0 0 50 100 150 200

Types of Diagnosis Algorithms (Cont)


Bridging-fault May better represent common CMOS faults More complicated fault model Biggest problem: candidate selection Other possible (future) directions: Functional fails Delay fails Parametric failures
2005 David Lavo Fault Diagnosis Overview 30

Outline
Introduction: What is Fault Diagnosis? Components: Whats involved? Algorithm details: How does it work? Diagnosis in practice: How does it really work? Research: Why does (or doesnt) it work? How should it work?

2005 David Lavo

Fault Diagnosis Overview

31

Diagnosis in Practice
Using a diagnosis Translating the results: circuit navigation Evaluating diagnosis quality Commercial diagnosis tools

2005 David Lavo

Fault Diagnosis Overview

32

Using a Diagnosis
Fault diagnosis is used to aid physical inspection and root-cause identification Diagnosis output is logical, not physical: Abstract faults (such as stuck-at) Gates, ports (nodes), and nets No information about location or size Translation to physical location requires navigation of circuit
2005 David Lavo Fault Diagnosis Overview 33

Types of Circuit Navigation


Netlist Examine RTL (Verilog/VHDL etc) for gates and data paths Schematic Symbolic view of gates and wires Layout/artwork Graphical view of metal lines, poly, vias, cell boundaries, etc.
2005 David Lavo Fault Diagnosis Overview 34

Circuit Netlist
module TOP (CLK, Reset, StartOut, SiReady, Rst_CntN, Up_DnN, Wr, SDin, Wr_RAM, Wr_Rreg, RAM_Addr, ATG_TESTMODE, BIST_TESTMODE, SDout, TwoOnes, OneOne, NoOnes, TwoZeros, OneZero, NoZeros);

input inout

CLK; Reset, StartOut, SiReady, Rst_CntN, Up_DnN, Wr, SDin, Wr_RAM;

inout [2:0] RAM_Addr; inout ATG_TESTMODE; inout BIST_TESTMODE; inout SDout, OneZero, NoZeros; inout TwoOnes, OneOne, NoOnes, TwoZeros, Wr_Rreg; // Tie off cells TLOW tielow1 (.Q(tielow)); THIGH tiehigh1 (.Q(tiehigh)); // Inverted CLK wire CLK_N; INVFF clkinv (.Q(CLK_N), .A(CLK)); //PADS

PADNMIOSCM0H08N05B50 PAD001_StartOut (.PUEN(tiehigh), .PDE(tielow), .IEN(tielow), .I(StartOut_I), .SIGNAME(StartOut), .INMODE(in_mode_avail), .TESTI(jumper001), .TESTIEN(tiehigh), .SCANIN(jumper001), .OUTMODE(out_mode_avail), .TESTO(tiehigh), .TESTOEN(tiehigh), .O(tielow), .OEN(tiehigh));

Netlist Navigation
Either use text editor on netlist, or use browser function in simulator Browsers allow you to trace forward and backward and see logic values Can be used to view hierarchy and functional blocks Can be tedious

2005 David Lavo

Fault Diagnosis Overview

36

Circuit Schematic
U475 #1 (100): SA0
0

AOI21
0 0

n27 U510

n673
0

U493 #2 (96): SA1 #3 (85): SA0

U509

U515

OAI222
0

#4 (75): SA1 U454

Schematic Navigation
Either hand-drawn (from netlist navigation) or tool-generated gate symbols and wires Schematic tools in simulators also allow forward and backward traversal and display of logic values Used to verify fault propagation Does not reflect physical distances

2005 David Lavo

Fault Diagnosis Overview

38

Circuit Artwork

Layout (Artwork) Navigation


Use routing/floorplanning tools to view artwork Can usually input cell or wire name and tool will highlight the object Useful for determining (x,y) values Also good for evaluating physical implications of a set of fault candidates Faults clustered in a small area are good Faults/nets spread around large die areas are bad
2005 David Lavo Fault Diagnosis Overview 40

Fault Proximity

Net runs across die: physical examination is almost impossible Faults contained in small area: physical examination is possible

Evaluating a Diagnosis
A diagnosis without one or a few strong (highscoring) candidates is usually poor Can indicate: Multiple defects Unmodeled (complex) behavior Inappropriate algorithm If the diagnosis is poor, either try another algorithm or look for more data (failures)

2005 David Lavo

Fault Diagnosis Overview

42

Evaluating a Diagnosis (cont)


Many diagnoses (~60%) implicate a single stuck-at fault Usually a good sign, but you must consider equivalent faults Many defects can mimic a stuck-at fault, without being a short to Vdd or Gnd Consider nearby nodes also, if practical

2005 David Lavo

Fault Diagnosis Overview

43

Dominance Bridging Fault


Strong inverter

FIB short

Weak inverter Top candidate is stuck-at fault on this node.

Candidate #2 is Best

Candidate #1

Candidate #2

Candidate #3

FIB short

Commercial Tool: Mentor Graphics


ATPG tool: Fastscan Stuck-at diagnosis only No IDDQ capability Orders candidates by number of matched failures (biased to lowest non-prediction) Also has netlist & schematic browser Based on Waicukauski & Lindbloom (D&T89)

2005 David Lavo

Fault Diagnosis Overview

46

Commercial Tool: Synopsys


ATPG tool: TetraMAX J. Waicukauski moved to Synopsys after writing Fastscan Diagnosis capability unknown: assumed to be similar to Fastscan

2005 David Lavo

Fault Diagnosis Overview

47

Commercial Tool: Cadence


ATGP tool: Encounter Test Test and diagnosis tools purchased from IBM IBM has had good diagnosis research, but Encounters capabilities are unknown Also of interest: Silicon Ensemble - routing tool Graphical artwork viewer Good for highlighting nets and cells based on diagnosis results Good for determining (x,y) and producing screen shots

2005 David Lavo

Fault Diagnosis Overview

48

Outline
Introduction: What is Fault Diagnosis? Components: Whats involved? Algorithm details: How does it work? Diagnosis in practice: How does it really work? Research: Why does (or doesnt) it work? How should it work?

2005 David Lavo

Fault Diagnosis Overview

49

Prior Art
Waicukauski & Lindbloom, IEEE Design & Test, Aug. 89 Most widely-used algorithm for commercial tools Finds candidates to match individual tests, attempts to explain all failing tests Abramovici & Breuer, IEEE Trans. Computing, June 80 Effect-cause diagnosis Permanent stuck-at fault assumption Aitken & Maxwell, HP Journal, Feb. 95 Analysis of relative importance of models vs. algorithms Lavo, Larrabee, et. Al., Proceedings of ITC 98 Probabilistic scoring Mixed-model diagnosis Bartenstein et. Al., Proceedings of ITC 01 SLAT: Single Location At-a-Time diagnosis Focus on matching per-vector results
2005 David Lavo Fault Diagnosis Overview 50

Prior Art (cont)


Jee & Ferguson, Proceedings of ISTFA 93 Carafe Inductive Fault Analysis (IFA) Examine circuit to determine likely failure locations Aitken, Proceedings of ITC 95 Using FIBs to insert defects Calibrate/evaluate diagnosis methods Henderson & Soden, Proceedings of ITC 97 Probabilistic physical failure analysis Nigh, Vallett, et. Al., Proceedings of ITC 98 Large-scale, multi-company SEMATECH experiment Failure analysis of timing and IDDQ fails

2005 David Lavo

Fault Diagnosis Overview

51

Research Directions
Complex defect behaviors Beyond stuck-at and 2-line bridges Intermittent faults Delay and timing-related defects Parametric & process-related defects Multiple simultaneous defects Is there a simple, inductive way to infer complex defects?
2005 David Lavo Fault Diagnosis Overview 52

Research Directions (cont)


Diagnosibility What makes a particular circuit easy or hard to diagnose? What can we do to make diagnosis easier? Evaluation of diagnoses What makes a good diagnosis? Can we quantify our confidence in a diagnosis?
2005 David Lavo Fault Diagnosis Overview 53

Research Directions (cont)


Integration with physical FA & yield improvement Can we incorporate process information? Can we produce a physical diagnosis? On-line (or even on-chip) diagnosis Commercial toolflow integration Can diagnosis tools use industry-standard data formats? Can commercial tools be scripted or programmed to do better diagnosis?
2005 David Lavo Fault Diagnosis Overview 54

You might also like