You are on page 1of 16

Six Sigma Tolerance Design Case Study:

Optimizing an Analog Circuit Using Monte Carlo Analysis

Andy Sleeper
Successful Statistics LLC
970-420-0243
andy@OQPD.com

1. Abstract

Tolerance Design is the science of predicting the variation in system


performance caused by variations in component values or the environment. This
article shows how Monte Carlo simulation can be applied to predict and improve
the quality of a system before even one prototype has been built. Using these
methods allows new products to be developed rapidly and introduced with fewer
unexpected problems. The case study in this article is a simple analog circuit.
The analytical methods and optimization process may be successfully applied to
any engineering problem where a transfer function can be derived.

2. Overview of Tolerance Design

In general, any product or process is a system converting inputs to outputs. This


is shown graphically in Figure 1.

At the center of the system is a System Characteristics


Transfer function, which converts the
Outputs
inputs (X) into outputs (Y). The Y
transfer function is a mathematical
equation, which may be known,
estimated, or unknown.
Transfer function
Y = f(X)
These three types of transfer
functions are common in engineering
problems: X

• White box transfer functions Inputs


are derived analytically, using Part Characteristics
principles of science and Process Characteristics
engineering. Environmental Characteristics
• Gray box transfer functions
are estimated by simulating Figure 1 - Generic System
the behavior of the system,
using computer programs like
SPICE. The function itself may be too complicated to derive, or it may
have no closed-form solution.

© 2003 Successful Statistics LLC 1 www.OQPD.com


• Black box transfer functions are estimated by observing the behavior of a
physical system. This is done by designing an orthogonal experiment,
collecting the data, and estimating the transfer function using analysis of
variance and linear regression methods.

This paper describes an example of Tolerance Design applied to a white-box


transfer function. Whenever possible, white-box transfer functions are preferred,
because they can be derived earlier in the development process, leading to
faster introduction of new products.

Step 1:
Define tolerance for Y
Step 5:
Optimize system

Step 2: Step 4:
Develop transfer function Predict variation of Y

Step 3:
Compile variation data on X

Figure 2 - Tolerance Design Process

Figure 2 illustrates an effective process for tolerance design, using these five
steps. More details on these steps will be explained later, using the case study
as an example.

1. Define tolerance for Y: Based on customer requirements for the system,


define the widest limits on Y which provide tolerable performance for the
system.

2. Develop transfer function: Derive the transfer function for the initial design
of the system. Set up an Excel worksheet with formulas to calculate the
transfer function.

3. Compile variation data on X: If real data is available on the X’s, compute


statistics from that data, and select distributions that represent the
variation seen in the data. Usually, no data is available, and an
assumption is needed. When nothing is known about X, assume that it is

© 2003 Successful Statistics LLC 2 www.OQPD.com


uniformly distributed between its tolerance limits. This is a conservative
assumption, because it is worse than real life, in most cases. Using
Crystal Ball® software, define Assumption cells for each input X, based on
this information.

4. Predict variation of Y: Using Crystal Ball software, define Forecast cells


for each output Y. Set run preferences and run the simulation. In a Six
Sigma environment, compute capability metrics for Y, such as CP, CPK and
DPMLT. If these predicted quality metrics meet Six Sigma criteria, then,
stop!

5. Optimize system: If the system is not acceptable, what needs to be


changed? Consider these questions:

a. Does the tolerance for Y accurately reflect customer needs?

b. Which X contributes most to variation in Y? The sensitivity chart


produced by Crystal Ball tells you this. For the biggest contributor,
either get some data to replace the default assumption, or choose a
different component with less variation. Don’t waste time fiddling
with the small contributors on the sensitivity chart.

c. If the design needs to be changed, a new transfer function must be


developed. The results of the simulation and the sensitivity chart
provide clues to help in your redesign effort.

3. Case Study

The schematic shown below is part of a 5V power supply designed to detect


when the 5V voltage drops too low. When this happens, the comparator
changes state, resetting the processor before it starts doing evil things.
+5

R1 R4
4.99k R3 10K
±1% 499k ±1% ±1%

+

VR1 U1
R2 LM2903
AD780
5.36k Voffset = 0 ±15 mV
2.5V
±1%
±0.2%

Figure 3 – Undervoltage Comparator, Original Design

© 2003 Successful Statistics LLC 3 www.OQPD.com


Step 1: Define Tolerance for Y

First, what is Y? What characteristics of this circuit are we interested in? Here
are three:
• VTRIP-DOWN – This is the voltage of the +5V bus when the comparator
changes state, when the +5V is going down, for instance, when the power
supply is shutting off.
• VTRIP-UP – This is the voltage of the +5V bus when the comparator changes
state, when the +5V is going up, for instance, when the power supply
starts up.
• VHYST = VTRIP-UP – VTRIP-DOWN For stability, the comparator circuit requires
a certain amount of hysteresis.

For simplicity in this article, we will only analyze VTRIP-DOWN. If you wish to practice
using these techniques, try analyzing the other two Ys as an exercise!

So what are the customer requirements for VTRIP-DOWN? This circuit is buried
inside a product, and appears to be far away from the customer. No customer is
ever aware of this circuit, unless it fails to work properly. This circuit is a safety
device, intended to prevent undesired malfunction of the digital circuitry. So the
customer requirement for VTRIP-DOWN is to shut down the processor before its
supply voltage goes out of range at 4.75V. Therefore, 4.75V is the lower
tolerance limit.

The upper tolerance limit is set by the variation of the +5V output itself. If VTRIP-
DOWN is above 4.85V, and the +5V voltage is low because of load conditions or its
inherent variation, the system will not work correctly.

So the tolerance limits for VTRIP-DOWN are 4.75V to 4.85V.

Step 2: Develop Transfer Function

For many problems, this step can be the most difficult. But a few simple
guidelines help make this easier:

• Do not include inputs which have negligible impact


• Use new symbols to represent intermediate values
• Keep equations short. Look for opportunities to substitute symbols for
portions of the equation

For the undervoltage comparator, there are many inputs I choose to ignore. This
is risky, and requires some engineering judgment. There is a risk of ignoring an
input that is actually significant. So when in doubt, either leave it in, or use some
other method (such as circuit simulation) to determine if the input is significant or
not.

© 2003 Successful Statistics LLC 4 www.OQPD.com


In this case, I choose to ignore the effect of the resistor in series with the
reference diode. Based on the specifications of the diode, I can calculate that
the effect of the resistor tolerance is in the nanovolt range, which is swamped out
by the voltage tolerance of the diode. So I feel safe in ignoring this input.

Likewise, the input bias current of the comparator and the load impedance of the
circuit following the comparator have effects, but these are extremely small, and I
ignore them.

What follows is one way to derive the transfer function. In this derivation:
VTRIP-DOWN is the +5V bus voltage at the point where the comparator
changes state
V+ is the voltage at the + input to the comparator
V- is the voltage at the – input to the comparator
V+ = V− + VOFFSET at the trip point
V- = VR1
Since we are analyzing VTRIP-DOWN , the output of the comparator before it
changes state is high, so the open-collector output of the LM2903 is floating.
R2
V+ = VTRIP −DOWN
R1 (R3 + R4 ) + R2
R2
V+ = VR1 + VOFFSET = VTRIP −DOWN
R1(R3 + R4 )
+ R2
R1 + R3 + R4
 R1(R3 + R4 ) 
VTRIP −DOWN = [VR1 + VOFFSET ] + 1
 R2(R1 + R3 + R4 ) 
This last equation is the transfer function to be analyzed.

Figure 4 illustrates an Excel worksheet containing this Undervoltage Comparator


formula. Here are some tips to make this process
easier: VR1 2.5000
Voffset 0.0000
• Enter a name in the cell to the left of each R1 4990
component. In the next step, Crystal Ball will R2 5360
automatically pick up this name for each R3 499000
Assumption cell. R4 10000

• Format each cell with a reasonable number of Numerator 2539910000


decimal places. Denominator 2754986400
Vtrip-down 4.8048
• Split the transfer function into small pieces to
minimize errors. Here, the numerator and Figure 4 - Worklsheet
denominator of the fraction were calculated

© 2003 Successful Statistics LLC 5 www.OQPD.com


separately.

Step 3: Compile Variation Data on Each Input X

In the ideal world, engineers would have access to vast databases with actual
measured values from samples of all these parts. From this data, we could
select the most appropriate probability distribution and use that distribution for
the Monte Carlo simulation.

But in real life, most engineers have no data.

For the first simulation in data-poor real life, I recommend assuming that each
component is uniformly distributed between its specification limits. This is a
conservative assumption, because it is usually (but not always) worse than real
data will be.

A handy way to implement this assumption with Crystal Ball is to define the
tolerance limits in worksheet cells. For each X, define a uniform distribution and
enter references to the cells where the tolerance limits are located. This is
illustrated in Figure 5.

Figure 5 - Defining Assumptions with Calculated Parameter Values


After defining the first assumption, use the Crystal Ball “Copy Data” and “Paste
Data” functions to quickly define the rest of the assumption cells.

Step 4: Predict Variation of Y

Select the cell containing the calculated value for VTRIP-DOWN and define that as a
Crystal Ball forecast cell, so that Crystal Ball will keep track of the randomly
generated values. At this point, the spreadsheet looks like Figure 6.

© 2003 Successful Statistics LLC 6 www.OQPD.com


Figure 6 - Spreadsheet ready for simulation

Next, we must decide how many trials to run. We could pick a number out of the
air, but Crystal Ball provides a better approach, called precision control. Using
this feature, the simulation runs until we have “enough” information.

In this case, I asked Crystal Ball to run until the mean and standard deviation of
VTRIP-DOWN are known to within 1%, with 95% confidence. For this model, this
precision was achieved after 15,500 trials, which were completed in 10 seconds
on my computer.

I also selected “Latin Hypercube Sampling”, which tends to converge faster than
the default simple random sampling used by Crystal Ball.

For more complicated models which require more calculation time, relaxing the
precision control to 5% or more may be needed to finish the simulation in a
practical time.

© 2003 Successful Statistics LLC 7 www.OQPD.com


Forecast: Vtrip-down
15,500 Trials Frequency Chart 15,500 Displayed
.024 371

.018 278.2

.012 185.5

.006 92.75

.000 0
4.7286 4.7666 4.8046 4.8426 4.8806
Certainty is 94.63% from 4.7500 to 4.8500

Figure 7 - Predicted forecast distribution

Figure 7 displays the frequency chart for the forecast VTRIP-DOWN. The certainty
grabbers are set at the tolerance limits, 4.75 and 4.85.

Clearly, this design has a problem. Based on this simulation, only 94.63% of
these circuits would meet their tolerance requirements.

In a Six Sigma environment, we must calculate other metrics, such as CP, CPK
and DPMLT. To do this, we need the mean and standard deviation of VTRIP-DOWN
which Crystal Ball predicts are 4.8049 and 0.02568, respectively. I plug these
values into another spreadsheet to make the capability calculations. (This
worksheet, CapMet16.xls, is available on my web site, www.OQPD.com)

© 2003 Successful Statistics LLC 8 www.OQPD.com


Capability metrics
Cp 0.6490991
Cc 0.0981429
Cpu 0.5853946
Cpl 0.7128036
Cpk 0.5853946
Z-bench 1.5913082
Z-st 1.7561839
Z-lt 0.2561839

Quality Prediction, assuming normal distribution


Short-Term Shifted Up Shifted down Long-Term
Defects per million, upper DPMU 39528.466 398904.492 564.664
Defects per million, lower DPML 16241.654 137.197 261603.116
Defects per million DPM 55770.119 399041.689

Normal probability
function
Specification limits

Target

Shifted up

Shifted down

4.6 4.65 4.7 4.75 4.8 4.85 4.9 4.95 5

Figure 8 - Capability of Initial Design


This report predicts a CPK of 0.58 and a long term defect rate of 399,042 Defects
Per Million Units (DPMLT). These metrics are clearly unacceptable. The shifted
distributions in the chart illustrate the effects of inevitable shifts and drifts which
happen during the production of a product.

Step 5: Optimize System

Clearly improvement is needed. We can revisit the tolerance VTRIP-DOWN, but for
the reasons explained above, no changes to the tolerance are possible.

So what is causing most of the variation in this system? The Crystal Ball
sensitivity chart, shown in Figure 9, has the answer.

© 2003 Successful Statistics LLC 9 www.OQPD.com


The biggest Sensitivity Chart
contributor to
Target Forecast: Vtrip-dow n
variation is
VOFFSET, followed Voffset .65
R1 .50
closely by R1 and R2 -.50
R2. VR1 .22
R4 -.02

So the first change R3 .00

to the system
should be to
improve VOFFSET.
-1 -0.5 0 0.5 1

Revision 1: Better Measured by Rank Correlation

Comparator
Figure 9 - Sensitivity Chart
For a modest
increase in parts cost, the
LM 2903 comparator can be
replaced with a LM293,
which controls offset voltage
to 0 ± 9 mV over
temperature.

The organization of the Excel


worksheet used in this case
study makes revisions very Figure 10 - Revision 1

convenient. By
changing the Sensitivity Chart
tolerance in cell C5 Target Forecast: Vtrip-down
to .009, the R2 -.60
parameters of the R1 .59

Voffset assumption Voffset .44

are automatically VR1 .24


R4 -.01
updated. R3 .00

After repeating the


simulation with
these settings, CPK
-1 -0.5 0 0.5 1
is now 0.68 and Measured by Rank Correlation
DPMLT is now
290,947. It’s better, Figure 11 - Sensitivity Chart - Revision 1
but not good yet.

The sensitivity chart in Figure 11 shows that R1 and R2 are now the big culprits.
Further improvement to the comparator would not be cost-effective.

© 2003 Successful Statistics LLC 10 www.OQPD.com


Revision 2: Using 0.1% resistors for R1 and R2

It is possible (at high cost) to purchase 0.1% resistors. What if these were used
in place of R1 and R2? It’s easy to find out. Change the values in cells C6 and
C7 to 0.1% and repeat the simulation.

Forecast: Vtrip-down
12,350 Trials Frequency Chart 12,350 Displayed
.033 403

.024 302.2

.016 201.5

.008 100.7

.000 0
4.7500 4.7750 4.8000 4.8250 4.8500

Figure 12 - Frequency chart - Revision 2


Figure 12 shows the predicted frequency chart with the tolerance limits set as the
limits of the plot. None of the trials fell outside of tolerance limits. As a result, CP
= 1.44, CPK = 1.31 and DPMLT = 7,829. These numbers are better, and out of all
the simulated units, none failed.

But there are still two big problems with this design:

First, the odd-value 0.1% resistors are expensive, and using them creates costly
problems for procurement and inventory. If these are not part of the standard
parts stocked for assembly, additional equipment and setup will be necessary.

Second, this quality level is still not good enough for Six Sigma. To meet Design
For Six Sigma (DFSS) standards, CPK must be 2 or greater. After a product goes
into production, shifts and drifts caused by components, processes and
uncontrolled environmental factors may shift the average by 1.5 standard
deviations or more, without being detected. A DFSS product must be designed
so that quality is good even after the average values are shifted by 1.5 standard
deviations.

What is good enough? For a normally distributed process, if CPK = 2.00, then the
long term defect rate (DPMLT) is 3.4 Defects Per Million Units. That’s world-class
quality for this type of product.

So what can we do if the system is already too costly and still does not meet
quality requirements? Redesign it.

© 2003 Successful Statistics LLC 11 www.OQPD.com


Take a look at the transfer function, shown again here:
 R1  R3 + R4 
VTRIP −DOWN = [VR1 + VOFFSET ]  + 1
 R2  R1 + R3 + R4 
I regrouped the equation to illustrate the impact of the ratio R1/R2 on the result.
Can we control the ratio R1/R2 and reduce cost? Yes we can!

Revision 3: Network of matched resistors

There are resistor networks containing two resistors with tightly controlled ratio.
Because the resistors are manufactured on a single die, these parts are
reasonably priced. One such part contains two 10,000 Ohm resistors with 0.1%
absolute tolerance, while the ratio is controlled to 1 ± 0.025%. This part is less
expensive than even one 0.1% resistor.

The drawing below shows a revision of the design, using this component
+5

R4
R3 10K
499k ±1% ±1%

+

U1
VR1
R1-2 LM293
AD780
10k ±0.1% Voffset = 0 ± 9 mV
2.49V
ratio ±.025%
±0.2%
R5
634 ±1%

Figure 13 – Undervoltage Comparator, Revision 3

So far, the system models we have used assume that all components are
independent of each other. Here, we have intentionally introduced a dependency
between R1 and R2. How do we set up the Monte Carlo model so Crystal Ball
will simulate this dependency?

• If we had a number of samples of the resistor network, we could measure


them, compute the correlation coefficient between R1 and R2, and specify
this correlation in Crystal Ball.

© 2003 Successful Statistics LLC 12 www.OQPD.com


• But if we have no samples and no data, we must make an assumption. A
reasonable assumption is that the values of R1 and R2 are uniformly
distributed within their tolerance zones.
10010 Figure 14 illustrates the tolerance zone of
these two resistors. Each part must be
within 0.1% (10 ohms) of the nominal
value, and the ratio is controlled to within
10000 0.025%. So if R1 = 10,000 ohms, the
tolerance for R2 is 9,997.5 to 10,002.5
ohms.

9990 One way to express this to Crystal Ball is


to use the following trick:
10000

10010 Specify R1 as 10,000 ± 0.1%


9990

In the transfer function, replace R2 by (R1


+ R2A). Specify R2A as 0 ± 2.5 ohms.
Figure 14 - Tolerance zone of R1,R2
The new transfer function is shown below:

 R1  R3 + R4 
VTRIP −DOWN = [VR1 + VOFFSET ]  + 1
 R1 + R2A + R5  R1 + R3 + R4 

Simulating this transfer function leads to this frequency chart:


Forecast: Vtrip-down
11,800 Trials Frequency Chart 11,800 Displayed Now, CP = 1.45, CPK =
.033 388
1.29 and DPMLT = 8,863,
.025 291
about the same as the
previous revision.
.016 194

So cost has improved,


.008 97
but quality has not.
.000 0
4.7500 4.7750 4.8000 4.8250 4.8500 To plan the next step,
once again look at the
sensitivity chart, shown
Figure 15 - Frequency chart - Revision 3 in Figure 16

© 2003 Successful Statistics LLC 13 www.OQPD.com


Sensitivity Chart

Target Forecast: Vtrip-down


Voffset .88
VR1 .45
R5 -.07
R2 -.03
R1 -.01
R3 .01
R4 .01

-1 -0.5 0 0.5 1
Measured by Rank Correlation

Figure 16 - Sensitivity chart - Revision 3


Once again, VOFFSET is the biggest culprit, while all the resistors now have a trivial
impact.

Revision 4: Define Assumption Based on Real Data

It is time to question the default assumption that each component is uniformly


distributed between its tolerance limits. After all, the comparator comes from a
company who publicly champions its “Six Sigma” program. It should be of high
quality.

So, a sample of 50 LM293 parts are drawn from stock, including samples from
different date codes. The offset voltage is measured on all these parts. The
figure below shows a histogram of this data.

Histogram of Sample 1

-0.01 -0.005 0 0.005 0.01

Figure 17 - Histogram of Voffset data

© 2003 Successful Statistics LLC 14 www.OQPD.com


The Crystal Ball Batch Fit tool may be used to select a distribution model which
best fits this data. In this case, we decide to use a normal distribution, with
parameters set based on the statistics of this sample: µ = 3 x 10-6 and σ =
.00207

In the spreadsheet model of the transfer function, we change the assumption for
VOFFSET to a normal distribution with the parameters listed above, and repeat the
simulation.

The results are shown below:


Forecast: Vtrip-down
14,150 Trials Frequency Chart 14,150 Displayed
.052 732

.039 549

.026 366

.013 183

.000 0
4.7500 4.7750 4.8000 4.8250 4.8500

Figure 18 - Frequency chart - Revision 4


Quality Prediction, assuming normal distribution
Short- Shifted
Term Shifted Up down Long-Term
Defects per million, upper DPMU 0.000 0.342 0.000
Defects per million, lower DPML 0.000 0.000 0.000
Defects per million DPM 0.000 0.342

Figure 19 - Six Sigma Capability Plot for Revision 4

© 2003 Successful Statistics LLC 15 www.OQPD.com


Now, CP = 2.43, CPK = 2.15 and the predicted long-term defect rate is 0.3 DPM.
Figure 19 illustrates that even with a 1.5-sigma shift added to this process,
quality levels are extremely good.

4. Summary

In this article, an analog circuit design is used to illustrate the power of Tolerance
Design techniques and Monte Carlo simulation. The initial design proved to be
unsatisfactory, and through a series of revisions, we generated a new design of
extremely high quality at reasonable cost. Here are the steps we followed:

1. We analyzed the initial design using Crystal Ball Monte Carlo simulation,
assuming that each component is uniformly distributed between its
tolerance limits. The results showed unacceptably high variation.
2. The sensitivity chart identified the biggest cause of variation, so we
replaced it with a tighter tolerance part. This reduced variation, but not
enough.
3. We tried 0.1% resistors, which further improved quality, but at
unacceptable parts cost.
4. We recognized that the transfer function depends heavily on the ratio
R1/R2. Instead of discrete 0.1% resistors, we used a resistor network with
controlled ratio. This reduced parts cost to acceptable levels, but variation
was still too high.
5. Again, the sensitivity chart identified the biggest cause of variation. We
gathered a sample of parts and measured them, using actual data instead
of the default assumption. This change brought the predicted quality to an
acceptable level.

New product design is always iterative. To introduce products more quickly,


these iterations must be done rapidly, in the analysis phase. Later, in the
prototype phase, revisions are slow and costly. This case study illustrates how a
design may be fully optimized before building a single prototype.

Tolerance Design and Monte Carlo simulation are the keys to a safe, robust and
successful new product.

About the Author

Andy Sleeper is a DFSS Master Black Belt and General Manager of Successful
Statistics LLC. Andy provides training and consulting services to engineers in
new product development. Andy holds a BS degree in Electrical Engineering
and a MS degree in Statistics. For more information, please e-mail
andy@OQPD.com or call 970-420-0243.

© 2003 Successful Statistics LLC 16 www.OQPD.com

You might also like