You are on page 1of 8

Performing a Short-Term MSA Study http://www.qualitydigest.


Published on Quality Digest (

Home > Content

Performing a Short-Term MSA Study

By: Steven Ouellette

Gauging your conformance decisions

In the past couple of articles, we have been having fun together testing whether a measurement
device is usable for the crazy purpose of determining if we are actually making product in or out
of specification. Last month, we performed a measurement systems analysis [1](MSA) “potential
study” using a snazzy MSA spreadsheet (if I do say so myself)*. We found that the Hard-A-Tron
was not only pretty highly variable (compared to our spec), but that the material we were
measuring actually might have been changing over time. But a potential study was not enough
for you, was it? You asked, nay demanded, that we perform a short-term MSA, and I, your
humble servant, gave you the data [2] to do so. After the jump, we will perform the analysis, so
unless you are the type of person that flips to the back of the book to see if you want to read it,
finish up your analysis, and then click to read more.

OK, let’s review the scenario. We are weighing a plastic preform before placing it into a
compression mold. The weight specification is 465 ±50 grams. We are assuming that the
measurement is independent of operator, so we only have one operator do the test.

We have been using this scale for years—and it has a digital readout, so the plant manager
likes it. On the other hand, we have had a lot of defective parts for years, too, and the area the
scale is in is pretty contaminated with phenolic dust.

You perform the study by having the operator go through all 25 samples in a random order
(while preventing him from seeing the ID number of each). You record these readings, then go
through the same 25 samples in a different random order, and repeat this so you have five
measurements on each of the 25 pieces. Each row contains the five measurements for that part.

First off, what are we trying to accomplish by doing a short-term study?

In this case, we want to know if the gauge itself might be adding variability into the process. In
other situations, we might want to learn more about how a gauge performs before plunking
down $10,000 for it. By increasing the size of the samples we are testing, we give the gauge a
better opportunity for “stuff” to happen that would affect the measurement and giving ourselves
some more data to help us make a conclusion. I mean, if something happens during the hour of

1 of 8 12/5/2010 8:54 PM
Performing a Short-Term MSA Study

this test, it is only going to perform worse once we get it into production, right?

So we need to look at:

1. The repeatability, or how much variability the same operator and the same gauge produce on
the same part. This comes from the range within operator within part and will be shown on a
modified range chart.

2. The reproducibility, or how much variability is due to differences in operator and gauge. (This
is not applicable in this scenario.) This comes from the range across operators.

3. The discrimination of the gauge, or the ability of the gauge to tell nominally different parts
from each other. This will come from the total measurement error for each operator and be
shown on a modified mean chart.

4. The ability of the gauge to correctly classify product as conforming or nonconforming on a

single measure. This will be indicated by the metric %R&R.

Once the data are in the spreadsheet, a whole bunch of things happen. Let’s go through each
and see what we can learn about this gauge.

As with the potential study, we are using the range across each unit’s measurements to get an
estimate of the measurement error within sample. The average range across all those samples
ought to give us an estimate of that component of the measurement error. Now that we have 25
samples being measured five times each, we can use the concept of a control chart to help us
determine if any of the ranges are unexpectedly high, instead of just eyeballing it. (We can’t do
that with the potential study, because the limits on the range fluctuate with sampling error on
those smaller sample sizes and lower repeated measures.)

So the spreadsheet creates the usual type of range chart, but the range is for each part, not
across multiple parts. Be sure you understand what this range chart is doing—it is critical to
making conclusions about your measurement system. Each dot is the average range of
repeated measurements on the same part, and so represents measurement error. Measurement
error is one of the few events you can usually count on being normally distributed and hopefully
the same for each part (we will verify this two ways), so the average of the ranges of those
repeated measurements can use the usual formula for calculating the control limit for the

where D3 and D4 are constants related to sample size. As it turns out, with a sample size of five,
there is no lower limit on the range chart. Because each dot is the range across a part, the dots
are not in time order, we don’t connect the dots and we don’t use any of the time-based control
rules like runs and trends. We just look for one or more points outside of the limits.

And here is what we see:

2 of 8 12/5/2010 8:54 PM
Performing a Short-Term MSA Study

Figure 1: Range chart of repeated measures

This chart would detect if one or more parts had an unusually high range, which you might see
if the part gets damaged during the MSA, or if there was something unique to that part that
made getting a measurement difficult. Just like with regular statistical process control (SPC), we
would investigate any point outside the limits to try to understand what happened. But with no
points outside the limits, we can say that the within-part variability looks to be pretty stable
across all of our parts.

Those of you who have ever used a mass balance are looking at that average range and going,
“Whoa!” But we did learn something important here: There is nothing unusual about any part
that is causing a large range in measurements, which implies that there is something inherent in
the measurement process itself causing that. Whether that is due to operator error or
measurement device we don’t know by looking at the graph (though the assumption in this case
was that operator didn’t have an effect).

Because we have built a range chart it seems like we ought to take a look at a mean chart, too.
But it is again important to understand what the mean chart is telling us.

Just like the range chart, the mean chart here is the mean of the repeated measures, not the
mean of multiple samples, and the samples are selected from nominals across the entire range
that the gauge is expected to be used. These two facts have a profound effect on how we
interpret this chart.

Remember how control limits are (by default) calculated for the location charts for continuous
data? Except in very particular circumstances, we use the average dispersion metric multiplied
by a constant, as with a range chart:

We do this because the average range will give us a better idea of the true underlying variability.
The way we usually make a control chart is to take, say, five sequential units for each sample.
That way they are about as similar as we can make them, so what variability we see within-
sample is hopefully due to just the total process variation (which includes inherent variability in
the parts and measurement error). If there are no changes through time (out of control events)
then the within-sample error is the same as the between-sample error, and the chart shows
random, normally distributed means. And don’t forget, we are taking a sample of five, and due

3 of 8 12/5/2010 8:54 PM
Performing a Short-Term MSA Study

to the central limit theorem I know that the random sampling distribution of the means are going
to be more narrowly distributed than the individuals. Remember this?

So I use the range to estimate the process standard deviation (σ), and then reduce that to
account for the sample size used in generating the means. Otherwise, the limits would be too
big for our averages.

But that is not what is happening here.

In an MSA we are trying to understand only the measurement error, and our ranges across the
repeated measures represent only that error. Also, we should be choosing as samples for our
MSA parts across the entire range I expect to use that gauge. If I am testing a 1- to 2-inch
micrometer, I will knowingly choose parts that are to a 1-inch nominal, 2-inch nominal, and
anything in between. So I don’t expect my means to be distributed in any particular way—I
chose how they are distributed when I decided which parts I wanted for my sample. If my
micrometer varies around a 100th of an inch on repeated measures, the range is going to
predict a tiny control limit, but my parts are up to an inch different from each other. We can have
an additional component of variability here, and that is the nominal differences between parts.

So the mean chart is going to look weird. You might try to make a case that you don’t even need
the mean chart, but we can still use a chart with the means to learn about our measurement
system. Again we don’t draw lines from average to average (which implies time order) but now I
add in points for the individual readings as well (you’ll see why in a moment). I have big blue
dots for the mean and little black dots for each measurement. This will visually show the spread
of the repeated measurements, which can help us figure out what is going on.

We are not doing process control here—control limits make no sense to put on this chart. We
don’t care about the part-to-part variability, we care about the variation of the repeated
measures. We can put some lines on the chart that show that variation. To do so, we use this
good old formula—

—to give us an estimate of the measurement error based on the range, which is about 14.69 g.
We do not correct for sample size—we run this process by taking a single measure, so we want
to see visually how much variation we can have on a single measurement. Let’s use ±3
measurement error standard deviations to generate the natural tolerance of repeated
measurements on the exact same part time and time again. For convenience, we will place
these lines (not control limits) around the mean of all the parts.

And we see this:

4 of 8 12/5/2010 8:54 PM
Performing a Short-Term MSA Study

Figure 2: Mean plot of repeated measures with ±3σ lines and

individual readings

Note that I made those lines dashed black, so as not to confuse anyone into thinking that they
are control limits.

This chart shows a process with so much variation that whatever part to part differences might
exist are swamped by the huge measurement error. See that blue dot pretty close to the green
line? If I measured that 100 times, I could get readings spanning the entire range between the
black dashed lines. Yikes!

What I actually want to see on this graph is something like this:

Figure 3: Mean plot of repeated measures with ±3σ lines and

individual readings for different gauge

5 of 8 12/5/2010 8:54 PM
Performing a Short-Term MSA Study

The black dashed lines show the discrimination of the gauge on a single measurement. With
the gauge in figure 3, we have a pretty good ability to discriminate one part from another. In
figure 2, we can’t tell one part from another with a single measure on the mass balance—any
real differences are small compared to the measurement variation.

Still, we chose a bunch of parts from the process—maybe they really are pretty much the same.
If the specification is wide compared to this variation, I can still count on the gauge to correctly
classify my parts as in or out of spec. Remember that goofy graph from last month showing the
probability of incorrectly classifying a part? Here it is again for this gauge.

So even if our preform masses are well within specification, our measurement system is going to
be classifying a good chunk of them as out of spec. And we stand a pretty good chance
classifying parts that are out of spec as in spec. Hmmm….

We don’t have to make that graph as part of an MSA, we just need a metric to show this inability
of the gauge to make the right decision. That is the %R&R calculation, which will tell me what
proportion of the spec is taken up soley by measurement error:

Again, %R&R should not be used as the only consideration for acceptability, but in this case the
high measurement error relative to the spec combined with the fact that we have had problems
with this process for years, would seem to indicate that it is time to give that mass balance
salesman a call. And to think about how to protect the new balance from phenolic

One last thing to consider is to see if there is a relationship between the magnitude of the mass
reading and its variation. It is not uncommon for a measurement system to have different

6 of 8 12/5/2010 8:54 PM
Performing a Short-Term MSA Study

variability on the high versus the low end of the scale. We can check that with a quick
correlation, which the spreadsheet generated for us.

We test to see if that correlation is significant, using the correlation test for ρ = 0 and (no
surprise) there is no significant correlation. If there were, the variation of the measurements
would change with the magnitude of the reading, so the ability to correctly classify our preforms
would change depending on how much they weigh, and we would have a different %R&R for
different nominal masses. It is good to know if this is the case, especially with a gauge that
measures a wide span of nominal values as part of production. (Yes, I am talking to you, QA/QC

The short-term study gives us a lot of information on how a gauge performs during a snapshot
in time, and the %R&R indicates if I can use that gauge to make conformance decisions. (And I
hope by this point, you can see the total insufficiency of a calibration sticker to tell you that.) But
once I start using a gauge, how do I know that it is still good today?

The long-term study monitors a gauge over time to ensure that a gauge that is acceptable today
remains so tomorrow. To do that, we set aside five to eight samples from our usual production,
spanning the range of what the gauge is to measure. Every day we remeasure these same
samples, so again we are getting repeated measures. If there is a change in the readings we
investigate to see what changed. And if it wasn’t the samples, then the gauge is giving different
numbers today than it did yesterday.

So here is your mission, should you choose to accept it:

A statistical facilitator and an engineer wish to conduct a gauge capability analysis (long-term)
for a particular Ignition Signal Processing Test on engine control modules. The test selected for
study measures voltage which has the following specifications:

7 of 8 12/5/2010 8:54 PM
Performing a Short-Term MSA Study

IGGND = 1.4100 ± .0984 Volts (Specification)

Eight control modules are randomly selected from the production line at the plant, and run (in
random order, of course) through the tester at one hour intervals (but randomly within each
hour). This sequence is repeated until 25 sample measures (j = 25) of size eight (n = 8) have
been collected.

The data can be found here [3].

Play around with the data in the MSA spreadsheet [4], and let me know what you think about
your voltage test fixture.

*By the way, I noticed that the MSA spreadsheet will give an error in Excel 2007 where it hasn’t
before. The new version with this workaround is posted online [4].

Quality Insider Column Six Sigma Column Twitter Ed

© 2010 Quality Digest Magazine. All Rights Reserved.

the Insider
Privacy Policy
About Us

Source URL (retrieved on 12/05/2010):


[3] MSA-SRO-col.txt
[4] Forms 3.22.xls

8 of 8 12/5/2010 8:54 PM