You are on page 1of 5

Avoid Two Common Mistakes in

Measurement System Analysis

Rohin Raina 3
Measurement system analysis (MSA) determines whether the measurement system is
adequate and confirms that significant error is not introduced to the true value of a process
characteristic. MSA is the one of the most misunderstood and underused concepts in Six
Sigma. This article highlights two of the common mistakes made during the study and
explains how to avoid them.

Maintain Low Measurement System Error

Mathematically, total variance is equivalent to the sum of true variance and the
measurement system error. Measurement system error should be zero but, practically
speaking, this is not often the case because of factors such as worn and noncalibrated
gauges, inconsistency of an appraiser, and different knowledge levels of the appraisers. In
other terms, total variation should arise due to the difference in the parts being measured. It
is important to maintain a measurement system error as low as possible.
Variance (total) = variance (true) + variance (measurement error)

Continuous or Discrete Data

To consider a measurement system as adequate, there are set rules based on the data type
being used. For continuous data, 1) gage R&R has to be within 10 percent (10 percent to 30
percent allowed if the process is not critical) of the total study variation, and 2) the number
of distinct categories has to be greater than four. (For discrete data where attribute
agreement analysis is used, kappa value has to be at least 0.7 for nominal and ordinal data,
and Kendalls correlation coefficient [with a known standard] has to be at least 0.9 for
ordinal data.)
The process of conducting MSA study for continuous and discrete data is similar. Take 10
to 20 samples for a study, provide them to two or three appraisers for the first trial, and then
rerun the study. The main difference lies in the fact that the appraisers use a gauge to
measure the part in continuous data. For discrete data, however, it is left to the knowledge
of the appraisers whether the transaction is defective.

MSA for Discrete Data

One common challenge faced in an MSA study of discrete data is regarding the two trials.
How can the bias be removed when appraisers are given the same samples for the two trials
through an email? When provided the same sample twice at the same time, the appraisers
will surely provide the experimenter the same results for Trials 1 and 2; thus, no

repeatability issues will be detected when the study is done in this manner. Additionally, if
the two appraisers are aware of the study being run, then the reproducibility component
results will be biased. The following example highlights such a mistake being made during
an MSA study.
Example: Compliance Project in Banking
A project leader at a financial institution was asked to do an MSA study to confirm that the
measurement system was adequate. He ran the study for a week, put 10 samples in a
spreadsheet and sent them to the two appraisers. The study was completed and the data was
shared with the Black Belt (BB). The BB completed the study in a statistical analysis
program and found that there was no issue in repeatability. There were, however, some
mismatches between the two appraisers. Curious, the BB asked the project leader how the
study was conducted.
The project leader explained that he documented 10 samples in a spreadsheet and sent them
to the two appraisers through separate emails. For the second trial, the project leader again
sent the 10 samples in a spreadsheet via email. The BB told the project leader that while the
project leader ensured that the two appraisers did not know that the study was being
conducted by two different individuals, there was a repeatability bias involved in the
process. The BB suggested that the project leader instead follow the following procedure to
ensure that there would be no repeatability or reproducibility bias involved in the study.
1. Write the unique identification numbers of ten transactions on paper and make a
2. Give each of those hard copies to the two appraisers (or subject-matter experts,
SMEs) but do not tell the SMEs that two trials will be conducted.
3. The SMEs should review the 10 transactions and provide their decisions on each
transaction (defective or nondefective).
4. Have the SMEs return those original copies with their now-added decisions.
5. After a weeks time has passed, put the same 10 samples again on paper. Make a
6. Give each of those two papers again to the same SMEs.
7. Have the SMEs review the 10 transactions and make their decisions.
8. Collate all four papers.
9. Mark the SME names and trial numbers (1 or 2) on each paper and collate in a
10. Send the data to the BB to run the study in the statistical analysis program.
The project leader took a new 10 samples and provided them to the SMEs following the
new documented method. This time there were differences within appraisers, but the kappa
value was within the permissible limit. By using this process, the repeatability bias was
removed and the true measurement system error was determined.

MSA for Continuous Data

Another common challenge is frequently observed when an MSA study is done for a set of
continuous data. How should a sample be selected when the manufacturing process
happens on a number of machines that results in varying product sizes? Can that influence
the MSA study?
Example: Multiple Machines in Manufacturing
A supervisor was conducting a MSA study for the thickness parameter of a grinding wheel.
She had parts produced from different presses, which used to come in sizes varying from 5
mm to 200 mm in thickness (categorized into large, medium and small thickness wheels).
The supervisor thought that one study of 10 samples done with two appraisers would be
good enough for the study.
She met with the Six Sigma expert in the organization and asked if she was using the right
approach to conduct the study. The Six Sigma expert asked her how she would ensure that
no measurement error was introduced (taking linearity into consideration). The expert
recommended that the supervisor needed to ensure that the gauge is linear across the entire
range of measurements (varying range of thicknesses).
The supervisor then took another set of 10 samples each for the small, medium and large
thickness wheels to check the linearity of the gauges (the gage R&R). This way the
supervisor ensured that both accuracy and precision-related measurement errors were
correctly addressed during the study.

While conducting MSA studies, be aware of their practical challenges and how to remove
them so as to avoid measurement errors.

brent bowler

Regarding the Multiple Machines in Manufacturing MSA with different size of grinding
wheels. What we dont know is why the measurement study was being performed.
Normally the purpose of an MSA is to ensure the product meets an external/internal
requirement for a customer (in spec/out of spec, in control, out of control, etc..) In this case,
it almost appears as if the purpose is to tell the difference between a 5mm wheel and a 200

mm wheel. The Six Sigma expert did well to help her understand that the samples needed
to be in smaller buckets. However, to lump the grinding wheels into buckets with ranges of
70mm each is still a non starter. A metric stanley tape measure will provide all the
discrimination needed to tell the difference between a 5mm, 10mm, 15mm, etc., up to the
ten samples in the study. I have normally seen these type of MSAs when a Green
Belt/Black Belt was trying to check off a box as part of their certification.
Chris Seider

Your continuous example may be misleading. If you pick 3 different SKUs and test the
measurement system across such a broad range, the % error of the MSA relative to the
process will be mistakenly thought to be small. You want to evaluate a measurement
system for a product line and compare how the measurement is for variation compared to
the process variation and specs for that 1 product.
Theres nothing wrong with stating youd want to check the linearity across the entire range
but youd get mislead on the MSA variation with my understanding of what was presented.
If theres a huge difference in product characteristics on the same measurement device, one
could easily say you would need to do an MSA on the various points across the spectrum
(e.g. the 5 200 mm thickness is too large of product variationIm assuming not one
product spec is 5-200 but a much smaller range).
Good topic.

I like the article. Here are some suggestions to enhance its understandability.
1. You state at the beginning (and provide a formula) Mathematically, total variance is
equivalent to the sum of true variance and the measurement system error. You should
explain that since we are measuring several parts, that the equation becomes variance(total)
= variance(parts) + variance(measurement error). That is, the true variance in MSA is the
variance of the parts. If you only measure one part that does not change during the
measurement period, then there is only measurement error if there is any variance.
2. You state that one criterion for a good measurement system is that the number of
distinct categories has to be greater than four, when doing continuous MSA. However,
your solution for your example of multiple machines with sizes varying from 5 mm to 200
mm in thickness is not appropriate for this criterion. (Note you also mention these
thicknesses are categorized into large, medium and small thickness wheels suggesting
that this should be a discrete MSA and not continuous. Clarify this for the readers.)
3. You should never knowingly use parts of multiple sizes (even if the tolerances are the
same) to determine the number of distinct categories. The formula is: number of distinct
categories = 2_parts /_measurement. Thus, the greater the variation of the parts the
more likely this quantity will be four or more. By merely choosing parts that are very
different (as in your example where they range from 5 to 200 mm), you will get more than
four. Yet, you have no idea what the smallest difference your measurement system can
detect. Or worse, you believe it is acceptable for distinguishing parts you need to recognize

as different when it isnt capable of that. Your recommendation of checking for linearity is
good but not applicable to determining number of distinct categories, or, more
informatively, the smallest difference your measurement system can detect. It is a myth that
4. Fortunately, there is a solution and it doesnt require measuring multiple parts. You
select the smallest difference you want to recognize with your measurement system.
Then determine the standard error of your measurement system by measuring only one part
(two if you need to check that the variance is constant across the linear range) multiple
times. Then if _measurement 2, you can distinguish between parts that differ by or
more. I have explained this in one of books and some myths surrounding measurement
system analyses.
5. You need to separate the criteria for evaluating a measurement system into two areas: a)
to distinguish between good parts and bad parts relative to specifications and b) to
distinguish between parts, regardless of whether they meet specs. Percent of tolerance
address the first but not the second while number of distinct categories addresses the second
but not the first.